Nifty GUI performance issues

pspeed · January 11, 2013, 3:16pm

@normen said:
@pspeed said: You could preload the textures in the asset manager if you are worried about load times breaking frame... but adding the picture itself to the HUD should be instant. Unless your textures are really big then you shouldn't even notice.
If 100 different icons are suddenly supposed to render, 100 icons will be pushed trough the PCI bus to the GPU, you can't "preload" that. Even if its 7x7 images, thats the equiv of one 700x700 image. As the issue is apparently the multitude if images and hence the multitude of geometry I guess just using single picture objects will yield a similar result. The general overhead might be less though as nifty basically generates the composited image each frame.

To me, it sounded like he was preloading 4000 icons because 45 of them might be displayed. The assumption is that most of the time the toolbar isn’t changing and that at game time it’s just one or two icons changing at a time.

If he/she is really swapping out 1000 icons per frame then there will always be issues. Even resending the texture coordinate buffers every time could be a bottleneck.

normen · January 11, 2013, 3:22pm

@pspeed said: To me, it sounded like he was preloading 4000 icons because 45 of them might be displayed. The assumption is that most of the time the toolbar isn't changing and that at game time it's just one or two icons changing at a time.
If he/she is really swapping out 1000 icons per frame then there will always be issues. Even resending the texture coordinate buffers every time could be a bottleneck.

To me it sounded like he went off about this not being the problem and that in fact he’ll need an immense amount of buttons at some point. Maybe a custom nifty control would also work.

tamasszolcsanyi · January 11, 2013, 3:27pm

In my first implementations, before starting testing the performance, I preloaded 100 elements, something which I first deemed as a reasonable maximum. Inserting and removing icons isn’t done is massive numbers, at worst case, it’s a dozen or two in a frame, but even that is unlikely. I haven’t noticed issues with displaying/hiding these icons, but when all 100 were shown, the FPS dropped to sub-60, as I mentioned.

Even if its 7×7 images, thats the equiv of one 700×700 image

Do you mean equal in number of pixels? Wouldn't it be then 70x70? 100 7x7 images has 4900 pixels, a 700x700 one has 490000. Did you mean with overheads?

pspeed · January 11, 2013, 3:30pm

How big are your icons’ source images and how much space do they typically take up on screen?

nehon · January 11, 2013, 3:34pm

IMO I would go custom ui made of quads, with texture atlasses, custom shader to account for icon index in the atlas…all this in a batch node…
This could end rendering only one object for the entire GUI, and would drastically increase perfs. And you would still keep the ability to move GUI stuff around.
All of this is doable without much effort right now.

Only draw back is that all of this has to be done in code.

If OP absolutely want to keep nifty i would go one panel for the entire tool bar, and generate the entire toolbar texture with Zarch’s TexturePainter and only refresh it when an icon is changed in the toolbar…

pspeed · January 11, 2013, 3:37pm

@nehon said: IMO I would go custom ui made of quads, with texture atlasses, custom shader to account for icon index in the atlas...all this in a batch node... This could end rendering only one object for the entire GUI, and would drastically increase perfs. And you would still keep the ability to move GUI stuff around. All of this is doable without much effort right now.

I was going to suggest this too but I wanted to get the data in order first. If he’s doing something crazy like using 512x512 images for his icons then an atlas may not be the first thing.

But yeah, I think an atlas is totally the way to go. A particular icon’s update is then just four texture coordinates.

tamasszolcsanyi · January 11, 2013, 3:48pm

I haven’t started working with a more defined implementation, since I noticed this performance hit very early, and tried to figure it out; made the topic after cutting it down to 5 objects: one image, the file was a 7x7 png, and three text elements. The fifth object was the container panel.
Ideally, I’d like to have an icon with a size between 32x32 to 64x64, depending on resolution and the general structure of the UI. I’d also like to have a border texture for it, separate would be nice, since I could shade it depending on the situation. Since using the 12-parameter scaling makes too many objects, a single image is acceptable too. Then two more text fields, three wouldn’t likely be used. Cooldown and hotkey in the case of action bar buttons, duration and stacks in the case of status effects.

I considered writing custom UI code; I actually attempted, and partially succeeded in an older project to do so, though it only used immediate mode. I would have preferred to avoid it if possible.

Currently, my likeliest solution will be to use a custom framework for the performance sensitive parts only. I can use Nifty to lay out the more complex panels, such as character sheet and such.

Edit: A few more details:
I can envision the action bar to have between 10 and 50 buttons. Status effects can end up a few dozen, with sustained effects, buffs, and debuffs in heavy combat situations.
The reason I’m so unsure with the action bar, because at first, I only want to get a simple tech-demo for the combat together. Only once that’s done I will work on the other parts of the game, and I can’t judge what would be a healthy number for the player to use.

zarch · January 11, 2013, 4:15pm

I’ve run Nifty with substantial number of panels/images etc on screen with no problems. Not hundreds of icons but some of the more complex screens like the deck builder or challenge screens must be getting quite high:

Each of those progress bars is 2 images, then there is the text, the background panels, the buttons, etc.

Having said that I agree with the others, either ImagePainter and a single Image or a custom mesh and a texture atlas would give much better performance than trying to do this via Nifty.

normen · January 11, 2013, 5:53pm

*700x7, anyway lots of data at once ^^ If it was an atlas it would be sent once and after its only texture coords.

t0neg0d · January 11, 2013, 7:56pm

@nehon said: IMO I would go custom ui made of quads, with texture atlasses, custom shader to account for icon index in the atlas...all this in a batch node... This could end rendering only one object for the entire GUI, and would drastically increase perfs. And you would still keep the ability to move GUI stuff around. All of this is doable without much effort right now.
Only draw back is that all of this has to be done in code.

If OP absolutely want to keep nifty i would go one panel for the entire tool bar, and generate the entire toolbar texture with Zarch’s TexturePainter and only refresh it when an icon is changed in the toolbar…

Wonder who the two people who thumbed-up this were? =)

@tamasszolcsanyi said: In my first implementations, before starting testing the performance, I preloaded 100 elements, something which I first deemed as a reasonable maximum. Inserting and removing icons isn't done is massive numbers, at worst case, it's a dozen or two in a frame, but even that is unlikely. I haven't noticed issues with displaying/hiding these icons, but when all 100 were shown, the FPS dropped to sub-60, as I mentioned.
Even if its 7×7 images, thats the equiv of one 700×700 image
Do you mean equal in number of pixels? Wouldn't it be then 70x70? 100 7x7 images has 4900 pixels, a 700x700 one has 490000. Did you mean with overheads?

When you say 12-24 on any given frame… do you mean per frame? Or when the circumstance arises that 12 effects start / end?

One of the reasons I was suggesting the reuse of geometries in Nifty was to stop the possibility of sending those geometries over again to the GPU.

Say you had 12 effects going.

The GPU has all twelve (possibly containing multiple geometries) sitting in memory already. Effectively, you can either

a) hide any that aren’t being used
b) swap out the image for a 1 pixel transparent

then, when you need them again:

Swap out the image (now only the image is being pushed to the GPU

If you need more than 12 at some point just push the new one down, though, I would also suggest GCing them after some set time of inactivity.

Basically this gives you your preloading plan… it is just preloaded in stages… and the preloading is somewhat temporary. >.<

pspeed · January 11, 2013, 8:31pm

The JME nifty layer only has one mesh and reuses it over and over and over.

t0neg0d · January 11, 2013, 9:14pm

@pspeed said: The JME nifty layer only has one mesh and reuses it over and over and over.

/doh … of course it does. You’d think I would know this by now >.< The idea somehow rings a bell.

normen · January 12, 2013, 7:37pm

After looking at the nifty docs, a custom nifty control that uses an atlas of buttons should be a breeze. And you keep the option to have the GUI layout separate as well as exchangeable for different platforms.

pspeed · January 12, 2013, 10:12pm

I think a custom nifty control that uses an atlas of buttons will still result in buttonCount number of draw calls on the JME side. And it will resend the texture coordinates for each button. The nifty JME layer can only draw images one at a time given source/target offsets, etc. and it reuses the same quad and updates the texture coordinates for each draw call.

nehon · January 12, 2013, 10:15pm

@pspeed said: I think a custom nifty control that uses an atlas of buttons will still result in buttonCount number of draw calls on the JME side. And it will resend the texture coordinates for each button. The nifty JME layer can only draw images one at a time given source/target offsets, etc. and it reuses the same quad and updates the texture coordinates for each draw call.

Not anymore, i changed it like a year ago. This was killing performance on android

Edit, I mean the texcoords updated every frame

pspeed · January 12, 2013, 10:18pm

@nehon said:
@pspeed said: I think a custom nifty control that uses an atlas of buttons will still result in buttonCount number of draw calls on the JME side. And it will resend the texture coordinates for each button. The nifty JME layer can only draw images one at a time given source/target offsets, etc. and it reuses the same quad and updates the texture coordinates for each draw call.
Not anymore, i changed it like a year ago. This was killing performance on android
Edit, I mean the texcoords updated every frame

If they are different then they are updated. It cannot be otherwise.

nehon · January 12, 2013, 10:21pm

Yeah you’re right my mistake, i changes the color writing to the mesh by using a material color.

Coords are still written

normen · January 13, 2013, 12:39pm

@pspeed said:
@nehon said:
@pspeed said: I think a custom nifty control that uses an atlas of buttons will still result in buttonCount number of draw calls on the JME side. And it will resend the texture coordinates for each button. The nifty JME layer can only draw images one at a time given source/target offsets, etc. and it reuses the same quad and updates the texture coordinates for each draw call.
Not anymore, i changed it like a year ago. This was killing performance on android
Edit, I mean the texcoords updated every frame

If they are different then they are updated. It cannot be otherwise.

But they are only different when the layout changes and they use only one image that has been uploaded when the game started, plus you don’t have any additional overhead that might be caused by stacks of hundreds of generic layout elements of nifty being processed in its update loop. You can do it like it makes most sense for the application “wow-like button bar overkill”.

pspeed · January 13, 2013, 5:27pm

@normen said:
@pspeed said:
@nehon said:
@pspeed said: I think a custom nifty control that uses an atlas of buttons will still result in buttonCount number of draw calls on the JME side. And it will resend the texture coordinates for each button. The nifty JME layer can only draw images one at a time given source/target offsets, etc. and it reuses the same quad and updates the texture coordinates for each draw call.
Not anymore, i changed it like a year ago. This was killing performance on android
Edit, I mean the texcoords updated every frame

If they are different then they are updated. It cannot be otherwise.

But they are only different when the layout changes and they use only one image that has been uploaded when the game started, plus you don’t have any additional overhead that might be caused by stacks of hundreds of generic layout elements of nifty being processed in its update loop. You can do it like it makes most sense for the application “wow-like button bar overkill”.

No. You misunderstand the interaction between Nifty and JME. Nifty says: draw this part of this image, draw this part of this image, draw this part of this image, etc… for each of those, JME sets the texture coordinates, sets the transform of the quad, renders the quad… over and over. Effectively this is like having x number of quads.

normen · January 13, 2013, 5:42pm

@pspeed said: No. You misunderstand the interaction between Nifty and JME. Nifty says: draw this part of this image, draw this part of this image, draw this part of this image, etc... for each of those, JME sets the texture coordinates, sets the transform of the quad, renders the quad... over and over. Effectively this is like having x number of quads.

Yes! You must be misinterpreting me talking about object count in the post you quote, I guess. I gave an example for a possible low-level / self-built solution for that quite some time ago in this thread.