Nifty GUI performance issues

Hello!

I started working on the GUI for my action-RPG, and run into some performance issues when displaying a high number of icons.
My icon is composed of a panel, with the icon image in the middle with a small inset, and at most 3 texts: timer, counter and hotkey. These icons can be used for status effects and ability hotkeys as well. This is as bare-bones as it can get.

I am using a hidden layer for storing unused icons, and when they are needed (in my current case, when status effects are applied) they are moved into a visible panel. When they expire, they are moved back into the hidden layer. This in itself doesn’t have any noticeable performance problems.

The problem arises when I have a somewhat high amount (~100) icons on the screen. My FPS drops from 1500 to ~50.
While a hunded icons may seem a little high, it’s not too unreasonable. The player can likely have a dozen or two status effects at a time, the action bar could likely have this much, or more. I don’t have anything else in the UI either; no status bars, combat logs and such.

Even inserting a few hundred image elements such as below right into the xml results in a noticeable performance hit:
[java]image width = “10px” height = “10px” filename = “Interface/border.png” [/java]

My computer is not an ancient machine. Core2Duo 2.8Ghz CPU, Radeon 4850 graphics card, 4GB RAM, running Windows 7 64bit.

What could be the problem here? Does everyone else see such a performance hit in a similar case?

Thanks for the help!

How are you creating your elements?

You should probably use… uh… what does nifty call them? Builders? To construct the objects you need on the fly. Flag them for removal when they are not needed.

You’ll have to get a bazillion pointers to use/manipulate the objects you create… one for the Nifty Element, One for the Renderer… one just because 1 simple object isn’t half as fun as managing 17!

Anyways… try and look at it the same way you would your code…

if you were optimizing:

Create an icon because none exist…
Create a second because the first is in use
1st one expires… don’t GC the instance
Repupose existing icon 1 for new effect…
Create 3rd instance because others are in use…

Player one rocks and has 36 effects… player 2 sucks and has 1.

No need to tax both clients unnecessarily.

etc, etc, etc… amen

2 Likes

out of interest, what does object count show in the frame stats

That’s similar what I’m doing currently. I create a large pool of icons at initialization and are stored on the hidden layer, and are used when needed. Building them on the fly is really inefficient actually.
The problem though is not the way I’m handling these elements.

Simply having many image elements visible on the nifty screen has a massive performance hit. Much more than it would be reasonable. having 500-1000 textured rectangles on the UI shouldn’t cripple the system when the renderer can handle much more complex scenes.

Object count is ~500 when I have 100 icons up. Each icon is 5 object. This is likely the container panel, the three numbers and the icon image.

Edit: I’d like to add, there is no performance hit when the icons are on the hidden layer; only when they are visible.

Have you tried putting them directly in the GUI node of JME, instead of nifty ?

I don’t want to flag on Nifty, but just to check if the trouble is from the software or from the hardware.
If you have the same impact, then Nifty is not faulty… If not, there is something wrong with the way it’s done…

Displaying 500 images using a simple (but long) XML (pastebin link):
Results in 30 FPS

Using this code:

[java]for (int i = 0; i < 500; i++) {
Picture pic = new Picture("HUD Picture");
pic.setImage(assetManager, "Interface/border.png", true);
pic.setWidth(10);
pic.setHeight(10);
pic.setPosition((i % 50) * 11, (i / 50) * 11 +200);
guiNode.attachChild(pic);
}[/java]
Results in 200 FPS

Only when changing the second code to display 4000 images does the FPS drop down to 30.

Edit: In both case, each image results in a separate object.

@tamasszolcsanyi said: Building them on the fly is really inefficient actually.

I don’t know how to put this any other way… um… wrong.

Each object and all of it’s references are eating up memory… not to mention you’re pushing down 100 unused images to the GPU, because Nifty uses clipping to hide images.

Create them as you need and reuse the ones you create… set a time limit to how long you are willing to consume memory before you GC them.

I don’t even know where to begin, or if I should even reply, because I don’t know whether you are trolling or just stupid.

  1. Building new nifty Element objects on the fly takes some time; enough to have a small but noticeable performance impact. I know this because this was my first approach, and it was even slower than the current. The correct solution is to create objects in advance, as much as is likely to be used, and if more are needed, it can be created dynamically.
  2. The memory eaten up by the objects are miniscule compared to the hundreds or thousands of megabytes available memory. Due to this, it doesn’t really matter if there are some unused objects. Optimizing memory consumption is better done elsewhere.
  3. There are no 100 images pushed onto the GPU. Images loaded into the engine are likely already stored in the graphical memory (not the GPU, which is the chip on the card), all that is being pushed is vertex data, a few dozen bytes per icon. Something modern graphics pipelines and memory buses should easily handle.
  4. What you are talking about is irrelevant to my issue, useless. I haven’t even gone into fleshing out the details for this system. As I said (which you willingly or unintentionally ignored) simply displaying a hundred image elements using XML has an unreasonable performance cost. Without solving this issue, everything else is irrelevant, the features detailed are needed for my game, unavoidably. (Thanks to yang71, I now know I can use the niftyGui node manually for better performance).
    Contributor status or not, you are just wasting my time.
@tamasszolcsanyi said: I don't even know where to begin, or if I should even reply, because I don't know whether you are trolling or just stupid.
  1. Building new nifty Element objects on the fly takes some time; enough to have a small but noticeable performance impact. I know this because this was my first approach, and it was even slower than the current. The correct solution is to create objects in advance, as much as is likely to be used, and if more are needed, it can be created dynamically.
  2. The memory eaten up by the objects are miniscule compared to the hundreds or thousands of megabytes available memory. Due to this, it doesn’t really matter if there are some unused objects. Optimizing memory consumption is better done elsewhere.
  3. There are no 100 images pushed onto the GPU. Images loaded into the engine are likely already stored in the graphical memory (not the GPU, which is the chip on the card), all that is being pushed is vertex data, a few dozen bytes per icon. Something modern graphics pipelines and memory buses should easily handle.
  4. What you are talking about is irrelevant to my issue, useless. I haven’t even gone into fleshing out the details for this system. As I said (which you willingly or unintentionally ignored) simply displaying a hundred image elements using XML has an unreasonable performance cost. Without solving this issue, everything else is irrelevant, the features detailed are needed for my game, unavoidably. (Thanks to yang71, I now know I can use the niftyGui node manually for better performance).
    Contributor status or not, you are just wasting my time.
1) wrong 2) wrong, images are large and we talk about OpenGL memory, the GPU has even less. 3) wrong, the image is pushed to the GPU when its rendered and theres no way to tell OpenGL to do it elsewhen, the PCI bus is the bottleneck. 4) wrong, its a possible solution to your issue
1) wrong
Care to elaborate? Are you telling me that it's better to weight the gameplay performance down (even if it's occasional and little) when it could be done in advance, at loading? Or care to elaborate what would be the "right" way?
2) wrong, images are large and we talk about OpenGL memory, the GPU has even less.
Images aren't necessary large. They can be large. My icon images will not be large. Right now, they actually only display the same 7x7px image in addition to the bitmap fonts. So memory shouldn't really be an issue here.
3) wrong, the image is pushed to the GPU when its rendered and theres no way to tell OpenGL to do it elsewhen, the PCI bus is the bottleneck.
Okay, is this something nifty specific? Because modern OpenGL renderers usually work by loading the texture data from the disk onto the graphic card memory, and (ideally) only texture coordinate data needs to be transmitted through the PCI bus.
4) wrong, its a possible solution to your issue
I say it, for the third time, the core of the issue, as I could break it down is:

Using NiftyGui to display a somewhat large number (of even the same) image elements has a much higher performance cost than doing it with Picture objects attached to the GuiNode. About an order of magnitude difference.
The situation doesn’t change even if it’s done through a simple XML description or Java builder pattern.
This performance cost is large enough to be a significant problem later. I can’t avoid displaying about 50 or so icons for the action bar and the status effects combined. On top of this comes many more UI elements; status bars, combat text, tooltips.
Creating the elements during gameplay doesn’t change the situation: If there 100 of these images visible at the same time, performance drops below 60 fps. So he haven’t solved the issue.

Edit: Disregard my openGL performance arguments. I actually don’t know too much about it, I might be wrong in it even. Either way, the above eight lines what I would only like to know the answer to. I can admit I may be wrong in other details

Being an ass will surely get you more people to help you find a solution to your problem, right? Not.

If you disagree with t0neg0d’s answer then explain yourself politely, that you’re having a shitty day is irrelevant. We’re, mostly, here to help each others here, but acting that way tells other people that the help they can provide you might result in that kind of treatment if you’re not happy with what’s being offered.

Just don’t be an ass. Bite your tongue and be civil. It’s not that hard and you might end up getting more than you hoped. Respect is one of those.

1 Like

All right, I apologize for being an ass. I too have let more emotion into my reply than I should’ve.

I still prefer to tell, and be told the reasons when calling someone’s arguments wrong. I probably had baseless “rebuttals” one too many times, and it’s a touchy subject for me. I don’t like ending up with the wrong information because someone just wants to win the argument. I hope, and it does look like, that this is just a case of misunderstanding.

As I said, the last paragraph of my previous post is my issue, condensed, as I see it. I will try a different approach, using absolutelayout and positioned images to get as close as possible to the Picture object way. I wonder what will be the performance difference be then.

uh…please calm down.
You had 3 people trying to help you in about 3 hours after your post, even if their answers didn’t solve your issue, they DID spend THEIR time to look into it and give advice…calling them stupid is just plain unfair.

Your problem is due to the vast amount of geometries displayed on screen. One geometry = 1 draw call.
Modern GPU are very good at crunching big amounts of polygons, but can handle only a relatively small amount of draw calls.

You’ll have the same issue in a 3D scene, not only in a GUI.
Unfortunately, you don’t have a way to batch those objects coming out of nifty like you would for classic geometry in JME.

One solution though would be to use less objects for your UI. 5 objects per icon sounds excessive. Using simple panels for icons should drop your object count drastically. Try to compensate by changing your textures maybe.

You could also use multiple quads in one geometry and just change their UVs to display differnt icons from a texture atlas.

@normen said: You could also use multiple quads in one geometry and just change their UVs to display differnt icons from a texture atlas.
Well...not with nifty...

All right, I already apologized, I too would rather undo this stupidity of me if I could. It wasn’t useful for any of us.

Anyway, I understand there are some problems with how the draw calls are structured and the performance issues behind them.
Sometimes it’s unavoidable, though in my case, I can’t really cut my icons down more. The least I can go down to are three objects per icon: Image, hotkey/stack count and cooldown/duration numbers. This is very plain though, at least a border would be nice, which ups it to 4.

I have done one more test, to see seemingly similar approaches have any performance differences. I tried displaying this XML file. It contains 500 image tags in absolute layout, positioned with x,y coordinates in a 50*10 grid. This is as close as I could get to this approach, same what I posted previously:
[java]
for (int i = 0; i < 500; i++) {
Picture pic = new Picture(“HUD Picture”);
pic.setImage(assetManager, “Interface/border.png”, true);
pic.setWidth(10);
pic.setHeight(10);
pic.setPosition((i % 50) * 11, (i / 50) * 11 +200);
guiNode.attachChild(pic);
}
[/java]

That XML surprisingly yields 30 FPS, even worse than using other layouts.

Either way, it seems like if I want decent performance, I could use Picture objects on the GuiNode. I really don't see if there is any way to display 100 icons, made up from a border image, and icon image and three texts with NiftyGui with good performance.

@nehon said:
@normen said: You could also use multiple quads in one geometry and just change their UVs to display differnt icons from a texture atlas.
Well...not with nifty...
No but you don't have accessible geometry in Nifty either ^^ In fact nifty could do this at some point if image atlas support was added..

Just a note, the nifty layer is already reusing the same object over and over. Only the texture coordinate buffer is resent to the GPU for every image (if it is a sub-image). The textures may or may not be reuploaded but I think that depends on the size of the textures and how full the GPU memory is. If these are just icons in a tool bar then I’d expect their textures to be pretty small… if not then that’s an issue worth looking at. In general, I think the JME-Nifty layer will be really poor at this sort of UI, though.

Adding a new icon to either nifty or a Picture in the regular scene should be nearly instant. If this is taking a lot of time then I again wonder how big the textures must be. I can only speculate about why a nifty UI with 1000 icons is slower than 1000 separate Picture meshes.

At any rate, using Picture for your icons is probably the best way anyway. You shouldn’t have to preload them, either. Just add them to the HUD when you need them and let the regular JME systems take care of the asset caching and stuff for you. You could preload the textures in the asset manager if you are worried about load times breaking frame… but adding the picture itself to the HUD should be instant. Unless your textures are really big then you shouldn’t even notice.

@pspeed said: You could preload the textures in the asset manager if you are worried about load times breaking frame... but adding the picture itself to the HUD should be instant. Unless your textures are really big then you shouldn't even notice.
If 100 different icons are suddenly supposed to render, 100 icons will be pushed trough the PCI bus to the GPU, you can't "preload" that. Even if its 7x7 images, thats the equiv of one 700x700 image. As the issue is apparently the multitude if images and hence the multitude of geometry I guess just using single picture objects will yield a similar result. The general overhead might be less though as nifty basically generates the composited image each frame.