Need some tips to improve performance

I have been checking performance of my game. Even though I did not put in everything I had planned, the framerate is stuck around 32 with all debugging stuff to a minimum.

In this screen the following is going on (only the big chunks):

  • There is a globe model with a custom terrain shader
  • There is a clone of this model with a custom water shader
  • There is an interface being rendered to a texture (you can not see it here)
  • There is a Fog-filter and Depth-of-field filter
  • There is a shadow filter
  • there are 200 tiny guys with brains running around and hitting each other.

I think that is not very impessive. I have reasonable fast computer with a dedicated GPU and I do not think it should have any trouble with this.
Schermafdruk van 2022-11-15 08-54-08

These are my main suspects of the cycle theft. I started turning them off to see where the biggest cycle-eaters are:

  • Setting man to 0: 26 cycles (#cycles won)
  • Turning off shadow filter: 15 cycles
  • Turning off Fog and DOF: 2 cycles
  • Turning off the interface offline renderer: 7 cycles
  • Turning off the water (model and shader): 5 cycles
  • Turning off the globe with terrain AND water shaders: 28 cycles

I know this is not a very exact way to measure performance, but I think it is a good indication. It seems that the worst cycle eaters are the little men and the terrain shader.

Does anyone know how to further profile the shader performance? Or does anyone have any do’s or dont’s for shaders that could help?

Without looking at anything else, 8000 objects is kind of a lot:
image

8018 draw calls. Not sure how your scene breaks down but usually lowering that is effective.

JME also has a detailed profiler that can provide some more information on where your time is going each frame.

I don’t know what “cycles” means in your above post.

1 Like
DetailedProfilerState detailedProfilerState = new DetailedProfilerState();
getStateManager().attach(detailedProfilerState);

By default it binds to F6 if I remember correctly.

Wow - that is useful AND fun!

I have been experimenting to see where the objects come from. Lemur, with just a simple interface showing just a label and a button, already is good for over 400 objects.

The little guys are good for 4 objects each. They are all cloned and have one single mesh. They share a material. So I am a bit puzzled what counts as an object.

The trees are very obviously guilty: they each consist of four meshes.

What I think I should do first, is dive into instancing. I wrongly thought cloning a node was the same in JME3, but it obviously is not.

If the trees are static, GeometryBatchFactory might be a quicker and simpler way than instancing

1 Like

And, correct me if I’m wrong. But instancing doesn’t lower the object count. Batching (to single mesh group) does.

Also, spherical worlds are tricky, because things on the opposite side of the planet don’t get culled, even if they’re out of sight. Maybe some custom culling based on the normals of the objects in relation to the camera. Batching also hinders culling since the bounding meshes cover all of the meshes (naturally). So batching them all together might not be a good idea (but worth testing)

Yes, cloning has nothing to do with instancing. You can create an InstancedNode and attach models into that and call InstancedNode.instance().

You can find some examples of using it here:

Note that, to work with instancing, when you clone a spatial you must use spatial.clone(false) to not clone the material and use the same material instance, that is required for “instancing”. Also, note that rigged models do not work with instancing because when you clone them JME internally also clone the mesh and does not share the same mesh (that is required for animation). For “instancing” the models must share the same mesh and same material. I would prefer to use instancing for large models like trees,… and batching for small things like grass,…

It is a pitty rigged models can not be instances… does that mean that the guys need to stay their own unique selves?

And @tonihele, I can confirm from testing that object count does not go down with instancing.

@rickard why would things on the opposite side of the planet not get culled? Do you mean at a mesh level? Because faces will be culled, I guess.

Instancing the trees and rocks helped a little, but not much. GeometryBatchFactory can not be used since the rocks can be moved and trees will be chopped down later on.

Yes, I mean objects. Away-facing tris will be culled.

Try it anyway to see if it has an impact. If it works you can create ways around the problem.

You can still batch them after change. We have a dungeon game (GitHub - tonihele/OpenKeeper: Dungeon Keeper II remake), the dungeon is ever changing. We just update the changes and batch again. Swap it to the scene graph. We use the stock jME batching. Which in our case is definitely not the most efficient choice, yet is works just fine.

Maybe I don´t fully understand GeometryBatchFactory. No that is not a maybe.

The Wiki suggests BatchNode would be a better way to go here. As I understand it reading the Wiki BatchFactory does not let you control individual items.

Btw: this is what the profiler says:

Yes that is the thing. GeometryBatchFactory is for static stuff. But also you can, like Rickard said, quickly test batching with it. BachNode is the same thing then, except maybe marginally slower and needs some setup.

Hmm, it says the GPU frame time is 16.86ms which means 60fps. So it seems GPU (rendering) is not the problem in your case but it is the CPU(?) if I am right!!

Edit:
I have never used the profiler before so I could be wrong about those numbers.

I believe so. Afaik you can not batch or instance them but I think you can still use LOD (level of detail) control on them which is another good way to improve rendering performance.

Edit:
Note I think you can also use LOD control with instancing as well. JME InstancedGeometry supports that.

looks like CPU is problem, and most probably its about Objects amount 8000, but it might be also something in your code too.

as we see GPU is fine, so shader/tris/etc is fine.

so first thing i would look what cause CPU issue, is this object amount or maybe some of your terrain code, or something else. Just need debug.

Just please note, looking at your planet with elements, i would expect like 100-200 objects

  • 1 terrain
  • 50-100 trees
  • 50-100 characters
    if you gonna have any grass, it should also be 1 object in this case really

also if you have some “all time generator code” going on, you can pass it to different thread so it run on different core, because JME main thread just use 1 of 6 of your CPU cores

1 Like

Here I have selected just the most time consuming items in the stats:

It appears to loose a lot of CPU time with the DirectionalLightShadowRenderer over 11ms of the 20ms at this given moment (it fluctuates quite a bit)

About the expected object count, I would expext:
200 Humans x 1 object = 200 objects (seems to be over 800)
100 Trees x 4 objects = 400 objects
100 Trees x 4 objects = 400 objects
----------------
1000 objects (give or take a few)

So you are right, it is not clear what makes up an object. And it is also fluctuating.

1st line = if its 800 its wrong if it have 1 object. For characters like this it can easly be just 1 object.
2-3 lines dupplicated? anyway trees could have 2 objects if you want leafs different material, 4 is too much imo. so 200 objects imo is preferred in this case.

Most optimal:
200 Humans x 1 object = 200 objects
+
100 Trees x 4 objects → batched / instanced → 1-4 objects i belive.

still arround 200 could be.

ofc you can make 1000, should be ok too, but the less the better.

not sure why DirectionalLightShadowRenderer make issue for you, maybe you use wrong parameters for it?

you can try switch things off, like remove trees/characters and see if there is difference, if not, then something else cause issue.

Trust me, the model for the humans only has one mesh. And still it counts for four times as many objects as expected. That’s where my question comes from: how does JME3 count objects?

I did experiment with switching off stuff: trees and rocks are not te problem. They don’t cost that much.

Geom = 1 Object

In blender if you have 1 Object, but its 4 different materials, its 4 geoms and its 4 objects in JME
Thats why its better to use Special Shader for characters that can apply “tatoos”/“skin color”/etc things just with single material.

I did experiment with switching off stuff: trees and rocks are not te problem. They don’t cost that much

then try terrain, maybe it is it?

edit:
and see Paul response below, he is right, shadow dupplicate render calls amount too. so my suggested 200 objects would be still 400 anyway.