How does jMonkeyEngine do Geometry Instancing?

giantmustache · June 5, 2011, 8:22pm

I’m still working on my grass project and having an issue with drawing too many meshes.

At the moment, I am creating a new mesh for every blade of grass. Which is wrong of course, but they all have a different size. So every mesh is a little bit different.

But suppose that we have the same mesh for every blade. They still have a different Material, because they all have a different wind angle as Uniform.

How can I do geometry instancing to use only one mesh over and over again?

Do I have to call this just like in OpenGL with a DrawInstanced()? Or does jMonkey handle this for me?
I say a package called scene.geometryinstancing in jme2, but it doesn't seem to be in jme3 anymore.

normen · June 6, 2011, 12:02am

If you load your geometry from the assetManager then the mesh is shared as long as there is no animation bound to it. Otherwise you can just reference a mesh multiple times. Look at the box example, a geometry is created from the Box mesh, you can create multiple geometries from one mesh.

giantmustache · June 6, 2011, 12:03am

So jMonkeyEngine always uses instancing? That’s a good thing! Nice to know. Thank you.

Momoko_Fan · June 6, 2011, 12:39am

If you have many grass blades its probably best to use batching instead of instancing, see the GeometryBatchFactory class on how to do it

johncl · November 4, 2011, 11:53am

I have also been wondering about this. So if you make several geometries sharing the same mesh, they will actually be shared on the GPU as well? But each of them are still treated like a separate Spatial in the rendering pipeline and each will get an individual draw call right?

So if several meshes using the same material are close to each other you are better off just merging these right? But if you merge all instances spread over a big area you will get lower framerates since they are not culled if one is visible right?

Just want to get these things clear. - It seems like things like grass or small pebbles on the ground are often wise to group into blocks to reduce the amount of draw calls greatly. These kind of optimizations are rather fun as you get immediate feedback in the form of higher fps when the scene is packed with details.

normen · November 4, 2011, 12:54pm

Yes

nehon · November 4, 2011, 5:06pm

@johncl said:
But if you merge all instances spread over a big area you will get lower framerates since they are not culled if one is visible right?

Actually, from my testings...it almost always worth the shot...It depends on many things actually. it can be slower, an it can be faster, depending on how many geoms you have and on how complex are your geoms and how they are distributed over the area.
The thing is, culling is done before the geom is send to the gpu. So it's vital to have it because the less geom you send to the GPU the faster your scene will render.
If you batch a group of objects, you'll have one and only one object to send to the GPU, and the overhead of sending a new geometry to the GPU compared to processing more polygons can be huge. Actual GPUs are built to process huge batches of polygons but quite suck at processing lots of geometries.
Also opengl automatically clips polygons that are outside of the view matrix, so polygons belongings to geometries that would be culled won't render anyway.

But right now batching comes with big down sides, batched geometries cannot be transformed anymore, and they need to share the same material.
However we are looking for ways to address those issues.

@giantmustache said:
So jMonkeyEngine always uses instancing? That's a good thing! Nice to know. Thank you.

No that's not what Normen said. Your mesh is shared, it means there is only one instance of your mesh in memory.
Instancing in the opengl way, is the ability to render an object multiple times in one draw call, and this is not supported by JME at the moment.
But it has to be the exact same mesh with the exact same material.

johncl · November 4, 2011, 6:42pm

Ah thanks for clearing that up. I seem to recall reading that jme3 didnt support instancing of geometry so even if you use the same mesh for two geometry objects they will be sent twice to the GPU, hence the speed increase in batching these meshes together if they share the same material as one draw is faster than two draws even if the mesh sent is bigger. I saw this increase very fast when I did an optimize on the geometry for a tree I had that was using many quads for leaves. Merging the mesh for these quads saw a very noticeable increase in fps.

I have an odd problem with GeometryBatchFactory.mergeGeometries() btw, after I call it I get a mesh that if I add it to a Geometry it seems to be culled too early (when objects are close to the screen edge and my camera). Should I call something to set it up correctly after? Perhaps there are some boundary calcuations missing?

johncl · November 4, 2011, 6:51pm

Hm, GeometryBatchFactory.optimize() works fine though so I guess its better to use that.

nehon · November 4, 2011, 7:59pm

yep use optimize()

asija · March 20, 2012, 11:57am

just quick question

OpenGL Geometry Instancing (in this sense)

http://pyopengl.sourceforge.net/context/tutorials/shader_instanced.xhtml

OpenGL 3.2 Geometry Instancing Culling on GPU Demo | Geeks3D

http://http.developer.nvidia.com/GPUGems3/gpugems3_ch02.html

Is not currently supprted in jMonkey, is it?

Is it planned to be in future?

Or there is no need for it because it could be outperformed by other methods (BatchNode?) ?

nehon · March 20, 2012, 12:28pm

no it’s not supported right now (the renderer has everything to do it but there is no API for the user)
that’s still in discussion, geometry instancing is kinda particular because it only works on geoms that have the same mesh and same material. That’s not a situation you encounter in a lot of games. But since it’s a common opengl feature i think we should support it.
BatchNode won’t outperfom geometry instancing because at best it does the same. So it can be as fast as geometry instancing (in theory, i did not make any comparison bench). But there are cases where is will be slower (when sub geoms are moving).

asija · March 20, 2012, 2:56pm

Thank’s for clear answers

it only works on geoms that have the same mesh and same material. That’s not a situation you encounter in a lot of games.

Forrests, big armies, repeated buildings on one map looks like pretty common thing in games simulating real world.

normen · March 20, 2012, 3:53pm

@asija said:
Forrests, big armies, repeated buildings on one map looks like pretty common thing in games simulating real world.

Yes, but compared to the GeometryBatchFactory you only get memory usage improvements from hardware instancing for static objects. If hardware instancing is really faster than the BatchNode when you move the separate geometry has to be seen, you will have to transfer the locations to the GPU as well, so..

asija · March 20, 2012, 5:32pm

the true is, that I don’t know what BatchNode does to accelerate rendering if the objects are moving (what is the principle?).

But, principle of HW geometry instancing - to pass one buffer of N vertexes of the mesh and one buffer of M instance transform matrices in order to get N x M vertexes on the screen, looks like huge performance improvement instead of passing mesh of NxM vertexes all moved by CPU or passing M independent meshes. (If CPU-GPU bandwidh and call overhead is an issue)

normen · March 20, 2012, 5:55pm

The principle is that your GPU can render a big mesh easily but sending the information for rendering a single geometry objects takes a long time (via the AGP/PCI bus), hence BatchNode “simulates” separate geometry objects but in fact just sends one big mesh with them all and moves the vertices inside that.

zarch · March 20, 2012, 11:16pm

Presumably that means every time anything within the batch moves then the whole batch needs repackaging and resending? How bad is the overhead on that sort of thing?

i.e. clearly for frequently moving objects this would be a bad idea but for example doors that only occasionally get opened and closed would you consider them a good thing to be batched? Would you leave the door batched as it opens/closes or remove it from the batch while it moves and then batch it again after?

(This is more a theoretical discussion to clarify in my mind just when it’s useful to use batchnode rather than an immediate requirement I have).

normen · March 20, 2012, 11:22pm

Yes atm that basically means that. We are looking into updating only the specific ranges but its already a lot faster than the material switching overhead. As for whats best for your game… only you can tell. Generally I’d batch everything I can but obviously you have to keep the batches small enough so that culling can still occur.

atomix · May 21, 2013, 4:18pm

Hey guys, sorry for waking the death but…

… as we are migrating to lwjgl 2.9.0, also we recently have hardware skining, i just want to ask if any chance we do hardware Geometry Instancing more easier.

Let say I want to draw thousand of static mesh and somehow have “Instance_ID” passed to the vertex shader?

In advance, I’d like to render thousands of football fan moving and cheering in a stadium, I wrote the animation to texture compressing part, but I don’t know how to tweak the Renderer to render instanced mesh.

I know below function is there for ages but can you guy point out how to do something useful with it.

http://code.google.com/p/jmonkeyengine/source/browse/trunk/engine/src/lwjgl/com/jme3/renderer/lwjgl/LwjglRenderer.java?r=10618
[java]
public void drawTriangleArray(Mesh.Mode mode, int count, int vertCount) {
if (count > 1) {
ARBDrawInstanced.glDrawArraysInstancedARB(convertElementMode(mode), 0,
vertCount, count);
} else {
glDrawArrays(convertElementMode(mode), 0, vertCount);
}
}
[/java]

for example:
http://http.developer.nvidia.com/GPUGems3/gpugems3_ch02.html

nehon · May 21, 2013, 4:48pm

The renderer supports it but there is no API for the user…so you can’t use it.
The thing is instancing seems nice at first but you have to understand that it can only render the same mesh with the same material.
So basically it rules out any animated mesh unless you want all instances to play the animations synchronized. (hardware skinning won’t solve this issue).
BatchNode can cover any of the instancing use case (afaik but i can be wrong).

Your use case though is not trivial, you want batched/instanced geometries with individual animations.
Now that we have hardware skinning maybe we could try to batch models with bone animations and it could be pretty efficient…