How does jMonkeyEngine do Geometry Instancing?

Instancing will use a single draw call. It’s not as good as one big mesh because the driver/GPU still has to do a bunch of separate internal draws but it’s definitely better than a bunch of separate draw calls.

If it were me, I’d probably use a custom mesh and update the vertexes myself. It’s going to be waaaaaay more efficient than JME’s automatic batching. But the automatic instancing stuff might get you where you want to go.

1 Like

It’s too hard stuff for me) It’s my first experience with 3D so I’ll pick more easiest solutions.

Can you please clarify what do you mean by automatic instancing? As I understand its not implemented in the engine right now?

In very short; use InstancedNode.

2 Likes

You can take a look at the instancing examples here

You need to create an InstanceNode and attach all the instances to it and then call the instance() method on it. Note that every time you add a new object to the node you should call the instance() method again.

Note that instances must have the same material.

4 Likes

Wow! Didn’t know about it! It’s a piece of great news for me! Thanks! It does all the magic)

It seems that it gives a lag when adding a big amount of geometries into InstanceNode, will try to add them in smaller chunks.

Thank again!

2 Likes

To clarify it, I mean the exact same reference (==), not a clone. So if you are cloning the tree parts make sure to use clone(false) so it won’t clone the material.

The same goes for meshes as well.

2 Likes

Also might worth mentioning that, JME will cull the instances (not render) that are outside of camera view when sending instance data buffer to GPU.

Yeah, I get this. Thanks!

Does InstanceNode use glDrawElementsInstanced calls or what does it use?

I miss instance drawing. For simple Particle instance, it is far better. I could push out over 100k meshes with no performance issues but with JME not doing it, I couldn’t do 20% of that without it effecting performance.

In my own engine I was doing the following for instancing batching.

    private void renderChunkInstanced(List<GameItem> gameItems, Transformation transformation, Matrix4f viewMatrix, boolean view3d) 
    {
        this.modelViewBuffer.clear();
        this.colorMatrixBuffer.clear();
        this.modelPosBuffer.clear();
        this.textureAtlasBuffer.clear();
        
        int i = 0;
        Texture text = gameItems.get(0).getTexture();
        for (GameItem gameItem : gameItems) {
            // Update projection Matrix
        	Matrix4f modelViewMatrix;
        	if (view3d)
            	modelViewMatrix = transformation.getModelViewMatrix(gameItem, viewMatrix);
        	else
        		modelViewMatrix = transformation.getOrtoProjModelMatrix(gameItem, viewMatrix);
            modelViewMatrix.get(MATRIX_SIZE_FLOATS * i, modelViewBuffer);
            i++;

            Vector4f color = new Vector4f(gameItem.getColor().x,gameItem.getColor().y, gameItem.getColor().z,gameItem.getTranslusentLevel());
            color.get(FLOAT_SIZE_BYTES * i, colorMatrixBuffer );

            Vector4f pos = new Vector4f(gameItem.getPosition().x,gameItem.getPosition().y, gameItem.getPosition().z,1.0f);
            pos.get(FLOAT_SIZE_BYTES * i, modelPosBuffer );
            
            if (text != null)
            {
                int col = gameItem.getTextPos() % text.getNumCols();
                int row = gameItem.getTextPos() / text.getNumCols();
                float textXOffset = (float) col / text.getNumCols();
                float textYOffset = (float) row / text.getNumRows();
                Vector2f pos2f = new Vector2f(textXOffset,textYOffset);
                pos2f.get(2 * i, textureAtlasBuffer );
            } else {
                Vector2f pos2f = new Vector2f(1.0f, 1.0f);
                pos2f.get(2 * i, textureAtlasBuffer );
            }
        }

        glBindBuffer(GL_ARRAY_BUFFER, modelViewVBO);
        glBufferData(GL_ARRAY_BUFFER, modelViewBuffer, GL_DYNAMIC_DRAW);

        glBindBuffer(GL_ARRAY_BUFFER, colorMatrixVBO);
        glBufferData(GL_ARRAY_BUFFER, colorMatrixBuffer, GL_DYNAMIC_DRAW);

        glBindBuffer(GL_ARRAY_BUFFER, modelPosVBO);
        glBufferData(GL_ARRAY_BUFFER, modelPosBuffer, GL_DYNAMIC_DRAW);

        glBindBuffer(GL_ARRAY_BUFFER, textureAtlasVBO);
        glBufferData(GL_ARRAY_BUFFER, textureAtlasBuffer, GL_DYNAMIC_DRAW);
        
        glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, i_id);
    	glDrawElementsInstanced(GL_TRIANGLES,   draw_count, GL_UNSIGNED_INT, 0, gameItems.size());

    	
        glBindBuffer(GL_ARRAY_BUFFER, 0);
    }

It would allow 100k+ instances without performance hits. If I could convert particles emitter to use InstanceNode if it used the DrawElementInstance, it would be worth it.

I just tried “TestInstanceNode”.

It is set to 30 instances. If you change it to 300. it basically crashes JME. It has a frame rate of <1fps.

Even 100 instance really effects the FPS around 20fps.

It must not be using glDrawElementsInstanced.

I can’t speak to the efficient of TestInstanceNode but at the mesh level it works fine.

It uses glDrawArraysInstancedARB

Note, 30 is not the number of instances, it will create 3600 instances. Setting it to 300 will create 360000 instances.

1 Like

Thanks on the count, Didn’t notice the looping any the math of “NEGATIVE” so everything is doubled and times by the second row numbers.

Do you know what controls the “num” to draw at a time? I can’t locate this. I see on my machine it is doing at a range of 2,000 object at a time. Which is very low.

Can you control this? Increase the batch draw size.

When I do the 90 = 32,400 instance

Sorry, just noticed there is not limit, it was limited because of the random material usage.
If you do one material, it renders all of them at the same time.

300 = did a GPU call for 90,000+ geom.

The FPS was very poor, 3FPS.

But I see that on every frame there is so many looping through all the items, that take a hugs hit.

One issue with having the scene graph do auto-instancing is that there will always be scene graph overhead. For a super-large number of objects, that’s going to be significant… especially if all of them are moving and stuff. It’s convenient but it comes with those “everything is duplicated and we still have to manage a million objects” limitations.

But the real instancing support is down in the Mesh and it’s pretty easy to construct a raw Mesh that works with JME’s materials and uses instancing. Then it’s always just one object in the scene graph and you just have to edit the transform buffers yourself. A trade off between performance and simpler code. Though for me, I’ve never found the raw buffer updates to be particular onerous.

One thing I noticed in the TestInstanceNode, is that if the objects do not move then fps goes up. In my case from 170 fps (when objects are moving) to 230 fps (when objects are not moving). I can not explain why?

Thanks, I found Shaderblow using an example of instancing and drawing over 90k items with 200+ fps.

I’m going to use that technique that for my particle emitter for my Rain/Snow.

Thanks for helping. This is basically like what I was doing in my old engine.

Can you share the link to that example?

Sure no problem.

He is doing is own instancing, that is outside of JME architecture. He just builds his own vertex buffers and then updates the vertex data on simpleupdate. Then uses a simple shader to draw them.

This is with 80k instances, doing a simple update to move the rain drops to the ground and then reposition them to the top again to simulate rain.

Still gettings 500+ fps on this. This is a SIMPLE QUAD using a sprite sheet, so I altered the text coords.

2 Likes