Sprite batching for Monkeysheet

Ogli · September 28, 2016, 3:22pm

Good question. I would guess that it batches all child-Quads in one pass.
In that case it would not really matter.
You either do it manually or let the BatchNode do the same work.
I would opt for the clearer code with less complexity.

But if you remove randomly from the inside very often:
That would indicate using a linked list instead of an array of all colors.

Pesegato · September 30, 2016, 9:29am

Further experimenting, I’d like to change parameters at runtime, otherwise where’s the fun?

    public void simpleUpdate(float tpf){
        for (int i = 0; i < 10000; i++) {
            //allMyColors[i].r++;
            //if (allMyColors[i].r==255)
                allMyColors[i].r=0;
            allQuads[i].setBuffer(VertexBuffer.Type.Color, 4, new float[]{allMyColors[i].r, allMyColors[i].g, allMyColors[i].b, allMyColors[i].a,
                allMyColors[i].r, allMyColors[i].g, allMyColors[i].b, allMyColors[i].a,
                allMyColors[i].r, allMyColors[i].g, allMyColors[i].b, allMyColors[i].a,
                allMyColors[i].r, allMyColors[i].g, allMyColors[i].b, allMyColors[i].a});
        }        
        //bn.batch();
    }

However, I’m doing it wrong: framerate decrease, and no visual change (the quads remain at the original color)…

pspeed · September 30, 2016, 9:37am

Well, if you change their color then you obviously have to rebatch them.

Pesegato · September 30, 2016, 9:53am

Already tried that

    public void simpleUpdate(float tpf){
        for (int i = 0; i < 10000; i++) {
            allMyColors[i] = ColorRGBA.randomColor();
            //allMyColors[i].r++;
            //if (allMyColors[i].r==255)
            allQuads[i].setBuffer(VertexBuffer.Type.Color, 4, new float[]{allMyColors[i].r, allMyColors[i].g, allMyColors[i].b, allMyColors[i].a,
                allMyColors[i].r, allMyColors[i].g, allMyColors[i].b, allMyColors[i].a,
                allMyColors[i].r, allMyColors[i].g, allMyColors[i].b, allMyColors[i].a,
                allMyColors[i].r, allMyColors[i].g, allMyColors[i].b, allMyColors[i].a});
        }        
        bn.batch();
    }

same as above; no color change, and framerate decrease

pspeed · September 30, 2016, 9:55am

Well, sure… you are changing everything every frame. That’s the worst case scenario for batching.

If this is a batch node, it’s possible you may need to do something else to let it know that a particular child is dirty. I think normally it won’t rebatch unless things have moved.

Pesegato · September 30, 2016, 9:59am

I also guess that also having 10000 new(s) per tpf is making the gc sad… however this can be sorted out.
What I don’t get is why the quads don’t update the colors.

Pesegato · September 30, 2016, 10:00am

Ah, ok.

pspeed · September 30, 2016, 10:04am

Looking at the javadoc, could be that you can just manually call:
http://javadoc.jmonkeyengine.org/com/jme3/scene/BatchNode.html#onMeshChange-com.jme3.scene.Geometry-

Else, I’m not sure how the Geometry knows the mesh has changed to update it anyway. It does mean that you need to keep the relationship from quad to your geometry around somehow.

Pesegato · September 30, 2016, 10:08am

java.lang.UnsupportedOperationException: Cannot set the mesh of a batched geometry
at com.jme3.scene.BatchNode.onMeshChange(BatchNode.java:116)

[quote=“pspeed, post:28, topic:36943”]
It does mean that you need to keep the relationship from quad to your geometry around somehow.
[/quote]

Please elaborate… I have no clue on how to progress. Or do I have to create my custom mesh with vertices and all?

pspeed · September 30, 2016, 12:03pm

Try onTransformChange()… I’m not sure why none of the other onXXX methods are implemented but that one doesn’t actually check to see if the transform is changed, it just rebatches that one geometry.

I just meant that your loop goes through quads… which doesn’t do you any damned good if you need geometries… ergo, you must have some way of saying “Gee, what’s the geometry for this quad?” Which I feel like is more or less exactly what I said… but I don’t know. Deedle deedle doo, wubba wubba wubba.

Ogli · September 30, 2016, 12:41pm

Yes, I think what you want to do is bad too.

Problem #1: 10.000 items iteration each frame with 10.000 new float arrays each frame - lots of CPU and RAM overhead. You can counter the memory-stuff by just reusing the buffer (or reuse the float arrays, but it might be better to use the buffer directly to reuse the buffer, depending on the code inside jME).

Problem #2: loading up a big batch every frame and dropping the old one - spams the GPU and VRAM. I guess this is less of a problem since a 10.000 x 4 vertices upload each frame should not be so difficult to handle for the graphics card and the bus that transports this data, at least not with a current gen graphics card.

Maybe it’s about time to reveal what you want to do. What’s your goal? Depending on that we might suggest you to take a certain route and use a certain technology in doing so.

Pesegato · September 30, 2016, 1:03pm

As topic: Sprite batching for Monkeysheet

Once I’ve understood how the buffer management work with batching, I’ll modify the Material for managing the monkeysheet animation and put these values on the buffer.

Ogli · September 30, 2016, 1:17pm

Still not clear to me.

Maybe you could start explaining at the end and not at the middle. The “goal” means a description of what you want in the end. Then we could discuss steps to get there (or optimizations for steps to get there).

pspeed · September 30, 2016, 1:20pm

He’s writing a sprite library. He’s trying to learn how to incorporate batching into it so he’s playing around with stuff.

Pesegato · September 30, 2016, 1:22pm

This is the shader that draws the sprite out of a spritesheet:

github.com

Pesegato/MonkeySheet/blob/master/src/main/resources/MonkeySheet/MatDefs/Anim.vert

uniform mat4 g_WorldViewProjectionMatrix;
uniform float m_SizeX;
uniform float m_SizeY;
uniform float m_Position;
uniform float m_FlipHorizontal;

attribute vec3 inPosition;
attribute vec2 inTexCoord;
attribute float inTexCoord2;
attribute float inTexCoord3;

varying float vAlpha;
varying vec2 texCoord;

void main(){

    float t = m_Position;
    #ifdef HAS_VERTEXSHEETPOS
        t = inTexCoord2;
        vAlpha = inTexCoord3;

This file has been truncated. show original

Each sprite is a quad with this shader; so I’ll probably want to put m_Position on a buffer.

Ogli · September 30, 2016, 1:39pm

If the goal is a 2D game consisting of sprites and the world has static levels and scrolls when walking and there are some characters running around and some particles flying around (see: lots of “if”) then maybe do it like that:

Have your texture (sprite sheet) for background elements + build a large batch for the level (or clusters, each being the size of a game screen).

Have another texture with all the animations of the game characters + make one Quad for each character for each frame (or a small batch for all characters on the screen - similar to particle systems).

Have several more textures (texture atlases, sprite sheets) for the particles and look for an efficient way to implement a particle system (starting with how jME particle systems work).

So far as I see it, you won’t need a custom shader and could use Unshaded.j3md or something like that.

And yes, if BatchNode doesn’t work as expected, then maybe look at how particle systems do it (I think they use a custom mesh consisting of a list of quads with some magic code to orient those quads to face the camera). That might be the better starting point for what you are trying to do.

Ogli · September 30, 2016, 1:53pm

And … in the end it’s just the question how much you can do on the GPU in order to optimize performance.

If you use Unshaded.j3md and some CPU-based code then you don’t need a special shader and you will pass only vertex positions and maybe colors down to the GPU. I know how to code this, it’s easy if you ever made a “custom mesh”.

If you want to utilize more GPU power then you will need to pass more stuff down to the GPU (e.g. a buffer that tells the shader what frame of the sprite’s animation it should use - would be another buffer similar to the color buffer containing a value inside the [0…1] interval with 0 being the start of the animation and 1 being the end of the animation). I currently don’t know the best way to utilize the power of the GPU for this so maybe you ask another person for hints.

There are always tradeoffs like these and usually the CPU is the weak point. But for me the clearer code is CPU-based code (Java or C++). Maybe get it to work with CPU-code first and then try to optimize it by writing a GPU-shader and see how much performance you can gain. It’s difficult to tell.

EDIT:

Usually the position is the “vertex-buffer” (i.e. the one buffer you set in the custom mesh that holds the x,y,z positions).

Pesegato · September 30, 2016, 2:02pm

Same result as before.

You touched many arguments, but I’d like to focus on sprite batching. Hacking around the code I provided is probably the best way to assess what is viable and what is not.
And yes, some of your ideas are also mine.

Pesegato · October 3, 2016, 12:34pm

onGeometryUnassociated did the trick. However, I get less than 1 FPS as everything get rebatched…

Pesegato · October 3, 2016, 12:42pm

I’ve discovered that with the same setup, I get about 16FPS if I DON’T use batching…