Vertex shader displacement and batching

I’m trying to batch grass with an animated vertex shader (the ones taken from SimArborea treesl) but it doesn’t work fine. When not batching all is as it must, but when batching, the models move from their position. The farther from the origin of the batched mesh (or that I think), bigger is the offset the model moves.

I’ve seen other posts where they use batching and I know it is possible to achieve a good effect by I can’t, at least with that shader.

What I’m missing?, is it because of the shader itself (I’ve been looking to it but I’m not really a shader guru) or thre is something more I must do apart from batching?

I’ve tried with BatchNode and manually batching with GeometryBatchFactory.optimize.

I’ve also tried instancing them (with the same effect). (I tested it wrong, it was always batching)

Could you post the shader code?

From the name of the shader i guess paul uses that shader for trees, which uses instancing instead of batching.

Your issues comes from line 177

vec4 groundPos = worldMatrix * vec4(0.0, 0.0, 0.0, 1.0);

in your case worldMatrix will always be the same for every grass blade, and the wind displacement calculated as groundPos+offset.

It makes sense, I should see it myself :S. So… Is there an implemented way to know the original object position?, or must I set a vertex buffer (eg: any of the texcoord ones) with that position when batching?

Thanks for the help.

Why are you using a tree shader for grass?

You should use the grass shader from the IsoSurfaceDemo.

I’m using it for grass because of it is a type of grass that doesn’t need billboard and they aren’t grass-blades but something more complex (crossing polygons). I think I already asked you for the main differences of them both and it was the in-shader billboarding. So, currently I’m using the tree one for both, grass and trees (I need it to work for trees too, and so I’m having the same problem).

That shader is only meant for grass blades and blades are made up of a triangle each, isn’t it?. The trick used in there to know the groundPos wouldn’t work for trees or more complex plants, would it?

Yes, but trees have 17283659846593647345189 triangles so instancing is good.

Grass has, like 8 or 10 or so triangles, so batching is better.

Yes, I know about it but not all gpus support instancing so I need some fallback solution.

I’m also aware about this, so I’m trying to have my “complex grass” batched but with a wind-shader (the one of isosurfaces is not suitable as it is just for blades and my best bet then is the simarboreal’s).

The only solution I can think of is on a buffer-fill to have the object position in the vertex shader.

GL 3.1 Release date: March 24, 2009
Not sure whats your target specs but you will hit limits everywhere if you target older system. 512mb or vram where considered high-end back then.

Another option would be passing a array of worldpos to the material and use the InstanceId to get the correct one.
Fun stuff with that is that you dont need different meshes for each batch. UBO’s for the poor people :wink:
Note that each gpu has a register limit, if your array is too large compilation will fail

For that instancing is needed, isn’t it?, so it wont work for batching.

Just remove the billboarding in the IsoSurface shader… let the positions go where they want instead. Just add wind to them.

if you know the amout of vertices each object has the id can be calculated on the shader. I would have to look up the docs, cant remember exaclty how

int id=int(gl_VertexID/numerOfVertices);

Not sure this will work, I’ll give it another try though and look further (when using that material, my grass looks really strange, bigger and “bladed”).

Yes, this could work (using the vertex index?). So, instead of filling a vertex buffer with the position I fill a shorter array, avoiding a different mesh for every grass object. So, the only negative thing is the gpu register limit.

Just attaching this here for possible readers: Built-in Variable (GLSL) - OpenGL Wiki

You basically trade higherGpuMemoryUsage+ higherloadingTime with higherGpuUsage + higherBandwithUsage + higherApiInteractions

Ah, forgot, it does not work with indexed rendering.
Well, it probalbly does if you duplicate all the vertices

Could you expand this explanation?

Every technique comes with downsides. But IMHO in 99% of the cases you are cpu limited.

When dooing the array thing you have to upload the array in each drawcall which is a state change and requires cpu->gpu bandwitch plus you have a random access to the array and some additional calculations in the shader.

When dooing classical batching you have to create, batch and upload a mesh for each tile of grass

But then, the array alternative is not the better solution in that 99% of the cases if I understood well. With the array you have:

plus the batching.

while with the extra buffer solution you must just batch (with the extra cost of an extra buffer) and once it is sent to the gpu the cpu has nothing more to do with it. The bad part is that you consume more “cpu memory” because of you can’t have all that batched objects with a single mesh but a mesh instance for every of them (in the case of grass where all are the same). However, this meshes could internally share all buffers but the extra one with the object positions (so it isn’t really much).

So, in comparison one to the other (having in mind that both need batching) with the buffer solution it is a little bit of overhead (i’m not sure that is even appreciable) in the batch creation and a higher gpu memory usage (the extra batched buffer size) and with the array one there is a constant higher gpu/cpu overhead plus the register limit problem?

(Pretty much what you already said but with other words xP )

So, resuming, you recommend the buffer alternative in the 99% of the cases?

There is no need for batching. Just ine time and then the amount and the position can be set with the array.
The very same mesh is shared across your world.
Mesh generation for me is always a heavy task.

But i have to say that i personally try to avoid conditional additional load at all cost. I favor paying a static controllable amount then a one time fee at some undefined point.

And there we are back at the usual problem, all depends on your type of game. Large levels, small levels, loading screen between whatever.