I’m trying to batch grass with an animated vertex shader (the ones taken from SimArborea treesl) but it doesn’t work fine. When not batching all is as it must, but when batching, the models move from their position. The farther from the origin of the batched mesh (or that I think), bigger is the offset the model moves.
I’ve seen other posts where they use batching and I know it is possible to achieve a good effect by I can’t, at least with that shader.
What I’m missing?, is it because of the shader itself (I’ve been looking to it but I’m not really a shader guru) or thre is something more I must do apart from batching?
I’ve tried with BatchNode and manually batching with GeometryBatchFactory.optimize.
I’ve also tried instancing them (with the same effect). (I tested it wrong, it was always batching)
It makes sense, I should see it myself :S. So… Is there an implemented way to know the original object position?, or must I set a vertex buffer (eg: any of the texcoord ones) with that position when batching?
I’m using it for grass because of it is a type of grass that doesn’t need billboard and they aren’t grass-blades but something more complex (crossing polygons). I think I already asked you for the main differences of them both and it was the in-shader billboarding. So, currently I’m using the tree one for both, grass and trees (I need it to work for trees too, and so I’m having the same problem).
That shader is only meant for grass blades and blades are made up of a triangle each, isn’t it?. The trick used in there to know the groundPos wouldn’t work for trees or more complex plants, would it?
Yes, I know about it but not all gpus support instancing so I need some fallback solution.
I’m also aware about this, so I’m trying to have my “complex grass” batched but with a wind-shader (the one of isosurfaces is not suitable as it is just for blades and my best bet then is the simarboreal’s).
The only solution I can think of is on a buffer-fill to have the object position in the vertex shader.
GL 3.1 Release date: March 24, 2009
Not sure whats your target specs but you will hit limits everywhere if you target older system. 512mb or vram where considered high-end back then.
Another option would be passing a array of worldpos to the material and use the InstanceId to get the correct one.
Fun stuff with that is that you dont need different meshes for each batch. UBO’s for the poor people
Note that each gpu has a register limit, if your array is too large compilation will fail
Not sure this will work, I’ll give it another try though and look further (when using that material, my grass looks really strange, bigger and “bladed”).
Yes, this could work (using the vertex index?). So, instead of filling a vertex buffer with the position I fill a shorter array, avoiding a different mesh for every grass object. So, the only negative thing is the gpu register limit.
Every technique comes with downsides. But IMHO in 99% of the cases you are cpu limited.
When dooing the array thing you have to upload the array in each drawcall which is a state change and requires cpu->gpu bandwitch plus you have a random access to the array and some additional calculations in the shader.
When dooing classical batching you have to create, batch and upload a mesh for each tile of grass
But then, the array alternative is not the better solution in that 99% of the cases if I understood well. With the array you have:
plus the batching.
while with the extra buffer solution you must just batch (with the extra cost of an extra buffer) and once it is sent to the gpu the cpu has nothing more to do with it. The bad part is that you consume more “cpu memory” because of you can’t have all that batched objects with a single mesh but a mesh instance for every of them (in the case of grass where all are the same). However, this meshes could internally share all buffers but the extra one with the object positions (so it isn’t really much).
So, in comparison one to the other (having in mind that both need batching) with the buffer solution it is a little bit of overhead (i’m not sure that is even appreciable) in the batch creation and a higher gpu memory usage (the extra batched buffer size) and with the array one there is a constant higher gpu/cpu overhead plus the register limit problem?
(Pretty much what you already said but with other words xP )
So, resuming, you recommend the buffer alternative in the 99% of the cases?
There is no need for batching. Just ine time and then the amount and the position can be set with the array.
The very same mesh is shared across your world.
Mesh generation for me is always a heavy task.
But i have to say that i personally try to avoid conditional additional load at all cost. I favor paying a static controllable amount then a one time fee at some undefined point.
And there we are back at the usual problem, all depends on your type of game. Large levels, small levels, loading screen between whatever.