Optimizing forest framerate

It could be that launch4j expects some specific registry setup?

Anyway, the .exe is also a jar… so you can also go to the directory and run:
java -jar IsoSurfaceDemo.exe

…should work in theory.

1 Like

@shagros for good performance, use instancing for the billboards. Then handle the quad rotation/image selection in a vertex shader.
-What you got now is one draw call per each quad. That is expensive.
-What is fast is few objects(even one) with all the quads.

Similarly, what @8Keep123 said, you can check the distance every few frames.

PS: For comparison, my billboard systems renders 100k quads at ~40 FPS as a single draw call on an Intel HD 4400.

Use batching for the billboards. Instancing has per-object costs that are not really worth it in the case of quads.

The other part is true… do the “billboarding” in the shader.

…which is what the SimArboreal code does.

What kind of per-object costs? My setup has single vertex buffer 4 vertices, etc and then just the Instanced data buffer, mat3x4 per quad.

Batching is fine too, but when making changes you got to change multiple buffers, where I change only the InstancedData buffer.

The driver has to setup each instance… you do it in one draw call but it’s not free. It’s cheaper than multiple draw calls but more expensive than just one single non-instanced (batched) geometry.

What is the use case for this? Usually, I’d use batching or instancing for static stuff like trees, rocks, and grass, etc… don’t have much call for dynamically morphing ALL of them at once.

Thank you for the information. I’ve implemented batching in my billboard system to compare it with instancing. I tested it with 1M quads and got 15 FPS with instancing and 15-17 FPS with batching. Thus the advantage of batching is that it is faster, a disadvantage is that it takes more memory, in my case, 3.74x more than instancing.

I use it for static stuff too but as player moves closer at some point the billboards are replaced by actual objects. I got two choices, either I remove the billboards that are close and swap them for objects (resending the whole instance data buffer, eg. few times a second), or I create degenerate triangles in vertex shader or discard the primitive in geom shader if the triangle is too close to camera/player.

Atm I am using the first method. I havent attempted to implement the second yet. If I’ll manage to implement it without artefacts, preferably only using vertex shader, I will not have to reupload the instance data that often.

On another note: billboards are also useful for non-static objects, for example games use them for far characters, audience in a stadium, etc. There are also animated billboards. Atm, I am not using them for such cases, but I might use them to display far characters.

But that’s only because I guess your quads are not quads? Else the data you have to include with your instanced quads would likely outweigh the savings of just positioning them where they are supposed to be.

For swapping, generally speaking, you’d group your objects by zones and swap out whole zones. So farther zones are billboards, closer zones are objects. This has other advantages as well since it lets you sort any semi-transparent quads in bulk as part of their zone (for proper transparency front to back sorting is necessary)… since on the whole the zone won’t change relative to the player until the player’s zone shifts. You only have to sort when crossing the zone boundary.

IsoSurfaceDemo does all of these things, as an example.

They are quads… just with few extra instanced attributes: selecting texture region(since the texture contains several billboard images), another attribute for proper scaling, and I am planning to add colour attribute so that I can tint each quad separately.

My terrain is already split into 64x64 regions (64 meters) and loads/unloads whole segments.
However, whether object is billboard or not is determined purely by the distance from camera to the object few times a second. You could say that I could load lets say, 9 regions around the player with actual objects.
In my case, this does not work well, since a single 64x64 region can have from 2k-4k objects. I’ve got pretty dense foilage :slight_smile: ( trees, flowers, rocks, etc).

PS: regarding transparency, I will most likely end up using deffered with dithering transparency.

When you compare memory costs of batching versus instancing, are you taking the per-instance position + transform information into account? Else I have trouble with the 3.74x more stat.

It’s specific to my shader. These are my buffers for instanced and batched versions.

data = BufferUtils.createFloatBuffer(12*capacity); //mat3x4 InstanceData (instanced)
texSlotData = BufferUtils.createFloatBuffer(4*capacity); //TexCoord2 (instanced)
viewSizeData = BufferUtils.createFloatBuffer(4*capacity); //TexCoord3 (instanced)
//+ 12 floats in position buffer
//+ 6 int/short in index buffer
//+ 8 float in uv

//1M quads takes 20M floats

dataPos = BufferUtils.createFloatBuffer(12*capacity); //Position
indexData = BufferUtils.createIntBuffer(6*capacity); //Index
texSlotData = BufferUtils.createFloatBuffer(4*4*capacity); //TexCoord2
viewSizeData = BufferUtils.createFloatBuffer(4*4*capacity); //TexCoord3
rotData = BufferUtils.createFloatBuffer(4*4*capacity); //TexCoord4
texData = BufferUtils.createFloatBuffer(2*4*capacity); //TexCoord

//1M quads takes 74M floats

As you can see from the snippet, if I would like to add eg Color to tint each quad. In instanced version I need 3 or 4 floats. In batched version, I need to include the same color for each vertex thus 4 times.12-16 floats

I’m done with the billboard shader, now I need to implement the texture swapping. From your experience, what is optimal in terms of having enough textures and at the time that the texture swapping is not too obvious on screen? I’m thinking of organizing a 4 x 4 texture containing the rendering of the tree at an angle of 0, 10, 20 and 30 degrees from xz plan and a yaw of 0, 90, 180 and 270 degrees. Except memory, is there an issue with having too big of a texture?

FYI I am also including the billboard shader I built

#import “Common/ShaderLib/Instancing.glsllib”
#import “Common/ShaderLib/Skinning.glsllib”
#import “Common/ShaderLib/Lighting.glsllib”

 attribute vec3 inPosition;
 attribute vec2 inTexCoord;
 attribute vec3 inNormal;

 varying vec2 texCoord;

 uniform mat4 g_ModelViewMatrix;

 uniform mat4 g_ModelViewProjectionMatrix;
 uniform mat4 g_ModelMatrix;
 uniform mat4 g_ModelWorldMatrix;

 uniform int spherical; // 1 for spherical; 0 for cylindrical

 void main()
     mat4 modelView = g_WorldViewMatrix;
     vec4 unitX = g_WorldViewMatrix * vec4(1.0,0.0,0.0,1.0); 
     vec4 unitZ = g_WorldViewMatrix * vec4(0.0,0.0,1.0,1.0); 
     vec4 origin = g_WorldViewMatrix * vec4(0.0,0.0,0.0,1.0); 

     float scaleX = distance(unitX, origin);
     float scaleZ = distance(unitZ, origin);

     modelView[0][0] = scaleX; 
     modelView[0][1] = 0.0; 
     modelView[0][2] = 0.0; 

     if (spherical == 1)
       // in case you want the mesh to move on all axis, not just x and z
       modelView[1][0] = 0.0; 
       modelView[1][1] = 1.0; 
       modelView[1][2] = 0.0; 

     modelView[2][0] = 0.0; 
     modelView[2][1] = 0.0; 
     modelView[2][2] = scaleZ; 

     vec4 P = g_ModelViewMatrix * vec4(inPosition,1.0);
     //gl_Position = g_ProjectionMatrix * P;

     //ok gl_Position = g_WorldViewProjectionMatrix * vec4(inPosition, 1.0);
     //ok gl_Position = g_ProjectionMatrix * g_WorldViewMatrix * vec4(inPosition,1.0);
     //kind of ok but wrong gl_Position = g_ViewProjectionMatrix * g_WorldViewMatrix * vec4(inPosition,1.0);
     //ok gl_Position = g_ViewProjectionMatrix * g_WorldMatrix * vec4(inPosition,1.0);
     //gl_Position = g_ProjectionMatrix * g_ViewMatrix * g_WorldMatrix * vec4(inPosition,1.0);
     gl_Position = g_ProjectionMatrix * modelView * vec4(inPosition,1.0);
     texCoord = inTexCoord;

#import “Common/ShaderLib/Texture.glsllib”
#import “Common/ShaderLib/Parallax.glsllib”
#import “Common/ShaderLib/Optics.glsllib”
#import “Common/ShaderLib/BlinnPhongLighting.glsllib”
#import “Common/ShaderLib/Lighting.glsllib”

 out vec4 out_Color;

 uniform sampler2D m_ColorMap;

 varying vec2 texCoord;

 void main(){
     vec4 texColor = texture2D(m_ColorMap, texCoord);

     out_Color = texColor; //vec4(0.6,0.1,0.5,1.0);

Then material

 MaterialDef Simple {
     MaterialParameters {
         Texture2D ColorMap
     Technique {
         LightMode MultiPass

         WorldParameters {

         VertexShader GLSL130: Materials/Forest/Billb.vert
         FragmentShader GLSL130: Materials/Forest/Billb.frag

     Technique FixedFunc {


How many renderings of a tree are you considering? 8 or 16?
Yaw 0, 90, 180, 270 is fine. How big of a texture do you have in mind? Do you plan to support mobile too? PC can handle larger textures than mobile.
It is preferable to have larger texture than a lot of smaller ones as you will have less texture switches.
On another note, there is also GL_MAX_TEXTURE_SIZE which is hardware dependent, if you try creating bigger texture then your hardware can handle, you will get an error.

Regarding the noticeable texture swapping. If you check some games that use this technique, some employ a sort of fade in, fade out to another texture.

Here is an article with a video at the end of it implementing such transition.

Ok, so by applying the billboarding algo (see also [Solved] Unable to produce file with writeImageFile - #21 by shagros ), with everything done in GPU (rotation and change of texture depending on distance) I managed to get a x2 to x3 improvement speed

Now I still need to work on a few things

For example, the lighting is quite off (and not implemented at all in the shader), and actually i think the issue comes also from the main trees rather than the billboard. Trees don’t reflect light … I’m going to try using a colorRamp with almost no shadow, so that the trees are more sensitive to ambient light vs sun light.

Then I need to fine tune the swap itself, so that trunks get properly positioned when the swap occurs

@pspeed and @The_Leo thanks a lot for your help and patience! And feel free to comment