Quickly compute Normals for Terrain

I am currently implementing some chucked LOD Geomipmapping. However, I am running into an interesting speed based issue.



When a chunk splits, it creates 4 children chunks. Each chunk is 33 x 33 verticies with a total of 2048 triangles. When compared to the times it takes to build the indicies and verticies, the normals take ALOT of time



(In milliseconds)

Vertex: 9

Index: 0

Normals: 338



and that is per child, so you end up having long stalls while the normals are calculated.

I am wondering if there is a faster way to handle this?

I am really open to anything. I have an idea in my head of spawning new threads to handle normal calculations, however that idea has me worried because you never know how many splits you could have at once.



Perhaps building them on the graphics card or a faster computation algorithm? Does anyone have any ideas?

Thank you.



Here is the current function I am using to compute normals:


   private void computeNormals(TriMesh batch) {
      Vector3f vector1 = new Vector3f();
      Vector3f vector2 = new Vector3f();
      Vector3f vector3 = new Vector3f();
      
      FloatBuffer vb = batch.getVertexBuffer();
      IntBuffer ib = batch.getIndexBuffer();
      int tCount = batch.getTriangleCount();
      int vCount = batch.getVertexCount();
      
      // Get the current object
      // Here we allocate all the memory we need to calculate the normals
      Vector3f[] tempNormals = new Vector3f[tCount];
      Vector3f[] normals = new Vector3f[vCount];
      
      // Go through all of the faces of this object
      for (int i = 0; i < tCount; i++) {
         BufferUtils.populateFromBuffer(vector1, vb, ib.get(i*3));
         BufferUtils.populateFromBuffer(vector2, vb, ib.get(i*3+1));
         BufferUtils.populateFromBuffer(vector3, vb, ib.get(i*3+2));
         vector1.subtractLocal(vector3);
         tempNormals[i] = vector1.cross(vector3.subtract(vector2)).normalizeLocal();
      }
      
      Vector3f sum = new Vector3f();
      int shared = 0;
      
      for (int i = 0; i < vCount; i++) {
         for (int j = 0; j < tCount; j++) {
            if (ib.get(j*3) == i
                  || ib.get(j*3+1) == i
                  || ib.get(j*3+2) == i) {
               sum.addLocal(tempNormals[j]);
               
               shared++;
            }
         }
         normals[i] = sum.divide((-shared)).normalizeLocal();
         
            sum.zero(); // Reset the sum
            shared = 0; // Reset the shared
      }
      batch.setNormalBuffer(BufferUtils.createFloatBuffer(normals));
   }