GeometryBatchFactory, BatchNode, OOM issues

Hmmm… having some issues (always)… but! These I think someone can help with.



I need to force GC after optimizing a node. At least this is what I think I need to do. I’m sectoring out areas (by terrain tile) and running optimization on objects as they build out. Problem is… as the topic title states… I’m hitting OOM issues. Not due to the number of objects/verts/etc… these are between 4 - 15 objects all with low vert counts, but due to the number of nodes I need to optimize individually (This is approximately 12x12 at initial load… max of 24 at any given load time after this).



How does one do this to avoid OOM issues? >.<

Is that heap memory or direct memory that runs out?

Forcing a GC will never help you avoid OOM. If memory runs out you can be sure the GC has done all it can to reclaim memory. So either the application simply needs more memory than you have set as max or you are keeping references to objects that aren’t needed - Like you forgot that you also keep references in a hashmap somewhere.

@jmaasing said:
Is that heap memory or direct memory that runs out?
Forcing a GC will never help you avoid OOM. If memory runs out you can be sure the GC has done all it can to reclaim memory. So either the application simply needs more memory than you have set as max or you are keeping references to objects that aren't needed - Like you forgot that you also keep references in a hashmap somewhere.


No references :( Nothing I can do to fix this issue (that I can think of) It only occures when either using the GeometryBatchFactory or the BatchNode.... guess it's time to see if these are holding unneeded references.

Ooops forgot to answer your question. Direct Memory. The error is actually thrown by the BufferUtilities class. Are there known issues with this?

@normen I’m not sure who wrote the GeometryBatchFactory (so I randomly picked you to mention this to)… however, after going through the code I noticed a few things:


  1. All of the the code for LoDs was commented out (and with good reason) except this:

    [java]for (int lodLevel = 0; lodLevel < lodLevels; lodLevel++) {



    }[/java]



    The Phantom Loopace


  2. List are created throughout, but instead of cleaning up, it relies on the CG to get around to removing them. Not sure if this is a potential issue or not when calling optimize() on many nodes. /shrug



    Guess it’s time to go through the BufferUtilities class. Maybe I’ll see something there :frowning:

As per the BufferUtil class:



[java]/**

  • Direct buffers are garbage collected by using a phantom reference and a
  • reference queue. Every once a while, the JVM checks the reference queue and
  • cleans the direct buffers. However, as this doesn’t happen
  • immediately after discarding all references to a direct buffer, it’s
  • easy to OutOfMemoryError yourself using direct buffers. This function
  • explicitly calls the Cleaner method of a direct buffer.

    *
  • @param toBeDestroyed
  •      The direct buffer that will be &quot;cleaned&quot;. Utilizes reflection.<br />
    

*

*/

public static void destroyDirectBuffer(Buffer toBeDestroyed) { … }[/java]



Never seen this called directly by anything… anyone?



@normen Sorry for bugging you… is there a way to force the GC to check this? It would seem to me that this would be a good thing to do when GeometryBatchFactory.optimize() is called… or BatchNode.batch().



Another question. What happens to the original buffers that were allocated for the models/custom meshes/etc that exist within a node?

BatchNode keeps references on the underlying geometries. So they are always kept (this might change at some point though)

This was easy to keep geometry picking compliant.



GeometryBatchFactory just create a geometries with one mesh that is the result of the merge of the sub geometries. So if your geometries are static i recommend using it instead of the BatchNode.

All the geometries you passed to the optimize method are gonna be collected at some point but there is no guaranty the buffers in direct memory will.

I guess you can iterate through the buffers of each mesh and call destroyDirectBuffer. First make sure the buffer is direct though.



Maybe Normen, pspeed or Momoko_Fan, can confirm if it’s safe.

@nehon said:
GeometryBatchFactory just create a geometries with one mesh that is the result of the merge of the sub geometries. So if your geometries are static i recommend using it instead of the BatchNode.
All the geometries you passed to the optimize method are gonna be collected at some point but there is no guaranty the buffers in direct memory will.
I guess you can iterate through the buffers of each mesh and call destroyDirectBuffer. First make sure the buffer is direct though.


I'll give this a go. Thank you!

Direct memory is handled differently from heap memory in how and when it is collected and also what you can do to make sure it is collected. It is also more scarce to begin with.

I have to set -XX:MaxDirectMemorySize=512m on my Mac as a JVM argument when loading big terrain tiles, I guess texture and/or mesh simply are to big for the default max size (which is pretty low http://stackoverflow.com/questions/3773775/default-for-xxmaxdirectmemorysize)

Yeah, what he said.



Allocated direct memory is not considered by the JVM when deciding to GC. If your heap is set large enough that you rarely GC then direct memory will accumulate and you will get errors.



Counter-intuitively, it’s generally better to set the heap max relatively small… or at least “just big enough”. This keeps the direct memory pool much fresher overall.

If it wasn’t clear from pspeeds explanation. Direct memory isn’t directly reclaimed by the GC and does not factor in when the GC decides to run. What, in theory, should happen is that the java heap object (the ByteBuffer instance) will become elegible for collecting, and when that happens the finalizer method in the ByteBuffer will free the direct memory using a “cleaner” method.

But the problem is that you can’t “force” the GC to collect the heap objects, you can strongly hint it to do so but you can’t be sure. So setting a lower heap will hopefully make the GC run more often which increases the chance of the direct buffers being collected.



But to make sure that the direct memory is freed all the horrors of managing memory surfaces, now in a language that has no malloc and free :slight_smile: The usual trick is to use reflection to call the cleaner method directly.

@pspeed @nehon @normen



I had an idea that may or may not be a suitable fix for the OOM issues related to batching (at least in many circumstances). Let me run through the basic idea and maybe you can tell me thoughts on a) is it worth trying b) pointers for making it work well.



The idea is basically:


  1. Use a custom mesh and set a template mesh (loaded model), then extract the buffers into Java handled arrays/lists/whatever is appropriate for the buffer type. (Edit: At this point you could destroy the template model, freeing up memory… eventually… whenever the GC decides to do it)
  2. Have an apendFromTemplate(Vector3f position, Quaternion rot) method that appends to the local lists updated with the template buffer info + the loc/rot info provided.
  3. Finally call build() to allocate the direct memory buffers and create the final mesh.



    This should stop buffers being allocated for every object you want to load and then batch. It would allocate buffers for the single template and then the final “batched” mesh.



    At some point, indexes/lengths could be stored for each object added and manipulated at some later point, without having to recreate the buffers. Besically avoiding re-batching. This should work well for static objects at a minimum.



    Maybe?
@jmaasing said:
If it wasn't clear from pspeeds explanation. Direct memory isn't directly reclaimed by the GC and does not factor in when the GC decides to run. What, in theory, should happen is that the java heap object (the ByteBuffer instance) will become elegible for collecting, and when that happens the finalizer method in the ByteBuffer will free the direct memory using a "cleaner" method.
But the problem is that you can't "force" the GC to collect the heap objects, you can strongly hint it to do so but you can't be sure. So setting a lower heap will hopefully make the GC run more often which increases the chance of the direct buffers being collected.

But to make sure that the direct memory is freed all the horrors of managing memory surfaces, now in a language that has no malloc and free :) The usual trick is to use reflection to call the cleaner method directly.


True. In Mythruna, I used to get out of memory errors all the time for the direct heap. I make and throw away direct memory every time the user clicks a block, basically. Lowering the max heap size back to a reasonable level completely got rid of these errors.

I now also free the old direct memory explicitly when it is being replaced but at that point it was more a performance issue. I'd occasionally see out of memory errors but mostly the app system memory size was always huge and some GC sweeps were taking a while. Freeing the buffers explicitly when I throw them away got rid of both of those issues.
@t0neg0d said:
@pspeed @nehon @normen

I had an idea that may or may not be a suitable fix for the OOM issues related to batching (at least in many circumstances). Let me run through the basic idea and maybe you can tell me thoughts on a) is it worth trying b) pointers for making it work well.

The idea is basically:

1. Use a custom mesh and set a template mesh (loaded model), then extract the buffers into Java handled arrays/lists/whatever is appropriate for the buffer type. (Edit: At this point you could destroy the template model, freeing up memory... eventually... whenever the GC decides to do it)
2. Have an apendFromTemplate(Vector3f position, Quaternion rot) method that appends to the local lists updated with the template buffer info + the loc/rot info provided.
3. Finally call build() to allocate the direct memory buffers and create the final mesh.

This should stop buffers being allocated for every object you want to load and then batch. It would allocate buffers for the single template and then the final "batched" mesh.

At some point, indexes/lengths could be stored for each object added and manipulated at some later point, without having to recreate the buffers. Besically avoiding re-batching. This should work well for static objects at a minimum.

Maybe?


Well... it is a heck of a lot faster than GeometryBatchFactory.optimize(), however, it doesn't solve the issue. Can someone explain to me the difference between loading 150 models (say with 100 verts) and creating a custome mesh with 15000 verts? I can do the first without OOM issues... however if I try to batch these.... toast. If I try to create a custom mesh instead of these... toast.

Anyone?

What is your direct memory size set to? By default it’s super low on some versions of Java.



By that I mean what do you have for: XX:MaxDirectMemorySize



It’s still a little strange that you are hitting issues… I can run Mythruna without any memory special settings. It runs like crap but I don’t overflow direct heap and I have many hundreds of thousands of verts.

@t0neg0d said:
Well... it is a heck of a lot faster than GeometryBatchFactory.optimize(), however, it doesn't solve the issue. Can someone explain to me the difference between loading 150 models (say with 100 verts) and creating a custome mesh with 15000 verts? I can do the first without OOM issues... however if I try to batch these.... toast. If I try to create a custom mesh instead of these... toast.

Anyone?

I am just speculating here, haven't had time to look. But I guess one difference could be the size of the buffer needed to send data to the GPU. 100 items one at a time or allocating a buffer large enough for all of them at the same time is my guess, or something like that would explain the difference.
@jmaasing said:
I am just speculating here, haven't had time to look. But I guess one difference could be the size of the buffer needed to send data to the GPU. 100 items one at a time or allocating a buffer large enough for all of them at the same time is my guess, or something like that would explain the difference.


I'm skeptical, though. 15,000 verts isn't that many no matter how many attributes you load on. For example, even something ridiculous like 64 32-bit attributes per vertex, it's less than 4 meg.

Something strange is going on, though.
@pspeed said:
What is your direct memory size set to? By default it's super low on some versions of Java.

By that I mean what do you have for: XX:MaxDirectMemorySize

It's still a little strange that you are hitting issues.. I can run Mythruna without any memory special settings. It runs like crap but I don't overflow direct heap and I have many hundreds of thousands of verts.


-XX:MaxDirectMemorySize=1024m
@t0neg0d said:
-XX:MaxDirectMemorySize=1024m


There is no way you should be running out of direct memory just by batching some geometry as you describe. Something else has to be gobbling it up.
@pspeed said:
There is no way you should be running out of direct memory just by batching some geometry as you describe. Something else has to be gobbling it up.


I can't figure this out. :(
@t0neg0d said:
I can't figure this out. :(


Try to reduce the issue to a simple test case and work from there. Maybe there are other factors at play in the larger application.