Hi,
I am trying to understand an Exception which is occurring irregularly since a couple of days. I also do not exactly know which change in my code was causing it. Therefore I’d like to ask whether you are familiar with it and can give me a hint.
and this is line #1073 in GLRenderer (jME version should be 3.1.0).
@Override
public void postFrame() {
objManager.deleteUnused(this);
OpenCLObjectManager.getInstance().deleteUnusedObjects();
gl.resetStats();
}
So, this gives me at least some indication. The error happens irregularly and is therefore hardly reproducable, but very annoying.
I can only assume that it is occurring because of a step-wise loading of scenery content. I have a human-readable input file, where each line loads and object, generates a node, applies a decal, etc. I go through the file almost line-by-line and make changes to the scenegraph as I go along. Therefore the loading itself is happening at a slow pace but always on the JME thread:
Thread t=new Thread(() ->
{
while (!isDone)
{
try
{
visual.enqueue(() ->
{
long tik=System.currentTimeMillis();
while (!isDone && System.currentTimeMillis()-tik<10L)
{
try
{ processNextStatement();
}
catch (Exception e)
{ LOGGER.warning("Exception while reading file '"+inputFileName+"': "+e.getMessage());
}
}
});
}
catch (NullPointerException e)
{ LOGGER.warning("NullPointerException while reading file: "+inputFileName+". Loading may not have completed.");
e.printStackTrace();
}
//try {Thread.sleep(20);} catch (InterruptedException e) {};
Thread.yield();
}
isStreaming=false;
});
t.start();
Hard to find the actual code on Image.java on your snapshot to see what value is null. But is it just enough to fix the NPE from toString? That should never throw NPE in my opinion anyway. Or is there something else wrong?
I’m afraid it will become even more puzzling when I show the code of Image (see below). I guess it is the call to format.name(), which should not throw any Exception unless format is null.
Hmm yeah and it shouldn’t be null. At least internally the Image.java always sets the format via the setter and the setter has the sanity check against null.
Should try to set the Image.java format class variable protected → private to try see if it accessed from somewhere else.
I believe the NPE is an after-effect of this code in NativeObjectManager.java:
throw new IllegalArgumentException("The " + obj + " NativeObject is not "
+ "registered in this NativeObjectManager");
What strikes me as odd is that (in 3.1.0-stable) “Image.java” has only 1065 lines of sourcecode. There’s no line 1166. So perhaps the jme3-core version isn’t really 3.1.0.
For me, the obvious question is whether you could upgrade to a more recent engine, such as JME 3.4.0-stable.
Thank you. I will give it a try on the weekend. Since I have a lot of scenery content I will also take a closer look at the texture count and memory consumption and try to pin point which part of the scenery is causing the issue.
I am almost able to reproduce the error within 5-10 minutes. I jump between 4 different airports very heavily and call System.gc() every ten seconds. This seems to provoke the Exception.
There is no guarantee however, sometimes it just works and sometimes it does not. (Probably I should try it only on Mondays.)
Additionally, I found another thread. The difference is, that I do not manually dispose stuff, but I think the same thing could be happening. I don’t want to jinx it, but the additional call to ref2.clear() seems to fix the problem on my side too. I’ll keep you informed in any case.
I would appreciate your thoughts however, because I still do not understand the problem 100%.
Ah yes, this very much looks like the same issue I was having. The “NativeObject is not registered in this NativeObjectManager” crash sometimes manifests as the NPE you are getting. As you can see the problem is a tricky one to get to the root of because of the obfuscating nature of how the NativeObjectManager works.
An important thing to note is that your code is manually disposing of some NativeObjects, because there are numerous calls to it within the engine itself. FilterPostProcessor.cleanup() for example disposes of its NativeObjects explicitly.
Which is only called if you remove the FPP… and if you are constantly adding and removing FPP then you are definitely taking your life into your own hands.
I think most apps probably create it once and remove it once (at shutdown).
I’d be curious to know how many of these are done “all the time” and how many are done during normal shutdown.
@Retzinsky Thanks for joining the discussion too. I am curious whether you experienced the problem after your proposed fix. Were you able to isolate the issue or make it exactly reproducible?
@pspeed Maybe we should not call it „manually disposing“ stuff, because in a way every process which removes stuff from the scenegraph and removes all references (ArrayLists etc.) suffers from the same problem. I think that it is not connected to the FPP. Maybe one issue is that the objects are rendered in multiple viewports?
Well, I think we need to distinguish between manually disposing and “normal GC disposing”… because they are certainly different. And the errors you indicate seem impossible for naturally GC’ed objects to encounter (how could something be trying to call toString() on an image that nothing references, for example).
So, the question is: does your app manually, as in your code calls it, dispose of objects hoping to be GC-friendly? Or are 100% of your objects reclaimed naturally by the garbage collector?
Are there other things that you add/remove that a simpler app might just enable/disable (post processors being one example but I’m sure there are others)?
I’m not saying we shouldn’t be allowed to do these things but it would definitely help narrow down the issues.
So you (maybe) solved the issue by modifying the Engine, specifically “NativeObjectManager.java”?
(Before this, I didn’t realize that @retzinsky was talking about modifying the Engine!)
I’m convinced that clearing the reference is a good idea, to avoid the race condition you apparently encountered. I’ll open an issue at GitHub to that effect.
Sometimes I’m changing the kind of camera I’m using and the ViewPort along with it and was only doing what seemed like good housekeeping in destroying the FPP and making another later as required.
It hasn’t happened since the fix. I was able to make it sufficiently reproducible that I’m happy the solution works.