Diagosing "java.lang.IllegalStateException: Scene graph is not properly updated for rendering." "Problem spatial name: Root Node"

I am receiving the error intermittently

java.lang.IllegalStateException: Scene graph is not properly updated for rendering.
State was changed after rootNode.updateGeometricState() call. 
Make sure you do not modify the scene from another thread!
Problem spatial name: Root Node

I understand at a high level what this means; I’m changing the Scene graph from another thread. The bit I understand less is “Problem spatial name: Root Node”. Does that mean something was added to directly to root node (or removed from it), that some property of rootnode was changed (e.g. if its culled), that rootnode was added to something else (which would obviously be a bit weird), or something else.

Its a big game and this happens rarely and seemingly at random so diagnosing it is going to be tricky

it dont even need to be caused by another thread.

Technically, as i noted it happends when you modify Node in incorrect"frame workflow time".

Here is part of JME update loop:

@Override
public void update() {
    if (prof!=null) prof.appStep(AppStep.BeginFrame);

    super.update(); // makes sure to execute AppTasks
    if (speed == 0 || paused) {
        return;
    }

    float tpf = timer.getTimePerFrame() * speed;

    // update states
    if (prof!=null) prof.appStep(AppStep.StateManagerUpdate);
    stateManager.update(tpf);

    // simple update and root node
    simpleUpdate(tpf);

    if (prof!=null) prof.appStep(AppStep.SpatialUpdate);
    rootNode.updateLogicalState(tpf);
    guiNode.updateLogicalState(tpf);

    rootNode.updateGeometricState();
    guiNode.updateGeometricState();

    // render states
    if (prof!=null) prof.appStep(AppStep.StateManagerRender);
    stateManager.render(renderManager);

    if (prof!=null) prof.appStep(AppStep.RenderFrame);
    renderManager.render(tpf, context.isRenderable());
    simpleRender(renderManager);
    stateManager.postRender();

    if (prof!=null) prof.appStep(AppStep.EndFrame);
}

when you will execute something that modify Nodes After calls:

    rootNode.updateLogicalState(tpf);
    guiNode.updateLogicalState(tpf);

    rootNode.updateGeometricState();
    guiNode.updateGeometricState();

it will cause this issue for sure.

so as i see it is possible even in same thread, when you modify Nodes using simpleRender methods instead of simpleUpdate. Or some other methods that just execute after this lines.

Its same about another Thread ofc, because it depends when it will be executed, before or after calls. So if you are using multithread you should use JME synchronize loop call Queues.

like:

    app.enqueue(() -> {
        //do something
    });

(longer version if lower java version, i dont remember now)

anyway in this code everything executed after

rootNode.updateGeometricState();

will cause this issue.

I hope someone will verify me, because i dont check this long time.

about:

The bit I understand less is “Problem spatial name: Root Node”

it just mean this error is about this node, so like said above, this node in updateLoop was not ready for updates (so after this rootNode.updateGeometricState(); call)

btw. “Spatial” is everything, so ti can be Node, or Geometry or something else. Its like entity.

that rootnode was added to something else (which would obviously be a bit weird), or something else.

ofc it would be odd, because app thread manage if rootNode was ready for update or not.

I can also guess, you could yse OffScreen render methods and use rootNode there that could cause some issue?

Its a big game and this happens rarely and seemingly at random so diagnosing it is going to be tricky

Yes, if big game it would be hard to find.

But i have some guess:

  • you use offScreen rendering using rootNode
  • you modify rootNode in some calls like render()
  • you have multithread and you modify rootNode in thread without queue

First of all:
What you see is a security measure of jme. Multithreading is evil because the errors only come in randomly. That does not mean it’s less of a problem.

Actually what happens here is as @oxplay2 showed: The Scene Graph is modified right after the state has been set. This means someone is modifying the position DURING the Engine Rendering. This can cause all kinds of issues (think: one half of a mesh is still there, the other is moved, w/e else). You only see this rarely, because in all other cases, the manipulation happens before or after so JME is unable to detect it, but it’s still there.

The obvious answer would be to check for other threads (e.g. with a profiler/debugger), most likely something network related or some worker threads.

If nothing works, fork the engine, look for Spatial’s API which affects the Flags (There is something called RF_TRANSFORM, which is used to detect the problem), and add a simple check like:

if (!Thread.currentThread().getName().equals("jME3 Main-Thread")  {
     throw new IllegalStateException("See the stacktrace for the villian");
}

A shame the engine doesn’t use Assertions for this, I know it’s considered hand holding, but all other ways are hard, as you really have to use a fork for that

1 Like

That’s an interesting suggestion. I assume thats not in there by always for performance reasons? Could we put a mode in the engine (for debugging) that does that; throws errors early on threaded access to the Spatial’s API

(I know its worker threads; I have them all over the place but I (clearly incorrectly) thought none of them where touching the graphics state)

The “Root Node” part means it was the Spatial that threw the Exception. “Something” was changed but it could be a number of things (there’s 5 flags, one is RF_TRANSFORM mentioned already).

Adding on to oxplay2’s post, are you overriding update() at all? If so, can you use simpleUpdate() instead?

Since you say it “happens rarely” I’m thinking it’s probably not this simple and more likely thread related, but if you have some conditional code that runs inside update() I guess you could see this “rarely” too.

1 Like

I’ve been told that with Assert, it’s only happening when you launch the jvm in a special mode. I think the stance was that this is something the engine shouldn’t care of any further than this.
We also had this discussion for bad code like Vector3f.UNIT_X.addLocal(0f, 1.5f, 0f) (though obviously more obfuscated. This will break almost every code)

How would you do it without being incredibly invasive? And without accidentally detecting cases that are perfectly fine?

If you can answer how a Spatial would know it’s being accessed from a non-rendering thread and ALSO attached to a render-managed ViewPort… then sure. We could add it. The problem is that there is no answer to that question.

Oh right, you could be having a different thread but locks to synchronize as well

Or a different thread to load and manipulate the spatial before transferring it thread-safely to the main thread for adding to the scene graph.

That’s why I’d suggest its an optional mode, off by default. So the answer to “weird bug is happening” can be “turn on this mode and see where it’s coming from”

“Or a different thread to load and manipulate the spatial before transferring it thread-safely to the main thread for adding to the scene graph”

This is a good point, but I’m guessing it can track up its parent to reach the root node(s)? (Or not, in which case its fine).

But I can see all that being fiddly

It can be difficult tracking these things down, but experience tells me that logging will get you a long way. Start by logging every time you manipulate the scene in a threaded manner. I’m sure you’ll have an idea of where it might begin.

This particular Exception is only thrown in Spatial.checkCulling() when it sees refreshFlags set. It looks like there’s 5 possible flags:

/**
 * Refresh flag types
 */
protected static final int RF_TRANSFORM = 0x01, // need light resort + combine transforms
                           RF_BOUND = 0x02,
                           RF_LIGHTLIST = 0x04, // changes in light lists 
                           RF_CHILD_LIGHTLIST = 0x08, // some child need geometry update
                           RF_MATPARAM_OVERRIDE = 0x10;

Would it be (slightly) more helpful to indicate which flag(s) were set? Or at least log which were set before throwing the IllegalStateException? It won’t point you directly to the problem but might help you narrow down things if you’re completely uncertain.

2 Likes

What’s a parent? The spatial does not ever know what ViewPorts that it’s in. Only the ViewPort knows.

So you’d have to instrument the whole scene graph tree and modify all of the code that ever uses it to force this instrumentation in. So everyone always pays the price even if it’s not turned on.

…so that you can detect when you’ve strayed from best practices and have a mess on your hands.

It’s a hard thing to track down. Which is why there are best practices in the first place.

What you are actually trying to catch:
-I’ve modified my spatial somewhere after the state was finalized for rendering and now I’m rendering.

It’s not even thread specific.

The only place to check this is in rendering… which is where it’s checked.

How about this:

This records a presumptive exception every time one of those flags is set. Then if there is a problem it prints that presumptive exception (and if there is no problem it just throws it away.

I’m sure that has a performance penalty (generating exceptions is expensive) but within my game it wasn’t a noticeable performance drain, and you’d turn it on only when needed.

I set up a thread that was just continuously adding nodes to the root node and got:

java.lang.RuntimeException: State changed at illegal point
at com.jme3.scene.Spatial.onSetRefreshFlag(Spatial.java:278)
at com.jme3.scene.Spatial.setMatParamOverrideRefresh(Spatial.java:322)
at com.jme3.scene.Node.setMatParamOverrideRefresh(Node.java:143)
at com.jme3.scene.Node.attachChildAt(Node.java:362)
at com.jme3.scene.Node.attachChild(Node.java:332)
at mygame.Main.lambda$simpleInitApp$5(Main.java:499) <— This is indeed the problem line
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

As the main exception could say “turn on debug mode” it could cut down on forum questions as well

So every spatial gets a special extra field and added method calls for the 0.001% of cases where someone does something in a strange place and sees the exception.

Trust me that with a little experience, you will either never see this issue or know exactly what is causing it as soon as you see it.

In all other cases, it’s not a big deal to hack JME locally as you’ve done.

Also, for what it’s worth, there is no reason to throw an exception just to catch and capture. That’s what this method is for:
https://docs.oracle.com/javase/7/docs/api/java/lang/Throwable.html#fillInStackTrace()

Edit: and to be honest, I’m not even sure that’s necessary. I do new Throwable().printStackTrace() all the time and it works to dump the current call stack.