[solved] A "Compare function result changed!" bug appeared after updating the SDK

Hello all,

I have a problem with the following bug, it appeared when i updated to the new SDK:

java.lang.UnsupportedOperationException: Compare function result changed! Make sure you do not modify the scene from another thread and that the comparisons are not based on NaN values.
	at com.jme3.util.ListSort.mergeLow(ListSort.java:702)
	at com.jme3.util.ListSort.mergeRuns(ListSort.java:474)
	at com.jme3.util.ListSort.mergeForceCollapse(ListSort.java:423)
	at com.jme3.util.ListSort.sort(ListSort.java:241)
	at com.jme3.renderer.queue.GeometryList.sort(GeometryList.java:158)
	at com.jme3.renderer.queue.RenderQueue.renderGeometryList(RenderQueue.java:262)
	at com.jme3.renderer.queue.RenderQueue.renderQueue(RenderQueue.java:305)
	at com.jme3.renderer.RenderManager.renderViewPortQueues(RenderManager.java:877)
	at com.jme3.renderer.RenderManager.flushQueue(RenderManager.java:779)
	at com.jme3.renderer.RenderManager.renderViewPort(RenderManager.java:1108)
	at com.jme3.renderer.RenderManager.render(RenderManager.java:1158)
	at com.jme3.app.SimpleApplication.update(SimpleApplication.java:253)
	at com.jme3.system.lwjgl.LwjglAbstractDisplay.runLoop(LwjglAbstractDisplay.java:151)
	at com.jme3.system.lwjgl.LwjglDisplay.runLoop(LwjglDisplay.java:197)
	at com.jme3.system.lwjgl.LwjglAbstractDisplay.run(LwjglAbstractDisplay.java:232)
	at java.lang.Thread.run(Thread.java:748)

I know i have probably made something wrong somewhere in my code, but the whole message stack designate nothing in my code so itā€™s hard to find where is the culprit. Especially because it happens randomly (i need to run the game around 5 minutes before it happens).

Here is the code which seems to be the problem:

Vector3f newLocation = new Vector3f(
        information.getLocation().x+information.getTranslatingVector().x*advanceTime,
        information.getLocation().y+information.getTranslatingVector().y*advanceTime-0.2f,
        information.getLocation().z+information.getTranslatingVector().z*advanceTime);
playerModel.setLocalTranslation(newLocation);

Here is a little background about what this code is doing:
We can assume for now that the game is a FPS with a multiplayer, the bug happens client side.

The update of ennemies information is made this way:

  • When i receive data from the server (like ennemiesā€™s new location), i change the ā€œinformationā€ object (which contains only data, no geometries at all) without modifying the scene.
  • When i execute the simpleUpdate function, i move the geometry representing the ennemy to the new location (the code above).

I also update the rotation of the player and others informations in similar way without triggering the bug. There is also an update of physical objects which works like a charm. The only problem i have so far is when i directly modify the location of a geometry in simpleUpdate.

Does anyone have had the same bug ? Is there something iā€™m doing wrong ?

Thanks in advance,

Ga3L

You should check to see if any of your positions become ā€œNaNā€ because I think that is one of the things that will trigger the bug. Edit: the other thing is modifying positions while the sort is being doneā€¦ but Iā€™m going to assume (for the moment) that you arenā€™t doing that.

Just a tip from experience, Iā€™m guessing that ā€œinformationā€ is not thread safe at all? If you are writing to it from the network thread and reading it from the render thread then you are going to see strange results all the time. You need a thread-safe data structure.

Is that code trying to ā€œtweenā€ or something?

Edit: Iā€™m assuming you arenā€™t modifying scene things from the network thread but itā€™s hard to tell for sure. The seemingly unsafe ā€˜informationā€™ sharing is a clue that maybe there is some threading inexperience.

1 Like

I checked, you were right : it was only a NaN problem, the value ā€œadvanceTimeā€ became NaN due to a divide by 0 provoked by a ping of 0 ms (it happens because the server is in same machine as client).
I donā€™t know why this error didnā€™t happened in the previous version, either it never happened that the ping drop to 0 ms or a NaN in location wasnā€™t triggering any error ?

Yes that code is tweening to make ennemy move smoothly between two updates from server.

Because of the error message, i suspected more a thread problem, especially because this code is linked with networking synchronization.

The method which push the new information and the one which updates the object are both synchronized to avoid the network thread accessing it while the simpleUpdate is updating.

Anyway thank you a lot !

Updating the SDK most likely updated the version of the engine, which made your code perform faster it seems :smiley:
Or which is more likely: You changed something in your codebase parallel to upgrading the SDK and so you suspected the SDK to be the issue.

We changed to a more optimized sort some time back and it is sensitive to this ordering issue. Because comparing a number to NaN is always false then it looks like !A > B and !B > A and A != Bā€¦ which doesnā€™t make sense to the sort routine.

synchronized is SUPER heavy-handed for this sort of thing.

Note: SimMath already has thread-safe data structures for tweening:

One thread can push new positions and any number of threads can ask for ā€œtweensā€.

Like so: