[solved] A "Compare function result changed!" bug appeared after updating the SDK


#1

Hello all,

I have a problem with the following bug, it appeared when i updated to the new SDK:

java.lang.UnsupportedOperationException: Compare function result changed! Make sure you do not modify the scene from another thread and that the comparisons are not based on NaN values.
	at com.jme3.util.ListSort.mergeLow(ListSort.java:702)
	at com.jme3.util.ListSort.mergeRuns(ListSort.java:474)
	at com.jme3.util.ListSort.mergeForceCollapse(ListSort.java:423)
	at com.jme3.util.ListSort.sort(ListSort.java:241)
	at com.jme3.renderer.queue.GeometryList.sort(GeometryList.java:158)
	at com.jme3.renderer.queue.RenderQueue.renderGeometryList(RenderQueue.java:262)
	at com.jme3.renderer.queue.RenderQueue.renderQueue(RenderQueue.java:305)
	at com.jme3.renderer.RenderManager.renderViewPortQueues(RenderManager.java:877)
	at com.jme3.renderer.RenderManager.flushQueue(RenderManager.java:779)
	at com.jme3.renderer.RenderManager.renderViewPort(RenderManager.java:1108)
	at com.jme3.renderer.RenderManager.render(RenderManager.java:1158)
	at com.jme3.app.SimpleApplication.update(SimpleApplication.java:253)
	at com.jme3.system.lwjgl.LwjglAbstractDisplay.runLoop(LwjglAbstractDisplay.java:151)
	at com.jme3.system.lwjgl.LwjglDisplay.runLoop(LwjglDisplay.java:197)
	at com.jme3.system.lwjgl.LwjglAbstractDisplay.run(LwjglAbstractDisplay.java:232)
	at java.lang.Thread.run(Thread.java:748)

I know i have probably made something wrong somewhere in my code, but the whole message stack designate nothing in my code so it’s hard to find where is the culprit. Especially because it happens randomly (i need to run the game around 5 minutes before it happens).

Here is the code which seems to be the problem:

Vector3f newLocation = new Vector3f(
        information.getLocation().x+information.getTranslatingVector().x*advanceTime,
        information.getLocation().y+information.getTranslatingVector().y*advanceTime-0.2f,
        information.getLocation().z+information.getTranslatingVector().z*advanceTime);
playerModel.setLocalTranslation(newLocation);

Here is a little background about what this code is doing:
We can assume for now that the game is a FPS with a multiplayer, the bug happens client side.

The update of ennemies information is made this way:

  • When i receive data from the server (like ennemies’s new location), i change the “information” object (which contains only data, no geometries at all) without modifying the scene.
  • When i execute the simpleUpdate function, i move the geometry representing the ennemy to the new location (the code above).

I also update the rotation of the player and others informations in similar way without triggering the bug. There is also an update of physical objects which works like a charm. The only problem i have so far is when i directly modify the location of a geometry in simpleUpdate.

Does anyone have had the same bug ? Is there something i’m doing wrong ?

Thanks in advance,

Ga3L


#2

You should check to see if any of your positions become “NaN” because I think that is one of the things that will trigger the bug. Edit: the other thing is modifying positions while the sort is being done… but I’m going to assume (for the moment) that you aren’t doing that.

Just a tip from experience, I’m guessing that “information” is not thread safe at all? If you are writing to it from the network thread and reading it from the render thread then you are going to see strange results all the time. You need a thread-safe data structure.

Is that code trying to “tween” or something?

Edit: I’m assuming you aren’t modifying scene things from the network thread but it’s hard to tell for sure. The seemingly unsafe ‘information’ sharing is a clue that maybe there is some threading inexperience.


#3

I checked, you were right : it was only a NaN problem, the value “advanceTime” became NaN due to a divide by 0 provoked by a ping of 0 ms (it happens because the server is in same machine as client).
I don’t know why this error didn’t happened in the previous version, either it never happened that the ping drop to 0 ms or a NaN in location wasn’t triggering any error ?

Yes that code is tweening to make ennemy move smoothly between two updates from server.

Because of the error message, i suspected more a thread problem, especially because this code is linked with networking synchronization.

The method which push the new information and the one which updates the object are both synchronized to avoid the network thread accessing it while the simpleUpdate is updating.

Anyway thank you a lot !


#4

Updating the SDK most likely updated the version of the engine, which made your code perform faster it seems :smiley:
Or which is more likely: You changed something in your codebase parallel to upgrading the SDK and so you suspected the SDK to be the issue.


#5

We changed to a more optimized sort some time back and it is sensitive to this ordering issue. Because comparing a number to NaN is always false then it looks like !A > B and !B > A and A != B… which doesn’t make sense to the sort routine.

synchronized is SUPER heavy-handed for this sort of thing.

Note: SimMath already has thread-safe data structures for tweening:

One thread can push new positions and any number of threads can ask for “tweens”.

Like so: