StandardGame memory synchronization issues

When adding tasks via GameTaskQueueManager, there is no synchronization involved and so, the memory among threads is not consistent. This can cause random NullPointerExceptions among other things.

For example:

  1. Thread A creates alot of objects (a part of a scenegraph) and then calls GameTaskQueueManager.update() to attach them to the scenegraph being rendered.
  2. Some object fields reside in a memory local to the Thread A and are not synchronized with the main memory.
  3. When the GameTask.invoke() is executed the synchronized keyword causes the OpenGL thread to update it's local memory from the main memory. However the main memory has inconsistent values because Thread A hasn't flushed the local memory cache.
  4. If the inconsistent object field was an object pointer that is null in the main memory but set in Thread A local memory cache, then a NullPointerException gets thrown on any attempt to call methods on the object.

    Adding “synchronized” keyword to GameTaskQueue.enqueue() should solve this problem. This will cause Thread A to flush the local memory cache into the main memory.

    Also, the lock used for OpenGL thread is a fair lock. However, according to the following article from IBM, fair locks are very slow and even locking via synchronized doesn’t use fair locks.
but would also create some synchronization problems that would perhaps hinder performance.

Why is everyone so SCARED of synchronization??

First of all, to add a task in the queue, you create new Future object every time, that alone is pretty slow and comparable to a synchronization call. How come nobody cares about that?
Second of all, whenever you call any task you call a synchronized method on the task: GameTask.ivnoke()! Again, why is that not creating so called synchronization problems.

The synchronization is ALREADY in there, but only one way: OpenGL thread syncs thread memory to main memory, but any other thread using GameTaskQueue is not syncronizing its memory to the main memory. That's the problem.

Synchronization is slower than a regular function call, and one of the reasons is because of synchronizing memory. And in this case this is exactly what is needed: to synchronize the memory. You cant have multithreaded application and expect to pay nothing for thread synchronization, that's just not realistic.

And why why why do you say that synchronization is in a problem... It is not a problem it is a solution: to the problem of the game crashing - i.e. not being usable.

It's like a myth, and it's starting to drive me nuts! First it was "java is slow" now it's "synchronization is slow". Yes it is slower, but it's not THAT slow, and there absolutely no reason to avoid synchronization in places where it helps.

The only time synchronized vs not synchronized makes a difference is when you have a crazy multithreaded algorithm that does hundreds of thousands of calls per second. And the GameTaskQueue is far from that. And it's arealdy using synchronization either way and creates garbage collectable GameTasksObjects. Another synchronized call there will not cause you to get 1 fps.

Wouldn't volatile variables be required for this instead of synchronized access?

A volatile variable would suffice if you were sharing that variable among the Thread A and the OpenGL thread. The problem is: you can create any object tree of any depth in the Tread A and then pass the root of the tree to the OpenGL thread. For volatile approach to work, every single variable in that object tree would have to be declared volatile. However volatile variables require per-variable monitoring and for many variables it’s much cheaper to synchronize all of them via synchronized code block / method.

Besides blocking threads, synchronized keyword has another effect. When JVM encounters the synchronized code block, it updates the local thread cache from the main memory, and before JVM exits the synchronized code block, it writes the local thread cache back into the main memory.

Edit: under new JMM semantics (1.5 and up) it is possible to use volatile variables as flags. Because writing to a volatile variable was changed to force the memory synchronization with JSR133. So the cost of a write to a volatile variable is roughly half of the cost of synchronized. So it is doable to use a volatile variable to flush the thread’s local memory.

I would post a test, but this problem like most other multithreading problems is very hard to replicate. In my program the bug is manifested about 1 in 100 000. So everything works fine for a couple of minutes to half an hour, then suddenly crashes with null pointer exception.

Probably synchronizing the whole method would be too much. Since the queue is thread safe, this should do it:

    public <V> Future<V> enqueue(Callable<V> callable) {
       GameTask<V> task = null;
       synchronized (this) {
          task = new GameTask<V>(callable);
        return task;

I must admit I'm not up on this area of Java.  I don't mind us putting this in if enough people who know what they are talking about chime in with affirmation. :slight_smile:

Ok, I'll try to come up with a test to expose the issue. And then you can test that fix solves the problem.

While I was at it, I found some other issues with standard game, taht are pretty straightforward:

  • there is a variable named frameCount that is declared, incremented, compared and set to zero, but never actually used for anything useful.
  • shudown() method cleans joystick and audio system, however that method is never called if the game is closed via the window button (display.isClosing()). Also the audio system and the joystick are not cleaned when calling StandardGame.finish(). Possible solution: move the cleanup code in the clean() method and remove shutdown() method.

Things that are StandardGame specific would be DF's area

I am not exactly sure this is desirable. As the implementation of ConcurrentLinkedQueue<E> states, the queue (that used in GameTaskManager) is a no wait operation, so it never synchronizes. In other words, what lex is proposing would solve the problem, but would also create some synchronization problems that would perhaps hinder performance.

While I agree that this might be a problem, I think it is a lot more important to let the user do a synchronized block in their code, and let the responsability of consistency be there (or perhaps create a API call for it, but still reserve the fast implementation there for people to use with care).

Does it make sense?

Instead of lex's small modification being made maybe instead he should optimize the whole game task queue system using his incredible knowledge of concurrency.  :wink:

Man, I was about to answer with different examples in which a synchronization would not be necessary, and good programming techniques would solve the problem, but your reply just left me speechless  :smiley:

It is important to have a stable game engine, if a problem is to be fixed then it must be fixed right, and for that you need an expert.

Momoko_Fan said:

Instead of lex's small modification being made maybe instead he should optimize the whole game task queue system using his incredible knowledge of concurrency.  ;)

I'm making dynamic loading system which requires a separate thread and tight resource management.

Thought I would not consider myself an expert, I can spot and fix the problems. However I would not go as far as using custom locks and volatiles because it's very easy to make a mistake (such as not releasing the lock in the finally block and comparing two volatiles). Unless there is a very good performance gain when avoiding synchronized blocks - at least a couple of percent in overall application performance.

I kept reading documentation for ConcurrentLinkedQueue and at the end they state that it insures memory consistency. I have refactored StandardGame by removing the reentrant lock, simplifying the GameTaskQueue and removing singletons.

Now the bug seems to be gone for the time being, even without synchronization.

It was definitly some kind of multithreading bug, but It is very unlikely that the implementation of ConcurrentLinkedQueue is bugged. Maybe it was some bizzare problem due to multiple memory flushes with ConcurrentLinkedQueue, synchronized invoke() method and the reentrant lock at the same time.

Everyone is happy now, I am happy at least, and I am waiting for the Diff Patch and to be commited to CVS  :wink: