AbstractGame thread safety

Heya,

  Long time no post (new job, etc etc), but I've still been slowly developing my game. I sync'd with head and saw a thread safety issue, here's the patch. Setting elements of an array like getAttributes() to get around the "final" is a concurrency no-no. I threw the patch on my site in case it gets mangled here:

http://mindwidgets.com/jme/com.jme.app.AbstractGame-04142008-patch.txt



Index: src/com/jme/app/AbstractGame.java

===================================================================

RCS file: /cvs/jme/src/com/jme/app/AbstractGame.java,v

retrieving revision 1.35

diff -u -r1.35 AbstractGame.java

— src/com/jme/app/AbstractGame.java 10 Apr 2008 16:08:23 -0000 1.35

+++ src/com/jme/app/AbstractGame.java 15 Apr 2008 02:32:57 -0000

@@ -36,6 +36,7 @@

import java.net.MalformedURLException;

import java.net.URL;

import java.util.Stack;

+import java.util.concurrent.atomic.AtomicReference;

import java.util.logging.Level;

import java.util.logging.Logger;



@@ -191,22 +192,26 @@

        if ((!loaded && dialogBehaviour == FIRSTRUN_OR_NOCONFIGFILE_SHOW_PROPS_DIALOG)

            || dialogBehaviour == ALWAYS_SHOW_PROPS_DIALOG) {



-            final LWJGLPropertiesDialog[] dialog = new LWJGLPropertiesDialog[1];

+            final AtomicReference<LWJGLPropertiesDialog> dialogRef = new AtomicReference<LWJGLPropertiesDialog>();

            final Stack<Runnable> mainThreadTasks = new Stack<Runnable>();

            try {

  •             EventQueue.invokeLater(new Runnable() {
  •                 public void run() {
  •                     dialog[0] = new LWJGLPropertiesDialog(properties, dialogImage, mainThreadTasks);
  •                 }
  •             });

    +            if (EventQueue.isDispatchThread()) {

    +            dialogRef.set(new LWJGLPropertiesDialog(properties, dialogImage, mainThreadTasks));

    +            } else {

    +            EventQueue.invokeAndWait(new Runnable() {

    +            public void run() {

    +            dialogRef.set(new LWJGLPropertiesDialog(properties, dialogImage, mainThreadTasks));

    +            }

    +            });

    +            }

                } catch (Exception e) {

                    logger.logp(Level.SEVERE, this.getClass().toString(),

                            "LWJGLPropertiesDialog(PropertiesIO, URL)", "Exception", e);

                    return;

                }

           

    -

    -            while (dialog[0] == null || dialog[0].isVisible()) {

    +            final LWJGLPropertiesDialog dialog = dialogRef.get();

    +            while (dialog == null || dialog.isVisible()) {

                    try {

                    // check worker queue for work

                    while (!mainThreadTasks.isEmpty()) {

    @@ -219,7 +224,7 @@

                    }

                }



    -            if (dialog[0].isCancelled()) {

    +            if (dialog.isCancelled()) {

                    //System.exit(0);

                    finish();

                }

Hmm, this patch causes a hang on my machine because the change to invokeAndWait will not return until some tasks are completed that need to be run below that in the mainThreadTasks.  Perhaps it worked for you because this part is true for you?

EventQueue.isDispatchThread()





Anyhow, I fixed it by returning to invokeLater and using dialogRef.get() in place of the dialog variables below.  The new patch (which is now applied in cvs) is below (sorry for whitespace difs):


### Eclipse Workspace Patch 1.0
#P jme
Index: src/com/jme/app/AbstractGame.java
===================================================================
RCS file: /cvs/jme/src/com/jme/app/AbstractGame.java,v
retrieving revision 1.35
diff -u -r1.35 AbstractGame.java
--- src/com/jme/app/AbstractGame.java   10 Apr 2008 16:08:23 -0000   1.35
+++ src/com/jme/app/AbstractGame.java   15 Apr 2008 13:31:20 -0000
@@ -36,6 +36,7 @@
 import java.net.MalformedURLException;
 import java.net.URL;
 import java.util.Stack;
+import java.util.concurrent.atomic.AtomicReference;
 import java.util.logging.Level;
 import java.util.logging.Logger;
 
@@ -189,24 +190,29 @@
         boolean loaded = properties.load();
 
         if ((!loaded && dialogBehaviour == FIRSTRUN_OR_NOCONFIGFILE_SHOW_PROPS_DIALOG)
-            || dialogBehaviour == ALWAYS_SHOW_PROPS_DIALOG) {
+            || dialogBehaviour == ALWAYS_SHOW_PROPS_DIALOG) {
 
-            final LWJGLPropertiesDialog[] dialog = new LWJGLPropertiesDialog[1];
-            final Stack<Runnable> mainThreadTasks = new Stack<Runnable>();
-            try {
-               EventQueue.invokeLater(new Runnable() {
-                   public void run() {
-                       dialog[0] = new LWJGLPropertiesDialog(properties, dialogImage, mainThreadTasks);
-                   }
-               });
-            } catch (Exception e) {
-                logger.logp(Level.SEVERE, this.getClass().toString(),
-                        "LWJGLPropertiesDialog(PropertiesIO, URL)", "Exception", e);
-                return;
-            }
+         final AtomicReference<LWJGLPropertiesDialog> dialogRef = new AtomicReference<LWJGLPropertiesDialog>();
+         final Stack<Runnable> mainThreadTasks = new Stack<Runnable>();
+         try {
+            if (EventQueue.isDispatchThread()) {
+               dialogRef.set(new LWJGLPropertiesDialog(properties,
+                     dialogImage, mainThreadTasks));
+            } else {
+               EventQueue.invokeLater(new Runnable() {
+                  public void run() {
+                     dialogRef.set(new LWJGLPropertiesDialog(properties,
+                           dialogImage, mainThreadTasks));
+                  }
+               });
+            }
+         } catch (Exception e) {
+            logger.logp(Level.SEVERE, this.getClass().toString(),
+                  "AbstractGame.getAttributes()", "Exception", e);
+            return;
+         }
            
-
-            while (dialog[0] == null || dialog[0].isVisible()) {
+         while (dialogRef.get() == null || dialogRef.get().isVisible()) {
                 try {
                    // check worker queue for work
                    while (!mainThreadTasks.isEmpty()) {
@@ -219,7 +225,7 @@
                 }
             }
 
-            if (dialog[0].isCancelled()) {
+            if (dialogRef.get().isCancelled()) {
                 //System.exit(0);
                 finish();
             }



Thanks for the suggestions, the AtomicReference is something I was not well familiar with.  Any other suggestions to further improve on this are of course welcome.

I guess that’s the problem with concurrency, so many things work by accident. I didn’t realize I submitted my own bug fixing another :wink: You may want to store the dialog value in a temp variable to make the while expression an atomic check:



LWJGLPropertiesDialog dialogCheck = dialogRef.get();

while (dialogCheck == null || dialogCheck.isVisible()) {



    try {

        // check worker queue for work

        while (!mainThreadTasks.isEmpty()) {

            mainThreadTasks.pop().run();

        }

        // go back to sleep for a while

        Thread.sleep(50);

    } catch (InterruptedException e) {

        logger.warning( “Error waiting for dialog system, using defaults.”);

    }


   

    dialogCheck = dialogRef.get();

}



This is because theoretically between the two gets in the original while statement the value of dailogRef could change from not null to null, so you’d get an NPE. Sounds fussy but so many java apps break from little things like this when moved from single core to dual/quad.



java.util.concurrent is one of my favorite libraries, its very well designed. If you have shared state it can really cut down code size and give performance boosts. Java Concurrency in Practice significantly helped my understanding, and it covers most of what we need to know in under 400 pages.

Crap, just realized this line may access AWT state outside of the AWT thread:



while (dialogRef.get() == null || dialogRef.get().isVisible()) {



I'm at work now so can't code up a solution right now, I can take a look at this later tonight. AWT can be such a pain sometimes :wink:




Thanks for the book suggestion, I'll pop that in my cart. :)  I've locally changed to use the atomic check… 



As for isVisible, it seems really onerous to have to ask isVisible in the AWT thread.  Conceptually I get that this is correct because it ensures serial operations on that value…  But dang.

Visibility is more of an issue with that code. For instance, lets say you have the AWT thread running on CPU0 and WorkerThread running on CPU1. Each of these CPU's have a few megabytes of local cache so they don't have to access the slower system memory all the time. Since there is no synchronization around the visible state, the CPUs don't know that their local cached copies of visible need to be in sync.



If CPU0 sets visibleto true, it may only set its local cache copy of that variable. CPU1 may have its own local cache copy of visible, and never check system memory. Synchronization via the keyword or the new java.util.concurrent locks would let the VM know that the value of visible is shared and must be kept in sync.

Why use AtomicReference instead of volatile keyword?

darkfrog said:

Why use AtomicReference instead of volatile keyword?


needs to be final to be accessed in the anonymous Runnable.

Anonymous classes are of the devil!  :cry:



Repent sinner!

lazlohf said:
If CPU0 sets visibleto true, it may only set its local cache copy of that variable. CPU1 may have its own local cache copy of visible, and never check system memory. Synchronization via the keyword or the new java.util.concurrent locks would let the VM know that the value of visible is shared and must be kept in sync.


Good to know.  Is this really likely in this scenario though?
renanse said:

Good to know.  Is this really likely in this scenario though?


Of course not, but he just got done reading a pretty good book and wants to put it to use...who can blame him? ;)

Seriously though, volatile is the preferred approach to this as the Atomic classes are simply logical wrappers around the internal functionality of volatile.  Volatile is usually of most benefit in a scenario when you're modifying primitives from multiple threads.  As an example, I was once working on a project that simply did:

public void doSomething() {
   ...
   i++;
}



However, I had multiple threads that could potentially invoke it at the same time.  I didn't know it was possible (and it really freaked me out since there was absolutely no other code that modified the primitive value), but at one point I checked the value and it was something like -934523403...which, logically was impossible, but because of the concurrent modification it corrupted the primitive value.
darkfrog said:

renanse said:

Good to know.  Is this really likely in this scenario though?


Of course not, but he just got done reading a pretty good book and wants to put it to use...who can blame him? ;)



Or I have a job where I track down concurrency problems the "experts" and "architects" can't find, because they think it can't happen yet their systems are failing ;) Yes Renanse, it happens all the time and tracking these issues down can take days to weeks because it "only happens on that one guy's machine about once a week and we can't reproduce it, but it crashes his app". The probability of these issues happening shoots up quickly as the number of cores increases. Another very common signature is that the app behaves fine until it is under load, then it blows up (which is usually the worst timing). Is it a big deal in AbstractGame? Maybe not, the user will probably just restart their game if it actually occurs.

darkfrog said:

Seriously though, volatile is the preferred approach to this as the Atomic classes are simply logical wrappers around the internal functionality of volatile. 


Atomic classes also have atomic compare/get operations and all the other stuff that most programmers either don't do correctly or have boilerplate code they have to copy all over. Also, don't rely on volatile in Java VM's lower than 1.5, it will not work correctly even on booleans (this is a well documented problem).

darkfrog said:

Volatile is usually of most benefit in a scenario when you're modifying primitives from multiple threads.  As an example, I was once working on a project that simply did:

public void doSomething() {
   ...
   i++;
}



However, I had multiple threads that could potentially invoke it at the same time.  I didn't know it was possible (and it really freaked me out since there was absolutely no other code that modified the primitive value), but at one point I checked the value and it was something like -934523403...which, logically was impossible, but because of the concurrent modification it corrupted the primitive value.


Yep volatiles are mainly used on primitives, and the code above is an example of how not to use them :) The ++ operand is not atomic, which probably caused your corruption. Using AtomicInteger for the i variable would get you what you want.

http://java.sun.com/javase/6/docs/api/java/util/concurrent/atomic/AtomicInteger.html

There will be concurrency sessions at JavaOne, you should check them out and ask Josh Bloch and friends if you don't believe what I wrote here. Especially go to Java Puzzlers if you can, the volatile i++ is one of Bloch's puzzles and he will explain why it doesn't work.

I would agree that thread-safety issues can be among the most complicated problems to solve in programming.



…and sorry, I didn't mean to put you on the defensive.


Especially go to Java Puzzlers if you can, the volatile i++ is one of Bloch's puzzles and he will explain why it doesn't work


Interesting, what chapter is that in?  I've done extensive testing using volatile i++ and I haven't had any problems.

i++ is actually three operations and is not atomic.



eg, fetch the i value, add one to it, write the value back



During those operations another thread may have read from it. Of course, making the decision is all down to tolerance of the application and wether such atomic accuracy at a given junction is necesary

I realize this. :slight_smile:

darkfrog said:


Especially go to Java Puzzlers if you can, the volatile i++ is one of Bloch's puzzles and he will explain why it doesn't work


Interesting, what chapter is that in?  I've done extensive testing using volatile i++ and I haven't had any problems.


Sorry, its not in the current Puzzlers revision but from what I've been told will be in the next (target release date is this JavaOne).
renanse said:

Thanks for the suggestions, the AtomicReference is something I was not well familiar with.  Any other suggestions to further improve on this are of course welcome.


I think this patch takes care of the threading issues for AbstractGame.getAttributes() with minimal changes. I only tested on my dual core XP laptop. Having a standard task queue manager for OpenGL, AWT/Swing, OpenAL, etc may be something for the long term. Maybe a blend of parts of GameTaskQueue and the jmex StandardGame.

Patch: http://www.mindwidgets.com/jme/AbstractGameThreading2-04202008-patch.txt