JME3 Performance Issue

Hi JME Community

I am using JME3 for visualizing some scientific data gattered during four dimensional image processing in Java. As soon as I am adding a SimpleApplication to my application the performance is getting worser every few iterations of my main loop. The calculation is performing in a multithreaded environment and than passed to a consumer for visualizing. The consumer in this case is a SimpleApplication who has a shared threadsafe queue holding the data to display. Even if I don’t display any kind of mesh/geometry, simple starting a new SimpleAplication in a new Thread, the performance keeps getting worser. Pleas have a look at my log file:

<— Start executing FastMarchingEngine with DistanceCalculator: Medialness3DDistanceCalculator | NeighbourhoodCalculator: Connected26Neighbourhood3DCalculator
Loop #100 average 100 processing time: 13 [ms]
Heap Size: 488
Loop #200 average 100 processing time: 7 [ms]
Heap Size: 834
Loop #300 average 100 processing time: 8 [ms]
Heap Size: 1328
Loop #400 average 100 processing time: 8 [ms]
Heap Size: 1642
Loop #500 average 100 processing time: 8 [ms]
Heap Size: 2145
Loop #600 average 100 processing time: 8 [ms]
Heap Size: 2701
Loop #700 average 100 processing time: 8 [ms]
Heap Size: 3165
Loop #800 average 100 processing time: 7 [ms]
Heap Size: 3389
Loop #900 average 100 processing time: 6 [ms]
Heap Size: 3591
Loop #1000 average 100 processing time: 6 [ms]
Heap Size: 3753


Loop #8600 average 100 processing time: 6 [ms]
Heap Size: 22548
Loop #8700 average 100 processing time: 6 [ms]
Heap Size: 22722
Loop #8800 average 100 processing time: 6 [ms]
Heap Size: 22898
Loop #8900 average 100 processing time: 7 [ms]
Heap Size: 23127
Loop #9000 average 100 processing time: 7 [ms]
Heap Size: 23392
Loop #9100 average 100 processing time: 5 [ms]
Heap Size: 23390
Loop #9200 average 100 processing time: 6 [ms]
Heap Size: 23438
Loop #9300 average 100 processing time: 7 [ms]
Heap Size: 23716
Loop #9400 average 100 processing time: 8 [ms]
Heap Size: 24149
Loop #9500 average 100 processing time: 6 [ms]
Heap Size: 24306
Loop #9600 average 100 processing time: 7 [ms]
Heap Size: 24545
Loop #9700 average 100 processing time: 7 [ms]
Heap Size: 24802

Now I start the SimpleAplication which displays nothing for debug purpose:

Loop #9900 average 100 processing time: 8 [ms]
Heap Size: 24998
Loop #10000 average 100 processing time: 8 [ms]
Heap Size: 25260
Loop #10100 average 100 processing time: 7 [ms]
Heap Size: 25404
Loop #10200 average 100 processing time: 7 [ms]
Heap Size: 25491
Loop #10300 average 100 processing time: 8 [ms]
Heap Size: 25567
Loop #10400 average 100 processing time: 8 [ms]
Heap Size: 25730
Loop #10500 average 100 processing time: 11 [ms]
Heap Size: 25908
Loop #10600 average 100 processing time: 10 [ms]


Loop #13900 average 100 processing time: 35 [ms]
Heap Size: 30472
Loop #14000 average 100 processing time: 34 [ms]
Heap Size: 30623
Loop #14100 average 100 processing time: 34 [ms]
Heap Size: 30642
Loop #14200 average 100 processing time: 37 [ms]
Heap Size: 30745
Loop #14300 average 100 processing time: 39 [ms]
Heap Size: 30813
Loop #14400 average 100 processing time: 45 [ms]


Loop #22600 average 100 processing time: 143 [ms]
Heap Size: 39807
Loop #22700 average 100 processing time: 148 [ms]
Heap Size: 39861
Loop #22800 average 100 processing time: 167 [ms]
Heap Size: 40011
Loop #22900 average 100 processing time: 146 [ms]
Heap Size: 40069
Loop #23000 average 100 processing time: 155 [ms]
Heap Size: 40190

Does anyone have an idea what I am doing wrong here or has encountered similar problems? I can also provide src code if needed.

Thank you very much in advance for your feedback.

How many simple applications are you starting?

Note: I think your threading model sounds overly heavy. It’s best to let the render thread run without any locks.

Just that empty one. Pleas correct me if I am wrong but as far as I know, there should no lock been involved if I’m starting a (clean) SimpleApplication in a new Thread. If I don’t start a JME instance the performance stays around 6ms.

You said:
" The consumer in this case is a SimpleApplication who has a shared threadsafe queue holding the data to display" - leading me to velieve that the simple application is actually accessing the queue. Since there is one thread writing to the queue and antoher thread reading there must be a lock somewhere. It might be a highly optimized lock but there must be some barrier or you are not thread safe.

So without source it is impossible to say - for example: Is the simple-application doing something sillly as asking the queue for size while the producer keeps adding to it?

I am sorry for not beeing that clear. Normally my modified SimpleApplication polls element from the queue inside the simpleUpdate() methode but for debugging and profiling purposes I’ve just added an unmodified SimpleApplication to my main application to see what happens to the performance in the main loop.

I’ve implemented a test class for verifing my situation. It looks like that my calculation in the main loop is some how responsible for the performance problem, since the test shows no problemes at all:

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        
        // Create main thread 
        Thread mainThread = new Thread(new Runnable() {
            
            @Override
            public void run() {
                
                int loopCounter = 0;
                long averageProcessingTime = 0;
                
                while (true) {
                    
                    long nanoTimeStart = System.nanoTime();

                    try {
                        Thread.sleep(10);
                    } catch (InterruptedException ex) {
                        Logger.getLogger(JMonkey3DPerformanceIssueTester.class.getName()).log(Level.SEVERE, null, ex);
                    }

                    loopCounter++;
                    long nanoTimeStop = System.nanoTime();
                    averageProcessingTime += nanoTimeStop-nanoTimeStart;

                    if (loopCounter % 100 == 0) {
                        long processingTime = TimeUnit.MILLISECONDS.convert( (averageProcessingTime / 100), TimeUnit.NANOSECONDS);
                        System.out.println("Loop #" + loopCounter + " average 100 processing time: " + processingTime + " [ms]");
                        averageProcessingTime = 0;
                    }
                    
                }
            }
        });
        
        // Create JME3 instance
        Thread jmengine = new Thread(new Runnable() {

            @Override
            public void run() {
                SimpleApplication sa = new SimpleApplication() {

                    @Override
                    public void simpleInitApp() {
                        setShowSettings(false);
                        
                    }
                };
                
                sa.start();
            }
        });
                
                
        // Start both threads
        mainThread.start();
        jmengine.start();
    }   

Log file:

Loop #100 average 100 processing time: 9 [ms]
Loop #200 average 100 processing time: 9 [ms]
Loop #300 average 100 processing time: 9 [ms]
Loop #400 average 100 processing time: 9 [ms]
Loop #500 average 100 processing time: 9 [ms]
Loop #600 average 100 processing time: 9 [ms]
Loop #700 average 100 processing time: 9 [ms]


Loop #18100 average 100 processing time: 9 [ms]
Loop #18200 average 100 processing time: 9 [ms]
Loop #18300 average 100 processing time: 9 [ms]
Loop #18400 average 100 processing time: 9 [ms]
Loop #18500 average 100 processing time: 9 [ms]
Loop #18600 average 100 processing time: 9 [ms]
Loop #18700 average 100 processing time: 9 [ms]
Loop #18800 average 100 processing time: 9 [ms]
Loop #18900 average 100 processing time: 9 [ms]
Loop #19000 average 100 processing time: 9 [ms]
Loop #19100 average 100 processing time: 9 [ms]

So the problem is in the code we can’t see? Or now there is no problem?

The problem seems to be in the code you can’t see so far. It’s a little bit complicated to post the code here, since there are a lot of classes involved. It looks like it’s a multithreading issue as you mentioned erlier, I am just not jet sure how to dig into this.

Just a little update from my side. I finally found the problem of my issue with the poor performance. After updating the Nvidia Quadro 2000 driver to the newest version, everything is running as it should. No more performance issues here :wink:.