Race condition during initialization?

We suspect that a race condition exists in the lwjgl initialization. Specifically, we see that the Pbuffer is not always initialized on all platforms before it is accessed by LwjglOffScreenBuffer. We’ve been wrong before! but here is our evidence. In our hands, we find that JME3 canvases/panels are successfully created on faster graphics cards and that the SAME code fails on slower graphics cards with GL2.X support. Errors reported by JME3 have included ‘Pixel format not supported’, ‘Failed to create display’ and ‘Pixel format not accelerated’. A minimal test system to reproduce the first indication of the pending error (we can find) has been reduced to inserting the following two lines into LwjglOffscreenBuffer.java as the first two lines in InitInThread():



[java]System.out.println("LwjglOffScreenBuffer.initInThread Thread: "+

Thread.currentThread().getName());

System.out.println(“LwjglOffScreenBuffer: Pbuffer.getCapabilities:”+

Pbuffer.getCapabilities());[/java]



We observe that on faster systems the Pbuffer call returns a value of ‘1’ indicating that off-screen puffers are supported by the adapter/driver. Executing the SAME code on another machine (that runs the SDK just fine) returns a value of ‘0’. We have now tested this on 6 systems. The table below recaps 7 tests on these systems that demonstrate our thinking:

Test 1,2,3: all hardware CAN display JME canvases/panels

Test 4,5: JME3 in our app can ONLY run on faster adapters using a canvas or panel

Test 5: The problem occurs with simplified version of SDK topcomponent/application

Test 6,7: AWTTestPanel can ONLY run on faster adapters - regardless of whether it is running from Installer.java in the SDK project or our app (eliminates our app/project as culprit).



NOTE: All graphics drivers have been updated from the graphics vendor directly. All platforms successfully run AwtTestPanels and TestSafeCanvas via (modified) TestChooser.



1 2 3 4 5 6 7 A ‘Y’ indicates that the JME3 canvas/panel

PC


Test
was visible and interactive
Desk1 Y Y Y Y Y Y Y
Desk2 Y Y Y Y Y Y Y
Desk3 Y Y Y N N ? ?
Laptop1 Y Y Y N N N N
Laptop2 Y Y Y N N N N
MacLaptop Y Y Y N N ? ?
Tests
1. AwtTestPanels / modified TestChooser
2. TestSafeCanvas / TestChooser
3. SDK (using AWT panels)
4. Our app using Swing Canvas context
5. Our app using a reduced SDK SceneViewerTopComponent / SceneApplication.
6. AwtTestPanels from Installer.java of our app (before our code can go wrong)
7. AwtTestPanels from Installer.java of SDK (before your code can do something right)

Desktop 1: Windows 7/64, NVIDIA GeForce 8800
Desktop 2: Windows XP, NVIDIA Quadro 4000
Desktop 3: Linux, NVIDIA GTX 480
Laptop 1: Windows XP, Mobile Intel 4 Series Express Chipset
Laptop 2: Windows XP, Mobile Intel 945 Express Chipset
MacLaptop: OS 10.6, 2006 era adapter that supports GL2.x

Things we've tried without success:
- Instantiate LWJGL Sys in Installer, access PBuffer from topcomponent later
- Pause/sleep within LwjglOffscreenBuffer.InitInThread() to buy time for PBuffer
- Wait for PBuffer to return getCapabilities=1 (now seems goofy)
- Separate worker thread to run context.create() (now seems goofy)

Extra info ===============================
Desktop 1: output excerpt from SDK showing successful instantiation
INFO [com.jme3.system.lwjgl.LwjglOffscreenBuffer]: Using LWJGL 2.8.2
INFO [com.jme3.system.lwjgl.LwjglOffscreenBuffer]: Offscreen buffer created.
INFO [com.jme3.system.lwjgl.LwjglContext]: Adapter: nvd3dumx,nvwgf2umx,nvwgf2umx
INFO [com.jme3.system.lwjgl.LwjglContext]: Driver Version: null
INFO [com.jme3.system.lwjgl.LwjglContext]: Vendor: NVIDIA Corporation
INFO [com.jme3.system.lwjgl.LwjglContext]: OpenGL Version: 3.3.0
INFO [com.jme3.system.lwjgl.LwjglContext]: Renderer: GeForce 8800 GT/PCI/SSE2
INFO [com.jme3.system.lwjgl.LwjglContext]: GLSL Ver: 3.30 NVIDIA via Cg compiler
INFO [com.jme3.system.lwjgl.LwjglTimer]: Timer resolution: 1,000 ticks per second
INFO [com.jme3.renderer.lwjgl.LwjglRenderer]: Caps: [FrameBuffer, FrameBufferMRT, FrameBufferMultisample, TextureMultisample, OpenGL20, OpenGL21, OpenGL30, OpenGL31, OpenGL32, ARBprogram, GLSL100, GLSL110, GLSL120, GLSL130, GLSL140, GLSL150, VertexTextureFetch, TextureArray, TextureBuffer, FloatTexture, FloatColorBuffer, FloatDepthBuffer, PackedFloatTexture, SharedExponentTexture, PackedFloatColorBuffer, TextureCompressionLATC, NonPowerOfTwoTextures, MeshInstancing, VertexBufferArray]
1 Like

Thanks, this is valuable info. We are definitely having issues with lwjgl and canvases despite our attempts to apply a pretty safe method to initialize our implementations… Maybe you can also post this info in the lwjgl forum.

A reduced version of this posting has been cross-posted to the LWJGL forum. See http://lwjgl.org/forum/index.php/topic,4386.0.html

New Info: All platforms also run LWJGL test cases such as PBufferTest independent of NetBeans and JME3.

I would be happy to take suggestions and code additional test cases to further localize the problem.

We have some stress tests in the test suite that use vanilla AWT and cause the same issues irregularly.

FWIW, we see the same results for AWTPanels as for Swing Canvas tests on the platforms described above. However, we have another test system which passes TestCanvas and TestSafeCanvas tests but fails TestAWTPanels. This system might be of use in debugging. It is a virtual Windows XP machine on a physical machine running Linux with GTX 480 and VMWare version 8.x . For these tests GL2 is provided by Mesa 7.11 .



INFO LwjglRenderer 3:41:41 PM Caps: [FrameBuffer, FrameBufferMRT, FrameBufferMultisample, OpenGL20, OpenGL21, ARBprogram, GLSL100, GLSL110, GLSL120, NonPowerOfTwoTextures, VertexBufferArray]

INFO LwjglContext 3:41:41 PM Adapter: smsmdd

INFO LwjglContext 3:41:41 PM Driver Version: 4.0.6163.1000

INFO LwjglContext 3:41:41 PM Vendor: VMware, Inc.

INFO LwjglContext 3:41:41 PM OpenGL Version: 2.1 Mesa 7.11-devel

INFO LwjglContext 3:41:41 PM Renderer: Gallium 0.4 on SVGA3D; build:

RELEASE;

INFO LwjglContext 3:41:41 PM GLSL Ver: 1.20

INFO LwjglTimer 3:41:41 PM Timer resolution: 1,000 ticks per second INFO

As said, we long ran out of ideas on how we can workaround these issues in other ways. I was just mentioning that since you said that lwjgl pbuffer tests “independent of jME3” work. Maybe they should apply similar torture methods as we do to show the issues.

I never had success running jME3 in a VM.

LWJGL just doesn’t seem to support it for some reason

Ouch - we are currently only able to run on about half of our GL 2.x capable adapters and this makes us question if we are pursuing the right course. I will set the expectations low but we can take a look at the lwjgl code and specifically look for possible causes of the symptom we’re finding. Are there any design notes/documents for the com.jme3.system.lwjgl package outside the javadocs? It would be especially helpful to know the approach to threading and what LWJGL code was used as a template.

Ok, let me say this clearly once: The issue is in lwjgl. We were initializing the canvas “normally” as per the lwjgl documentation but got freezing issues with it very often. Then we were working with a lot of pbuffer-based workarounds on and off and then the lwjgl guys applied some fixes to their wrapper and things were a bit better but as you point out still not 100% right. Its best to leave jME3 out of the equation and trying to find the actual issue in lwjgl with the described torture methods. The initialization of the canvas basically amounts to the code that is in the respective class.