Quake 3 level viewer [Java3D vs Xith3D vs JME benchmark]

I’ve ported the Xith3D q3 level viewer to Java3D and JME. The JME source can be found here:

http://home.halden.net/tombr/jmeq3.jar



To run the code you need JME, the level and vecmath library found in David’s QuakeIII Demo. Either put the level data folder in current directory or hardcode the full path in the JMEQuakeRenderer class.



You can read more about the port on javagaming.org:

Porting Quake III to Java3D

Awesome!  I can't wait to take a look at it.



You say something about "benchmark" in the subject…which one performs better? :-p



darkfrog

Java3D is by far the fastest, JME is the slowest.

That's not good.  Perhaps we should be looking into why that is.  I had heard before that Java3D is extremely fast, is there something they do inherently different that makes their engine faster than the rest?  I would say speed in a game engine is at the top of the list of requirements.  I'm confident jME can get there, and it already has so many other features that if we could get the performance to match or exceed the rest there would be no need to consider any alternatives.



darkfrog

I mean, if you switch on the boundingbox-drawing, it's almost as detailed as the geometry itself  XD

interesting reading about java3d's performance features:

http://java.sun.com/products/java-media/3D/collateral/1.2.1.perfguide.html



some nice things:

  • Scene graph flattening: TransformGroup nodes that are neither readable nor writable are collapsed into a single transform node.
  • Combining Shape3D nodes: Non-writable Shape3D nodes that have the same appearance attributes and are under the same TransformGroup (after flattening) are combined, internally, into a single Shape3D node that can be rendered with less overhead.



    I really think we should pick the best optimization-parts from java3d and incorporate them into jME
MrCoder said:

interesting reading about java3d's performance features:
http://java.sun.com/products/java-media/3D/collateral/1.2.1.perfguide.html

some nice things:
- Scene graph flattening: TransformGroup nodes that are neither readable nor writable are collapsed into a single transform node.
- Combining Shape3D nodes: Non-writable Shape3D nodes that have the same appearance attributes and are under the same TransformGroup (after flattening) are combined, internally, into a single Shape3D node that can be rendered with less overhead.

I really think we should pick the best optimization-parts from java3d and incorporate them into jME



http://www.jmonkeyengine.com/wiki/doku.php?id=geometrybatch

I haven't stress tested GeometryBatch so I don't know how usable it is or how fast. If anyone wants to try.... :)

That performance guide is for 1.2.1,  thats about 3 years ago if i recall that correctly ?

it was a discussion about core optimization techniques, so I think it's relevant anyways…and I doubt they have changed those…

I've converted to jme math, which doesn't obviously affect speed of running but makes it easier to run for a jme user (no need to find and add java3d's math lib to the class path)



You can pull it down from:

http://www.renanse.com/jme/jme_q3_src.zip



more to come later as we do tweaks.  I already get 60-140 FPS in the level before doing any work on jme to allow it to better handle scene structures of this type…  something due this release which will help everyone's games I believe.

any idea on how to achieve the same thing with renderstates? a way to lock them as well to allow for easy sorting…

The plan is locking a spatial will lock everything including the renderstates. (If it's a Node, all children will be locked as well). When compile is called on the locked spatial it will optimize the tree and the renderstates. I'm going to be cryptic with the use of the word optimize, because it's unknown what all the optimizations are going to be at this point.

well, we already do not reapply states if they are already applied (to a certain degree, although this could be improved.)  Your second statement though is in line with what we are planning, which is to sort the opaque queue by renderstate properties.  Work on that will commence soon (as soon as I can leave behind the texture/clip and other stuff :wink: )

hehehe your point is taken renanse…but at least I provided some clipping info of my own instead of just bogging you s  :smiley:

After playing with the Quake loader a bit here are some of my preliminary findings:


  1. Turning off UpdateGeometricState places the FPS of the initial camera position from ~90 to ~215 FPS.

        - This is related to the updating/merging/transforming of the bounding volumes for all the nodes in the BSP tree.
  2. Zooming out to the full level with updateGeometricState still turned off, puts the FPS to 17 (the same as with updateGeometricState on).

      - We are sending 3706 TriMeshes to be rendered… individually. We need to better batch these trimeshes to allow the least number of required calls.
  3. The PVS set seems very poor. Turning off PVS updating and zooming out shows that a PVS sector as ~2400 individual meshes when only about 100-200 were on screen. The PVS should be doing most of the work for us, but doesn't seem to be doing that job. (Leaving the majority of the culling work to be done by the bounding volumes).



    Therefore, reducing the required updates to the boundings (lock mechanism discussed before), reducing the GL calls by batching buffers in some way. For instance, render queue would group objects based on RenderStates, then compile the buffers of these groups into a single buffer for sending to the graphics card.



    EDIT: After hacking a bit with it and forcing some batching in for the geometry I got the full view from ~15 FPS to ~60 FPS.

just a thought I had in the middle of the night,  You guys could use opengl's display list to cache in the static operations.

Yes Patrick, that's what Java3D does, I believe. It's certainly one of the options for static geometry, however the overhead of merging boundingvolumes and such is more easy to eliminate. This will also give us a "locking system" that can eventually be used for displaylists.

I've finished a preliminary writeup of locking that allows you to lock various aspects of a spatial one at a time.  So far you can lock the Bounds, Transforms and/or the MeshData.  Locking the meshdata has no real affect at the moment but will allow for special compiled buffer options later (think merging like data, automatic static vbo generation across multiple spatials in the same area, etc.)



Locking the bounds prevents regeneration of the world bounds…  locking the transforms prevents regeneration of the world translation/rotation/scale.  When you lock, those items are first generated for you, so you'd generally want to lock once your scene is setup and ready to go.



These two locks took my FPS from about 120FPS at the initial position to 240FPS



I've also cleaned up the ordering mechanism in the RenderQueue OpaqueBucket so that expensive calls such as texture state changes are minimized.  This change brought the FPS up to 265FPS.  More interesting, when viewed from above, these changes brought my FPS from 17 to 24.



I also cleaned up a few object creation problems that caused hiccups when changing portal occlusion sectors…



I'll check these changes in, plus update the jme-quake src code soon… I'd also like to set up a webstart version so others can more easily try it out.  Unfortunately I will be tied up at a family reunion for most of the rest of the week so if I am unresponsive for a bit, that's why.



Cheers.

great work!

renanse,



Our deepest sympathy will be with you. :-p



darkfrog