JOGL Implementation

Hi there,

i’m just a newbie who read Eberly’s Book, and i’m pleased to see that you’ve done a cool Java implementation of its engine, and also that your code is clear to read (no empty or non implemented confusing methods). I really like it, and will be pleased to take part of its development, i’ve not lot of time due to my job and my family, but if you agree, i’ll be pleased to help as i can :D.



Actually, i’m trying to make a JOGL implementation (I will explain later why), but due to my lack of knowledge with both your design, LWJGL (and even with JOGL), i’ve some design difficulties for the implementation.



It may be the begin of a long list of questions ://



It seems that you explicitely ask to the Window to render itself, to clean its backbuffer and so on.

But, i’m a bit confused with Thread issue. Indeed, in which thread are you when you do such asking ?

Using an AWT frame to host JOGL’s Canvas, performance should be better if integrating the “SimpleGame” process in the Render Thread itseld, i.e. in the AWT event queue. Is that right ? or do i have to make both Threads (rendering and AWT) compliant using a message type pattern.

For example, the “takeSnapShot” method should only be handled after rendering is performed, and i guess it shouldn’t be done while rendering or cleaning are performed.

am i clear enough, did i miss something :// ?





Now, let’s explain why JOGL. I’d like to use Cg implementation in a cool Scenegraph environment (it seems i’ve found the last, thanks to jME!! XD) (and then problably integrate it as a jME Render state), and actually it seems that only JOGL handle it :? if not please tell me.



Thanks,

Note : my english is poor, so please be nice with me !

Art.

My guess is that it would be easier to add the necessary support to jME, which I think is already under development. Before I was using jME, I was using Cg fragment programs in the underlying LWJGL (well, I compiled them to ARB fragrment shaders at build time, rather than using the Cg runtime).

Does LWJGL have support for the Cg runtime ?

It didn’t in version 0.7, no, but I don’t know what version they’re at now.

0.9 should be release during easter (about nowish). We’l see what happens

It seems that you explicitely ask to the Window to render itself, to clean its backbuffer and so on.
But, i'm a bit confused with Thread issue. Indeed, in which thread are you when you do such asking ?
Using an AWT frame to host JOGL's Canvas, performance should be better if integrating the "SimpleGame" process in the Render Thread itseld, i.e. in the AWT event queue. Is that right ? or do i have to make both Threads (rendering and AWT) compliant using a message type pattern.
For example, the "takeSnapShot" method should only be handled after rendering is performed, and i guess it shouldn't be done while rendering or cleaning are performed.
am i clear enough, did i miss something ?


All rendering/updates/etc is handled in a single thread. There is no AWT/Swing in LWJGL, so there are no seperate threads. All JOGL ports will be very straight forward, except for this one sticking point. I don't believe JOGL currently supports explicit buffer swapping. I know it's been requested many times, and complained about that it wasn't there before, but I don't know if they have honored that request yet. However, if you can port LWJGLDisplaySystem and LWJGLRenderer (buffer handling) the rest will be very easy.

If you have any specific questions, we'll be able to walk you through it, I'm sure. Good luck.

Thanks,

I agree only DisplaySystem, Renderer and TextureRenderer should be the tricky part… (actually i hope so)



Swap problem, shouldn’t be so important if engine are based on the same process : render and then swap, because the swap is done automatically after the rendering is performed in JOGL.



ok, i’ll deal with the Thread issue, but this brings up a new design question.



Because, rendering is done directly (?) by parsing the scenegraph from its root, i’m wondering how additionnal treatments can occured.

Let’s me explain:

  • for performance issue, Eberly (and others) said that RedneringState should be sorted in order to minimize their switching and their changement every time. Such sorting is already in because of inheritance of states but… is that enough
  • rendering transparent objects must be done in specific order due to the limitation of the z-buffer (which isn’t an alpha buffer), and even the object itself should be render from back to front (but that’s an other problem i guess)
  • shadow rendering must be done in a specific part
  • etc.



    does JMe handle such process, if so where it is done ? if not what about a kind of scenegraph buffer which can be sorted, and so on.

    Such scenegraph is then the rendering scenegraph (note it RS) and not the application specific scenegraph (note it AS). RS will only handle kind of Spatial proxy of the AS Spatial data.



    So let’s back to the thread issue, such feature (RS) is the way i think i will handle the rendering process.

    AS will be passed to the renderer to be rendered, which will set its own RS and during the rendering process then call all draw specific methods.



    So do you agree with such design ?

    I just afraid about perf. issue, but by the way because (in the case of JOGL) direct rendering is not tighty coupled with AS and the “SimpleGameEngine”, the two thread should be compliant.

Rendering scene graph may not be a scengraph in fact, it can be a tree of states with mesh data as leaf, but a queue could be better, an ordered state queue…may be

I agree that we will have issues regarding the alpha, shadow sorting. That will be taken care of soon, and it should be taken care of in a OpenGL binding independent way. i.e. it should be the same between LWJGL and JOGL. So, I’m not sure how those aspects directly effect a JOGL port.



We need to guarantee that the JOGL version will work the same as the LWJGL version (that’s how the design was done). Therefore, the JOGL version should be a direct “copy” of the LWJGL version. With any warts it has. That way problems are located in a central location and can be resolved once.



However, that said, I do agree with you. Later, we will be implementing a mechanism to seperate Opaque, Alpha and shadows into seperate render passes.

I was thinking about this earlier. Independently of the order in which nodes are sorted, a simple optimization is not calling set() when the new node’s render state is the same as the one that’s currently applied.



For example, if I’m rendering two nodes which have different TextureStates but identical LightStates, then the renderer need only switch TextureStates - suppressing calls to oldLightState.unset() and newLightState.set().



If this seems reasonable, I’d be happy to implement equals() (and of course hashCode()) methods for the RenderState objects as a first step.

For example, if I'm rendering two nodes which have different TextureStates but identical LightStates, then the renderer need only switch TextureStates - suppressing calls to oldLightState.unset() and newLightState.set().


This is already taken of, the states are inherited, so, your shared light state should be set in the parent of your two nodes.

Ah, so a null render state doesn’t mean unset everything, it means use the parent settings? Nifty.



That’s fascinating - it leads to all sorts of speculation about how to arrange your scene graph. I’d always assumed it would be done spatially, but if swapping texture states is much more expensive than pushing and popping matrix transforms, then you’d stand to gain by arranging your scene graph by texture. Eeenteresting.

Ah, so a null render state doesn't mean unset everything, it means use the parent settings?


Exactly. So, while this is quite fast and keeps state switching to a minimum, it does rely on the user to think about the graph design to maximize it's effectiveness. Which is true for any implementation, I suppose.

Hi there,

just a note to tell that i’ve almost done a raw implementation of jme using jogl and jinput, which successfully passed the TestScenegraph app :smiley: .

That means, TriMesh, Line and TextureState are working…



I think, i will use this thread as a kind of diary of what is done. And feel free to gives me any information that may help or not. Especially using JInput and JOal



[size=18px]

1-Renderer


[/size]
Actually, all render state are implemented but not tested, all rendering stuffs are implemented but not tested.

TextureRenderer is not implemented. I need to investigate how JOGL's pBuffer works.

[size=18px]
2-Fullscreen
[/size]

Fullscreen mode is done and works with unpredictable results... For those that may sound familiar : the DisplayMode switch doesn't seems to be in every case supported (may be due also that i'm using a laptop with one resolution...). There is some JOGL bugs, that explain this but not all, i'll continue to investigate later :? .

[size=18px]
3-VBOs
[/size]

VBO are implemented but not tested. This brings me a request, unlike LWJGL, raw buffer data needs to be provided as native (ie byte) type not as Float. TriMesh only store a view of the underlying ByteBuffer, as a FloatBuffer. FloatBuffer are great but cannot be provided to the GPU Buffer.
I'm asking to store both the native and direct ByteBuffer, and also (as it is now) one of its view : the FloatBuffer. This is true for Vertices, Normals, Colors and Textures. This could be written as :


nativeBuffer = NIOHeap.newByteBuffer (size*NIOHeap.SIZEOF_FLOAT);
floatBuffer = NIOHeap.setupFloatBuffer(nativeBuffer);


where :


ByteBuffer newByteBuffer(int numElements) {
  //eventually check in cache for an usued same sized buffer...
  ByteBuffer bb = ByteBuffer.allocateDirect(numElements);
  bb.order(ByteOrder.nativeOrder());
  return bb;
}
FloatBuffer setupFloatBuffer(ByteBuffer bb) {
  bb.order(ByteOrder.nativeOrder());
  return bb.asFloatBuffer ();
}



This brings me a remark : how do you know that the data in cache are invalid accordingly to the Mesh. More precisely, it looks like (by reading part of code only) if using VBO on dynamic data, VBO cache aren't invalidate and reloaded...
I would create a new structure on Geometry that store any data relative to a specific component. Let me explain this by code sample. Instead of using, the following in Geometry :


    /** The geometry's vertex information. */
    protected Vector3f[] vertex;
    protected transient FloatBuffer vertBuf;
    private boolean useVBOVertex = false;


The use of :


    protected BufferData vertices = new BufferData(BufferData.VERTEX);


where :


public class BufferData {

  public static final CachePolicy CACHE_MAPPED_DYNAMIC =
    new CachePolicy("mapped-dynamic", true, true, true);
  public static final CachePolicy CACHE_DYNAMIC =
    new CachePolicy("dynamic", true, true, false);
  public static final CachePolicy CACHE_STATIC =
    new CachePolicy("static", true, false, false);
  public static final CachePolicy NOCACHE =
    new CachePolicy("nocache", false, false, false);

  public static final int NOTYPE = 0;
  public static final int COLOR = 1;
  public static final int NORMAL = 2;
  public static final int VERTEX = 3;
  public static final int TEXTURE = 4;
  // indicate the number of float's relative to the specific data
  // values are indexed using the NOTYPE, COLOR... constants.
  public static final int[] defaultSize = { 3, 4, 3, 3, 2 };
  //
  public static final int NOT_INITIALIZED = -1;
  public static final boolean DEBUG = true;

  //


  protected int vboID = NOT_INITIALIZED;
  protected boolean useVBO;
  protected int type = NOTYPE;
  protected boolean dirty = true;
 
  protected int elemSize;

  protected int floatCount;
  protected FloatBuffer buffer;
  protected ByteBuffer nativeBuffer;

  protected CachePolicy cachePolicy = CACHE_STATIC;
  protected MapBufferParameters mapBufferParams;
...



[size=9px]
ok, now, why such a code ! just to separate the raw data (every single buffer) from its model (the geometry). By doing so each data can be handled separately by its own, the dirty flag indicates that data is dirty and may needs to be updated. Furthermore, a similar structure for indices, may allow to store TRIANGLE as well as STRIP or FAN, or even a Composite of each. Multi-Texture is handled by an subclass of BufferData called MultiBufferData, which stores an array of textureUnits-1
BufferData (0 the default texture coords are the instance itself). I've done an implementation of such Data, if someone is interested.

I've started to do a new Geometry class based on it, and so a new rendering method (note that such rendering method is strongly the same as the TriMesh ones, except that indices aren't supposed to be organized as TRIANGLES, and several indices buffer can be used).
Such implementation doesn't store any array, and is all based on nio buffer. So data aren't duplicated amond buffer and array. The only bottleneck is the bound.computeFromPoints(verticesCache); method which only takes array...
[/size]


Side note : i've used
[url]
http://www.javagaming.org/cgi-bin/JGNetForums/YaBB.cgi?board=share;action=display;num=1071940370
[/url]
on a Sphere (s = new Sphere("Sphere", 100, 100, 25)) using the TestSphere based sample, and a *STRIP* version of the LWJGLRenderer, and fps goes from 70 to 140. The only drwback is that sphere stripification is quite long almost one minute.

[size=16px]
4-Timer
[/size]
No high accuracy Timer is provided in the Java.net, so i'm using LWJGL ones.

[size=16px]
5-Input
[/size]

Due to the fact of the JInput, i must have a better approach to identify which controller is the Keyboard and wich ones is the mouse. This may sound funny, but when using my laptop with an external m$ keyboad and m$ mouse (which has wheel ans several button, more than 3!) controllers are split, and there types (BUTTON, MOUSE, ...) are not enough to identify them properly...

An other, missing thing is the cursor hiding functionnality. I could probably correct this, using AWT functionnality, since JOGL is renderer inside an AWT component, but when the cursor leaves the component, cursor is visible, furthermore this introduce a real dependency between the Input facility and the rendering support. This also exclude the use of JInput for LWJGL for example, so need to be done by other means. I'll check this later...

[size=16px]
6-Sound
[/size]

I've not started, but i hope it will be easier than JInput since tutorials are provided.

[size=16px]
7-Cg
[/size]

Arrhhhh, my grail ! All this port for Cg use in JME. I've implemented CgVertexState and CgFragmentState, that... ... works !!! in fact i've not tested the CgFragmentState since my graphics card doesn't support Fragment program..
This brings me two questions (one i've already posted without response :'( )

a) Since only one Vertex State can be set at a time, should i defined CgVertexState as a RS_VERTEX_STATE or a new one. In both case, it raises problems. Due to the RenderState handling, default ones are used to unset current state if needed, instead of disabling the current one (actually i would prefer the solution in any case...) but CgVertexState and VertexProgramState doesn't "unapply" the same way and each of ones needs its own "unapply". If doing a new RS_STATE, one cannot ensure that the ARB shader isn't unapply before applying Cg one's and reciprocally (ouch my english is falling even more...), without adding constraints on RenderState, which compicates a lot State handling...
(Same thing with Fragment...)

b) Most of such shader need to be updated (because they by-pass the fixed Transform and Lighting stuffs on card) before any mesh is rendered. Actually, i've defined a Callback that is added on the CgState. Such callback wich is retrieved by the renderer and called after setting the OpenGL transform and just before the mesh is renderer. That works fine, but in my own customized Renderer only.
Internal interface of the com.jme.scene.state.CgShader


public interface Callback {
  public void onCreate ();
  public void onApply();
  public void onDrawMesh();
}



[size=16px]
8-Misc
[/size]
and since this is a kind of diary, here is my plan for the hypotetic future exploration :
+ explore nio MapBuffer on file, in order to draw geometry that are too huge to be stored in memory.
+ add Occlusion Culling capability, transform the actual rendering pipeline to add occlusion culling and prevent CPU/GPU stalls, probably using a modifed RenderQueue facility.
+ add several way to handle Transparent data based on speed requirement : test depth peeling (pbuffer and Cg ?) for accurate but slow process, test two rendering pass one with GL_BACK as CullState the second with GL_FRONT faster but not always accurate, only depth sorting as it is with RenderQueue.

That's all for now, more on this later ;)

what is the difference between



glGenBuffers

glBindBuffer

glBufferData





and

glGenBuffersARB

glBindBufferARB

glBufferDataARB



was the first one before the VBO ARB extension was approved?

I looking foreword to seeing jogl in jme… good luck finishing it.

"arthemnos" wrote:
what is the difference between

glGenBuffers
glBindBuffer
glBufferData
...

and
glGenBuffersARB
glBindBufferARB
glBufferDataARB

was the first one before the VBO ARB extension was approved?

I can't recall, however the one we use now is better supported by ATI and nVidia.