VoxelEngine Questions

Samwise · June 30, 2018, 11:36am

Good Morning jMonkeys,

[edit] I’m sorry, i see this post is too long so here is the questions only:

I see a continously increasing amount of direct memory usage, since i create new geometries for all chunks each time they are loaded i guess its related to that, so how do i delete old meshes buffers?
how do i use the LOD level stuff in jme? i know the buffer types from the meshes site in the wiki and the setLODLevels(VertexBuffer[] lod) method of the mesh but i dont get how to set the dufferent buffers when the section listing all buffers only mentions one buffer for vertices, indices etc?
is it woth it switching from a 3d-array to a 1d-array memory overhead-wise?[/edit]

I’m working on some VoxelEngine stuff for a while now, I basically started with a simple cube, implemented chunks as groups of blocks, added some sort of culling-meshing to only display the faces that are visible and since I didn’t really get how LODs work in jME I came up with my own way of LODs meaning I create completly indepentant geometries for each LOD per chunk, while the node that holds the chunk’s geometries keeps attaching and detaching the required LOD-geometries dependant on the distance to the camera. So far so good (not actually good but meh).

Now there is a bunch of questions coming into my mind about further improvements (I know there is always discussions about weather to improve at that point or first improve once you feel like you got low FPS but since I’m programming on a somewhat mediocre notebook and I actually want to play my own game with rather solid FPS and i just started with some basic AI and stuff that’s not actually gonna speed up the game but make it more heavyweight instead, i need some buffer in terms of processing power that I could make use of).

So back to what I currently have:

chunkwise: although I might still play with it, I currently got a chunksize of 32x256x32, the data is stored in a 3d-array that’s put into a buffer once the chunk is unloaded and as soon as another chunk needs to be loaded it does not create a new 32x256x32 array but takes one from the buffer, zeroes it out and loads in the data needed for that chunk.
meshingwise: I got greedy meshing and marching cubes working (except for the texturecoords), since I’m not yet 100% sure which one to use in my game, whenever a chunks data was loaded it’s given to the mesher that creates a new geometry for that chunk (currently one geometry per blocktype per chunk (and also per LOD but more about LOD problems later) and replaces the old geometries of that chunk with the freshly created ones. The List(Vector3f), List(Vector2f) and List(Integer) needed during meshing for the vertices, texturecoords and indices respectively are array lists that just like the chunks arrays are put in a buffer, same is true even for the Vector3fs and Vector2fs (all that buffering together resulted is some really decent meshing boost).
an additional ChunkHolderAppState (name might be misleading, it does not actually hold any chunks it just checks which chunks should be loaded/unloaded, depending on an given array of positions of characters (multiplayer), ensures chunks that would be loaded by more than one character are only loaded once, makes sure chunks that are closer to characters are loaded first (doesn’t actually load them itself but tells the WorldData instance to do so in the given order) and such
bunch of more stuff that’s not important for now

Now onto my first question:
When I switched from my culling-meshing to the greedy meshing, as I mentioned, I continued to produce one geometry per blocktype per chunk instead of meshing it all together and using a texture atlas (which as far as I know would mean I have to write my own shader to check which pixel to give what color since the built-in ones only support repeating textures or “cutting” the textures as used in a texture atlas but not a combination of both, i might be wrong here though but I just cant imagine how to tell it that information or how it would make use of it since there is more information needed than 2 floats for texturecoords per vertex when I want repeating textures from a texture atlas I guess). But when I was hinted towards the TextureAtlas class that can batch stuff together, create a texture atlas from the textures used, and probably update the texture coords of the vertices regarded I thought that might just be what I need but I realized using TextureAtlas.makeBatch on my chunks nodes results in as many materials as chunks are loaded (yeah was actually obvious), meaning in case of voxel related stuff it’s no improvement at all when using it for the chunks. Then I found out about texture arrays and I also considered to make use of mipmapping, so performance wise (since I guess after minecraft basically everyone wanted to make his/her own voxel related game, meaning people have experience regarding which techniques work good or bad and I don’t want to reinvent the wheel) I was wondering is it better to go for a textureatlas of mipmapped textures or for a texture array of mipmapped textures and can I still mesh all different types of blocks together when using a texture array and get data from that array of textures in the shader? (I guess I should be able to but I don’t really see the difference between texture array and texture atlas then since the texture atlas data internally should be stored in an array too?)

And the second question:
As I mentioned I currently create independant geometries per LOD (and currently still per blocktype), I thought about instead of creating independant geometries, i can just add the list of vertices fpr the different LOD to the list of vertices from the fully detailed geometry and then just change the list of indices dependant on the distance to the camera to use the vertices from the corresponding LOD.
https://wiki.jmonkeyengine.org/jme3/advanced/mesh.html told me about the different types of buffers and also mentions the setLodLevels(VertexBuffer[] lodLevels) but in the list of buffers I dont see several buffers for several LODs for the vertices or indices, and since my LOD generator might need vertices at positions where there are no vertices in the fully detailed version I’m wondering if I can actually use the built in LOD thing or not.
And somewhat regarding the same topic: since I currently only create new geometries when chunks are loaded and a once loaded chunk’s geometry stays visible even if the chunks data is unloaded (it switches LODs but first updates once you get into chunk-load-distance again; you can imagine it a little like age of empires in terms of whats invisible on the one hand, visible on the other hand and whats visible, showing enemies and getting continuously updated on the last hand) when moving on I’m facing an always increasing amount of direct memory usage, so I was wondering how can I modify the buffers of the already generated meshes instead of creating new meshes? (and shouldn’t actually the buffers of old geometries meshes that are not attached to the scenegraph anymore, since they were replaced by a later loaded version of that chunk, be deleted, if not, how to delete them manually?)

And one last thing I’m wondering, though it’s not actually jMonkeyEngine related:
Currently using a 3d-array in my chunks for the data instead of a 1d-array, would that result in 8704 bytes overhead more per chunk meaning for a chunk-load-distance of 16 chunks (= 805 chunks loaded) that would only be a difference of 6.68 mB for all loaded chunks, but also be more efficient in terms of time it takes to access the data or am I wrong on that?
In case I break a rule by asking that non-jMonkeyEngine related 3rd question I can sure just remove it but since it belongs to the topic of improvement here I though I might quickly throw it in to make sure i got that right.

I know it’s quite a long post, so if you made it all the way down to here thanks a lot for that already and thanks a lot in advance for every hint, idea or thought related to the topic, that you might leave here

Greetings from the shire, Samwise

[edit] changed the generics stuff in the lists I mention for the meshing to be put into parenthesis instead of tags because that tag stuff screwed it all up [/edit]
[edit2] put the questions summed up at the top so the rest is only needed in case there is questions about my questions lol[/edit2]

pspeed · June 30, 2018, 3:30pm

I think it’s more than that, isn’t it? A 32x256x32 array is 32 arrays of 256 arrays. Each array has overhead… let’s call that overhead ‘n’ (which is bound to be at least 4 bytes, eh?)

So, a 256x32 2D array is 256 * n just in array overhead. 32x256x32 is 32 * 256 * n… or 8192 * n… or at least 32 k but probably much more.

A 1d array, also avoids two index out of bounds checks per access… and in my experience, if the index lookup math is in a private then it’s inlined and very fast.

As to your other questions, I will let others answer in detail… but the built in LOD may not be what you want. You share one vertex buffer and multiple index buffers, basically.

You talk about “marching cubes” which is for IsoSurfaces but you say you are doing a block world. That part confuses me.

Samwise · June 30, 2018, 5:28pm

Hello pspeed,

[edit] I was again unable to differ between what’s important and what’s not but I can’t think of a way to shorten my answers without losing clarity, so I’m sorry for that[/edit]

First of all thanks a lot for the answer!
You are totally right, since I just switched over to 32x256x32 only few days ago, I was still having old 16x256x16 chunksize in mind when I tried to calculate it and I even failed with that since what i calculated should be the overhead for a 16x16x256 chunk instead of 16x256x16.
I just a few days ago saw a video about java memoy overhead and that made me think of how much more efficient it could be to switch over to 1d arrays, if I’m not mistaken there is 4 bytes for ClassPointer, 4 bytes for flags and 4 bytes for something i forgot (might be locks or something), plus another 4 bytes if it’s an array to hold the size of the array, meaning 16 bytes overhead per array, or 32 bytes even for 64bit version.
Since with the 1d array I would also need one array with 16/32bytes overhead, the difference are 32 + 32x256 arrays, thats 8224 arrays, resulting in 8224x16 bytes overhead, thats 131 584 bytes, multiplied by Pix16x16 for the amount of chunks loaded, would mean there is an overhead of freaking 100.923 MB (or 201.846MB for 64bit)? That would already make it worth to switch over to 1d arrays.

And for the LOD thing, if it does actually share one vertex buffer and I’m not bound to make use of all vertices in each index buffer, then I can just add the vertices from the different LODs up as well as the texcoords and make the LODs index buffers only use the vertices they need for their LOD, even if the most detailed version would not even make use of all vertices in the vertex buffer since the LOD generator might produce new vertices in order to ultimately reduce the amount of vertices for all LODs. Or do I have to make use of all vertices in the most detailed versions index buffer? Or otherway round, would it be a waste of memory to hold the unused vertices in the buffer when for example only the lowest detailed version is needed but still all vertices and texcoords for all different LODs are in the buffers? might that be related to the problem of the continously increasing amount of direct memory or it that rather related to how I convert the Vector2f, Vector3f and Integer-Lists to buffers (Short code on the bottom, and i probably need to free the buffer somehow)?

Lastly regarding the marching cubes, I found that js implementation from mikolalysenko on github and I modified it to take my 32x256x32 blockdata arrays as input data (i did actually not understand 100% how that worked out nor did I actually understand all the lookup tables used to speed it up) but it results in a terrain thats not bound to 90 degrees angles like in minecraft but it gives me 45 degree angles ( i could probably play around with the interpolation to adjust the vertices positions but as I said I dont really get whats going on there). I’m basically just not sure yet which look I would prefer for the game, seems like both got advantages and disadvantages.

Thats the code used to make a FloatBuffer from a List of Vector3fs (I unfortunately cannot remember where I got that from but I feel like it was in a post here on the forum)

    public static FloatBuffer createFloatBufferFromVector3fList(List<Vector3f> list) {
        FloatBuffer buf = ByteBuffer.allocateDirect(3 * 4 * list.size()).order(ByteOrder.nativeOrder()).asFloatBuffer();
        buf.clear();
        for (Vector3f current : list) {
            buf.put(current.x);
            buf.put(current.y);
            buf.put(current.z);
        }
        return buf;
    }

and it’s the same for Vector2f and Integer-Lists respectively.

Thanks a lot again,
Samwise

[edit2] had to change asterix (it wasnt called like that but quite similar) to ‘x’ because it resulted in italics[/edit2]
[edit3] use that backticks thing although on my last post the code blocks worked just by using the java-tag i used this time[/edit3]

pspeed · June 30, 2018, 9:02pm