Cubes – A Block World Framework [Update Preview]

@ulfgur said: I have the following problem popping up when I try to use sleaker's system:

diamond operator is not supported in -source 1.5
(use -source 7 or higher to enable diamond operator)

I tried updating my JDK, but that didn’t fix it. Any ideas?

Look in your IDE configuration, I bet it’s saying 1.5 somewhere.

In Eclipse, it would be under:
Window->Preferences, Java->Compiler, Compiler compliance level.
There’s also per-project settings, they are available on that page but apply to just the current project; they are directly accessible via the context menu of the project root:
Properties, Java Compiler. (If you don’t have the “Enable project specific settings” checkbox active, the per-workspace settings from above will be used. Per-project settings become important if you wish to share the Eclipse project with others without sharing the entire workspace.)

While sleaker’s system is indubitably better, it does not play nicely with the standard Cubes framework, and is a pain in the butt to implement. Does anyone know of an easier system, or a shortcut, or a different mesher, or something similar that I might use? I appreciate any help, especially since I’m not sure I can get his mesher to work.

Hello destroyflyer,

I took notice of your Cubes framework, as I am starting a project which requires use of a voxel world. I played around with it a bit, and it certainly works great! However, while I was trying to build my own project on top, there were a couple things which made it hard to extend functionality and get at underlying data. So I dug into your code and ended up doing some major refactoring. The code can be found on my bitbucket here: https://bitbucket.org/seizure/cubes

Please tell me what you think of my changes, and if you plan on using my code - I am happy to contribute everything to your project, and I will probably continue to contribute as my project continues. All the algorithms and basic structure remains more or less the same, but there are a few key changes:

  • Complete package refactoring: Maybe it’s just my OCD tendencies, but huge single packages stuffed with classes bothers me. I separated the project into different packages so that it is easier to grasp how everything fits together. Additionally, as the project grows and more classes are added, it will make everything more manageable.

  • Register Blocks by Object, rather then Class: Perhaps you had a reason for this, but some disadvantages I noticed were you needed an entirely new, hard-coded class for each block even if they were all identical, or even completely unmodified from their super! It also meant having to jump through hoops to get the actual Block object from a Vector3Int, since it passed you the Class instead, not the originally registered Block.

My new version registers blocks by Object. In my opinion, this works better since you can register multiple blocks (Dirt, stone, wood, etc) from many instances of a single class, and simply supply different textures for each in the constructor, which means less class clutter. In addition, when you get a block by Vector3Int from the database, it returns you the original Object, with all it’s methods and variables. This is very useful, since it lets you easily extend functionality onto Blocks and integrate it into Cubes/your project. For example, a “Minecraftian” leaf block could have checkDecay(Vector3Int) to see if a particular leaf block should “die” after a tree trunk is chopped down.

In doing this, I noticed that having separate Block and BlockType classes wasn’t as necessary, so I joined them into one class… in the future, it might be nice to create a distinction again, where a Block refers to the physical, abstract voxel and simply contains some navigational helper methods (eg getNeighbor(BlockFace face) and a method to retrieve the actual underlying BlockType class which the user extends and registers. But for now, I think it is fine as is.

  • Abstracted rendering functionality from voxel storage: I can think of a couple use cases where one would want to store voxel data, but not create geometry. For one, it would be useful for server instances. Secondly, one could use it to store data necessary and parallel to the voxel world, but not want the data to be visible. “Metadata”? :stuck_out_tongue_winking_eye:

  • Changed AbstractControl to Control interface: Because of the way I abstracted rendering/voxel storage as described above, I couldn’t extend AbstractControl AND the base storage class. Therefore, it was necessary to convert it from a class extension to an interface implementation. Doing so makes the project easier to further extend too.

And yes, Java 1.8 allows you to extend multiple classes, but I think that this completely defeats the purpose of having Interfaces at all, and anyways, I noticed that JME encourages your code to be 1.5 compatible.

Functionality I would like to add in the future:

  • More thorough event system (Events for blockAdded, blockRemoved, blockSet, chunkAdded, etc.)
  • A more flexible/extensible texture system. Maybe I didn’t look at the code carefully enough, but it seems like I am hardcoded into a 16 x 16 “tile” texture atlas. Just because we are limited to 256 block types doesn’t mean we should be limited to 256 textures!
  • Moving ChunkMaterial definition from CubesSettings into BlockType. This will make it easier to customize the texture according to various conditions, such as lightening/darkening the texture, or optimized texture usage by having greyscale textures which are conditionally “colorized”.

I also have a question in regards to data storage: Storing voxel data in 3D arrays is super efficient for calculations, but super inefficient in terms of RAM usage. I am curious about implementing the option to store data as a HashMap<Vector3Int, BlockType>. I am sure there will be some impact on performance, but I wonder to what extent? How do applications like Minecraft store their large data sets?

3 Likes
@ulfgur said: While sleaker's system is indubitably better, it does not play nicely with the standard Cubes framework, and is a pain in the butt to implement. Does anyone know of an easier system, or a shortcut, or a different mesher, or something similar that I might use? I appreciate any help, especially since I'm not sure I can get his mesher to work.

Greedy Mesher is most definitely harder, one of the reason why I rolled my own version of the framework, now for myself it was a road worth travelling, because like all engineers we like to know how things work and the best way to learn something is too pull it apart and play, break and test.

Honestly though, whatever works as long as it works, and preforms well, any of these meshers in the framework will do, for the job at hand, go with what you feel comfortable with. Also have you seen the amazing awesome work done by pspeed
Marching Cubes

Will be happy to share any information about the Greedy Mesher, where possible with my limited knowledge on this subject.

@seizure

Wow, seriously, those are some really great changes! Is it alright if I include some of your changes into mine and @echospot 's game?

Also we converted the array to a hashmap just like you considered and see no performance impact at all. (But RAM usage is down a lot!)

@seizure

Oh yeah, nicely done, had a quick glance at your project and definitely think its worth while expanding, sleaker done some nice work with one dimensional array with shifting bits for the x, y, z coordinates, so no need for a three dimensional array, another approach to think about :wink:

@PewPewKapowie
feel free to use the code however you like! I am hoping that my changes are in line with @destroyflyer 's plans and that he can use my changes, since I am sure I will continue to be developing this and would like my code to be easily available.

My guess concerning HashMaps vs native arrays, is that for worlds with relatively small data sets, and/or which require relatively infrequent modification/checking, the speed difference will be negligible, as you stated. As amount of data stored goes up, there are more items in a HashMap which potentially have to be traversed before finding the correct one, and the more times you call getter methods or modify the data (setting new blocks would be particularly costly I believe?), the more times those costly HashMap traversals have to occur.

The other thing to consider is “empty space” vs “block space”. With an array of natives, enough space to hold every single cell is immediately grabbed from the heap and saved for use. This is why you notice that distinct hang at the beginning of many of the test classes, and why it takes up such a large amount of RAM - even if the number of actual blocks is relatively low, the rectangular boundary around them (that is, all the chunks needed to store them) will be much larger, and you are storing space for empty blocks. With a HashMap, you only store data for the Vector3Ints which actually contain something.

For a volume which is entirely populated or close to it, a native array definitely makes sense. But the more empty space you have, the stronger an argument you have to use a HashMap. This is especially true for worlds which have “clusters” of blocks which are spread very far apart - storing all the space inbetween is extremely costly and a huge waste.

I wonder how storing a Java Byte[][][] would work though? They are classes, not natives. But if the JRE just creates an array of pointers (very likely) which start out leading to null, and are to be replaced, you’re no better off (actually worse off, since pointers are definitely going to larger then a byte). But maybe if the JRE were made such that it allocated the space dynamically…? I don’t know the technical details.

I think what could DEFINITELY stand to use a HashMap over an array is the main BlockVolume, which contains all the Chunks. In most use cases, you won’t be storing an extraordinary amount of chunks (and if you are, you should maybe consider decreasing it, and instead using larger chunks with smaller sub-chunks in another HashMap) and this means that you can initiate an absolutely massive volume with minimal memory cost, and then dynamically create/delete chunks as needed when blocks are added or removed.

Actually wouldn’t that entirely remove the need for volume borders, creating the grounds for an infinite terrain generator? :o You could then make an extension class with borders if you wanted.

@lawsy
There are a number of possible approaches to storing voxel data in the base level chunks, each with varied pros and cons, as I started to get into. Bitwise single-dimensional arrays are super fast, but still have that memory problem. I’m not even sure exactly how much faster they are then normal 3D arrays. It’s easy to implement in any case. Having said all that, I think playing around with chunk size/storage, as I was describing above, is more interesting and fruitful, especially in these early stages.

@PewPewKapowie

Well said, and most of the time it all boils down too game design, for my game design has a ConcurrentHashMap for the chunk manager, with under 200 chunks loaded at one time, will not impact greatly using a ConcurrentHashMap, and using sleaker idea of a Bitwise single-dimensional arrays for all the blocks inside a chunk, works well for my design, cause there is no such thing as a null value, each block is either solid, liquid or gas, very similar to minecraft design, plus gas is playing a huge role in this design.

Had many an argument with myself, which way to handle chunks and blocks and game design pretty well summed up my approach.

@seizure said: My guess concerning HashMaps vs native arrays, is that for worlds with relatively small data sets, and/or which require relatively infrequent modification/checking, the speed difference will be negligible, as you stated. As amount of data stored goes up, there are more items in a HashMap which potentially have to be traversed before finding the correct one, and the more times you call getter methods or modify the data (setting new blocks would be particularly costly I believe?), the more times those costly HashMap traversals have to occur.

HashMap traversals and updates are constant-time actually, just slower than arrays. (MUCH slower if you choose a too high load factor, but the standard load factor in the JDK is just fine.)
Insertions can be slower: if you happen to exceed the preallocated size, it will allocate a new backing store and copy every entry. So if you know the maximum size beforehand, preallocate enough, if you don’t, don’t worry too much (it averages out as O(log N)) but put the HashMap updates into a separate thread to avoid that ultra-rare but annoying stutter.

HashMaps have a bad locality of reference, and that does not play nice with the usual restrictions of CPUs and RAM access. If your code spends a lot of time iterating by coordinate, use a two-tier approach: use a HashMap of chunks, where each chunk is an array of arrays.
Benchmark chunk sizes and choose the one that is fastest overall. Ideally, allow the program to adapt to the best chunk size while it runs, though that might be more work than it’s worth. (Notch chose chunk size for Minecraft after benchmarking.)

@seizure said: The other thing to consider is "empty space" vs "block space". With an array of natives, enough space to hold every single cell is immediately grabbed from the heap and saved for use.

Actually it’s just an array of pointers, unless you use a native buffer.
It can still be a lot of data. A 100x100x100 chunk means you have an array of 100 pointers to 100 arrays of 100 pointers to 10,000 arrrays of 1,000,000 pointers.

This still shouldn’t not be too noticeable. Allocation is just a pointer increment on the desktop.
You’ll get lag only if you actually initialize all these preallocated blocks, i.e. if there’s a constructor or member initialization.
(Waste of space is unrelated and can still happen, of course.)

@seizure said: I wonder how storing a Java Byte[][][] would work though? They are classes, not natives. But if the JRE just creates an array of pointers (very likely) which start out leading to null, and are to be replaced, you're no better off (actually worse off, since pointers are definitely going to larger then a byte). But maybe if the JRE were made such that it allocated the space dynamically...? I don't know the technical details.

Byte[][][] would just replace the lowest-level byte[] with a Byte[]. It would also allocate the individual Byte objects as separate heap entitites, so the overall per-array-entry size would go up from 1 byte to whatever the address size is (32 or 64 bits). Also, the actual payload per byte would now be an additional pointer (to refer to the Byte class) and a byte-rounded-up-to-heap-granularity, i.e. another pointer (for most architectures). I.e. each byte now requires the size of three addresses to store, for 8-byte addresses on a 64-bit-machine that means your size goes up by a factor of 24.
On an older Android, each Byte would also carry a substantial (multi-byte) space overhead for heap management, things would be even worse there.

Byte[] does not give you anything beyond byte[], actually. I wouldn’t even try it (though it might be interesting to check the theory I’m explaining above).

@seizure said: But maybe if the JRE were made such that it allocated the space dynamically...? I don't know the technical details.

It isn’t made that way. The JRE programmers have enough to do, they don’t code for approaches that make little sense. They don’t even cover all approaches that do make sense…
(The same holds for CPU designers, actually. A lot of what’s fast and what isn’t has grown from consensus, and could be done faster if enough people agreed on different approaches.)

1 Like

Thanks for the tips on storage! However:

This still shouldn't not be too noticeable. Allocation is just a pointer increment on the desktop. You'll get lag only if you actually initialize all these preallocated blocks, i.e. if there's a constructor or member initialization. (Waste of space is unrelated and can still happen, of course.)
Are you sure about this? I have always assumed the long lag that occurs is due to grabbing a block of RAM. For example, if you run the ModelTest scenario in Cubes, you will see that the app hangs there, blank, for an extraordinary amount of time - this is despite most of the final volume being empty. Vast majority of the space is simply "air". It seemed to me like what was hanging it was the sheer volume of the chunk boundaries, rather then having to set the relatively few non-empty blocks.
@seizure said: Thanks for the tips on storage! However:

Are you sure about this? I have always assumed the long lag that occurs is due to grabbing a block of RAM. For example, if you run the ModelTest scenario in Cubes, you will see that the app hangs there, blank, for an extraordinary amount of time - this is despite most of the final volume being empty. Vast majority of the space is simply “air”. It seemed to me like what was hanging it was the sheer volume of the chunk boundaries, rather then having to set the relatively few non-empty blocks.

It’s current best knowledge. You’ll really have to profile your code and see where it spends its time - and then you have to mistrust your interpretation, because profiling is really hard (and there’s a lot of wrong interpretations around so you have to take everything with a grain of salt).

The standard JVMs (OpenJDK, Oracle JDK) use a copying garbage collector (the generational variety).
For a good introduction, look at http://icu-project.org/docs/papers/cpp_report/an_introduction_to_garbage_collection_part_ii.html .

1 Like
@lawsy said: Greedy Mesher is most definitely harder, one of the reason why I rolled my own version of the framework, now for myself it was a road worth travelling, because like all engineers we like to know how things work and the best way to learn something is too pull it apart and play, break and test.

Honestly though, whatever works as long as it works, and preforms well, any of these meshers in the framework will do, for the job at hand, go with what you feel comfortable with. Also have you seen the amazing awesome work done by pspeed
Marching Cubes

Will be happy to share any information about the Greedy Mesher, where possible with my limited knowledge on this subject.

Ok, wait, now I’m all confused. I’m getting bad performance even though I use multithreading. As I understand it, this is because the inefficiencies of the built-in mesher. The solution is therefore to use a better mesher, the best of which is probably sleaker’s greedy mesher. But now the solution is to use any mesher? Or is it just any mesher in sleaker’s package is good enough?

@ulfgur said: Ok, wait, now I'm all confused. I'm getting bad performance even though I use multithreading. As I understand it, this is because the inefficiencies of the built-in mesher. The solution is therefore to use a better mesher, the best of which is probably sleaker's greedy mesher. But now the solution is to use any mesher? Or is it just any mesher in sleaker's package is good enough?

I’m going to assume you see a performance decrease when you are manipulating (adding/remove) chunk meshes in the object space, not necessarily when generating them. I personally would implement sleaker’s greedy mesher, although you are correct, it was a pain to implement his different VoxelMesher system; but once you get that working, you can implement a lot of different meshers, such as the Marching Cube algorithm, to experiment with. All the Cubes framework needs is a mesh generated when the updateSpacial() is called.

What they were discussing is the weight on the JVM on whether to store block information in the form of HashMaps or multi-dimensional arrays. As @lawsy has stated, it all boils down to how you want/need to manipulate the chunks in your game. The HashMap has worked wonders due to its ability to dynamically allocate its needed space in the JVM memory. I haven’t seen any performance decrease yet as we have the future threads managing them, THEN adding to the space when it’s fully generated.

Also, how would creating your own class and just making an ArrayList work? (e.g. “ArrayList<BlockInfo>”) Ever since I started working with the Perlin/Simplex noise generator, I keep hearing that member access is much faster than array access. Anyone care to expound on that?

2 Likes
@echospot said: Also, how would creating your own class and just making an ArrayList work? (e.g. "ArrayList<BlockInfo>") Ever since I started working with the Perlin/Simplex noise generator, I keep hearing that member access is much faster than array access. Anyone care to expound on that?

This confuses me just because an ArrayList will of course always be slower than an array (ArrayList is just a wrapper for an array after all). So you may need to be more specific.

Generally:
foo.x is faster than (direct field access, not counting the ‘foo’ dereference and null check in these)
foo.x[0] (two dereferences + a null check and an index check) which is faster than
foo.x[0][0] (three dereferences + two null checks and two index checks.

Also, while I’m on the subject something like:
int[] array = new int[width * height]
…is faster and smaller than…
int[][] array = new int[width][height]
…even when factoring in the math necessary to find the right index.

Though it is easier to mess up.

1 Like
@ulfgur said: Ok, wait, now I'm all confused. I'm getting bad performance even though I use multithreading. As I understand it, this is because the inefficiencies of the built-in mesher. The solution is therefore to use a better mesher, the best of which is probably sleaker's greedy mesher. But now the solution is to use any mesher? Or is it just any mesher in sleaker's package is good enough?

Sorry to confused you, what I should have stated, if you only require a small voxel world without a chunk manager, the cube framework will do nicely, if you want large voxel world with chunk manager and infinite terrain, then would highly recommend rolling your own using the cubes framework as your base camp, marching cube, naive, voxel or greedy mesher, each mesher works, with pros and cons and using sleakers code will require some implementing, if you are only focused on game designing then go with what you feel comfortable with, if you are building a game engine then workout which mesher preforms best for your game, or ask pspeed about his marching cube project, that project is freaking amazing!

This is my approach to the voxel world, its not perfect but works:
Noise generated chunks are handled separately in batches (from the chunk manager) using a future callable thread, for generating all the needed chunk data for calculation and meshing, once the batch of chunks is finished and returned from the callable in order from top to bottom (FIFO), then the chunks are sent to the rendering threading to be processed and attached one chunk at a time, with all the chunk data from neighbouring chunks at hand.

My tests with the greedy mesher:
Using 32 x 32 x 32 blocks per chunk, with liquid, gas and lighting cellular calculations plus meshing (my chunk rendering pipeline), generates a chunk between 10 to 50 milliseconds, depending on how much liquid, gas and lighting sources inside the chunk and neighbouring sources affecting the current chunk being generated.

The joys of engineering, build, break, beg, steal, and test test test until you have something good!!!

2 Likes

Added note, personally I do like the greedy mesher, but things to be aware of using the greedy mesher, texture atlas need implementing, baked lighting can be a little tricky, transparent blocks like glass and liquids will need to be separated, but don’t create another greedy mesher for these, calculate all inside the greedy mesher and attach too different geometries, only meshing visible blocks that have lighting, and adding/removing blocks across neighbouring chunks, these things are lacking in the current form of the greedy mesher.

Checkout @roleary post, this helped me a lot.
greedy meshing for complex voxel data

Ok, all, thanks for all the help and clarification.

1 Like
@pspeed said: This confuses me just because an ArrayList will of course always be slower than an array (ArrayList is just a wrapper for an array after all). So you may need to be more specific.

Generally:
foo.x is faster than (direct field access, not counting the ‘foo’ dereference and null check in these)
foo.x[0] (two dereferences + a null check and an index check) which is faster than
foo.x[0][0] (three dereferences + two null checks and two index checks.

Also, while I’m on the subject something like:
int[] array = new int[width * height]
…is faster and smaller than…
int[][] array = new int[width][height]
…even when factoring in the math necessary to find the right index.

Though it is easier to mess up.

Alright I understand what you were saying. Yeah I meant array sorry about that xD I was just wondering, regardless of game design and all that, what would be the easiest, fastest, and most efficient way to store the block data, since we were all discussing between multi-dimensional arrays and hashMaps etc. etc. Thanks for your insight though, very helpful.

@lawsy said: Added note, personally I do like the greedy mesher, but things to be aware of using the greedy mesher, texture atlas need implementing, baked lighting can be a little tricky, transparent blocks like glass and liquids will need to be separated, but don't create another greedy mesher for these, calculate all inside the greedy mesher and attach too different geometries, only meshing visible blocks that have lighting, and adding/removing blocks across neighbouring chunks, these things are lacking in the current form of the greedy mesher.

Checkout @roleary post, this helped me a lot.
greedy meshing for complex voxel data

I actually combine the naive and greedy meshers in the transparency vs opaque scenario. However, one thing you will notice in my past screenshots, I’m still fighting the texturing as that I do not have every block indexed with a square as I do with the naive mesher. The texturing I have implemented right now with sleakers greedy implementation only does a textured quad for every 2 triangles, which gives a cool, but also strange stretched and skewed effect to it.

I actually combine the naive and greedy meshers in the transparency vs opaque scenario. However, one thing you will notice in my past screenshots, I'm still fighting the texturing as that I do not have every block indexed with a square as I do with the naive mesher. The texturing I have implemented right now with sleakers greedy implementation only does a textured quad for every 2 triangles, which gives a cool, but also strange stretched and skewed effect to it.

Interesting way of handling the transparency vs opaque.

Using @roleary techniques for the greedy mesher:
To avoid any extra looping in the meshing process, or another greedy mesher, I created geometries for solid, transparent, liquid and fluids blocks. And just to be clear, (fluids include liquids, gases and plasmas), (flowing or continually deforms), while the greedy mesher creates a quad face and depending on what quad face voxel type, determines which corresponding geometry to attach too, without introducing huge overheads and still have greedy mesh techniques for every type of blocks, when finished meshing quad faces, then smash all the geometries together with geometry batch factory and attach to the chunk.

However still hunting a bug using this system, transparency blocks look great viewing from certain directions and partly see through from other directions.

example:
Can see all edges of a glass blocks in one direction, but only see front facing edges in the opposite direction, hoping this is just directional light, material cull rendering or a similar issue.

Had a lot fun with the texture atlas and greedy mesh as well, in the end had to scale the texture coordinates and in the shader repeat texture coordinates to tile a quad face, maybe not the ideal way to do things but does work.

@toolforger said: It's current best knowledge. You'll really have to profile your code and see where it spends its time - and then you have to mistrust your interpretation, because profiling is really hard (and there's a lot of wrong interpretations around so you have to take everything with a grain of salt).

The standard JVMs (OpenJDK, Oracle JDK) use a copying garbage collector (the generational variety).
For a good introduction, look at http://icu-project.org/docs/papers/cpp_report/an_introduction_to_garbage_collection_part_ii.html .

Great advice and thank you, decided to take some time-out and do some profiling after reading the link you posted, found a memory leak which in hindsight explains a mystery little shutter (pause the world) with my project, once fixed, everything is running fast and smoothly :-o