Well I’d really like to know how you can deduce this ^. How can you tell your algorithm is memory bound in Java.
I’ve been experimenting with data oriented design lately. For those who don’t know what it is : it’s based on the idea that your CPU uses L1, L2, L3 cache to store chunks of memory for fast access while processing. The caches works in a way that when your program access data at a memory address all nearby / consecutive data is also loaded in the cache, with the assumption that the program may access nearby data subsequently.
The problem is copying data in this cache is long so if your program treat chunks of data that are scattered throughout the ram, it will constantly load useless data in the cache an dismiss them right after. This is called a cache miss.
Basically there are a lot of programs where the CPU is just waiting for memory to be copied in the cache instead of processing it. That’s why cpus do branch prediction during “down” time to get ahead of future execution (that’s what lead to the Meltdown thingy not so long ago, but that’s another story).
When you think of a OOP structure you thing objects in objects in objects, this usually ends up in objects allocated far away in the memory (like your example of an array of objects, the pointers to the objects will be in one chunk in memory, but there is no guaranty of where will be the actual objects in memory)
The idea with data oriented design is to make your design around the data, and make sure all data that will be processed at the same time is continuous in memory to maximize cache efficiency.
So I experimented, with java of course, and the result were… totally unexpected. In the way that the DoD approach didn’t give me any perf boost… even the opposite. I tried many things: primitive types arrays, bytebuffers, even used the sun.misc.Unsafe to allocate memory myself… and I barely get on par with OOP design…
So I may have done shit… but, I’m not exactly the Java nooby… so I don’t know.
My assumption is this one (DISCLAIMER, this may be totally wrong, it’s just my idea based on empiric observation and I have no proof so take it as it is) : Hotspot aligns the memory to minimize cache misses. There is no guaranty of where the JVM will put your variables in memory. IMO that’s because they may move. And they move according the the most accessed memory.
So … if this is true… having struct won’t help that much. Sure it will minimize memory usage… But I really doubt it will give you a 2,33 perf gain as you claim.
Note that in C++ data oriented design is an immediate perf win. The approach is really interesting, and it’s worth a read or 2.
EDIT : and btw I’d be glad to have comments about “DoD in java” if some of you experimented. I couldn’t find a lot of literature on the topic.