Well Iâd really like to know how you can deduce this ^. How can you tell your algorithm is memory bound in Java.
Iâve been experimenting with data oriented design lately. For those who donât know what it is : itâs based on the idea that your CPU uses L1, L2, L3 cache to store chunks of memory for fast access while processing. The caches works in a way that when your program access data at a memory address all nearby / consecutive data is also loaded in the cache, with the assumption that the program may access nearby data subsequently.
The problem is copying data in this cache is long so if your program treat chunks of data that are scattered throughout the ram, it will constantly load useless data in the cache an dismiss them right after. This is called a cache miss.
Basically there are a lot of programs where the CPU is just waiting for memory to be copied in the cache instead of processing it. Thatâs why cpus do branch prediction during âdownâ time to get ahead of future execution (thatâs what lead to the Meltdown thingy not so long ago, but thatâs another story).
When you think of a OOP structure you thing objects in objects in objects, this usually ends up in objects allocated far away in the memory (like your example of an array of objects, the pointers to the objects will be in one chunk in memory, but there is no guaranty of where will be the actual objects in memory)
The idea with data oriented design is to make your design around the data, and make sure all data that will be processed at the same time is continuous in memory to maximize cache efficiency.
So I experimented, with java of course, and the result were⌠totally unexpected. In the way that the DoD approach didnât give me any perf boost⌠even the opposite. I tried many things: primitive types arrays, bytebuffers, even used the sun.misc.Unsafe to allocate memory myself⌠and I barely get on par with OOP designâŚ
So I may have done shit⌠but, Iâm not exactly the Java nooby⌠so I donât know.
My assumption is this one (DISCLAIMER, this may be totally wrong, itâs just my idea based on empiric observation and I have no proof so take it as it is) : Hotspot aligns the memory to minimize cache misses. There is no guaranty of where the JVM will put your variables in memory. IMO thatâs because they may move. And they move according the the most accessed memory.
So ⌠if this is true⌠having struct wonât help that much. Sure it will minimize memory usage⌠But I really doubt it will give you a 2,33 perf gain as you claim.
Note that in C++ data oriented design is an immediate perf win. The approach is really interesting, and itâs worth a read or 2.
EDIT : and btw Iâd be glad to have comments about âDoD in javaâ if some of you experimented. I couldnât find a lot of literature on the topic.