[Solved] How to make my AssetLoaders threadsafe / How do async loading?

Ogli · August 19, 2016, 4:28pm

Hello jME people,

This is about: [see title of the topic].

I looked here: Multithreaded AssetLoading
Also here: jmonkeyengine/ImplHandler.java at master · jMonkeyEngine/jmonkeyengine · GitHub
And started to understand ThreadLocal and ThreadLocalRandom by having read some examples in my main spoken language. Currently reading Understanding the concept behind ThreadLocal - DZone Java

Still I’m not very sure about how this works in detail and I can’t figure out what exactly is going on under the hood. Maybe these two questions will lead to meaningful insights:

(1)
What do I need to watch out for when making my own threadsafe AssetLoader for the AssetManager - will the code from the DesktopAssetManager handle these things for me?
Some AssetLoader I’ve seen (the BitmapFontLoader) doesn’t have any state (no attributes) which makes it threadsafe already (but not the loading of sub-assets, e.g. Texture pages for bitmap characters or the Material used to render the characters).
Do I need to do it in the same way or can I use attributes in my loader and the DesktopAssetManager will handle the thread-safety for me?

(2)
How do I make async asset loading with jME, for example when streaming in tiles of a large world. I’m sure that there are some examples for this, but maybe someone knows the perfect example and the name of the class that contains the relevant code for this?
(2b)
Is it, in your experience, more useful to use only one async loader thread or to use 2+ loader threads? I can imagine that a hard disk or solid state drive may have difficulties when too many loaders are accessing the medium. The bottleneck most certainly is this external medium, not the CPU/CPU-Cache/RAM part of the system.
(2c)
Is “memory-mapped files” a good option? How to use that with jME/Java?

Thanks for sharing your time reading my noob questions,

pspeed · August 19, 2016, 9:52pm

Nothing, really. AssetManager is already thread safe. As long as your AssetLoader doesn’t keep its own state then there is nothing special to do. (And if it does keep its own state then it could be the sign of a design problem with the loader.)

My IsoSurfaceDemo uses my OSS Pager library.
IsoSurfaceDemo:

Pager library:

There is very little documentation there but the IsoSurfaceDemo kind of shows how to use it. There was a big long thread here on the development of the demo that may be helpful.

At least one other dev used the Pager in their project and there was a thread of questions/support but I can’t seem to find it right now.

JME’s standard terrain apparently also does asynch loading but it’s approach is less general.

It kind of depends on how many things you are loading, how long they take, how many CPU cores are available, etc… In the general case in Mythruna, I set the number of paging threads to be the same as the number of cores (according to Java’s Runtime class). It was also a configurable option, as I recall.

No experience here with paging specifically. Most of the time there is significant difference in what is in memory versus what is on disk in the case of 3D assets. Java supports memory mapped files in the nio package… that’s a deep topic and there are lots of non-game specific resources about that on the web.

Darkchaos · August 20, 2016, 6:16am

Given your posts a few days/weeks ago, I feel like you try to overcomplicate things because you simply don’t know of them.

The Callable<> is a good example of that.
Back in the days I always subclassed Thread to have some variable and then started making it run. You had two variables boolean isRunning and Object/Spatial/../ value and then simply read for isRunning and when it was done you could read value.

This approach is obviously wrong or at least not good (since writing a boolean should be atomic you don’t need threadsafety and/or you could also use a Mutex). But that just as a side information.

You can always do something like Callable<Spatial> call = new Callable<Spatial>() { public Spatial call() { return assetManager.loadSpatial("blah.j3o"); }

Then you could access it like call.isDone (I don’t remember the actual name) and if it is you can call call.get(). You could enqueue all loading into the main thread like this.

Note: I just realized that this won’t help you I’ll keep it in though since it might be good to know for some people.
On the other hand you could throw those callables in a thread pool and given the AssetLoader is threadsafe you only have to keep a list of callables and know when everything is loaded.

Anyway the best might be simply using paul’s code
Regarding the Thread Question: If you load j3os the bottleneck should be the Harddrive. You can simply watch your CPU Usage. If it is less than (100% / NumberOfCores), then your Mainthread isn’t limited by the CPU power.

Ogli · August 20, 2016, 6:34pm

Thanks both of you for your replies.

In the meantime I studied the https://github.com/jMonkeyEngine/jmonkeyengine/blob/master/jme3-core/src/main/java/com/jme3/asset/ImplHandler.java a little bit more in detail.

So, as I thought, it’s not necessary to handle threadsafety myself, since as I guessed, that ImplHandler (the ThreadLocal) handles the threadsafety.

What surprised me: The ImplHandler (the ThreadLocal) expects the AssetLoader instance to provide a public constructor (c’tor) with zero arguments and uses reflection to instantiate an instance whenever a new thread demands access to that AssetLoader.

So, in theory, I don’t even need to make the AssetLoader stateless (even though that can easily be achieved by passing ad-hoc data capsules along a chain of methods if needed).

What I do need to provide is a public c’tor for reflection-based instantiation - a fact that is never mentioned anywhere in the documentation I’ve seen so far.

Now to reply your inputs, guys:

I admit, that I don’t have much experience in software design even though I started coding at the age of 10. I compensate for that by other traits I have - traits that most other people are lacking. So thanks for the design hint - that’s exactly why I was asking that question here.

Okay, thanks. I will schedule that on my TODO list for when my framework OGF is finally done.
Btw, I have some experience writing a pager with jME 2. Back then I implemented a producer-reducer-system and even though doing all i/o in separate threads, the jME 2 app caused a lot of stuttering. A colleague said, it might be because of the blocking disk access. So maybe I’ve done something wrong with the async i/o back then or the system just failed due to other reasons (back then not all CPUs had more than one core).

Okay, and what did happen when using only one loader thread? Did the performance go up or down?
I just found out yesterday that we could make it in a pipeline: Have one thread read i/o data, have another thread prepare that data for rendering, hand it to the scenegraph thread when done.
In theory that should give some benefit even when only having one thread for disk i/o.

Okay, then I will need to fiddle with it myself. If I do something with it then I will put that into the open source OGF framework so that others can use that solution.

As a a sidenote: There is somthing called AtomicLong which does exactly that with a long integer in the Random class that Java provides. In the ThreadLocalRandom it has been replaced by unsafe integer operations and a per-thread instantiation of Random instances.

Yeah, but that has two dissadvantages (even though it’s “best practice” with jME to do it that way):

a) You create one Callable each time you want something simple to be done (even though it might just be a simple rotation of a Node). Obviously it’s “the Java way” to mass create objects and hope that they will be gc’ed as “first generation” minor garbage. When writing efficient code (with Java, .NET or C++) you would never spam your system with mass-creation of minor ad-hoc objects.

b) When submitting Callable to the queue, you don’t have any guarantee when and with what priority things are done. There may be 1000 geometry rotations and only 3 or 4 A.I. path finding tasks. Having a separate thread for the two things might give you guarantees and priorities of your chosing, not a random execution of callable instances. Though, I think that load balancing over multiple threads should be better. Those guarantees might not be important to most coders and so this b) is not a very big disadvantage in the end.

Yes, that might indeed be a good idea. I’m slapping myself a lot because I wrote yet another UI system for my OGF. Lemur and other libraries profit immensely from an experienced software engineer like @pspeed.

That’s why I was asking ^^
However, theorie and practice often diverge. I can imagine that a typical SATA buffer filled with read/write requests in combination with certain hard disks and solid state drives might cause other results than I expect at first thought.