Thread efficiency vs time per frame deficiency - opinion welcomed :)

amirkr · December 5, 2013, 6:43pm

OK, here it is.

I have attached a schematic drawing explaining what happens in the code (or at least what I’m aiming at).
There are only 2 classes so it should be pretty straight forward: I’m creating 24 MyImage objects - each one has a geometry that is a “placeholder” and a geometry that is the image I acually want. Both geometries are attached to a single node for convenience. After creating a MyImage object two things happen:

I attach its node to the rootNode (thus making the placeholder visible).
I create a new thread with that MyImage which run in the background and loads the image.
When a a MyImage run() method ends, it adds itself to the readyQueue, indicating the loading is done.
simpleUpdate checks that queue everytime and polls “ready” MyImage objects out of it. Those objects have an image geometry ready, so the update loop calls a method in MyImage which attaches the ready image to the MyImage node.

What should you see? (If the problem persists on your machine):
Keep an eye on the Fps from launch until all images are loaded. It should be OK for the little images, but much much lower on image3. When all images have been loaded the Fps boosts up again.

Note:

There are two constants in main: The first is the path (where the images are stored), the second is the image name (tha image will be loaded 24 times). So you should just replace that with wherever the images are on your machine.
if you load the heaviest image (image3, size: 245KB, dimensions 1920x1200), my Fps drops to 0 until every single placeholder has been filled. if you delete the single code line which attaches the image (but everything else happens! The loading, the texture… just not showing), the Fps does not drop. This is line 37 in Main, “m.setImage();”.

Try to see what you get…

Thanks so much!!!

Amir.

The code: All needed files

pspeed · December 5, 2013, 6:53pm

http://hub.jmonkeyengine.org/javadoc/com/jme3/asset/TextureKey.html#setGenerateMips(boolean)

amirkr · December 6, 2013, 12:48am

@pspeed said: http://hub.jmonkeyengine.org/javadoc/com/jme3/asset/TextureKey.html#setGenerateMips(boolean)

Thank you, I have found that using the TextureKey and setting the mip to false, the loading speed of the textures was almost 8 times faster (what did you say about the mipmaps…? use them…? =D ).
Anyhow, though quick, the problem still exists that while the textures are being loaded (well, attached actually, something happens there - the transfer to the GPU?) the Fps drops, even if the image is small, which is very odd.
I’m not being petty about this - the problem is it lags the movement in such a way that for those 1-2 seconds of loading (which occur very often) things look bad. So the improvement you offered helped a lot but I still need to figure out how to tell the render thread to shut up and keep working at high Fps even when it needs to attach other geometries (possibly fat ones).

I have found something which better “pinpoints” the problem:
I thought of the following solution -
if I mostly care about keeping the Fps high, why not attach the geometries a bit more “rarely” (but still rather quickly?).

Basically doing something like this in the updateLoop:

[java]
public void simpleUpdate(float tpf)
{
if(m == null)
{
m = _readyQueue.poll();
}

if(m != null)
{
	if(counter % 100 == 0)
	{
		m.setImage();
		m = null;
	}
}

counter++;

}
[/java]

And I have discovereted this happens (pspeed, I’m sorry if its again what you were trying to say but I think this is something else?) - The movement is indeed smooth, until once in a second or so a geometry is attached - then, exacly then, when it is attached, the scene freezes for a second, then continues. The Fps isn’t showing a decrease (its about 150…) but maybe its just isn’t fast enough to detect this “epsilon time decrease” when the geometry is attached.

Is there any way to stop this freeze? if this is accomplished my work is done (for this issue) and I’ll buy each and every one of you a six pack of beers. Honestly.

Amir.

pspeed · December 6, 2013, 1:33am

You can load images 5000 times in RAM. You can move them around. You can flip them. You can shrink them. Never once do they have to go the GPU… until you attach them to the scene. THEN they need to get rendered and only then will they go to the GPU.

When else would we send them? OpenGL context is single threaded… so whenever they go to the GPU it’s going to pause the render thread because render thread = open gl thread = render thread.

Empire_Phoenix · December 6, 2013, 8:27am

You could test with dds files , and see if that reduces the time. as dds is around 1/4 or 1/8 smaller from a gpu perspective. If that improves it, its probably really the gputransfers.

Also are you by any chance using a unix based os with a amd graficcard?

amirkr · December 6, 2013, 8:46am

@Empire Phoenix said: You could test with dds files , and see if that reduces the time. as dds is around 1/4 or 1/8 smaller from a gpu perspective. If that improves it, its probably really the gputransfers.
Also are you by any chance using a unix based os with a amd graficcard?

I didn’t know that about dds files. Interesting! Do you know of a way to get jpeg files and turn those (in code) to dds files? or re-encoding BufferedImages as dds-s? (eventually I keep all images in BufferedImage objects since I need to resize them as pspeed suggested.

As for your second question - no, I’m using Windows with nVidia GeForce 410M (it’s not my regular computer ).

Thanks!

Empire_Phoenix · December 6, 2013, 11:52am

Well there are several tools that can write dds, search for one cmd tool you like and call it via Processbuilder.
(On the server of course then, only deliver the dds out)

(Alternativly the only pure java solution seems to be a class inside of nasa worldwind, mind the license however)

normen · December 6, 2013, 12:14pm

I still think jME should provide another OpenGL context for exactly this kind of stuff (uploading/preparing data). Depending on the driver implementation that should effectively allow multithreading and keeping the update loop running while uploading data.

Empire_Phoenix · December 6, 2013, 12:47pm

@normen said: I still think jME should provide another OpenGL context for exactly this kind of stuff (uploading/preparing data). Depending on the driver implementation that should effectively allow multithreading and keeping the update loop running while uploading data.

Wait is that really possible (jme aside) to open two contexts on low level and then use one for rendering and one for uploading and share the objects inbetween? Never knew that. If so we only would need a way to push on of those into jme, (aka wrap the whole jme stuff with a dual context provider)

normen · December 6, 2013, 1:31pm

@Empire Phoenix said: Wait is that really possible (jme aside) to open two contexts on low level and then use one for rendering and one for uploading and share the objects inbetween? Never knew that. If so we only would need a way to push on of those into jme, (aka wrap the whole jme stuff with a dual context provider)

Yeah, thats the suggested way to do mutithreading as per the OpenGL spec. You’d still have to manage this manually though, as in actively using the other context to upload and then assigning the texture/material in the main context after it has been uploaded. So it would require some special access methods for this, it cannot happen “automagically”. If you just outsource the uploading the main context would still have to wait until the texture is actually uploaded.

Empire_Phoenix · December 6, 2013, 4:24pm

Well i could use a assetlistener tho, to enqueue a upload to it before finishing returning.

As I already use a secondary thread for everything but attaching, this could already help.

normen · December 6, 2013, 4:31pm

@Empire Phoenix said: Well i could use a assetlistener tho, to enqueue a upload to it before finishing returning.
As I already use a secondary thread for everything but attaching, this could already help.

Yeah exactly, something like that is what I meant by “handling it manually” You still need a way to trigger an actual upload, too.

toolforger · December 6, 2013, 10:36pm

I’d like that in a way that the application doesn’t have to manage the second thread; it’s too easy to muck up things that way.
Something like “if you specify the asset in this particular way, you’ll get back a Future that will be filled in only after the upload completed”. Application can then defer using the asset until after the upload happened.

normen · December 6, 2013, 10:45pm

@toolforger said: I'd like that in a way that the application doesn't have to manage the second thread; it's too easy to muck up things that way. Something like "if you specify the asset in this particular way, you'll get back a Future that will be filled in only after the upload completed". Application can then defer using the asset until after the upload happened.

A helper AppState for this could certainly be added but it would be two method calls (load asset, upload data) in a Runnable, not exactly complicated. Having this in a rigid API would bind the user to (yet) another thread that has to be considered, I guess its better as a more low-level extension. Plus most larger games have separate threads for asset loading anyway.

toolforger · December 7, 2013, 10:33am

Not sure whether an AppState is useful for that, Appstates are just there so you can enable and disable processing steps from outside the threads they are called from.
I think the asset-loading thread of larger games should still be decoupled from load-to-GPU because the two transfer processes might need to be priorized differently (e.g. load-to-GPU might want to upload GUI meshes/textures before background images even if stuff got loaded from disk in a different order).

I think an API extension would be necessary. The application won’t want to use an asset before it’s in the GPU, so it needs a way to know about a completed upload. (What would happen if the application tried to use an asset that’s not completely uploaded yet?)
A Future would be one way to do that. A Runnable that the uploader thread can submit() to the render thread would be another one (probably a better one, with a Future per asset the render thread would have to poll all outstanding Futures).

toolforger · December 7, 2013, 10:34am

@erlend_sh can you please move the messages starting from #253144 to a separate thread? This is interesting but won’t help the original poster with his problem.

normen · December 7, 2013, 11:06am

@toolforger said: Not sure whether an AppState is useful for that, Appstates are just there so you can enable and disable processing steps from outside the threads they are called from. I think the asset-loading thread of larger games should still be decoupled from load-to-GPU because the two transfer processes might need to be priorized differently (e.g. load-to-GPU might want to upload GUI meshes/textures before background images even if stuff got loaded from disk in a different order).
I think an API extension would be necessary. The application won’t want to use an asset before it’s in the GPU, so it needs a way to know about a completed upload. (What would happen if the application tried to use an asset that’s not completely uploaded yet?)
A Future would be one way to do that. A Runnable that the uploader thread can submit() to the render thread would be another one (probably a better one, with a Future per asset the render thread would have to poll all outstanding Futures).

Dude, you’re rambling. As I said, this is normal threading, no need to get all fuzzy about it. The simplest way to determine when its uploaded would be when the upload call finished. :roll: Your suggestion doesn’t require an API, its exactly what the existing concurrency API in java does. As I said, you just put the upload process in a Callable and have exactly what you describe. So theres no need for an “API” apart from the actual upload call.

I also don’t know where you get the idea that “Appstates are just there so you can enable and disable processing steps from outside the threads they are called from”, because its nonsense. AppStates are for all kinds of things but you cangroup them all under “extending Application”. In this case an AppState would be appropriate so you can synchronize the uploading with the update loop. Finally, all this is not even a current discussion or anything we ask for input on.

pspeed · December 7, 2013, 11:42am

To be clear, this discussion is arguing over the color of the curtains you will put on a house that hasn’t even been built yet. The “house” is the hard part.

So in other words, I kind of agree with Normen. I don’t think we have anyone who will be doing the harder OpenGL context-related work to make this happen so it falls into the realm of holiday wishes.