Skeletal Animation Revisited

After playing a little with Cal3d loader( changing update frequency etc ) i managed to have 50 animated models at a frame rate around 100fps.

Without animating the models the fps were around 135.

With 20 models animated at the same time the fps were around 250.

My system is an Acer Aspire 5652 with Intel Core Duo 1,66Ghz, 256Mb Nvidia Geforce 7600 and 1GB RAM.

The only overhead is in the controller update, so customizing the update to update animation only for models that are drawn on screen

i believe we can achieve good performance for certain types of games, I mean in an fps or rpg even if you're getting attacked by a squad of enemies

i think that the animated models on screen cannot be more than say 40 or 50 units.

And for standard animations that don't need blending ( idle, walk or attack) we can use keyframe caching( i dont know the memory overhead for such a solution).

Of course i cannot calculate at this time the overhead of shadows, game level etc, but i believe that on modern hardware a frame rate of 80 fps can be achieved.

Any thoughts or comments are appreciated.

Perhaps this will be the second contest!

Where these models from a shared node or all seprate  :?

kman,



Let me borrow your machine and I'll see if I can't get it to work faster…don't tell me you're afraid I'll keep it. :-p



darkfrog

kidneybean said:

Where these models from a shared node or all seprate  :?

no all separate. The concept is to minimize the overhead of the skeletal controller for 30 or 40 different models.
darkfrog said:

Let me borrow your machine and I'll see if I can't get it to work faster....don't tell me you're afraid I'll keep it. :-p

At the end of june, when i come back from honeymoon in dominicus republic and get my opteron, why not  }:-@

I hope your honey moon is going great :slight_smile:



let's hope md5 aquires same performance :wink:



Though true that u seem to have a really powerful machine!



u know I'll be making the very opposite test…old celeron and tnt1 cards, heh. You can't beat that. And if you can, I yet can count on a pentium 150MMX of when Gandalf was born. nah, I don't even pretend our game runs in tnt1. Though is a safe bet.If continues running there with the md5 loaded (only one character at a time in our case) I'll be totally sure it runs in our main target: geforce2mmx (aprox an geforce1 or lower).But I keep wanting it all run in a tnt1 16mb, celeron 400, 256 ye old ram pc100. Till now, what we need to put in screen is running great. Let's see if md5 kills it all, hehehe.

I noticed that with a single mesh model the fps were around 400 with 40 models even if their polycount is greater than multi mesh models.

As far as i remember when i was initially developing cal3d loader on a geforce 440 mx with one model the fps were around 400.

The main problem with skeletal animation ( at least my way of skeletal animation) is that is cpu dependent, so i dont think you are going to manage to run on old celeron

processors.

And a question : does jme run on TNT2 cards?

kman, what is the stats of the models you are testing?


of Bones

of verts


are these weighted bones?

I'm curious because here at work we have a complete weighted skeletal system with a skinned mesh. Ran a test with a creature that has 1500 verts, 56 bones with weighted vertices. We get around 100 fps with 10 of them on screen at the same time animating (individual objects, nothing shared).

first test case :

model with 1144 triangles and 817 vertices , 50 bones with 20 meshes(single trimesh) on screen 276 fps.

second test case:

model with 1359 triangles and 1009 vertices , 43 bones with 20 meshes(single trimesh) on screen 220 fps.



the animation supports weights, blending, has a physics system for vertices not attached to a bone.

for more information on cal3d check out their site.



when running on AMD64 3000+ with 1Gb of RAM and GEforce 6500 the fps are around 180-190( sorry the only weaker machine)

Wow, impressive, I'd love to play around with that sometime and take a look at what you've done.


I noticed that with a single mesh model the fps were around 400 with 40 models even if their polycount is greater than multi mesh models.
As far as i remember when i was initially developing cal3d loader on a geforce 440 mx with one model the fps were around 400.
The main problem with skeletal animation ( at least my way of skeletal animation) is that is cpu dependent, so i dont think you are going to manage to run on old celeron
processors.


grrr....Couldnt casual gamers just upgrade the freakin' machine...

Hmmm...but let me ensure that. I have gone from surprise to surprise(good ones) with Jme and Tora's coding .  I'll tell u once we reach the point we can test it. Tora's going at full speed. :)

And a question : does jme run on TNT2 cards?



Totally. BUT...it has its problems, of course. With what we need to put in screen, that is , physics, dont remember now the count of scenery, but think of a low poly room, but with quite mesh detail (more than old counter strike ones) ...we needed to stick to small textures, most at 128x128, very few at 256. That is, texturing stuff is hwat seem to be killing more the performance, not really polycount. (though of course, our room tends to be of a bit above 3,000 tris. We don't need more: I'm an expert at low poly, hehe, and we need every poly.

BTW, is *not* tnt2. Is TNT1. Quite lower. Cause it's also just 16MB memory.
And yup, it runs with all mesh and phisics we want to put, but quite in the limit: 50fps. In the limit as a 1000 + something single mesh md5 may end up killing it all once added it, some overlayed bitmaps of UI, and of course ...ai, etc. Dunno how much that ye old cpu can take. That's what keeps me nervous. md5 involves bones and all, but also weights. Unsure how it was dealt in old tnt1.

Anyway, our main target is a duron 800 , 256ram pc133, gf2mmx(quite lower than gf4mmx) of 32 megs, UBER low HD (the worst I've seen ever) , unstable Win Me eating all freaking resources. To tell u that it eats even more in some cases than the tnt1 machine! And even being the tnt1 w98SE in very badly status , too.

to say that the fall to 50fps was for a load of phisics stuff.But enough to mount the game.

You see, I'm the gfx man and the tester. Tora is the coder and...the... magician.
We're getting nicer results I could ever hope of a java engine (you know what's told out there ;) ) ...but I am certainly afraid of the moment of developement we add to that the md5 loaded.... :cry:

And to say thanks once again to kman; this format will save totally our hd memory problem keeping 100x100 quality.
If the machine dies, I'll get back to a vertex animation format, surely Hevee's. But then the anim files wont be of just a few ks or bytes...

My plans for the md5 are pretty low, though. I'd like a 1500 tris character; I know I'll surely will have to stay in 1000 or so.  :'(
No fingers (maybe just the fat one) , two hand bones (for at least flexing), 3 or 2 spine bones, arms, legs, one only toe, head bone. that's all.
Anyway, at job I am used to only 400 -500 tris models,(in a rare occasion I did one of ~800) and be em visually quite detailed...fooling the eye, of course! and a heavy work with low textures ( one of 128x128, one 64x64 for the weapon).

Oh, and well weighted vertices, I want to avoid the 1/0 deal of MS3d or HL1 models. I even prefer vertex animation to that.(as still could do organic bendings)


Of course I'd love to go for a 65,000 tris, my magic number ;) and some displacement map for several millions of tris of detail ;) But u know, the TNT1 celeron I have here for testing would have something to say about it...




Mojo: do you use harware skinning? if not doeas anybody knows of any good resources for hardware skinning?

No we are not currently doing hardware skinning, however, we designed with that in mind. In fact, after speaking with one of the engineers for NCsofts C++ graphics library we redesigned the system to allow for that. However, at first glance comparison (and admittedly it's very hard to compare with just discussing it and not actually experiment), you are getting roughly twice the speed we are getting. So, I'm very curious as to how you pulled off the feat. Of course, it'd be nice to do a more direct comparison with the same assets.


well, getting 100 fps 50 animated characters of that count (low pol these days, but still, quite above 1k tris) by software skinning...sounds very good...yet I dunno how much powerful is an Acer Aspire 5652 with Intel Core Duo 1,66Ghz, 256Mb Nvidia Geforce 7600 and 1GB RAM. (core duo means two processors? I'm a bit lost with english...1,6 sounds very low speed...)

To mention that sometimes unwelded vertices in a weighted mesh(specially if then weights 2 instead of one),  or just if the export format carries way more info than a more direct-to-the-needed-stuff format, it can be adding extra load. (ie, vertex colors, etc)

Certainly the assets do have a lot to say.But also those other things.Or what else is being done at that moment by the engine...Or even...how models are created by artists ;)





just to mention that in an average machine...probably a bit less ram... a 5,000 tris model loaded and animated, and being itself alone in screen...in Blitz3d (weighted vertices by software,but no spline interpolation, just the ugly linear) starts to kill the machine. Using b3d format. Anyway, looks like in general jme has quite more performance than usual non expensive(probably the expensive ones too)  game making comercial environments.

A pitty that blender's cal3d plugin looks to be more problematic than md5(zero probs) one, and that md5 files are much lighter in size.

mojo: I've purchased a low poly model for testing, so pm me if you want a max or a ms3d file of the model for your tests.

how often do you update in your code? I've included a modifier so that the update methods can be called in specific time intervals.

if every model is updated every frame then the fps are about 70. With  update logic being updated at 50 ups the frame rate rises at 200 and the animation remains smooth.

snaga:as far as cal3d size is concerned i can change it to a md5 like text format so that it will become much lighter in size if you zip it.

updates only occur when a bone moves. Which with the animation sets we currently have occurs about every 0.0333 seconds.


interesting...but ...Would have to solve some problems I personally have with cal3d and my packages... I'll do some more testing, then (i guess cal3d is ready to use now?) probably we will test both formats, then.We're really tight in downloadable file size. yes we compress it with a very high algorythm or something...(you know, the artist never gets the whole picture ;)  )

I'll tell you more about possible issues, etc, in the art side..at least in what I can find...

One measurement that might help the comparison would be to compare bone-mesh calculations per second.  So in a simple (very simple) with two bones and 1 vert, and each bone affecting the vert 1 time, then every time the mesh is updated you have 2 bone-mesh calculations.  We have a BoneInfluence class that represents the combination of a weight, a vert and a Bone reference, so we could basically count those and multiply by mesh updates per second.  You'd want to test without lighting and without textures as well to thrown out those.  Keep in mind that a comparison would require that both methods also did normals recalculation on each mesh update.



That max model (assuming it was already setup with animations) is something we can convert to collada and might be very helpful in doing an apples-to-apples comparison.

here is a test model: http://www.3drt.com/3dm/free/3drt.com-free-monster-animated-character.zip

pretty simple animations 884 triangles 780 vertices 40 bones.

By updating every 5 calls to SimpleGame’s update call i get 250fps with lightning and texture enabled and with normal calculations.

By updating every 10 calls to SimpleGame’s update call i get 350 fps.

By updating every call i only get 58 since my loader calculates the interpolated bone tranformation every update cycle.

My code needs many refactorings and perhaps i’ll move too towards support for a hardware skinning method.