Hey monkeys this is more intended as a blog post than a forum post but…we don’t have a blog… So this might be a longer read than regular forum posts.
A while ago I initiated a project called Monkanim that was supposed to replace the whole JME animation system Monkanim: new animation system in the works
TLDR It was actually based on the current bone and spatial animation system, and provided a state machine to setup your animations flow.
It is somehow working, but lack some features.
Recently I worked on the gltf loader and I had to dive deeply into the bone animation system. I realized there were some major issues (some transforms were not combined properly and it worked just with a hack) and the system itself was very convoluted, because YES… skeletal animation is actually quite simple.
I truly believe that this hack and the convoluted design played a big role over the years in the difficulties people trying to make importers had to face, most of the time resulting in a “everything works except bone animation TM”.
I have come to that point where I have enough grasp of our system, and enough experience to redo it entirely from scratch… so I did it.
I made a monkanim branch in the engine repo with the new system. It’s still a work in progress but the base of the system is here and it’s working. This is intended as the big feature for 3.3.
It is not the monkanim that was explained in the previous post, I switched the design to something else, I think, for the best.
So what did change?
Todays Skeletal Animation System.
Skeleton: A Skeleton is basically a hierarchical Bone structure. You can have several root bones to the skeleton, and the skeleton also maintains a flat array of the bones so that they can be referenced by an index for skinning.
Bone: Here resides the wonky part IMO. A Bone has a local Transform, that is its transform relative to its parent bone, and a bind transform, that is its position so that it’s aligned with the undeformed mesh. The bind transform is expressed in local space, which doesn’t make a lot of sense because the bind pose is actually only useful in model space ( the same space as the vertex coordinates in the mesh). If you set a bone in its local bind pose it will only be aligned to the mesh if its parents, all the way up to the root bone, are also in bind pose.
On each frame the model transform of each bone (transforms in model space) are computed in a very similar way world transforms are computed in the scene graph. From there is computed the inverseBindModelMatrix of the bone, which is actually what we need for skinning.
What’s this thing?… think of it as a matrix that can transform the meshes vertices into the bone’s space. Making it easy to apply the bones model transforms to the mesh’s vertices.
The wonkiest part is how local transforms of a bone are resolved either when setting them by hand (setUserTransforms) or when an animation sets them. You are never setting the local transforms directly, instead you are providing a transform that will be “combined” to the bind transform of the bone and this combination will result in the actual local transforms…
Worst of all the combination is just wrong because it just add the translations, multiplies the rotations and multiplies the scales. This is not how you combine transforms → this is the right way jmonkeyengine/Transform.java at master · jMonkeyEngine/jmonkeyengine · GitHub
I think this way works ok as long as scale is 1, 1 ,1. And I guess it’s the vast majority of the cases… but still.
I traced this up to Ogre mesh importer. It’s actually the structure you have in a skeleton.xml. And since the system was made based on this… well… it’s done like this.
So… basically, whenever you want to set transforms yourself…it’s kind of confusing. Also for an importer, animations data (that are usually giving localTransforms of each bone for each frame) anim data has to be modified with this inverse operation : “demultiplying the bind transform”. Which IMO blocked a lot of attempts to have bone animations properly imported into JME (of course… added to the complexity of the imported format itself).
Also, the skeleton is resetting the bone transform to its bind transform on each frame. That’s why you have to call setUserTransform(true) on a bone to “lock” this mechanism and that your transform won’t be erased on next frame.
The new System
Armature: I’m not gonna lie…An armature is pretty much the same as a skeleton. I had to change the name so that there won’t be any confusion, as the old system is staying in the code base for backward compatibility but deprecated. So… Armature = Skeleton.
Joint: This one is replacing Bone. First because even the current Bones are actually Joints. Bones usually have a length, they don’t in JME, so they are more joints.
A Joint has a localTransform and an inverseModelBindMatrix. Period.
If you want to change the transforms just set its local transforms as you do with Spatials.
If you are making your own armature, align each bones with the mesh by changing its local transforms then call setBindTransform on the armature and the current local transforms will be used to compute the inverseModelBindMatrix of each bone. You will never have to manipulate this matrix, it’s used internally for skinning.
Model transforms are computed on each frame (needed for skinning). The way they are accumulated can be changed so that non uniform scales are properly supported.
No transform resetting on each frame.
And it works just fine.
This change alone allowed me to drop around 200 lines of code in the gltf loader to fixup animation data.
This is for the low layers of the system, let’s have a look at the higher levels.
Today’s Animation System.
Animation: Holds keyframe animations. Keyframe animations are split into tracks
BoneTrack: The keyframe information for a specific bone for a specific animation
SpatialTrack: The keyframe information for a specific spatial for a specific animation
AudioTrack: Allows you to add a sound into your animations. It’s more of a hack into the system. I know because I did it.
EffectTrack: Allows you to add a particle emitter effect to your animations. Here again… Hack.
AnimControl: It holds all the animations and allow you to play them through a Channel that you’ll have created beforehand. That’s what manages the animations.
AnimChannel: A channel is some kind of animation lane where you are going to set an animations. The anim channel allows you to switch from one animation to another with a smooth transition. The main idea behind the channels is that they are defined for a set of bones in the skeleton and will only affect those bones. The problem is that the need to use a subset of bone is pretty rare, or at least not the usual case, and the channels are not very intuitive to use.
SkeletonControl: Handles the skinning of the model through the skeleton. Support software and hardware skinning. Nothing much to say.
AnimEvenListener: A listener that allows you to hook your own code at the start or at the end of an animation. I’m not fond of this system as it’s not really flexible, and cumbersome to setup.
This system works fine, but lacks a lot of features like animation blending, and only allows you manage transforms for a skeleton or a Spatial. The audio and effect part are just plugged into it in a not very convenient way, and I’m not sure a lot of people are using it.
Monkanim was first designed as a state machine with something very similar to Mecanim, the Unity animation system. You define states and transitions between states. However, hinted by my friend Paul Speed I dug into Lemur’s animation system that is based on Tweens. A Tween is “something that happens in between”. It goes from a starting state to an end state through an interpolation process. A very simple concept, yet powerful.
Also the most interesting part about them is the way you can compose them. Lemur allows you to make a sequence tween, that is a chain of sub tweens, or a parallel tween that will interpolate sub tweens in parallel.
This offer a huge potential for animation composition, in a very intuitive “lego” way.
So I went and rewrote monkanim based on a tween system. The very same tweens you have in Lemur.
Monkanim Action System
It’s still a WIP, but the main principle is already in place.
AnimClip: The replacement for Animation. It’s mostly the same: a stateless class holding raw animation data. It can be a Joint animation or a spatial animation or both. It’s holding TransformTracks : basically animation data for something that has a transform.
AnimComposer: Basically what replaces the AnimControl. As its predecessor it allows you to change the playing animation, but in a different way. It allows you to create actions from the AnimClip, and play these actions on demand.
Action: An action is a composition of Tweens. It can be a single tween, a sequence of tweens, or tweens called in parallel way or a mix of all this. An Action is also a Tween, so it can be composed as part of another action and so on. An action can be provided with a Mask that will define a subset of targets on which it applies (a subset of Joints in an Armature or a subset of spatials for spatial animations)
Notables built in actions : (this may change a bit)
BaseAction: An action that wraps a Tween
ClipAction : an Action that plays an AnimClip
BlendAction : an Action that allows you to blend AnimClip together (think half walk / run animation varying depending on the forward speed)
That’s basically it. Note that Tweens can be basically anything, so the base action already provides virtually endless possibilities:
Stock Tweens that I plan to provide:
- SoundTween : play a sound
- EffectTween : fire a particle emitter
- MaterialTween: animate material’s properties
- ChangeActionTween: tells the AnimComposer to change the current Action
Stock Tweens available in Lemur (will probably be ported to jme’s core):
- Sequence tween: a tween that plays tweens in sequence.
- Parallel tween: a tween that plays tweens in parallel.
- Delay tween : a tween that just waits…
- Stretch tween: a tween that wraps another tween and change its duration.
- Camera tween: moves the camera…
Then… your tweens. Tween are very easy to implement and you can definitely make your own and compose them in actions.
Anyway, you got it, with this composition system you can do pretty much anything.
All of this will come with factory like methods to easily construct them.
SkinningControl: The replacement for SkeletonControl. It does pretty much the same.
I feel this system is very simple, yet it offers a shitload more possibilities than the old one and is a lot more flexible.
Note that the system could even replace the cinematic system with very few addition (basically an AppState that would coordinate different AnimComposers…).
As said this is still a wip and some parts needs yet to be implemented.