Quaternion-to/from-Rotation Matrix brokenness

Issue 229 in the bug database

  https://jme.dev.java.net/issues/show_bug.cgi?id=229

might need some additional review by more eyes.



The quaternion conversions to and from a rotation matrix were very broken – controlling a camera using quaternion interpolation was just impossible without huge jerky motions due to numerical errors.  I've tracked the problems down to errors in the "Matrix and Quaternion FAQ", from which jME's Quaternion implementation seems to derive, and replaced the code in question with the recommended algorithms from Graphics Gems ( ftp://ftp.cis.upenn.edu/pub/graphics/shoemake/quatut.ps.Z ) which fixed the problems. (My camera no longer jerks upward!)

Hello?  (lo?) Is there an echo in here? (echo in here?)

Interesting… I am about to start smoothing out my camera movement. Mind posting your code that fixes the bug?

according to cvs renanse checked in some fix referring to "cananian", but it doesnt seems to be a complete rewrite, just a change from

"if (t > 0)" to "if (t > 1e-5)" in fromRotationMatrix…

I don't really agree with some of cananian's patch, but I do have a lot more to check in for Quat and both Matrix classes…  unfortunately, these changes fix a bug that Bone is counting on and perhaps other classes count on.  I could also get my changes to Bone checked in, but we're not ready to check in everything and I'm afraid I might be forgetting what those "other classes" are.  If that's not too big of a deal, say so and I'll just plow ahead.

Could you describe what you find objectionable about the patch?



The changes involved are extremely trivial (basically, a change from 1e-5 to 1 and a call to Quaternion. norm()), it comes with a test suite vividly demonstrating the problems, and the code now closely follows a widely respected reference in the field (instead of following some FAQ found on the internet).



I'm finding it very hard to contribute to this project.

cananian, hehe, I'm sorry buddy, I really do sympathize with you…you're having sort of rough luck getting going.

agree, lets just check it out and test it, and if it works out, it's in…



hope you don't get discouraged to contribute, since you are allready valued…

The patch is available from the URL above (https://jme.dev.java.net/issues/show_bug.cgi?id=229) as well as a test suite.  Feel free to try it out!

Ah, sorry… didn't even see that link  :P. This is perfect, as I am just about to start using .fromAxes in my current project.

One of the things I don't like about your patch is the calculation of the trace without the +1.  Every textbook description I can find on the subject uses m00+m11+m22+1.  (On a side note, out of all the books I covered recently, Van Verth's book "Essential Mathematics For Games & Interactive Applications" is a great resource for game math!)



Really, the major problem was that there was a ton of code waiting to be checked in, including existing fixes to the math classes.  Those are now in cvs as of an hour or two ago, so what needs doing is a retest for completeness.

FYI, a shortened version has been checked in that does not require branching and is a lot cleaner to read.  See source here: Maths - Conversion Matrix to Quaternion - Martin Baker



Also, just to back up the way this is being done, see Intel's paper (one of many such papers I've looked at on this): http://cache-www.intel.com/cd/00/00/29/37/293748_293748.pdf



I hope you interpret this as an effort to understand and research the issue.  Being an integral core method, I feel proper due dilligence is needed here on my part before just accepting something.  The code in cvs appears to work fine and smooth for bone calculations, node handling, and other uses that I have tested.



If you still want to argue for your method, please explain A, why your 2nd test program is a valid test and B, what you feel you method is doing more efficient than what is in cvs.  After all this research, I'm definitely curious on both points.

Oddly enough, the paper you cited from intel cited the method I used as "the commonly used implementation by Ken Shoemake".  And the first reference you cited explicitly refers to the "trace plus 1", which is the fix which my patch proposed.  Sigh.



Anyway, I'll look at what you've checked in and how it compares.  I'd feel a lot more positive about this if you'd also checked in a test suite for your method, instead of just "appears to work" – the old quaternion code "appeared to work" as well, until I actually tried to use it to control a camera which was slewing to a particular position which fell into the (broken) first case of the code.  Does your new code pass the test suite which I submitted in my patch?

Basically you are ignoring what I asked for in my last post…  :slight_smile:

Well, it takes time to review your new code. [EDITED: snide remark deleted.]



Your new version fails the test suite I attached to Issue #229.  Your implementation of toRotationMatrix seems to be wrong when norm != 1; but I'm fairly certain that's not the only problem.



I've never claimed my method is more efficient: just that it is correct.  The "efficiency" of your new implementation seems to depend critically on the implementation of Math.max – do you have any actual benchmarks proving that you've improved performance?



As for the test program, it is very straightforward:  it converts a matrix to a quaternion and back, and checks that the results are reasonably similar.  When conversions to/from quaternions are broken or numerical unstable, this is not the case.  The program prints out the "bad" matrices, so that you can compare for yourself that accuracy isn't up to snuff.  For example, original matrix:


com.jme.math.Matrix3f
[
 -0.9747772  0.19721694  -0.03884777
 -0.008699911  0.23692809  0.99724585
 0.2230103  0.9512994  -0.0631785
]


Result after converting to quaternions and back:


com.jme.math.Matrix3f
[
 -0.9040756  0.27828974  -0.32434866
 -0.27828974  0.19264323  0.9409801
 0.32434866  0.9409801  -0.09671891
]



This test is only valid if the original matrix is orthonormal, and you can verify that the original matrices are.  The result of the conversion from a matrix should be a normalized quaternion, and the test suite even normalizes it to give the conversion back to a matrix as much benefit of the doubt as possible.  You can comment the 'q.normalize()' line out to show that it doesn't have any effect on the results in this case, which means the bug is not in 'toRotationMatrix()' (which has the normalization bug described above), but in 'fromRotationMatrix()'.  I haven't figured out exactly where the bug there is, but the test suite clearly indicates that there is one.

Another useful test suite might be to generate random rotation matrices and convert them to/from quaternions.  This might give you more confidence in the correctness of your code.

OK, I've found the bug in your fromRotationMatrix code now, too.  [EDITED: needless flamage elided.]  The poster in "euclidianspace" started with the same (broken) implementation in the Matrix/Quaternions FAQ floating around the interwebs, and so the conclusions he draws are in error.  [ED: this isn't actually correct; see later reply.]

Also, your toRotationMatrix code, in addition to incorrectly handing the "norm()!=1" case, involves significantly more multiplications than mine (well, Ken Shoemake's).  So, although I didn't mention efficiency as a concern originally, I will now confidently state that you should use my patched version because it is both faster and more correct.

Thank you, I appreciate your williness to do some foot work here. (Your slightly aloof and snide remarks aside.)  If it is possible to generate a patch against the current cvs to ensure completeness, that would be great.

I've attached an updated patch to the bug: https://jme.dev.java.net/issues/show_bug.cgi?id=229



The anonymous poster on the "euclideanspace" forum gives a correct formulation, if rounding errors did not exist.  But all floating point math is approximate, and his "Math.max(0, …)" only covers up the problem.  Some of the "trace-like" quantities (1 +/- m00 +/- m11 +/- m12) may actually be negative, due to rounding errors; using floats instead of doubles magnifies these problems.  Coercing these negative values to 0 yields pretty bad results.



Ken Shoemake's algorithm "makes the best of a bad thing" by computing the largest "trace-like" quantity (hence, the one least affected by rounding errors), and then computing the other elements of the quaternion based on this.  The end result is much less affected by numerical inaccuracies.



The anonymous poster also compared the speed of sqrt favorably to division (which is true: on a hardware level square root is almost identical to division), but he failed to notice that Shoemake's formulation only performs one division, while the alternate algorithm involves four sqrts (and 4 divisions as well, but those can/should be replaced with multiplications by 0.5f).



I'll eat a bit of humble pie here and admit that my first intuition about where the "bug" in fromRotationMatrix was incorrect; the problem was subtler than the simple off-by-one error I thought I had found (although the bug in toRotationMatrix is more obvious).  I apologize for being "aloof and snide"; I was frustrated, but I should have kept my tone more level.

Thanks, I admit I felt frustrated myself because I had done research for the from method, see above referenced books, (although I am embarrassed that I took that research and swapped in the shorter version because it seemed like a simplification of the other approach.)  While good at math in school, I am no mathematician so I like to be given more information before accepting a change to such core classes as math.  Sorry if I was being stubborn, but there it is.



Look for your Quaternion changes to be in this weekend.  All of this has only intensified my wish for more unit testing in the engine.