Quaternion.getRotationColumn Optimization

The current implementation of Quaternion.getRotationColumn precalculates 9 products with 2 multiplications each, for a total of 18 multiplications. In each case of the switch statement, only 6 of the products are used. This means that the multiplications to calculate the three unused products are wasted each time the method is called. Inlining the multiplications in the switch statement, rather than precalculating them, eliminates the 6 wasted ones, such that the method uses 12 multiplies rather than 18 (or more specifically, 15 vs. 21 total multiplies when counting the 3 multiplications by 2 already in each case block).



public Vector3f getRotationColumn(int i, Vector3f store) {
        if (store == null)
            store = new Vector3f();

        float norm = norm();
        if (norm != 1.0f) {
            norm = FastMath.invSqrt(norm);
        }
       
        switch (i) {
            case 0:
                store.x  = 1 - 2 * ( (y*y*norm) + (z*z*norm) );
                store.y  =     2 * ( (x*y*norm) + (z*w*norm) );
                store.z  =     2 * ( (x*z*norm) - (y*w*norm) );
                break;
            case 1:
                store.x  =     2 * ( (x*y*norm) - (z*w*norm) );
                store.y  = 1 - 2 * ( (x*x*norm) + (z*z*norm) );
                store.z  =     2 * ( (y*z*norm) + (x*w*norm) );
                break;
            case 2:
                store.x  =     2 * ( (x*z*norm) + (y*w*norm) );
                store.y  =     2 * ( (y*z*norm) - (x*w*norm) );
                store.z  = 1 - 2 * ( (x*x*norm) + (y*y*norm) );
                break;
            default:
                LoggingSystem.getLogger().log(Level.WARNING,
                        "Invalid column index.");
                throw new JmeException("Invalid column index. " + i);
        }

an optimization that might as well be already applied by the vm…