Android Performance Tip

In the game I have to detach about avg. 8 nodes and attach 8 nodes like every 10 seconds or so. I tried this instead of setting CullHints and this resulted in 40-60. Thus, I’m sticking with CullHints + not calling updateLogical state for 60+ fps.

@Momoko_Fan said: I don't really see any solution here except just having less objects ... Unless anybody has other ideas.
You do know that I have presented a solution, right? By not calling the updateLogicalState on rootNode.

@pspeed yes I noticed that for large scenes updateLogicalState and updateGeometricState cause an overhead that makes a huge difference for Android. But by selectively calling updateLogicalState on just Spatials with Contol and using the below DynNode instead of Node, it seems that suddenly jme3 works pretty fast for large scenes as well.

When I profiled again, and found that a lot is spend in updateGeometricState. I have significantly reduced the overhead by instead of using Node.class using this:

public class DynNode extends Node {
    public DynNode() {
        super();
    }
    public DynNode(String name) {
        super(name);
    }
    public void updateGeometricState() {
        if(refreshFlags == 0) return;
        super.updateGeometricState();
    }
}

This seems to reduce much of the updateGeometricState calls. Also my rootNode and guiNode are now of DynNode.

Here is a screenshot taken on Android, while also playing background music, and using accelerometer. The fps is 63. :slight_smile:
Taken on phone

@The Leo said: You do know that I have presented a solution, right? By not calling the updateLogicalState on rootNode.
Well, that's a solution for your game, but that can hardly be a generic solution for the engine. Kirill wants a generic solution. updateLogicalState updates controls, if a control moves an object, it has to be called even if the object is out of the screen, else this object will move only if it's in the view frustum, which is not an acceptable solution.
@The Leo said:
public class DynNode extends Node {
    public DynNode() {
        super();
    }
    public DynNode(String name) {
        super(name);
    }
    public void updateGeometricState() {
        if(refreshFlags == 0) return;
        super.updateGeometricState();
    }
}
You realize that if a children of this node is transformed you won't update it. Refresh flags are not updated upward, refreshFlags == 0 does not guarantee that the subgraph didn't change. Doing less things obviously gives better performance...but that's doing less things...

You may think those are good solutions because they give good numbers, but they may get you in trouble later, if you want for example to have moving platforms that goes in and outside of the view…

Your analysis is interesting, because it clearly shows that those methods are the weak point when you have a big scene. But your solutions only works for very particular cases and cannot be general.

On a side note, your character looks very nice, I like the cartoonish style.

@Refresh flags are not updated upward, refreshFlags == 0 does not guarantee that the subgraph didn’t change.

Im not really sure about this. From my observation:

SetTransformRefresh always includes SetBoundRefresh which updates flags upward to parent.
Adding child to a node calls setTransformRefresh(), which calls setBoundRefresh which updates flags upward to parent.
Detaching also calls setTransformRefresh() and so on.

I do agree with your statement since setLightListRefresh() does not update upward to parent. I assumed I could ignore this because I do not use light.

Thus I assume DynNode can be used for scenes with no lighting, which is common on Android. I said ‘assume’ because so far I have not encountered any problems with it yet. Maybe it should have been called NoLightNode or something.

Regarding a general case solution for updating Controls. What about one of these:

  1. I guess one can create two rootNodes, one for objects with controls and call updateLogicalState on it and another rootNode with objects without controls not calling updateLogical state. That really seems to be an easy way to do things.
  2. What if there was one list of Contols. When attaching an object to rootNode or its subNodes, all objects controls were added to this list, when detaching removed. Also when adding a Control to an object and it was attached to the rootNode, that Control would be added. This would definitely eliminate the updateLogicalState overhead. Eg. Following is the sample code how this could be done:

For convenience I extended Node and Spatial classes, Thus NewNode would become Node, and NewSpatial would become Spatial. RootNode would be a special kind of node used as the rootNode.

public abstract class NewSpatial extends Spatial {
    public void addControl(Control control) {
        controls.add(control);
        control.setSpatial(this);
        /*Only required to call this method now, since if list had
         controls it would already be in added in the list*/
        if(controls.size() == 1) addControlsToList(controls);
    }
    
   public void removeControl(Class<? extends Control> controlType) {
        for (int i = 0; i < controls.size(); i++) {
            if (controlType.isAssignableFrom(controls.get(i).getClass())) {
                Control control = controls.remove(i);
                control.setSpatial(null);
            }
        }
        if(controls.isEmpty()) removeControlsFromList(controls);
    }
    public boolean removeControl(Control control) {
        boolean result = controls.remove(control);
        if (result) {
            control.setSpatial(null);
        }
        if(controls.isEmpty()) removeControlsFromList(controls);
        return result;
    }
    
    protected void addControlsToList(SafeArrayList<Control> list) {
        if(parent != null)
            ((NewNode)parent).addControlsToList(list);
    }
    protected void removeControlsFromList(SafeArrayList<Control> list) {
        if(parent != null)
            ((NewNode)parent).removeControlsFromList(list);
    }
}
public class NewNode extends Node {
    private static final Logger logger = Logger.getLogger(NewNode.class.getName());

    public int attachChild(Spatial child) {
        return attachChildAt(child, children.size());
    }
    public int attachChildAt(Spatial child, int index) {
        if (child == null)
            throw new NullPointerException();
        if (child.getParent() != this && child != this) {
            if (child.getParent() != null) {
                child.getParent().detachChild(child);
            }
            child.setParent(this);
            children.add(index, child);
            if(parent != null || this instanceof RootNode) {
                if(child instanceof Node) {
                    Node n = (Node)child;
                    n.depthFirstTraversal(new SceneGraphVisitor() {
                        public void visit(Spatial s) {
                            if(!s.controls.isEmpty())
                                addControlsToList(s.controls);
                        }
                    });
                }
                else {
                    if(!child.controls.isEmpty())
                        addControlsToList(child.controls);
                }
            }
            
            child.setTransformRefresh();
            child.setLightListRefresh();
            if (logger.isLoggable(Level.FINE)) {
                logger.log(Level.FINE,"Child ({0}) attached to this node ({1})",
                        new Object[]{child.getName(), getName()});
            }
        }
        return children.size();
    }
    public Spatial detachChildAt(int index) {
        Spatial child =  children.remove(index);
        if ( child != null ) {
            child.setParent( null );
            logger.log(Level.FINE, "{0}: Child removed.", this.toString());

            if(parent != null || this instanceof RootNode) {
                if(child instanceof Node) {
                    Node n = (Node)child;
                    n.depthFirstTraversal(new SceneGraphVisitor() {
                        public void visit(Spatial s) {
                            if(!s.controls.isEmpty())
                                removeControlsFromList(s.controls);
                        }
                    });
                }
                else {
                    if(!child.controls.isEmpty())
                        removeControlsFromList(child.controls);
                }
            }
            // since a child with a bound was detached;
            // our own bound will probably change.
            setBoundRefresh();

            // our world transform no longer influences the child.
            // XXX: Not neccessary? Since child will have transform updated
            // when attached anyway.
            child.setTransformRefresh();
            // lights are also inherited from parent
            child.setLightListRefresh();
        }
        return child;
    }
    
    protected void addControlsToList(SafeArrayList<Control> list) {
        if(parent != null)
            ((NewNode)parent).addControlsToList(list);
    }
    protected void removeControlsFromList(SafeArrayList<Control> list) {
        if(parent != null)
            ((NewNode)parent).removeControlsFromList(list);
    }
}
public class RootNode extends NewNode {
    SafeArrayList<SafeArrayList<Control>> controlsList;
    protected void addControlsToList(SafeArrayList<Control> list) {
        controlsList.add(list);
    }
    protected void removeControlsFromList(SafeArrayList<Control> list) {
        controlsList.remove(list);
    }
    public void updateLogicalState(float tpf) {
        SafeArrayList<Control> cList[];
        Control[] c;
        cList = controlsList.getArray();
        for(int i = 0; i < cList.length; i++) {
            c = cList[i].getArray();
            for(int j = 0; j < c.length; j++)
                c[i].update(tpf);
        }
    }
}

I hope I did not miss any important thing. If not, this should alleviate the updateLogicalState overhead for spatials with no controls. The price to pay for this change is travering subnodes when attaching detaching nodes.

Tell me what you think! :wink:

There are general solutions we could employ that complicate the engine a little and have side-effects that we have to deal with. Your solution is not an optimal one because it does a lot of potentially unnecessary work on attach/detach, requires a special root node class (ugh), and some other distasteful things. There is a way to do it better without all of that but as a team we’d need to decide that we want to handle what has previously been considered oversized scenes. This is why I’ve not proposed any solutions before now.

Your solution is probably the best you can do if you don’t want to modify the core engine but we are talking about potentially modifying the core engine. We can get away with a few different things that way (such as keeping the list of spatials with controls instead of a list of lists because we can call runControlUpdate() directly if made protected, etc.)

The advice in the past has always been “make smaller scenes”… so I kept these ideas to myself. Because it’s not really bad advice either. For example, one reason that control updates rank so highly in your scene is because you have “too many nodes” and your game isn’t otherwise doing very much. The strategy that your pursuing has a scalability limit because if you get the engine to handle 800 spatials ok then add 10 more tomorrow you might now be over the hump again. Alternatively, there are ways to better organize your scene so that you can add/remove entire sections based on distance or whatever. Not “several times a second” but “when the player crosses a threshold”. A solution like that could be scaled far higher. Attaching three nodes and detaching three others when crossing a grid boundary is not going to have a high cost but it opens up scalability to the point that you will hit other limits before you are plagued with update issues again.

Do I think the engine should avoid traversing the whole scene graph three times per frame? Yes. Do I think you should look at reorganizing your scene? Also yes.

1 Like

Oh you’re right about the updateBounds. Forgot about that…I’m not sure why it’s needed though since we traverse top to bottom on update…anyway.

I’m not fond either of the idea of a special RootNode.

@pspeed we should talk about it then. Even if @The Leo should have some partitioning scheme in his scene, he does have a point.
I’m not sure this would have a lot of benefit on desktop though, since there is more chance that you are GPU bound than CPU. Android is special in that matter…

@nehon said: Oh you're right about the updateBounds. Forgot about that...I'm not sure why it's needed though since we traverse top to bottom on update...anyway.

I’m not fond either of the idea of a special RootNode.

@pspeed we should talk about it then. Even if @The Leo should have some partitioning scheme in his scene, he does have a point.
I’m not sure this would have a lot of benefit on desktop though, since there is more chance that you are GPU bound than CPU. Android is special in that matter…

Maybe I’ll write something up and e-mail it to the group.

1 Like

Since you are on this track, would it be possible to disable frustum culling completely?

As for most 2d games you would already have code that manages the scene, removing/reusing every single object out there. In most cases everything attached to the scenegraph is drawn too, making the culling checks only a waste of computing time.

Might only be a small benefit, but on android every nanosecond counts

Add: to be more clear, i would just be nice if there would be a off switch somewhere. Of course enabled by default.

@zzuegg said: Since you are on this track, would it be possible to disable frustum culling completely?

As for most 2d games you would already have code that manages the scene, removing/reusing every single object out there. In most cases everything attached to the scenegraph is drawn too, making the culling checks only a waste of computing time.

Might only be a small benefit, but on android every nanosecond counts

Add: to be more clear, i would just be nice if there would be a off switch somewhere. Of course enabled by default.

For 2D, use the GUI node and you will not have normal frustum culling but just a simple screen bounds check. Unless you are really talking about 2.5d.

In my case, i am working on a 3d game, but i am 100% sure that every object on the scenegraph is in the frustum too. Otherwise i would have to check my object reusing code.

AFAIK 2.5d games are quite common too. Having persepective really helps, and it should not cost that much performance compared to flat 2d

800 objects and 60 fps - impossible!
Test code gives 17 fps - 343 objects


package classes;

import com.jme3.app.SimpleApplication;
import com.jme3.material.Material;
import com.jme3.math.Vector3f;
import com.jme3.scene.Geometry;
import com.jme3.scene.Spatial;
import com.jme3.scene.shape.Box;

public class Java_SE extends SimpleApplication {

float size = 7;
    
public static void main(String[] args) { new Java_SE().start(); }

@Override
public void simpleInitApp() {

flyCam.setMoveSpeed(15);
cam.setLocation(new Vector3f(0, 0, 18f));

Geometry geometry = new Geometry("", new Box(0.3f, 0.3f, 0.3f));
Material mat = new Material(assetManager, "Common/MatDefs/Misc/ShowNormals.j3md");
geometry.setMaterial(mat);

for (float z = 0; z < size; z++) {
    for (float y = 0; y < size; y++) {
        for (float x = 0; x < size; x++) {
            Spatial s = geometry.clone();
            s.setLocalTranslation(((-size+1)/2f)+x,
                                  ((-size+1)/2f)+y,
                                  ((-size+1)/2f)+z);
            rootNode.attachChild(s);
        }
    }
}

}

@Override
public void simpleUpdate(float tpf) { rootNode.rotate(tpf/5, tpf/5, tpf/5); }

//@Override
//public void update() {
//
//super.update();
//
//if (this.speed == 0.0f || this.paused) { return; }
//
//float tpf = this.timer.getTimePerFrame() * this.speed;
//
//this.stateManager.update(tpf);
//this.simpleUpdate(tpf);
//
//this.rootNode.updateLogicalState(tpf);
//this.guiNode.updateLogicalState(tpf);
//
//this.rootNode.updateGeometricState();
//this.guiNode.updateGeometricState();
//
//this.stateManager.render(this.renderManager);
//this.renderManager.render(tpf, this.context.isRenderable());
//this.simpleRender(this.renderManager);
//this.stateManager.postRender();
//
//}

}

If I comment updateLogicalState(tpf) game don’t stop.

Since this thread got necro’d, I thought I’d follow up on this…

I did more than write something up for the core devs, I actually implemented it. In the latest 3.1 code, updateLogicalState() no longer traverses the scene graph. It collects the spatials that are detected to need updateLogicalState() in a list and just updates those. (Presuming you follow best practices and don’t extend spatial then you get this for free automatically… else you have to override a method on Spatial to avoid backwards-compatibility mode.)

1 Like

@pspeed: thanks for implementing that

@Rutar: I think you didn’t read the part where I cull far away objects, thus at any moment 30-80 are visible. If you’re struggling with android performance, I suggest to use the NetBeans Profiler.

@The_Leo Cool game you got there.

I’ve never ask my self if there any thing bad can happen in the updateLogicalState(), but if it get improved now, so nice…

Regard optimization for such runner game. I think you need a Pool of spatials that made up the scene. In the past, I’ve involved in making Despicable Me 2 (the Minions at the top of Android game for a while).

Beside of very good optimized engine, we had use several tricks to make sure the game “ready” for every upcoming models that load down the road where Minions go. Those tricks can also be apply to JME game for Android and mobile for example:

  1. Simplify physics collision of the scenes. We just have very few faces and boxes at the nearby of player’s position. Simplify interactions too, but it can affect game design, so consider optimizate interactions later…
  2. Pools of everything and reuse them as much as possible. As in your game, I see walls can made of tiles, tree can be rotated to be varied, even reuse particle emitters, shader, lights… Remember to count every single thing that you use!!!
    3 )For paging, I think in general toggling a spatial should be in the form of show/hide or move in/out spatials but not add/ remove them in scenegraph.
  3. In low memory device, keep a minimum of spatials to be updated. In my other JME games I usually keep a list of updatable (out of the scene graph) has less than 50 objects . This list can be called the EntityManager and work like what the update loop to with spatial and their controls. Also use Controls where needed and try to share controls between instances. A small trick is to let Control “operate” Spatials by their userdata.
  4. Make smaller, less tris version of models from the start (if your team has artists), then switch them by their quality. If you has a good LOD code, it will gain some more cycle per sec. If the device is too weak, try to use a lower quality or even consider re-modelling if need.

Those are some tips in general and it’s nothing special, but for low GPU budget such as Android phones, is all you can do even after genius coding. :smile:

Cheers,

1 Like

@The_Leo: Thank you for your reply, I really missed this point. To implement your approach I have trouble, as I write in JME 3.0, while the code on Github other (probably 3.1).Could you give a sample of your code (own implementation class) completely (or even partially)? I only recently started to program for Android and can not itself implement those parts of the code that you have filed. I think your experience will be very useful for many novice programmers.
P.S. Excuse me please for my English :slight_smile:

@Rutar: I would not advice to try these changes, unless one understands what they are doing. This means, controls added to your objects will no longer be updated unless you explicitly add them to the list mentioned below. However, a nice performance boost will be noticed on Android. Thanks to @pspeed for optimizing 3.1, this will only be the case in jme 3.0.

The idea is not to call updateLogicalState() on your root nodes to tree traversal. If you extend SimpleApplication, this is already done by the class’s update method. To change this, make a copy of SimpleApplication class, name it differently, and edit it.

  1. Comment out updateLogicalState()
    updateLogicalState() is responsible for updating Controls each Spatial has.
  2. Have a collection of Controls/Spatials on which we will call updateLogicalState
  3. Comment the following two lines in the update method, and add those three instead.
//        rootNode.updateLogicalState(tpf);
//        guiNode.updateLogicalState(tpf);
Spatial[] s = logical.getArray();
for(int i = 0; i < s.length; i++)
    s[i].updateLogicalState(tpf);

4 . define the list that will hold spatials that have controls

public SafeArrayList<Spatial> logical = new SafeArrayList<Spatial>(Spatial.class);

Alternatives:

  1. manage your own root node… won’t require you to mess with cut-pasted code, etc.
  2. or, upgrade to 3.1… get the feature automatically

@The_Leo, thank you very much for the detailed reply, I will try it in practice.

@pspeed, unless version 3.1 is already available? How can I update it?

You have to build it yourself.

@pspeed, give me a link to the 3.1, please. This only 3.0 - http://wiki.jmonkeyengine.org/doku.php/jme3:build_from_sources