When reading up on Java reflection it’s hard to browse very far without hearing about how slow reflection supposedly is. Probably most of us have seen about a bajillion benchmarks comparing reflection to direct method dispatch, lambdas, etc., and reflection usually loses badly. The other day, however, I encountered a situation where reflection significantly outperformed the alternatives.
Over the last few months I’ve been putting most of my free time into a scripting language/interpreter for game development. The language is dynamically typed and compiles into a custom bytecode format that is run by an interpreter written in Java. The original design of the interpreter utilized a “catch all” base class that relied on overridden methods to handle most interactions with the interpreter (very similar to how Python handles operator overloading). The problem with this design was that it was freakishly verbose and resulted in ultra-dense, ugly code even for simple operations. For example, here’s the integer addition code:
@Override
public CObject __plus__(CObject other){
if(other instanceof CInt){
return new CInt(intValue + ((CInt) other).intValue);
}else if(other instanceof CFloat){
return new CFloat((float)intValue + ((CFloat) other).floatValue);
}else{
throw new UnimplementedOperation(String.format("Undefined operation: cannot perform int + %s", other.getClass().getSimpleName()));
}
}
There were two major problems with this design:
- The code was nasty to write and maintain.
- Interop with Java code was quite tricky - you’d have to write your own wrapper to expose Java objects to the interpreter, one wrapper per Java class.
To work around the second issue I had plans to write a reflective “auto wrapper.” Before I had a chance to do that, I got tired of writing the freakishly tangled up code above and totally rewrote the internals of the interpreter. The new design uses reflection to dispatch opcodes to Java methods, replacing the base-class override design above. I was concerned with the performance impact this would have, and when I did the rewrite I took the time to set up the project with proper JMH benchmarks. I’ve been using a simple count-to-one-million loop to benchmark/profile the core “basic” functionality of the runtime:
var x = 0
while(x < 1000000){
x = x + 1
}
Sure enough, my shiny new reflective design benchmarked at a dismal 7 seconds to complete one iteration of the count-to-a-million benchmark. I knew it wouldn’t be optimal because I was looking up the Method
objects with each opcode dispatch, but I didn’t expect it to be that bad. I promptly added a Method
cache to save and re-use method references between dispatches (the lookup is done only once the first time a method is called). I also disabled access checking by calling method.setAccessible(true)
. The results were drastically better - ~ 0.27 seconds to run the benchmark on my main development machine (about 0.18 seconds on a machine with a newer processor and faster RAM clocks). Profiling with VisualVM showed that about 1/3 of the time per opcode was spent decoding and preparing arguments and 2/3 of the time was spent by Java in the reflective method invocation, and so I decided to let well enough alone and move on to other higher priority tasks on the project.
The other day, however, I had the idea of reintroducing the old style base class into the new design. Objects that extended from the base class would be dispatched statically through virtual method calls, check their parameter types, and perform the operations without reflection ever being involved. Objects that didn’t sport this “optimization” would go through the plain ol’ reflective dispatch. I tossed in enough code to use this technique for the benchmark above and fired up JMH, expecting a modest performance improvement since I’d bypassed the reflective invocation.
Much to my surprise, the reflective version ran the benchmark ~20% faster. My “optimization” spent more time checking parameter types and invoking the operation than the JVM did. In hindsight, I shouldn’t have been surprised - whenever you go from dynamically typed parameters to statically typed method invocations you’ve got to pay some type checking cost, and the JVM apparently has a more optimized way to do this than “if-instanceof-then-cast” techniques.
My takeaways from this:
- As usual, make a good design first and optimize the unacceptably slow parts later (after proper benchmarking/profiling). Performance issues often aren’t where you think they are.
- Reflection is slower than static method dispatch, but if you’ve got to dynamically bind method invocations (and invokedynamic/MethodHandles aren’t a good/applicable option), it may very well be faster than handwritten alternatives. In other words, reflection can be fast - quite fast.