What causes this speedup?
It is most likely JIT compilation, but it could also be contributions from code loading and/or heap warm up effects.
How can I be sure my measurements are accurate when testing other functions, do I have to call the function more than four times before measuring?
You need to do something like that. There is no other way to eliminate JVM warmup effects from your measurements, and still get representative results. Writing valid "micro-benchmarks" for Java is difficult, and you need to read up on all of the issues before you try. Start with this: How do I write a correct micro-benchmark in Java?
I would also note a couple of other things:
Your attempts to remove the cost of garbage collection from your measurements (I assume that's what you seem to be going) appear to have failed. It looks like you are getting minor collections during the execution of
testMethod
. That would account for the ~7% variability in your "steady state" results.Separating the cost of allocating an object from the cost of freeing it is likely to give you misleading results. The "total" cost of allocating an object includes the cost of zeroing the memory when it is recycled ... and that is done by the garbage collector.
In fact, the most useful measure is the amortized cost per-object of an allocate / collect cycle. And that (surprisingly) depends on the amount of non-garbage when the GC runs ... which is something that your benchmark doesn't take into account.