Java: What causes the performance increase when repeatedly calling a function?

Question 1

What causes this speedup?

It is most likely JIT compilation, but it could also be contributions from code loading and/or heap warm up effects.

How can I be sure my measurements are accurate when testing other functions, do I have to call the function more than four times before measuring?

You need to do something like that. There is no other way to eliminate JVM warmup effects from your measurements, and still get representative results. Writing valid "micro-benchmarks" for Java is difficult, and you need to read up on all of the issues before you try. Start with this: How do I write a correct micro-benchmark in Java?

I would also note a couple of other things:

Your attempts to remove the cost of garbage collection from your measurements (I assume that's what you seem to be going) appear to have failed. It looks like you are getting minor collections during the execution of testMethod. That would account for the ~7% variability in your "steady state" results.
Separating the cost of allocating an object from the cost of freeing it is likely to give you misleading results. The "total" cost of allocating an object includes the cost of zeroing the memory when it is recycled ... and that is done by the garbage collector.
In fact, the most useful measure is the amortized cost per-object of an allocate / collect cycle. And that (surprisingly) depends on the amount of non-garbage when the GC runs ... which is something that your benchmark doesn't take into account.

Question 2

There are multiple factors that could speed up the execution of the function, as you stated:

most of all JIT may occur at different deegrees or moment during your application (this is why profiling without giving the JVM enough time for warm up can lead to misleading results)
the request of memory to the OS
the reuse of memory on heap for objects that you are allocating

You can't be sure of when every step occurs and you can't know when pieces of code are compiled into native code, but assuming that this happens right before first call is not correct, it may happen exactly when time drops between third and fourth iteration or whatever. Unfortunately these details are hidden inside JVM implementation.