Java Microbenchmarking Harness vs System.getNanotime()

https://stackoverflow.com/questions/23480236

16-07-2023
|

سؤال

Question1: Why JMH better than simple System.getNanotime()?

Question2: What can I conclude from the results (look at benchmarking results section) besides validateLongKeyBinary is 64 percents faster than validateLongKeyAscii?

Example (Code):

import net.spy.memcached.util.StringUtils;
import org.openjdk.jmh.annotations.GenerateMicroBenchmark;

public class KeyBench {

private static final String LONG_KEY = "thisIsAFunkyKeyWith_underscores_AndAlso334" +
            "3252545345NumberslthisIsAFunkyKeyWith_underscores_AndAlso3343252545345Numbe" +
            "rslthisIsAFunkyKeyWith_underscores_AndAlso3343252545345NumberslthisIsAFunkyK" +
            "eyWith_underscores_AndAlso3343252545345Numbersl";


    @GenerateMicroBenchmark
    public void validateLongKeyBinary() {
        StringUtils.validateKey(LONG_KEY, true);
    }

    @GenerateMicroBenchmark
    public void validateLongKeyAscii() {
        StringUtils.validateKey(LONG_KEY, false);
    }
}

Benchmarking results

# Running: benchmarks.KeyBench.validateLongKeyAscii

Result : 393,667 ±(95%) 13,985 ±(99%) 20,094 ops/ms
  Statistics: (min, avg, max) = (357,445, 393,667, 413,004), stdev = 19,552
  Confidence intervals: 95% [379,682, 407,653], 99% [373,573, 413,762]


# Running: benchmarks.KeyBench.validateLongKeyBinary

Result : 644,023 ±(95%) 6,881 ±(99%) 9,887 ops/ms
  Statistics: (min, avg, max) = (621,784, 644,023, 654,178), stdev = 9,620
  Confidence intervals: 95% [637,142, 650,904], 99% [634,136, 653,910]

Benchmark                             Mode Thr     Count  Sec         Mean   Mean error    Units
b.KeyBench.validateLongKeyAscii      thrpt   1        10    1      393,667       20,094   ops/ms
b.KeyBench.validateLongKeyBinary     thrpt   1        10    1      644,023        9,887   ops/ms

المحلول

JMH maintainer here.

Let me ask the leading question: Why would one use the library, if you can code most of the things yourself? The answer is actually simple: of course you can write everything given the infinite time, but in practice we have to reuse code to fit into reasonable time.

Now, it would only seem that having two timestamps around the code is enough to measure its performance. However, you have to control what exactly you are measuring, e.g. whether you are still in the transitional warmup phase, does your code actually execute or you measure a hollow shell after the optimization, how statistically significant your effect is, etc. etc. etc. Good benchmark frameworks try to help with that.

You can have a glimpse of the issues you will have to face by looking at JMH Samples, our benchmarking talks, or the relevant SO answers. Oh, and using nanoTime is harder than you think.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow