Challenges

Here are some of the quick challenges in regular development activities:

How to justify my new version of logic is better than older version
How to justify new features of jdk like parallel stream is better than sequential stream
How to compare same logic with various load conditions (samples)
How to compare logic under different environment variables / jvm settings/ GC enforce or not.

One of the quick solutions is to use nice benchmarking tool. A benchmark is meant to measure the performances apiece of software. An attempt to reproduce in laboratory what will happen in production.

Open JDK has come up with nice tool called JHM.

Overview

JMH (Java Microbenchmark Harness ) is a Java harness for building, running, and analyzing nano/micro/milli/macro benchmarks written in Java and other languages targeting the JVM. JMH is Maven-driven, hence having Maven installed will bring the best experience.

JMH allowed us to deeply understand implementation details, which were impacting execution and concurrency of our new implementation. JMH is certainly a tool that you'll want to bring into your toolbox if you have any care at all for understanding the performance of your applications (more importantly down a the algorithm and language level).

One advantage of JMH over Caliper is that it runs on Windows.

Setup

Please go through JHM portal for more details of setup.

Sample Code

JMH has only 2 requirements (everything else are recommendations):

You need jmh-core maven dependency
You need to annotate test methods with @GenerateMicroBenchmark annotation

Annotate the required methods with @GenerateMicroBenchmark. Typical JMH code with annotation "@GenerateMicroBenchmark" will looks like below:

    @GenerateMicroBenchmark
   @OutputTimeUnit(TimeUnit.SECONDS)
   public void performSerailStream(){
       long rangeEnd =10000L;
       long startTime =System.currentTimeMillis();
       List<Long> perfectNumbers =
                   LongStream.rangeClosed(1, rangeEnd).
                       filter(PerfectNumberFinderBenchMark::isPerfect).
                          collect(ArrayList<Long>::new, ArrayList<Long>::add, ArrayList<Long>::addAll);
   }


   @GenerateMicroBenchmark
   @OutputTimeUnit(TimeUnit.SECONDS)
   public void performParallelStream(){
       long rangeEnd =10000L;
       long startTime =System.currentTimeMillis();
       List<Long> perfectNumbers1 =
                   LongStream.rangeClosed(1, rangeEnd).parallel().
                       filter(PerfectNumberFinderBenchMark::isPerfect).
                          collect(ArrayList<Long>::new, ArrayList<Long>::add, ArrayList<Long>::addAll);

}

More JMH samples are can be from here.

Sample Output

Here are the benchmark output details with different mode of operations (Throughput/thrpt,AverageTime/avgt, SampleTime/sample, SingleShotTime/ss ) are shown below:

How it Works?

Finds annotated micro-benchmarks using reflection

Generates infrastructure plain java source code around the calls to the micro-benchmarks

Compile, pack, run, profit

No reflection during benchmark execution

Customization

Lot of customizations is supported from run time and through annotation base. So many metrics can be measured, some of them are:

Single execution time

Operations per time unit

Average time per operation

Percentile estimation of time per operation

Run time:

Some of important options to customize the output are:

bm <mode> - Benchmark mode. Available modes are: [Throughput/thrpt,AverageTime/avgt, SampleTime/sample, SingleShotTime/ss,All/all]
i - number of benchmarked iterations, use 10 or more to get a good idea
r - how long to run each benchmark iteration
wi - number of warmup iterations
w - how long to run each warmup iteration (give ample room for warmup, how much will depend on the code you try and measure, try and have it execute 100K times or so)
gc [bool] - Should JMH force GC between iterations?

Annotations

JMH uses annotation-drive approach to detect benchmarks. It Provides BlackHole to consume results (and CPU cycles)

GenerateMicroBenchmark

Marks the payload method as the target for microbenchmark generation. Annotation processors will translate methods marked with this annotation to correct MicroBenchmark-annotated classes.

This annotation only accepts parameters affecting the workload generation, now only Mode. Other parameters for run control are available as separate annotations (e.g. Measurement, Threads, and Fork), which can be used both for concrete GenerateMicroBenchmark-annotated methods, as well as for the classes containing target methods. Class-level annotations will be honored first, then any method-level annotations.

State

@State annotation defines the scope in which an instance of a given class will be available. JMH allows you to run tests in multiple threads simultaneously, so choose the right state:

Name	Description
Scope.Thread	This is a default state. An instance will be allocated for each thread running the given test.
Scope.Benchmark	An instance will be shared across all threads running the same test. Could be used to test multithreaded performance of a state object (or just mark your benchmark with this scope).
Scope.Group	An instance will be allocated per thread group (see Groups section down below).

Besides marking a separate class as a @State, you can also mark your own benchmark class as @State. All above scope rules apply to this case as well.

@State of benchmark data is shared across the benchmark, thread, or a group of threads. It allows performing Fixtures (setUp and tearDown) in scope of the whole run, iteration or single execution

@Threads a simple way to run concurrent test if you defined correct @State

@Group threads to assign them for a particular role in the benchmark

BenchmarkMode

You can use the following test modes specified using @BenchmarkMode annotation on the test methods:

Name	Description
Mode.Throughput	Calculate number of operations in a time unit.
Mode.AverageTime	Calculate an average running time.
Mode.SampleTime	Calculate how long does it take for a method to run (including percentiles).
Mode.SingleShotTime	Just run a method once (useful for cold-testing mode). Or more than once if you have specified a batch size for your iterations (see `@Measurement` annotation below) – in this case JMH will calculate the batch running time (total time for all invocations in a batch).
Any set of these modes	You can specify any set of these modes – the test will be run several times (depending on number of requested modes).
Mode.All	All these modes one after another.

Time units

You can specify time unit to use via @OutputTimeUnit, which requires an argument of the standard Java type java.util.concurrent.TimeUnit. Unfortunately, if you have specified several test modes for one test, the given time unit will be used for all tests (for example, it may be convenient to measure SampleTime in nanoseconds, but throughput should better be measured in the longer time units).

Black Hole for Dead code result

Dead code elimination is a well known problem among microbenchmark writers. The general solution is to use the result of calculations somehow. JMH does not do any magic tricks on its own. If you want to defend against dead code elimination – never write void tests. Always return the result of your calculations. JMH will take care of the rest.
If you need to return more than one value from your test, either combine all return values with some cheap operation (cheap compared to the cost of operations by which you got your results) or use a BlackHole method argument and sink all your results into it (note that BlockHole.consume may be more expensive than manual combining of results in some cases). BlackHole is a thread-scoped class.

@GenerateMicroBenchmark(BenchmarkType.All)
    public void arrayListIterator(BlackHole bh) {
        Iterator<Integer> iterator = arrayList.iterator();
        while(iterator.hasNext()) {
            Integer i = iterator.next();
            bh.consume(i);
        }
    }

Java Micro Benchmarking

Friday, 30 May 2014

Overview

Challenges

Overview

Setup

Sample Code

Sample Output

How it Works?

Customization

Run time:

Annotations

GenerateMicroBenchmark

State

BenchmarkMode

Time units

Black Hole for Dead code result