Challenges
Here
are some of the quick challenges in regular development activities:
- How to justify my new version of logic is better than older version
- How to justify new features of jdk like parallel stream is better than sequential stream
- How to compare same logic with various load conditions (samples)
- How to compare logic under different environment variables / jvm settings/ GC enforce or not.
One of
the quick solutions is to use nice benchmarking tool. A benchmark is meant to
measure the performances apiece of software. An attempt to reproduce in
laboratory what will happen in production.
Open
JDK has come up with nice tool called JHM.
Overview
JMH (Java
Microbenchmark Harness ) is a
Java harness for building, running, and analyzing nano/micro/milli/macro
benchmarks written in Java and other languages targeting the JVM. JMH is
Maven-driven, hence having Maven
installed will bring the best experience.
JMH
allowed us to deeply understand implementation details, which were impacting
execution and concurrency of our new implementation. JMH is certainly a tool
that you'll want to bring into your toolbox if you have any care at all for
understanding the performance of your applications (more importantly down a the
algorithm and language level).
One
advantage of JMH over Caliper is that it runs on Windows.
Setup
Please
go through JHM portal for more details of setup.
Sample Code
JMH
has only 2 requirements (everything else are recommendations):
- You need jmh-core maven dependency
- You need to annotate test methods with @GenerateMicroBenchmark annotation
Annotate
the required methods with @GenerateMicroBenchmark. Typical JMH code with annotation
"@GenerateMicroBenchmark"
will looks like below:
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void performSerailStream(){
long rangeEnd =10000L;
long startTime =System.currentTimeMillis();
List<Long> perfectNumbers =
LongStream.rangeClosed(1, rangeEnd).
filter(PerfectNumberFinderBenchMark::isPerfect).
collect(ArrayList<Long>::new, ArrayList<Long>::add, ArrayList<Long>::addAll);
}
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void performParallelStream(){
long rangeEnd =10000L;
long startTime =System.currentTimeMillis();
List<Long> perfectNumbers1 =
LongStream.rangeClosed(1, rangeEnd).parallel().
filter(PerfectNumberFinderBenchMark::isPerfect).
collect(ArrayList<Long>::new, ArrayList<Long>::add, ArrayList<Long>::addAll);
@OutputTimeUnit(TimeUnit.SECONDS)
public void performSerailStream(){
long rangeEnd =10000L;
long startTime =System.currentTimeMillis();
List<Long> perfectNumbers =
LongStream.rangeClosed(1, rangeEnd).
filter(PerfectNumberFinderBenchMark::isPerfect).
collect(ArrayList<Long>::new, ArrayList<Long>::add, ArrayList<Long>::addAll);
}
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void performParallelStream(){
long rangeEnd =10000L;
long startTime =System.currentTimeMillis();
List<Long> perfectNumbers1 =
LongStream.rangeClosed(1, rangeEnd).parallel().
filter(PerfectNumberFinderBenchMark::isPerfect).
collect(ArrayList<Long>::new, ArrayList<Long>::add, ArrayList<Long>::addAll);
}
Sample Output
Here
are the benchmark output details with different mode of operations (Throughput/thrpt,AverageTime/avgt,
SampleTime/sample, SingleShotTime/ss ) are shown below:
How it Works?
- Finds annotated micro-benchmarks using reflection
- Generates infrastructure plain java source code around the calls to the micro-benchmarks
- Compile, pack, run, profit
- No reflection during benchmark execution
Customization
Lot of customizations is supported from run time and through
annotation base. So many metrics can be measured, some of them are:
- Single execution time
- Operations per time unit
- Average time per operation
- Percentile estimation of time per operation
Run time:
Some
of important options to customize the output are:
- bm <mode> - Benchmark mode. Available modes are: [Throughput/thrpt,AverageTime/avgt, SampleTime/sample, SingleShotTime/ss,All/all]
- i - number of benchmarked iterations, use 10 or more to get a good idea
- r - how long to run each benchmark iteration
- wi - number of warmup iterations
- w - how long to run each warmup iteration (give ample room for warmup, how much will depend on the code you try and measure, try and have it execute 100K times or so)
- gc [bool] - Should JMH force GC between iterations?
Annotations
JMH uses annotation-drive approach to detect benchmarks. It Provides
BlackHole to consume results (and CPU cycles)
GenerateMicroBenchmark
Marks the payload method as the target for microbenchmark
generation. Annotation processors will translate methods marked with this
annotation to correct MicroBenchmark-annotated classes.
This
annotation only accepts parameters affecting the workload generation, now only Mode. Other parameters for run control are available as separate
annotations (e.g. Measurement, Threads, and Fork), which can be used both for concrete GenerateMicroBenchmark-annotated
methods, as well as for the classes containing target methods. Class-level
annotations will be honored first, then any method-level annotations.
State
@State annotation defines the scope in which
an instance of a given class will be available. JMH allows you to run tests in
multiple threads simultaneously, so choose the right state:
Name
|
Description
|
Scope.Thread
|
This
is a default state. An instance will be allocated for each thread running the
given test.
|
Scope.Benchmark
|
An
instance will be shared across all threads running the same test. Could be
used to test multithreaded performance of a state object (or just mark your
benchmark with this scope).
|
Scope.Group
|
An
instance will be allocated per thread group (see Groups section down below).
|
Besides
marking a separate class as a @State, you can also mark your own benchmark
class as @State. All
above scope rules apply to this case as well.
@State
of benchmark data is shared across the benchmark, thread, or a group of threads.
It allows performing Fixtures (setUp and tearDown) in scope of the whole run,
iteration or single execution
@Threads
a simple way to run concurrent test if you defined correct @State
@Group
threads to assign them for a particular role in the benchmark
BenchmarkMode
You can use the following test modes specified using@BenchmarkMode
annotation
on the test methods:
Name
|
Description
|
Mode.Throughput
|
Calculate number of
operations in a time unit.
|
Mode.AverageTime
|
Calculate an average
running time.
|
Mode.SampleTime
|
Calculate how long does
it take for a method to run (including percentiles).
|
Mode.SingleShotTime
|
Just run a method once
(useful for cold-testing mode). Or more than once if you have specified a
batch size for your iterations (see
@Measurement
annotation below) – in this case JMH will calculate the batch running time
(total time for all invocations in a batch). |
Any set of these modes
|
You can specify any set
of these modes – the test will be run several times (depending on number of
requested modes).
|
Mode.All
|
All these modes one after
another.
|
Time units
You can specify time unit to use via@OutputTimeUnit
, which requires an argument of the standard Java type java.util.concurrent.TimeUnit
.
Unfortunately, if you have specified several test modes for one test, the given
time unit will be used for all tests (for example, it may be convenient to measure
SampleTime
in nanoseconds, but throughput should better be measured in the
longer time units). Black Hole for Dead code result
Dead code elimination is a well known problem among microbenchmark writers. The general solution is to use the result of calculations somehow. JMH does not do any magic tricks on its own. If you want to defend against dead code elimination – never writevoid
tests. Always return the result of your
calculations. JMH will take care of the rest. If you need to return more than one value from your test, either combine all return values with some cheap operation (cheap compared to the cost of operations by which you got your results) or use a
BlackHole
method argument and sink all your results into it (note that BlockHole.consume
may be more expensive than manual combining of results in some cases). BlackHole
is a thread-scoped class.@GenerateMicroBenchmark(BenchmarkType.All)
public void arrayListIterator(BlackHole bh) {
Iterator<Integer> iterator = arrayList.iterator();
while(iterator.hasNext()) {
Integer i = iterator.next();
bh.consume(i);
}
}