Writing Micro-Benchmarks in Java is famously hard/tricky/complicated. If you don't think so read these articles:
- Robust Java Benchmarking
- The perils of benchmarking under dynamic compilation
- How to write a benchmark?(from StackOverflow)
Get started
I'll assume you got Ant sorted and nothing more, you'll need to add Ivy support to your build. You can use the following which I cannibalised from the Ivy sample:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<project xmlns:ivy="antlib:org.apache.ivy.ant" name="TheBestestProjectEver" default="build"> | |
<property name="ivy.install.version" value="2.0.0-beta1" /> | |
<property name="ivy.jar.dir" value="${basedir}/ivy" /> | |
<property name="ivy.jar.file" value="${ivy.jar.dir}/ivy.jar" /> | |
<property name="allocation.jar.file" value="${basedir}/lib/allocation.jar" /> | |
<condition property="skip.download.ivy"> | |
<and> | |
<available file="${ivy.jar.file}"/> | |
</and> | |
</condition> | |
<condition property="skip.download.allocation"> | |
<and> | |
<available file="${allocation.jar.file}"/> | |
</and> | |
</condition> | |
<property name="build.dir" value="build" /> | |
<property name="src.dir" value="src" /> | |
<property name="experiments.dir" value="experiments" /> | |
<target name="download-ivy" unless="skip.download.ivy"> | |
<mkdir dir="${ivy.jar.dir}" /> | |
<!-- | |
download Ivy from web site so that it can be used even without any special installation | |
--> | |
<echo message="downloading ivy..." /> | |
<get src="http://repo1.maven.org/maven2/org/apache/ivy/ivy/${ivy.install.version}/ivy-${ivy.install.version}.jar" dest="${ivy.jar.file}" usetimestamp="true" /> | |
</target> | |
<target name="download-allocation-jar" unless="skip.download.allocation"> | |
<mkdir dir="lib" /> | |
<!-- | |
download allocation jar, I'm sure there's a better way to do this with Ivy, but fuck it | |
--> | |
<echo message="downloading allocation jar.." /> | |
<get src="http://java-allocation-instrumenter.googlecode.com/files/allocation.jar" dest="${allocation.jar.file}" usetimestamp="true" /> | |
</target> | |
<!-- | |
this target is not necessary if you put ivy.jar in your ant lib directory if you already have ivy in your ant lib, you can simply remove this target and the dependency the 'go' target has on it ================================= | |
--> | |
<target name="install-ivy" depends="download-ivy" description="--> install ivy"> | |
<!-- | |
try to load ivy here from local ivy dir, in case the user has not already dropped it into ant's lib dir (note that the latter copy will always take precedence). We will not fail as long as local lib dir exists (it may be empty) and ivy is in at least one of ant's lib dir or the local lib dir. | |
--> | |
<path id="ivy.lib.path"> | |
<fileset dir="${ivy.jar.dir}" includes="*.jar" /> | |
</path> | |
<taskdef resource="org/apache/ivy/ant/antlib.xml" uri="antlib:org.apache.ivy.ant" classpathref="ivy.lib.path" /> | |
</target> | |
<!-- Build using the classpath defined by the ivy.xml file found in the same directory --> | |
<target name="build" depends="install-ivy" description=" resolve dependencies, and compile project"> | |
<ivy:cachepath pathid="lib.path.id" /> | |
<echo message="compiling..." /> | |
<mkdir dir="${build.dir}" /> | |
<javac srcdir="${src.dir}" destdir="${build.dir}" classpathref="lib.path.id" /> | |
<javac srcdir="${experiments.dir}" destdir="${build.dir}" classpathref="lib.path.id" /> | |
</target> | |
</project> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<ivy-module version="2.0"> | |
<info organisation="psy.lob.saw" module="experiments"/> | |
<dependencies> | |
<dependency org="com.google.caliper" name="caliper" rev="0.5-rc1"> | |
<exclude name='java-allocation-instrumenter' /> | |
</dependency> | |
</dependencies> | |
</ivy-module> |
You can use an Eclipse plugin to support Ivy integration and bring the jars into your project. You'll still need to get the allocation.jar and sort it out as described below.
Now that we got through the boring bits, let's see why we bothered.
UTF-8 Encoding benchmarks: Before and after
To give context to this tool you need to review how hand rolled benchmarks often look. In this case I'll just revisit a benchmark I did for a previous post measuring different methods of encoding UTF-8 Strings into a byte buffer. The full code base is here but here's the original code used for benchmarking and comparing(written by Evan Jones):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import java.io.BufferedReader; | |
import java.io.FileInputStream; | |
import java.io.IOException; | |
import java.io.InputStreamReader; | |
import java.io.UnsupportedEncodingException; | |
import java.nio.ByteBuffer; | |
import java.nio.CharBuffer; | |
import java.nio.charset.Charset; | |
import java.nio.charset.CharsetEncoder; | |
import java.nio.charset.CoderResult; | |
import java.util.ArrayList; | |
public class StringEncodingTest { | |
private static void error() { | |
System.err.println("(bytebuffer|string|chars|custom) (once|reuse) (buffer|array|bytebuffer) (input strings)"); | |
System.exit(1); | |
} | |
private static enum OutputMode { | |
ARRAY, REUSE_BUFFER, NEW_BYTEBUFFER, | |
} | |
public static void main(String[] args) throws IOException { | |
if (args.length != 4) { | |
error(); | |
return; | |
} | |
byte[] destination = new byte[4096]; | |
UTF8Encoder encoder; | |
if (args[0].equals("bytebuffer")) { | |
encoder = new DirectEncoder(destination); | |
} else if (args[0].equals("string")) { | |
encoder = new StringEncoder(destination); | |
} else if (args[0].equals("string2")) { | |
encoder = new StringEncoder2(destination); | |
} else if (args[0].equals("chars")) { | |
encoder = new CharBufferCopyEncoder(destination); | |
} else if (args[0].equals("custom")) { | |
encoder = new CustomEncoder(destination); | |
} else { | |
error(); | |
return; | |
} | |
boolean reuseEncoder = true; | |
if (args[1].equals("once")) { | |
reuseEncoder = false; | |
} else if (!args[1].equals("reuse")) { | |
error(); | |
return; | |
} | |
OutputMode outputMode; | |
if (args[2].equals("array")) { | |
outputMode = OutputMode.ARRAY; | |
} else if (args[2].equals("buffer")) { | |
outputMode = OutputMode.REUSE_BUFFER; | |
} else if (args[2].equals("bytebuffer")) { | |
outputMode = OutputMode.NEW_BYTEBUFFER; | |
} else { | |
error(); | |
return; | |
} | |
ArrayList<String> strings = new ArrayList<String>(); | |
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(args[3]), "UTF-8")); | |
String line; | |
while ((line = reader.readLine()) != null) { | |
strings.add(line); | |
} | |
// ~ final int ITERATIONS = 5000000; | |
// ~ final int ITERATIONS = 1000000; | |
// ~ final int ITERATIONS = 10000; | |
final int ITERATIONS = 1000; | |
for (int j = 0; j < 50; ++j) { | |
long start = System.nanoTime(); | |
testLoop(destination, encoder, reuseEncoder, outputMode, strings, | |
ITERATIONS); | |
long end = System.nanoTime(); | |
System.out.println(((double) end - start) / 1000000. + " millis"); | |
System.gc(); | |
} | |
} | |
private static void testLoop(byte[] destination, UTF8Encoder encoder, | |
boolean reuseEncoder, OutputMode outputMode, | |
ArrayList<String> strings, final int ITERATIONS) | |
throws UnsupportedEncodingException { | |
for (int i = 0; i < ITERATIONS; ++i) { | |
encodeLoop(destination, encoder, reuseEncoder, outputMode, strings); | |
} | |
} | |
private static void encodeLoop(byte[] destination, UTF8Encoder encoder, | |
boolean reuseEncoder, OutputMode outputMode, ArrayList<String> strings) throws UnsupportedEncodingException { | |
for (String value : strings) { | |
UTF8Encoder temp = encoder; | |
if (!reuseEncoder) { | |
temp = encoder.newInstance(); | |
} | |
if (outputMode == OutputMode.REUSE_BUFFER) { | |
int bytes = temp.encode(value); | |
assert new String(destination, 0, bytes, "UTF-8").equals(value); | |
} else if (outputMode == OutputMode.ARRAY) { | |
byte[] out = temp.encodeToArray(value); | |
assert new String(out, "UTF-8").equals(value); | |
} else { | |
assert outputMode == OutputMode.NEW_BYTEBUFFER; | |
ByteBuffer out = temp.encodeToNewBuffer(value); | |
assert new String(out.array(), 0, out.remaining(), "UTF-8") | |
.equals(value); | |
} | |
} | |
} | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package utf8; | |
import java.io.BufferedReader; | |
import java.io.FileInputStream; | |
import java.io.IOException; | |
import java.io.InputStreamReader; | |
import java.io.UnsupportedEncodingException; | |
import java.nio.ByteBuffer; | |
import java.nio.CharBuffer; | |
import java.nio.charset.Charset; | |
import java.nio.charset.CharsetEncoder; | |
import java.util.ArrayList; | |
import com.google.caliper.Param; | |
import com.google.caliper.Runner; | |
import com.google.caliper.SimpleBenchmark; | |
public class Utf8EncodingBenchmark extends SimpleBenchmark { | |
@Param(value = "utf8.txt") | |
String stringsFile; | |
@Param({"true","false"}) | |
boolean direct; | |
private ArrayList<String> strings = new ArrayList<String>(); | |
private ByteBuffer dest; | |
private char[] chars; | |
private CharBuffer charBuffer; | |
private CharsetEncoder encoder; | |
private CustomUtf8Encoder customEncoder; | |
@Override | |
protected void setUp() throws IOException { | |
// @Param values are guaranteed to have been injected by now | |
BufferedReader reader = null; | |
try { | |
reader = new BufferedReader(new InputStreamReader(new FileInputStream(stringsFile), "UTF-8")); | |
String line; | |
while ((line = reader.readLine()) != null) { | |
strings.add(line); | |
} | |
} catch (Exception e) { | |
throw new RuntimeException(e); | |
} | |
finally{ | |
if(reader != null) | |
reader.close(); | |
} | |
if (direct) { | |
dest = ByteBuffer.allocateDirect(4096); | |
} else { | |
dest = ByteBuffer.allocate(4096); | |
} | |
chars = new char[4096]; | |
charBuffer = CharBuffer.wrap(chars); | |
encoder = Charset.forName("UTF-8").newEncoder(); | |
customEncoder = new CustomUtf8Encoder(); | |
} | |
public int timeCustomEncoder(int reps) { | |
int countBytes = 0; | |
for (int i = 0; i < reps; i++) { | |
for (int stringIndex = 0; stringIndex < strings.size(); stringIndex++) { | |
customEncoder.encodeString(strings.get(stringIndex), dest); | |
countBytes += dest.position(); | |
dest.clear(); | |
} | |
} | |
return countBytes; | |
} | |
public int timeStringGetBytes(int reps) throws UnsupportedEncodingException { | |
int countBytes = 0; | |
for (int i = 0; i < reps; i++) { | |
for (int stringIndex = 0; stringIndex < strings.size(); stringIndex++) { | |
dest.put(strings.get(stringIndex).getBytes("UTF-8")); | |
countBytes += dest.position(); | |
dest.clear(); | |
} | |
} | |
return countBytes; | |
} | |
public int timeCharsetEncoder(int reps) throws UnsupportedEncodingException { | |
int countBytes = 0; | |
for (int i = 0; i < reps; i++) { | |
for (int stringIndex = 0; stringIndex < strings.size(); stringIndex++) { | |
String source = strings.get(stringIndex); | |
source.getChars(0, source.length(), chars, 0); | |
charBuffer.clear(); | |
encoder.reset(); | |
charBuffer.limit(source.length()); | |
encoder.encode(charBuffer, dest, true); | |
countBytes += dest.position(); | |
dest.clear(); | |
} | |
} | |
return countBytes; | |
} | |
public static void main(String[] args) throws Exception { | |
Runner.main(Utf8EncodingBenchmark.class, args); | |
} | |
} |
Running the main give you the following output:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
0% Scenario{vm=java, trial=0, benchmark=CustomEncoder, direct=true, stringsFile=utf8.txt} 107419.06 ns; σ=1032.57 ns @ 4 trials | |
17% Scenario{vm=java, trial=0, benchmark=StringGetBytes, direct=true, stringsFile=utf8.txt} 199795.25 ns; σ=6067.76 ns @ 10 trials | |
33% Scenario{vm=java, trial=0, benchmark=CharsetEncoder, direct=true, stringsFile=utf8.txt} 278708.17 ns; σ=10344.13 ns @ 10 trials | |
50% Scenario{vm=java, trial=0, benchmark=CustomEncoder, direct=false, stringsFile=utf8.txt} 102597.54 ns; σ=3081.05 ns @ 10 trials | |
67% Scenario{vm=java, trial=0, benchmark=StringGetBytes, direct=false, stringsFile=utf8.txt} 196083.46 ns; σ=4185.70 ns @ 10 trials | |
83% Scenario{vm=java, trial=0, benchmark=CharsetEncoder, direct=false, stringsFile=utf8.txt} 185340.89 ns; σ=17881.00 ns @ 10 trials | |
benchmark direct us linear runtime | |
CustomEncoder true 107 =========== | |
CustomEncoder false 103 =========== | |
StringGetBytes true 200 ===================== | |
StringGetBytes false 196 ===================== | |
CharsetEncoder true 279 ============================== | |
CharsetEncoder false 185 =================== | |
vm: java | |
trial: 0 | |
stringsFile: utf8.txt |
--trials <n> : this will run several trials of your benchmark. Very important! you can't rely on a single measurement to make conclusions.
--debug : If you want to debug the process this will not spawn a new process to run your benchmark so that you can intercept the breakpoints easily.
--warmupMillis <millis> : how long to warm up your code for.
--runMillis <millis> : how long should a trial run take
--measureMemory : will measure and compare allocationsIsn't that great? sadly the last one (measureMemory) is a bit annoying to get working because:
- The dependency jar does not work
- Just getting the right jar is not enough because...
- You need to set up a magical environment variable: ALLOCATION_JAR
- Don't rename the allocation.jar the name is in the manifest and is required for the java agent to work.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<target name="benchmark-utf8-encoding" depends="build,download-allocation-jar" description=" run experiments"> | |
<ivy:cachepath pathid="lib.path.id" /> | |
<java classpathref="lib.path.id" classname="utf8.Utf8EncodingBenchmark" fork="true"> | |
<classpath> | |
<pathelement path="${build.dir}"/> | |
<pathelement location="${allocation.jar.file}"/> | |
</classpath> | |
<env key="ALLOCATION_JAR" value="${allocation.jar.file}"/> | |
<arg value="--measureMemory" /> | |
</java> | |
</target> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
benchmark-utf8-encoding: | |
[java] 0% Scenario{vm=java, trial=0, benchmark=CustomEncoder, direct=true, stringsFile=utf8.txt} 99842.67 ns; σ=741.65 ns @ 3 trials, allocated 0 instances for a total of 0B | |
[java] 17% Scenario{vm=java, trial=0, benchmark=StringGetBytes, direct=true, stringsFile=utf8.txt} 204718.91 ns; σ=11685.98 ns @ 10 trials, allocated 2672 instances for a total of 166024B | |
[java] 33% Scenario{vm=java, trial=0, benchmark=CharsetEncoder, direct=true, stringsFile=utf8.txt} 267964.76 ns; σ=2541.32 ns @ 3 trials, allocated 0 instances for a total of 0B | |
[java] 50% Scenario{vm=java, trial=0, benchmark=CustomEncoder, direct=false, stringsFile=utf8.txt} 102537.55 ns; σ=5065.85 ns @ 10 trials, allocated 0 instances for a total of 0B | |
[java] 67% Scenario{vm=java, trial=0, benchmark=StringGetBytes, direct=false, stringsFile=utf8.txt} 202675.42 ns; σ=6581.97 ns @ 10 trials, allocated 2672 instances for a total of 166024B | |
[java] 83% Scenario{vm=java, trial=0, benchmark=CharsetEncoder, direct=false, stringsFile=utf8.txt} 197434.59 ns; σ=11170.45 ns @ 10 trials, allocated 0 instances for a total of 0B | |
[java] benchmark direct instances B us linear runtime | |
[java] CustomEncoder true 0 0 99.8 =========== | |
[java] CustomEncoder false 0 0 102.5 =========== | |
[java] StringGetBytes true 2672 166024 204.7 ====================== | |
[java] StringGetBytes false 2672 166024 202.7 ====================== | |
[java] CharsetEncoder true 0 0 268.0 ============================== | |
[java] CharsetEncoder false 0 0 197.4 ====================== | |
[java] vm: java | |
[java] trial: 0 | |
[java] stringsFile: utf8.txt | |
BUILD SUCCESSFUL | |
Total time: 2 minutes 11 seconds |
Finally, to their eternal credit to writers of Caliper include a page on the project site which highlights some of the pitfalls and considerations around micro benchmarks, so please "Beware the Jabberwock" :P
Enjoy.
Thanks for the stuff about the allocation.jar. That was annoying. I went to Maven Central for my artifact but the manifests are entirely different and it does not work.
ReplyDeleteIf you're using Maven you must download the allocation.jar manually and set up a system scope dependency.
Would you be able to send me the pom.xml or post it as a gist to add to the completeness of the above HowTo?
ReplyDeleteSure.
ReplyDeletehttps://gist.github.com/kay/5031707