Java performance profiling using flame graphs

MySQL flame graph from Brendan Gregg’s articleJava performance profiling using flame graphsMahesh SenniappanBlockedUnblockFollowFollowingApr 15One of the great advantages of microservices is that, when there is an issue, you already have a pretty good idea of where it is happening and which microservice is responsible for it.

And if it is a performance issue, you have a manageable amount of code or libraries to investigate, rather than dealing with the monolith as a whole.

There are a lot of performance measurement tools that come as part of JDK itself — JConsole, VisualVM, HPROF, etc.

Most of them profile the application as a whole and it would take some effort to get to class or method level hot spots.

While I was trying to evaluate the performance of one of our microservices, I came across a method using flame graphs which I found very effective in finding out CPU usage of the code.

This post is more of a how-to and all credits go to Brendan Gregg.

RequirementsA Linux machine with perfJDK — JDK8u60 and aboveFlameGraph visualizerjvm-profiling-tools/perf-map-agentAn application to profile :)I used an EC2 machine running RHEL 7 for this exercise — although I never tried, I expect Vagrant or VirtualBox should also work.

If the application is an API, you need a load testing tool like JMeter or wrk to generate traffic for the API.

StepsAt a very high level, this is what needs to be done.

Install perf_eventsBuild perf-map-agent from sourceRun the Java application in the machine with -XX:+PreserveFramePointer JVM optionGenerate load for the application using a load testing toolRun perf-record command to capture performance counter profileRun perf-map-agent to generate a map for JIT-compiled methodsGenerate stack trace output from the previously recorded data by running perf-scriptGenerate flame graphI am using a RHEL machine, the commands in this post are based on it, but it should be easy to find equivalent commands for your OS.

Let’s look at each of these steps in detail now.

Install perf_eventsAs the flame graphs are generated from the output of Linux perf_events, the first steps is to install it which provides the perf CLI command.

Command to install perf_events:yum install perfBuild perf-map-agentWhen an application is running, JVM performs just-in-time (JIT) compilation of the byte code at runtime to optimize frequently used “hot” code.

The byte code is converted to native code to improve performance and this native code is stored in memory.

When perf runs, only this memory address is accessible and not the actual Java class or method.

A tool like perf-map-agent connects to a running a JVM process and exports a map file which can be used by perf to generate the stack trace with the actual Java method names.

To build perf-map-agent follow the instructions in the source repo.

It should be something like this:git clone https://github.

com/jvm-profiling-tools/perf-map-agent.

gitexport JAVA_HOME=<JDK_DIR>cd perf-map-agentyum install cmakecmake .

makeRun the applicationThe next step is to run the application with the JVM option -XX:+PreserveFramePointer.

Frame pointers are commonly used to provide information to the debuggers about the call stack.

With this option set, perf can construct more accurate stack traces by using information in the frame pointer about the currently executing method.

Using this feature requires, JDK8u60 and above.

java -XX:+PreserveFramePointer -jar app.

jarKeep the application running to until performance profile (perf record) and symbol table (perf-map-agent) are captured.

Generate loadGenerate load for your application using any of the load testing tools or a different approach depending on the application.

Capture performance profileWhen the application is running, start capturing the CPU profile using perf_events with the following command:perf record -F 99 -p `pgrep java` -g — sleep 10-F 99 — Run profile at this frequency-p — Profile an existing process with this PID-g — Generate call graphsleep 10 — Profile for ten secondsOnce the profiling is completed after ten seconds, this command will generate a file called perf.

data.

Export symbolsAssuming you have already built perf-map-agent, run the following command while the application is running to generate a map file of JVM symbols:bin/create-java-perf-map.

sh `pgrep java`This command will create the file /tmp/<PID>.

map.

The application can be stopped at this point and is not needed for the subsequent steps.

Generate trace outputNow that we have the profile data and the symbols map, we can generate a details trace output of the profiled information.

Run this command in the same directory as the perf.

data file generated earlier:perf script > out.

perfThis command will look for the map file in mp and use it to generate the output.

It will fail if the .

map file is not present in /tmp.

Flame graph ????Get the scripts to generate the flame graph from the source repo.

Run the scripts by passing the trace output generated earlier.

git clone –depth 1 https://github.

com/brendangregg/FlameGraph.

git.

/FlameGraph/stackcollapse-perf.

pl out.

perf > out.

folded.

/FlameGraph/flamegraph.

pl out.

folded > graph.

svggraph.

svg is the flame graph and it can be opened in your favorite browser to explore.

ConclusionCheck out the reference articles and video linked below to get more information about flame graphs.

Hope you found this useful.

ReferencesCPU Flame GraphsOn this page I'll introduce and explain CPU flame graphs, list generic instructions for their creation, then discuss…www.

brendangregg.

comJava in Flamesmixed-mode flame graphs provide a complete visualization of CPU usagemedium.

com.

. More details

Leave a Reply