tradeoffs, bad science, and polar bears: the world of java optimisation
Holly Cummins IBM @holly_cummins
Slide 2
why optimise?
Slide 3
why optimise?
Slide 4
0.5s extra search page time
why optimise?
Slide 5
0.5s extra search page time
20% drop in traf c
why optimise?
Slide 6
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
why optimise?
Slide 7
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
7% lower conversion rate
why optimise?
Slide 8
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
7% lower conversion rate
why optimise?
Slide 9
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
7% lower conversion rate
10 ms delay in trading platform
why optimise?
Slide 10
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
7% lower conversion rate
10 ms delay in trading platform
10% drop in revenue
why optimise?
Slide 11
what is optimising?
Slide 12
“make it go faster” for whom? when? doing what?
Slide 13
design thinking
Slide 14
Slide 15
performance can be:
Slide 16
performance can be: throughput
Slide 17
performance can be: throughput
transactions per second
Slide 18
performance can be: throughput
transactions per second
Slide 19
performance can be: throughput latency
transactions per second
start-up time
Slide 20
performance can be: transactions per second
throughput latency
response time
start-up time
Slide 21
performance can be: transactions per second
throughput latency
response time
ramp-up time start-up time
Slide 22
performance can be: transactions per second
throughput latency
response time
ramp-up time start-up time
Slide 23
performance can be: transactions per second
throughput latency capacity
ramp-up time
response time
start-up time footprint
Slide 24
performance can be: transactions per second
throughput latency capacity
ramp-up time
response time
start-up time footprint CPU usage
Slide 25
performance can be: transactions per second
throughput latency capacity utilisation
ramp-up time
response time
start-up time footprint CPU usage
Slide 26
performance can be: transactions per second
throughput latency capacity utilisation
ramp-up time
response time
start-up time footprint CPU usage
… #IBM
Slide 27
Never underestimate the bandwidth [throughput] of a station wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981
Slide 28
Never underestimate the bandwidth [throughput] of a station wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981
but the latency is terrible … #IBM
Slide 29
requirements change
Slide 30
Slide 31
Slide 32
Slide 33
Slide 34
I am not designed for this.
Slide 35
the world changes
Slide 36
Slide 37
-Xmx == $
Slide 38
-Xmx == $ footprint
Slide 39
Slide 40
which performs better?
Slide 41
quarkus trading-off flexibility against startup speed and footprint
Slide 42
quarkus trading-off flexibility against startup speed and footprint
uhh … are you supposed to shut down applications after using them?
Slide 43
behaviour at idle
30% of VMs are zombies (
Slide 44
how to optimise?
Slide 45
find the bottleneck.
x it.
Slide 46
pitfall 1
Slide 47
this is not the place for ideas
Slide 48
measure, don’t guess.
Slide 49
measure the right thing
Slide 50
measure the right thing what do your users care about?
Slide 51
pitfall 2
Slide 52
Slide 53
leading indicators
Slide 54
leading indicators
lagging indicators
Slide 55
leading indicators
lagging indicators we care about them
Slide 56
leading indicators
lagging indicators we care about them easy to measure
Slide 57
leading indicators
lagging indicators we care about them easy to measure hard to change
Slide 58
leading indicators
lagging indicators
easy to change
we care about them easy to measure hard to change
Slide 59
leading indicators
lagging indicators
predictive of a thing we care about
we care about them easy to measure hard to change
easy to change
Slide 60
leading indicators
lagging indicators
predictive of a thing we care about hard to identify easy to change
we care about them easy to measure hard to change
Slide 61
leading indicators
lagging indicators
predictive of a thing we care about hard to identify easy to change
we care about them easy to measure hard to change
Slide 62
caution: performance experiments for entertainment purposes only. do not try these at home.
Slide 63
Slide 64
bad-ish advice: “reduce time spent in garbage collection”
Slide 65
bad-ish advice: “reduce time spent in garbage collection” actually, garbage collection can make your application go faster
-verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput -Xmx300m -Xms300m -Xcompactgc why does the performance stay exactly the same no matter what gc settings I choose?
Slide 75
by the way, this is cheating. (remember the ‘bad science’?)
Slide 76
Slide 77
Slide 78
Slide 79
Slide 80
Slide 81
Slide 82
Slide 83
Slide 84
Slide 85
Slide 86
4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s
Slide 87
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 88
leading indicator
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 89
leading indicator
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 90
lagging indicator
leading indicator
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 91
lagging indicator
leading indicator ?
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 92
lagging indicator ?
leading indicator ?
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 93
so wait, what changed to make the app faster? running jmeter on the same machine as the app gives a big speedup!
Slide 94
“Any improvements made anywhere besides the bottleneck are an illusion.” – Gene Kim #IBM
Slide 95
time kills all performance advice (even mine)
Slide 96
the takeaways: gc can improve performance by rearranging the heap find the bottleneck validate advice independently
Slide 97
pitfall 3
Slide 98
I read it on the internet!
Slide 99
“make one big method because method dispatching is slow”
Slide 100
“re-use your objects to help the garbage collector”
use StringBuilder, never concatenate strings with +=
Slide 103
noooooo! wait, what? yes, right?
use StringBuilder, never concatenate strings with +=
Slide 104
2 things ruin advice: • context • time
Slide 105
pitfall 4
Slide 106
Slide 107
static string beSlow() { string result = “”; for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
Slide 108
@Override public String toString() { String ret = “\n\tMarket Summary at: ” + getSummaryDate() + “\n\t\t TSIA:” + getTSIA() + “\n\t\t openTSIA:” + getOpenTSIA() + “\n\t\t gain:” + getGainPercent() + “\n\t\t volume:” + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += “\n\t\t Current Top Gainers:”; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData =; ret += (“\n\t\t\t” + quoteData.toString()); } ret += “\n\t\t Current Top Losers:”; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData =; ret += (“\n\t\t\t” + quoteData.toString()); } return ret; }
Slide 109
@Override public String toString() { String ret = “\n\tMarket Summary at: ” + getSummaryDate() + “\n\t\t TSIA:” + getTSIA() + “\n\t\t openTSIA:” + getOpenTSIA() + “\n\t\t gain:” + getGainPercent() + “\n\t\t volume:” + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += “\n\t\t Current Top Gainers:”; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData =; ret += (“\n\t\t\t” + quoteData.toString()); } ret += “\n\t\t Current Top Losers:”; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData =; ret += (“\n\t\t\t” + quoteData.toString()); } return ret; }
Slide 110
@Override public String toString() { String ret = “\n\tMarket Summary at: ” + getSummaryDate() + “\n\t\t TSIA:” + getTSIA() + “\n\t\t openTSIA:” + getOpenTSIA() + “\n\t\t gain:” + getGainPercent() + “\n\t\t volume:” + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += “\n\t\t Current Top Gainers:”; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData =; ret += (“\n\t\t\t” + quoteData.toString()); } ret += “\n\t\t Current Top Losers:”; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData =; ret += (“\n\t\t\t” + quoteData.toString()); } return ret; }
Slide 111
this never gets called @Override public String toString() { String ret = “\n\tMarket Summary at: ” + getSummaryDate() + “\n\t\t TSIA:” + getTSIA() + “\n\t\t openTSIA:” + getOpenTSIA() + “\n\t\t gain:” + getGainPercent() + “\n\t\t volume:” + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += “\n\t\t Current Top Gainers:”; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData =; ret += (“\n\t\t\t” + quoteData.toString()); } ret += “\n\t\t Current Top Losers:”; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData =; ret += (“\n\t\t\t” + quoteData.toString()); } return ret; }
Slide 112
let’s make travel energy-efficient?
Slide 113
every little helps?
Slide 114
every little helps? every optimisation is another optimisation you aren’t doing
Slide 115
our platforms help
Slide 116
static string beSlow() { string result = “”; for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
Slide 117
static string beSlow() { string result = “”; result += getStringData(1); result += getStringData(2); result += getStringData(3); }
return result;
Slide 118
static string beSlow() { string result = “”; result += getStringData(1); result += getStringData(2); result += getStringData(3); }
return result;
this is fine @holly_cummins
Slide 119
the JVM writers have far more time for optimising than you do clean, typical, code runs best
Slide 120
ok, but how to optimise?
Slide 121
Slide 122
“What you can optimize is limited to what you can observe.” -Susie Xia, Netflix
Slide 123
Slide 124
method profiler GC analysis heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 125
method profiler
GC analysis heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 126
method profiler
VisualVM Mission Control
GC analysis heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 127
method profiler
VisualVM Mission Control
GC analysis
IBM Health Center (for OpenJ9)
heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 128
method profiler
flame graphs VisualVM Mission Control
GC analysis
IBM Health Center (for OpenJ9)
heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 129
method profiler
flame graphs VisualVM Mission Control
GC analysis
IBM Health Center (for OpenJ9)
heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 130
method profiler
flame graphs VisualVM Mission Control
GC analysis
IBM Health Center (for OpenJ9)
heap analysis
Eclipse MAT
APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 131
method profiler
flame graphs VisualVM Mission Control
GC analysis
heap analysis APM
IBM Health Center (for OpenJ9)
Eclipse MAT
distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 132
method profiler
flame graphs VisualVM Mission Control
GC analysis
heap analysis APM
IBM Health Center (for OpenJ9)
Eclipse MAT
New Relic*
distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 133
method profiler
flame graphs VisualVM Mission Control
GC analysis
heap analysis APM
IBM Health Center (for OpenJ9)
Eclipse MAT
AppDynamics* New Relic*
distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 134
method profiler
flame graphs VisualVM Mission Control
GC analysis
heap analysis APM
IBM Health Center (for OpenJ9)
Eclipse MAT
AppDynamics* New Relic*
distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 135
method profiler
flame graphs VisualVM Mission Control
GC analysis
heap analysis APM
IBM Health Center (for OpenJ9)
distributed tracing
Eclipse MAT
AppDynamics* New Relic*
not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 136
method profiler
flame graphs VisualVM
IBM Health Center (for OpenJ9)
Mission Control
GC analysis
heap analysis APM
distributed tracing
Eclipse MAT
AppDynamics* New Relic*
not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
Slide 137
optimising a micro-service: is that micro-optimising?
Netflix microservice architecture #IBM
Slide 138
you may need to know the whole system context to know what to optimise
Slide 139
“Nines don’t matter if your users aren’t happy.” – Charity Majors
Slide 140
don’t forget the edges queueing theory helps us understand where the disasters happen
Slide 141
“When it comes to IT performance, amateurs look at averages. Professionals look at distributions.” – Avishai Ish-Shalom
Slide 142
slow performance can turn into big cloud bills make cloud costs visible to engineers
Slide 143
ok, but you promised bears
Slide 144
if you leave the TV on when you’re not using it, you’re a polar bear murderer
Slide 145
there is a moral imperative to avoid waste
Slide 146
there is a moral imperative to avoid waste electricity hardware
Slide 147
data centres use 1-2% of the world’s electricity
Slide 148
fewer devices
longer lifetime
Slide 149
higher ef ciency
fewer devices
longer lifetime
@holly_cummins fi
Slide 150
higher ef ciency
fewer devices
lower footprint
longer lifetime
@holly_cummins fi
Slide 151
higher ef ciency
fewer devices
lower footprint more multitenancy
longer lifetime
@holly_cummins fi
Slide 152
higher ef ciency
fewer devices
lower footprint more multitenancy
longer lifetime
@holly_cummins fi
optimise for longevity
Slide 153
higher ef ciency
fewer devices
lower footprint more multitenancy
longer lifetime
the end of planned obsolescence?
@holly_cummins fi
optimise for longevity
Slide 154
sooo … you can optimise, and it can be fun measure, don’t guess only optimise what matters
now for questions! #IBM