tradeoffs, bad science, and polar bears: the world of java optimisation
Holly Cummins IBM @holly_cummins
Slide 2
why optimise?
#IBM
@holly_cummins
Slide 3
why optimise?
#IBM
@holly_cummins
Slide 4
0.5s extra search page time
why optimise?
#IBM
@holly_cummins
Slide 5
0.5s extra search page time
20% drop in traf c
why optimise?
fi
#IBM
@holly_cummins
Slide 6
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
why optimise?
fi
#IBM
@holly_cummins
Slide 7
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
7% lower conversion rate
why optimise?
fi
#IBM
@holly_cummins
Slide 8
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
7% lower conversion rate
why optimise?
fi
#IBM
@holly_cummins
Slide 9
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
7% lower conversion rate
10 ms delay in trading platform
fi
#IBM
why optimise?
@holly_cummins
Slide 10
0.5s extra search page time
20% drop in traf c
100 ms latency on page load
7% lower conversion rate
10 ms delay in trading platform
fi
#IBM
10% drop in revenue
why optimise?
@holly_cummins
Slide 11
what is optimising?
#IBM
@holly_cummins
Slide 12
“make it go faster” for whom? when? doing what?
#IBM
@holly_cummins
Slide 13
design thinking
#IBM
@holly_cummins
Slide 14
#IBM
@holly_cummins
Slide 15
performance can be:
#IBM
@holly_cummins
Slide 16
performance can be: throughput
#IBM
@holly_cummins
Slide 17
performance can be: throughput
#IBM
transactions per second
@holly_cummins
Slide 18
performance can be: throughput
transactions per second
latency
#IBM
@holly_cummins
Slide 19
performance can be: throughput latency
#IBM
transactions per second
start-up time
@holly_cummins
Slide 20
performance can be: transactions per second
throughput latency
#IBM
response time
start-up time
@holly_cummins
Slide 21
performance can be: transactions per second
throughput latency
#IBM
response time
ramp-up time start-up time
@holly_cummins
Slide 22
performance can be: transactions per second
throughput latency
response time
ramp-up time start-up time
capacity
#IBM
@holly_cummins
Slide 23
performance can be: transactions per second
throughput latency capacity
#IBM
ramp-up time
response time
start-up time footprint
@holly_cummins
Slide 24
performance can be: transactions per second
throughput latency capacity
ramp-up time
response time
start-up time footprint CPU usage
#IBM
@holly_cummins
Slide 25
performance can be: transactions per second
throughput latency capacity utilisation
#IBM
ramp-up time
response time
start-up time footprint CPU usage
@holly_cummins
Slide 26
performance can be: transactions per second
throughput latency capacity utilisation
ramp-up time
response time
start-up time footprint CPU usage
… #IBM
@holly_cummins
Slide 27
Never underestimate the bandwidth [throughput] of a station wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981
#IBM
@holly_cummins
Slide 28
Never underestimate the bandwidth [throughput] of a station wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981
but the latency is terrible … #IBM
@holly_cummins
Slide 29
requirements change
#IBMGarage
@holly_cummins
Slide 30
#IBMGarage
@holly_cummins
Slide 31
#IBMGarage
@holly_cummins
Slide 32
#IBMGarage
@holly_cummins
Slide 33
#IBMGarage
@holly_cummins
Slide 34
I am not designed for this.
#IBMGarage
@holly_cummins
Slide 35
the world changes
#IBMGarage
@holly_cummins
Slide 36
#IBM
@holly_cummins
Slide 37
-Xmx == $
#IBM
@holly_cummins
Slide 38
-Xmx == $ footprint
#IBM
@holly_cummins
Slide 39
#IBM
@holly_cummins
Slide 40
which performs better?
#IBM
@holly_cummins
Slide 41
quarkus trading-off flexibility against startup speed and footprint
#IBM
@holly_cummins
Slide 42
quarkus trading-off flexibility against startup speed and footprint
uhh … are you supposed to shut down applications after using them?
#IBM
@holly_cummins
Slide 43
behaviour at idle
30% of VMs are zombies (antithesisgroup.com)
#IBM
@holly_cummins
Slide 44
how to optimise?
#IBM
@holly_cummins
Slide 45
fi
find the bottleneck.
x it.
#IBM
@holly_cummins
Slide 46
pitfall 1
intuition
#IBM
@holly_cummins
Slide 47
this is not the place for ideas
#IBM
@holly_cummins
Slide 48
measure, don’t guess.
#IBM
@holly_cummins
Slide 49
measure the right thing
#IBM
@holly_cummins
Slide 50
measure the right thing what do your users care about?
#IBM
@holly_cummins
Slide 51
pitfall 2
numbers
#IBM
@holly_cummins
Slide 52
#IBM
@holly_cummins
Slide 53
leading indicators
#IBM
@holly_cummins
Slide 54
leading indicators
#IBM
lagging indicators
@holly_cummins
Slide 55
leading indicators
lagging indicators we care about them
#IBM
@holly_cummins
Slide 56
leading indicators
lagging indicators we care about them easy to measure
#IBM
@holly_cummins
Slide 57
leading indicators
lagging indicators we care about them easy to measure hard to change
#IBM
@holly_cummins
Slide 58
#IBM
leading indicators
lagging indicators
easy to change
we care about them easy to measure hard to change
@holly_cummins
Slide 59
leading indicators
lagging indicators
predictive of a thing we care about
we care about them easy to measure hard to change
easy to change
#IBM
@holly_cummins
Slide 60
#IBM
leading indicators
lagging indicators
predictive of a thing we care about hard to identify easy to change
we care about them easy to measure hard to change
@holly_cummins
Slide 61
#IBM
leading indicators
lagging indicators
predictive of a thing we care about hard to identify easy to change
we care about them easy to measure hard to change
@holly_cummins
Slide 62
caution: performance experiments for entertainment purposes only. do not try these at home.
#IBM
@holly_cummins
Slide 63
2007
#IBM
@holly_cummins
Slide 64
bad-ish advice: “reduce time spent in garbage collection”
#IBM
@holly_cummins
Slide 65
bad-ish advice: “reduce time spent in garbage collection” actually, garbage collection can make your application go faster
#IBM
@holly_cummins
-verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput -Xmx300m -Xms300m -Xcompactgc why does the performance stay exactly the same no matter what gc settings I choose?
#IBM
@holly_cummins
Slide 75
by the way, this is cheating. (remember the ‘bad science’?)
#IBM
@holly_cummins
Slide 76
-verbose:gc
#IBM
@holly_cummins
Slide 77
Slide 78
Slide 79
Slide 80
Slide 81
Slide 82
Slide 83
Slide 84
Slide 85
Slide 86
4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s
Slide 87
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 88
leading indicator
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 89
leading indicator
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 90
lagging indicator
leading indicator
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 91
lagging indicator
leading indicator ?
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 92
lagging indicator ?
leading indicator ?
total GC time: 21.6s 4.1% of time in GC pause 23.9 GB garbage collected 493 transactions/s #IBM
total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s @holly_cummins
Slide 93
so wait, what changed to make the app faster? running jmeter on the same machine as the app gives a big speedup!
#IBM
@holly_cummins
Slide 94
“Any improvements made anywhere besides the bottleneck are an illusion.” – Gene Kim #IBM
@holly_cummins
Slide 95
time kills all performance advice (even mine)
#IBM
@holly_cummins
Slide 96
the takeaways: gc can improve performance by rearranging the heap find the bottleneck validate advice independently
#IBM
@holly_cummins
Slide 97
pitfall 3
advice
#IBM
@holly_cummins
Slide 98
I read it on the internet!
#IBM
@holly_cummins
Slide 99
noooooo!
“make one big method because method dispatching is slow”
#IBM
@holly_cummins
Slide 100
noooooo!
“re-use your objects to help the garbage collector”
#IBM
@holly_cummins
noooooo!
use StringBuilder, never concatenate strings with +=
#IBM
@holly_cummins
Slide 103
noooooo! wait, what? yes, right?
use StringBuilder, never concatenate strings with +=
#IBM
@holly_cummins
Slide 104
2 things ruin advice: • context • time
#IBM
@holly_cummins
Slide 105
pitfall 4
micro-optimisation
#IBM
@holly_cummins
Slide 106
#IBM
@holly_cummins
Slide 107
static string beSlow() { string result = “”; for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
#IBM
@holly_cummins
Slide 108
@Override public String toString() { String ret = “\n\tMarket Summary at: ” + getSummaryDate() + “\n\t\t TSIA:” + getTSIA() + “\n\t\t openTSIA:” + getOpenTSIA() + “\n\t\t gain:” + getGainPercent() + “\n\t\t volume:” + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += “\n\t\t Current Top Gainers:”; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += (“\n\t\t\t” + quoteData.toString()); } ret += “\n\t\t Current Top Losers:”; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += (“\n\t\t\t” + quoteData.toString()); } return ret; }
#IBM
@holly_cummins
Slide 109
@Override public String toString() { String ret = “\n\tMarket Summary at: ” + getSummaryDate() + “\n\t\t TSIA:” + getTSIA() + “\n\t\t openTSIA:” + getOpenTSIA() + “\n\t\t gain:” + getGainPercent() + “\n\t\t volume:” + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += “\n\t\t Current Top Gainers:”; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += (“\n\t\t\t” + quoteData.toString()); } ret += “\n\t\t Current Top Losers:”; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += (“\n\t\t\t” + quoteData.toString()); } return ret; }
#IBM
@holly_cummins
Slide 110
@Override public String toString() { String ret = “\n\tMarket Summary at: ” + getSummaryDate() + “\n\t\t TSIA:” + getTSIA() + “\n\t\t openTSIA:” + getOpenTSIA() + “\n\t\t gain:” + getGainPercent() + “\n\t\t volume:” + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += “\n\t\t Current Top Gainers:”; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += (“\n\t\t\t” + quoteData.toString()); } ret += “\n\t\t Current Top Losers:”; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += (“\n\t\t\t” + quoteData.toString()); } return ret; }
#IBM
@holly_cummins
Slide 111
this never gets called @Override public String toString() { String ret = “\n\tMarket Summary at: ” + getSummaryDate() + “\n\t\t TSIA:” + getTSIA() + “\n\t\t openTSIA:” + getOpenTSIA() + “\n\t\t gain:” + getGainPercent() + “\n\t\t volume:” + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += “\n\t\t Current Top Gainers:”; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += (“\n\t\t\t” + quoteData.toString()); } ret += “\n\t\t Current Top Losers:”; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += (“\n\t\t\t” + quoteData.toString()); } return ret; }
#IBM
@holly_cummins
Slide 112
let’s make travel energy-efficient?
#IBM
@holly_cummins
Slide 113
every little helps?
#IBM
@holly_cummins
Slide 114
every little helps? every optimisation is another optimisation you aren’t doing
#IBM
@holly_cummins
Slide 115
our platforms help
#IBM
@holly_cummins
Slide 116
static string beSlow() { string result = “”; for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
#IBM
@holly_cummins
Slide 117
static string beSlow() { string result = “”; result += getStringData(1); result += getStringData(2); result += getStringData(3); }
#IBM
return result;
@holly_cummins
Slide 118
static string beSlow() { string result = “”; result += getStringData(1); result += getStringData(2); result += getStringData(3); }
#IBM
return result;
this is fine @holly_cummins
Slide 119
the JVM writers have far more time for optimising than you do clean, typical, code runs best
#IBM
@holly_cummins
Slide 120
ok, but how to optimise?
#IBM
@holly_cummins
Slide 121
tools
#IBM
@holly_cummins
Slide 122
“What you can optimize is limited to what you can observe.” -Susie Xia, Netflix
#IBM
@holly_cummins
Slide 123
observability
#IBM
@holly_cummins
Slide 124
method profiler GC analysis heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 125
method profiler
VisualVM
GC analysis heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 126
method profiler
VisualVM Mission Control
GC analysis heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 127
method profiler
VisualVM Mission Control
GC analysis
IBM Health Center (for OpenJ9)
heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 128
method profiler
flame graphs VisualVM Mission Control
GC analysis
IBM Health Center (for OpenJ9)
heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 129
method profiler
flame graphs VisualVM Mission Control
GC analysis
IBM Health Center (for OpenJ9)
GCMV
heap analysis APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 130
method profiler
flame graphs VisualVM Mission Control
GC analysis
IBM Health Center (for OpenJ9)
GCMV
heap analysis
Eclipse MAT
APM distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 131
method profiler
flame graphs VisualVM Mission Control
GC analysis
GCMV
heap analysis APM
IBM Health Center (for OpenJ9)
Eclipse MAT
GlowRoot
distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 132
method profiler
flame graphs VisualVM Mission Control
GC analysis
GCMV
heap analysis APM
IBM Health Center (for OpenJ9)
GlowRoot
Eclipse MAT
New Relic*
distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 133
method profiler
flame graphs VisualVM Mission Control
GC analysis
GCMV
heap analysis APM
IBM Health Center (for OpenJ9)
GlowRoot
Eclipse MAT
AppDynamics* New Relic*
distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 134
method profiler
flame graphs VisualVM Mission Control
GC analysis
GCMV
heap analysis APM
IBM Health Center (for OpenJ9)
GlowRoot
Eclipse MAT
AppDynamics* New Relic*
Dynatrace*
distributed tracing * not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 135
method profiler
flame graphs VisualVM Mission Control
GC analysis
GCMV
heap analysis APM
IBM Health Center (for OpenJ9)
GlowRoot
distributed tracing
Eclipse MAT
AppDynamics* New Relic*
Dynatrace*
Zipkin
not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 136
method profiler
flame graphs VisualVM
IBM Health Center (for OpenJ9)
Mission Control
GC analysis
GCMV
heap analysis APM
GlowRoot
distributed tracing
Eclipse MAT
AppDynamics* New Relic*
Zipkin
Dynatrace*
Jaeger
not free #IBM
this is an incomplete list, because there are a lot of tools out there, and many cost money
@holly_cummins
Slide 137
optimising a micro-service: is that micro-optimising?
Netflix microservice architecture #IBM
@holly_cummins
Slide 138
you may need to know the whole system context to know what to optimise
#IBMGarage
@holly_cummins
Slide 139
“Nines don’t matter if your users aren’t happy.” – Charity Majors
#IBM
@holly_cummins
Slide 140
don’t forget the edges queueing theory helps us understand where the disasters happen
#IBM
@holly_cummins
Slide 141
“When it comes to IT performance, amateurs look at averages. Professionals look at distributions.” – Avishai Ish-Shalom
#IBM
@holly_cummins
Slide 142
slow performance can turn into big cloud bills make cloud costs visible to engineers
#IBM
@holly_cummins
Slide 143
ok, but you promised bears
#IBM
@holly_cummins
Slide 144
if you leave the TV on when you’re not using it, you’re a polar bear murderer
#IBM
@holly_cummins
Slide 145
there is a moral imperative to avoid waste
#IBM
@holly_cummins
Slide 146
there is a moral imperative to avoid waste electricity hardware
#IBM
@holly_cummins
Slide 147
data centres use 1-2% of the world’s electricity
#IBM
@holly_cummins
Slide 148
fewer devices
longer lifetime
#IBM
@holly_cummins
Slide 149
higher ef ciency
fewer devices
longer lifetime
@holly_cummins fi
#IBM
Slide 150
higher ef ciency
fewer devices
lower footprint
longer lifetime
@holly_cummins fi
#IBM
Slide 151
higher ef ciency
fewer devices
lower footprint more multitenancy
longer lifetime
@holly_cummins fi
#IBM
Slide 152
higher ef ciency
fewer devices
lower footprint more multitenancy
longer lifetime
@holly_cummins fi
#IBM
optimise for longevity
Slide 153
higher ef ciency
fewer devices
lower footprint more multitenancy
longer lifetime
the end of planned obsolescence?
@holly_cummins fi
#IBM
optimise for longevity
Slide 154
sooo … you can optimise, and it can be fun measure, don’t guess only optimise what matters
now for questions! #IBM
@holly_cummins