Running a Serverless Lucene Reverse Geocoder

A presentation at Search Meetup Munich in April 2019 in Munich, Germany by Alexander Reelsen

Slide 1

Slide 1

Running a Serverless Lucene Reverse Geocoder Alexander Reelsen alex@elastic.co @spinscale

Slide 2

Slide 2

Agenda ‣ What is serverless? ‣ Searching for Locations ‣ Demo ‣ Searching with Lucene using a binary

Slide 3

Slide 3

Serverless?

Slide 4

Slide 4

Serverless? ‣ FaaS (Function as a Service) ‣ Execution Environment as a Service ‣ Payment model: Pay per code runtime ‣ Not running? No bill! ‣ Configure memory size (also changes CPU power) ‣ Maximum function execution time ‣ Provider takes care of scaling functions

Slide 5

Slide 5

Providers? ‣ AWS Lambda, GCP Cloud Functions, Azure Cloud Functions, Cloudflare, IBM OpenWhisk ‣ Faastruby, Binaris, Spotinst ‣ K8s: KNative, Fission, Kubeless, Nuclio, OpenFaas ‣ Docker: Fn, OpenFaas

Slide 6

Slide 6

Java? ‣ Not too well suited for short lived tasks ‣ JVM startup time ‣ JIT compiler ‣ Dependency initialisation ‣ Application initialisation

Slide 7

Slide 7

Location search

Slide 8

Slide 8

Reverse Geocoder ‣ Input: Latitude, Longitude ‣ Output: City (a readable representation)

Slide 9

Slide 9

Search across points ‣ Each city gets indexed with a lat/lon pair ‣ Search for the next point to the supplied one ‣ Problem: Neighbours!

Slide 10

Slide 10

Point based search: Near neighbours

Slide 11

Slide 11

Point based search: Near neighbours

Slide 12

Slide 12

Search across shapes ‣ Each city gets indexed with a lat/lon pair ‣ Certain cities get indexed as a geoshape ‣ Search twice: ‣ Lat/Lon within any shape ‣ Lat/Lon nearby any point

Slide 13

Slide 13

Geo and Lucene: BFF! ‣ LatLonPoint: two points, 4 bytes each ‣ LatLonShape: triangular mesh tesselation

Slide 14

Slide 14

Geo and Lucene: BFF! https://home.apache.org/~mikemccand/geobench.html

Slide 15

Slide 15

Geo and Lucene: BFF! https://home.apache.org/~mikemccand/geobench.html

Slide 16

Slide 16

Serverless Lucene ‣ Local execution, index part of the package ‣ Offline index creation ‣ Packaging index into code ‣ Index needs to be unpacked, using Lucene via classpath resources is tricky

Slide 17

Slide 17

Demo

Slide 18

Slide 18

Summary ‣ Works! ‣ Problem: Data quality, getting accurate shape data ‣ Problem: First invocation (up to 2s) ‣ JVM startup ‣ Lucene index opening

Slide 19

Slide 19

Faster startup & runtime

Slide 20

Slide 20

Enter GraalVM! ‣ A new compiler, supporting HotSpot and AOT compilation ‣ Graal compiler part of Java9 (experimental!) ‣ Graal JIT compiler part of Java10 (Linux 64bit only) ‣ Project Metropolis: Java-on-Java Hotspot implementation ‣ Truffle: Interpreter to implement other languages on top of graal (jruby replacement)

Slide 21

Slide 21

Enter GraalVM! ‣ AOT static compilation + SubstrateVM = executable binaries of java apps ‣ Using SubstrateVM ‣ Reflection!

Slide 22

Slide 22

Slide 23

Slide 23

Discussion … ask all the things!

Slide 24

Slide 24

Links ‣ https://serverless.com/framework/docs/ ‣ https://www.openfaas.com ‣ https://cloud.google.com/knative/ ‣ https://kubeless.io/ ‣ https://fission.io/ ‣ http://fnproject.io/ ‣ https://nuclio.io ‣ https://openwhisk.incubator.apache.org/ ‣ https://www.graalvm.org ‣ https://openjdk.java.net/projects/metropolis/ ‣ https://github.com/oracle/graal/tree/master/substratevm ‣ https://en.wikipedia.org/wiki/Reverse_geocoding