Practical Service Mesh

A presentation at KubeCon EU Sponsor Lightning Talks in May 2019 in Barcelona, Spain by Jesse Butler

Slide 1

Slide 1

ive Practical Service Mesh A Quick Tour Through Some Real Use Cases Jesse Butler Cloud Native Advocate, Oracle Cloud Infrastructure. @jlb13 cloudnative.oracle.com Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 2

Slide 2

The Inevitable First Slide: What is a Service Mesh? Connect, secure, control and observe services at scale, often requiring no service code modification Though many options exist, Linkerd and Istio are the two main projects Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 3

Slide 3

Service Mesh 101 • Infrastructure layer for controlling and monitoring service-to-service traffic • Data plane deployed alongside application services, control plane used to manage the mesh • Greatly simplifies service implementation offering transparent service discovery, automated retries, timeouts and more Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 4

Slide 4

Service Mesh is Not an API Gateway API Gateways deal with north-south traffic, inbound to your cluster Service Mesh is concerned with east-west traffic, between your services within your cluster Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 5

Slide 5

Service Mesh Architecture • Both Istio and Linkerd use a sidecar pattern, adding a proxy container for each pod added to the mesh • Each proxy instance manages traffic for its pod, and is fully configurable • This vantagepoint is what gives a service mesh its power – it sees and knows all Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 6

Slide 6

Sidecar Proxy Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 7

Slide 7

Sidecar Proxy Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 8

Slide 8

Sidecar Proxy HTTP/1.1, HTTP/2 gRPC or TPC With or without mTLS Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 9

Slide 9

Observability • Metrics – Aggregate data regarding the behavior of a thing over time • Tracing – Instrumentation which provides an instance of an action, traversing the entire stack • Logging – Developer breadcrumbs we leave to give context for a certain code path Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 10

Slide 10

Triaging Issues • Metrics are instrumented and scraped for analytic use • Traces are implemented on a per-span basis at points of interest • Logs are specific, and a gift we give our future selves… treat yourself Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 11

Slide 11

Service Mesh Brings Observability Gifts • All traffic in the mesh is routed through the proxies • Boundary tracing, on-wire traffic, calls and status are all obvious in a mesh • Metrics and traces can be taken for free, with no modifications to code • Most issues can be triaged with this information Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 12

Slide 12

Linkerd Dashboard Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 13

Slide 13

Kiali, the Istio Dashboard Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 14

Slide 14

Traffic Management • Proxy instances provide a traffic shifting capabilities • We can configure proxies based upon knowledge of our services • Through proxy configuration we have intelligent routing of our cluster traffic Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 15

Slide 15

Traffic Management Details with Istio Deployment:v1 foo:v1 VirtualService Pod foo:v1 DestinationRule Deployment:v2 foo:v2 Pod foo:v2 • ‘foo’ deployed services routed through ‘foo’ VirtualService • DestinationRules for ‘foo:v1’ and ‘foo:v2’ pods, with weights Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 16

Slide 16

Speak Kubernetes at your Kubernetes Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 17

Slide 17

Leveraging Traffic Shifting • Manage and shift traffic via configuration • Take advantage of zero-downtime changes in routing between versions • We can automate deployments of any kind – Canary deployments – Blue/Green deployments – Whatever we want Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13 17

Slide 18

Slide 18

Traffic Mirroring and Dark Launches • Traffic shifting, but 100% of production traffic goes to production services • Mirror as much or as little traffic to other services in the cluster • These routes can be intelligently filtered – Test automation – Beta users – That one dev who keeps bugging you… Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13 18

Slide 19

Slide 19

Test in Production Safely • Isolate traffic as required • Deploy test candidates • Mirror real production data to them, shift their responses to test fixtures • Meanwhile, prod keeps humming along Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13 19

Slide 20

Slide 20

Testing • Core mesh features: retries, timeouts, circuit breakers • Through the same proxy configuration we can inject latency trivially as well • Modify on-wire data including message bodies and header information Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13 20

Slide 21

Slide 21

Welcome to Microservices Fight Club • Inject faults by modifying reply status or mutate parameter data • Inject latency to test resilience and response • Redirect traffic to API stubs / mocks • Use traffic shifting/mirroring to target test traffic as needed • Let your imagination run Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13 21

Slide 22

Slide 22

Caution • It is trivial to modify on-wire data via mesh configuration • Service protocols could mutate over time through configuration changes, with no visibility in source code • This is a recipe for disaster, and would repeat mistakes long-ago learned from Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13 22

Slide 23

Slide 23

Security • Deploying services in containers requires careful provisioning, build and deployment practices • There are options to leverage in both CI/CD and registry scanning • Once services are deployed in the wild, they are on their own Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 24

Slide 24

Security • Istio and Linkerd are capable of creating a zero-touch, zero-trust network • Services within your cluster authenticate via the mesh • Leveraging mTLS, the cluster is transparently hardened and protected from many types of attacks Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13

Slide 25

Slide 25

Thanks! cloudnative.oracle.com @jlb13 Copyright © 2019, Oracle and/or its affiliates. All rights reserved. @jlb13