OVHCloud K8s: DevOps & Agility in a Product Team

A presentation at JDev 2020 in July 2020 in by Horacio Gonzalez

Slide 1

Slide 1

OVHCloud K8s DevOps & Agility in a Product Team Horacio Gonzalez 2020-07-09

Slide 2

Slide 2

Who are we? Introducing myself and introducing OVH OVHcloud

Slide 3

Slide 3

Horacio Gonzalez @LostInBrittany Spaniard lost in Brittany, developer, dreamer and all-around geek Flutter

Slide 4

Slide 4

OVHcloud: A Global Leader 200k Private cloud VMs running 1 Dedicated IaaS Europe 30 Datacenters Own 20Tbps Hosting capacity : 1.3M Physical Servers 360k Servers already deployed Netwok with 35 PoPs

1.3M Customers in 138 Countries

Slide 5

Slide 5

OVHcloud: 4 Universes of Products WebCloud Domain / Email Domain names, DNS, SSL, Redirect Email, Open-Xchange, Exchange Baremetal Cloud VM General Purpose Baremetal SuperPlan T2 >20e Virtualization T3 >80e Storage PaaS for Web Mutu, CloudWeb Compute Standalone, Cluster Game Collaborative Tools, NextCloud Database T4 >300e Bigdata T5 >600e HCI Plesk, CPanel AI PaaS with Platform.sh VDI Cloud Game Public Cloud 12KVA /32KVA Hosted Private Cloud K8S, IA IaaS PaaS for DevOps Storage File, Block, Object, Archive Databases SQL, noSQL, Messaging, Dashboard Network Virtual servers VPS, Dedicated Server Network VPS aaS pCC DC SaaS CRM, Billing, Payment, Stats IP FO, NAT, LB, VPN, Router, DNS, DHCP, TCP/SSL Offload Virtuozzo Cloud Security Wordpress, Magento, Prestashop Wholesales Hosted Private Cloud IAM, MFA, Encrypt, KMS IT Integrators, Cloud Storage, VMware SDDC, vSAN 1AZ / 2AZ vCD, Tanzu, Horizon, DBaaS, DRaaS Nutanix HCI 1AZ / 2AZ, Databases, DRaaS, VDI OpenStack IAM, Compute (VM, K8S) Stortage, Network, Databases Storage Ontap Select, Nutanix File OpenIO, MinIO, CEPH Zerto, Veeam, Atempo AI ElementAI, HuggingFace, Deepopmatic, Systran, EarthCube Bigdata / Analitics / ML Cloudera over S3, Dataiku, Saagie, Tableau, MarketPlace CDN, Database, ISV, WebHosting Support, Managed High Intensive CPU/GPU, Support Basic Encrypt Support thought Partners KMS, HSM Managed services Encrypt (SGX, Network, Storage) IA, DL Hybrid Cloud Standard Tools for AI, AI Studio, vRack Connect, Edge-DC, Private DC IA IaaS, Hosting API AI Dell, HP, Cisco, OCP, MultiCloud Bigdata, ML, Analytics Datalake, ML, Dashboard Secured Cloud GOV, FinTech, Retail, HealtCare

Slide 6

Slide 6

DevOps at OVHcloud You build it, you run it

Slide 7

Slide 7

A small fish in a big pond Needs to be adaptable, flexible, quick

Slide 8

Slide 8

Huge range of products For a comparatively small technical staff

Slide 9

Slide 9

With a wide variety of technology stacks Open source, non open source and self built

Slide 10

Slide 10

Only a way to keep up You build it, you run it

Slide 11

Slide 11

The DevOps approach was natural Even if not always fully structured

Slide 12

Slide 12

Let’s create a new product OVHcloud Managed Kubernetes

Slide 13

Slide 13

In 2018 we decided to build a product OVHcloud Managed Kubernetes

Slide 14

Slide 14

Build over our Public Cloud We leverage on our OpenStack expertise

Slide 15

Slide 15

Building a team for the project A small DevOps team to: ● Bootstrap the project ● Build the product ● Operate it

Slide 16

Slide 16

A balanced mix of skill needed At the beginning of the project: ● Architects ● Developers ● Sysadmins/Ops

Slide 17

Slide 17

But a clear objective By the end of the projects, having DevOps

Slide 18

Slide 18

Everybody on call The most direct road from Dev to DevOps

Slide 19

Slide 19

Developers need to be on call ● Engineering is about building and maintaining services ● On call should not be life-impacting ● Services are better when feedback loops are short

Slide 20

Slide 20

On-call duty needs to be humane ● Paid on-call duty ○ Flat rate per rotation period ○ Call out fee for responding to an alert. ● Enough people in rotation ○ Between 5-10 people ● Flexibility ○ Allowing people to swap on call time ● Curated alerts ○ Not all systems or errors deserve an on-call alert ● A clear escalation policy ○ And level 2/3 on-call duties

Slide 21

Slide 21

Building it for efficiency Because we need to facilitate operations

Slide 22

Slide 22

Kubinception: running K8s on K8s

Slide 23

Slide 23

And the ETCD

Slide 24

Slide 24

And the agility in all that? The keystone of the whole project and DevOps approach

Slide 25

Slide 25

DevOps teams all around the country ● ● ● ● ● ● ● ● Roubaix Paris Rennes Bordeaux Toulouse Lyon Nantes Brest

Slide 26

Slide 26

Tools & Methodology « SCRUM » « Farmer’s Wisdom » « KINTSUGI* »

Slide 27

Slide 27

What do we mean by agility?

Slide 28

Slide 28

A team with prejudices about Agility “Being Agile? This is a trick to justify a failure in project management. ” “The Scrum does not work for development teams, it is for managers” “Scrum is only useful to watch and control us” “Scrum? We tested 3 years ago, we quickly understood that it was not for us…”

Slide 29

Slide 29

st 1 step: building trust “I’m here to give you visibility” “We will create value together” “Your rhythm will be respected” “We will keep our commitments”

Slide 30

Slide 30

nd 2 step: team engagements “We respect the rules of Scrum” “We track our Impediments” “We respect our tickets’ lifecycle”

Slide 31

Slide 31

3rd step: convince with numbers and metrics!

Slide 32

Slide 32

Agility and real life of a DevOps team “I have a customer who came to ask me to solve a problem” “I am too often disturbed to fix bugs!” “I didn’t have time to finish the ticket, I had to deal with an emergency”

Slide 33

Slide 33

Impediments Unexpected tasks asking for immediate attention and diverting time from scheduled tasks

Slide 34

Slide 34

Tracking burndown and impediments

Slide 35

Slide 35

✔ Decision making ✔ Vision ✔ Creation of value ✔ Respect for the rhythm of the team

Slide 36

Slide 36

th 4 step: agile telemetry +

Slide 37

Slide 37

Agile telemetry workflow

Slide 38

Slide 38

Extracting metrics from Jira bot.jerem Language JIRA Metrics DataBase Golang API Scrapping HTTP POST Warp10

Slide 39

Slide 39

From metrics to visualisation 1 Objet 3 pts 3 pts 3 pts 5 pts 5 pts 8 pts 8 pts 8 pts Example : Projet : OB Epic : Alpha Information : taches DONE 12 pts 18 pts Temps Interval régulier envir : 20min

Slide 40

Slide 40

And then to the dashboard

Slide 41

Slide 41

OVHcloud agile telemetry MACRO • Top management • Product Marketing • Program Manager MICRO • Equipe de dev • Scrum Master • Technical leader

Slide 42

Slide 42

Last but not least… “We have kept our promises, with a fairly good level of quality” - Developer “We don’t apply the method mindlessly, we adapt it” Developer “Product release with good visibility and accuracy” - Manager A convinced product team

Slide 43

Slide 43

That’s all, folks! Thank you all!