Industrializing AI in the cloud

A presentation at Cloud Expo Europe - Madrid Tech in October 2021 in Madrid, Spain by Horacio Gonzalez

Slide 1

Slide 1

Cloud and Artificial Intelligence industrialization Matías Sosa & Horacio González 2021-10-27

Slide 2

Slide 2

Who are we? Introducing ourselves and introducing OVH OVHcloud

Slide 3

Slide 3

Matías Sosa Marketing Product Manager

Slide 4

Slide 4

Horacio Gonzalez @LostInBrittany Spaniard lost in Brittany, developer, dreamer and all-around geek Flutter

Slide 5

Slide 5

OVHcloud: A global leader Web Cloud & Telcom 30 Data Centers in 12 locations 1 Million+ Servers produced since 1999 Private Cloud 34 Points of Presence on a 20 TBPS Bandwidth Network 1.5 Million Customers across 132 countries Public Cloud 2200 Employees worldwide 3.8 Million Websites hosting Storage 115K Private Cloud VMS running 1.5 Billion Euros Invested since 2016 300K Public Cloud instances running P.U.E. 1.09 Energy efficiency indicator 380K Physical Servers running in our data centers 20+ Years in Business Disrupting since 1999 Network & Security

Slide 6

Slide 6

The many faces of AI And the people who work on it

Slide 7

Slide 7

We often identify two kinds of AI users

Slide 8

Slide 8

But there is a third one: DevOps/SRE

Slide 9

Slide 9

They speak different languages

Slide 10

Slide 10

But they need to work together

Slide 11

Slide 11

The challenge of integration Integrating AI/ML & DevOps/SRE teams, process, and tools

Slide 12

Slide 12

Many questions to answer…

Slide 13

Slide 13

Automate the end-to-end pipeline From idea to production

Slide 14

Slide 14

OVHcloud & AI Our answer to AI pipeline automation

Slide 15

Slide 15

Our approach to tackle the problem

Slide 16

Slide 16

AI Platform

Slide 17

Slide 17

OVHcloud AI Platform Store OVHcloud Object Storage Explore and preprocess Train Deploy and Serve OVHcloud AI Training OVHcloud AI Serving OVHcloud Data Processing Each project step can be managed with user-friendly AI Notebooks Quickly train your model without complex setup configuration, allowing CPU/GPU parallelization Built on OVHcloud trusted and secured cloud, designed for large dataset (leveraging Object Storage scalability) Deploy your model with industry-leading AI frameworks Working with third parties to propose out-of-the-box AI services and ready-to-use ML models adapted to specific use cases (e.g., Healthcare, Transport)

Slide 18

Slide 18

OVHcloud AI Hub Ex: Chatbot, Search engine, Fraud detector, Healthcare apps, DataViz, Translators, … AI hub ✔ Hosted on OVHcloud ✔ Fully integrated ✔ Easy to use (as a service) More partners to come (ETA Nov. 2021) Ongoing Speech Image Text AI Frameworks Done AI HUB Integrated partners & frameworks Cloud Infrastructure & policies AI Services Consulting MCO … And more..

Slide 19

Slide 19

Secured by design Software ✔ Openstack-based public cloud platform ✔ Strong authentications mechanisms (Keystone) ✔ Fully managed by OVHcloud (no root access) ✔ Availability in multiple regions ✔ Clustered and resilient AI services by default Infrastructure Certifications & Compliance ✔ ISO27001 ✔ HDS/HIPAA ✔ GDPR

Slide 20

Slide 20

Pay as you go, simple, and aggressive pricing Object storage what’s the cost of 50 hours of notebook with 1 x NVIDIA V100 GPU? Pay per GB, starting at: 0,01 € HT /month /GB to store + 0,01€ HT /GB traffic OUT 87 € 121 € 161 € GCP Azure 195 € (example: 10TB = 100€ HT/month) GRA + BHS AI Notebooks / AI Training Pay per GPU per minute, starting at: 1,75€ /hour /gpu (NVIDIA V100s 32GB) Pay per CPU per minute, starting at: 0,03€ /hour /cpu (Intel Xeon 1vCPU + 4GB) OVHcloud AI-standard 1 x V100S 32GB Standard_V100 Standard_NC6s_v3 1 x V100 16GB 1 x V100 16GB AWS P3.2xlarge 1 x V100 16GB Prices in EU datacenters, without storage attached, no period commitment.

Slide 21

Slide 21

An answer to many AI questions… We need to setup / use AI environments easily & quickly. Infrastructure? Not our job. Managed services, no setup cost, no sysadmin skills required Our team needs to collaborate and have guaranteed access to resources. Object storage sync & linked to GPUs or CPUs Our needs for resources are evolving. We need scalability and pay as we grow. Simple & predictive Pay as you Go Per minute How do we guarantee the privacy, security and compliance of our data and models in the Cloud? GDPR compliant, ISO 27K1, HDS certifications

Slide 22

Slide 22

OVHcloud & AI: sum-up 4 3 2 1 For everyone, everywhere Available anywhere in the world. Can be launched by anyone in self-service Made with communities Working closely with worldwide AI communities and partners. European sovereignty European legislation, powered by open source End-to-end AI offering Built on 20 years of cloud experience. With an open-source mindset

Slide 23

Slide 23

Some use-cases Building AI together

Slide 24

Slide 24

Use-case#1: AI for Health Challenge 2020 OVHcloud infrastructure has been leveraged for two projects in the AI for Health Challenge 2020 Identify and forecast live cancer diagnosis with AI algorithms Develop AI algorithms to help therapeutic decision to detect lung cancer Partners 16 startups working on the projects Isolated/secured data and compute environment and additional audit by APHP Certified ISO 27k and GDPR compliant and HDS, HIPAA In line with limited budget leveraging high-end AI infrastructure 1m€ of reward for the 2 winner startups

Slide 25

Slide 25

Use-case#2: Zaion Context : Zaion is the European expert on customer relationships tools : callbot, welcomebot, Chatbot, …. Context Challenges Their main differentiator: Strong accuracy, enhanced feature (detect emotions) Challenges : lot of data (audio) to play with. Need a lot of GPU power and low latency with their datasets. Voice recognition (Speech to text) Products Zaion bots Hello, I may have the covid ! What should I do ? Hello ! First can you describe the symptoms ? Emotions detection Answers generation

Slide 26

Slide 26

Use-case#2: Zaion Infra used AI Notebooks • 7 GPUs in parallel • 20 TB of data Large capacity Storage (OVHcloud Object storage) Valid code AI Training AI Training AI Training Several TB for datasets Challenges solved ✔ ✔ ✔ ✔ ✔ Trained models Ability to work on huge datasets Lots of GPU in parallel Low Latency between data/GPUs No more infra to maintain 🡺 They gain time Improve their bots with new data / new code Users can benefits from smarts AI bots

Slide 27

Slide 27

Do you want to know more? 16-17 November 2021

Slide 28

Slide 28

That’s all, folks! Thank you all!