Cloud Computing & Storage - More than a File System in the Sky

A presentation at Guest Lecture - Storage Systems Class in October 2012 in Pittsburgh, PA, USA by erik riedel

Slide 1

Slide 1

Cloud Computing & Storage More than a File System in the Sky Erik Riedel, PhD Technology & Architecture Cloud Infrastructure Group EMC © Copyright 2012 EMC Corporation. All rights reserved. 1

Slide 2

Slide 2

ONE VISION: EMC BRAND HIE EMC prod The master brand EMC Atm Avamar C Centera CLARii Data Domain DiskXtender Data Protect Docu Documentum Re Documentum Documentum xCP EMC So Greenplum Community Edition

Slide 3

Slide 3

Slide 4

Slide 4

Slide 5

Slide 5

Cloud Computing © Copyright 2012 EMC Corporation. All rights reserved. 5

Slide 6

Slide 6

Slide 7

Slide 7

Supporting the Shift to Cloud Inside, Outside, and Across Organizations Cloud is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g. networks, servers, storage, applications) that can be rapidly provisioned and released with minimal management effort or service provider interaction Private Cloud Infrastructure deployed and operated exclusively for an organization or enterprise Hybrid Cloud Composition of two or more clouds, private and/or public Public Cloud Infrastructure made available to general public or many industry groups/customers Source: *National Institute of Standards and Technology, V15 October 2009 © Copyright 2012 EMC Corporation. All rights reserved. 7

Slide 8

Slide 8

Big Data © Copyright 2012 EMC Corporation. All rights reserved. 8

Slide 9

Slide 9

IN 2010 THE DIGITAL UNIVERSE WAS 1.2 ZETTABYTES 1,200,000,000,000,000,000,000 Source: 2010 IDC Digital Universe Study © Copyright 2012 EMC Corporation. All rights reserved.

  • 600 million disk drives sold in 2011 (so another 1.2 ZB ! ) 9

Slide 10

Slide 10

Who Is It Really For © Copyright 2012 EMC Corporation. All rights reserved. 10

Slide 11

Slide 11

vs. IT Managers Programmers Programmers buzz – Ruby/Rails, MapReduce/Hadoop IT Managers buzz – VM images, vApps, VLANs Marketing buzz – Virtualization, IaaS, PaaS, SaaS

Slide 12

Slide 12

The previously separate roles of software developer and operations have [become] increasingly intermeshed and intertwined. Things are materially different… Ray Ozzie, Chief Software Architect, Microsoft

Slide 13

Slide 13

  • IT Managers Programmers Programmers buzz – Ruby/Rails, MapReduce/Hadoop IT Managers buzz – VM images, vApps, VLANs Marketing buzz – Virtualization, IaaS, PaaS, SaaS

Slide 14

Slide 14

Cloud is often an “excuse” for enterprises to move to “New IT” – away from the old client/server model that has been used for the past ten years [toward Web 2.0 IT] Werner Vogels, CTO, Amazon

Slide 15

Slide 15

Agility Is The #1 Private Cloud Driver 15% 24% Other Cost 9% Business Alignment 55% Agility “The majority see agility and speed as the primary benefits of private cloud computing.” GARTNER Source: “The Drivers And Challenges Of Private Cloud Computing”, March 2011, Gartner © Copyright 2012 EMC Corporation. All rights reserved. 15

Slide 16

Slide 16

A Few Details © Copyright 2012 EMC Corporation. All rights reserved. 16

Slide 17

Slide 17

It’s not possible to “start over” and re-write all applications using scale-out design patterns in the first few months of a cloud deployment, but it is possible to adapt many legacy applications with the help of virtualization, so cloud infrastructure can support and enable both development models, including mixing the two.

Slide 18

Slide 18

“Developers” Range Widely in Focus/Expertise • IT managers/admins deploying applications encapsulated or pre-packaged into virtual machines – – – – Language – configuration scripts, command lines Input – catalog of vApp templates or pre-configured VMs Output – VM images, VM configurations, system configurations Runs on – vSphere/ESX, virtual networks, legacy storage + scale-out storage • Programmers using application frameworks such as Groovy/Grails or Hadoop – – – – Language – Grails/Java, MapReduce/Hadoop Input – code, with help of an IDE Output – Rails + database configurations, job scripts Runs on – Rails + MySQL, virtual networks, scale-out storage

Slide 19

Slide 19

Apps + Data • Development – new applications – explicitly scale-out (e.g. MapReduce, Hadoop) – built on higher-level frameworks (e.g. Ruby/Rails, Azure) • Deployment – legacy applications – “packaged” into virtual machine containers – easy to replicate and migrate across virtual infrastructure Ÿ Data – shared corporate data is the common ground (enterprise apps) – consumer value centered around their personal data (consumer apps) © Copyright 2012 EMC Corporation. All rights reserved. 19

Slide 20

Slide 20

Example – Deployment Marketing buzz – IaaS – Infrastructure as a Service © Copyright 2012 EMC Corporation. All rights reserved. 20

Slide 21

Slide 21

Example – Development Marketing buzz – PaaS – Platform as a Service © Copyright 2012 EMC Corporation. All rights reserved. 21

Slide 22

Slide 22

Example – EMC Greenplum HD Enterprise-Ready Hadoop Platform For Unstructured Data • Addresses The Growth Of Unstructured Data • More Reliable For The Enterprise • Easier To Use With Existing Systems And Tools Marketing buzz – Big Data – MapReduce, Hadoop © Copyright 2012 EMC Corporation. All rights reserved. 22

Slide 23

Slide 23

More About Apps + Data • From the perspective of development & deployment, the key new technology component is a combined data + app (storage + compute) platform where apps are created, deployed, monitored & managed with a common set of tools. • Underlying enablers: • Common object space – apps, configs, user data • Single identity store – public, private, enterprise, consumer • Federation (public + private) – seamless across infrastructures • Monitoring – continuous measurement to optimize (and generate bills) © Copyright 2012 EMC Corporation. All rights reserved. 23

Slide 24

Slide 24

Under The Covers What About The Data? © Copyright 2012 EMC Corporation. All rights reserved. 24

Slide 25

Slide 25

Cloud – A New Architecture Old World – Physical app app app Dedicated, Vertical Stacks New World – Virtual app app app app app app Dynamic Pools Of Compute & Storage Operating Systems & Frameworks “disappear” into the cloud fabric © Copyright 2012 EMC Corporation. All rights reserved. 25

Slide 26

Slide 26

Builds on 20 Years of Storage Research • APIs vs. mount points – “no slashes required” – blocks vs. files vs. objects vs. “APIs” • App-driven and policy-automated / GUI – self-configuring, self-organizing, self-tuning, self-* • Built in data services – self-healing RAID • Unlimited namespace, dynamic – billions and billions of objects, large and small • Native multi-tenancy – security/auth, monitoring, resource isolation © Copyright 2012 EMC Corporation. All rights reserved. 26

Slide 27

Slide 27

EMC Atmos BIG. SMART. ELASTIC. © Copyright 2012 EMC Corporation. All rights reserved. 27

Slide 28

Slide 28

Atmos Gen 2 Hardware Configurations WS2-120 WS2-240 WS2-360 Dense Compute Intel® Xeon® 5500 “Nehalem” Micro architecture Capacity Small/Medium Scale Compute 1 / 2 / 3 TB SATA Drives Expansion Nodes (2 Nodes per expansion) Ÿ Ÿ Ÿ Ÿ Ÿ 1:15 servers-disk ratio 60 – 120 disks 60 TB entry point / 240 TB max 2 Node / 30 Disk expansions (7.2K) PWR: 5KW / 16.7 BTU/Hr © Copyright 2012 EMC Corporation. All rights reserved. Ÿ 1:15 servers-disk ratio Ÿ 60 – 240 disks Ÿ Up to 480 TB total capacity Ÿ 1:60 server-disk ratio Ÿ 240 or 360 disks Ÿ Up to 720 TB total capacity Ÿ 2 Node / 30 Disk expansions Ÿ 7.2K PWR: 10kW HEAT: 34.2k BTU/hr Ÿ 2 Node / 120 Disk expansions Ÿ 7.2k PWR: 10.3 kW HEAT: 35.1k BTU/hr 28

Slide 29

Slide 29

• commodity SATA drives (as many as possible) • x86 servers/controllers (as few as possible) • SAS backplanes/cables (just the right number) Promo Code 1 Front (tray pulled out) © Copyright 2012 EMC Corporation. All rights reserved. 29

Slide 30

Slide 30

8.6 drives/U Dell 6 drives/U 12 drives/U Promo Code 1 Front (tray pulled out) Supermicro Backblaze 11.3 drives/U 11.3 drives/U © Copyright 2012 EMC Corporation. All rights reserved. 30

Slide 31

Slide 31

• commodity SATA drives (as many as possible) • x86 servers/controllers (as few as possible) • SAS backplanes/cables (just the right number) 14.1 drives/U Promo Code 1 Front (tray pulled out) © Copyright 2012 EMC Corporation. All rights reserved. 31

Slide 32

Slide 32

A New Approach For Distributed Big Data L.A. BOSTON LONDON L.A. Storage Islands • • • • Disparate Systems Manual Administration One Tenant, Many Systems IT Provisioned Storage © Copyright 2012 EMC Corporation. All rights reserved. BOSTON LONDON Single Storage Pool • • • • Single System Across Locations Automated Policies Many Tenants One System Self-Service Access 32

Slide 33

Slide 33

What is EMC Atmos? EMC Atmos Custom or Packaged Applications REST, SOAP, or file services access No limits on namespace or location Multi-tenancy securely isolates data Automated location, protection, and efficiency services Single GUI Available on purpose-built appliances or virtualized software SITE #1 SITE #2 SITE #3 Los Angeles New York London © Copyright 2012 EMC Corporation. All rights reserved. Self-Service Experience 33

Slide 34

Slide 34

• Petabyte-scale • Geographic distribution • Policy-driven storage Case Studies Customer-Facing Web Application Atmos Web Service Interfaces + Metadata Atmos Policy “isPaid N” Atmos Policy “isPaid Y” Canada U.S. EMEA Case Study – Content-rich Web App Content-Rich Web App on Atmos Ÿ Global distribution, content mix Ÿ Multi-tenancy, scale to multiple sites Ÿ Policy supports business models © Copyright 2012 EMC Corporation. All rights reserved. CareCore “gets in the cloud” with Atmos Ÿ Ÿ Ÿ Ÿ Wrote to Atmos REST API in one week Bought Atmos and deployed in three weeks Adding over 2 million objects a day to Atmos Started with one app, spreading to many more 34

Slide 35

Slide 35

Builds on 20 Years of Storage Research • APIs vs. mount points – “no slashes required” – blocks vs. files vs. objects vs. “APIs” • App-driven and policy-automated / GUI – self-configuring, self-organizing, self-tuning, self-* • Built in data services – self-healing RAID • Unlimited namespace, dynamic – billions and billions of objects, large and small • Native multi-tenancy – security/auth, monitoring, resource isolation © Copyright 2012 EMC Corporation. All rights reserved. 35

Slide 36

Slide 36

Summary What Changes © Copyright 2012 EMC Corporation. All rights reserved. 36

Slide 37

Slide 37

Summary – Structural Changes Enterprise IT challenges/pain points – Adapting to the business model changes of cloud – Answer: private + public clouds with federation – Adapting to development model changes of cloud – Answer: leverage new tools, frameworks to develop Web 2.0 and scale-out apps – Migrating legacy applications to cloud – Answer: virtualization to encapsulate legacy OS + apps – Managing data across apps & users – governance – Answer: a combined + app platform to manage the data flow among apps and virtual machines © Copyright 2012 EMC Corporation. All rights reserved. 37

Slide 38

Slide 38

Questions? © Copyright 2012 EMC Corporation. All rights reserved. 38

Slide 39

Slide 39

References • Geoff Moore “Partly Cloudy: Business and Innovation in the Internet Era” September 2010 – www.snia.org/cloud/Cloudburst/ Moore_SNIA_Keynote.pdf • Peter Mell & Tim Grance “The NIST Definition of Cloud Computing” October 2009 – csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc • Any business or computing magazine published at any point in 2009 or 2010 © Copyright 2012 EMC Corporation. All rights reserved. 39

Slide 40

Slide 40

Slide 41

Slide 41

Big Data Challenges INFORMATION IN THE ENTERPRISE WILL GROW 50X IN THE NEXT 10 YEARS Unstructured Content Prepare for digital universe explosion — 34 zettabytes of growth to 2020 © Copyright 2012 EMC Corporation. All rights reserved. Distributed Big Data Aggregate data as a business advantage; manage as one system Accessibility Make available around the globe—from any device—any location 41

Slide 42

Slide 42

IN A DECADE THE DIGITAL UNIVERSE WILL BE 35 ZETTABYTES 35,000,000,000,000,000,000,000 Source: 2010 IDC Digital Universe Study © Copyright 2012 EMC Corporation. All rights reserved. 42

Slide 43

Slide 43

Key Technology Components • Policy-driven Orchestration – Application mgmt – via virtualization – Data mgmt – ILM is finally required – Continuous measurement & monitoring – to meter/bill; to maintain high efficiency • vPods (virtualization) – Enables easy migration and replication of containerized applications vPod(s) Customer [virtual] Data Center3 – Drives highly efficient resource utilization – Eases rapid deployment of new applications & new services • Pods (packaged racks) – Rack-level deployment of infrastructure (compute + network + storage) – Drives highly efficient acquisition and deployment vs. traditional full custom or semi-custom design-per-app © Copyright 2012 EMC Corporation. All rights reserved. MAN/WAN VPN Pod(s) Service Provider Data Center 43

Slide 44

Slide 44

Why The Cloud Is Here To Stay Enterprise vs. Consumer Technology © Copyright 2012 EMC Corporation. All rights reserved. 44

Slide 45

Slide 45

Another angle – cloud computing is really about bringing enterprise computing technology and applications up to the norms and expectations of consumer computing technology.

Slide 46

Slide 46

The way we run our lives has forever changed. The employees we are hiring right out of school are appalled by the technology we use to run our companies. They are more productive at home than they are in the office. Marc Benioff, CEO, Salesforce.com

Slide 47

Slide 47

The barrier is becoming less and less between enterprises and consumers in application terms [expectations and functionality often are very much the same]” Eric Schmidt, CEO, Google

Slide 48

Slide 48

From September 2010, SNIA CloudBurst keynote by Geoffrey Moore

Slide 49

Slide 49

Why should employees accept a 50% reduction in their productivity when they come to the office on Monday morning? On the weekend, Google can answer any question I have, on Monday, I can’t get the answer to “who are my five biggest customers?” On the weekend, someone from my high school can find me and try to be my friend, on Monday, I can’t find my VP of Finance. Geoff Moore, Author, Crossing the Chasm

Slide 50

Slide 50

Consumer Attention • Diagram from Maya Gap Courtesy Mick McManus, MAYA Design