Cloud Computing & Storage More than a File System in the Sky Erik Riedel, PhD Technology & Architecture Cloud Infrastructure Group EMC © Copyright 2012 EMC Corporation. All rights reserved. 1
A presentation at Guest Lecture - Storage Systems Class in October 2012 in Pittsburgh, PA, USA by erik riedel
Cloud Computing & Storage More than a File System in the Sky Erik Riedel, PhD Technology & Architecture Cloud Infrastructure Group EMC © Copyright 2012 EMC Corporation. All rights reserved. 1
ONE VISION: EMC BRAND HIE EMC prod The master brand EMC Atm Avamar C Centera CLARii Data Domain DiskXtender Data Protect Docu Documentum Re Documentum Documentum xCP EMC So Greenplum Community Edition
Cloud Computing © Copyright 2012 EMC Corporation. All rights reserved. 5
Supporting the Shift to Cloud Inside, Outside, and Across Organizations Cloud is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g. networks, servers, storage, applications) that can be rapidly provisioned and released with minimal management effort or service provider interaction Private Cloud Infrastructure deployed and operated exclusively for an organization or enterprise Hybrid Cloud Composition of two or more clouds, private and/or public Public Cloud Infrastructure made available to general public or many industry groups/customers Source: *National Institute of Standards and Technology, V15 October 2009 © Copyright 2012 EMC Corporation. All rights reserved. 7
Big Data © Copyright 2012 EMC Corporation. All rights reserved. 8
IN 2010 THE DIGITAL UNIVERSE WAS 1.2 ZETTABYTES 1,200,000,000,000,000,000,000 Source: 2010 IDC Digital Universe Study © Copyright 2012 EMC Corporation. All rights reserved.
Who Is It Really For © Copyright 2012 EMC Corporation. All rights reserved. 10
vs. IT Managers Programmers Programmers buzz – Ruby/Rails, MapReduce/Hadoop IT Managers buzz – VM images, vApps, VLANs Marketing buzz – Virtualization, IaaS, PaaS, SaaS
The previously separate roles of software developer and operations have [become] increasingly intermeshed and intertwined. Things are materially different… Ray Ozzie, Chief Software Architect, Microsoft
Cloud is often an “excuse” for enterprises to move to “New IT” – away from the old client/server model that has been used for the past ten years [toward Web 2.0 IT] Werner Vogels, CTO, Amazon
Agility Is The #1 Private Cloud Driver 15% 24% Other Cost 9% Business Alignment 55% Agility “The majority see agility and speed as the primary benefits of private cloud computing.” GARTNER Source: “The Drivers And Challenges Of Private Cloud Computing”, March 2011, Gartner © Copyright 2012 EMC Corporation. All rights reserved. 15
A Few Details © Copyright 2012 EMC Corporation. All rights reserved. 16
It’s not possible to “start over” and re-write all applications using scale-out design patterns in the first few months of a cloud deployment, but it is possible to adapt many legacy applications with the help of virtualization, so cloud infrastructure can support and enable both development models, including mixing the two.
“Developers” Range Widely in Focus/Expertise • IT managers/admins deploying applications encapsulated or pre-packaged into virtual machines – – – – Language – configuration scripts, command lines Input – catalog of vApp templates or pre-configured VMs Output – VM images, VM configurations, system configurations Runs on – vSphere/ESX, virtual networks, legacy storage + scale-out storage • Programmers using application frameworks such as Groovy/Grails or Hadoop – – – – Language – Grails/Java, MapReduce/Hadoop Input – code, with help of an IDE Output – Rails + database configurations, job scripts Runs on – Rails + MySQL, virtual networks, scale-out storage
Apps + Data • Development – new applications – explicitly scale-out (e.g. MapReduce, Hadoop) – built on higher-level frameworks (e.g. Ruby/Rails, Azure) • Deployment – legacy applications – “packaged” into virtual machine containers – easy to replicate and migrate across virtual infrastructure Data – shared corporate data is the common ground (enterprise apps) – consumer value centered around their personal data (consumer apps) © Copyright 2012 EMC Corporation. All rights reserved. 19
Example – Deployment Marketing buzz – IaaS – Infrastructure as a Service © Copyright 2012 EMC Corporation. All rights reserved. 20
Example – Development Marketing buzz – PaaS – Platform as a Service © Copyright 2012 EMC Corporation. All rights reserved. 21
Example – EMC Greenplum HD Enterprise-Ready Hadoop Platform For Unstructured Data • Addresses The Growth Of Unstructured Data • More Reliable For The Enterprise • Easier To Use With Existing Systems And Tools Marketing buzz – Big Data – MapReduce, Hadoop © Copyright 2012 EMC Corporation. All rights reserved. 22
More About Apps + Data • From the perspective of development & deployment, the key new technology component is a combined data + app (storage + compute) platform where apps are created, deployed, monitored & managed with a common set of tools. • Underlying enablers: • Common object space – apps, configs, user data • Single identity store – public, private, enterprise, consumer • Federation (public + private) – seamless across infrastructures • Monitoring – continuous measurement to optimize (and generate bills) © Copyright 2012 EMC Corporation. All rights reserved. 23
Under The Covers What About The Data? © Copyright 2012 EMC Corporation. All rights reserved. 24
Cloud – A New Architecture Old World – Physical app app app Dedicated, Vertical Stacks New World – Virtual app app app app app app Dynamic Pools Of Compute & Storage Operating Systems & Frameworks “disappear” into the cloud fabric © Copyright 2012 EMC Corporation. All rights reserved. 25
Builds on 20 Years of Storage Research • APIs vs. mount points – “no slashes required” – blocks vs. files vs. objects vs. “APIs” • App-driven and policy-automated / GUI – self-configuring, self-organizing, self-tuning, self-* • Built in data services – self-healing RAID • Unlimited namespace, dynamic – billions and billions of objects, large and small • Native multi-tenancy – security/auth, monitoring, resource isolation © Copyright 2012 EMC Corporation. All rights reserved. 26
EMC Atmos BIG. SMART. ELASTIC. © Copyright 2012 EMC Corporation. All rights reserved. 27
Atmos Gen 2 Hardware Configurations WS2-120 WS2-240 WS2-360 Dense Compute Intel® Xeon® 5500 “Nehalem” Micro architecture Capacity Small/Medium Scale Compute 1 / 2 / 3 TB SATA Drives Expansion Nodes (2 Nodes per expansion) 1:15 servers-disk ratio 60 – 120 disks 60 TB entry point / 240 TB max 2 Node / 30 Disk expansions (7.2K) PWR: 5KW / 16.7 BTU/Hr © Copyright 2012 EMC Corporation. All rights reserved. 1:15 servers-disk ratio 60 – 240 disks Up to 480 TB total capacity 1:60 server-disk ratio 240 or 360 disks Up to 720 TB total capacity 2 Node / 30 Disk expansions 7.2K PWR: 10kW HEAT: 34.2k BTU/hr 2 Node / 120 Disk expansions 7.2k PWR: 10.3 kW HEAT: 35.1k BTU/hr 28
• commodity SATA drives (as many as possible) • x86 servers/controllers (as few as possible) • SAS backplanes/cables (just the right number) Promo Code 1 Front (tray pulled out) © Copyright 2012 EMC Corporation. All rights reserved. 29
8.6 drives/U Dell 6 drives/U 12 drives/U Promo Code 1 Front (tray pulled out) Supermicro Backblaze 11.3 drives/U 11.3 drives/U © Copyright 2012 EMC Corporation. All rights reserved. 30
• commodity SATA drives (as many as possible) • x86 servers/controllers (as few as possible) • SAS backplanes/cables (just the right number) 14.1 drives/U Promo Code 1 Front (tray pulled out) © Copyright 2012 EMC Corporation. All rights reserved. 31
A New Approach For Distributed Big Data L.A. BOSTON LONDON L.A. Storage Islands • • • • Disparate Systems Manual Administration One Tenant, Many Systems IT Provisioned Storage © Copyright 2012 EMC Corporation. All rights reserved. BOSTON LONDON Single Storage Pool • • • • Single System Across Locations Automated Policies Many Tenants One System Self-Service Access 32
What is EMC Atmos? EMC Atmos Custom or Packaged Applications REST, SOAP, or file services access No limits on namespace or location Multi-tenancy securely isolates data Automated location, protection, and efficiency services Single GUI Available on purpose-built appliances or virtualized software SITE #1 SITE #2 SITE #3 Los Angeles New York London © Copyright 2012 EMC Corporation. All rights reserved. Self-Service Experience 33
• Petabyte-scale • Geographic distribution • Policy-driven storage Case Studies Customer-Facing Web Application Atmos Web Service Interfaces + Metadata Atmos Policy “isPaid N” Atmos Policy “isPaid Y” Canada U.S. EMEA Case Study – Content-rich Web App Content-Rich Web App on Atmos Global distribution, content mix Multi-tenancy, scale to multiple sites Policy supports business models © Copyright 2012 EMC Corporation. All rights reserved. CareCore “gets in the cloud” with Atmos Wrote to Atmos REST API in one week Bought Atmos and deployed in three weeks Adding over 2 million objects a day to Atmos Started with one app, spreading to many more 34
Builds on 20 Years of Storage Research • APIs vs. mount points – “no slashes required” – blocks vs. files vs. objects vs. “APIs” • App-driven and policy-automated / GUI – self-configuring, self-organizing, self-tuning, self-* • Built in data services – self-healing RAID • Unlimited namespace, dynamic – billions and billions of objects, large and small • Native multi-tenancy – security/auth, monitoring, resource isolation © Copyright 2012 EMC Corporation. All rights reserved. 35
Summary What Changes © Copyright 2012 EMC Corporation. All rights reserved. 36
Summary – Structural Changes Enterprise IT challenges/pain points – Adapting to the business model changes of cloud – Answer: private + public clouds with federation – Adapting to development model changes of cloud – Answer: leverage new tools, frameworks to develop Web 2.0 and scale-out apps – Migrating legacy applications to cloud – Answer: virtualization to encapsulate legacy OS + apps – Managing data across apps & users – governance – Answer: a combined + app platform to manage the data flow among apps and virtual machines © Copyright 2012 EMC Corporation. All rights reserved. 37
Questions? © Copyright 2012 EMC Corporation. All rights reserved. 38
References • Geoff Moore “Partly Cloudy: Business and Innovation in the Internet Era” September 2010 – www.snia.org/cloud/Cloudburst/ Moore_SNIA_Keynote.pdf • Peter Mell & Tim Grance “The NIST Definition of Cloud Computing” October 2009 – csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc • Any business or computing magazine published at any point in 2009 or 2010 © Copyright 2012 EMC Corporation. All rights reserved. 39
Big Data Challenges INFORMATION IN THE ENTERPRISE WILL GROW 50X IN THE NEXT 10 YEARS Unstructured Content Prepare for digital universe explosion — 34 zettabytes of growth to 2020 © Copyright 2012 EMC Corporation. All rights reserved. Distributed Big Data Aggregate data as a business advantage; manage as one system Accessibility Make available around the globe—from any device—any location 41
IN A DECADE THE DIGITAL UNIVERSE WILL BE 35 ZETTABYTES 35,000,000,000,000,000,000,000 Source: 2010 IDC Digital Universe Study © Copyright 2012 EMC Corporation. All rights reserved. 42
Key Technology Components • Policy-driven Orchestration – Application mgmt – via virtualization – Data mgmt – ILM is finally required – Continuous measurement & monitoring – to meter/bill; to maintain high efficiency • vPods (virtualization) – Enables easy migration and replication of containerized applications vPod(s) Customer [virtual] Data Center3 – Drives highly efficient resource utilization – Eases rapid deployment of new applications & new services • Pods (packaged racks) – Rack-level deployment of infrastructure (compute + network + storage) – Drives highly efficient acquisition and deployment vs. traditional full custom or semi-custom design-per-app © Copyright 2012 EMC Corporation. All rights reserved. MAN/WAN VPN Pod(s) Service Provider Data Center 43
Why The Cloud Is Here To Stay Enterprise vs. Consumer Technology © Copyright 2012 EMC Corporation. All rights reserved. 44
Another angle – cloud computing is really about bringing enterprise computing technology and applications up to the norms and expectations of consumer computing technology.
The way we run our lives has forever changed. The employees we are hiring right out of school are appalled by the technology we use to run our companies. They are more productive at home than they are in the office. Marc Benioff, CEO, Salesforce.com
The barrier is becoming less and less between enterprises and consumers in application terms [expectations and functionality often are very much the same]” Eric Schmidt, CEO, Google
From September 2010, SNIA CloudBurst keynote by Geoffrey Moore
Why should employees accept a 50% reduction in their productivity when they come to the office on Monday morning? On the weekend, Google can answer any question I have, on Monday, I can’t get the answer to “who are my five biggest customers?” On the weekend, someone from my high school can find me and try to be my friend, on Monday, I can’t find my VP of Finance. Geoff Moore, Author, Crossing the Chasm
Consumer Attention • Diagram from Maya Gap Courtesy Mick McManus, MAYA Design