A presentation at Global Azure Bootcamp in in Linz, Austria by Hans-Peter Grahsl
❤ Microsoft Azure Open Source Tech
Microsoft + Open Source ? 2 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
GitHub Octoverse 2018 3 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
GitHub Octoverse 2018 4 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
GitHub Octoverse 2018 5 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
GitHub Octoverse 2018 6 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
HDInsight Services 7 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
HDInsight Services 8 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
$ whoami ” • Hans-Peter Grahsl • living & working in Graz • technical trainer at • independent engineer & consultant • associate lecturer • 9 irregular conference speaker @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Application Needs ? • “It depends” ! • plenty of pieces & components • irrespective of concrete use cases • two architectural pieces… 10 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
operational data store 11 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Cosmos DB 12 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
messaging platform 13 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Event Hubs 14 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
15 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Cosmos DB • global distribution • high availability & elastic scaling • multi-model & native NoSQL APIs • consistency choices • leading security, compliance & SLAs 16 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Multi-Model • keys + values • documents • column families • graphs 17 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Native API Support • SQL • Table Storage • MongoDB • Cassandra • Gremlin 18 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Throughput Provisioning • request units (RUs) • abstract performance metric • combination of CPU + Memory + IOPS • backed by SLAs 19 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
exciting …in theory
Azure Docs on RUs “The cost of all database operations is normalized by Azure Cosmos DB and is expressed in terms of Request Units (RUs). The cost to read a 1-KB item is 1 Request Unit (1 RU) and minimum RUs required to consume 1 GB of storage is 40.” 21 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
22 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
daunting …in practice
Measure via API • write operation consumed ??? RUs • given 1000 provisioned RUs at most ☛ 1000 ÷ 20.19 ≅ 49 docs/sec • no parallel activity considered 24 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Measure via API • read operation consumed ??? RUs • given 1000 provisioned RUs at most ☛ 1000 ÷ 3.11 ≅ 321 reads/sec • no parallel activity considered 25 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Influences on RUs • indexing • property count • consistency level • query patterns • scripts …and any other concurrent operation 26 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Cosmos DB for MongoDB API • native implementation • wire protocol compatibility • transparent re-use (code + tools) • migration benefits: in both ways 27 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Showtime! 28 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Important Note • feature support is limited of course • carefully check what does (NOT) work • wire protocol versions: 3.2 GA, 3.4 preview 29 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
3.6 ?? 4.0 31 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Currently Missing Out On… • powerful operators from 3.2 + 3.4 - $graphLookup, $facet, $bucket(Auto) - $reduce, $zip, $switch, $replaceRoot, … • great features & major advances from 3.6 + 4.0 - views, change streams, - type conversions, schema validation, - multi-element array updates, array filters, - sessions & transactions, … 33 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
34 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Event Hubs • distributed event ingestion platform • decouple producers ⬌ consumers • source for stream processing • high availability & scalability • fully-managed 35 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Concepts: Event Hub 36 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Concepts: Partition 37 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Concepts: Partition Keys 38 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Concepts: Partition Offsets 39 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Concepts: Big Picture 40 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Event Hubs for Apache Kafka • overlay on top of Event Hubs • binary compatible with Kafka 1.0+ • transparent re-use (code + tools) • migration benefits: in both ways 41 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
The Promise… “You update the connection string in configurations to point to the Kafka endpoint exposed by your event hub instead of pointing to your Kafka cluster. Then, you can start streaming events from your applications that use the Kafka protocol into Event Hubs.” 42 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Showtime! 43 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Unsupported Kafka Features • idempotent producers & transactions • message compression • size-based retention or log compaction • parition resize for existing topics • HTTP Kafka API support • Kafka Streams & KSQL 44 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
The devil is in the detail
THANK YOU Q&A? https://bit.ly/2W7KGn1 46 @hpgrahsl | #Azure @GlobalAzure, 27th April 2019, Linz - Austria
Over the last few years, Microsoft has time and again proven the high priority given to open source technologies, especially in different Azure Cloud Computing Services.
A brief introduction will show the key aspects of Azure Event Hubs and Azure Cosmos DB. Apart from the standard operation, alternative APIs are the focus of this session, which should allow for these two Azure services a seamless integration or a “drop-in” replacement to very popular open source projects.
While Azure Event Hubs offers an Apache Kafka-compliant API, Azure CosmosDB can also be used in a MongoDB-compatible variant. Based on simple use cases and practical live demos, these possibilities are discussed.