A Cloud Native Foundation for Developers/Application Platforms Max Körbächer | Liquid Reply 12th October 2021
A presentation at Cloud Native DevX Day in October 2021 in by Max Körbächer
A Cloud Native Foundation for Developers/Application Platforms Max Körbächer | Liquid Reply 12th October 2021
INTRODUCTION Max Körbächer | Co-Founder Liquid Reply Sr. Manager Cloud Native Engineer, focusing on: • Platform Engineering • Application Delivery • Cloud Native Advisory Part of the Kubernetes Release Team & Release Engineering Team Run a cloud native newsletter: nativecloud.dev 2
What we are talking about Platforms Application Delivery Automation Declarative The Foundation Platforms are (often) based on Kubernetes were application runs on. We implement tools to support the development in the delivery of applications to the platform. We utilize APIs and declarative manifests to provision infrastructure, platform and delivery. We build a trustful foundation for valuable solutions. This has to be reliable, secure and supporting the requirements of the applications. Everything is automated, tested and proofed for reliability and zerodowntimes.
Platform or Internal Dev Platform „An Internal Developer Platform (IDP) is a layer on top of the tech and tooling an engineering team has in place already. It helps Ops teams structure their setup and enable developer self-service.“ Source: https://internaldeveloperplatform.org/
Orchestrate Chaos When IDPs appear? • “Young” IT/Data driven companies – the Spotifies, Netflixes and Lyfts of the world • Major companies with an IT history older than you…
The ideal IDP Supporting Developer by: • Extending local dev to a remote system for fast response • Eliminate multiple entry points • Give everything ”on hand” • 100% self service Platform Engineering fosters: • Everything is the same API • You get the same infrastructure • Replaceability • Advice for the good Supporting Ops by: • Provision similar environments • Integrated with Monitoring, Logging & Alerting • Apply self healing capabilities • Supporting Business by: • Costs transparency • Reliability & availability Foster speed and reduce idle time • Portability Supporting Security by: • Provide environments which apply security rules • Integrate by default with SIEM • Proactively detect threads
What you want to achieve
Or even better Container
Lessons Learned – the little things What ever you do, it’s wrong There is no one size fits all Start early with security Network Policies, Security Context and a hardened OS doesn’t hurt The market is to fast If it is about development of tools, methodologies or idea, corporates doesn’t adapt fast enough Simplify infrastructure and bet on discovery You don’t need to train everyone as CKA/D/S
Complexity of Platforms Application Layer IDE Source Code/GIT Build Pipeline Artefact Build CVE Scanner Container Build Container Scan Deployment Container Registry Artefact Repository SIEM Custom App/Pod Managed App/Pod Commercial App/Pod Proxy Service Mash API Gateway K8s1 Container Engine Bare Metal Dev Team K8s Expert OSS Expert CSP Need2know No resp CNI VMs K8s Base Components Cloud Secret Management Chaos Engineering Operations Tooling Monitoring Logging Alerting Anomaly Detection Backend as Service DB as Service K8s2 – Infrastructure Layer DEVELOPEMT & DEPLYOMENT K8S & APPLICATION BASE OPERATION 1 per definition, 2 common understanding
Kubernetes is not a hypervisor When you bet on Kubernetes, you have two choices: 1. You keep developing your apps you done it the past years, maybe as microservice 2. You go all in K8s and let it lift the heavy stuff e.g. encryption, DNS, operateability, traffic management, security et al. Way 1. means, it will be hard, because the classic concepts doesn’t fit together with what K8s do for you. Way 2. means, it will be hard, because K8s is the critical success factor and must be done 100% correct.
The cloud provider matter On paper AWS, Azure, GCP & others are “the same” by services, pricing etc. Service capabilities, majority, stability, reliability are valuable to have a look at. There is nothing worse than frustrated platform engineers struggling with the infrastructure. Also, the IaC service matching can be a thing. AWS Azure X GCP
CI/CD a mythos/dilemma Define clear checkpoints like a container registry and GitOps like ArgoCD Dev tools meanwhile reach till the remote clusters for faster feedback and direct interaction, allow it! However, for productive workload several rules must apply, checkpoints who break the CI/CD in the CI and CD allow to enforce these: • Container Registry/Chart Library – scan for CVEs, security flaws, bad configurations, BOMs • GitOps (e.g.) – deploy based on triggers, metrics, validations How and where do you do the development? How compiles the code? Make the docker file? Who defines how the application needs to be Ensure best practices & security configs are done? deployed? How do you deploy? How you do updates? Development Deployment Integration Local Remote CR & Git e.g. Argo Int Prod
GitOps, because there will be change Developers generate code and merge it via pull requests into a centralized/decentralized repository. GitOps distributes the dependency and responsibility between the development and deployment process.
Only one thing will be stable in future The cloud native environment change so fast, you have to prepare to adapt this The delivery of code is the one thing which will keep happening (excl. the development of OpenAI), but how container and OSs will look like will change, think about WebAssembly. X?
You need a unified observability Operator Operator Operator Server Server Server Alert Manager Alert Manager Alert Manager Querier Operator Gateway Compactor Observer Cluster Object Store
Deliver to Devs the full view Request-scoped metrics Metrics Aggregatable Tracing Request scoped Logging Events Request-scoped events Aggregatable events e.g. rollups
Start with security, later it will be pain Integration of continuous security in a DevOps environment through security checks at every stage of the development cycle in the CI/CD pipeline without compromising speed or agility • 1. Implementing the necessary controls to ensure compliance and regulatory requirements are met 2. Integration of security relevant tools into the CI/CD pipelines (e.g. security scanning via OWASP, CVE scanning of container images) 3. Role Based Access Control for users and technical users 4. Policy based verification of rights (e.g. via Open Policy Agent/Gatekeeper, Kyverno) 5. Integrity chains from code to image
What doesn’t work well, yet • To cost intensive Photo by Frederick Marschall, Edgar Chaparro, Franki Chamaki on Unsplash • Containers are not secure, so their isolation isn’t too • Multi tenant user management “is solved” but complex and hard • Where to store all the data? Meta Data • e.g. no separation between namespaces Multi Tenancy Observability • Not designed for platforms • Kubernetes is also not a DB, GitOps is nice but has downsides, store data somewhere else feels to much custom
Some key metrics building a platform 1. The number of scripts is an index of how inflexible you are 2. The more components have direct dependencies the less you can use your platform in future 3. The more complex your handbooks for developers are the worse and less likely to use is your platform