TRACK: SITE RELIABILITY ENGINEERING NOVEMBER 10, 2022 I am Cluster Admin, Destroyer of Everything You Hold Dear Matt Williams, Evangelist @ Infra TW: @technovangelist - Mast: @technovangelist@fosstodon.org

TRACK: SITE RELIABILITY ENGINEERING Least Privilege According to Cybersecurity & Infrastructure Security Agency (CISA): Only the minimum necessary rights should be assigned to a subject that requests access to a resource and should be in effect for the shortest duration necessary … careful delegation of access rights can limit attackers from damaging a system.

TRACK: SITE RELIABILITY ENGINEERING What happens when we skip Least Privilege

TRACK: SITE RELIABILITY ENGINEERING Target - 2013 • HVAC on main network • Useful for monitoring energy consumption at various stores

TRACK: SITE RELIABILITY ENGINEERING Target - 2013 • HVAC on main network • Useful for monitoring energy consumption at various stores • Technician compromised

TRACK: SITE RELIABILITY ENGINEERING Target - 2013 • HVAC on main network • Useful for monitoring energy consumption at various stores • Technician compromised Attackers stole 40 million debit and credit cards

TRACK: SITE RELIABILITY ENGINEERING GitLab - 2017 • SRE responding to incident • Intended to drop replica database

TRACK: SITE RELIABILITY ENGINEERING GitLab - 2017 • SRE responding to incident • Intended to drop replica database • Fat fingered the production database and had excessive privileges to do it

TRACK: SITE RELIABILITY ENGINEERING GitLab - 2017 • SRE responding to incident • Intended to drop replica database • Fat fingered the production database and had excessive privileges to do it GitLab went down for 6 hours, 5k projects lost (issues, etc), comments, users

TRACK: SITE RELIABILITY ENGINEERING Marriott - 2018 User compromised Had admin access for everything Ran some database queries

TRACK: SITE RELIABILITY ENGINEERING Marriott - 2018 User compromised Had admin access for everything Ran some database queries Hundreds of millions of customer records lost

TRACK: SITE RELIABILITY ENGINEERING Capital One - 2019 • Misconfigured firewall • Generated temp account creds via SSRF exploit • Had excessive privileges to sync S3 buckets

TRACK: SITE RELIABILITY ENGINEERING Capital One - 2019 • Misconfigured firewall • Generated temp account creds via SSRF exploit • Had excessive privileges to sync S3 buckets 30GB of credit application data, affecting 100 million in US, 6 million in Canada

TRACK: SITE RELIABILITY ENGINEERING Verkada - 2021 • Credentials found for user • Had excessive privileges

TRACK: SITE RELIABILITY ENGINEERING Verkada - 2021 • Credentials found for user • Had excessive privileges Accessed 150k live camera feeds in schools, prisons, and hospitals

TRACK: SITE RELIABILITY ENGINEERING Reported by Rocky Chen? - 2021 • User accidentally deleted a namespace • Recreated it - but did it wrong • He thought he was in his test cluster • Assumed AWS role made it difficult to troubleshoot

TRACK: SITE RELIABILITY ENGINEERING SW company with tools used by law enforcement and sec teams • One of the devs ran kubectl command • Thought he was in test, was actually in prod • Assumed roles, never figured out who did it All access to Kubernetes removed and start over

TRACK: SITE RELIABILITY ENGINEERING What is the cost of breaches? • Avg cost: $4.24 million in 2021 • Avg time to identify: 212 days. • Avg lifecycle: 286 days from identification to containment. • The likelihood detected and prosecuted 0.05%. • Personal data involved in 45%. https://www.securitymagazine.com/articles/93990-a-cluster-without-rbac-is-an-insecure-cluster

TRACK: SITE RELIABILITY ENGINEERING How is this relevant to this talk? Let’s talk about Kubernetes & Cluster Admin

TRACK: SITE RELIABILITY ENGINEERING How is this relevant to this talk? Let’s talk about Kubernetes & Cluster Admin Cluster Admin is wonderful because you can do anything you want!!

TRACK: SITE RELIABILITY ENGINEERING How is this relevant to this talk? Let’s talk about Kubernetes & Cluster Admin Cluster Admin is wonderful because you can do anything you want!! Cluster Admin is scary because you can do anything you want!!

TRACK: SITE RELIABILITY ENGINEERING How is this relevant to this talk? Let’s talk about Kubernetes & Cluster Admin Cluster Admin is wonderful because you can do anything you want!! Cluster Admin is scary because you can do anything you want!! Cluster Admin is the worst thing ever because you can do anything you want!!

TRACK: SITE RELIABILITY ENGINEERING so the answer is don’t give cluster admin to everyone, right??

TRACK: SITE RELIABILITY ENGINEERING But creating users in k8s is HARD Users don’t actually exist in kubernetes

TRACK: SITE RELIABILITY ENGINEERING But creating users in k8s is HARD Users don’t actually exist in kubernetes Everything in k8s is a resource.

TRACK: SITE RELIABILITY ENGINEERING But creating users in k8s is HARD Users don’t actually exist in kubernetes Everything in k8s is a resource. But there is no user resource

TRACK: SITE RELIABILITY ENGINEERING But creating users in k8s is HARD Users don’t actually exist in kubernetes Everything in k8s is a resource. But there is no user resource Its All About the Certs

TRACK: SITE RELIABILITY ENGINEERING But creating users in k8s is HARD Users don’t actually exist in kubernetes Everything in k8s is a resource. But there is no user resource Its All About the Certs in your .kubeconfig

TRACK: SITE RELIABILITY ENGINEERING apiVersion: v1 clusters: - cluster: certificate-authority-data: certgoeshere server: https://clusterendpoint.k8s.ondigitalocean.com name: mycluster contexts: - context: cluster: mycluster user: do-sfo3-matt-primary-admin name: mycontext current-context: mycontext kind: Config preferences: {} users: - name: do-sfo3-matt-primary-admin user: token: dop_v1_dea9d7ff2b8eb092f53ffebogus31d2bd4602a62a19b5ac4

TRACK: SITE RELIABILITY ENGINEERING apiVersion: v1 clusters: - cluster: certificate-authority-data: certgoeshere server: https://clusterendpoint.k8s.ondigitalocean.com name: mycluster contexts: - context: cluster: mycluster user: do-sfo3-matt-primary-admin name: mycontext current-context: mycontext kind: Config preferences: {} users: - name: do-sfo3-matt-primary-admin user: token: dop_v1_dea9d7ff2b8eb092f53ffebogus31d2bd4602a62a19b5ac4

TRACK: SITE RELIABILITY ENGINEERING apiVersion: v1 clusters: - cluster: certificate-authority-data: certgoeshere server: https://clusterendpoint.k8s.ondigitalocean.com name: mycluster contexts: - context: cluster: mycluster user: do-sfo3-matt-primary-admin name: mycontext current-context: mycontext kind: Config preferences: {} users: - name: do-sfo3-matt-primary-admin user: token: dop_v1_dea9d7ff2b8eb092f53ffebogus31d2bd4602a62a19b5ac4

TRACK: SITE RELIABILITY ENGINEERING apiVersion: v1 clusters: - cluster: certificate-authority-data: certgoeshere server: https://clusterendpoint.k8s.ondigitalocean.com name: mycluster contexts: - context: cluster: mycluster user: do-sfo3-matt-primary-admin name: mycontext current-context: mycontext kind: Config preferences: {} users: - name: do-sfo3-matt-primary-admin user: token: dop_v1_dea9d7ff2b8eb092f53ffebogus31d2bd4602a62a19b5ac4

TRACK: SITE RELIABILITY ENGINEERING What is a Role? • Defines the level of access a ‘user’ has to the cluster • Resource • Verb

TRACK: SITE RELIABILITY ENGINEERING What is a Role? apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: marketing-dev labels: app.infrahq.com/include-role: “true” rules: - apiGroups: [“”] # “” indicates the core API group resources: [“pods”] verbs: [“get”, “watch”, “list”]

TRACK: SITE RELIABILITY ENGINEERING What is a Role? apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: marketing-dev labels: app.infrahq.com/include-role: “true” rules: - apiGroups: [“”] # “” indicates the core API group resources: [“pods”] verbs: [“get”, “watch”, “list”]

TRACK: SITE RELIABILITY ENGINEERING What is a Role? apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: marketing-dev labels: app.infrahq.com/include-role: “true” rules: - apiGroups: [“”] # “” indicates the core API group resources: [“pods”] verbs: [“get”, “watch”, “list”]

TRACK: SITE RELIABILITY ENGINEERING What is a Role? apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: marketing-dev labels: app.infrahq.com/include-role: “true” rules: - apiGroups: [“”] # “” indicates the core API group resources: [“pods”] verbs: [“get”, “watch”, “list”]

TRACK: SITE RELIABILITY ENGINEERING How to create a User • • • • Create the user key (openssl genpkey…) Create the CSR (openssl req –new) Submit the CSR to the cluster (yaml) Approve the request (kubectl certificate approve…)

TRACK: SITE RELIABILITY ENGINEERING How to create a User • Get the approved request (kubectl get csr…) • Build the kubeconfig (kubectl —kubeconfig myuserconfig config set-credentials, kubectl —kubeconfig myuserconfig configset-context) • Then distribute the file https://infrahq.com/blog/how-to-create-users

TRACK: SITE RELIABILITY ENGINEERING How to create a User • And then repeat often • Ensure bad parties can’t access • You can’t revoke a cert • And redistribute

TRACK: SITE RELIABILITY ENGINEERING that’s a lot of steps can we automate it?

TRACK: SITE RELIABILITY ENGINEERING

TRACK: SITE RELIABILITY ENGINEERING but… He doesn’t deal with file distribution

TRACK: SITE RELIABILITY ENGINEERING Is there something easier??

TRACK: SITE RELIABILITY ENGINEERING

TRACK: SITE RELIABILITY ENGINEERING Infra • Two deployment options • Self Hosted • Use Infra Cloud (coming soon)

TRACK: SITE RELIABILITY ENGINEERING DEMO

TRACK: SITE RELIABILITY ENGINEERING Summary • • • • • Least Privilege is important but… complicated on Kubernetes RBAC You can automate… Infra makes it easier

TRACK: SITE RELIABILITY ENGINEERING NOVEMBER 10, 2022 I am Cluster Admin, Destroyer of Everything You Hold Dear Matt Williams, Evangelist @ Infra @technovangelist

TRACK: SITE RELIABILITY ENGINEERING