Lessons Learned When a Dev Does Opsy Things Cora Fedesna (she/her) @CorainChicago noti.st/corainchicago @DeliveryConf

What is ActiveCampaign? Automation. Email. CRM. Messaging. For over 80,000 growing businesses.

Software Engineer Developer Conference Organizer @CorainChicago noti.st/corainchicago @DeliveryConf

Never been a: SRE OPS SysAdmin @CorainChicago @DeliveryConf

Our story begins with dealignment of the SREs (Also known as when I join the company)

SRE Alignment SRE Dealignment @CorainChicago @DeliveryConf

We get to do opsy things

We have to do opsy things

@CorainChicago @DeliveryConf

Continuous Integration for your 1000% coverage

Continuous Integration Locally gradle/mvn clean test

Continuous Deployment Locally gradle/mvn clean build

Tasks 1. Write terraform to make my CI/CD pipeline

Tasks 1. Write terraform to make my CI/CD pipeline 2. Write terraform to create the infrastructure for my application

Tasks 1. Write terraform to make my CI/CD pipeline 2. Write terraform to create the infrastructure for my application 3. Get someone to apply the terraform

Tasks 1. Write terraform to make my CI/CD pipeline 2. Write terraform to create the infrastructure for my application 3. Get someone to apply the terraform 4. Learn how to write terraform

terraform/ci @CorainChicago @DeliveryConf

terraform/ci terraform/cd

Tutorial Code PR PR Comments - SRE PR update PR Comments - SRE PR update 1:1 with SRE @CorainChicago 9. 10. 11. 12. 13. 14. 15. Code PR PR Comments - SRE PR Update PR Comments - SRE PR Update 1:1 Apply - SRE @DeliveryConf

DEALIGNMENT All teams are on a paging schedule. @CorainChicago @DeliveryConf

Github Gitlab

Kubernetes

.gitlab-ci.yml include: - project: ‘devops/name-of-project’ ref: v1.1.1 file: ‘filename.yml’

.gitlab-ci.yml cache: paths: - .gradle/wrapper - .gradle/caches

test: .gitlab-ci.yml stage: test image: docker-image-for-jdk services: - docker:18-dind script: - export GRADLE_USER_HOME=pwd/.gradle - apk add docker - apk add bash - setup_docker - ./gradlew clean test

.gitlab-ci.yml script: - export GRADLE_USER_HOME=pwd/.gradle - apk add docker - apk add bash build: - setup_docker stage: build

  • ecr_login name-of-aim-role image: image-jdk-used
  • ./gradlew build services:
  • docker pull $REPOSITORY:latest || true
  • docker:18-dind
  • docker build —cache-from $REPOSITORY:latest —tag $REPOSITORY:$TAG —tag $REPOSITORY:latest . - docker push $REPOSITORY:$TAG - docker push $REPOSITORY:latest

.gitlab-ci.yml variables: VARIABLES_FROM_DEVOP_PROJECT_KEYS: value STAGING_ENABLED: “true” TEST_DISABLED: “true” REVIEW_DEPLOY_VALUES: > —env SPRING_PROFILES_ACTIVE=staging —env DEFAULT_LOG_LEVEL=debug STAGING_DEPLOY_VALUES: > —env SPRING_PROFILES_ACTIVE=staging —env DEFAULT_LOG_LEVEL=debug

name: service-name app.yaml namespace: service-name image: service-name:latest imagePullPolicy: IfNotPresent http: true gateway: service-name.test iam: k8s/segue-e2e-role livenessDelay: 360 livenessTimeout: 15 readinessDelay: 360 readinessTimeout: 50 memoryRequest: 2Gi memoryLimit: 4Gi vaultAddr: https://vault.staging.app-us1.com:8200 containerPort: 8080 secrets: [“vault/path/to/secret”] healthPath: /actuator/health prometheusPath: /actuator/prometheus minScale: 1 maxScale: 1

name: service-name app.yaml namespace: service-name image: service-name:latest imagePullPolicy: IfNotPresent livenessDelay: 360 livenessTimeout: 15 readinessDelay: 360 Oops! Our devops team abstracted this work away from us. http: true gateway: service-name.test iam: k8s/segue-e2e-role containerPort: 8080 healthPath: /actuator/health minScale: 1 maxScale: 1 readinessTimeout: 50 memoryRequest: 2Gi memoryLimit: 4Gi vaultAddr: https://vault.staging.app-us1.com:8200 secrets: [“vault/path/to/secret”] prometheusPath: /actuator/prometheus

Book Club Time!

Book Club Time! (actually reading chapters 1-7)

Kubernetes

CI/CD being much easier, let’s go back to Terraform

Terraform

Playing with AWS == Scary @CorainChicago @DeliveryConf

Dev View resource “aws_db_instance” “name_mysql” { instance_class = “${var.mysql_instance_class}” db_subnet_group_name = “${aws_db_subnet_group.mysql.name}” engine = “mysql” allocated_storage = “${var.allocated_storage}” engine_version = “${var.mysql_version}” username = “${var.mysql_user_name}” password = “${var.mysql_password}” vpc_security_group_ids = [ “${data.aws_security_group.rds.id}”, ] skip_final_snapshot = true identifier = “${var.mysql_idenitifier}” Version tags = “${merge( local.common_tags, map( “Name”, “${var.application_name}-db-instance” ) )}” Username Password lifecycle { ignore_changes = [ “password”, ] } }

CODE Good Bad

Terraform Coding == Java Coding Good ● ● ● ● Modular Variables used Documentation Examples available Bad ● ● ● All in one Hard code values Don’t follow examples

Terraform Coding == Java Coding Good ● ● ● ● Modular Variables used Documentation Examples Bad ● ● ● ● All in one Hard code values Don’t follow examples Doesn’t run

Terraform Coding == Java Coding Good ● ● ● ● Modular structure Variables used Documentation Examples Bad ● ● ● ● ● All in one Hard code values Don’t follow examples Doesn’t run Runs and does something inadvertent and horrible

Module module “service name” { source = environment_type = mysql_instance_class = mysql_version = mysql_password = mysql_user_name = } “../modules/filename” “staging” “${var.mysql_instance_class}” “${var.mysql_version}” “${var.mysql_password}” “${var.mysql_user_name}”

Resources resource “aws_s3_bucket” “bucket name” { bucket = “${var.application_name}-controls” acl = “private” force_destroy = true tags = “${merge( local.common_tags, map( “Name”, “${var.application_name}-controls” ) )}” }

Resources

Data Blocks data “aws_vpc” “my_name_for_it” { tags { Name = “region I want to filter by” } }

Data Blocks resource “aws_db_instance” “name_mysql” { data “aws_security_group” “rds” instance_class { db_subnet_group_name = name = “rds” = “${var.mysql_instance_class}” “${aws_db_subnet_group.mysql.name}” tags { Zone = “zone name” } } engine = “mysql” allocated_storage = “${var.allocated_storage}” engine_version = “${var.mysql_version}” username = “${var.mysql_user_name}” password = “${var.mysql_password}” vpc_security_group_ids = [ “${data.aws_security_group.rds.id}”, ] }

Variables variable “environment_type” { default = “staging” description = “notes about it, explanation” } variable “mysql_password” { description = “The password for the mysql database” type = “string” }

Data vs. Variable @CorainChicago @DeliveryConf

terraform.tfvars mysql_instance_class = “db.m5.12xlarge” mysql_user_name = “username” kubernetes_vpc_name = “vpc_name”

Terraform - Tricky Things Destroy database Commits passwords @CorainChicago @DeliveryConf

Terraform - Tricky Things Destroy database Commits passwords DEAR LORD - IT CAN FEEL SIDEWAYS @CorainChicago @DeliveryConf

terraform fmt Commit code terraform init terraform plan terraform apply @CorainChicago @DeliveryConf

Tutorial Code PR PR Comments - SRE PR update PR Comments - SRE PR update 1:1 with SRE (briefer) @CorainChicago 9. 10. 11. 12. 13. 14. 15. Code PR PR Comments - SRE PR Update PR Comments - SRE PR Update 1:1 Apply - SRE @DeliveryConf

DEALIGNMENT My team’s SRE stops coming to stand up.

Java Spring Docker

Java Spring Docker Database S3 API security K8 Cluster

What’s still a struggle? @CorainChicago noti.st/corainchicago @DeliveryConf

Things Devs Rarely Need to Think About IAM Policies Cidr Blocks VPCs

Things Devs Rarely Deal With SECURITY

IAM Identity and Access Management Cidr Blocks Classless Inter-Domain Routing VPC Virtual Private Cloud

IAM - Identity and Access Management An entity that, when attached to an identity or resource, defines their permissions. AWS Permissions Thing

data “aws_iam_policy_document” “policy_name” { statement { IAM effect = “Allow” actions = [ Identity and “s3:PutObject”, “s3:GetObject” Access ] resources = [ Management “${aws_s3_bucket.name.arn}/*”, ] } }

IAM - Identity and Access Management data “aws_iam_policy_document” “policy_name” { resource “aws_iam_policy” “name_task_policy” { statement { name effect = “Allow” actions = [ = “name-task-profile” policy = “${data.aws_iam_policy_document.policy_name.json}” } “s3:PutObject”, “s3:GetObject”, resource “aws_iam_policy_attachment” “attachment_policy” { ] name = “name-task-profile-attachment” resources = [ roles = [“${aws_iam_role.name_role.name}”] “${aws_s3_bucket.name.arn}/*”, ] } } policy_arn = “${aws_iam_policy.name_task_policy.arn}” }

CIDR - Classless Inter-Domain Routing 192.168.100.14/26

CIDR - Classless Inter-Domain Routing

VPC - Virtual Private Cloud logically isolated section of the AWS any cloud (Has a cidr block)

data “aws_vpc” “kubernetes” { tags { Name = “${var.kubernetes_vpc_name}” } } cidr_blocks = [“${data.aws_vpc.kubernetes.cidr_block}”]

IAM Identity and Access Management Cidr Blocks Classless Inter-Domain Routing VPC Virtual Private Cloud

TUNING @CorainChicago @DeliveryConf

Generate Load @CorainChicago @DeliveryConf

Volume hitting service? Database Connections - good? Services falling over? Error logs?

Volume hitting service? Will my service cause an alert or page? Database Connections - good? Services falling over? Error logs?

Volume hitting service? Services falling over? Can I prevent it? Database Connections - good? Error logs?

AWS Console Datadog Dashboards Kibana Logs Kubectl Commands @CorainChicago noti.st/corainchicago @DeliveryConf

Where’d my db go?

Where’d my db go?

Datadog

Datadog

Datadog

Datadog

Kibana

Logs LOGGER.error(“Here’s the exact information you will need {}” , request);

Kubectl Commands ● Kubectl staging get pods -n namepspace ● Kubectl staging describe pods -n namepspace ● Kubectl staging logs pod_id -n namepspace -c container -f

Modify @CorainChicago @DeliveryConf

Repeat @CorainChicago @DeliveryConf

Tutorial Code PR PR Comments - SRE Dev PR update PR Comments - SRE PR update 1:1 with SRE (briefer) @CorainChicago 9. 10. 11. 12. 13. 14. 15. 16. Code PR PR Comments - SRE PR Update PR Comments - SRE PR Update 1:1 Apply - SRE SRE Ticket to apply @DeliveryConf

Our story ends with… (Also known as now)

Thank you! (We’re hiring!)

Thank you! Cora Fedesna (she/her) @CorainChicago noti.st/corainchicago @DeliveryConf