Moving to Kubernetes on Amazon Web Services at Scale

27
1 MAKING VIDEO ADS PERSONAL

Transcript of Moving to Kubernetes on Amazon Web Services at Scale

1

MAKING VIDEO ADS PERSONAL

THE WAY DEVELOPERS PUT VIDEO ADS IN APPS

PRESENTATION TITLEMonth 2015 (or delete)

Moving to Kube on AWS at ScaleNovember 2015

3

ABOUT ME

Daniel Nelson

Staff Ops Engineer at Vungle

@packetcollision

[email protected]

4

WHAT WE HAD BEFORE

• Ubuntu 12.04

• Chef

• ASGs and manually started

• One app per server

• Uneven resource utilization

• Dev very different from prod

5

THE STAGE

• Docker dev environment

• Prod still piles of Chef

• Moving to SOA architecture

• More services

• More servers

6

THE DREAM

• Prod uses the same Docker images as dev/QA

• easily add/remove machines

• minimal maintenance

• Self-Service

7

THE OS

• Minimal

• Simple updates

• Community

• CoreOS

• RancherOS

OPTIONS

8

THE OS

• CoreOS

• RancherOS

• Minimal

• Simple updates

• Community

OPTIONS

9

THE CONTENDERS

• Mesos

• Fleet

• Docker Swarm

• Amazon ECS

• Kubernetes

10

THE CONTENDERS

• Mesos

• Fleet

• Docker Swarm

• Amazon ECS

• Kubernetes

11

THE CONTENDERS

• Mesos

• Fleet

• Docker Swarm

• Amazon ECS

• Kubernetes

12

THE CONTENDERS

• Mesos

• Fleet

• Docker Swarm

• Amazon ECS

• Kubernetes

13

THE CONTENDERS

• Mesos

• Fleet

• Docker Swarm

• Amazon ECS

• Kubernetes

14

Construction Challenges• Can’t use kube-up.sh — Hardcoded VPC name

15

Construction Challenges• Can’t use kube-up.sh — Hardcoded VPC name

• Can’t use ELB — Hardcoded VPC name

16

Construction Challenges• Can’t use kube-up.sh — Hardcoded VPC name

• Can’t use ELB — Hardcoded VPC name

• CloudFormation templates get big

Construction Challenges

17

Construction Challenges• Can’t use kube-up.sh — Hardcoded VPC name

• Can’t use ELB — Hardcoded VPC name

• CloudFormation templates get big

• Tons of little things

Construction Challenges

18

REPLACING ELB

• ASG of router/load-balancer machines

• Not in Kube cluster

• Are in same SDN as Kube cluster

• Running Vulcand (or Waco Kid)

• Romulus for auto-configuration (use dev branch)

19

REPLACING ELB

• ASG of router/load-balancer machines

• Not in Kube cluster

• Are in same SDN as Kube cluster

• Running Vulcand (or Waco Kid)

• Romulus for auto-configuration (use dev branch)Nginx/HAProxy Ingress balancer

20

REPLACING LEGACY SYSTEMS

• Kafka is our backbone - MirrorMaker from new to old

• Move consumers only once producers are on Kube

• GTM or Route53 to slowly move traffic over

• Help project teams move to Docker prod deploys and pod/service configs

21

DESIGN FOR SCALE AND RELIABILITY

• Multi-region (can also be used to keep clusters smaller within one region)

• Make adjusting scale easy

• Assume machines and zones will go down

22

SAVE MONEY

• Spot pricing is awesome

• You can bid on a bunch of different instance types

• Kubernetes makes instance eviction less painful

23

COMMUNICATING BETWEEN CLUSTERS

• Avoid (synchronous) communication

• Kafka

• VPN

• Internal Load balancer

24

DEPLOYMENT

• Have a standard

• Empower the Project teams

• Automate, automate, automate

25

That’s it

26

WE’RE HIRING

Senior Software Engineer, Data

Senior Software Engineer, Machine Learning Infrastructure

Senior Software Engineer, Machine Learning

Data Scientist Engineer

Thank you!

@packetcollision