Kubernetes introduc-tion
with a running exampleDongwon Kim, PhD
SK Telecom
Why we use Kubernetes?
Container-based virtualization+ Container orchestra-tion
Satisfying common needs in production
co-locating helper processesmounting storage systems
distributing secretsapplication health checking
replicating application instanceshorizontal auto-scalingnaming and discovery
load balancingrolling updates
resource monitoringlog access and ingestion
...
from a web page from the official site : https://kubernetes.io/docs/whatisk8s/
Pod – the basic unit of Kubernetes• Components
• a group of containers• docker, rkt (pronounced “rock-it”) from CoreOS, etc
• a group of shared storage called volumes• ephemeral volume• persistent volume
• host local directories• nfs• iscsi• flocker• Google Compute Engine (GCE) Persistent Disk• Amazon Web Services (AWS) Elastic Block Store (EBS)
• Purpose • model an application-specific logical host/VM
• Characteristics• containers in a pod share IP addresses/ports• containers in a pod can communicate via IPC
Pod
Container(port : 1234)
Volume(ephemeral)
Container(port : 3456)
Container(port : 5678)
Volume(persistent)
Containers claim their vol-umes
ipc
Address : 10.244.1.10localhost:345
6
Few things to consider when running Zookeeper with Kubernetes
• How to launch Zookeeper servers using a pod?• How to give IDs to pods?• What is the domain name of each pod?• How to make sure a certain # of pods running during maintenance?
PodZookeeper server
(leader)
- myid : 1- server.1
- zk-1:2888:3888- server.2
- zk-2:2888:3888- server.3
- zk-3:2888:3888
Zookeeper server
- myid : 2- server.1
- zk-1:2888:3888- server.2
- zk-2:2888:3888- server.3
- zk-3:2888:3888
Zookeeper server
- myid : 3- server.1
- zk-1:2888:3888- server.2
- zk-2:2888:3888- server.3
- zk-3:2888:3888
Kafka server- broker.id : 1- zookeeper.connect
- zk-1.zk:2181- zk-2.zk:2181- zk-3.zk:2181
Kafka server- broker.id : 2- zookeeper.connect
- zk-1.zk:2181- zk-2.zk:2181- zk-3.zk:2181
Kafka server- broker.id : 3- zookeeper.connect
- zk-1.zk:2181- zk-2.zk:2181- zk-3.zk:2181
Zookeeper servers(zk)
Kafkaservers(kk)
Pod Pod
Pod Pod Pod
zk-1
zk-2
zk-3
kk-1
kk-1
kk-1
a majority quorum must be present
StatefulSet – a way of launching ordered replicas of a containerzk-0
Contain-ers
Volumes
zk-1
Contain-ers
Volumes
zk-2
Contain-ers
Volumes
The StatefulSet creates 3 pods with ordinals suffixed to pod names,
and guarantees the followings:
pod-0
Contain-ers
Volumes
pod-1
Contain-ers
Volumes
pod-2
Contain-ers
Volumes
pods are created sequentially
pod-0
Contain-ers
Volumes
pod-1
Contain-ers
Volumes
pod-2
Contain-ers
Volumes
pods are deleted in reverse order
pod-0
Contain-ers
Volumes
pod-1
Contain-ers
Volumes
pod-2
Contain-ers
Volumes pod-3
Contain-ers
Volumes
Before a scaling op is appliedall its predecessors must be run-
ning
pod-0
Contain-ers
Volumes
pod-1
Contain-ers
Volumes
pod-2
Contain-ers
Volumes
Before a pod is terminated,all of its successors are shutdown
Each pod is created and sched-uled using this template
Each pod lays its claim to stor-age
using this template
Create 3 replicas of serversusing the following templates
Service (10.111.67.108)
Service – to represent a group of pods with a cluster IP
server-0
Containers
Volumes
server-1
Containers
Volumes
server-2
Containers
Volumes
Q) How to achieve the followings?• Users must be unaware of the replicas• Traffic is distributed over the replicas
server-0
Containers
Volumes
server-1
Containers
Volumes
server-2
Containers
Volumes
Let’s say that we have 3 replicas of a pod for load bal-ancing
A) Define a service with a cluster IP. Then Kubernetes does round-robin for-warding
Headless service – service without a common IP
• Zookeeper clients (e.g. Kafka) need to specify the address of each Zookeeper server• Kubernetes depends on its DNS service for headless services
• Each pod is assigned a domain name from Kubernetes• Each pod is directly accessed with its domain name (not through a cluster IP)
• Fully Qualified Domain Name (FQDN) format• $pod.$service.$namespace.svc.cluster.local
PodZookeeper server
- myid : 1- server.1
- zk-1:2888:3888- server.2
- zk-2:2888:3888- server.3
- zk-3:2888:3888
Zookeeper server- myid : 2- server.1
- zk-1:2888:3888- server.2
- zk-2:2888:3888- server.3
- zk-3:2888:3888
Zookeeper server- myid : 3- server.1
- zk-1:2888:3888- server.2
- zk-2:2888:3888- server.3
- zk-3:2888:3888
Kafka server- broker.id : 1- zookeeper.connect
- zk-1.zk:2181- zk-2.zk:2181- zk-3.zk:2181
Kafka server- broker.id : 2- zookeeper.connect
- zk-1.zk:2181- zk-2.zk:2181- zk-3.zk:2181
Kafka server- broker.id : 3- zookeeper.connect
- zk-1.zk:2181- zk-2.zk:2181- zk-3.zk:2181
Zookeeper servers(zk)
Kafkaservers(kk)
Pod Pod
Pod Pod Pod
zk-1
zk-2
zk-3
kk-1
kk-1
kk-1
Namespace in Kubernetes
zk-0Contain-
ers
Volumes
zk-1Contain-
ers
Volumes
zk-2Contain-
ers
Volumes
Three pods are defined within zk-headless ser-vice,
and they are given DNS entries of the following format:
pod.service.namespace.svc.cluster.local
zk-headless service
zk-1:2181 (within ser-vice)
zk-1.zk-headless:2181 (within same namespace)
default namespacekafka service
kk-0
Contain-ers
Volumes
kk-1
Contain-ers
Volumes
kk-2
Contain-ers
Volumes
kk-3
Contain-ers
Volumes
zk-1.zk-headless.default.svc.cluster.local:2181 (from other namespace)
alien names-pace
The default namespace is usedas there’s no namespace declaration
Pod anti-affinity
This pod should not run in X in which one or more pods that sat-isfy Y are running.- X belongs to topology domain
- node (topologyKey:kubernetes.io/hostname in this exam-ple)
- rack - cloud provider zone- cloud provider region
- Y is a label selector- it selects all pods belonging to a service named zk-headless
⇓ debugging hook (a pod pauses until it is set to true)
kube-scheduler is about to schedule pod2 labeled app=zk-head-less,
but wants to avoid node3 because there’s pod1 labeled app=zk-headless.
Kubernetes provides pod anti-affinity for this case.
node1 node2 node3
pod1
Contain-ers
Volumes
pod2
Contain-ers
Volumes
app=zk-head-
lesskube-scheduler
app=zk-head-
less
Files in the container image• Dockerfile
1. Download the latest Zookeeper tarball 2. Extract and place the content under /opt/
zookeeper3. ln -s /opt/zookeeper/* /usr/bin
• zkGenConfig.sh1. create zoo.cfg2. configure log-related properties3. create data directories4. set myid extracted from domain name
• ex) zk-0.zk-headless.default.svc.cluster.local 0+1 = 1
• zkOk.sh• check readiness and liveness of a pod
⇓ it’s from Zookeeper
Environmental variables for container processes in a pod
env defines environmental variables to be used in container processes.
Two ways to assign values1. value = constant val2. valueFrom = val from ConfigMap
Readiness & liveness check for containers
Kubernetes provides a means of checking
readiness & liveness
Kubernetes
How to guarantee a certain # of running pods during mainte-nance• Users can define PodDisruptionBudget with mi-
nAvailable• At least two pods from zk must be available at any
time
• Below is an example illustrating PodDisruptionBud-get• together with StatefulSet and PodAntiAffinity
node1zk-0
Containers
Volumes
node2zk-2
Containers
Volumes
node3
zk-3
Containers
Volumes
Drain node1Operation is permitted
because allowed-disruptions=1
Kubernetes
Drain node2
3 replicas have to be run-ning due to StatefulSet,so try scheduling zk-0
on other nodes!Oops!
cannot schedule zk-0 on node2 and node3
due to PodAntiAffin-ity!
Operation not permitted because allowed-disruptions=0
(Note that minAvailable=2)
Please wait until node1 is up and zk-0 is rescheduled!
node1zk-0
Containers
Volumes
node2zk-2
Containers
Volumes
node3
zk-3
Containers
Volumes
Scaling issue with Zookeeper• Dynamically changing the membership of a replicated distributed sys-
tem, while preserving data consistency and system availability, is chal-lenging• from “Dynamic Reconfiguration of Primary/Backup Clusters” in USENIX ATC 2012
• Prior to Zookeeper 3.5.0 (We use 3.4.9 which is the latest stable version at this point)• Configuration parameters are loaded during boot• Configuration parameters are immutable at runtime
• Operators have to carefully restart all daemons
• Starting with Zookeeper 3.5.0,• Full support for automated configuration changes
• without service interruption while preserving data consistency• Set of zookeeper servers, roles of servers, all ports, and even quorum systems
* https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html
Scaling up/down a StatefulSet
StatefulSet itself has means to scaling up/down
• kubectl scale statefulset $statefulSetInstanceName --replicas=5• kubectl patch statefulset $statefulSetInstanceName -p '{"spec":
{"replicas":3}}’
Topics not covered here• Detailed architecture of Kubernetes
• https://github.com/kubernetes/community/blob/master/contributors/design-proposals/architecture.md
• ReplicaSet and Deployment (other than StatefulSet)• https://kubernetes.io/docs/user-guide/replicasets/• https://kubernetes.io/docs/user-guide/deployments/
• Persistent Volume and Persistent Volume Claim• https://kubernetes.io/docs/user-guide/volumes/
• Kubernetes network (Proxy, DNS, etc)• https://kubernetes.io/docs/admin/networking/• https://kubernetes.io/docs/admin/dns/
The end
Top Related