Skip to content

Kubernetes Orchestration

Foreword

Docker solved the "packaging" problem, and Kubernetes solves the "management" problem. When you have dozens or hundreds of containers that need deployment, scaling, and fault recovery, manual management is impractical. Kubernetes (K8s) is the "operating system" for containers, automating the deployment, scaling, and operations of containerized applications.

What will you learn from this article?

After completing this chapter, you will gain:

  • Architecture understanding: Master the composition of the K8s control plane and worker nodes
  • Core resources: Become familiar with core concepts like Pod, Deployment, and Service
  • Declarative management: Understand the philosophy of "declare desired state, system automatically converges"
  • Operations capabilities: Learn about rolling updates, auto-scaling, health checks, and other mechanisms
  • Hands-on introduction: Deploy a complete application using kubectl and YAML
ChapterContentCore Concepts
Chapter 1Why K8s is NeededChallenges of container orchestration
Chapter 2K8s ArchitectureControl plane, worker nodes, etcd
Chapter 3Core ResourcesPod, Deployment, Service, Ingress
Chapter 4Declarative ManagementYAML, kubectl, control loops
Chapter 5Operations PracticesRolling updates, HPA, health checks

1. Why Kubernetes?

Docker makes packaging and running individual containers simple, but when you face the following scenarios, manual management becomes inadequate:

ChallengeDescriptionK8s Solution
Multi-instance deploymentA service needs 10 replicas runningDeployment automatically manages replica counts
Fault recoveryA container crashes and needs automatic restartControllers automatically detect and recreate Pods
Service discoveryContainer IPs change; how to find each other?Service provides stable DNS and IP
Rolling updatesCan't stop service when updating versionsGradually replace old Pods with zero downtime
Elastic scalingAuto-scale during traffic peaksHPA automatically adjusts replica count based on CPU/memory
Resource schedulingPlace containers on the most suitable machinesScheduler intelligently schedules

K8s Core Philosophy: Declarative

You don't tell K8s "start 3 containers" (imperative). Instead, you tell it "I want 3 replicas running" (declarative). K8s continuously monitors to ensure the actual state matches your declared desired state. If a Pod crashes, it automatically creates a new one to replace it.


2. Kubernetes Architecture

A K8s cluster consists of a Control Plane and Worker Nodes.

Kubernetes Architecture
Click a component to inspect the details
Control Plane
API Server
etcd
Scheduler
Controller Manager
Worker Node x N
kubelet
kube-proxy
Container runtime
API Server
The front door of Kubernetes. All operations from kubectl, dashboards, and internal components go through the API Server. It handles authentication, authorization, admission control, and acts as the single entry point for the cluster.
Analogy:A company reception desk where every visitor and delivery is checked in

Complete Path of a Request

User Request → Ingress Controller → Service → kube-proxy → Pod (Container)

                                    Endpoint list (maintained by Service)

3. Core Resource Objects

K8s describes the cluster's desired state through various "resource objects."

K8s Core Resources
Click a resource type to inspect the explanation and YAML example
Pod
Smallest scheduling unit
A Pod is the smallest deployable unit in K8s. It contains one or more tightly related containers. Containers inside the same Pod share networking and storage, so they can communicate through localhost.
YAML example
apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
    - name: app
      image: my-app:1.0
      ports:
        - containerPort: 3000
Tip:Production workloads rarely create Pods directly; they are usually managed by Deployments.

Resource Object Categories

CategoryResourcesPurpose
WorkloadsPod, Deployment, StatefulSet, DaemonSet, JobRun applications
NetworkingService, Ingress, NetworkPolicyService discovery and traffic management
ConfigurationConfigMap, SecretConfiguration and sensitive data management
StoragePersistentVolume, PersistentVolumeClaimPersistent storage
SchedulingNode, Namespace, ResourceQuotaResource isolation and limits

4. Declarative Management and kubectl

Reconciliation Loop

K8s's core working mechanism is the reconciliation loop:

Observe → Diff → Act → Observe...
     ↓          ↓        ↓
  Read actual   Compare   Execute
  state         with      corrective
                desired   actions
                state

You declare replicas: 3. The controller discovers only 2 Pods running and creates 1 new one. This loop executes every few seconds, ensuring the system always converges toward the desired state.

Common kubectl Commands

CommandPurposeExample
kubectl apply -fApply YAML configurationkubectl apply -f deployment.yaml
kubectl getList resourceskubectl get pods -o wide
kubectl describeView resource detailskubectl describe pod my-app-xxx
kubectl logsView Pod logskubectl logs -f my-app-xxx
kubectl execEnter Pod terminalkubectl exec -it my-app-xxx -- sh
kubectl deleteDelete resourceskubectl delete -f deployment.yaml
kubectl scaleManual scalingkubectl scale deploy my-app --replicas=5

apply vs create

kubectl create is imperative — "create this resource," and will error if it already exists. kubectl apply is declarative — "ensure the resource is in this state," creating if it doesn't exist or updating if it does. In production, you should always use apply.


5. Operations Practices

5.1 Rolling Updates and Rollbacks

Deployment uses a rolling update strategy by default: gradually creating new version Pods while gradually terminating old version Pods.

yaml
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # At most create 1 extra Pod
      maxUnavailable: 0   # No Pods allowed to be unavailable
OperationCommand
Update imagekubectl set image deploy/my-app app=my-app:2.0
View update statuskubectl rollout status deploy/my-app
View revision historykubectl rollout history deploy/my-app
Rollback to previous versionkubectl rollout undo deploy/my-app

5.2 Auto Scaling (HPA)

HPA (Horizontal Pod Autoscaler) automatically adjusts the number of Pod replicas based on CPU, memory, or custom metrics.

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

5.3 Health Checks (Probes)

K8s monitors Pod health through three types of probes:

ProbePurposeFailure Consequence
livenessProbeDetect if container is aliveRestart container
readinessProbeDetect if container is readyRemove from Service, don't receive traffic
startupProbeDetect if container has finished startingDon't run other probes during startup

Importance of Probes

Without health check probes configured, K8s can only determine health by whether the process exists. But often the process is still running while the service is no longer responding (like deadlocks, edge of OOM). Configuring livenessProbe allows K8s to automatically restart these "zombie" containers.


Summary

Kubernetes is the de facto standard for container orchestration, and understanding its core concepts is the foundation of cloud-native development.

Key takeaways from this chapter:

  1. Declarative management: Tell K8s "what I want," not "how to do it" — the control loop automatically converges
  2. Layered architecture: Control plane makes decisions, worker nodes execute, etcd stores state
  3. Core resources: Pod (smallest unit), Deployment (replica management), Service (service discovery), Ingress (external entry point)
  4. Operations automation: Rolling updates with zero downtime, HPA elastic scaling, automatic fault recovery with probes
  5. Configuration separation: ConfigMap and Secret decouple configuration from images

Further Reading