Docker Containerization
Foreword
"It works on my machine" is the most classic developer excuse, and Docker makes that excuse disappear completely. Containerization technology packages an application and all its dependencies into a standardized unit, ensuring consistent execution across any environment. It is the cornerstone of modern software delivery.
What will you learn from this article?
After completing this chapter, you will gain:
- Core concepts: Understand the three core concepts of images, containers, and registries
- Architecture comparison: Understand the fundamental difference between containers and virtual machines
- Hands-on skills: Master Dockerfile writing and common commands
- Orchestration basics: Learn to manage multi-service applications with Docker Compose
- Best practices: Learn about image optimization, security hardening, and other production-grade practices
| Chapter | Content | Core Concepts |
|---|---|---|
| Chapter 1 | Why Containers Are Needed | Environment consistency, resource efficiency, standardized delivery |
| Chapter 2 | Core Concepts | Image, Container, Registry, Dockerfile |
| Chapter 3 | Docker Lifecycle | Write, Build, Push, Run, Manage |
| Chapter 4 | Docker Compose | Multi-service orchestration, networks, volumes |
| Chapter 5 | Best Practices | Image optimization, security, multi-stage builds |
1. Why Containers?
Before containers, deploying an application required manually installing runtimes, configuring environment variables, and handling dependency conflicts on servers. Differences between environments (development, testing, production) were breeding grounds for bugs.
What Problems Do Containers Solve?
| Problem | Traditional Approach | Container Approach |
|---|---|---|
| Inconsistent environments | "It works on my machine" | Package all dependencies, consistent everywhere |
| Dependency conflicts | App A needs Node 14, App B needs Node 18 | Each container has an independent environment |
| Resource waste | Each VM has a full OS | Share kernel, MB-level overhead |
| Slow deployment | Manual install and configure | docker run — one command |
| Difficult scaling | Create new VM, install environment, deploy | Start new containers in seconds |
The Essence of Containers
Containers are not lightweight virtual machines. Their essence is isolated processes. The Linux kernel implements containers through two mechanisms:
- Namespaces: Isolate a process's view (PID, network, filesystem, etc.)
- Cgroups: Limit a process's resource usage (CPU, memory, IO)
Processes inside a container are fundamentally no different from ordinary processes on the host machine — they're just "locked in a room where they can't see the outside."
2. Core Concepts
The Docker world revolves around three core concepts: Image, Container, and Registry.
| Concept | Analogy | Description |
|---|---|---|
| Image | Class / Template | Read-only application template containing code, runtime, libraries, and configuration |
| Container | Instance / Object | Running instance of an image, read-write, with independent lifecycle |
| Registry | App Store | Service for storing and distributing images (Docker Hub, ACR, ECR) |
| Dockerfile | Recipe / Blueprint | Text file defining how to build an image |
| Volume | External Hard Drive | Persistent data; data survives container deletion |
Image Layer Structure
Docker images are composed of multiple read-only layers stacked together. Each Dockerfile instruction creates one layer:
┌─────────────────────────┐
│ CMD ["node", "app.js"] │ ← Startup command layer
├─────────────────────────┤
│ COPY . /app │ ← Application code layer (changes frequently)
├─────────────────────────┤
│ RUN npm install │ ← Dependency installation layer (changes occasionally)
├─────────────────────────┤
│ FROM node:18-alpine │ ← Base image layer (rarely changes)
└─────────────────────────┘Why Layering Matters
Docker caches each layer. If a layer hasn't changed, the cache is reused during builds. Therefore, in Dockerfiles, you should place instructions that change infrequently at the top (like installing dependencies) and instructions that change frequently at the bottom (like copying code). This way, most builds hit the cache and are much faster.
3. Docker Lifecycle
From writing a Dockerfile to running a container, Docker's workflow is a clear pipeline.
FROM node:18-alpineChoose the base imageWORKDIR /appSet the working directoryCOPY package*.json ./Copy dependency files for cache reuseRUN npm installInstall dependenciesCOPY . .Copy application codeEXPOSE 3000Declare the portCMD ["node", "server.js"]Start commandDockerfile Common Instructions Reference
| Instruction | Purpose | Example |
|---|---|---|
FROM | Specify base image | FROM node:18-alpine |
WORKDIR | Set working directory | WORKDIR /app |
COPY | Copy files into image | COPY package.json ./ |
RUN | Execute commands during build | RUN npm install |
ENV | Set environment variables | ENV NODE_ENV=production |
EXPOSE | Declare port (documentation only) | EXPOSE 3000 |
CMD | Container startup command | CMD ["node", "app.js"] |
ENTRYPOINT | Container entry point (harder to override) | ENTRYPOINT ["nginx"] |
4. Docker Compose: Multi-Service Orchestration
Real projects typically involve more than one container. A web application might need: application server + database + Redis + Nginx. Docker Compose uses a single YAML file to define and manage multiple containers.
docker-compose.yml Example
version: '3.8'
services:
app:
build: .
ports:
- "3000:3000"
environment:
- DB_HOST=db
- REDIS_HOST=redis
depends_on:
- db
- redis
db:
image: postgres:15-alpine
volumes:
- db-data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=secret
redis:
image: redis:7-alpine
volumes:
db-data:Compose Core Concepts
| Concept | Description | Example |
|---|---|---|
| services | Define individual container services | app, db, redis |
| volumes | Persistent data volumes | db-data stores database files |
| networks | Custom networks (auto-created by default) | Services access each other by service name |
| depends_on | Startup order dependencies | app depends on db and redis |
| environment | Environment variables | Database password, connection address |
Service Discovery
In Docker Compose, the service name is the hostname. The app container can directly access the database using db:5432 and Redis using redis:6379 without knowing IP addresses. This is thanks to Docker's built-in DNS.
5. Best Practices
5.1 Multi-stage Build
Multi-stage builds are a powerful tool for optimizing image size. The build stage installs all tools and dependencies, while the final stage only retains the files needed at runtime.
# Build stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Runtime stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
EXPOSE 3000
CMD ["node", "dist/server.js"]5.2 Image Optimization Checklist
| Optimization | Approach | Effect |
|---|---|---|
| Choose small base images | Use alpine instead of ubuntu | Image from ~200MB down to ~50MB |
| Merge RUN instructions | Connect multiple commands with && | Reduce image layers |
| Use .dockerignore | Exclude node_modules, .git, etc. | Speed up builds, reduce context |
| Multi-stage builds | Separate build and runtime environments | Final image doesn't contain build tools |
| Pin version numbers | node:18.17-alpine instead of node:latest | Reproducible builds |
5.3 Security Practices
| Practice | Description |
|---|---|
| Don't run as root | USER node to specify a non-root user |
| Scan for vulnerabilities | docker scout or Trivy to scan images |
| Least privilege | Only install necessary packages, no debug tools |
| Don't hardcode secrets | Use environment variables or Docker Secrets |
| Regularly update base images | Fix security vulnerabilities promptly |
Summary
Docker containerization is the infrastructure of modern software delivery, and understanding it is crucial for any developer.
Key takeaways from this chapter:
- Containers vs VMs: Containers share the host kernel, are lighter and faster, but slightly weaker isolation than VMs
- Core trio: Image (template), Container (instance), Registry (distribution)
- Dockerfile: Layered builds, leverage caching, place rarely-changing instructions first
- Docker Compose: Define multi-service applications with YAML, service names are hostnames
- Production practices: Multi-stage builds reduce image size, alpine base images, run as non-root
Further Reading
- Docker Official Documentation - The most authoritative reference
- Docker Getting Started - Official beginner's tutorial
- Dockerfile Best Practices - Official best practices guide
- Docker Compose Documentation - Complete Compose reference