Gateway & Reverse Proxy

🎯 Core Question

In high-concurrency internet architectures, how do you route traffic to the right service safely and efficiently? Reverse proxies solve "how to distribute traffic," and API gateways solve "how to process requests." This article uses real-world analogies (reception desk, security system, intelligent routing) to explore the design philosophy and engineering practices of gateways.

1. Why "Gateway"?

1.1 A Real-World Case: The Architecture Evolution of an E-Commerce Platform

An e-commerce platform encountered serious architectural problems during rapid business growth:

Scenario:

Phase 1: Directly Exposing Services
Client → Directly calls User Service, Order Service, Payment Service...
         ↓
Problem 1: Service IPs are exposed — security risk
Problem 2: No unified authentication or rate limiting
Problem 3: Adding a new service requires modifying client configuration

⚠️ Critical Issues with Direct Exposure

Security risk: All service IPs are exposed and vulnerable to attacks
Redundant functionality: Every service must implement authentication, rate limiting, and logging
Scaling difficulty: Adding a new service requires changes to all clients
Protocol chaos: Some services use HTTP, others gRPC — clients must adapt to all

Improved Architecture (with Gateway):

Client → API Gateway (Nginx/Kong) → Internal Services
         ↓
      Unified authentication, rate limiting, routing
         ↓
      Client only knows the gateway address

✨ Benefits After Improvement

Security: Real service IPs are hidden; only the gateway is exposed externally
Consolidation: Authentication, rate limiting, and logging are handled centrally
Easy scaling: Adding a new service only requires configuring a route on the gateway
Protocol unification: HTTP externally, gRPC internally

1.2 A Real-Life Analogy for Gateways

The Reception Desk

Imagine visiting a large company:

No reception desk: Visitors go straight to departments, don't know where to go, chaos ensues
With reception: Visitors check in at reception first, explain their purpose, and are directed to the right department

An API gateway is the "reception desk" of a system:

Reverse proxy: The receptionist, guiding visitors to the correct department
API gateway: A smart receptionist that also verifies visitor identity (authentication) and limits visitor numbers (rate limiting)

👤

用户 (浏览器)

→

访问域名

🛡️

反向代理 (Nginx)

代理服务器

→

负载均衡

⚙️

后端服务器集群

Web1 | Web2 | Web3

🛡️ 反向代理特点

客户端无感知，只需要访问域名
隐藏真实服务器架构，统一对外接口
提供负载均衡、安全防护、SSL卸载等功能
典型代表：Nginx、HAProxy、AWS ELB

💡 典型使用场景

网站需要承载高并发流量（负载均衡）
统一HTTPS证书管理（SSL卸载）
防护DDoS攻击和SQL注入
灰度发布、A/B测试、蓝绿部署

🧠 记忆口诀

"反向代理 = 代理服务器" —— 客户端不知道真实服务器，只知道域名

2. What Is a Reverse Proxy?

2.1 Forward Proxy vs. Reverse Proxy

🤔 Terminology

Forward Proxy:

Deployed on the client side
Accesses external resources on behalf of the client
Typical applications: VPNs, circumvention tools
Example: In a corporate network, you access the internet through a proxy

Reverse Proxy:

Deployed on the server side
Receives client requests and forwards them to internal services
Clients only know the proxy exists, not the real servers
Examples: Nginx, HAProxy

Comparison:

Dimension	Forward Proxy	Reverse Proxy
Deployment side	Client side	Server side
Serves	Clients	Servers
Typical use	VPN, circumvention	Load balancing, gateway
Transparency	Server sees the proxy IP	Client sees the proxy IP
Purpose	Hide real client, accelerate access	Hide real server, load balance

2.2 Core Value of a Reverse Proxy

Value 1: Load Balancing

Distributes traffic across multiple backend servers to avoid single-point overload.

Client
  ↓
Nginx (Reverse Proxy)
  ↓
┌──────────┬──────────┬──────────┐
│ Server 1 │ Server 2 │ Server 3 │
└──────────┴──────────┴──────────┘

Value 2: Security Protection

Hides real server IPs to prevent direct attacks. Security is enforced at the proxy layer.

Client → Only sees Nginx's IP
Real servers → Only on the internal network, inaccessible externally

Value 3: SSL Termination

Handles HTTPS encryption/decryption at the proxy layer; backend services use HTTP, reducing backend computational overhead.

HTTPS Client → Nginx (encrypt/decrypt) → HTTP Backend Services
                   ↑
              SSL termination point

3. Nginx: How Does It Handle Millions of Concurrent Connections?

3.1 Master-Worker Process Model

Nginx uses a multi-process architecture, not multi-threaded:

Master Process (Manager):

Reads and validates configuration files
Manages Worker processes (start, stop, reload)
Does not handle actual requests

Worker Processes (Workers):

Actually handle HTTP requests
Each Worker is an independent, isolated process
The number is typically set to the number of CPU cores to avoid context-switching overhead

💡 Advantages

Strong isolation: One Worker crash does not affect other Workers
Full multi-core utilization: Each Worker runs independently
Avoids multi-threading complexity: No need to deal with locks, race conditions, etc.

3.2 Event-Driven + Async Non-Blocking

This is the core secret of Nginx's high performance:

Traditional Apache (multi-process/thread model):

One connection = one process/thread
Concurrency is limited by the number of system processes/threads
Under high connection counts, context-switching overhead is enormous

Nginx (event-driven model):

Uses efficient I/O multiplexing mechanisms like epoll (Linux) / kqueue (macOS)
A single Worker process can handle tens of thousands of connections simultaneously
When a connection has no data, it consumes no CPU; it is woken up by event notifications when new data arrives

Real-Life Analogy

Apache: A restaurant where every diner gets a dedicated waiter (process); many diners require many waiters
Nginx: One super-waiter serving all diners simultaneously, going to whoever needs service rather than standing next to a single diner

Nginx 进程架构图

👑

Master 进程

管理所有 Worker，负责配置加载、平滑升级

4 个 Worker

⚙️

Worker 1

处理 0 请求

⚙️

Worker 2

处理 0 请求

⚙️

Worker 3

处理 0 请求

⚙️

Worker 4

处理 0 请求

📡 epoll (Linux) / kqueue (macOS)

事件驱动：一个 Worker 同时处理数万个连接

传统 Apache

一个连接 = 一个进程/线程

❌ C10K 问题

Nginx

事件驱动 + 异步非阻塞

✅ 百万并发

🎮 模拟请求处理

💡 生产环境建议

Worker 数量 = CPU 核心数（通常设置为 auto，让 Nginx 自动检测）
太多了上下文切换开销大，太少了无法利用多核性能。

4. What Is an API Gateway?

4.1 Why Do You Need an API Gateway?

Imagine a system without a gateway:

The client must know addresses for multiple services (User Service, Order Service, Payment Service...)
Each service must implement its own authentication, rate limiting, and logging
Protocols are inconsistent — some use HTTP, others gRPC
When services upgrade, clients must change too

⚠️ Problems Without a Gateway

Client complexity: Must configure multiple service addresses
Redundant functionality: Every service must implement authentication and rate limiting
Protocol chaos: Clients must adapt to multiple protocols
Upgrade difficulty: Service upgrades force client-side changes

With an API gateway:

The client only needs to know the gateway address; the gateway routes to the correct service
Cross-cutting concerns like authentication, rate limiting, and logging are handled centrally
The gateway can perform protocol translation; externally it uniformly exposes HTTP
Backend service upgrades only require gateway config changes — clients are unaffected

客户端 (来访者)

📱 App

💻 Web

🔧 第三方

⬇️ 统一入口

🚪 API 网关 (前台)

🔐身份认证

⚡限流熔断

🧭路由转发

🔄协议转换

⬇️ 分发请求

⚙️ 后端服务 (各个部门)

👤

用户服务

/api/users

📦

订单服务

/api/orders

💳

支付服务

/api/pay

🔐身份认证

统一校验用户身份，无需每个后端服务都写登录逻辑。支持 JWT、OAuth2、API Key 等多种认证方式。

💡 实际场景

用户请求携带 JWT Token，网关校验签名和过期时间，通过后把用户ID添加到请求头转发给后端服务。

🤔 没有网关 vs 有网关的区别

功能需求	没有网关 (直接访问)	有 API 网关
身份认证	每个服务都要写一遍登录校验	✅ 统一在网关层校验 JWT
限流保护	每个服务自己实现限流	✅ 网关统一限流，保护后端
协议转换	HTTP、gRPC、WebSocket各自处理	✅ 网关统一对外暴露 HTTP
灰度发布	需要改负载均衡器配置	✅ 网关层按 Header 路由

4.2 Core Features of an API Gateway

Feature	Description	Typical Scenario
Route forwarding	Forwards requests to different services based on URL, headers, etc.	`/api/users` → User Service, `/api/orders` → Order Service
Load balancing	Distributes traffic when a service has multiple instances	User Service has 3 instances, round-robin request distribution
Authentication	Centrally validates JWT, OAuth tokens	Unauthenticated users cannot access `/api/admin`
Rate limiting & circuit breaking	Controls traffic caps to prevent service overload	Max 1000 requests/second; beyond that returns 429
Protocol translation	HTTP externally, can translate to gRPC internally	Client uses HTTP, gateway translates to gRPC for internal calls
Canary release	Routes a portion of traffic to a new version by header or ratio	5% of users experience the new version, 95% use the old
Logging & monitoring	Centrally records request logs for analysis and troubleshooting	Record request latency, status codes, response sizes

5. Gateway in Practice: How to Build a Complete Gateway Architecture?

5.1 Full Architecture Diagram

┌───────────────────────────────────────────────────────────────────────┐
│                           Client (Browser/App)                         │
└───────────────────────────┬─────────────────────────────────────────┘
                                │ HTTPS
                                ▼
┌───────────────────────────────────────────────────────────────────────┐
│                      Outer Layer: CDN + WAF                            │
│  ┌─────────────────────────────────────────────────────────────┐  │
│  │  CDN (Content Delivery Network)                              │  │
│  │  - Static asset caching (images, CSS, JS)                    │  │
│  │  - Nearby access, reduced latency                            │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  WAF (Web Application Firewall)                               │  │
│  │  - Protection against SQL injection, XSS attacks              │  │
│  │  - Block malicious bots and crawlers                          │  │
│  │  - CC attack protection                                       │  │
│  └───────────────────────────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌───────────────────────────────────────────────────────────────────────┐
│                  Middle Layer: API Gateway (Nginx/Kong)                │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 1: SSL Termination + Security                          │  │
│  │  - HTTPS / TLS 1.3                                            │  │
│  │  - HSTS, security response headers                            │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 2: Authentication & Authorization                      │  │
│  │  - JWT Token validation                                       │  │
│  │  - OAuth 2.0 / SSO integration                                │  │
│  │  - API Key management                                         │  │
│  │  - Permission checks (RBAC)                                   │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 3: Traffic Control                                     │  │
│  │  - Rate limiting — token bucket / leaky bucket algorithms     │  │
│  │  - Circuit breaking — prevent fault propagation               │  │
│  │  - Degradation — fallback when a service is unavailable       │  │
│  │  - Canary release — traffic splitting by ratio                │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 4: Routing & Load Balancing                            │  │
│  │  - Path-based Routing                                         │  │
│  │  - Host-based Routing                                         │  │
│  │  - Header-based Routing                                       │  │
│  │  - Load balancing algorithms — round-robin / weighted /       │  │
│  │    least connections / IP hash                                │  │
│  │  - Service Discovery integration                              │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 5: Protocol Translation & Data Processing              │  │
│  │  - SSL Termination — HTTPS ↔ HTTP                             │  │
│  │  - Protocol translation — HTTP ↔ gRPC / WebSocket             │  │
│  │  - Request/Response transformation — JSON ↔ XML               │  │
│  │  - Data compression — Gzip / Brotli                           │  │
│  │  - Caching — static assets and API responses                  │  │
│  └───────────────────────────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌───────────────────────────────────────────────────────────────────────┐
│                    Inner Layer: Microservice Cluster                   │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐      │
│  │ User Svc    │ │ Order Svc   │ │ Product Svc │ │ Payment Svc │      │
│  │             │ │             │ │             │ │             │      │
│  └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘      │
│         │                │                │                │               │
│         └────────────────┴────────────────┴────────────────┘               │
│                                       │                              │
│                Service Discovery & Config Center (etcd)              │
│                - Service registration & discovery                    │
│                - Health checks                                       │
│                - KV config storage                                   │
└───────────────────────────────────────────────────────────────────────┘

5.2 Routing & Load Balancing

One of the gateway's core responsibilities is getting requests to the right place. This involves two key capabilities: routing (which server to go to) and load balancing (how to distribute traffic).

Routing Rules: From URL to Service

Imagine an e-commerce system where different URLs map to different services:

/api/users/* → User Service
/api/orders/* → Order Service
/api/products/* → Product Service
/api/pay/* → Payment Service

Nginx configuration example:

nginx

server {
    listen 80;
    server_name api.example.com;

    # User Service
    location /api/users/ {
        proxy_pass http://user-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Order Service
    location /api/orders/ {
        proxy_pass http://order-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Product Service
    location /api/products/ {
        proxy_pass http://product-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Payment Service (requires higher security)
    location /api/pay/ {
        # Restrict IP access
        allow 10.0.0.0/8;
        deny all;

        proxy_pass http://payment-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Load Balancing: Comparing Four Strategies

When a service has multiple instances, how do you choose?

Strategy	Principle	Use Case	Pros	Cons
Round-robin	Assigns to each server in order	Servers with similar specs	Simple and fair	Doesn't consider current server load
Weighted round-robin	Assigns by weight ratio; higher weight = more traffic	Servers with uneven specs	Fully utilizes high-perf servers	Requires sensible weight configuration
Least connections	Assigns to the server with the fewest active connections	Long-lived connections, video streaming	Dynamically adapts to load changes	Requires real-time connection tracking
IP hash	Hashes client IP; same IP always goes to the same server	Session persistence needed	Guarantees session consistency	A heavy-traffic IP can create a hotspot

Nginx configuration example:

nginx

# Weighted round-robin
upstream backend_weighted {
    server 10.0.1.10:8080 weight=3;  # High performance, handles more traffic
    server 10.0.1.11:8080 weight=2;
    server 10.0.1.12:8080 weight=1;  # Lower performance, handles less traffic
}

# Least connections
upstream backend_least_conn {
    least_conn;
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
}

# IP hash (session persistence)
upstream backend_ip_hash {
    ip_hash;
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
}

选择负载均衡策略

🎮 负载均衡模拟器

💡

轮询 - 挨个分发，雨露均沾

按照服务器列表的顺序，依次将请求分配给每台服务器。就像银行叫号，1号窗口完事了到2号，2号完事了到3号，轮着来。

🏢 后端服务器集群

服务器数量:4 台

🖥️

Server-A

40%

请求数:0

权重:1

最近请求:

🖥️

Server-B

46%

请求数:0

权重:4

最近请求:

🖥️

Server-C

49%

请求数:0

权重:3

最近请求:

🖥️

Server-D

23%

请求数:0

权重:5

最近请求:

📨 请求队列

总请求: 0待处理: 0

📊 负载分布统计

40%

平均负载

49%

最高负载

10.1

负载标准差

Server-C

最忙服务器

6. Gateway Security: How to Guard the System's Front Door?

6.1 Authentication & Authorization

Traditional approach (each service authenticates independently):

User Service, Order Service, Payment Service... each must validate JWTs
Code duplication, maintenance headache
Secrets scattered across services — higher leak risk

Gateway-unified authentication:

Client accesses the gateway with a token
Gateway validates the token (signature, expiration)
After validation, user info (e.g., user_id) is added to request headers and forwarded to backend services
Backend services don't need to validate; they read user info directly from headers

💡 Core Idea

Authenticate at the gateway, authorize at the service:

Authentication: Who are you? (Validate token, obtain user identity)
Authorization: What can you do? (Determine permissions based on user role)

Like a company reception desk: reception authenticates your identity (ID card), but specific permissions are determined by each department.

JWT (JSON Web Token) 认证流程

用户

输入用户名密码，点击登录

↓

网关/Nginx

转发登录请求到认证服务

↓

认证服务

验证密码，生成 JWT Token（包含 Header、Payload、Signature）

↓

用户/客户端

保存 Token（LocalStorage 或 Cookie）

↓

后续请求

在 HTTP Header 中携带: Authorization: Bearer <Token>

↓

网关/Nginx

校验 Token 签名和过期时间，通过后转发请求

↓

后端服务

从 Token 中解析用户信息，处理业务逻辑

🔑 JWT Token 结构（Base64编码）

PAYLOAD

eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ

{ "sub": "1234567890", "name": "John Doe", "iat": 1516239022 }

SIGNATURE

SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

HMACSHA256(base64Url(header) + "." + base64Url(payload), secret)

🛠️ 三种方案实现对比

对比维度	Session + Cookie	JWT	OAuth2.0
存储位置	服务端存储 Session，客户端存 Cookie	客户端存储 Token，服务端无状态	授权服务器存储，客户端存 Access Token
扩展性	❌ 需要共享 Session，扩展复杂	✅ 无状态，易于水平扩展	✅ 分布式架构，支持大规模系统
安全性	⚠️ Cookie 可能被窃取，需要 CSRF 防护	⚠️ Token 泄露风险，需 HTTPS + 短期有效	✅ 行业最佳实践，支持多种安全机制
实现复杂度	🟢 简单，开箱即用	🟡 中等，需要 Token 管理	🔴 复杂，需要授权服务器
适用场景	传统 Web 应用、后台管理系统	SPA、移动端 API、微服务	第三方登录、开放平台、SSO

🔒 网关层认证最佳实践

统一在网关层验证

不要在每个微服务里重复写认证逻辑，统一在网关层校验 JWT 或 Session

HTTPS 强制

网关层强制 HTTPS，防止 Token 在传输过程中被窃取（中间人攻击）

Token 过期策略

Access Token 短期有效（15分钟），配合 Refresh Token 实现无感知续期

黑名单机制

用户登出或 Token 泄露时，将 Token 加入黑名单（Redis 存储）

6.2 HTTPS & SSL Termination

Why HTTPS?

Security: Prevents data from being stolen in transit
Compliance: Modern browsers show "Not Secure" warnings for HTTP sites
SEO: Search engines prioritize HTTPS sites

SSL termination approach:

Only configure HTTPS and certificates at the gateway layer
The gateway handles TLS handshakes and encryption/decryption
Communication between gateway and backend services uses plain HTTP (internal network is trusted)
Backend services focus on business logic without handling TLS

💡 Advantages of SSL Termination

Simplified management: Certificates only configured on the gateway, not on backends
Reduced overhead: Backend services don't need to handle TLS handshakes
Unified updates: Certificate renewal only needs to happen on the gateway

🔐 HTTPS 流量解密流程

👤

客户端 (浏览器)

发起 HTTPS 请求

🔒TLS 加密连接

证书: *.example.com

算法: TLS 1.3

加密: AES-256-GCM

🚪

Nginx (SSL 终结)

📜 校验证书

🔓 解密流量

📝 添加 X-Forwarded-*

🔓HTTP 明文

X-Forwarded-For: 203.0.113.42

X-Forwarded-Proto: https

X-Real-IP: 203.0.113.42

⚙️

后端服务集群

专注于业务逻辑，无需处理 TLS

📜 SSL 证书管理

生成私钥

使用 OpenSSL 生成 RSA 私钥，这是证书的基础

openssl genrsa -out private.key 2048

创建 CSR

生成证书签名请求，包含域名和组织信息

openssl req -new -key private.key -out csr.pem

域名验证

CA 机构验证域名所有权（DNS 记录或 HTTP 文件）

# 添加 DNS TXT 记录或上传验证文件到 /.well-known/

签发证书

验证通过后，CA 签发证书文件

# 下载 certificate.crt 和 chain.crt

部署配置

将证书配置到 Nginx 并测试

nginx -t && systemctl reload nginx

✨ SSL 终结的核心优势

🚀

性能提升

TLS 握手和加密解密是 CPU 密集型操作，集中在 Nginx 处理，后端服务专注业务逻辑，整体吞吐量提升 2-5 倍

🔧

简化运维

证书统一管理，只需在 Nginx 配置一次，无需在每个后端服务重复配置，证书续期、更换一键完成

🛡️

集中安全

SSL/TLS 配置统一管控，强制使用最新协议版本和密码套件，统一添加安全响应头（HSTS、CSP 等）

📊

统一监控

所有 HTTPS 流量经过 Nginx，可以统一记录访问日志、分析 SSL 握手性能、监控证书有效期，便于审计和排障

7. Rate Limiting & Circuit Breaking: How to Prevent the System from Being Overwhelmed by "Traffic Floods"?

7.1 Rate Limiting Algorithm Comparison

Algorithm	Core Idea	Burst traffic	Use Case	Complexity
Token bucket	Bucket holds tokens; a request needs a token to pass	Allows some bursting	API rate limiting, bandwidth control	Medium
Leaky bucket	Requests enter the bucket and are processed at a steady rate	Enforces smoothing; bursts are queued or rejected	Scenarios requiring strict steady processing	Medium
Sliding window	Counts requests within a time window	Strictly counts by window; excess is rejected	Precise counting (e.g., "max 100 per minute")	High

7.2 Nginx Rate Limiting Configuration in Practice

nginx

# Define rate-limiting zones (place in the http block)

# 1. IP-based rate limiting (leaky bucket algorithm)
# zone=mylimit:10m — zone name and memory size (10 MB ≈ 160k IPs)
# rate=10r/s — 10 requests per second
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=10r/s;

# 2. IP-based connection limit (prevents a single IP from opening too many connections)
limit_conn_zone $binary_remote_addr zone=addr:10m;

# 3. Endpoint-based rate limiting (not per-IP; protects the backend as a whole)
limit_req_zone $server_name zone=server_limit:10m rate=100r/s;

server {
    listen 80;
    server_name api.example.com;

    # User Service — normal rate limiting
    location /api/users/ {
        # Apply rate limiting
        # burst=20 — bucket capacity, allows 20 burst requests
        # nodelay — don't delay burst requests (process or reject immediately)
        limit_req zone=mylimit burst=20 nodelay;

        # Limit connections per IP
        limit_conn addr 10;

        proxy_pass http://user-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Order Service — stricter rate limiting
    location /api/orders/ {
        # Stricter: 5 requests per second
        limit_req_zone $binary_remote_addr zone=order_limit:10m rate=5r/s;
        limit_req zone=order_limit burst=10 nodelay;

        proxy_pass http://order-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Handling after rate limiting
    # When a request is rate-limited, return 429 Too Many Requests
    error_page 429 /429.html;
    location = /429.html {
        internal;
        return 429 '{"error": "Too Many Requests", "message": "Rate limit exceeded. Please try again later."}';
        add_header Content-Type application/json;
    }
}

💡 Rate Limiting Strategy Recommendations

Normal endpoints: 10 requests/second, allow 20 burst
Critical endpoints (payment, orders): 5 requests/second, allow 10 burst
Global protection: Total across all requests no more than 100/second

选择限流算法

🪙 令牌桶算法可视化

令牌桶

🪙

5 / 10 令牌

⏰ 令牌产生器 (2/秒)

🪙

📥 请求队列

📊 三种算法对比

维度	令牌桶 (Token Bucket)	漏桶 (Leaky Bucket)	滑动窗口 (Sliding Window)
核心思想	桶里装令牌，有令牌才能通过	请求进桶，匀速流出处理	统计时间窗口内的请求数
突发流量	✅ 允许一定程度的突发（桶里有令牌）	❌ 强制平滑，突发会被缓存或拒绝	❌ 严格按窗口计数，超出一律拒绝
适用场景	API 限流、带宽控制（允许突发）	需要严格匀速处理的场景（如消息队列）	精确统计（如"1分钟内最多100次"）
实现复杂度	中等	中等	较高（需要记录每个时间窗口的请求）
Nginx 配置	limit_req_zone (漏桶)	limit_req_zone (漏桶)	需第三方模块或 Lua

📝 Nginx 限流配置示例

# 定义限流区域
# $binary_remote_addr: 按 IP 限流
# zone=mylimit:10m: 区域名称和大小
# rate=10r/s: 每秒最多10个请求
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=10r/s;

server {
    listen 80;
    server_name api.example.com;

    location / {
        # 应用限流
        # burst=20: 桶容量，允许突发20个请求
        # nodelay: 不延迟处理突发请求
        limit_req zone=mylimit burst=20 nodelay;

        proxy_pass http://backend;
    }
}

💡 配置说明

limit_req_zone: 在 http 块中定义限流区域
$binary_remote_addr: 使用二进制 IP 地址作为限流键（省内存）
zone=mylimit:10m: 区域名称 mylimit，分配 10MB 内存
rate=10r/s: 每秒允许 10 个请求（漏桶算法）
burst=20: 桶的容量为 20，允许一定程度的突发流量
nodelay: 不延迟处理突发请求（立即处理或拒绝）

7.3 Circuit Breaking: Preventing Fault Propagation

How a circuit breaker works:

Closed state: Requests are forwarded normally; error rate is tracked
Open state: When the error rate exceeds the threshold, the circuit breaker opens, immediately returning errors without forwarding requests
Half-open state: After a period, a small number of requests are allowed through as probes; if successful, the circuit breaker closes

💡 Core Idea

A circuit breaker is like an electrical fuse: when current is too high, the fuse blows automatically, protecting the entire circuit from burning out.

Similarly, when a backend service has a high error rate, the circuit breaker "trips," failing fast to prevent the fault from spreading across the entire system.

8. Summary: Core Thinking in Gateway Design

8.1 Review of Core Principles

Principle	Meaning	Key Practices
Routing	Get requests to the right place	Path-based, host-based, header-based routing
Load balancing	Distribute traffic across servers	Round-robin, weighted, least connections, IP hash
Security	Guard the system's front door	Authentication & authorization, HTTPS, WAF
Rate limiting	Prevent being overwhelmed by traffic	Token bucket, leaky bucket, sliding window
Circuit breaking	Prevent fault propagation	Fail fast, degradation strategies
Observability	Monitoring and troubleshooting	Logging, metrics, distributed tracing

8.2 Technology Selection Advice

💡 Selection Decision Tree

Choosing a gateway:
│
├─ Only need reverse proxy & load balancing?
│  ├─ Yes → Nginx (first choice)
│  └─ No → Continue
│
├─ Need a rich plugin ecosystem?
│  ├─ Yes → Kong (built on Nginx)
│  └─ No → Continue
│
├─ Spring Cloud ecosystem?
│  ├─ Yes → Spring Cloud Gateway
│  └─ No → Nginx

9. Glossary

Term	Explanation
Reverse Proxy	A proxy deployed on the server side that receives client requests and forwards them to internal services. Clients only know the reverse proxy, not the real server addresses.
Forward Proxy	A proxy deployed on the client side that accesses external resources on behalf of the client. The server sees the proxy's IP, not the real client. Typical applications: VPNs, circumvention tools.
API Gateway	An intermediary layer between clients and backend services that provides routing, authentication, rate limiting, logging, and more — the "unified front door" of a microservice architecture.
Load Balancing	Distributing request traffic across multiple servers to avoid overloading a single server, improving system availability and performance.
SSL Termination	Handling HTTPS encryption/decryption at the gateway layer; backend services use HTTP, reducing backend computational overhead and simplifying certificate management.
Rate Limiting	Limiting the number of requests per unit of time to prevent the system from being overwhelmed by traffic bursts. Common algorithms: token bucket, leaky bucket, sliding window.
Circuit Breaking	Automatically cutting off calls to a failing dependency to prevent fault propagation, while providing a fallback strategy.
Session Persistence	Ensuring requests from the same client are always routed to the same backend server, used in scenarios requiring session state.
Health Check	Periodically checking the health of backend services, automatically removing faulty nodes to ensure traffic is only sent to healthy instances.
Canary Release	Routing a small portion of traffic to a new version, verifying stability, then gradually increasing the ratio to reduce release risk.
WAF	Web Application Firewall — protects against SQL injection, XSS, CC attacks, and other web security threats.
CDN	Content Delivery Network — deploys edge nodes globally to accelerate access to static assets.

Gateway & Reverse Proxy ​

1. Why "Gateway"? ​

1.1 A Real-World Case: The Architecture Evolution of an E-Commerce Platform ​

1.2 A Real-Life Analogy for Gateways ​

2. What Is a Reverse Proxy? ​

2.1 Forward Proxy vs. Reverse Proxy ​

2.2 Core Value of a Reverse Proxy ​

3. Nginx: How Does It Handle Millions of Concurrent Connections? ​

3.1 Master-Worker Process Model ​

3.2 Event-Driven + Async Non-Blocking ​

4. What Is an API Gateway? ​

4.1 Why Do You Need an API Gateway? ​

4.2 Core Features of an API Gateway ​

5. Gateway in Practice: How to Build a Complete Gateway Architecture? ​

5.1 Full Architecture Diagram ​

5.2 Routing & Load Balancing ​

6. Gateway Security: How to Guard the System's Front Door? ​

6.1 Authentication & Authorization ​

6.2 HTTPS & SSL Termination ​

7. Rate Limiting & Circuit Breaking: How to Prevent the System from Being Overwhelmed by "Traffic Floods"? ​

7.1 Rate Limiting Algorithm Comparison ​

7.2 Nginx Rate Limiting Configuration in Practice ​

7.3 Circuit Breaking: Preventing Fault Propagation ​

8. Summary: Core Thinking in Gateway Design ​

8.1 Review of Core Principles ​

8.2 Technology Selection Advice ​

9. Glossary ​

Gateway & Reverse Proxy

1. Why "Gateway"?

1.1 A Real-World Case: The Architecture Evolution of an E-Commerce Platform

1.2 A Real-Life Analogy for Gateways

2. What Is a Reverse Proxy?

2.1 Forward Proxy vs. Reverse Proxy

2.2 Core Value of a Reverse Proxy

3. Nginx: How Does It Handle Millions of Concurrent Connections?

3.1 Master-Worker Process Model

3.2 Event-Driven + Async Non-Blocking

4. What Is an API Gateway?

4.1 Why Do You Need an API Gateway?

4.2 Core Features of an API Gateway

5. Gateway in Practice: How to Build a Complete Gateway Architecture?

5.1 Full Architecture Diagram

5.2 Routing & Load Balancing

6. Gateway Security: How to Guard the System's Front Door?

6.1 Authentication & Authorization

6.2 HTTPS & SSL Termination

7. Rate Limiting & Circuit Breaking: How to Prevent the System from Being Overwhelmed by "Traffic Floods"?

7.1 Rate Limiting Algorithm Comparison

7.2 Nginx Rate Limiting Configuration in Practice

7.3 Circuit Breaking: Preventing Fault Propagation

8. Summary: Core Thinking in Gateway Design

8.1 Review of Core Principles

8.2 Technology Selection Advice

9. Glossary