Skip to content

Gateway & Reverse Proxy

🎯 Core Question

In high-concurrency internet architectures, how do you route traffic to the right service safely and efficiently? Reverse proxies solve "how to distribute traffic," and API gateways solve "how to process requests." This article uses real-world analogies (reception desk, security system, intelligent routing) to explore the design philosophy and engineering practices of gateways.


1. Why "Gateway"?

1.1 A Real-World Case: The Architecture Evolution of an E-Commerce Platform

An e-commerce platform encountered serious architectural problems during rapid business growth:

Scenario:

Phase 1: Directly Exposing Services
Client → Directly calls User Service, Order Service, Payment Service...

Problem 1: Service IPs are exposed — security risk
Problem 2: No unified authentication or rate limiting
Problem 3: Adding a new service requires modifying client configuration

⚠️ Critical Issues with Direct Exposure

  • Security risk: All service IPs are exposed and vulnerable to attacks
  • Redundant functionality: Every service must implement authentication, rate limiting, and logging
  • Scaling difficulty: Adding a new service requires changes to all clients
  • Protocol chaos: Some services use HTTP, others gRPC — clients must adapt to all

Improved Architecture (with Gateway):

Client → API Gateway (Nginx/Kong) → Internal Services

      Unified authentication, rate limiting, routing

      Client only knows the gateway address

✨ Benefits After Improvement

  • Security: Real service IPs are hidden; only the gateway is exposed externally
  • Consolidation: Authentication, rate limiting, and logging are handled centrally
  • Easy scaling: Adding a new service only requires configuring a route on the gateway
  • Protocol unification: HTTP externally, gRPC internally

1.2 A Real-Life Analogy for Gateways

The Reception Desk

Imagine visiting a large company:

  • No reception desk: Visitors go straight to departments, don't know where to go, chaos ensues
  • With reception: Visitors check in at reception first, explain their purpose, and are directed to the right department

An API gateway is the "reception desk" of a system:

  • Reverse proxy: The receptionist, guiding visitors to the correct department
  • API gateway: A smart receptionist that also verifies visitor identity (authentication) and limits visitor numbers (rate limiting)
🔄 反向代理 vs 正向代理
一句话区分:正向代理是"客户端的代理",反向代理是"服务器的代理"
👤
用户 (浏览器)
访问域名
🛡️
反向代理 (Nginx)
代理服务器
负载均衡
⚙️
后端服务器集群
Web1 | Web2 | Web3
🛡️ 反向代理特点
  • 客户端无感知,只需要访问域名
  • 隐藏真实服务器架构,统一对外接口
  • 提供负载均衡、安全防护、SSL卸载等功能
  • 典型代表:Nginx、HAProxy、AWS ELB
💡 典型使用场景
  • 网站需要承载高并发流量(负载均衡)
  • 统一HTTPS证书管理(SSL卸载)
  • 防护DDoS攻击和SQL注入
  • 灰度发布、A/B测试、蓝绿部署
🧠 记忆口诀

"反向代理 = 代理服务器" —— 客户端不知道真实服务器,只知道域名


2. What Is a Reverse Proxy?

2.1 Forward Proxy vs. Reverse Proxy

🤔 Terminology

Forward Proxy:

  • Deployed on the client side
  • Accesses external resources on behalf of the client
  • Typical applications: VPNs, circumvention tools
  • Example: In a corporate network, you access the internet through a proxy

Reverse Proxy:

  • Deployed on the server side
  • Receives client requests and forwards them to internal services
  • Clients only know the proxy exists, not the real servers
  • Examples: Nginx, HAProxy

Comparison:

DimensionForward ProxyReverse Proxy
Deployment sideClient sideServer side
ServesClientsServers
Typical useVPN, circumventionLoad balancing, gateway
TransparencyServer sees the proxy IPClient sees the proxy IP
PurposeHide real client, accelerate accessHide real server, load balance

2.2 Core Value of a Reverse Proxy

Value 1: Load Balancing

Distributes traffic across multiple backend servers to avoid single-point overload.

Client

Nginx (Reverse Proxy)

┌──────────┬──────────┬──────────┐
│ Server 1 │ Server 2 │ Server 3 │
└──────────┴──────────┴──────────┘
Value 2: Security Protection

Hides real server IPs to prevent direct attacks. Security is enforced at the proxy layer.

Client → Only sees Nginx's IP
Real servers → Only on the internal network, inaccessible externally
Value 3: SSL Termination

Handles HTTPS encryption/decryption at the proxy layer; backend services use HTTP, reducing backend computational overhead.

HTTPS Client → Nginx (encrypt/decrypt) → HTTP Backend Services

              SSL termination point

3. Nginx: How Does It Handle Millions of Concurrent Connections?

3.1 Master-Worker Process Model

Nginx uses a multi-process architecture, not multi-threaded:

Master Process (Manager):

  • Reads and validates configuration files
  • Manages Worker processes (start, stop, reload)
  • Does not handle actual requests

Worker Processes (Workers):

  • Actually handle HTTP requests
  • Each Worker is an independent, isolated process
  • The number is typically set to the number of CPU cores to avoid context-switching overhead

💡 Advantages

  • Strong isolation: One Worker crash does not affect other Workers
  • Full multi-core utilization: Each Worker runs independently
  • Avoids multi-threading complexity: No need to deal with locks, race conditions, etc.

3.2 Event-Driven + Async Non-Blocking

This is the core secret of Nginx's high performance:

Traditional Apache (multi-process/thread model):

  • One connection = one process/thread
  • Concurrency is limited by the number of system processes/threads
  • Under high connection counts, context-switching overhead is enormous

Nginx (event-driven model):

  • Uses efficient I/O multiplexing mechanisms like epoll (Linux) / kqueue (macOS)
  • A single Worker process can handle tens of thousands of connections simultaneously
  • When a connection has no data, it consumes no CPU; it is woken up by event notifications when new data arrives

Real-Life Analogy

  • Apache: A restaurant where every diner gets a dedicated waiter (process); many diners require many waiters
  • Nginx: One super-waiter serving all diners simultaneously, going to whoever needs service rather than standing next to a single diner
⚡ Nginx 架构揭秘:为什么它能扛住百万并发?
Master-Worker 进程模型 + 事件驱动 = 高性能的秘诀
Nginx 进程架构图
👑
Master 进程
管理所有 Worker,负责配置加载、平滑升级
4 个 Worker
⚙️
Worker 1
处理 0 请求
⚙️
Worker 2
处理 0 请求
⚙️
Worker 3
处理 0 请求
⚙️
Worker 4
处理 0 请求
📡 epoll (Linux) / kqueue (macOS)
事件驱动:一个 Worker 同时处理数万个连接
传统 Apache
一个连接 = 一个进程/线程
❌ C10K 问题
VS
Nginx
事件驱动 + 异步非阻塞
✅ 百万并发
🎮 模拟请求处理
💡 生产环境建议
Worker 数量 = CPU 核心数(通常设置为 auto,让 Nginx 自动检测)
太多了上下文切换开销大,太少了无法利用多核性能。

4. What Is an API Gateway?

4.1 Why Do You Need an API Gateway?

Imagine a system without a gateway:

  • The client must know addresses for multiple services (User Service, Order Service, Payment Service...)
  • Each service must implement its own authentication, rate limiting, and logging
  • Protocols are inconsistent — some use HTTP, others gRPC
  • When services upgrade, clients must change too

⚠️ Problems Without a Gateway

  • Client complexity: Must configure multiple service addresses
  • Redundant functionality: Every service must implement authentication and rate limiting
  • Protocol chaos: Clients must adapt to multiple protocols
  • Upgrade difficulty: Service upgrades force client-side changes

With an API gateway:

  • The client only needs to know the gateway address; the gateway routes to the correct service
  • Cross-cutting concerns like authentication, rate limiting, and logging are handled centrally
  • The gateway can perform protocol translation; externally it uniformly exposes HTTP
  • Backend service upgrades only require gateway config changes — clients are unaffected
🚪 API 网关:系统的"统一大门"
想象成写字楼的「前台」——所有访客都要先经过这里,才能到达不同的办公室
客户端 (来访者)
📱 App
💻 Web
🔧 第三方
⬇️ 统一入口
🚪 API 网关 (前台)
🔐身份认证
限流熔断
🧭路由转发
🔄协议转换
⬇️ 分发请求
⚙️ 后端服务 (各个部门)
👤
用户服务
/api/users
📦
订单服务
/api/orders
💳
支付服务
/api/pay
🔐身份认证
统一校验用户身份,无需每个后端服务都写登录逻辑。支持 JWT、OAuth2、API Key 等多种认证方式。
💡 实际场景
用户请求携带 JWT Token,网关校验签名和过期时间,通过后把用户ID添加到请求头转发给后端服务。
🤔 没有网关 vs 有网关的区别
功能需求没有网关 (直接访问)有 API 网关
身份认证每个服务都要写一遍登录校验✅ 统一在网关层校验 JWT
限流保护每个服务自己实现限流✅ 网关统一限流,保护后端
协议转换HTTP、gRPC、WebSocket各自处理✅ 网关统一对外暴露 HTTP
灰度发布需要改负载均衡器配置✅ 网关层按 Header 路由

4.2 Core Features of an API Gateway

FeatureDescriptionTypical Scenario
Route forwardingForwards requests to different services based on URL, headers, etc./api/users → User Service, /api/orders → Order Service
Load balancingDistributes traffic when a service has multiple instancesUser Service has 3 instances, round-robin request distribution
AuthenticationCentrally validates JWT, OAuth tokensUnauthenticated users cannot access /api/admin
Rate limiting & circuit breakingControls traffic caps to prevent service overloadMax 1000 requests/second; beyond that returns 429
Protocol translationHTTP externally, can translate to gRPC internallyClient uses HTTP, gateway translates to gRPC for internal calls
Canary releaseRoutes a portion of traffic to a new version by header or ratio5% of users experience the new version, 95% use the old
Logging & monitoringCentrally records request logs for analysis and troubleshootingRecord request latency, status codes, response sizes

5. Gateway in Practice: How to Build a Complete Gateway Architecture?

5.1 Full Architecture Diagram

┌───────────────────────────────────────────────────────────────────────┐
│                           Client (Browser/App)                         │
└───────────────────────────┬─────────────────────────────────────────┘
                                │ HTTPS

┌───────────────────────────────────────────────────────────────────────┐
│                      Outer Layer: CDN + WAF                            │
│  ┌─────────────────────────────────────────────────────────────┐  │
│  │  CDN (Content Delivery Network)                              │  │
│  │  - Static asset caching (images, CSS, JS)                    │  │
│  │  - Nearby access, reduced latency                            │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  WAF (Web Application Firewall)                               │  │
│  │  - Protection against SQL injection, XSS attacks              │  │
│  │  - Block malicious bots and crawlers                          │  │
│  │  - CC attack protection                                       │  │
│  └───────────────────────────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────────────────────┘


┌───────────────────────────────────────────────────────────────────────┐
│                  Middle Layer: API Gateway (Nginx/Kong)                │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 1: SSL Termination + Security                          │  │
│  │  - HTTPS / TLS 1.3                                            │  │
│  │  - HSTS, security response headers                            │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 2: Authentication & Authorization                      │  │
│  │  - JWT Token validation                                       │  │
│  │  - OAuth 2.0 / SSO integration                                │  │
│  │  - API Key management                                         │  │
│  │  - Permission checks (RBAC)                                   │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 3: Traffic Control                                     │  │
│  │  - Rate limiting — token bucket / leaky bucket algorithms     │  │
│  │  - Circuit breaking — prevent fault propagation               │  │
│  │  - Degradation — fallback when a service is unavailable       │  │
│  │  - Canary release — traffic splitting by ratio                │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 4: Routing & Load Balancing                            │  │
│  │  - Path-based Routing                                         │  │
│  │  - Host-based Routing                                         │  │
│  │  - Header-based Routing                                       │  │
│  │  - Load balancing algorithms — round-robin / weighted /       │  │
│  │    least connections / IP hash                                │  │
│  │  - Service Discovery integration                              │  │
│  └───────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Layer 5: Protocol Translation & Data Processing              │  │
│  │  - SSL Termination — HTTPS ↔ HTTP                             │  │
│  │  - Protocol translation — HTTP ↔ gRPC / WebSocket             │  │
│  │  - Request/Response transformation — JSON ↔ XML               │  │
│  │  - Data compression — Gzip / Brotli                           │  │
│  │  - Caching — static assets and API responses                  │  │
│  └───────────────────────────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────────────────────┘


┌───────────────────────────────────────────────────────────────────────┐
│                    Inner Layer: Microservice Cluster                   │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐      │
│  │ User Svc    │ │ Order Svc   │ │ Product Svc │ │ Payment Svc │      │
│  │             │ │             │ │             │ │             │      │
│  └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘      │
│         │                │                │                │               │
│         └────────────────┴────────────────┴────────────────┘               │
│                                       │                              │
│                Service Discovery & Config Center (etcd)              │
│                - Service registration & discovery                    │
│                - Health checks                                       │
│                - KV config storage                                   │
└───────────────────────────────────────────────────────────────────────┘

5.2 Routing & Load Balancing

One of the gateway's core responsibilities is getting requests to the right place. This involves two key capabilities: routing (which server to go to) and load balancing (how to distribute traffic).

Routing Rules: From URL to Service

Imagine an e-commerce system where different URLs map to different services:

  • /api/users/* → User Service
  • /api/orders/* → Order Service
  • /api/products/* → Product Service
  • /api/pay/* → Payment Service

Nginx configuration example:

nginx
server {
    listen 80;
    server_name api.example.com;

    # User Service
    location /api/users/ {
        proxy_pass http://user-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Order Service
    location /api/orders/ {
        proxy_pass http://order-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Product Service
    location /api/products/ {
        proxy_pass http://product-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Payment Service (requires higher security)
    location /api/pay/ {
        # Restrict IP access
        allow 10.0.0.0/8;
        deny all;

        proxy_pass http://payment-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
Load Balancing: Comparing Four Strategies

When a service has multiple instances, how do you choose?

StrategyPrincipleUse CaseProsCons
Round-robinAssigns to each server in orderServers with similar specsSimple and fairDoesn't consider current server load
Weighted round-robinAssigns by weight ratio; higher weight = more trafficServers with uneven specsFully utilizes high-perf serversRequires sensible weight configuration
Least connectionsAssigns to the server with the fewest active connectionsLong-lived connections, video streamingDynamically adapts to load changesRequires real-time connection tracking
IP hashHashes client IP; same IP always goes to the same serverSession persistence neededGuarantees session consistencyA heavy-traffic IP can create a hotspot

Nginx configuration example:

nginx
# Weighted round-robin
upstream backend_weighted {
    server 10.0.1.10:8080 weight=3;  # High performance, handles more traffic
    server 10.0.1.11:8080 weight=2;
    server 10.0.1.12:8080 weight=1;  # Lower performance, handles less traffic
}

# Least connections
upstream backend_least_conn {
    least_conn;
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
}

# IP hash (session persistence)
upstream backend_ip_hash {
    ip_hash;
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
}
⚖️ 负载均衡:把"压力"均匀分摊到多台服务器
想象成银行的取号系统——把客户均匀分配到各个窗口,避免某个窗口排长队
选择负载均衡策略
🎮 负载均衡模拟器
💡
轮询 - 挨个分发,雨露均沾
按照服务器列表的顺序,依次将请求分配给每台服务器。就像银行叫号,1号窗口完事了到2号,2号完事了到3号,轮着来。
🏢 后端服务器集群
4 台
🖥️
Server-A
45%
请求数:0
权重:1
最近请求:
🖥️
Server-B
45%
请求数:0
权重:4
最近请求:
🖥️
Server-C
49%
请求数:0
权重:1
最近请求:
🖥️
Server-D
37%
请求数:0
权重:4
最近请求:
📨 请求队列
总请求: 0待处理: 0
📊 负载分布统计
44%
平均负载
49%
最高负载
4.4
负载标准差
Server-C
最忙服务器

6. Gateway Security: How to Guard the System's Front Door?

6.1 Authentication & Authorization

Traditional approach (each service authenticates independently):

  • User Service, Order Service, Payment Service... each must validate JWTs
  • Code duplication, maintenance headache
  • Secrets scattered across services — higher leak risk

Gateway-unified authentication:

  • Client accesses the gateway with a token
  • Gateway validates the token (signature, expiration)
  • After validation, user info (e.g., user_id) is added to request headers and forwarded to backend services
  • Backend services don't need to validate; they read user info directly from headers

💡 Core Idea

Authenticate at the gateway, authorize at the service:

  • Authentication: Who are you? (Validate token, obtain user identity)
  • Authorization: What can you do? (Determine permissions based on user role)

Like a company reception desk: reception authenticates your identity (ID card), but specific permissions are determined by each department.

🔐 认证中间件:谁可以进大门?
想象成写字楼门禁——检查工牌、验证身份,没权限的人进不来
JWT (JSON Web Token) 认证流程
1
用户
输入用户名密码,点击登录
2
网关/Nginx
转发登录请求到认证服务
3
认证服务
验证密码,生成 JWT Token(包含 Header、Payload、Signature)
4
用户/客户端
保存 Token(LocalStorage 或 Cookie)
5
后续请求
在 HTTP Header 中携带: Authorization: Bearer <Token>
6
网关/Nginx
校验 Token 签名和过期时间,通过后转发请求
7
后端服务
从 Token 中解析用户信息,处理业务逻辑
🔑 JWT Token 结构(Base64编码)
HEADER
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
{ "alg": "HS256", "typ": "JWT" }
.
PAYLOAD
eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ
{ "sub": "1234567890", "name": "John Doe", "iat": 1516239022 }
.
SIGNATURE
SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
HMACSHA256(base64Url(header) + "." + base64Url(payload), secret)
🛠️ 三种方案实现对比
对比维度Session + CookieJWTOAuth2.0
存储位置 服务端存储 Session,客户端存 Cookie客户端存储 Token,服务端无状态授权服务器存储,客户端存 Access Token
扩展性 ❌ 需要共享 Session,扩展复杂✅ 无状态,易于水平扩展✅ 分布式架构,支持大规模系统
安全性 ⚠️ Cookie 可能被窃取,需要 CSRF 防护⚠️ Token 泄露风险,需 HTTPS + 短期有效✅ 行业最佳实践,支持多种安全机制
实现复杂度 🟢 简单,开箱即用🟡 中等,需要 Token 管理🔴 复杂,需要授权服务器
适用场景 传统 Web 应用、后台管理系统SPA、移动端 API、微服务第三方登录、开放平台、SSO
🔒 网关层认证最佳实践
1
统一在网关层验证
不要在每个微服务里重复写认证逻辑,统一在网关层校验 JWT 或 Session
2
HTTPS 强制
网关层强制 HTTPS,防止 Token 在传输过程中被窃取(中间人攻击)
3
Token 过期策略
Access Token 短期有效(15分钟),配合 Refresh Token 实现无感知续期
4
黑名单机制
用户登出或 Token 泄露时,将 Token 加入黑名单(Redis 存储)

6.2 HTTPS & SSL Termination

Why HTTPS?

  1. Security: Prevents data from being stolen in transit
  2. Compliance: Modern browsers show "Not Secure" warnings for HTTP sites
  3. SEO: Search engines prioritize HTTPS sites

SSL termination approach:

  • Only configure HTTPS and certificates at the gateway layer
  • The gateway handles TLS handshakes and encryption/decryption
  • Communication between gateway and backend services uses plain HTTP (internal network is trusted)
  • Backend services focus on business logic without handling TLS

💡 Advantages of SSL Termination

  • Simplified management: Certificates only configured on the gateway, not on backends
  • Reduced overhead: Backend services don't need to handle TLS handshakes
  • Unified updates: Certificate renewal only needs to happen on the gateway
🔒 SSL 终结:HTTPS 流量的"解密官"
想象成公司的前台接待——对外使用正式头衔(HTTPS),对内用内部称呼(HTTP),负责"翻译"身份
🔐 HTTPS 流量解密流程
👤
客户端 (浏览器)
发起 HTTPS 请求
🔒TLS 加密连接
证书: *.example.com
算法: TLS 1.3
加密: AES-256-GCM
🚪
Nginx (SSL 终结)
📜 校验证书
🔓 解密流量
📝 添加 X-Forwarded-*
🔓HTTP 明文
X-Forwarded-For: 203.0.113.42
X-Forwarded-Proto: https
X-Real-IP: 203.0.113.42
⚙️
后端服务集群
专注于业务逻辑,无需处理 TLS
📜 SSL 证书管理
1
生成私钥
使用 OpenSSL 生成 RSA 私钥,这是证书的基础
openssl genrsa -out private.key 2048
2
创建 CSR
生成证书签名请求,包含域名和组织信息
openssl req -new -key private.key -out csr.pem
3
域名验证
CA 机构验证域名所有权(DNS 记录或 HTTP 文件)
# 添加 DNS TXT 记录 或 上传验证文件到 /.well-known/
4
签发证书
验证通过后,CA 签发证书文件
# 下载 certificate.crt 和 chain.crt
5
部署配置
将证书配置到 Nginx 并测试
nginx -t && systemctl reload nginx
✨ SSL 终结的核心优势
🚀
性能提升
TLS 握手和加密解密是 CPU 密集型操作,集中在 Nginx 处理,后端服务专注业务逻辑,整体吞吐量提升 2-5 倍
🔧
简化运维
证书统一管理,只需在 Nginx 配置一次,无需在每个后端服务重复配置,证书续期、更换一键完成
🛡️
集中安全
SSL/TLS 配置统一管控,强制使用最新协议版本和密码套件,统一添加安全响应头(HSTS、CSP 等)
📊
统一监控
所有 HTTPS 流量经过 Nginx,可以统一记录访问日志、分析 SSL 握手性能、监控证书有效期,便于审计和排障

7. Rate Limiting & Circuit Breaking: How to Prevent the System from Being Overwhelmed by "Traffic Floods"?

7.1 Rate Limiting Algorithm Comparison

AlgorithmCore IdeaBurst trafficUse CaseComplexity
Token bucketBucket holds tokens; a request needs a token to passAllows some burstingAPI rate limiting, bandwidth controlMedium
Leaky bucketRequests enter the bucket and are processed at a steady rateEnforces smoothing; bursts are queued or rejectedScenarios requiring strict steady processingMedium
Sliding windowCounts requests within a time windowStrictly counts by window; excess is rejectedPrecise counting (e.g., "max 100 per minute")High

7.2 Nginx Rate Limiting Configuration in Practice

nginx
# Define rate-limiting zones (place in the http block)

# 1. IP-based rate limiting (leaky bucket algorithm)
# zone=mylimit:10m — zone name and memory size (10 MB ≈ 160k IPs)
# rate=10r/s — 10 requests per second
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=10r/s;

# 2. IP-based connection limit (prevents a single IP from opening too many connections)
limit_conn_zone $binary_remote_addr zone=addr:10m;

# 3. Endpoint-based rate limiting (not per-IP; protects the backend as a whole)
limit_req_zone $server_name zone=server_limit:10m rate=100r/s;

server {
    listen 80;
    server_name api.example.com;

    # User Service — normal rate limiting
    location /api/users/ {
        # Apply rate limiting
        # burst=20 — bucket capacity, allows 20 burst requests
        # nodelay — don't delay burst requests (process or reject immediately)
        limit_req zone=mylimit burst=20 nodelay;

        # Limit connections per IP
        limit_conn addr 10;

        proxy_pass http://user-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Order Service — stricter rate limiting
    location /api/orders/ {
        # Stricter: 5 requests per second
        limit_req_zone $binary_remote_addr zone=order_limit:10m rate=5r/s;
        limit_req zone=order_limit burst=10 nodelay;

        proxy_pass http://order-service;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Handling after rate limiting
    # When a request is rate-limited, return 429 Too Many Requests
    error_page 429 /429.html;
    location = /429.html {
        internal;
        return 429 '{"error": "Too Many Requests", "message": "Rate limit exceeded. Please try again later."}';
        add_header Content-Type application/json;
    }
}

💡 Rate Limiting Strategy Recommendations

  • Normal endpoints: 10 requests/second, allow 20 burst
  • Critical endpoints (payment, orders): 5 requests/second, allow 10 burst
  • Global protection: Total across all requests no more than 100/second
⚡ 限流算法:系统不会被"流量洪水"冲垮的秘诀
想象成水坝的闸门——控制水流速度,防止下游被淹没
选择限流算法
🪙 令牌桶算法可视化
令牌桶
🪙
🪙
🪙
🪙
🪙
5 / 10 令牌
⏰ 令牌产生器 (2/秒)
🪙
🪙
🪙
📥 请求队列
📊 三种算法对比
维度令牌桶 (Token Bucket)漏桶 (Leaky Bucket)滑动窗口 (Sliding Window)
核心思想 桶里装令牌,有令牌才能通过请求进桶,匀速流出处理统计时间窗口内的请求数
突发流量 ✅ 允许一定程度的突发(桶里有令牌)❌ 强制平滑,突发会被缓存或拒绝❌ 严格按窗口计数,超出一律拒绝
适用场景 API 限流、带宽控制(允许突发)需要严格匀速处理的场景(如消息队列)精确统计(如"1分钟内最多100次")
实现复杂度 中等中等较高(需要记录每个时间窗口的请求)
Nginx 配置 limit_req_zone (漏桶)limit_req_zone (漏桶)需第三方模块或 Lua
📝 Nginx 限流配置示例
# 定义限流区域
# $binary_remote_addr: 按 IP 限流
# zone=mylimit:10m: 区域名称和大小
# rate=10r/s: 每秒最多10个请求
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=10r/s;

server {
    listen 80;
    server_name api.example.com;

    location / {
        # 应用限流
        # burst=20: 桶容量,允许突发20个请求
        # nodelay: 不延迟处理突发请求
        limit_req zone=mylimit burst=20 nodelay;

        proxy_pass http://backend;
    }
}
💡 配置说明
  • limit_req_zone: 在 http 块中定义限流区域
  • $binary_remote_addr: 使用二进制 IP 地址作为限流键(省内存)
  • zone=mylimit:10m: 区域名称 mylimit,分配 10MB 内存
  • rate=10r/s: 每秒允许 10 个请求(漏桶算法)
  • burst=20: 桶的容量为 20,允许一定程度的突发流量
  • nodelay: 不延迟处理突发请求(立即处理或拒绝)

7.3 Circuit Breaking: Preventing Fault Propagation

How a circuit breaker works:

  1. Closed state: Requests are forwarded normally; error rate is tracked
  2. Open state: When the error rate exceeds the threshold, the circuit breaker opens, immediately returning errors without forwarding requests
  3. Half-open state: After a period, a small number of requests are allowed through as probes; if successful, the circuit breaker closes

💡 Core Idea

A circuit breaker is like an electrical fuse: when current is too high, the fuse blows automatically, protecting the entire circuit from burning out.

Similarly, when a backend service has a high error rate, the circuit breaker "trips," failing fast to prevent the fault from spreading across the entire system.


8. Summary: Core Thinking in Gateway Design

8.1 Review of Core Principles

PrincipleMeaningKey Practices
RoutingGet requests to the right placePath-based, host-based, header-based routing
Load balancingDistribute traffic across serversRound-robin, weighted, least connections, IP hash
SecurityGuard the system's front doorAuthentication & authorization, HTTPS, WAF
Rate limitingPrevent being overwhelmed by trafficToken bucket, leaky bucket, sliding window
Circuit breakingPrevent fault propagationFail fast, degradation strategies
ObservabilityMonitoring and troubleshootingLogging, metrics, distributed tracing

8.2 Technology Selection Advice

💡 Selection Decision Tree

Choosing a gateway:

├─ Only need reverse proxy & load balancing?
│  ├─ Yes → Nginx (first choice)
│  └─ No → Continue

├─ Need a rich plugin ecosystem?
│  ├─ Yes → Kong (built on Nginx)
│  └─ No → Continue

├─ Spring Cloud ecosystem?
│  ├─ Yes → Spring Cloud Gateway
│  └─ No → Nginx

9. Glossary

TermExplanation
Reverse ProxyA proxy deployed on the server side that receives client requests and forwards them to internal services. Clients only know the reverse proxy, not the real server addresses.
Forward ProxyA proxy deployed on the client side that accesses external resources on behalf of the client. The server sees the proxy's IP, not the real client. Typical applications: VPNs, circumvention tools.
API GatewayAn intermediary layer between clients and backend services that provides routing, authentication, rate limiting, logging, and more — the "unified front door" of a microservice architecture.
Load BalancingDistributing request traffic across multiple servers to avoid overloading a single server, improving system availability and performance.
SSL TerminationHandling HTTPS encryption/decryption at the gateway layer; backend services use HTTP, reducing backend computational overhead and simplifying certificate management.
Rate LimitingLimiting the number of requests per unit of time to prevent the system from being overwhelmed by traffic bursts. Common algorithms: token bucket, leaky bucket, sliding window.
Circuit BreakingAutomatically cutting off calls to a failing dependency to prevent fault propagation, while providing a fallback strategy.
Session PersistenceEnsuring requests from the same client are always routed to the same backend server, used in scenarios requiring session state.
Health CheckPeriodically checking the health of backend services, automatically removing faulty nodes to ensure traffic is only sent to healthy instances.
Canary ReleaseRouting a small portion of traffic to a new version, verifying stability, then gradually increasing the ratio to reduce release risk.
WAFWeb Application Firewall — protects against SQL injection, XSS, CC attacks, and other web security threats.
CDNContent Delivery Network — deploys edge nodes globally to accelerate access to static assets.