Skip to content

Message Queues and Event-Driven Architecture

Core Question

When systems are tightly coupled and traffic spikes, how do you ensure the critical path remains stable? Message queues are the "buffer" and "decoupler" of modern distributed systems. This article uses real-world cases (restaurant queuing, package sorting, flash sale systems) to deeply understand the design philosophy and engineering practices of message queues.


1. Why "Message Queues"?

1.1 A Real-World Case: The Evolution of Taobao's Order System

In 2012, Taobao's order system suffered a severe outage. At midnight on Double 11, traffic flooded in instantly. The order service directly called the inventory service, payment service, logistics service... the entire chain collapsed like dominoes.

The architecture at the time (tight coupling):

User places order → Order service → Sync call inventory service → Sync call payment service → Sync call logistics service
                    ↓                    ↓                    ↓
                 Response 200ms       Response 500ms       Response 300ms

Fatal Problems with Tight Coupling

  • Total response time = 200 + 500 + 300 = 1000ms (user waits 1 second)
  • Inventory service down → Order service also goes down (thread pool exhausted)
  • Payment service slows down → Entire chain is dragged down
  • Cannot scale horizontally → Can only scale vertically (expensive and limited)

Improved architecture (introducing message queues):

User places order → Order service → Send "order created" message → Immediate return (50ms)

                        Message queue (Kafka)

        ┌─────────────┬─────────────┬─────────────┐
        ▼             ▼             ▼             ▼
   Inventory      Payment       Logistics     Notification
   service        service       service       service
   (async deduct) (async process) (async create) (async send)

Improvements After Changes

  • User response time = 50ms (20x experience improvement)
  • Inventory service down → Messages stay in queue, continue processing after recovery
  • Payment service slows down → Doesn't affect order creation
  • Can scale horizontally → Just add more consumer instances

1.2 Everyday Analogies for Message Queues

Restaurant Queuing System

Imagine going to a popular restaurant:

  • No queuing system: Customers must stand at the window waiting; limited window space, long lines behind, restaurant under pressure
  • With queuing system: After ordering, you get a number; you can sit down first, pick up food when your number is called

A message queue is the software system's "queuing system":

  • Producer (the person ordering) → Puts messages (orders) into the queue
  • Queue (the number dispenser) → Temporarily stores messages
  • Consumer (the chef) → Processes messages at their own pace
Peak Shaving: flatten traffic spikes
Simulate a burst and see how a queue protects backend systems
Processing capacity (Consumer)200 req/s
Maximum backend processing speed
Queue capacity (Queue Size)2000
Maximum requests the message queue can buffer
Current inbound traffic
100 req/s
Queue backlog
0 msgs
Actual processing rate
0 req/s
Rejected requests (rate limited)
0 req
Inbound traffic (user requests)Processed traffic (system load)Queue backlog
💡
Core principle: When inbound traffic (blue) exceeds processing capacity (green line), extra requests are stored in the message queue (orange area).
After the traffic peak passes, the system keeps processing the backlog at full speed until the queue is empty. This is peak shaving.

2. What Is a Message Queue? (Definition + Core Three Elements)

2.1 What Is a "Message Queue"?

Terminology

Message Queue (MQ) is a container for storing messages. Producers put messages in, consumers take messages out for processing. It enables "asynchronous communication" — the sender doesn't need to wait for the receiver to finish processing.

Synchronous vs Asynchronous:

  • Synchronous: Like a phone call — the other party must answer to communicate
  • Asynchronous: Like sending a text — you send it, they read it when available

It's like calling a friend (synchronous) vs sending them a message (asynchronous).

2.2 The Three Core Elements of a Message Queue

Element 1: Producer

Responsibility: Create and send messages to the queue.

Analogy: The producer is like a "sender," delivering letters (messages) to the post office (queue).

Key Design Points
  • Send method: Synchronous send (reliable but blocking) vs asynchronous send (high performance but needs callback handling)
  • Message confirmation: Wait for Broker confirmation (At Least Once) vs fire-and-forget (At Most Once)
  • Failure handling: Retry strategy, local log backup, dead letter queue

Element 2: Consumer

Responsibility: Get messages from the queue and process them.

Analogy: The consumer is like a "recipient," taking letters (messages) from the mailbox (queue) and processing them.

Key Design Points
  • Consumption mode: Push mode (Broker actively pushes) vs Pull mode (consumer actively pulls)
  • Consumption confirmation: Auto ACK (efficient but may lose messages) vs manual ACK (reliable but needs timeout handling)
  • Concurrency control: Single-threaded sequential consumption vs multi-threaded parallel consumption
  • Failure handling: Retry strategy, dead letter queue, compensation mechanism

Element 3: Broker (Message Broker)

Responsibility: Receive, store, and forward messages.

Analogy: The Broker is like a "post office" or "package sorting station," responsible for receiving, sorting, and delivering letters.

Key Design Points
  • Storage model: In-memory storage (low latency) vs disk storage (high reliability)
  • Replication strategy: Primary-secondary replication, multi-replica synchronization
  • High availability: Cluster deployment, automatic failover
  • Scalability: Partitions, Sharding

3. Core Problem 1: How to Decouple Systems and Avoid "Pulling One Thread and Moving the Whole System"?

3.1 The Tragedy of Tight Coupling: One Service Goes Down, Everything Falls

Scenario recreation: An e-commerce platform's early architecture

Order service directly calls downstream services:
┌─────────────┐
│  Order       │
│  Service     │
└──────┬──────┘

       ├───────────┬───────────┬───────────┐
       ▼           ▼           ▼           ▼
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│Inventory │ │Payment   │ │Logistics │ │SMS       │
│Service   │ │Service   │ │Service   │ │Service   │
│  200ms   │ │  500ms   │ │  300ms   │ │  100ms   │
└──────────┘ └──────────┘ └──────────┘ └──────────┘

Pain Point Analysis

Pain PointSpecific ManifestationConsequence
Cascading failureInventory service goes down, order service sync call times outOrder service thread pool exhausted, cannot process new requests
Response latencyMust wait for all downstream service responsesUser waits over 1 second, terrible experience
Difficult to extendAdding a points service requires modifying order service codeLonger release cycles, increased risk
Resource wasteOrder service must wait for SMS serviceDatabase connections occupied for long periods

3.2 Decoupling Solution: Introduce Message Queue as "Middle Layer"

Architecture after decoupling:

Order service only sends messages, doesn't care who consumes:

┌─────────────┐
│  Order       │ ──Send "order created" message──┐
│  Service     │                                │
└─────────────┘                                ▼
                                    ┌───────────────────┐
                                    │   Message Queue    │
                                    │  (Kafka/RabbitMQ)  │
                                    │   - Reliable store │
                                    │   - Multi-replica  │
                                    │   - Order guarantee│
                                    └─────────┬─────────┘

              ┌───────────────────────┼───────────────────────┐
              │                       │                       │
              ▼                       ▼                       ▼
       ┌──────────────┐      ┌──────────────┐      ┌──────────────┐
       │  Inventory   │      │  Payment     │      │  Logistics   │
       │  Service     │      │  Service     │      │  Service     │
       │  Subscribe   │      │  Subscribe   │      │  Subscribe   │
       │  order events│      │  order events│      │  order events│
       └──────────────┘      └──────────────┘      └──────────────┘
🔗System Decoupling DemoFrom tight coupling to loose coupling
❌ The fatal problem of tight coupling
Order service
Create order
Call inventory service
Call points service
Call notification service
Notification service
Send SMS/email
⚠️Strong dependency: notification failure blocks order creation
⚠️Slow response: total time = 300ms + 500ms + 400ms = 1200ms
⚠️Hard to extend: adding a service requires changing order code
💡Core idea:Synchronous calls create strong dependencies and slow responses; asynchronous messaging decouples systems, improves response speed, and scales more easily.

Benefits of Decoupling

DimensionBefore DecouplingAfter Decoupling
Fault isolationInventory down = Order downInventory down, messages stay in queue, consumed after recovery
Response time1000ms (synchronous wait)50ms (return after sending message)
ExtensibilityNew service requires changing order codeNew service just subscribes to topic
System complexityOrder service tightly depends on downstreamOrder service only depends on message queue

3.3 The Essence of Decoupling: From "Direct Calls" to "Event-Driven"

Paradigm shift:

Traditional thinking (imperative):
"Order service commands inventory service: Deduct inventory for me!"
  ↓ Direct call
  ↓ High coupling, callee must be online
  ↓ Caller needs to know callee's interface

Event-driven thinking (declarative):
"Order service declares: Order has been created. Whoever cares, handle it."
  ↓ Send event to message queue
  ↓ Decoupled, consumers can be offline
  ↓ Producer doesn't need to know consumers exist

4. Core Problem 2: How to Handle Traffic Spikes with Peak Shaving?

4.1 Flash Sale Scenario: How to Handle 100K QPS Smoothly?

Scenario recreation: An e-commerce platform's Double 11 flash sale, expected peak 100K QPS, but the database can only handle 1,000 QPS.

Consequences of direct impact:

User requests ──→ App server ──→ Database
  100K/s         100K/s          1K/s (limit)

                         Connection pool exhausted
                         Response timeout
                         Database crash

                         Cascading failure (all services depending on DB go down)

Terminology

QPS (Queries Per Second): Queries per second, a metric for measuring system throughput.

100K QPS means 100,000 requests per second, like 100,000 people rushing into a store simultaneously.

4.2 Peak Shaving Solution: Message Queue as "Reservoir"

Architecture design:

┌───────────────────────────────────────────────────────────────────────┐
│                     Flash Sale System Architecture                    │
├───────────────────────────────────────────────────────────────────────┤
│                                                                       │
│  Layer 1: Gateway (hard rate limiting)                               │
│  ┌───────────────────────────────────────────────────────────────┐   │
│  │  - Token bucket: 100K/s → 10K/s (drop 90% of requests)       │   │
│  │  - CDN caches static resources (product detail pages)         │   │
│  │  - CAPTCHA / queue page (first layer of peak shaving)         │   │
│  └───────────────────────────────────────────────────────────────┘   │
│                            │                                          │
│                            ▼                                          │
│  Layer 2: Service (soft rate limiting)                               │
│  ┌───────────────────────────────────────────────────────────────┐   │
│  │  - Nginx rate limiting: 10K/s → 5K/s                         │   │
│  │  - Redis pre-deduct inventory (atomic operation):             │   │
│  │    * Use Lua script for atomicity                              │   │
│  │    * Insufficient stock → return "Sold out" directly           │   │
│  │  - Generate order token (queue voucher)                        │   │
│  └───────────────────────────────────────────────────────────────┘   │
│                            │                                          │
│                            ▼                                          │
│  Layer 3: Message Queue (core peak shaving)                          │
│  ┌───────────────────────────────────────────────────────────────┐   │
│  │  Kafka/RocketMQ:                                               │   │
│  │  - Batch write: 5K/s → 1K/s (matching DB capacity)           │   │
│  │  - Message persistence: disk write guarantees no message loss  │   │
│  │  - Multi-partition parallel consumption: boost throughput      │   │
│  │  - Consumer offset management: support failure recovery        │   │
│  │                                                                 │   │
│  │  Key metrics monitoring:                                        │   │
│  │  - Produce Rate                                                 │   │
│  │  - Consume Rate                                                 │   │
│  │  - Lag (message backlog)                                        │   │
│  └───────────────────────────────────────────────────────────────┘   │
│                            │                                          │
│                            ▼                                          │
│  Layer 4: Consumer (async processing)                                │
│  ┌───────────────────────────────────────────────────────────────┐   │
│  │  Order processing consumers (multiple instances):              │   │
│  │  - Pull messages from Kafka (1K/s, matching DB capacity)       │   │
│  │  - DB transaction: create order + deduct inventory              │   │
│  │  - Update order status to "Created"                             │   │
│  │  - Send order creation success notification (email/SMS/push)    │   │
│  │  - Confirm message consumption (ACK)                            │   │
│  │                                                                 │   │
│  │  Consumer scaling strategy:                                     │   │
│  │  - When Lag > 10,000, auto-scale up consumer instances          │   │
│  │  - When Lag < 1,000, scale down consumer instances (save cost) │   │
│  └───────────────────────────────────────────────────────────────┘   │
│                                                                       │
└───────────────────────────────────────────────────────────────────────┘
Peak Shaving: flatten traffic spikes
Simulate a burst and see how a queue protects backend systems
Processing capacity (Consumer)200 req/s
Maximum backend processing speed
Queue capacity (Queue Size)2000
Maximum requests the message queue can buffer
Current inbound traffic
100 req/s
Queue backlog
0 msgs
Actual processing rate
0 req/s
Rejected requests (rate limited)
0 req
Inbound traffic (user requests)Processed traffic (system load)Queue backlog
💡
Core principle: When inbound traffic (blue) exceeds processing capacity (green line), extra requests are stored in the message queue (orange area).
After the traffic peak passes, the system keeps processing the backlog at full speed until the queue is empty. This is peak shaving.

4.3 Mathematical Principles of Peak Shaving

Traffic smoothing effect:

Original traffic (spike):              Smoothed traffic:

100K/s │    ╱╲                  1K/s │████████████████
       │   ╱  ╲                      │
       │  ╱    ╲                     │
   1K/s│╱        ╲               0/s │
       └───────────────               └────────────────
       0s   1s   2s                   0s              20s

Original: 100K/s peak, lasting 1 second
Smoothed: 1K/s constant rate, lasting 100 seconds

Key formulas:

Queue length = Producer rate × Duration - Consumer rate × Duration
            = 100,000 × 1 - 1,000 × 1
            = 99,000 messages (peak queue backlog)

Time to consume all messages = Queue length / Consumer rate
                             = 99,000 / 1,000
                             = 99 seconds

5. Core Problem 3: How to Ensure Messages Are Not Lost, Not Duplicated, and In Order?

5.1 Message Reliability: Three Lines of Defense

Messages can be lost at three stages: during producer sending, during Broker storage, and during consumer processing.

Three Lines of Defense

Defense 1: Producer ACK

  • When sending a message, wait for the Broker to confirm receipt
  • If no confirmation received, retry or log locally

Defense 2: Broker Persistence

  • Write messages to disk, not just in memory
  • Multi-replica synchronization to ensure no data loss

Defense 3: Consumer ACK

  • After processing a message, manually confirm (ACK)
  • If processing fails, don't confirm; Broker will redeliver
🛡️Message Reliability DemoThree defense lines keep messages from being lost
Line 1
Producer ACK
📤
Producer
Send message
📨
Message
ACK confirmation
📦
Broker
Receive and store
💡 If no ACK is received, the producer retries or records a local log.
Line 2
Broker persistence
Memory storage
Fast, but lost on restart
❌ High risk
vs
🔄 Multi-replica sync
Messages are synced to 3 nodes, so data is not lost even if 1 node fails.
Message persisted and safe
Line 3
Consumer ACK
1
Pull message
Fetch message from Broker
2
Process message
Run business logic
3
Manual ACK
Confirm after processing
Auto ACK
Efficient but can lose messages
⚠️ Not recommended
💡 If processing fails, no ACK is sent and the Broker redelivers.
🎯
All three defense lines are required:Producer ACK → Broker persistence → Consumer ACK

5.2 How to Handle Duplicate Message Consumption?

Message duplication can occur in the following scenarios:

  1. Producer retry: Producer sends message but doesn't receive ACK, retries sending the same message
  2. Consumer ACK timeout: Consumer finishes processing but ACK times out, Broker redelivers
  3. Network jitter: Consumer ACK doesn't reach Broker, Broker considers message unconsumed
  4. Consumer restart: After consumer restarts, re-consumes the same batch of messages

Idempotency

Idempotency: Executing the same operation multiple times produces the same result as executing it once.

Everyday idempotency examples:

  • Idempotent: Pressing an elevator button (pressing 10 times or once, the elevator still comes)
  • Non-idempotent: Bank transfer (transferring $10, executing twice transfers $20)

Technical solution: Generate a unique ID for each message; check if already processed before handling.

🔄Idempotence DemoRepeated consumption does not create side effects
❌ Non-idempotent operation: bank transfer
Repeated consumption can debit multiple times
Disabled
Processing log
No logs yet. Click the button to start.
❌ No idempotence protection
Debit ¥100
Duplicate consumption causes multiple debits
✅ With idempotence protection
Debit ¥100
Duplicate requests are filtered, so debit happens once
🎯
Core idempotence principle: Generate a unique ID for each message and check whether it was processed before doing the operation.

6. Practice: How to Choose a Message Queue?

6.1 Comparison of Four Mainstream Message Queues

FeatureRabbitMQKafkaRocketMQRedis Stream
PositioningTraditional MQDistributed log streamE-commerce-grade MQLightweight queue
Throughput~10K/s~1M/s~100K/s~50K/s
LatencyMicrosecondsMillisecondsMillisecondsMilliseconds
ReliabilityHigh (persistence)High (multi-replica)High (sync flush)Medium (AOF)
Message replayNot supportedSupportedSupportedSupported
Transactional messagesSupported (weak)Not supportedSupported (strong)Not supported
Delayed messagesSupportedNot supportedSupportedNot supported
Use casesTraditional enterprise appsLogs, big dataE-commerce, financeSmall-scale apps

Selection Recommendations

Decision tree:

Choosing a message queue:

├─ Need transactional messages (distributed transactions)?
│  ├─ Yes → RocketMQ (first choice) or RabbitMQ
│  └─ No → continue

├─ Need to process massive logs/real-time streams?
│  ├─ Yes → Kafka (first choice)
│  └─ No → continue

├─ QPS > 10K/s?
│  ├─ Yes → RocketMQ or Kafka
│  └─ No → continue

├─ Need complex routing (e.g., header matching)?
│  ├─ Yes → RabbitMQ
│  └─ No → continue

├─ Already have Redis infrastructure?
│  ├─ Yes → Redis Stream (quick start)
│  └─ No → RabbitMQ (full-featured, moderate learning curve)

7. Summary: Message Queue Design Principles

7.1 Core Principles Review

PrincipleMeaningPractice Points
DecouplingServices don't directly depend on each otherCommunicate via message queue; consumer failure doesn't affect producer
Peak shavingSmooth traffic fluctuationsMessage queue as reservoir; consumers process at constant rate
ReliabilityMessages not lostProducer ACK + Broker persistence + Consumer ACK
IdempotencyDuplicate consumption has no effectBusiness-level idempotency guarantees (unique keys, state machines)
OrderingMessage order guaranteeSingle-partition ordering or consumer-side sorting

7.2 Design Checklist

Before introducing a message queue, ask yourself:

  • [ ] Do you really need a message queue? (Simple async can use thread pools)
  • [ ] Is message loss acceptable? (Determines reliability level)
  • [ ] Will message duplication affect the business? (Determines idempotency investment)
  • [ ] Is message order important? (Determines partition strategy)
  • [ ] What's the consumer processing capacity? (Determines queue size and alert thresholds)
  • [ ] How to handle consumption failures? (Determines retry and dead letter strategies)

8. Glossary

TermFull NameDescription
MQMessage QueueMiddleware for asynchronous communication, decoupling producers and consumers.
Producer-The party that sends messages.
Consumer-The party that receives and processes messages.
Broker-The server program that stores and forwards messages.
Topic-Logical categorization of messages (e.g., "orders").
Queue-Physical container storing messages.
Partition-A Kafka concept; one Topic can be split into multiple Partitions for higher concurrency.
ACKAcknowledgmentConsumer confirms to Broker after processing a message.
Pub/SubPublish/SubscribeA messaging pattern where one message can be received by multiple consumers.
P2PPoint-to-PointA messaging pattern where one message can only be received by one consumer.
DLQDead Letter QueueStores messages that cannot be consumed.
Idempotence-Multiple executions produce the same result.
Throughput-Number of messages processed per unit time.
Latency-Time difference from message send to receipt.
Persistence-Messages written to disk, not just stored in memory.
Replication-Messages copied to multiple nodes for high availability.
Transaction Message-Guarantees consistency between local transaction and message sending.
Backpressure-When consumers can't keep up, they notify producers to slow down.
Offset-The consumer's consumption position within a partition.
Rebalance-Reassigning partitions when consumer group members change.