Skip to content

The Browser Is an Operating System

Preface

You use a browser every day — watching videos, reading news, working online. But have you ever wondered: When you type a URL in the address bar and press Enter, what happens behind the scenes?

This article will use a relatable analogy of "online shopping" combined with the actual technical process to walk you step by step through how a browser turns a URL into a rich, colorful page.

After reading this, you will be able to:

  • Understand the complete process from entering a URL to displaying a page
  • Master core concepts like URL, DNS, TCP, and HTTP
  • Learn how browsers render pages
  • Know the difference between static and dynamic websites

No programming experience required — all you need is your everyday online shopping experience.

What will you learn from this article?

After completing this chapter, you will master the complete technical process from entering a URL to page display, and understand how browsers and servers work together. This knowledge is the foundation for learning about APIs, interfaces, network security, and other technologies. It's also key to troubleshooting everyday issues like "webpage won't load" or "slow loading."

ChapterContentCore Concepts
Chapter 1URL ParsingStructure and purpose of URLs
Chapter 2DNS LookupHow domain names convert to IP addresses
Chapter 3TCP HandshakeHow to establish a reliable connection
Chapter 4HTTP CommunicationHow browsers and servers talk to each other
Chapter 5Browser RenderingHow code becomes a visual page
Chapter 6Static vs DynamicHow webpage content is generated

0. Introduction: The Moment You Press Enter

Core Question

When you type a URL in your browser and press Enter, what happens behind the scenes? Why do some pages load quickly while others are slow? Why do you sometimes get a "server not found" error?

Real-life Analogy: An Online Shopping Journey

Imagine you're doing some online shopping. The entire process can be divided into 5 steps:

Step 1: Fill Out the Order Select products and confirm the delivery address

Step 2: Find the Warehouse The system locates the specific shipping warehouse

Step 3: Establish a Channel Confirm the warehouse is open and can ship

Step 4: Warehouse Ships The courier delivers the package to your door

Step 5: Unboxing Experience Open the package and see your desired product

The process of visiting a webpage is remarkably similar to online shopping!

When you type google.com in your browser and press Enter, you are the "buyer." Through a series of operations, the browser eventually delivers the "product" (webpage content) from a distant server to your screen.

https://
试一试:
🛒
出发
🗺️
查仓库
📞
建立通道
🚚
发货
🎁
收货
👈 在左上角输入网址,开启网络快递之旅

Key Insight

The key to understanding how browsers work is: map complex technical processes to familiar real-life scenarios. The 5 steps of online shopping perfectly correspond to the 5 technical stages of a browser accessing a webpage.


1. Step One: Fill Out the "Order" — URL Parsing

Core Question

Why does a URL look like this? https://www.example.com:8080/path/page.html?id=123#section — What does this string of characters actually mean?

Real-life Analogy: Filling Out an Order Form

If you only wrote "buy shoes" on an order, the warehouse wouldn't know which pair to send. You need to specify:

  • Store type (official flagship store / regular store)
  • Store name (Nike official store)
  • Product location (Men's shoes / Running series)
  • Specific model (Air Max 90)
  • Special notes (I want the red one)

The Real Process: Browser Parses the URL

A URL (Uniform Resource Locator) is the "product locator code" of the browser world. When you type https://www.example.com:8080/path/page.html?id=123#section in the address bar, the browser immediately breaks it down:

URL PartExample ValueShopping AnalogyTechnical Purpose
Protocol https://Secure Hypertext Transfer ProtocolShipping method: Secure delivery (HTTPS) vs regular delivery (HTTP)Determines the communication rules. http is regular transfer, https is encrypted transfer
Domain www.example.comHuman-readable name of the serverStore name: JD SupermarketTells the browser which server to find. Domain names are for human memorization and must eventually be converted to IP addresses
Port :8080Specific "door number" of the serverCounter number: Counter #3 (default is omitted)A server may have multiple services; the port specifies which one to access. HTTP defaults to 80, HTTPS to 443
Path /path/page.htmlFile location on the serverShelf location: Daily goods aisle / Third rowSpecifies the exact resource location on the server
Query parameters ?id=123Additional informationOrder notes: Red, size XLExtra data passed to the server, such as search keywords and page numbers
Fragment #sectionPosition within the pageManual page reference: Turn to page 5Automatically scrolls to the specified position after the page loads; not sent to the server
URL Parsing -- Translating human text into structured information
https://www.google.com/search
🚛 Transport mode (Protocol)
This says you want the safest vehicle, HTTPS encrypted communication. With HTTP, it is like an open-top car that anyone along the way can inspect.
🏢 Store name (Host)
This is the destination server domain. The browser later translates it into the numeric IP address used by the network.
📍 Exact shelf (Path)
After entering the store, this tells the server which room, resource, or action you want.
Hover over each part to see its responsibility

Key Understanding

URLs exist so that humans can remember and type them. What computers ultimately need are IP addresses (just as couriers need the actual warehouse address, not just the name "Nike Official Store").


2. Step Two: Check the "Address Book" — DNS Lookup

Core Question

How does the browser find the website? You type a human-readable domain name (like baidu.com), but what computers really need are numeric addresses (IP). What happens in between?

Real-life Analogy: Looking Up the Warehouse Address

You wrote "Nike Official Store" on your order, but the logistics system doesn't know where the warehouse is. It needs to check an address book:

  1. First check recent addresses (have I bought from this store recently?) → Browser cache
  2. If not, ask the neighborhood delivery point (they know the general area assignments) → Local DNS server
  3. Ask the headquarters dispatch center (knows who manages .com stores) → Root name server
  4. Ask the brand management office (ultimately finds the real shipping warehouse for the Nike store) → Authoritative name server

The Real Process: DNS Hierarchical Lookup

DNS (Domain Name System) is the internet's "distributed address book lookup system." Since there are billions of domain names worldwide, a hierarchical architecture is used to distribute the query load:

You (Browser)
    ↓ Ask: What's the IP for google.com?
Local DNS server (your ISP, such as China Telecom/China Unicom)
    ↓ Ask: Who manages .com?
Root name server (13 root server groups worldwide, managing all top-level domains)
    ↓ Tell: Ask the .com manager
Top-level domain server (Verisign manages .com)
    ↓ Tell: Ask the google.com manager
Authoritative name server (Google's own DNS server)
    ↓ Tell: google.com's IP is 142.250.80.46
Return IP address to browser

Query types explained:

  • Recursive Query: The browser sends only one request; the local DNS is responsible for querying all levels and returning the result
  • Iterative Query: Each level only tells the next level where to look; the browser needs multiple queries
  • Caching mechanism: Query results are cached for direct return next time, greatly speeding up access
DNS Lookup -- Finding coordinates in the address book
📱
Browser
Wants to visit www.google.com
Ask for coordinates
Return IP
📞
Directory service (DNS)
Standing by
Click the button to tell the browser that you do not know where the Google server is

Why So Many Layers?

Imagine if the entire world had only one address book — billions of people querying it simultaneously would crash it instantly. The hierarchical design lets each level manage only its own "jurisdiction," making it both efficient and reliable.

This is a core design principle of the internet: distributed systems.


3. Step Three: Phone Confirmation — TCP Three-Way Handshake

Core Question

Why is a "three-way handshake" needed? After finding the server address, why can't you just send data directly? Why go through three rounds of communication first?

Real-life Analogy: Establishing a Logistics Channel

If a logistics truck drove directly to the warehouse, the result could be:

  • The warehouse is closed → Wasted trip
  • The warehouse is overloaded and not accepting orders → Can't ship
  • Can't find the loading dock → Can't connect

So before actually shipping goods, you must first establish a reliable transport channel.

The Real Process: TCP Three-Way Handshake

TCP (Transmission Control Protocol) is the protocol that ensures reliable data transmission. Before transmitting goods (data), a connection must be established through the "three-way handshake":

Client (Your Computer)              Server (Merchant Warehouse)
   |                                |
   |--- SYN=1 --------------------->|  1st: Hello, I'm home and ready to receive! (SYN)
   |                                |
   |<-- SYN=1, ACK=1 ---------------|  2nd: Got it! I'm ready to ship too. Are you home? (SYN-ACK)
   |                                |
   |--- ACK=1 --------------------->|  3rd: Yes! Please ship. (ACK)
   |                                |
   ===== Channel established, shipping begins =====

Why three times, not two?

  • First (SYN): Client proves it can send
  • Second (SYN-ACK): Server proves it can receive and send
  • Third (ACK): Client proves it can receive

The three-way handshake ensures: both parties can send and both can receive — all four conditions are met for reliable transmission.

TCP is also responsible for:

  • Data segmentation: Breaking large data into smaller packets for transmission
  • Sequential reassembly: Ensuring packets are assembled in the correct order
  • Error retransmission: Automatically resending lost packets
  • Flow control: Adjusting sending speed based on network conditions
TCP Three-Way Handshake -- Establishing a reliable channel
💻
Browser (you)
Waiting
🖥️
Google server
Waiting
Click "Start connection" to simulate the TCP three-way handshake

HTTPS Extra Step: For HTTPS (secure websites), after the TCP handshake, a TLS handshake (1-RTT or 2-RTT) is also performed. Both parties exchange encryption keys to ensure only they can understand the subsequent conversation — like communicating in secret code.


4. Step Four: The "Buyer" and "Merchant" Conversation — HTTP Request and Response

Core Question

What are the browser and server saying to each other? After establishing a connection, how does the browser "tell" the server what it wants? And how does the server "respond"?

Real-life Analogy: Warehouse Shipping

The logistics truck arrives at the warehouse: "Here's the order (HTTP request), I want to pick up the product (webpage HTML source code)!" The warehouse manager verifies: "The order is valid. Here's your package (HTML file). Please take it."

The Real Process: HTTP Protocol Communication

HTTP (HyperText Transfer Protocol) is the "conversation rules" between browser and server. After the channel is established, the browser sends a pickup request. The core goal is to retrieve the webpage's source code (HTML file):

HTTP Request Example:

http
GET /index.html HTTP/1.1          ← Request method + path + protocol version
Host: www.example.com             ← Target host (supports virtual hosting; one server can host multiple websites)
User-Agent: Chrome/120.0          ← Client identifier (server can return adapted content based on this)
Accept: text/html,application/xhtml+xml  ← Acceptable response formats
Accept-Language: zh-CN,zh;q=0.9   ← Preferred languages
Accept-Encoding: gzip, deflate    ← Supported compression formats
Connection: keep-alive            ← Keep connection alive (reuse TCP connection)
Cookie: session_id=abc123         ← Authentication credentials

Developer Epiphany: Isn't This Just an API?

Exactly the same! The API calls you write (fetch / axios) and the browser accessing a webpage are exactly the same thing at the HTTP level.

Both send a request, and the server returns text data.

  • If the server returns HTML, the browser renders it (turning it into a webpage).
  • If the server returns JSON, your code stores it (for logic processing).

There aren't "two types" of requests — there's only one kind of HTTP request. The difference is just the response data format (Content-Type). This is why understanding HTTP means you understand 90% of backend API principles.

If you want to dive deeper into API development, please refer to the API chapter.

Common HTTP Methods:

  • GET: Retrieve a resource (safe, idempotent, cacheable)
  • POST: Submit data (create a resource, such as registration, login)
  • PUT: Update a resource (full replacement)
  • PATCH: Partially update a resource
  • DELETE: Delete a resource
  • HEAD: Get response headers only (no body returned; used to check if a resource exists)

Server Returns HTTP Response:

http
HTTP/1.1 200 OK                   ← Protocol version + status code + status description
Date: Mon, 23 May 2025 12:00:00 GMT  ← Server time
Content-Type: text/html; charset=UTF-8  ← Content type and encoding
Content-Length: 1234              ← Content length (bytes)
Cache-Control: max-age=3600       ← Caching policy
Set-Cookie: user_id=xyz789        ← Set Cookie

<!DOCTYPE html>...                ← Response body (webpage content)

HTTP Status Code Categories:

Status CodeCategoryMeaningReal-life Analogy
200SuccessRequest successfully processed"Order confirmed, shipping now"
301/302RedirectResource has moved"This store has relocated, please order at the new location"
304Not ModifiedCache is still valid"What you bought last time is still usable, no need to reship"
400Client ErrorMalformed request"The order is unclear and unreadable"
401UnauthorizedAuthentication required"Please show your membership card first"
403ForbiddenInsufficient permissions"Staff only, no entry"
404Not FoundResource does not exist"This product is not in stock"
500Server ErrorInternal server error"The warehouse is on fire, can't ship for now"
502Bad GatewayUpstream server unresponsive"Main warehouse is out of stock, branch can't transfer either"
503Service UnavailableServer overloaded or under maintenance"Too many orders, temporarily paused"
HTTP Request and Response -- Sending a note to buy a package
📤 Buyer note: HTTP Request
GET /search HTTP/1.1
Host: www.google.com
User-Agent: Mac Chrome browser
Accept-Language: en-US (I want English content)
📥 Seller package: HTTP Response
The server response package will appear here...
The HTTP request is assembled with a path and supporting headers.

5. Step Five: Opening the "Package" — Browser Rendering

Core Question

How does code become a visual display? The server sends dry HTML/CSS/JavaScript code. How does the browser turn it into a rich, colorful webpage?

Real-life Analogy: Unboxing and Assembly

You finally receive the delivery package (HTTP response), but when you open it, it's not ready-made furniture — it's a pile of parts (HTML) and an assembly manual (CSS). As the "buyer" (browser), you need to assemble it yourself:

  1. Open the packaging: Take out all parts and check the list (parse HTML → DOM tree).
  2. Read the manual: Understand the instructions — which part goes where, what color (parse CSS → CSSOM tree).
  3. Sort and organize: Pick out the parts to assemble, discard the packing foam (display: none), and prepare for assembly (build render tree).
  4. Measure positions: Use a ruler to measure the room dimensions and decide exactly where each piece of furniture goes (layout/reflow).
  5. Paint and decorate: Apply paint and decals to the furniture (paint).
  6. Final display: Clean up and turn on the lights for display (compositing).

The Real Process: Browser Rendering Engine

What the browser receives is HTML/CSS/JavaScript code (dry text), but it needs to become pixel imagery (a beautiful webpage). This process is called rendering, performed by the browser's rendering engine (such as Chrome's Blink, Safari's WebKit).

Step 1: Parse HTML → Build DOM Tree (Parts List)

The browser reads the HTML byte stream and parses it into a DOM (Document Object Model) tree. This is like organizing a pile of scattered parts into a hierarchically structured list:

html
<!-- Original HTML -->
<div class="header">Title</div>
<div class="content">Content</div>
text
DOM tree structure:
Document
 └─ html
     └─ body
         ├─ div.header ("Title")
         └─ div.content ("Content")

Step 2: Parse CSS → Build CSSOM Tree (Manual)

The browser parses all CSS (inline, external files) and builds a CSSOM (CSS Object Model) tree. This is like understanding the style rules in the manual:

css
.header {
  color: blue;
  font-size: 24px;
} /* Title should be blue */
.content {
  display: none;
} /* Content temporarily hidden */

Step 3: Merge → Render Tree (Ready to Assemble)

DOM tree + CSSOM tree = Render Tree. Key point: Only "visible" elements are in the render tree.

  • .header: In the render tree (visible).
  • .content: Not in the render tree (because display: none, like discarded wrapping paper that doesn't need assembly).

Step 4: Layout (Reflow) — Measure Dimensions

The browser calculates the exact coordinates and sizes of each node in the render tree on screen.

  • "This title box is 100px wide, 50px tall, positioned at screen coordinates (0,0)."
  • This process is called reflow. If the window size changes (e.g., rotating a phone), all element positions must be recalculated — very performance-intensive.

Step 5: Paint — Apply Colors

After knowing the positions, the browser starts filling in pixels: painting background colors, text colors, borders, shadows, etc.

Step 6: Composite — Final Display

Modern browsers divide the page into multiple layers for separate painting (e.g., 3D transforms, independent scrollbar layers), and the GPU composites them together like Photoshop layers onto the screen.

Browser Rendering -- Turning plain text into a polished page
Receive plain text source code
The response is just plain HTML and CSS text. It is an instruction manual for the page, not the visual page yet.
<html>
  <style>
   .title { color: #f00; }
  </style>
  <body>
   <h1 class="title">
     Google Search
   </h1>
   <input />
  </body>
</html>
Click each step icon above to see the output of each rendering stage

Did You Know?

Layout and painting are when the browser is busiest. The more elements and complex the structure on a page, the more time the browser needs to calculate positions and paint colors. This is why some complex pages can feel sluggish.


5.5 How Are Webpages "Generated"? Static vs Dynamic Websites

Core Question

Where does webpage content come from? We covered how browsers render pages, but how does the HTML file on the server get there? Is it prepared in advance or made on demand?

Everything we've covered so far is about how the browser "opens the package" — rendering the HTML/CSS/JS sent by the server into a page. But have you ever wondered: how did that HTML file on the server get there in the first place?

The answer: there are two ways — and that's the difference between static and dynamic websites.

Static Websites: Pre-made and Served Directly

Imagine going to a supermarket to buy cookies. The cookies on the shelf are already produced at the factory — you just pick them up, no waiting needed.

A static website works the same way — the webpage is already prepared on the server. When you visit, the server sends the ready-made HTML file directly without any additional processing.

Characteristics:

  • Fast access (server just sends files, no computation needed)
  • Simple to create (just write HTML and it's ready)
  • High capacity (can use CDN distribution; handles any traffic volume)
  • Hard to update content (changing content requires regenerating files)

Common examples: Company profile pages, product documentation, help centers, personal blogs

Dynamic Websites: Made to Order, Different Each Time

Now imagine going to a restaurant and ordering food. The chef prepares your dish fresh based on your order — if you order Kung Pao Chicken, you won't get Sweet and Sour Pork.

A dynamic website generates pages "on the spot" when you visit — after receiving your request, the server queries databases, computes data, and generates a brand new HTML file for you.

Characteristics:

  • Real-time content (shopping carts show latest inventory, news updates instantly)
  • Personalized (after logging in, you see your own profile)
  • Powerful features (search, comments, recommendations, payments all possible)
  • Slower access (server needs time to compute)
  • Higher server load (many simultaneous visitors may need to queue)

Common examples: Taobao, Weibo, online banking, online documents

Do you need a server? Dynamic websites do need some kind of "backend" to generate content, but the forms vary:

  • Traditional servers: Buy/rent your own server (Alibaba Cloud ECS, AWS EC2)
  • Serverless: No server management needed; the cloud provider runs your code (AWS Lambda, Alibaba Cloud Function Compute, Cloudflare Workers)
  • Third-party API calls: Use Stripe for payments, weather bureau API for weather — no need to write your own backend code

Combining Static and Dynamic

Many websites today are "hybrid": the main page structure is static, but certain parts (like comment sections, search boxes) are dynamically loaded. JavaScript can call APIs to fetch data after the page loads, achieving "static pages + dynamic functionality."

Static vs Dynamic — A Clear Comparison

Static WebsiteDynamic Website
How it's generatedPre-made, stored on serverMade on demand when visited
AnalogyProducts on supermarket shelvesMade-to-order dishes at a restaurant
SpeedFastSlower (requires computation)
Can content be changed?Difficult (need to regenerate)Easy (change in the backend)
Best forDisplay content (profiles, docs)Interactive applications (shopping, social)
Typical examplesCompany websites, help docsTaobao, WeChat, online banking

Common Questions

Q: Can static websites not use JavaScript?

Of course they can! Interactive features like carousels, dropdown menus, and form validation can all be implemented with JavaScript on a static website. When we say "static" and "dynamic," we're referring to whether the page content is pre-prepared, not whether it has interactive features.

Q: Do dynamic websites require buying your own server?

Not necessarily. Besides traditional servers, you can use Serverless (cloud functions) or directly call third-party APIs. The current trend is "avoid touching servers when possible" — using static websites + JavaScript API calls for speed and cost savings.

Important Note

Whether a website is static or dynamic, the browser rendering principles are exactly the same! The browser renders whatever the server sends. The only difference is:

  • Static website: The server sends a "finished product"
  • Dynamic website: The server sends a "freshly made" product

As a frontend developer, your main focus is how the browser handles the content it receives, not how the server generated it.


6. Summary: A Complete "Online Shopping" Journey

After This Chapter, You Should Be Able To

  • Explain the complete process from entering a URL to displaying a page
  • Understand the roles and relationships of URL, DNS, TCP, and HTTP
  • Know how browsers render pages
  • Distinguish between static and dynamic websites
  • Explain browser workings to others using real-life analogies

Let's review the entire journey:

StageTechnical TermShopping AnalogyCore TaskKey Technology
1. ParseURL ParsingFill out orderUnderstand what the buyer wantsProtocol, domain, port, path, parameters
2. LookupDNS LookupFind warehouse addressFind the store's shipping warehouseRecursive/iterative queries, caching
3. ConnectTCP HandshakeEstablish channelEnsure smooth logisticsThree-way handshake, sequence numbers, flow control
4. ExchangeHTTP ExchangeWarehouse shipsSubmit order and receive goodsRequest methods, status codes, header fields
5. DisplayBrowser RenderingUnbox and assembleDisplay the productDOM, CSSOM, render tree, layout, paint

The entire process typically completes in a few hundred milliseconds — think about how remarkable that is!

Your browser, in less than 1 second:

  • Parsed a complex address
  • Queried DNS servers distributed around the world
  • Established a reliable connection with a server thousands of miles away
  • Conducted a complete HTTP conversation
  • Turned dry code into a beautiful visual display

This is the charm of the internet: complex technology, simple experience.

Further Learning

If you want to dive deeper into any part, check out:


7. Glossary

TermFull NameSimple Explanation
URLUniform Resource LocatorThe "address" of a webpage, telling the browser where to find the resource.
DNSDomain Name SystemThe internet's "phone book," converting human-readable domain names into machine-readable IP addresses.
IP AddressInternet Protocol AddressThe unique "door number" for every networked device, such as 192.168.1.1.
TCPTransmission Control ProtocolThe "rules" ensuring reliable data transmission, establishing connections through a three-way handshake.
HTTPHyperText Transfer ProtocolThe rules for "conversation" between browser and server.
HTTPSHTTP SecureSecure HTTP. Adds encryption (TLS/SSL) on top of HTTP to protect data security.
HTMLHyperText Markup LanguageThe "skeleton" of a webpage, defining content structure.
CSSCascading Style SheetsThe "skin" of a webpage, defining content appearance.
DOMDocument Object ModelThe tree structure the browser converts HTML into for easier manipulation.
CSSOMCSS Object ModelThe tree structure the browser converts CSS into.
RenderingThe process of the browser converting code into screen pixels.
RTTRound Trip TimeThe time for a data packet to be sent and its acknowledgment received, affecting webpage loading speed.

Congratulations

Now when you type a URL in the address bar and press Enter again, you can already see the busy and fascinating digital world behind your screen.

You now understand:

  • Why sometimes webpages won't open (DNS resolution failure, server down)
  • Why some pages are fast and others slow (network latency, server performance, page complexity)
  • How the browser turns code into a visual display (rendering pipeline)

This is the value of understanding technical principles — when problems arise, you know where to look for the cause instead of feeling helpless.