Command Line & Shell Scripts

💡 Learning Guide: This chapter aims to provide readers with zero prior experience a systematic understanding of how the terminal (Terminal) works. No computer science background is required — we will use interactive demonstrations to explain the terminal's operating mechanisms step by step.

0. Quick Start: How to Open the Terminal?

Before you start learning, you first need to find it. The terminal comes pre-installed on every operating system — you don't need to install any additional software to use it.

🖥️ How to Open on Different Systems

 macOS (Mac)

Press Command (⌘) + Space to open Spotlight Search.
Type Terminal.
Press Enter, and you'll see a window with white text on black (or black text on white).

🪟 Windows

Method 1 (CMD): Press Win + R, type cmd, and press Enter. This is the oldest command line.
Method 2 (PowerShell): Press Win + R, type powershell, and press Enter. This is a more modern and powerful terminal.
Tip: Either works for simple tasks, but PowerShell or WSL (Windows Subsystem for Linux) is recommended for development.

🐧 Linux

The shortcut is usually Ctrl + Alt + T.
Or search for Terminal in the application menu.

0.1 Hands-on Lab: Try It Yourself

Practice makes perfect. Before diving into dry theory, let's get a feel for "typing commands" firsthand.

💡 Tip: For safety and convenience, we recommend using the web simulator below. If you're confident, you can also open your computer's real terminal using the method from Chapter 0 and follow along (the experience is the same).

In this exercise, you will learn to:

View files: Learn to use ls or dir to see what's in the current directory.
Create and navigate: Learn to use mkdir to create new folders and cd to enter them like a portal.
Create files: Learn to quickly create a new file using commands.
Install software: Experience the thrill of installing a Python library or system software with a single line of code.
Delete and clean up: Learn how to delete unwanted files (use with caution!).
Ask AI for help: This is the most important one! When you forget a command, learn to ask AI: "How do I delete a file on Mac?" — it will give you the answer directly.

Please select your operating system below, then follow the guided steps to begin:

🎯 Hands-on Task (1/6)

Step 1: See What's Here

Before working with files, we first need to know what files are in the current directory.

🤖Not sure how to write? Ask AI

How do I view files in the current directory? What command should I use?

In Windows CMD, use the `dir` command (Directory) to view file lists.

Expected Goal: List all files in the current directory.

Command Prompt

C:\Users\User>

0.2 Why Give Up the Mouse? (Why CLI?)

You might ask: "Graphical interfaces (GUIs) are so easy to use — just click with the mouse. Why bother typing complex commands into a black-and-white window?"

This isn't about "acting like a hacker." It's because in certain situations, language (commands) is more powerful than gestures (mouse clicks).

1. The Mouse Struggles with "Batch Operations" and "Logic"

GUI (mouse): Good for "click what you see." If you want to delete one photo, right-clicking is quick. But if you want to "delete all photos taken in 2023 that are larger than 5MB and in PNG format," the mouse can't help — you'd have to manually filter for ages.
CLI (commands): Good for "describing what you want to do." The above requirement only needs one line of command, and the computer will automatically find and process matching files — even if there are 10,000 of them.

2. Commands Can Be Recorded and Reused

GUI: Configuring an environment requires dozens of menu clicks. When you switch to a new computer, you have to re-do everything from memory — and it's easy to miss steps.
CLI: You can write all commands into a file (a script). Next time, just run that file and the computer will reproduce your operations with zero errors. This is the foundation of "automation."

3. The Only Option for Remote Control

GUI: Transmitting visuals is like watching HD video — it requires very high bandwidth. If the network is even slightly slow, the mouse stutters and becomes unusable.
CLI: Only plain text is transmitted — just a few dozen characters. Even in a remote mountain area with poor signal, you can smoothly control a data center server on the other side of the planet.

Summary: GUI is for exploration (browsing the web, viewing images), CLI is for production (development, operations, batch processing). As developers, we use the terminal because it is more precise, more controllable, and more efficient.

1. Concept Definition: What Is a Terminal?

Different operating systems have different-looking terminals, and different command syntax. Click the buttons below to switch views — notice how macOS, Windows, and Linux use different commands (e.g., dir vs ls) to do the same thing:

Command Prompt

👆Keep Clicking / Keep Clicking

C:\Users\User>

💡

Before graphical user interfaces (GUIs) became widespread, the terminal was the primary way humans interacted with computers. Even today, it remains the most precise and efficient tool for developers to control a computer.

Input (Keyboard)

Sends commands (character signals)

Character Stream / Character Stream

$ _

Output (Text Grid)

Text Grid Feedback

CLI (Command Line Interface): In this mode, the computer only understands characters. Each key press is converted to an encoding and sent to the system, which processes it and returns text results. It doesn't care where you click, only what you type.

At its core, a terminal is a character stream input/output environment:

Input: Send instructions via the keyboard (character signals).
Output: Display text feedback on a screen grid.

It doesn't handle complex graphics, images, or video — it focuses on text-based interaction.

2. Core Architecture: The Art of Decoupling (The Big Picture)

Before diving deeper, consider this question: Does the terminal window itself actually understand what you're saying?

In reality, the terminal is like a display that only relays messages. When you type the date command, the terminal doesn't know it means "show the date" — it just packages those 4 letters and sends them to the real boss behind the scenes: the Shell.

The Shell is the "brain" that can understand your commands and direct the computer to do the work.

To understand how they work together, let's look at these three components with clear roles. The best analogy for understanding their relationship is that of a web browser and a web server.

2.1 Role Division

🖥️ Terminal — Like the "Browser"
- Responsibility: It only handles input (relaying your keystrokes to the other party) and display (rendering the characters sent back on screen).
- Characteristic: It has no intelligence of its own and doesn't understand what ls or cd means. It's like the Chrome browser — whether you visit Baidu or Google, it just renders the page.
- Common terminals: Windows CMD/PowerShell window, macOS Terminal.app, VS Code's built-in terminal.
🧠 Shell — Like the "Web Server"
- Responsibility: It's the logical brain. It runs in the background, receiving the command strings you send, parsing their meaning, and then directing the operating system to execute them.
- Characteristic: It's invisible and intangible, communicating only through text streams.
- Common Shells: Bash, Zsh, Fish, PowerShell.
⚙️ Kernel — The "Grand Manager" Behind the Scenes
- Responsibility: The core of the operating system — only it can directly control hardware (read/write disks, allocate memory, control the CPU).
- Relationship: The Shell is the kernel's "secretary," translating your human language into something the kernel can understand.

2.2 Why Separate Them? (Replaceability)

Because the display layer (terminal) and the logic layer (Shell) are completely separate, they can be freely combined:

Change the "skin": On macOS, you can use the built-in Terminal, download iTerm2, or use VS Code's terminal. They look different but all connect to the same Shell (zsh), so the commands are identical.
Change the "brain": In the same terminal window, you can switch from bash to zsh, or to a Python interactive session. The terminal hasn't changed, but the logic processing commands has.

2.3 Interaction Flow: The Vanishing Keystroke

You might think: "I press 'a' on the keyboard, and the terminal draws 'a' on the screen."Wrong! The real process works like this (called Echo):

Press 'a': The keyboard signal is sent to the terminal.
Send signal: The terminal sends the encoding of 'a' to the Shell.
Shell processes it: The Shell receives 'a', finds no issue, and sends 'a' back to the terminal unchanged.
Display character: The terminal receives the 'a' sent back by the Shell and only then draws it on screen.

💡 Mini experiment: Some commands (like when entering a password) disable the Shell's echo function. At that point, you press keys and the terminal sends them to the Shell, but the Shell doesn't send anything back, so the screen stays blank. This is a privacy protection measure.

One-sentence summary of the flow: You type in the terminal ➡️ signal goes to the Shell ➡️ Shell sends it back unchanged (you see the characters) and interprets it ➡️ Shell directs the kernel to do the work.

The demo below illustrates this process. Watch the "wall" between the Shell and kernel, and how characters travel back and forth:

🖥️

Terminal (Terminal)

Conduit / Window / Conduit / Window

🗣️

Shell (Shell)

Interpreter / Assistant / Interpreter / Assistant

⚙️

Kernel (Kernel)

Manager / Chip / Manager / Chip

👆Keep Clicking / Keep Clicking

User Space / User Space

Kernel Space / Kernel Space

TERMINAL (Terminal)

/dev/tty

stdin / stdout

SHELL (Shell)

💤

Idle

/bin/zsh

System Calls

KERNEL (Kernel)

💤

Idle

macOS / Linux

Click "Start Simulation" to see how the command 'ls' travels through the system.

3. Visual Model: The Grid System

Unlike modern graphical interfaces that use "pixels," the terminal's display foundation is the character grid (Character Grid). The terminal screen is divided into rows and columns, and each cell is called a cell (Cell).

3.1 Cell Composition

Each cell is the smallest unit of terminal display, containing two types of core information:

Glyph: The actual displayed character (e.g., A, 中, $).
Attributes: The character's styling (e.g., foreground color, background color, bold, underline).

When you resize the terminal window, you're essentially changing the grid's rows and columns.

Try interacting with the area below and observe how the grid carries characters:

Click/Drag cells to draw, Type to insert text

3.2 Style Inspection

The terminal cannot display images — all "interfaces" are achieved through combinations of character colors and styles.

Click on the cells below to see the style attributes behind each one:

CHARACTER

FOREGROUND

BACKGROUND

ATTRIBUTES

BoldUnderline

4. Communication Protocol: Escape Sequences

You might wonder: if the terminal only transmits text, how are colored text, moving cursors, and screen clearing achieved?

The answer is escape sequences (Escape Sequences). These are special character instructions (usually starting with the ESC character). When the terminal receives these characters, it doesn't display them on screen but instead interprets them as control commands.

For example:

Regular character A → draw A on screen.
Sequence \033[31m → command: set subsequent text color to red.
Sequence \033[2J → command: clear the screen.

This is like an agreement with a friend: if I speak normally, you write it down; if I raise my left hand (equivalent to ESC), the next sentence is a command, not content.

Click the "play" button below to watch the terminal process the character stream one by one and identify hidden commands:

Parser Mechanism (Parser Mechanism)

Input Byte Stream / Input Byte Stream

H48

i69

ESC1B

[5B

333

131

m6D

V56

i69

b62

e65

ESC1B

[5B

030

m6D

!21

⬆

Current Byte

NORMAL

Print Characters

→

ESCAPE MODE

Buffer Command...

Terminal Screen / Screen Display

Normal mode: Characters go directly to screen. Escape mode (after ESC ): Terminal stops output and starts collecting characters as commands until the command ends (e.g. m) and executes.

The component below shows more types of escape sequences and their rendering effects:

16-COLOR PALETTE|16-Color Palette

STYLE SEQUENCES|Style Sequences

CURSOR SEQUENCES|Cursor Control Sequences

Terminal Preview

Last Sequence:Waiting for input...

Hello World

5. Input Mechanism: Byte Stream

The input process is often misunderstood. When you press a key, the terminal doesn't directly "draw" the character on screen — it performs an encoding transmission.

Key capture: The terminal captures your physical key press.
Encoding conversion: Converts the key press into a specific byte sequence.
- Pressing a → sends byte a.
- Pressing Up Arrow → sends sequence ^[[A.
Send: Sends the byte stream to the Shell or the currently running program.

Key point: All key presses (including function keys and mouse clicks) are byte data at the transmission level.

Try pressing keys below and observe how your input is converted to underlying data:

⌨️Click to Type

Press any key

BYTES (HEX)

SEQUENCE

Character: -

6. Running Modes: Typewriter vs. Game Console (Cooked vs. Raw Mode)

The terminal has two very different "personalities." Understanding this helps you see why typing commands and playing Snake in the terminal are completely different experiences.

Cooked Mode — Like a Typewriter
- This is the default mode.
- Behavior: Characters you type are temporarily held by the terminal until you press Enter.
- Benefit: This gives you a chance to correct mistakes. Made a typo? Press Backspace to delete and retype — the program never knows you made a mistake.
- Use case: Typing everyday commands (like ls, cd).
Raw Mode — Like a Game Controller
- This is the "expert" mode.
- Behavior: Every key you press (including arrow keys, Ctrl combinations) is instantly sent to the program with no buffering.
- Benefit: The program can respond to your actions in real time.
- Use case: Playing terminal games (like Snake), using the Vim editor (a keyboard-only editor).

Click the buttons below to switch modes and experience the different "feel" of "writing a letter" vs "playing a game":

1. Keyboard Input

Type here...

Click to focus

⬇

2. Line Buffer (Kernel) Active

Waiting for Enter... (Backspace works)

⬇

3. Application Receives

7. Process Control: Signals

Pressing Ctrl+C in the terminal usually stops a program. This isn't achieved by sending characters — it triggers a signal (Signal).

Signals are an operating system-level notification mechanism used to tell a program that a specific event has occurred.

Ctrl+C → sends SIGINT (Interrupt): notifies the program "please interrupt the current operation."
Ctrl+Z → sends SIGTSTP (Suspend): notifies the program "please pause and suspend to background."

This mechanism bypasses the standard data input channel, ensuring users retain control even when a program is frozen.

Ctrl+CInterrupt

SIGINT

Ctrl+ZSuspend

SIGTSTP

Ctrl+C → SIGINT

Stop the running program

Sends SIGINT (signal interrupt) to the foreground process. Most programs respond by stopping immediately. It's how you cancel a long-running command or exit a program that's stuck.

Example: Running `sleep 100` and pressing Ctrl+C stops it immediately.

$ sleep 100

sleeping...

State: Running (PID 1234)

Click "Run Command" to start a simulated process, then try sending different signals.

8. Advanced Application: Full-screen Interfaces & Buffers

Have you noticed that when you edit a file with vim or check system status with htop, they take over the entire screen? And when you exit, the screen instantly returns to its previous state with all your command history intact?

This is because the terminal has two "canvases" that switch back and forth:

Primary Buffer — Like a Scratchpad
- You write a line, the system responds with a line.
- When it's full, it scrolls — everything you wrote before is still there.
- Used for: Everyday command typing.
Alternate Buffer — Like a Blackboard
- The program wipes the blackboard clean and draws on it (full-screen display).
- No matter what's drawn, it never affects the scratchpad on your desk.
- When you exit the program, it's like putting the blackboard away — you're back to your scratchpad.
- Used for: Vim, Nano, games, and other full-screen software.

Click the button below to experience how the "scratchpad" and "blackboard" switch instantly:

Terminal - Buffer Switching Demo

➜ ls -la
 total 16 
 drwxr-xr-x 2 user staff 64 Jan 15 10:00 . 
 drwxr-xr-x 4 user staff 128 Jan 15 09:55 .. 
 -rw-r--r-- 1 user staff 1024 Jan 15 10:00 notes.txt 
➜echo "Hello World"
 Hello World 
➜ vim notes.txt

Current: Primary Buffer (Primary Buffer)

This is the standard scrolling log. Commands are executed line by line.

9. Summary

The terminal is not a mysterious black box — it's a standardized text interaction interface.

Display: Based on grids and characters.
Control: Based on escape sequences.
Interaction: Based on input/output streams and signals.

By understanding these underlying principles, you no longer need to blindly memorize commands — you can truly understand the logical flow behind every keystroke.

Appendix: Common Terminology (Vocabulary)

Term	English	Explanation
Terminal	Terminal	The window program responsible for display and input (front-end).
Shell	Shell	The program responsible for parsing commands and executing logic (back-end).
CLI	Command Line Interface	A text-based interaction method.
TUI	Text User Interface	A pseudo-graphical interface built with characters in the terminal.
Escape Sequence	Escape Sequence	Special character instructions used to control terminal cursor, colors, etc.
Standard Input/Output	Stdin/Stdout	Standard channels for programs to receive and output data.

References

How Terminals Work: The structure and demonstrations in this article are deeply inspired by this project. If you want to dive deeper into engineering implementation details, we highly recommend reading the original tutorial.

Command Line & Shell Scripts ​

0. Quick Start: How to Open the Terminal? ​

0.1 Hands-on Lab: Try It Yourself ​

Step 1: See What's Here

0.2 Why Give Up the Mouse? (Why CLI?) ​

1. The Mouse Struggles with "Batch Operations" and "Logic" ​

2. Commands Can Be Recorded and Reused ​

3. The Only Option for Remote Control ​

1. Concept Definition: What Is a Terminal? ​

2. Core Architecture: The Art of Decoupling (The Big Picture) ​

2.1 Role Division ​

2.2 Why Separate Them? (Replaceability) ​

2.3 Interaction Flow: The Vanishing Keystroke ​

3. Visual Model: The Grid System ​

3.1 Cell Composition ​

3.2 Style Inspection ​

4. Communication Protocol: Escape Sequences ​

5. Input Mechanism: Byte Stream ​

6. Running Modes: Typewriter vs. Game Console (Cooked vs. Raw Mode) ​

7. Process Control: Signals ​

8. Advanced Application: Full-screen Interfaces & Buffers ​

9. Summary ​

Appendix: Common Terminology (Vocabulary) ​

References ​