欢迎大家提Issue反馈问题或建议,建设推理王国!
Skip to content

Chapter 14: Formal Systems — Giving Reasoning a Foundation

Before you can prove something, you must first say clearly what "proof" means.


The thirteen chapters of the previous volume were a journey. We started from entropy increase, passed through the ruins of symbolic systems, detoured around the traps of vector spaces, and spent a long time at the boundary of complexity theory. Throughout the entire process, we kept talking about "reasoning" — but we never truly defined it.

This was intentional. Intuition needs to go first; definitions follow behind.

But now it is time to pay the bill. The next volume has only one thing to do: say clearly what reasoning is. Not with metaphors, not with history, but with precise language — dig out the structure of reasoning, place it under the light, and see it clearly.

This requires a starting point. That starting point is called a formal system.


14.0 The Axiom Stamp Game: What Does "Legally Derived" Mean?

Don't rush to talk about "truth." Let's first play a colder game: the Axiom Stamp Game.

On the table are three things: a stack of blank paper, a box of stamps, and a rule sheet. The stamps are engraved with certain fixed formulas, for example P(QP). The rule sheet lists a small number of permitted moves, for example: if P and PQ already appear on the paper, you may stamp Q on the next line.

When the game begins, there are no conclusions on the table. You can perform two kinds of actions:

  1. Apply an axiom stamp: instantiate some axiom template and write it on the paper.
  2. Take one step by rule: from formulas already appearing on the paper, write a new formula according to an inference rule.

Beyond this, nothing is permitted. You may not say "this is obviously true," may not say "my intuition tells me it should be so," may not invoke the real world for help. In this game, a "proof" is just a sequence of legal stamp records; a "theorem" is the formula appearing on the last line of some sheet.

The formal skeleton of this game

  • State space: all finite sequences of formulas (A1,A2,,An).
  • Legal actions: write down an axiom instance, or apply an inference rule to existing formulas.
  • Transition rule: from sequence s to s=sA, where A must be legally generated by an axiom or rule.
  • Victory condition: the target formula G appears at the end of the sequence, i.e., FG.
  • Failure mode: you feel that G is very correct, but the rule sheet does not permit you to write it. The formal system does not care about your feelings.

The point of this game: legality comes before truth. A formal system first stipulates what symbol strings can be written and what steps can be taken; whether these symbols actually describe the world is a problem that "semantics" will handle later.

This is the first cut of formalization. It cuts reasoning out of natural language and places it into a checkable game. Winning does not mean you possess truth; you have only proved one thing: under this set of rules, you indeed reached that position.

A formal system is not a truth machine. It is a chessboard, a box of pieces, a rulebook. Being able to reach a certain square only shows that this move is legal, not that the universe owes you applause. Separating "legally derived" from "truly holds" is the first lesson of this volume.

14.1 The Problem of Linguistic Ambiguity

Consider this sentence: "If it rains, the ground gets wet. It rained. Therefore, the ground is wet."

This reasoning seems completely correct. But the question is: why is it correct?

You might say: because it conforms to common sense. But common sense is not mathematics. Or you might say: because logically it must be so. But what does "logically it must be so" mean? If we are to take this question seriously, we must find some explanation that does not rely on "common sense" and "obvious."

Natural language is ambiguous. "Or" in different contexts may mean "at least one" or "exactly one." "If...then..." varies in countless ways in everyday usage — "If you pass the exam, I'll treat you to dinner" and "If this algorithm is correct, then the complexity is O(nlogn)" — the logical structures of these two "ifs" are completely different.

The purpose of formalization is not to make language ugly, but to eliminate ambiguity. We need a set of symbols, each symbol having only one meaning, given by definition, not depending on context, not depending on common sense.

"Common sense is not mathematics" — this sentence is very well said. But don't think that formalization is just translating common sense into symbols. Formalization is forcing you to admit: how much of what you call "obvious" hides things you never thought through clearly. Symbols are only tools that force you to show your true form, not the answer itself. That's all.


14.2 Propositions: The Raw Material of Reasoning

The smallest unit of reasoning is the proposition — a statement that can be asserted as true or false. "It is raining today" is a proposition. "Raining" is not, because it does not assert anything. "This sentence is false" looks like a proposition, but it creates a self-referential paradox — we set it aside for now; Chapter 15 will confront it directly.

Propositions are denoted by capital letters P,Q,R,. Complex propositions are composed of simple propositions through logical connectives:

SymbolReadingMeaning
¬Pnot Pthe negation of P
PQP and Qboth hold
PQP or Qat least one holds
PQif P then Qwhen P holds, Q must hold
PQP if and only if Qthe two are true or false together

The meanings of these symbols are given by precise rules, not by convention. PQ does not mean "P causes Q," does not mean "P temporally precedes Q," but only: there is no case where P is true and Q is false. This definition may be colder than your intuition — "If the moon is square, then 2 + 2 = 5" is, under this definition, a true proposition — but it is precisely this coldness that makes mathematical reasoning possible.

Logical connectives are not describing causation, but constraining the legality of inference. The question of PQ is: if you accept P, are you forced to also accept Q? Only when P is true and Q is false is the answer negative — so that is the only case to exclude. When the antecedent is false, the implication imposes no constraint, and the whole is regarded as true. The cost is that "true proposition" no longer equals "meaningful proposition." Mathematicians consider this cost worth paying.


14.3 Formal Systems: Setting the Rules of the Game for Reasoning

With the language of propositions in hand, the next step is to say clearly what kinds of inferences are legal. This is the work of a formal system.

A formal system F consists of four parts:

Language L: specifies which strings of symbols are legal propositions. Just as Chinese grammar specifies which strings of characters are legal sentences, the grammar of a formal language specifies which symbol strings are legal formulas.

Axioms A: a batch of propositions directly accepted as true, requiring no proof. For example, in propositional logic, P(QP) — no matter what P and Q are, this formula always holds. Axioms are the starting point of the entire system.

Inference rules R: the legal steps for deriving new propositions from existing ones. The most basic one is called Modus Ponens: if you know PQ, and also know P, then you may conclude Q. Written in horizontal-line form:

PQPQ

The premises are above the line, the conclusion is below. This is not describing what is "true," but stipulating what kinds of inferential steps are permitted.

Proof: a finite sequence that connects axioms and inference rules. A proof is a sequence of propositions A1,A2,,An, where each Ai is either an axiom or the result of applying some inference rule to earlier propositions. The last proposition An is the theorem established by this proof.

When we write FA — read as "A is provable in F" — it means there exists such a proof sequence, ending in A.

and : two similar-looking symbols with sharply different meanings These two symbols are core symbols that appear frequently in this volume; confusing them will make everything hard to understand:

  • (single horizontal line, turnstile): syntactic provability. FA means "there exists a formal proof, starting from axioms and proceeding step by step according to rules, that derives A." This is the result of pure symbol manipulation, having nothing to do with "truth."
  • (double horizontal line): semantic truth. A means "A is true under all possible interpretations" (a tautology). This is a statement about meaning.

Analogy: is "derived by a machine according to rules," is "indeed holds in the real world."

Soundness: AA — whatever can be proved is true.
Completeness: A⇒⊢A — whatever is true can be proved.

These two directions are not automatically true; both require proof.


14.4 Proofs Are Syntactic Objects

Here there is a crucial philosophical stance worth pausing to consider.

In a formal system, a proof is a syntactic object — a sequence of symbol strings that can be mechanically verified. Verifying a proof does not require understanding what it is "saying"; it only requires checking whether each step legally applies some inference rule.

What does this mean? It means that a machine — one that does not understand mathematics, does not know what "truth" means, and can only do symbol matching — can verify the correctness of any formal proof. Modern theorem-proving assistants (Lean, Coq, Isabelle) work exactly this way: you write the proof, the machine checks the symbols.

Note the asymmetry here. Verifying a proof is mechanical, doable in polynomial time — one need only check step by step whether each inference rule is legally applied. Discovering a proof is another matter: there is no general algorithm that can guarantee finding a proof under arbitrary circumstances (Chapter 15 will explain why this is impossible). This asymmetry runs through the entire theory of computability and is one of the sources of confusion around the "NP problem."

This is not the end of mathematics, but a deep property of mathematics: mathematical truths can be mechanically verified. We need not rely on authority, need not trust intuition — we need only check symbol strings.

Of course, the process of thinking up a proof — discovering the correct inference path — is not mechanical, and perhaps never will be. But verification is mechanical, and this point alone is deep enough.

It is easy to drift here. "A machine can verify proofs" does not equal "a machine understands proofs." Verification is pattern matching on symbol strings, which has no relation to "understanding." If you are going to praise "mechanically verifiable" as the glory of mathematics, you must first think clearly about what exactly you are praising. What is verified is form, not meaning.


14.5 Context: The Bookkeeping of Assumptions

Real reasoning rarely starts from zero. Usually we say: "Assume P holds, then we can derive…" This kind of reasoning with assumptions requires a place to record the assumptions, called a context, denoted by Γ.

ΓA is read: under the set of assumptions Γ, A can be derived.

For instance, {PQ, P}Q says: assuming "if P then Q" and "P," one can derive Q.

Context looks like a bookkeeping tool, but its behavior patterns determine the properties of the entire logical system. There are three basic rules about context, usually running silently in the background, to the point that most people never realize they exist:

Weakening: if A can be derived under assumptions Γ, then with more assumptions, A can still be derived — extra assumptions do no harm. This is very intuitive, but what it says is: the assumption set can contain things that are never used.

Contraction: the same assumption can be used arbitrarily many times. You know P; you can use it once, or use it a hundred times, with no loss whatsoever.

Exchange: the order of assumptions does not matter. {P,Q} and {Q,P} are the same.

These three rules are taken for granted in classical logic — resources are unlimited, order irrelevant. But Chapter 16 will remove the contraction rule, and this change will fundamentally alter the nature of the entire logic: each assumption can be used exactly once, and reasoning becomes resource management.

Weakening, Contraction, Exchange: sounds boring, but Chapter 16 will destroy them

These three structural rules (Structural Rules) are the hidden foundation of classical logic; the vast majority of logic textbooks never explicitly name them, because they "obviously hold":

  • Weakening: extra premises cause no trouble. Knowing "it rained" and "the wallet is at home," one can derive "the ground is wet" — even though "the wallet is at home" is completely useless.
  • Contraction: the same premise can be reused infinitely. Knowing "I have five dollars," one can use this piece of knowledge arbitrarily many times.
  • Exchange: the order of premises does not matter.

Why list them separately? Because Chapter 16's linear logic will remove contraction — suddenly, "five dollars can only be used once," and reasoning becomes resource management. The existence of these three rules is precisely to make the contrast clearer when they are removed in Chapter 16.

Most people study logic for ten years and never realize they have been silently using these three rules all along. This is not extra knowledge — it is the foundation of your reasoning. You don't even know what your foundation is, yet you build buildings on top of it — the things you build, you yourself don't know why they can stand. The absence of names does not mean their absence.

Because they are not "logic itself," but meta-choices about logic — choose them, and you get classical logic; reject a certain one among them, and you get a different logical system. Most math classes never mention these three rules at all, because in the classical context they are never violated. But it is precisely this "taken for granted" that conceals the optionality of logic. One of the things a formal system can do is make these hidden choices visible.


14.6 Semantics: The Meaning Behind the Symbols

A formal system manipulates symbols at the syntactic level. But we usually hope that these symbols "have meaning" — corresponding to some real world or mathematical structure.

The way to assign meaning to propositions is called a valuation: mapping each propositional variable to true or false. Under a given valuation, the truth value of a compound proposition is determined recursively by the rules for the connectives — PQ is true only when both P and Q are true, PQ is true when P is false or Q is true, and so on.

If a proposition is true under all valuations, it is called a tautology, written A. This is a concept at the semantic level — A holds no matter what values the propositional variables take.

Now there are two levels to distinguish:

  • FA: at the syntactic level, A has a formal proof
  • A: at the semantic level, A is a tautology

The relationship between these two levels is the core problem of formal logic:

Soundness: if FA, then A. The system proves only true things; it does not lie.

Completeness: if A, then FA. All true things can be proved by the system; nothing is missed.

The standard formal system of propositional logic satisfies both properties. First-order predicate logic does too — Gödel proved this in 1930, which was the last time he brought good news. One year later, he brought something completely different.


14.7 The Stronger the System, the Greater the Cost

Different formal systems have different expressive power, roughly ordered from weak to strong as follows:

Propositional logic: deals only with propositions and connectives, cannot talk about "all" and "exists." Simple enough to be mechanically decidable — there exists an algorithm that, for any proposition, determines in finitely many steps whether it is a tautology.

First-order predicate logic: adds quantifiers (for all) and (exists), can express statements like "every natural number has a successor." Complete, but undecidable — there is no general algorithm for deciding whether a first-order proposition is provable.

Peano Arithmetic (PA): adds axioms of the natural numbers on top of first-order logic, powerful enough to express nearly all statements of elementary number theory. It is at this level that Gödel's bad news arrives.

Set Theory (ZFC): the universal language of modern mathematics; nearly all mathematical objects can be encoded and studied within it.

This hierarchy has a pattern: the stronger the system, the greater its expressive power, but the smaller its capacity for self-verification. Propositional logic is simple enough to exhaustively check itself; Peano Arithmetic is powerful enough that it cannot even prove its own consistency. Power has its price. That price is precisely the bill that the next chapter will reckon clearly.

Intuitively: a sufficiently strong system can talk about itself, and once it can talk about itself, it can construct paradoxical statements about itself. Weak systems cannot express "my own propositions," so they do not encounter this problem. This is like language — the more expressively rich a language, the more prone it is to self-referential paradoxes; simple formal languages are safe precisely because they cannot self-refer. What Gödel discovered is not a flaw in mathematics, but the intrinsic cost of expressive power.


14.8 Hands-on: Write Your First Formal Proof

Before continuing, stop and do one thing: write a real formal proof sequence. Not "understand what it is saying," but actually write it out, noting for each line where it comes from.

This will make everything that follows feel different.


We use a minimal system of propositional logic, with only two axioms (this is Łukasiewicz's simplified version, sufficient for our practice):

Axiom schema L1: P(QP)

Axiom schema L2: (P(QR))((PQ)(PR))

Only inference rule: Modus Ponens: from P and PQ, derive Q.

Axiom schema means: substituting arbitrary propositional variables for P,Q,R yields results that are all axioms. For example, A(BA), A((AB)A), (CC)(CC), all of these are instances of L1.


Goal: prove PP (a proposition implies itself).

This seems obvious — but in this system, it is not an axiom; it must be derived from the axioms.

Here is a scaffold; try to fill in the source of each line (axiom instance, or MP applied to which two lines):

LinePropositionSource
1P((PP)P)L1, with Q taken as PP
2(P((PP)P))((P(PP))(PP))?
3(P(PP))(PP)? (MP on lines 1, 2)
4P(PP)?
5PP? (MP on lines 3, 4)

Line 2 is an instance of L2 — figure out what values P,Q,R take to produce this line. Line 4 is an instance of L1 — similarly, find the corresponding P,Q.

After filling it in, you will have completed a complete formal proof: five lines, each with a clear source, depending on no intuition, relying solely on symbol manipulation.

The significance of this proof lies not in the conclusion PP, but in your firsthand experience of what "a proof is a syntactic object" means — each step is a mechanical symbol check, with no "obvious" anywhere.

Returning to the axiom stamp game at the start of this chapter: you have just personally stamped five stamps with your own hands. What is truly worth remembering is not the final line PP, but how the rules of the game rewrite "I feel it is obvious" into "I can point out where each line comes from." The coldness of formal systems is precisely here, and it also reveals its democratic character: anyone can check whether your stamps are legal; no one needs to believe in your genius, your credentials, or your intuition.

But precisely because of this, a sharper question surfaces: if all legal moves are pre-scripted by the rule sheet, then is the rule sheet itself reliable? Might it stamp out false things? Might it miss true things? These are the two walls that the next chapter must face.


Unresolved

Can formalization capture all reasoning? Mathematicians rely on a great deal of informal intuition in their actual work — a certain construction "should" work, a certain proof "obviously" can be generalized. Can these intuitions be formalized? Or does a formal system inherently miss some ineffable dimension of reasoning?

Is the choice of axioms objective? Different sets of axioms yield different mathematical universes. The Axiom of Choice — the most controversial axiom in ZFC — is not "obviously" correct; mathematicians have debated it for a century. Formalization has made this debate clearer, but has not resolved it.

Is this machine sound? Is it complete? We have built a formal system, described its language, axioms, and inference rules, and distinguished syntax from semantics. But there is one question we have never directly answered: are the things it proves all true? Can it prove everything that is true? The answers to these two questions — one comforting, the other shocking. Chapter 15 confronts them directly.


Exercises

★ Warm-up

Which of the following symbol strings are legal propositional formulas? Which are not? No need to give reasons, just judge.

  1. PQR
  2. ¬¬P
  3. PQ
  4. (PQ)(¬Q¬P)
  5. P

(Hint: legal propositional formulas are composed of propositional variables, connectives, and parentheses according to grammatical rules. Connectives are binary or unary operators and cannot appear alone at the end.)


★★ Derivation

Which of the following three assertions hold? State the reasons clearly in natural language, and indicate which concept from this chapter each assertion depends on (inference rules, weakening, contraction, or the deduction theorem).

  1. If ΓPQ, then ΓP.
  2. If ΓP and ΓQ, then ΓPQ.
  3. If Γ,PQ, then ΓPQ.

The third is called the deduction theorem. It says: being able to derive Q under the assumption P is equivalent to being able to derive "P implies Q" without that assumption. The significance of this rule is: assumptions can be "absorbed" into the implication connective. Think clearly about whether this absorption happens at the syntactic level or the semantic level.


★★★ Challenge

The deduction theorem is a meta-theorem about a formal system, not an axiom or inference rule inside the system. Try using the language introduced in this chapter (, context Γ, inference rules) to write out precisely what the difference is between "meta-theorem" and "theorem inside the system."

Further: can the deduction theorem itself be proved inside some formal system? Or must it be established outside the system, using mathematical induction on the length of proofs? What connection exists between the answer to this question and the fact in Chapter 15 that "a system cannot prove its own consistency"? No need to solve it — only need to use the language of this chapter to state the question clearly.