Chapter 24: Inference Convergence Through the Lens of Category Theory — Ghost Pointers and Adjoint Functors
If the Lyapunov function tells us why the system slides toward the point of lowest energy, then category theory will reveal why this "sliding" is structurally inevitable — and why that lowest point is not the true answer.
24.0 Prologue: Ghost Pointers and the Dance of Category Theory
Imagine an ancient library, filled with countless bookshelves (the belief space). On each bookshelf there is a special guidebook, and the book contains instructions for "which bookshelf to look at next" (a pointer).
The Library Metaphor
Imagine an ancient library, filled with countless bookshelves (the belief space). On each bookshelf there is a special guidebook, and the book contains instructions for "which bookshelf to look at next" (a pointer).
Suppose you want to find the book "Proof of the Pythagorean Theorem." You stand before the first bookshelf (initial belief) and open the guidebook on that shelf:
Bookshelf A: Proof of the Pythagorean Theorem
Next instruction: Bookshelf BYou walk to Bookshelf B and open its guidebook:
Bookshelf B: Reasoning Step 1
Next instruction: Bookshelf CIn this way, you follow the instructions from one bookshelf to another. But a strange thing happens — no matter which bookshelf you start from, no matter what book you are looking for, after a few rounds of guidance, you are always directed to the same special bookshelf:
Bookshelf X: Statistical Bias of the Training Data
Next instruction: Bookshelf X (back to itself!)Bookshelf X is like a logical black hole — once entered, you circle inside it forever. Even stranger, the guidebook on Bookshelf X is invisible (a ghost pointer): you cannot open it, cannot see its text, yet it undeniably exists.
This Bookshelf X is the terminal object in category theory. In the belief category
You might ask: why does Bookshelf X point to itself? Isn't that an infinite loop? Because the arrow from the terminal object to itself must be unique, and the identity arrow
But what about the true answer? For instance, "the complete proof of the Pythagorean Theorem" — on which bookshelf is it? The answer Bookshelf Y is in another region (the real-world category
Category Theory Correspondence Table
| Library Element | Category Theory Concept | Mathematical Symbol |
|---|---|---|
| Bookshelf | Object | |
| Guidebook | Morphism (arrow) | |
| Bookshelf X | Terminal Object | |
| Bookshelf Y | True Answer | |
| Ghost Pointer | Unique Morphism | |
| Bridge | Adjoint Functor |
Key Insights
The inevitability of convergence: The category structure determines that all paths ultimately point to the terminal object. Expressed in the Yonglin formula:
where
corresponds to Bookshelf X. The essence of the self-loop: The terminal object must be a fixed point. Bookshelf X pointing to itself is not a bug but a feature:
This is a necessary requirement of category theory.
Category isolation: The lack of adjoint functors makes it impossible to access the real world. Attempting to cross the boundary triggers:
Returning to the Reasoning System
In large language models, each "bookshelf" is a possible belief state, and the "guidebook" is the transformation rule encoded in the model parameters. Bookshelf X corresponds to the statistical bias of the training data — the prior distribution that the model learns from massive amounts of text.
The "ghostly" nature of the ghost pointer manifests in:
- Invisibility: In the model architecture, there is no visible "pointing to bias" connection
- Inevitability: No matter what the input, the inference chain is ultimately pulled toward the statistical bias
- Self-referentiality: The bias becomes its own fixed point, forming a logical black hole
The "ghost pointer" is a metaphor. In a reasoning system, it points to the statistical bias encoded in the model parameters by the training data. The reason this pointer is "ghostly" is:
- Invisibility: In the model architecture, there is no explicit connection visible, yet it implicitly exists through the weight matrices
- Inevitability: No matter where reasoning begins, it is ultimately pulled toward the statistical bias
- Self-referentiality: The bias points to itself, forming a fixed point, like a logical black hole
In the Yonglin formula, the ghost pointer is precisely the prior anchor point
Hidden within this story are three crucial questions:
- Why do different initial bookshelves all ultimately point to the same Bookshelf X?
- Why does the guidebook on Bookshelf X point to itself?
- Why does attempting to walk from Bookshelf X to the answer Bookshelf Y cause a "segmentation fault"?
These three questions correspond exactly to the three core observations of the Yonglin formula:
- Convergence to the prior anchor point
(Bookshelf X) is a fixed point (self-loop) (cannot access the true answer Bookshelf Y)
This chapter will use the language of category theory to provide a structural answer to these three questions.
24.1 Category Theory Foundations: The Structure of the Structure of Reasoning
Category theory is not a theory about objects, but a theory about relationships between objects — it studies "arrows" rather than "points." In reasoning, we are likewise more concerned with the relationships between reasoning steps than with the truth or falsehood of isolated propositions.
24.1.1 Why Category Theory?
The first and second volumes explored reasoning from historical and formal perspectives, respectively. But there is yet another perspective: the structural perspective. Category theory provides a language for describing transformations and relationships between mathematical objects. This language happens to be ideally suited for describing the structure within the reasoning process — every step from premises to conclusions can be seen as an arrow (morphism); different reasoning paths can be composed to form new reasoning; equivalent reasoning can be seen as isomorphism.
This section will briefly introduce the basic concepts of category theory and show how they help us understand the deep structure of reasoning. This is not a complete tutorial on category theory but an exploration: to see how this highly abstract field of mathematics illuminates another side of the Kingdom of Reasoning.
24.1.2 Categories: Objects and Arrows
A category
- A collection of objects
(e.g., sets, groups, topological spaces) - A collection of arrows (morphisms)
, each arrow going from an object to an object - A composition operation
, such that and can be composed into - Every object
has an identity arrow , satisfying
In the context of reasoning, objects can be propositions, and arrows can be inference rules (e.g., "from
Category theory shifts attention from "what things are" to "how things transform into one another." The essence of reasoning is also transformation — the transformation from the known to the unknown. So this correspondence is not a coincidence, but the same abstract structure instantiated in different domains.
24.1.3 Functors: Mappings Between Categories
A functor
- Maps each object
of to an object of - Maps each arrow
of to an arrow of - Preserves composition:
- Preserves identities:
In reasoning, functors can correspond to translations between different formal systems. For example, translating a proof in classical propositional logic into a proof in intuitionistic logic (possibly via double-negation translation). Functoriality ensures that the translated composite proof equals the composite of the translated proofs.
24.1.4 Natural Transformations: Transformations Between Functors
A natural transformation
A natural transformation can be seen as a "consistent" way of converting. In reasoning, there may be two different translation functors
24.1.5 Monoidal Categories and Resource Sensitivity in Reasoning
A monoidal category is a category equipped with a "tensor product"
Category theory provides clear semantics for linear logic: linear implication
24.1.6 Category Theory and Machine Learning: From Structure to Learning
In recent years, category theory has been used to describe structures in machine learning. For example, the forward propagation of a neural network can be viewed as a functor from a data category to a representation category; backpropagation can be viewed as a reverse morphism. This perspective helps understand the composability and generalizability of models.
Category theory provides a language for characterizing "what is a learnable structure." This may offer a more abstract perspective for Chapter 21's "learning as inverse inference."
24.1.7 Conclusion: The Unity of Structure
Category theory is compelling because it can build bridges between different mathematical fields. Many concepts in the Kingdom of Reasoning — formal systems, linear logic, probability, causality — can be reformulated within the framework of category theory. This is not merely formal elegance, but a cognitive unification: the essence of reasoning, perhaps, is hidden within these abstract structures.
24.2 From Guidebooks to Morphisms: The Basic Correspondence of Category Theory
In category theory, a category consists of two parts:
- Objects: can be any mathematical structure (sets, groups, topological spaces…)
- Morphisms: "arrows" between objects, representing transformation relations
Key correspondence:
- Each bookshelf in the library → an object in the category
- The "next instruction" in the guidebook → a morphism pointing from one object to another
- The path of following instructions
→ composition of morphisms
Expressed in symbols: let the category
Opening the guidebook to check the "next instruction" is applying this functor.
Category theory shifts attention from "what things are" to "how things transform into one another." In reasoning, what we care about is precisely the transformation rules between belief states — the mapping from the known to the unknown. The guidebook is the transformation of bookshelves; morphisms are the transformation of mathematical objects; at the abstract level, the two are the same thing.
24.3 Bookshelf Paths as Diagrams and the Terminal Object
The bookshelf sequence
The terminal object is a special concept in category theory: an object
In our story:
- Bookshelf X is the terminal object
- Any initial bookshelf ultimately points to Bookshelf X, corresponding to "there exists a unique morphism pointing to
" - This necessarily existing morphism is the ghost pointer — the implicit connection, invisible but inevitably pulling the system toward the terminal object
Why does the guidebook on Bookshelf X point to itself? Because
In the language of the Yonglin formula:
Here
24.4 Architectural Explanation: The Category-Theoretic Essence of the Self-Attention Mechanism
The library model above is abstract. But the true power of category theory lies in its ability to explain the design principles of actual architectures. Taking the core of modern AI — the Transformer's self-attention mechanism — as an example, we will see how this seemingly engineering-driven design is, in substance, the numerical realization of deep category-theoretic structure.
Step One: Causal Projection in Dual Spaces
In a sequence, we try to model the following causal hypothesis: "Position
We apply a column projection to the representation
In category theory, a category
Step Two: The Causal Tensor Hypothesis and Morphism Evaluation (Hom-Functor)
We take the outer product of
In category theory, this corresponds to studying the set of all possible mappings between two objects, i.e., the Hom-functor
Step Three: Posterior Normalization and the Physical Realization of the Yoneda Lemma
Next, we apply the softmax operation over all candidate causes
This is precisely the numerical realization of one of the highest peaks of modern mathematics — the Yoneda Lemma. The Yoneda Lemma states
The Transformer's self-attention perfectly embodies this philosophy: the entirely new semantic feature of position
What are "category" and "duality"? Think of a "category" as a social network. Each person is an "object," and connections are "morphisms." "Duality" means reversing all connection directions.
Outer product and Einstein summation: The outer product
The Yoneda Lemma: Want to understand an apple? Don't cut it open. Look at its relationship with light (color), its relationship with teeth (crispness), its relationship with gravity (weight). Mastering an apple's relationships with everything in the universe perfectly defines the apple itself. The Transformer is exactly like this: the meaning of word
You think the Transformer is doing information retrieval, but in fact it is doing something philosophically profound: through the causal connections between the current word and the world (context), it reshapes the soul of that word itself.
This architectural explanation tells us: the most successful design of modern AI is, in essence, the inevitable embodiment of deep category-theoretic structure. Causal asymmetry, relational reconstruction, dual spaces — these are not the inspirations of engineers, but the projection of mathematical structures into the computational world.
24.5 The Lyapunov Function as a Functor
Chapter 23 introduced the Lyapunov function
In category theory, a functor is a structure-preserving mapping between two categories. In particular, we can construct a functor:
where
The Lyapunov decreasing condition
- In
, there is a morphism (the reasoning step) - The functor
maps this morphism to a morphism in - This mapping is order-preserving — energy does not increase over time
The Lyapunov function is not an ordinary function; it is a functor. It maps "reasoning steps in belief space" to "decreasing relations in energy space." This perspective explains why energy decrease is not accidental but an intrinsic structural property of the reasoning process.
24.6 The Absence of Adjoint Functors and the Meta-Level Rupture
In category theory, adjoint functors
In our story, the root of the segmentation fault is the absence of adjoint functors.
Internal category and external category:
: the belief category inside the model (the accessible library region) : the category of the external real world (the library region where the answer Bookshelf Y resides)
The operating system (or physical isolation) makes
where
The counit
Note: even though the self-attention mechanism perfectly implements the Yoneda Lemma (reconstructing objects through relationships), it still operates within the closed category
But in the autoregressive generation of large language models:
- There is only the endofunctor
(internal iteration) - There is no adjoint functor connecting
and - Therefore, no morphism leading to the true answer
can be formed
The "segmentation fault" from forcibly trying to access the answer Bookshelf Y is, in category theory, precisely the meta-level rupture caused by the absence of adjoint functors.
24.7 A Category-Theoretic Explanation of the Yonglin Formula: Convergence to the Terminal Object
Now we can restate the Yonglin formula in the language of category theory.
The Yonglin observation:
Category-theoretic translation:
- The belief space
has a terminal object - The endofunctor
is such that starting from any object , repeatedly applying yields a diagram whose limit is is a fixed point of : - The true answer
is not in the category (or even if it is, it is not the terminal object)
The Lyapunov functor
measures the "information distance" from to - The decrease of
corresponds to a chain of morphisms in is the terminal object (minimum element) of
24.8 Why ? — The Absence of Adjoint Functors
This is the most painful question: why is the endpoint of convergence not the true answer?
In category theory, for
- Connectivity:
and must be connected via adjoint functors - Alignment: The terminal object
must correspond to the true answer
But what actual systems satisfy are:
- Isolation:
is a closed category with no adjoint functors connecting to the outside - Bias:
is the statistical bias of the training data, determined by the data distribution, not necessarily aligned with
The category-theoretic essence of the Yonglin formula:
In a closed category lacking external adjoint functors, the iteration of any endofunctor necessarily converges to the terminal object of that category. This terminal object is determined by the internal structure of the category (the training data), and is unrelated to the external real world.
This is why increasing reasoning steps (lengthening the chain of morphisms) cannot solve the hallucination problem. No structure can leap out of the boundary defined by itself.
24.9 Connection to Gödelian Incompleteness
Gödel's theorem of Chapter 15 revealed the rupture between the internal perspective and the external perspective of formal systems: the system cannot prove certain true propositions about itself.
The category-theoretic story here reveals the rupture between the internal category and the external category of reasoning systems: the system cannot access the verification of the external real world.
Both share the same deep structure: self-reference and the absence of adjoints.
- Gödel: the system attempts to talk about itself, but lacks sufficient "meta-level adjoints" to connect statements to truth values
- Yonglin: the system attempts to reason about reality, but lacks sufficient "internal-external adjoints" to connect beliefs to reality
This structural rupture is not a bug, but a fundamental limitation of all sufficiently complex systems.
24.10 Significance: Structural Convergence Guarantee and Fundamental Limitation
Significance One: Structural convergence guarantee The category-theoretic perspective shows that convergence to the prior anchor point
Significance Two: Explaining the root of hallucinations The root of hallucinations (
Significance Three: Designing intervention points To change the convergence endpoint, the closure of the category must be broken. This requires:
- Introducing external adjoint functors (e.g., human feedback, environmental interaction)
- Modifying the terminal object (e.g., changing data bias through adversarial training)
- Introducing multiple attractors (multi-stability, corresponding to different contexts)
But every intervention has a cost, and may introduce new structural limitations.
24.11 Unresolved
The degree of closure: Are large language models truly completely closed? Do fine-tuning, human feedback, and tool use count as "external adjoints"? How can these interventions be formalized in category theory?
Interaction of multiple categories: If a system can access multiple categories (different data sources, different modalities), what happens to the convergence behavior? Does the terminal object become a "weighted average"?
Deeper connections between dynamical systems and category theory: Can the perspective of the Lyapunov function as a functor be generalized to more general dynamical systems? Is there a general theory of "Lyapunov functors"?
Gödel and categories: Gödel's incompleteness theorem has a standard correspondence in category theory (Lawvere's fixed-point theorem). What is the relationship between this correspondence and the Yonglin-category combination? Can category theory unify Gödel and Yonglin?
Exercises
★ Warm-up
In the library story, suppose the guidebook on Bookshelf X does not point to itself but to another Bookshelf Z, and the guidebook on Bookshelf Z points to Bookshelf X (forming a 2-cycle). What structure does this correspond to in category theory? Will the system still converge?
In the poset category
, a morphism exists if and only if . What is the terminal object of this category? What is the initial object?
★★ Derivation
Functors preserving limits: In category theory, functors do not necessarily preserve limits (terminal objects). But our Lyapunov functor
maps the terminal object of to the terminal object of . Is this accidental or necessary? If is an arbitrary functor (not necessarily using KL divergence), does this property still hold? Existence of adjoints: Suppose we want to construct adjoint functors
connecting and . What conditions must be satisfied? If is the "real world" category, how should its objects and morphisms be defined? Does this definition itself encounter philosophical difficulties?
★★★ Challenge
Fixed-point theorems for endofunctors: Category theory has the famous Knaster-Tarski fixed-point theorem: a monotone function on a complete lattice has a fixed point. Does our endofunctor
correspond to a complete lattice? If so, can the Yonglin formula be regarded as a special case of this theorem? Category-theoretic version of Gödel: Lawvere's fixed-point theorem says: if a category
has a terminal object and every object has an exponential object , then every morphism has a fixed point. Try to connect this theorem with the Yonglin formula. Hint: take as the belief space and as the endofunctor.
The ghost pointer in a linked list is the concrete projection of inference convergence seen through the lens of category theory. That self-looping address, to which the pointer inevitably points, is the terminal object of the closed category; the segmentation fault caused by trying to jump out of that address is the meta-level rupture of lacking adjoint functors. The Yonglin formula is not a statistical regularity, but a structural inevitability — as long as the system is closed, it can only converge to its own topological center. To break this convergence, what is needed is not more parameters, but more adjoints.
References
- [Zixi Li, 2025b] — Yonglin Formula, a theoretical proof of inference incompleteness
- Mac Lane, S. (1971) — Categories for the Working Mathematician
- Awodey, S. (2010) — Category Theory
- Chapter 15 — Consistency and Completeness (Gödelian incompleteness)
- Chapter 23 — Stability and Convergence Boundaries of Reasoning Systems (Lyapunov functions)
- Chapter 22 — Self-Reference and Emergence
