How to Architect a Harness
·
Garry Tan, CEO of Y Combinator, outlines an architecture for a harness (a wrapper around a model) that relies almost entirely on skills, not code.
He says that the productivity gap among 1x, 10x, and 100x engineers using AI coding agents has almost nothing to do with model quality. Instead, the difference comes from the architecture wrapped around the model. Tan calls this the harness, and the central claim is that the harness, not the model, determines whether someone gets mediocre results or transformative ones.
The idea is that modern models already know how to reason, synthesize, and write code. Model failures come from not understanding the specific environment they are working in. The harness exists to give the model the right context at the right moment, without overwhelming it with irrelevant information. Tan describes reading the leaked Claude Code source and discovering that the real magic is not in the model but in the wrapper that orchestrates it. Can confirm.
To explain how this works, Garry introduces five concepts: skill files, the harness, resolvers, latent and deterministic work, and diarization. These form the backbone of a system that can scale from simple tasks to complex knowledge work.
1. Skill Files
Skill files are reusable markdown documents that describe a process. They don’t tell the model what to do; they tell it how to do it. The user supplies the goal, and the skill supplies the method. Tan compares a skill file to a method call with parameters. A single skill can be invoked in many different contexts, producing different outcomes depending on the arguments. Tan argues that markdown is a better medium for encoding judgment and process than traditional code, because it matches the model’s native mode of thinking.
2. The Harness
The harness is the minimal program that runs the model. It loops the model, manages context, reads and writes files, and enforces safety. Tan warns against building a bloated harness full of tools and abstractions that slow everything down and clutter the context window. His ideal harness is thin, while the skills are fat. Modern software no longer needs to be precious or over‑engineered; it should be fast, narrow, and purpose‑built.
3. Resolvers
Resolvers act as routing tables for context. They decide what documents to load when a certain type of task appears. Without resolvers, developers tend to stuff everything into a single giant prompt, which degrades the model’s attention. With resolvers, the system loads only the relevant documents at the right time. Tan describes how their own massive CLAUDE.md file became unmanageable until they replaced it with a small resolver that pointed to the right documents on demand.
4. Latent vs Deterministic Work
The distinction between latent and deterministic work is another key idea. Latent space is where the model exercises judgment, interpretation, and synthesis. Deterministic space is where reliability matters: SQL queries, arithmetic, and other operations that must produce the same output every time. Problems arise when tasks that belong in deterministic space are forced into latent space. Scale forces us to handle queries deterministically.
5. Diarization
Diarization is the process of reading a large body of material and producing a structured profile that captures the essential judgments. This is something neither SQL nor RAG can do. It requires the model to read, compare, notice contradictions, and synthesize. Tan argues that diarization is the key to making AI useful for real knowledge work.
Architecture
These five ideas combine into a three‑layer architecture. At the top are the fat skills, which encode judgment and process. In the middle is the thin harness, which orchestrates the model. At the bottom is the deterministic layer, which handles reliable operations like database queries and file access. The guiding principle is to push intelligence upward into skills and push execution downward into deterministic tools.
The article ends with a principle: if a task will need to be done more than once, it should be codified into a skill. Skills become permanent upgrades. They never degrade, and they improve automatically when models improve. This compounding effect, not smarter models, is what produces the dramatic productivity gains described at the beginning.