Making English a Programming Language

Jun 15, 2026

Product Vault is a public, MIT-licensed harness for building software with coding agents like Claude Code or Codex. It is especially useful for non-technical builders who want to build more complex products without the codebase turning messy.

The short version

Problem

  • Coding agents are becoming very good at turning natural language into production-grade code.
  • The problem is that they mostly reason from the current codebase and the current chat thread.
  • Code records how a product was built. It does not reliably record why it was built, what it is meant to do, or which trade-offs were rejected.
  • As the codebase grows, lost intent makes each new change riskier: duplicated paths, weaker modularity, tech-stack drift, and new features that break things that used to work.

Solution

  • The missing layer is a product wiki: a natural-language layer above the codebase.
  • It stays current with what the product is supposed to do, not just how the code happens to work today.
  • The core units are actors, jobs, stories, acceptance criteria, rules, journeys, capabilities, and decisions.
  • As the product grows, the wiki also tracks outcomes, non-goals, assumptions, risks, and glossary terms.
  • Those units become building blocks. They make the product readable from the top down, so a non-technical person can understand what the product does and why important code decisions exist.
  • The important part is that the wiki persists after the chat ends. The same units give the agent something to check before it writes code: what already exists, what to reuse, what must not break, and where the change belongs.

How it works

Shape the request

  • You still speak to the coding agent normally: "add this feature", "fix this bug", "change this workflow".
  • The harness runs a proposal skill before implementation. It asks questions, finds where the change fits, shows what it touches, and drafts a proposed addition to the product wiki.

Update the product wiki

  • Once approved, the product wiki is updated.
  • The wiki records what must be true. Those rules and acceptance criteria become executable checks before code is written.

Compile the change

  • A compiler skill then turns that wiki change into behaviour, design, implementation steps, code, and checks.
  • The sequence matters. It decides the product change first, then the affected journey, rules, capabilities, architecture, checks, and code path. Each decision narrows the next one, so the agent moves from the whole product to the smallest safe edit.

Check and maintain

  • Before implementation, design checks ask whether the change can reuse existing parts, fit the architecture, and stay on the chosen stack.
  • Specialist subagents are used only when a step benefits from separate review, such as architecture, verification, or consistency.
  • Loops keep the wiki, checks, dependencies, architecture, and code aligned over time.

Result

  • It works whether you are starting a new product, wrapping an existing codebase, or adding a feature to a mature system.
  • It helps non-technical operators, founders, PMs, and domain experts shape software without handing the architecture to a chat thread.
  • It helps engineers because requirements, trade-offs, interfaces, expected behaviour, and architecture choices are explicit before implementation starts.
  • The result is more consistent software: one product picture, clearer requirements, cleaner boundaries, fewer duplicate paths, less tech-stack drift, and less code that works today but collapses under the next change.

Core idea

Product wiki, compiled into code

Product intent is broken into small connected units. The compiler turns an approved wiki change into code in checked steps. The checks run against the code, so the wiki stays useful rather than decorative.

1

1. Normal request

chat box

"Add this feature", "fix this bug", or "change this workflow".

2

2. Product wiki

building blocks

The proposal skill asks questions and writes the change as small units: jobs, stories, rules, outcomes, checks, and decisions.

3

3. Compile the change

compiler skill

The compiler skill turns the approved wiki change into behaviour, design, implementation steps, code, and executable checks.

4

4. Maintained code

repo

Loops keep the wiki, checks, dependencies, architecture, and code up to date.

Why I Built This

I am the COO of Ascend, a travel-tech scale-up with 80 people and roughly $25M ARR. Before that I was a founder, product manager, and growth operator. I have always been a builder at heart, but I have never trained as an engineer.

Over the last year, my commit history has started to look like an engineer's. Some of the products I have built with agents are now live in production at Ascend. Others flopped, or worked for a couple of weeks before collapsing like a house of cards.

The good ones had a pattern: clear intent, careful review, and enough structure around the agent that I could judge the work as a non-engineer.

Commit history

6,327 contributions

in the last year, redrawn from my GitHub snapshot

OctNovDecJanFebMarAprMayJun
LessMore

Something has changed in who can build software. Some of us were not trained in traditional programming, but we can reason about product, constraints, tests, and review well enough to direct an agent. However, the failure mode we keep hitting is product mess.

Agents can still produce working code, but every change has to rediscover the architecture. That is how duplicate paths, inconsistent patterns, and fragile branches creep in. The chat box is useful, but it is a weak place to keep all of that together. The repo I am open-sourcing is my attempt to fix that.

The Gap in the Chat Box

Andrej Karpathy put it simply: "The hottest new programming language is English." The problem is that most of us write that language into a chat box, where it gets used once and then disappears into the transcript. The code is left as the durable record, even though code mostly records how the product works, not what the product is meant to be.

Claude Code and Codex can already read a repo, search files, resume transcripts, follow project guidance, and delegate exploration to subagents. That is useful, and it is getting better. The issue is the material they reconstruct from. They mostly reason from the current codebase and the current thread.

Code is a strong record of implementation. It is a weak record of intent.

The missing intent is usually:

  • The actors the product serves.
  • The jobs those people are trying to get done.
  • The rules the product must apply.
  • The journeys it supports.
  • The behaviour that must always hold.
  • The decisions taken and the options rejected.

So the agent rebuilds a guess at your intent from the code each time. The guess is often good, but it is lossy, and it does not accumulate. Better models will help, but they cannot maintain a record of intent that was never written down.

What the Harness Contains

Product Vault sits between the chat box and the codebase. You put in the kind of request you would normally type to a coding agent. Instead of going straight to code, the harness runs a chain of skills.

First, the proposal skill works with you in natural language. It clarifies the request, asks questions when something is missing, finds where the change fits, and drafts the smallest useful addition to the product wiki.

You approve, edit, or reject that proposal. Once approved, the product wiki is updated. Then the compiler skill turns the approved wiki change into design decisions, executable checks, an implementation plan, code, and verification.

That distinction matters. A skill is the playbook that moves the work forward. A subagent is a specialist the playbook can call when separate context helps. Most steps do not need one. Architecture, verification, and consistency review sometimes do.

Take one example: "Customers keep calling on weekends because they cannot change flights while agents are offline." This is what happens to that request.

Running example

One request through the skill chain

1

Input

"Travellers call on weekends because agents are offline. Can they make simple flight changes themselves?"

2

Proposal skill

The skill asks what counts as simple, which bookings are eligible, and which fee rule applies.

3

Draft wiki addition

It drafts natural-language changes to the job, story, outcome, acceptance criteria, reused capabilities, and open decision.

4

Approve

You approve, edit, or send it back. The product wiki changes before any implementation starts.

5

Compiler skill

The skill maps the blast radius, chooses the lightest safe path, defines executable checks, and plans the code change.

6

Implement and reconcile

The agent writes the smallest code change, runs the checks against the code, and records anything new back into the wiki.

The core idea is simple. Break the product into the smallest useful natural-language units. Give each unit one job and clear links to the rest of the system. Then turn those units into code in small checked steps.

I call this practical determinism. It does not make the model deterministic. It pins down the behaviour a change must produce, so there is less room for the agent to invent the architecture while it writes the code.

The Product Wiki

The product wiki is the durable record. It is a set of linked natural-language files that describe the product as the smallest useful units. Each unit has one job, a stable ID, and links to the other units it depends on.

Product wiki

Product wiki units

Actor

Who is involved?

Example

Frequent flyer. A role in the system, not a full persona.

Job

What problem is worth solving?

Example

When plans change after hours, I need to keep my trip on track without waiting for an agent.

Story

What user-facing change should exist?

Example

As a frequent flyer, I want to reschedule a booked flight in the app, so I can handle simple changes myself.

Acceptance criteria

How do we know the story is done?

Example

AC-001: Given a confirmed booking, when a new date is requested, then options appear within 60 seconds.

Rule

What product logic must apply across stories?

Example

Fare differences over $500 require explicit confirmation.

Flow / journey

What path does this sit inside?

Example

Manage an existing trip: view booking, choose change, compare fares, confirm payment.

Capability

What reusable system function supports it?

Example

Flight search, fare comparison, payment authorisation.

Decision

Why did we choose this shape?

Example

ADR-004: Low-value changes can be self-serve. High-value changes still route to an agent.

Outcome

How will we know it mattered?

Example

Weekend flight-change calls fall by 30% without increasing failed payments.

Non-goal

What is deliberately out of scope?

Example

Do not support multi-city reissues in the first version.

Assumption / risk

What might make this wrong?

Example

Airline change penalties may not be available quickly enough for every carrier.

Glossary

Which words must mean one thing?

Example

"Change" means date or time change on the same origin and destination.

A proposal is the gate before code. It asks simple questions: which job does this serve, which journey changes, which capability should be reused, what outcome would prove it mattered, what is deliberately out of scope, what must be true, and what still needs a human decision?

That is the abstraction layer. A non-technical person can read the product from the top down. If they need detail, they can follow the links into requirements, checks, dependencies, and code.

The wiki is harder to ignore than a design doc because it sits on the path to implementation. A new feature, bug, or workflow change does not go straight into code. It first becomes a proposal.

proposal.md

Proposal

Self-serve flight changes outside agent hours

Incoming request

"Travellers call on weekends because agents are offline. Can they make simple flight changes themselves?"

Questions before code

  • Which bookings are eligible?
  • Which fee rule applies?
  • Can existing search and payment flows be reused?

Proposed wiki addition

Still natural language. Nothing in `src/` changes yet.

awaiting approval
+wiki/jobs/manage-trip.md

Traveller can change a flight when support is offline.

+wiki/outcomes/reduce-weekend-calls.md

Track whether weekend change calls fall without increasing failed payments.

+wiki/stories/self-serve-reschedule.md

Add story for rescheduling a booked flight without calling support.

+wiki/acceptance-criteria/reschedule.md

Options appear within 60 seconds. Fare changes over $500 need confirmation.

=wiki/capabilities/flight-search.md

Reuse existing search, fare comparison, and payment authorisation.

?wiki/decisions/weekend-fees.md

Confirm whether weekend changes use the same fee policy.

-src/**

No code changes until the wiki addition is approved.

Human review

Approve, edit, or send back. The proposal is a suggested addition to the wiki, not an implementation.

That proposal is a suggested addition to the wiki. You can approve it, edit it, or send it back. Nothing reaches the code until the product change is clear.

The Compiler

The compiler is the workflow that turns an approved wiki change into code. It runs as a disciplined sequence of small decisions. It works on one bounded change, plus the dependencies in its blast radius.

It should not be the same weight for every task. A one-line copy fix can take a light path: make the edit, run the relevant checks, and review the diff. A cross-module product change runs the full path.

The order matters. First, apply the approved wiki change. Then locate the affected product units and code paths. Then decide what to reuse, whether the architecture can absorb the change, what checks prove the behaviour, and what code needs to change. Each step makes one decision explicit before the next one starts.

By the time the model writes code, the hard choices should already be made: what the change must do, what should be reused, how it fits the architecture, and how it will be checked.

This is where the architecture review happens. Before the system writes code for weekend flight changes, it asks: can the current booking architecture support self-serve changes, or would this be another fragile branch inside a module that is already doing too much? If the answer is no, the system raises a refactor proposal before the feature is layered on top.

For a meaningful feature, the compiler runs in six stages.

Compiler

One approved feature, in order

Small edits can take a lighter path. This is the full path for a meaningful feature.

1

Apply wiki change

Write the approved addition, then update the wiki index, log, and dependency map.

2

Locate blast radius

Find the changed job, story, rule, journey, capability, and the code areas they already touch.

3

Reuse or refactor

Decide whether existing capabilities can absorb the change. If not, raise a refactor proposal first.

4

Define checks

Turn acceptance criteria, rules, and journeys into automated checks that can run against the code.

5

Plan the edit

Name the files, interfaces, data paths, and edge cases before writing code.

6

Implement and verify

Write the smallest code change, run the checks, and reconcile anything new back into the wiki.

Branch

If the current architecture cannot absorb the change cleanly, the compiler stops and asks for a refactor decision instead of adding a fragile branch.

The model is still probabilistic. The thing pinned down is behaviour, not the exact lines of code. The generated code may differ from one run to the next, but it must satisfy the same acceptance criteria, journeys, and rules. That only means something if the checks run against the code. Otherwise the wiki becomes another stale document with more confidence around it.

What Keeps It Safe

The system is trying to stop the codebase becoming a house of cards. It starts with three plain checks before implementation.

First, reuse. For weekend flight changes, the system should not invent a second flight search or a second fare comparison path. It checks whether the existing capabilities can be reused or extended. It can still propose something new, but it has to explain why.

Second, architecture. The architecture review checks whether the current structure can absorb the change cleanly. If self-serve changes would add a messy set of special cases to the booking module, the system should say so and propose a refactor first. This is the macro check: does the whole product still make sense if we add this?

Third, correctness. The wiki says what must be true: options appear within 60 seconds, fare differences over $500 need confirmation, and payment is not taken until the traveller confirms. Those acceptance criteria become automated checks before the code is written. This is the micro check: did the small change do the exact thing it promised?

For production changes, the compiler also asks the ordinary engineering questions that agents are easy to miss: does this cross a trust boundary, does it change data that needs migration or backward compatibility, and will we know if it fails in production?

The guardrails still matter. Linting, type checks, builds, sandbox rules, hooks, and CI catch the ordinary mistakes models still make. The wiki handles product coherence. The guardrails catch implementation errors and unsafe actions.

The Loops

The first version of a spec is never enough. Products change, dependencies move, tests get stale, and real usage teaches you things the original design missed. The loops keep the wiki and the code from drifting apart after the first implementation.

The practical trick is traceability. A rule, story, or acceptance criterion should have a stable ID. The relevant tests and code paths can point back to it. Then a loop can spot the gaps: wiki claims with no checks, checks with no wiki claim, and code that has drifted outside both.

For the weekend-flight feature, the loops might find four different things.

Maintenance

Loops stop the wiki becoming decorative documentation

Traceability

AC-001 exists, but no automated check covers it.

Generate or request the missing check.

Dependency

The code now touches payment authorisation, but the wiki did not declare it.

Update the dependency map and inform the user.

Architecture

The booking module has become the dumping ground for self-serve exceptions.

Raise a refactor proposal.

Regression

The self-serve journey no longer shows options within 60 seconds.

Fail the run and report the breakage.

Some loops only raise proposals because they need judgement. Others can act automatically when the answer is objective and easy to verify. The important point is that the output stays in plain language. You wake up to decisions about the product, not unexplained edits buried in code.

Starting from an Existing Codebase

This also works on existing repos. You can wrap it around a codebase that already has years of decisions inside it.

The first import runs the compiler backwards. It reads the code and drafts a product wiki from what it can infer: the capabilities that already exist, the stories they appear to serve, and the dependencies between them. That first pass is a proposal, not a fact, because the code cannot recover all of the what and almost none of the why.

For a large repo, that import has to be chunked by capability. A huge first wiki would be impossible to review. The useful version is a set of small, low-confidence proposals that a human can accept, correct, or reject.

You review the import the way you review any proposal. You confirm the stories it read correctly, fix the ones it misunderstood, and add the constraints, rejected options, and product rules that were never in the code. From there, every new feature updates the wiki before it touches the implementation.

How It Runs

Underneath, it is a folder of markdown and a small set of agent primitives. AGENTS.md is the portable agent contract. CLAUDE.md should import or mirror that contract for Claude Code. The product wiki is markdown. Skills are the repeatable workflows. Subagents are conditional reviewers. Hooks, scripts, CI, evals, and routines are the guardrails and loops.

The structure follows the LLM wiki idea: keep raw inputs, a maintained wiki, an index, a log, and schema files that tell the agent how to work. The extra layer is the compiler that turns approved product wiki changes into implementation.

The design system belongs in the product wiki too. The principles can be documented in CONSTITUTION.md, but hard rules need enforcement through hooks, permissions, scripts, and CI. The actual design language should live as linked wiki files: principles, tokens, components, accessibility rules, and content patterns. That way UI work is checked against the same product wiki as everything else.

How it runs

An ordinary repo with a product wiki above it

product-vault/

Agent instructions

Portable guidance, with enforcement handled elsewhere.

AGENTS.mdportable agent contract
CLAUDE.mdimports or mirrors the contract
CONSTITUTION.mdprinciples, not enforcement

Raw inputs

Requests, bugs, notes, and support signals land here first.

intake/raw/immutable requests
intake/proposals/proposed additions
intake/archive/processed inputs

Product wiki

The linked natural-language product wiki.

wiki/index.mdcatalog
wiki/log.mdaudit trail
wiki/actors/who
wiki/jobs/why
wiki/stories/what
wiki/rules/policy
wiki/acceptance-criteria/done
wiki/journeys/paths
wiki/capabilities/reuse
wiki/outcomes/success measures
wiki/non-goals/out of scope
wiki/assumptions/risks to test
wiki/glossary/shared language
wiki/decisions/why this shape

Design system

Product design rules that UI work must reuse.

wiki/design-system/principles.mdproduct feel
wiki/design-system/tokens.mdcolour, type, space
wiki/design-system/components.mdUI patterns
wiki/design-system/accessibility.mdusable by default
wiki/design-system/content.mdinterface copy
src/styles/implemented tokens

Schemas

The file contracts that keep the wiki consistent.

schemas/proposal.mdproposal shape
schemas/wiki-unit.mdunit shape
schemas/dependency-map.mdlinks
schemas/traceability-map.mdwiki to tests/code
schemas/check.mdtestable claims

Skills

Repeatable workflows you invoke directly.

propose-changerequest to wiki addition
apply-wiki-changeapproved addition to wiki
compile-changewiki to code
import-codebasecode to wiki proposal
reconcile-wikisync drift
review-architecturereuse or refactor
generate-checkscriteria to tests

Conditional reviewers

Separate context only when fresh review helps.

architecture-reviewerstructure and reuse
verification-reviewerproof before merge
consistency-reviewerdrift and stale claims

Loops and checks

Recurring maintenance and enforced guardrails.

hooks/pre-tool-use.*block unsafe actions
hooks/post-tool-use.*capture evidence
routines/wiki-health.mdstale claims
routines/traceability-drift.mdmissing coverage
routines/architecture-drift.mdsprawl
routines/design-drift.mdUI consistency
evals/golden/regression cases
scripts/wiki-lint.*missing links
.github/workflows/wiki-checks.ymlCI proof

Implementation

The compiled output and its ordinary checks.

src/code
tests/automated checks
docs/human-facing docs

Requests land in intake. Skills run the proposal and compiler workflows. Most work stays in the main context. Specialist reviewers are called only when separate context helps. Hooks, scripts, evals, CI, and routines keep the wiki, design system, tests, architecture, and code from drifting apart.

The pages are plain markdown, so any coding agent can read them. The skills are named entry points: propose a change, apply an approved wiki change, compile it, import an existing codebase, reconcile drift, review architecture, and generate checks. Subagents are not one per step. They are called when fresh context helps: architecture review, verification review, and consistency review. Loops run the recurring checks that keep the wiki, design system, tests, architecture, and code aligned.

Coding Without Writing Code

This is coding without writing code. You still do the human parts: decide what should exist, frame the constraints, and review what came back. The translation, typing, wiring, and checking happen a level down.

Good programming languages give you abstraction without losing the path to execution. You write in higher-level building blocks, and the compiler makes the path to the machine repeatable. This harness applies the same idea to product intent. Natural language is the input language. The wiki gives it structure. Executable checks make the behaviour repeatable.

You start in plain language, at the level a product person can read. Follow the links down and the system becomes more precise: jobs, stories, acceptance criteria, checks, dependencies, code. The wiki stays readable. The code is compiled, maintained, and inspectable.

Build Your Own

Product Vault is now live as a public, MIT-licensed GitHub repo. It scaffolds the product wiki, the agent contract, the schemas, the first skills, the hooks, the check manifest, and the maintenance routines. You bring the product.

If you build your own version, tell me where it bends under your work. You can reach me at hi@omarismail.com.