Filename-as-Specification
Filename-as-Specification is a methodology I developed for keeping AI coding agents from hallucinating about the files they are asked to produce. The central idea is to encode the full specification of a file into the filename itself, in a grammar the agent parses before it writes a single line of the body, so that the agent cannot drift from what was asked for without first altering the name on disk. An agent cannot avoid reading the filename, and if the specification is already inside the filename, the agent cannot avoid reading the specification along with it.
The problem I was trying to solve
Anyone who has done serious work with a coding agent has had the same experience. You write a careful prompt describing the file you want. The agent writes confident, well-formatted code that does not actually match what you asked for. You correct it, and the correction introduces a new problem somewhere else, and a few hours later you are deleting whole directories with the grim satisfaction of a gardener who has given up on a lawn.
The root cause, as far as I can tell, is a tool-use ordering issue. The agent reads the filename first, whether in a directory listing, a tool-call response, or a patch header, and then decides what to write into the body. If the filename carries no meaningful signal, the agent fills the resulting vacuum with whatever it guesses the file ought to contain, and commits to that guess before the actual content of the file has had a chance to constrain its behavior.
The insight, such as it is, was to put the specification precisely where the agent was going to look first anyway. Once the full specification lives inside the filename, the agent cannot open the file without absorbing the specification, which means it cannot miss the specification, cannot ignore it, and cannot drift away from it without visibly contradicting the name of the file it is writing into.
The anatomy of a FaS filename
user_sv_M0gM0hM0i_D01D04_I+rp0g+E0k+sc0g+E0h_E0m_V010_S2.py user → Entity: User sv → Layer: service M0gM0hM0i → Methods: register, authenticate, get_by_id D01D04 → Dependencies: sqlalchemy, jwt I+rp0g... → Internal imports: UserRepository, UserSchema E0m → Export: UserService V010 → Version: 0.1.0 S2 → Status: implemented
Every one of those axes is orthogonal to the others, and the entire string is parseable both by a human reader and by a deterministic grammar. The agent reads the filename before it writes any part of the body, and the body it eventually writes is constrained, down to the level of which symbols it is allowed to export, by what the filename has already told it.
The four roles
- ARCH-AGENT — Translates requirements into FaS filenames. Given "a user service with register, authenticate, and lookup", it produces
user_sv_M0gM0hM0i_D01D04_E0m_V010_S0.py. StatusS0= unimplemented. - CODE-AGENT — Reads the filename, implements exactly what it demands, writes the body. Status advances from
S0→S1(in-progress) →S2(implemented). - VALIDATOR — Runs a six-phase pipeline: syntax → completeness → dependency resolution → import chain → export compliance → cross-file consistency. Each phase has to pass before the next runs. Errors are reported with the filename that failed.
- ORCHESTRATOR — Walks a specification tree of FaS filenames and drives the pipeline end-to-end. Hands-off code production from a
filetree.ymlthat only contains names.
Domain matrices
A matrix is a lookup table that maps the compact FaS codes into human-readable concepts for a specific domain. Six matrices ship with the system, and there's a template for building your own.
- REST API — Routes, middleware, controllers, services, repositories. Method codes for CRUD, auth, search, pagination, file upload, webhooks.
- React / Next.js — Components, hooks, providers, pages, layouts. Props, state, effects, context, server actions.
- Python CLI — Commands, arguments, options, subcommands. Click/Typer patterns with validation, prompts, and output formatting.
- Game Development — Entities, systems, components. Physics, input, rendering, AI. ECS architecture expressed in filename form.
- Data pipelines — Sources, transforms, sinks, schedules. DAG edges implied by import codes.
- Infrastructure — Containers, services, volumes, networks. Terraform-style resources as files.
The MCP server
A Model Context Protocol server that gives Claude Desktop native FaS capabilities. Install once and Claude gains permanent access to four tools: parse_fas, generate_architecture, validate_tree, and matrix_lookup. No copy-paste, no context-window pollution, no fragile system prompts.
The server is stateless and reads a local matrices/ directory so you can drop new domain matrices in without restarting it. It streams validation results via MCP's progress protocol so long-running validator phases don't hang the agent.
Why it works
- It exploits token-order attention. The filename is read before the body. By the time the agent sees the body, the filename has already anchored its generation.
- It's deterministic. The grammar is regular. Parse errors fail fast. No "close enough" matches.
- It's human-legible. Once you know the matrix, you can read an entire architecture from a
ls -la. - It composes. Internal-import codes encode the dependency graph as part of the filename, so the validator catches cycles, missing exports, and broken imports without running the code.
- It's portable. Works with any agent that reads filenames — which is all of them.
I built this because I had mass-deleted AI-generated code one too many times after sessions that had quietly gone off the rails. The insight was stupid-simple: put the specification into the filename. Everything else is a refinement of that core idea.
Get it
One package on Gumroad — $100. Everything: the methodology document, all six domain matrices, the three agent prompts, the six-phase validator, the MCP server, the orchestrator, and the matrix-builder template for rolling your own.