Code as a Throwaway Artifact
Natural-language contracts, video-as-requirements, and living guardrails.
What you'll learn
- Write natural-language contracts and use video as requirements
- Treat code as a throwaway artifact behind a state machine + modules
- Maintain living guardrails while treating generated code as disposable and ideas as durable
In a nutshell
This final lesson is about shifting your loyalty away from individual generated code artifacts and toward durable structures: clear natural-language goals, recorded human behavior, reviewable diffs, state-machine boundaries, isolated modules, and living project guardrails. John’s big point is blunt: tools, models, harnesses, plugins, and even code can change or be regenerated, so the professional edge is defining intent, preserving evidence, reviewing the concrete diff, and keeping the architecture understandable.
Key concepts, explained
Natural-Language Contracts
A natural-language contract is a plain-English description of what a tool, app, or agent is supposed to do. John frames this as an "agent first" mindset: define the work in language so the underlying tool can change without destroying the workflow.
Why it matters It keeps you from overfitting to today’s specific tool, harness, plugin, or workflow. The portable asset is the intent.
Video-as-Requirements
John recommends recording yourself or users actually using the app while speaking intent out loud. The recording captures clicks, typing, layout changes, responsive behavior, and what the person wanted from the app.
Why it matters It captures human interaction that is hard to fully express in written specs. A video-capable model can help turn that usage into requirements, goals, and QA stories.
Diff-Driven Review
John reduces an agent run to a simple equation: prompt plus agent, harness, and skills equals diff. The diff is the concrete artifact worth reviewing, testing against, and handing to a fresh session for critique.
Why it matters It gives you a clean review loop. A second agent can inspect the goal and diff without inheriting the first agent’s assumptions.
State Machines and Modular Boundaries
When a codebase can grow by tens of thousands of lines overnight, John says the safety comes from structure: isolated modules and business logic that is navigable through state machines.
Why it matters These boundaries let humans and agents digest, regenerate, refactor, or replace implementation details without losing the shape of the product.
Living Guardrails
Files like agents.md or claude.md should stay fairly minimal and evolve with the project. John suggests reviewing recent commits, learning from mistakes, and iterating on the agent instructions over time.
Why it matters Mistakes become useful artifacts when they update the project’s guardrails. The file should capture real project knowledge, not a giant upfront rulebook.
Disposable Code, Durable Ideas
John explicitly says not to get too attached to a specific workflow or generated code artifact. A plugin, hook, or implementation can be generated quickly and thrown away later.
Why it matters The idea, boundary, state machine, module structure, and product intent matter more than the exact code the current agent happened to write.
Curated references
OpenAI Codex
github.com/openai/codexThe coding-agent environment John referred to when discussing skills, agents.md, generated code, and diffs.
Reach for it when Use it when you want an agent to work inside a project, then review the resulting diff as the main artifact of the run.
A project-level instruction file that gives coding agents local context, conventions, and constraints.
Reach for it when Use it as a small, evolving guardrail file; John recommended reviewing recent commits and updating it with lessons from mistakes.
Video-capable multimodal models
🔍 multimodal video understanding app requirementsJohn named Gemini as an example of a model that can understand videos of app usage.
Reach for it when Use this kind of model when narrated screen recordings contain richer requirements than a written prompt alone.
Git diffs
🔍 git diffThe concrete artifact John treated as the output of an agent run: prompt plus agent, harness, and skills equals diff.
Reach for it when Use diffs for review, testing discussions, logs, and clean-session critique.
Recommendations & best practices
- Define goals and tool contracts in natural language so the underlying agent, harness, or implementation can change.
- Record real app usage with narration; treat the video as a requirements and QA artifact, not just a demo.
- Do not over-specify exactly how the agent must prove success; keep your own QA judgment in the loop.
- Review diffs as the primary artifact of an agent run, then give the goal and diff to a fresh session to find missed cases or tests.
- Keep generated code disposable, but keep product intent, state machines, module boundaries, and names stable enough to reason about.
- Use agents.md or claude.md as living documents: update them from recent commits, repeated mistakes, and real project constraints.
- Stay flexible; do not define yourself too tightly by one language, framework, tool, harness, or generated implementation.
Make it stick
Practice treating generated code as replaceable while making your intent, evidence, architecture, diffs, and guardrails durable.
🧩 Quick quiz
1. What is the main value of a natural-language contract in an agent-first workflow?
2. Why did John frame video recordings as useful requirements artifacts?
3. In John’s diff-driven review loop, what is the most important artifact to inspect after an agent run?
4. A project suddenly grows by 30,000 generated lines. According to John, what makes that complexity manageable?
5. What is the best way to maintain an agents.md or claude.md file?
✅ Try it yourself
🚀 Challenges
Contract Before Code
EasyPick a small feature request and write a natural-language contract before opening the code. Include what the user wants, what must not change, what counts as done, and what evidence would convince you it works.
Done when: A fresh agent or teammate can read the contract and correctly describe the expected behavior without seeing your implementation notes.
Video QA Story
MediumRecord yourself using a real app flow while narrating intent, then ask a video-capable model to turn the recording into requirements, expected behavior, and QA stories.
Done when: The resulting QA story captures a meaningful user outcome from the video, not just a brittle click path or superficial selector.
State Machine Boundary Pass
MediumTake a messy feature with branching behavior and describe it as a state machine. Then split the surrounding implementation into clear modules with separate responsibilities.
Done when: A human or agent can look at the state machine and module names and understand the shape of the behavior without reading every implementation line.
Diff Injection Review
HardLet an agent implement a bounded change, then copy only the original goal and resulting Git diff into a fresh session. Ask the new session to critique the change, propose missing tests, and flag contract violations.
Done when: The second review finds at least one concrete risk, missing test, or confirmation point that improves your final merge decision.
💭 Reflect
- In your current codebase, what should be treated as durable: product intent, state machines, module boundaries, tests, guardrails, or the generated code itself?
- Where do agents repeatedly make the same kind of mistake in your projects, and what small agents.md or claude.md update would help prevent that without becoming noisy?
- Which side of the future engineering path are you actively building right now: deeper hyper-expert judgment, broader project-creator leverage, or a deliberate mix of both?
Go deeper
- Record a short narrated app walkthrough, then ask a video-capable model to extract the user goals, actions, expected behavior, and QA stories.
- Run a diff injection review: give a clean agent session only the original goal and the previous agent’s diff, then ask what tests or risks are missing.
- Rewrite one complex feature as a state machine with named states and transitions, then map the related implementation into isolated modules.
- Review recent commits and agent mistakes, then add only the most useful project-specific lessons to agents.md or claude.md.
- Map your own growth into two possible tracks: hyper-expert depth in a critical technical area and broader project-creator leverage with agents.
Moments worth pausing on
Screens captured from this part of the workshop — click any to open full size.