Field Guides

Turn the workshop references into Codex workflows

Practical, Oracle-grounded guides for the tools and references that deserved more depth. Each guide ties back to Codex prompts, skills, plugins, hooks, browser receipts, and artifacts a web developer can hand to an agent.

Lesson 07 / Lesson 11 - Oracle: codex-primitives-depth

Codex Primitive Field Guide

AGENTS.md, skills, plugins, hooks, MCP, and Goal mode without mixing the layers.

Skills use progressive disclosure to manage context efficiently.
OpenAI Codex Skills docs

Use this when a workflow keeps failing because instructions, tools, and enforcement are all living in the same prompt.

PrimitiveUse whenCodex doesArtifact
AGENTS.mdStable repo or directory rulesSetup, commands, conventions, done criteriaShort instruction file near the code
SkillReusable workflow with judgmentBrowser proof, release notes, Oracle handoff.agents/skills/<name>/SKILL.md
PluginShareable installable bundleTeam workflow with skills, MCP, apps, hooks.codex-plugin/plugin.json plus bundled folders
HookLifecycle enforcementPrompt injection, tool policy, Stop receipt gatehooks/hooks.json plus small scripts
MCPExternal systems and contextBrowser, Figma, docs, DB, SaaS tools.mcp.json plus tool policy
Goal modeLong-running objectiveResearch, plan, implement, verify loop.goals/<task>.md plus receipts
Agent-ready prompt
Create a Codex primitive map for this repo.
Read AGENTS.md, package scripts, existing skills/plugins/hooks, and MCP config first.
For each repeated workflow, decide whether it belongs in AGENTS.md, a skill, a plugin, a hook, MCP, or a goal file.
Write .notes/codex-primitives/decision.md with the decision, files to create, verification command, and rollback path.

Real web-dev stories

Next.js monorepo guardrails

Platform lead

Agents keep running the wrong tests and claiming UI work is done without proof.

Codex brief

Ask Codex to create a root AGENTS.md under 120 lines, copy commands from package scripts or CI, and add browser-proof expectations for UI work.

  • AGENTS.md exists
  • Commands match real scripts
  • Codex can summarize loaded instructions
  • No one-off task details

Browser proof skill

SaaS frontend engineer

Every UI fix needs the same route, viewport, action, console, screenshot, and command receipt.

Codex brief

Ask Codex to create .agents/skills/browser-proof-saas/SKILL.md with clear trigger and non-trigger language.

  • SKILL.md has name/description
  • Proof schema is included
  • Smoke prompt exists
  • Skill says when not to use it

Checkout Stop-hook gate

Checkout owner

Agents stop after one validation fix instead of proving cart to confirmation.

Codex brief

Ask Codex to scaffold a local plugin with a Stop hook that continues until checkout test and browser receipts exist.

  • Manifest validates
  • Hook simulator covers missing/success/cap
  • Receipts are named
  • No global install before simulation

Tips and gotchas

  • Keep AGENTS.md short. Put workflow machinery in skills/plugins/hooks.
  • Treat a skill as guidance, not enforcement. Move deterministic checks to hooks.
  • Use current Codex Stop-hook behavior: return decision: "block" with a reason to continue the turn.
  • Installing a plugin does not automatically make plugin hooks trusted. Teach users to review and trust hook definitions.
  • Use MCP for external stateful systems; use local package scripts for repo-local tests and builds.
References agents can follow

OpenAI Codex Skills
https://developers.openai.com/codex/skills

OpenAI Codex Plugins
https://developers.openai.com/codex/plugins

OpenAI Codex Hooks
https://developers.openai.com/codex/hooks

OpenAI AGENTS.md guide
https://developers.openai.com/codex/guides/agents-md

AGENTS.md standard
https://agents.md/

Better Plugins examples
https://github.com/johnlindquist/better-plugins/tree/main/plugins

Lesson 05 / Lesson 08 / Lesson 11 - Oracle: browser-proof-depth

Browser Proof Field Guide

Make Codex prove rendered web behavior with routes, actions, logs, screenshots, and receipts.

Browser automation CLI designed for AI agents.
agent-browser docs

Use this when a UI, auth, form, route, or responsive change cannot be considered done from source inspection alone.

PrimitiveUse whenCodex doesArtifact
Codex BrowserLocal/public unauthenticated routeFast visual proof and screenshots.notes/browser-proof/<task>/browser-proof.md
Chrome extension/manual ChromeSigned-in browser stateAuth flows and internal SaaS pagesRedacted manual proof plus screenshot
agent-browserTerminal-friendly browser controlsnapshot, click, fill, screenshotsnapshot.txt plus screenshot.png
PlaywrightRepeatable or CI-friendly flowTrace, console, screenshot, deterministic testtrace.zip plus console.json
DevTools/CDPRaw console/network evidenceHAR, console objects, performance detailsconsole.log plus network.har
Agent-ready prompt
Use browser proof before claiming this UI task is done.
Inspect package.json and AGENTS.md first.
Pick the browser surface that matches the auth and repeatability needs.
Save .notes/browser-proof/<task>/browser-proof.md with URL, viewport, auth context, actions, expected/observed result, console errors, network failures, artifacts, command output, diff summary, and pass/fail.
Redact cookies, tokens, customer data, QR codes, and auth headers.

Real web-dev stories

Authenticated billing route

SaaS frontend engineer

Codex changed /settings/billing, but the route only works when logged in.

Codex brief

Use signed-in Chrome or a seeded login; verify current plan, change plan, submit, and capture final visible success state.

  • browser-proof.md
  • screenshot
  • console/network summary
  • typecheck/test log

Pricing weird-path QA

Growth engineer

Plan toggles, coupons, and checkout modals fail after repeated back/refresh/toggle sequences.

Codex brief

Run five browser QA perspectives and collate reproducible findings into .notes/qa/pricing-summary.md.

  • perspective files
  • summary severity list
  • screenshots/transcripts
  • console status

Preview deployment triage

Frontend lead

A PR preview is green but nobody opened changed routes.

Codex brief

Discover preview URL, open homepage and changed route, capture status, visible error text, console errors, and screenshot.

  • preview URL
  • browser receipt
  • blocker/warning/clean verdict
  • no source edits

Tips and gotchas

  • Tests prove code paths; browser proof proves user-visible behavior.
  • Use the in-app browser for localhost/public pages, but use Chrome/manual auth for signed-in state.
  • Do not make Playwright mandatory for every UI task; use it when repeatability matters.
  • Treat screenshots, traces, HARs, and console logs as sensitive local artifacts by default.
  • A Stop hook should check receipt files, not drive the browser itself.
References agents can follow

Codex in-app browser
https://developers.openai.com/codex/app/browser

Codex Chrome extension
https://developers.openai.com/codex/app/chrome-extension

agent-browser
https://github.com/vercel-labs/agent-browser

Playwright screenshots
https://playwright.dev/docs/screenshots

Playwright Trace Viewer
https://playwright.dev/docs/trace-viewer

Chrome DevTools Console
https://developer.chrome.com/docs/devtools/console/log

Lesson 02 / Lesson 03 - Oracle: scenario-cmux-orchestrate

CMUX Orchestration Field Guide

Use panes, browser tabs, and a manager agent without letting workers step on each other.

Cmux is awesome.
Zac Jones, workshop chat

Use this when one Codex session should coordinate multiple focused workers, a dev server, and browser verification.

PrimitiveUse whenCodex doesArtifact
Manager paneHuman-facing planning and delegationOne agent owns status and decisions.notes/cmux-manager.md
Worker paneNarrow research or implementation sliceOne task, one output file.notes/workers/<name>.md
Browser paneLocal rendered proofKeep visible app state nearby.notes/browser-proof/<task>.md
Server paneLong-running dev serverKeep logs stable and inspectable.notes/server-log.md
Agent-ready prompt
Act as the CMUX manager.
Inspect the current repo and create a pane plan before opening workers.
Each worker gets one bounded task, writes one markdown receipt, and does not edit outside its assigned files.
Keep a dev-server pane and browser-proof pane visible for web work.
Collate findings into .notes/cmux-summary.md before asking me for decisions.

Real web-dev stories

Route migration split

Full-stack lead

A Next.js migration needs source audit, implementation, browser proof, and docs update in parallel.

Codex brief

Manager opens workers for route audit, component patch, browser proof, and docs. Only manager summarizes and asks for decisions.

  • worker receipts
  • server log
  • browser proof
  • single manager summary

Bug bash control room

Frontend lead

Several agents should explore different app personas without losing track.

Codex brief

Assign each pane a persona and require numbered findings, screenshots, severity, and reproducibility.

  • per-persona QA files
  • deduped summary
  • known false positives removed

Tips and gotchas

  • Give every pane a name and a single owner.
  • Use workers for gathering and narrow edits; keep decisions in the manager pane.
  • Put servers behind panes that need them so logs do not disappear.
  • Have Codex read worker screens/files back before summarizing.
References agents can follow

CMUX
https://cmux.com/

cmux extensions
https://github.com/johnlindquist/cmux-extensions

Zellij
https://zellij.dev/

Lesson 01 / Lesson 10 - Oracle: scenario-oracle-handoff

Oracle + PackX Handoff Field Guide

Spend Codex on implementation, not on reading 500K tokens of context.

Use Oracle with a codebase.
Workshop transcript

Use this when the hard part is broad context synthesis and Codex should bring back an implementation-ready plan.

PrimitiveUse whenCodex doesArtifact
PackX previewBefore any large handoffVerify files and token estimate~/.oracle/bundles/<slug>.txt
Browser OracleExpensive broad reasoningUpload bundle to ChatGPT browser mode~/.oracle/sessions/<slug>/output.log
Codex implementationAfter plan returnsPatch, test, verify, summarize.notes/<task>/verification.md
Agent-ready prompt
Use Oracle + PackX for this broad codebase question.
Preview the bundle first.
Build a markdown bundle under ~/.oracle/bundles/<slug>.txt.
Run Oracle in browser mode with a 3-5 word slug.
Read the full Oracle output, write a local plan, implement the first shippable slice, and verify locally before asking Oracle again.

Real web-dev stories

Stale test-suite migration

Senior engineer

Hundreds of tests need classification into keep, rewrite, browser QA, or delete.

Codex brief

Pack tests, source owners, failures, and CI logs; ask Oracle for a migration plan; Codex implements one bucket at a time.

  • bundle path
  • Oracle plan
  • migration goal file
  • first verified slice

Architecture consistency audit

Tech lead

A repo has several competing patterns and nobody knows the current owner path.

Codex brief

Bundle current source and docs, ask Oracle for live owner paths and invariants, then patch docs and tests locally.

  • owner map
  • invariant list
  • docs patch
  • verification commands

Tips and gotchas

  • Preview before packing so the handoff is intentional.
  • Keep Oracle text-only; the local agent owns artifacts and commits.
  • Do not ask Oracle again on the same topic until you implemented, verified, hit a blocker, or narrowed scope.
  • Preserve the session slug and output path for auditability.
References agents can follow

Oracle
https://askoracle.sh/

Oracle GitHub
https://github.com/johnlindquist/oracle

PackX npm
https://www.npmjs.com/package/packx

Lesson 03 / Lesson 10 / Lesson 11 - Oracle: scenario-agent-memory

Goal, Memory, and Context Hygiene Field Guide

Keep long-running Codex work grounded with local files, compact prompts, and receipt checkpoints.

compressed, easy to read, less tokens
Tyler Newman, workshop chat

Use this when /goal work burns tokens, drifts, or cannot converge because context is not being externalized.

PrimitiveUse whenCodex doesArtifact
.goals/Durable objective and checkpointsLong-running work.goals/<task>.md
.notes/Receipts and scratch stateVerification, logs, decisions.notes/<task>/
Grill-meClarify before a big goalAmbiguous or risky scope.notes/<task>/questions.md
Compact summariesKeep Codex in the useful zoneLarge sessions and repeated loops.notes/<task>/handoff.md
Agent-ready prompt
Before starting this goal, externalize state.
Create .goals/<task>.md with objective, constraints, stop condition, and verification.
Create .notes/<task>/ for receipts.
Every major phase must write a checkpoint: research, plan, implementation, verification, final diff review.
If context gets large, compact to .notes/<task>/handoff.md and resume from that artifact.

Real web-dev stories

Goal that will not converge

Solo developer

Codex keeps exploring alternatives and never ships.

Codex brief

Narrow to one shippable slice, write explicit stop condition, and require a verification receipt before continuing.

  • goal file
  • single next slice
  • verification command
  • final receipt

Memory drift cleanup

Repo maintainer

Agent memory says old commands are true, but package scripts changed.

Codex brief

Audit memories against package.json/CI, write corrections to local notes, and update durable instructions only after verification.

  • memory audit
  • verified commands
  • updated AGENTS.md or skill
  • stale facts removed

Tips and gotchas

  • Use local markdown as the durable memory for a task; do not rely on hidden conversational state.
  • Ask for numbered options with a recommended pick when you need to steer quickly.
  • Treat /goal as research + plan + implement + verify in one bounded loop, not infinite autoresearch.
  • Use compact summaries when receipts are better than raw chat history.
References agents can follow

Grill Me skill
https://www.aihero.dev/skills-grill-me

Beads
https://github.com/gastownhall/beads

Caveman
https://github.com/JuliusBrussee/caveman

Lesson 09 / Lesson 10 - Oracle: scenario-daemon-profile

Codex Daemon Profile Field Guide

Turn repeated web-dev chores into narrow, cheap, low-context Codex workers.

low reasoning is under utilized
rosa, workshop chat

Use this when a repeated CLI or browser-adjacent task deserves its own focused Codex profile instead of a full general agent.

PrimitiveUse whenCodex doesArtifact
Daemon profileRepeated narrow taskPreview deploy checks, log triage, changelog draftingprofiles/pro-<tool>.ts
Fake CODEX_HOMEHard isolationDisable memories/MCP/plugins unless needed.codex-daemon-home/
Command mapSmall tool vocabularyLiteral tasks with predictable outputsREADME plus examples
Agent-ready prompt
Design a narrow Codex daemon profile for this repeated task.
Use one command map, one output shape, low reasoning by default, and no broad memory/MCP/plugin access unless required.
Keep it read-only first.
Write docs, examples, and a smoke test before using it on a real repo.

Real web-dev stories

Preview deploy checker

Frontend lead

Every PR needs preview status, changed-route browser proof, and a short verdict.

Codex brief

Create a read-only deploy-check profile that discovers preview URL, opens changed routes, and writes a browser receipt.

  • profile file
  • command map
  • read-only proof
  • smoke test

Log triage worker

On-call engineer

A local dev server spews logs and general Codex wastes context reading everything.

Codex brief

Create a profile that tails, filters, summarizes, and writes exact blocker text for the main Codex session.

  • filtered log
  • blocker summary
  • no source edits
  • handoff path

Tips and gotchas

  • A daemon profile should be boring: one job, one command map, one output format.
  • Disable unrelated memories, MCPs, apps, and skills to keep context small.
  • Start read-only; add mutation only after a smoke test proves the profile is scoped.
  • Use low reasoning for literal CLI wrappers and reserve smarter models for ambiguous planning.
References agents can follow

codex-daemons
https://github.com/johnlindquist/codex-daemons

Codex SDK
https://developers.openai.com/codex/sdk

Codex MCP
https://developers.openai.com/codex/mcp

Lesson 04 / Lesson 05 / Lesson 06 / Lesson 11 - Oracle: scenario-video-reqs

Design, Video, and QA Story Field Guide

Convert taste, screenshots, narrated videos, and weird user behavior into Codex-ready implementation stories.

Bundle files, logs, issues, screenshots and traces.
Workshop chat

Use this when words are weak and Codex needs visual requirements plus browser-verifiable acceptance criteria.

PrimitiveUse whenCodex doesArtifact
design.mdDurable taste contractStyle DNA, type, color, layoutdesign.md
Image referenceHuman-selected directionImplement one concrete look.notes/design/reference.png
Video requirementsNarrated UX behaviorExtract acceptance criteria and edge cases.goals/<feature>-video.md
QA swarmWeird-path explorationPersonas, broken paths, browser receipts.notes/qa/
Agent-ready prompt
Treat the visual artifact as requirements, not decoration.
Extract acceptance criteria, user stories, responsive states, and proof moves.
Implement only the selected direction.
Verify in browser with screenshots/logs.
Write the durable contract to design.md, .goals/<feature>.md, or .notes/qa/<flow>.md depending on the task.

Real web-dev stories

Reference image to UI

Product engineer

The team can recognize the desired look but cannot write the CSS spec.

Codex brief

Ask Codex to distill the selected image into design.md, implement one route, then browser-proof the result.

  • design.md update
  • implemented route
  • desktop/mobile screenshots
  • diff review

Narrated bug report

Full-stack developer

A PM records cart behavior that is faster to show than explain.

Codex brief

Extract requirements from video into a goal file, implement behavior, and prove cart states in browser.

  • video-derived goal
  • acceptance criteria
  • browser proof
  • test/typecheck log

5 Monkeys QA swarm

QA-minded frontend lead

Happy path passes, but users hit weird navigation and state sequences.

Codex brief

Launch persona-based QA workers that write reproducible findings, then collate only high-signal bugs.

  • persona findings
  • dedupe summary
  • screenshots
  • repro steps

Tips and gotchas

  • Do not ask Codex to invent taste from nothing; hand it references and constraints.
  • A video should become acceptance criteria, not a brittle click macro.
  • Keep visual QA disposable until the flow stabilizes; promote only stable checks to tests.
  • Every visual/design story should end with browser proof across relevant viewports.
References agents can follow

Impeccable
https://impeccable.style/

Inspiration board
https://inspiration-board.pages.dev/

OpenAI Codex Figma/frontend post
https://developers.openai.com/codex

Playwright screenshots
https://playwright.dev/docs/screenshots