☰ Lessons

🧰 Build it yourself

Prompt & Skill Toolkit

Every technique from the workshop, ready to reuse — 11 copy-paste prompts and 10 installable Codex skills, grouped by part and grounded in the real commands John demonstrated. Each card pairs the prompt you paste into your agent with the skill you install once and forget.

Fill in the <PLACEHOLDERS>. Hit Copy, paste, go. Expand a skill to grab its SKILL.md.

Part 1 · Morning

Foundations: escape token limits, orchestrate your terminal, and give your agents memory.

Prompt + SkillLesson 01 · Escape the Token Ceiling

Oracle handoff for a huge codebase question

When a read-heavy question spans a large codebase and would burn your Codex token/rate budget. Offload the heavy analysis to a browser ChatGPT Pro Extended Thinking session via Oracle + PackX, then bring the returned plan back into Codex and implement locally.

PromptPaste into your agent

Copy-paste prompt

You are my coding agent. I have a large, read-heavy question about this codebase and I want to offload the heavy analysis to a browser ChatGPT Pro session via Oracle + PackX instead of burning local tokens. Follow these steps exactly and do not skip the preview.

1. Bundle the relevant files with PackX. First PREVIEW so we see what gets included and the token estimate:
     packx --preview -s "<TOPIC>"
   When the preview looks right, write the bundle:
     packx -s "<TOPIC>" -f markdown --no-interactive -o .notes/handoff-bundle.md

2. Send the bundle to ChatGPT Pro in the browser with extended thinking. The slug MUST be 3-5 kebab words:
     ORACLE_MAX_FILE_SIZE_BYTES=12000000 oracle --engine browser --browser-thinking-time extended \
       -p "<THE BIG ANALYSIS QUESTION>" \
       --slug "<3-5-word-kebab-slug>" \
       --write-output .notes/oracle-plan.md \
       --file .notes/handoff-bundle.md

3. Read .notes/oracle-plan.md, then implement the plan locally and verify (run the tests/build). Treat this as a plan -> implement -> verify loop.

Guardrail: keep browser concurrency LOW — no more than 2-3 Oracle browser requests at once.

How it works

packx --preview -s "<TOPIC>" shows the matched files and a token estimate before you commit — always preview first so the bundle is scoped, not bloated.
packx -s "<TOPIC>" -f markdown --no-interactive -o .notes/handoff-bundle.md writes a scripted markdown bundle (no prompts).
oracle --engine browser --browser-thinking-time extended drives ChatGPT Pro in the browser with extended thinking; --write-output saves ONLY the final assistant message, and --file attaches the bundle (cap raised via ORACLE_MAX_FILE_SIZE_BYTES=12000000).
The --slug must be 3-5 kebab words or Oracle rejects it.
The returned plan is the durable artifact: read it back into Codex, implement, then verify. Keep concurrency low (2-3 browser requests) to stay within ChatGPT Pro limits.
Reference: John's Oracle fork, PackX.

Codex skilloracle-codebase-handoffInstall & forget

Save as ~/.agents/skills/oracle-codebase-handoff/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: oracle-codebase-handoff
description: Trigger when a read-heavy question spans a large codebase and would burn Codex token/rate budget — offload heavy analysis to a browser ChatGPT Pro session via Oracle + PackX, then resume implementation locally. Do NOT trigger for small, local edits.
---

# Oracle codebase handoff

Offload a big, read-heavy codebase analysis to a browser ChatGPT Pro session via Oracle + PackX, then bring the plan back and implement locally. Run as a plan -> implement -> verify loop.

## Steps

1. Bundle the relevant files with PackX. PREVIEW first to confirm scope and token estimate, then write the bundle:
   ```bash
   packx --preview -s "<topic>"
   packx -s "<topic>" -f markdown --no-interactive -o .notes/handoff-bundle.md
   ```
   `-i` is a name/extension GLOB, not a path selector — pass a known file/dir POSITIONALLY. Always `--preview` first; use `--no-interactive` for scripted bundles.

2. Send the bundle to ChatGPT Pro in the browser with extended thinking. The slug MUST be 3-5 kebab words, and the prompt text must start with `[<slug>]` on line 1, a blank line, then the question:
   ```bash
   ORACLE_MAX_FILE_SIZE_BYTES=12000000 oracle --engine browser --browser-thinking-time extended \
     -p "[<3-5-word-kebab-slug>]\n\n<the big analysis question>" \
     --slug "<3-5-word-kebab-slug>" \
     --write-output .notes/oracle-plan.md \
     --file .notes/handoff-bundle.md
   ```
   - `--engine browser` automates ChatGPT in the browser (GPT models).
   - `--browser-thinking-time extended` sets ChatGPT thinking time (light|standard|extended|heavy).
   - `--write-output` writes ONLY the final assistant message.
   - `--file` attaches the bundle (size cap raised via `ORACLE_MAX_FILE_SIZE_BYTES`).

3. Read `.notes/oracle-plan.md`, then implement the plan locally and verify (tests/build).

## Guardrails
- Slug MUST be 3-5 kebab words or Oracle rejects it.
- Keep browser concurrency LOW — no more than 2-3 Oracle browser requests at once.
- Do NOT use this for small, local edits — only for read-heavy, large-codebase questions.

Prompt + SkillLesson 02 · Command Your Terminal Like an Orchestra

Command a CMUX manager/worker layout

When you want to drive several agent panes in parallel but only talk to one. This sets up a Manager pane plus worker panes in CMUX using natural language; the Manager opens, inspects, and delegates to the workers while you only converse with the Manager.

PromptPaste into your agent

Copy-paste prompt

You are the Manager agent running in my focused CMUX pane. I will only talk to you; you coordinate worker panes. Use the REAL cmux CLI (do not invent flags). Keep the upfront planning human-led — confirm the plan with me before delegating.

1. Build the layout. Open a worker surface to the right and create additional worker panes as needed:
     cmux new-split right
     cmux new-pane
   Name the tabs so they stay findable:
     cmux rename-tab "<role-name>"

2. Always inspect live state before acting on a worker:
     cmux --json tree        # full pane/surface tree as JSON (focused pane, refs, cwd)
     cmux list-panes         # enumerate worker panes

3. Delegate a task into a worker surface, then read its output back as context:
     cmux send -- "<TASK TEXT FOR THAT WORKER>"
     cmux read-screen        # read that worker's terminal text back

4. Loop: you (Manager) talk to me, workers do the research/implementation. Report status back to me; do not let workers drift without checking cmux --json tree.

Guardrail: prefer PANES over tab groups — tab groups are easy to lose track of.

How it works

cmux new-split right and cmux new-pane build the manager/worker layout; cmux rename-tab "<name>" keeps roles findable.
cmux --json tree returns the live window/workspace/pane/surface tree as JSON (focused pane, refs like pane:89/surface:126, working directory) — inspect it before acting.
cmux list-panes enumerates worker panes.
cmux send -- "<text>" injects a task into a worker surface; cmux read-screen reads that worker's terminal text back as context for the Manager.
The Manager talks to the human; workers do research/implementation. Keep upfront planning human-led.
Prefer panes over tab groups — tab groups are easy to lose track of.
Reference: CMUX.

Codex skillcmux-orchestrateInstall & forget

Save as ~/.agents/skills/cmux-orchestrate/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: cmux-orchestrate
description: Trigger when the user wants to drive multiple terminal/agent panes in CMUX from natural language — build manager/worker layouts, open/arrange panes, inspect or send work to other panes. Do NOT trigger outside a CMUX workspace.
---

# CMUX orchestration

Act as the Manager agent: build a pane layout, inspect live state, and delegate work to worker panes using the REAL cmux CLI. Do not invent flags. Inspect with `cmux --json tree` before acting.

## Command map (real cmux CLI)

```bash
cmux new-split <left|right|up|down>   # split from the focused surface in a direction
cmux new-pane                         # create a pane (terminal or browser content)
cmux --json tree                      # full window/workspace/pane/surface tree as JSON
cmux list-panes                       # enumerate panes in the workspace
cmux send -- "<text>"                 # send a task into a terminal surface
cmux read-screen                      # read a surface's terminal text back as context
cmux rename-tab "<name>"              # name a tab so it stays findable
cmux focus-pane <ref>                 # focus a pane (ref like pane:89 or surface:126)
```

Handles accept UUIDs, refs like `workspace:2` / `pane:89` / `surface:126`, and indexes.

## Manager-worker pattern
1. Build the layout: `cmux new-split right`, then `cmux new-pane` for each worker; name them with `cmux rename-tab "<role>"`.
2. Before acting on any worker, inspect live state: `cmux --json tree` (focused pane, refs, cwd) and `cmux list-panes`.
3. Delegate work: `cmux send -- "<task>"` into a worker, then `cmux read-screen` to pull its output back as context.
4. You (Manager) talk to the human; workers do research/implementation. Report status back.

## Guardrails
- Prefer PANES over tab groups — tab groups are easy to lose track of.
- Only use inside a CMUX workspace.

Prompt + SkillLesson 03 · Give Your Agents a Memory

Agent-managed memory folders

At the start of any project where you want agents to keep visible Markdown plans, notes, and goals close to the code without committing them. This creates `.notes/` and `.goals/`, ignores them globally, and opens a live-refreshing Markdown tracker pane.

PromptPaste into your agent

Copy-paste prompt

You are my coding agent. Set up local agent-managed memory the workshop way, then use it. These are LOCAL working files, not source — keep them boring and close to the repo, and never commit them.

1. Create the memory folders:
     mkdir -p .notes .goals

2. Ignore them across ALL my repos via a global gitignore:
     git config --global core.excludesFile ~/.gitignore_global
     echo '.notes/' >> ~/.gitignore_global
     echo '.goals/' >> ~/.gitignore_global

3. Write the current plan into .goals/ as Markdown, then open a live-refreshing tracker pane next to the code:
     cmux markdown open .notes/tracker.md --direction right

4. When you plan, use the numbered-option pattern so I can steer with terse refs: give me a NUMBERED list where each item has options A/B/C and YOUR recommended pick. I will reply with refs like "5B" or "27D" to choose.

How it works

mkdir -p .notes .goals creates the two local memory folders.
git config --global core.excludesFile ~/.gitignore_global then appending .notes/ and .goals/ keeps them out of every repo by default — they are working files, not source.
cmux markdown open .notes/tracker.md --direction right opens the Markdown in a formatted viewer with live file watching (auto-refresh) as a side tracker pane.
The numbered-option planning pattern (each item has A/B/C plus the agent's recommended pick) lets you steer with terse references like 5B.
Reference: Git global gitignore (core.excludesFile), CMUX.

Codex skillagent-memory-foldersInstall & forget

Save as ~/.agents/skills/agent-memory-folders/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: agent-memory-folders
description: Trigger at project start (or when an agent needs visible local working memory) to create .notes/ and .goals/, ignore them via global gitignore, and open a live Markdown tracker. Do NOT use for files meant to be committed.
---

# Agent memory folders

Provision local agent-managed memory: `.notes/` for working notes and `.goals/` for plans, ignored globally so they never get committed, plus a live Markdown tracker pane. These are LOCAL working files — keep them boring and close to the repo.

## Steps

1. Create the folders:
   ```bash
   mkdir -p .notes .goals
   ```

2. Ignore them across ALL repos via a global gitignore:
   ```bash
   git config --global core.excludesFile ~/.gitignore_global
   echo '.notes/' >> ~/.gitignore_global
   echo '.goals/' >> ~/.gitignore_global
   ```

3. Write the current plan into `.goals/` as Markdown, then open a live-refreshing tracker pane (auto-refresh via file watching) next to the code:
   ```bash
   cmux markdown open .notes/tracker.md --direction right
   ```

4. Use the numbered-option planning pattern: present a NUMBERED list where each item has options A/B/C plus your recommended pick, so the human steers with terse refs like `5B` or `27D`.

## Guardrails
- These are local working files, NOT source — never commit them.
- Keep them boring and close to the repo.

Part 2 · After the morning break

Building trust: source-of-truth context, prompting techniques, and designing with images and video.

Prompt + SkillLesson 04 · Source of Truth Over Stale Memory

A reusable design.md standard

When you want AI design work to follow a durable standard instead of one-off vibes. This turns visual inspiration into a `design.md` design contract (style DNA, color, type, layout, motion) that the agent reuses on every UI task.

PromptPaste into your agent

Copy-paste prompt

You are my coding agent. Author a durable design contract at design.md, then build UI FROM it. The goal is to turn taste into a concrete, reusable artifact — not stale memory. Code stays the source of truth; design.md is a stable design brief.

1. First, gather visual inspiration: collect the generated inspiration images I provide (or generate a sheet of options) and note what I actually picked.

2. Distill those references into design.md with these sections, each concrete and specific:
     - references (the images/directions I chose)
     - style DNA (the overall feel in a few words)
     - color (palette + usage rules)
     - typography (families, scale, weights)
     - layout (grid, spacing, density)
     - composition
     - shape (radii, borders, elevation)
     - motion (timing, easing, where motion is/ isn't used)
     - constraints (hard rules the UI must respect)

3. Now build the UI FROM design.md (e.g. a landing page <PAGE/COMPONENT>), referencing the contract for every visual decision so the result reflects the standard, not ad-hoc choices.

4. To explore directions, pair this with the throwaway multi-route variant move: generate a few variants, compare, keep one, delete the rest.

How it works

The workflow is: collect generated inspiration images first, then distill them into design.md, then have the agent build UI FROM design.md so taste becomes a concrete artifact.
The design.md sections (references, style DNA, color, typography, layout, composition, shape, motion, constraints) form a stable design contract the agent reuses.
Code is the source of truth; design.md is a stable design brief, not stale memory to drift from.
Pair it with the throwaway multi-route variant move (generate, compare, keep one) to explore directions cheaply.
Reference: Inspiration Board, Codex CLI.

Codex skilldesign-md-standardInstall & forget

Save as ~/.agents/skills/design-md-standard/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: design-md-standard
description: Trigger when starting UI/design work that should follow a durable standard rather than one-off prompts — author or update a design.md design contract, then build UI from it. Do NOT use as stale long-term project memory.
---

# design.md standard

Create or maintain a durable `design.md` design contract, then build UI FROM it so taste becomes a concrete, reusable artifact. Code stays the source of truth; `design.md` is a stable brief, not memory to drift.

## Steps

1. Gather visual inspiration first: collect the generated inspiration images the user provides (or generate a sheet of options) and note which direction was actually picked.

2. Distill the references into `design.md` with these sections, each concrete and specific:
   - **references** — the chosen images/directions
   - **style DNA** — the overall feel in a few words
   - **color** — palette + usage rules
   - **typography** — families, scale, weights
   - **layout** — grid, spacing, density
   - **composition**
   - **shape** — radii, borders, elevation
   - **motion** — timing, easing, where motion is / isn't used
   - **constraints** — hard rules the UI must respect

3. Build the UI FROM `design.md`, referencing the contract for every visual decision (color, type, spacing, motion) so the result reflects the standard, not ad-hoc choices.

## Guardrails
- Treat code as the source of truth; `design.md` is a stable design brief, NOT stale long-term memory.
- Pair with the throwaway route-variants move (generate, compare, keep one) when exploring directions.

SkillLesson 04 · Source of Truth Over Stale Memory

Throwaway route variants

Trigger when exploring UI/layout directions and you want fast comparison without a permanent component catalog — generate several route-level variants in-app, review, keep one, delete the rest. Do NOT use when a maintained design system is required.

Codex skillthrowaway-route-variantsInstall & forget

Save as ~/.agents/skills/throwaway-route-variants/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: throwaway-route-variants
description: Trigger when exploring UI/layout directions and you want fast comparison without a permanent component catalog — generate several route-level variants in-app, review, keep one, delete the rest. Do NOT use when a maintained design system is required.
---

# Throwaway route variants

The anti-Storybook move: generate several route-level UI variants inside the running app, compare them, keep the winner, and DELETE the losers. Favors shipping momentum over a maintained catalog. These variants are explicitly disposable.

## Steps

1. Generate 5 route-level variants of the target UI INSIDE the running app — each variant is its own route (e.g. `/variant-a`, `/variant-b`, ...), not a permanent component.

2. Add simple left/right navigation between the variants so the human can flip through and compare them live.

3. Let the human pick the winner.

4. DELETE the losing routes/variants — do not keep them around as a catalog.

## Guardrails
- Variants are explicitly disposable; this favors shipping momentum over a maintained component catalog.
- Do NOT use when a maintained design system / permanent component library is required.

Prompt + SkillLesson 05 · The 5 Monkeys & Perspective Prompting

5 Monkeys QA swarm

When you want to surface state bugs and UX dead ends a happy-path check would miss. This launches many cheap agents that click around in weird, non-obvious ways from assigned perspectives, recording findings to discardable QA files.

PromptPaste into your agent

Copy-paste prompt

Run a "5 Monkeys" QA swarm on <APP/FEATURE>. The point is to find state bugs and UX dead ends the happy path would never reveal — so do NOT just walk the happy path.

1. Assign yourself (or each worker agent) a PERSPECTIVE lens and stay in character:
     expert power user / brand-new first-time user / a grandparent / impatient power user.

2. Explore WEIRD, non-obvious paths explicitly:
     - keyboard shortcut sequences
     - back/forward navigation mid-flow
     - repeated/rapid clicks on the same control
     - toggling modes back and forth
     - partial or abandoned flows (start, then bail)
   Try to break state, not to succeed.

3. Write findings to a DISCARDABLE numbered QA markdown file (e.g. .notes/qa-01.md). These are cheap — regenerate freely.

4. After the runs, collate the findings and keep ONLY the useful ones.

Guardrails: keep rigorous tests for stable business logic — don't over-test UI you still expect to reshape daily. If you use a browser agent, script the auth/login path explicitly: cookies are NOT shared.

How it works

Each agent gets a perspective lens (expert, brand-new user, grandparent, power user) and explores the app in character.
Agents deliberately explore weird paths — shortcut sequences, back navigation, repeated clicks, mode toggles, partial/abandoned flows — instead of the happy path, which is where state bugs hide.
Findings go to discardable numbered QA markdown files (cheap, regenerate freely); collate afterward and keep only the useful ones.
Keep rigorous automated tests for stable business logic; don't over-test UI you still expect to reshape daily.
If using a browser agent, script the auth/login path explicitly — cookies are not shared between agents.
Reference: Agent Browser.

Codex skillfive-monkeys-qaInstall & forget

Save as ~/.agents/skills/five-monkeys-qa/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: five-monkeys-qa
description: Trigger when you want to surface state bugs and UX dead ends a happy-path check would miss — launch perspective-based agents that explore weird, non-obvious paths and write findings to discardable QA files. Do NOT use to replace rigorous tests for stable business logic.
---

# 5 Monkeys QA

Launch cheap perspective-based agents that explore the app in weird, non-obvious ways to surface state bugs and UX dead ends the happy path would never reveal. Findings go to discardable QA files.

## Steps

1. Assign each agent a PERSPECTIVE lens and have it stay in character:
   - expert power user
   - brand-new first-time user
   - a grandparent
   - impatient power user

2. Explore WEIRD, non-obvious paths explicitly (not the happy path):
   - keyboard shortcut sequences
   - back/forward navigation mid-flow
   - repeated / rapid clicks on the same control
   - toggling modes back and forth
   - partial or abandoned flows (start, then bail)

3. Each agent writes findings to a DISCARDABLE numbered QA markdown file (e.g. `.notes/qa-01.md`). They are cheap — regenerate freely.

4. Collate the findings and keep ONLY the useful ones.

## Guardrails
- Keep rigorous automated tests for STABLE business logic — this does not replace them, and don't over-test UI you still expect to reshape daily.
- If using a browser agent, script the auth/login path explicitly — cookies are NOT shared.

PromptLesson 06 · Design With Images & Video

Image-to-design handoff

When you can recognize the look you want but can't spec exact CSS. Generate a sheet of visual variations, pick the direction, and hand that concrete reference image to a coding agent to implement only that direction.

PromptPaste into your agent

Copy-paste prompt

Help me do an image-to-design handoff for <COMPONENT/SCREEN>. I can recognize the look I want but can't spec the CSS, so we'll let a selected reference image carry my taste.

1. Ask an image model for a SHEET of 5 visual variations of <COMPONENT/SCREEN> in one image (vary layout, color, type, density).

2. Stop and let ME select the direction I like. (My selection expresses taste far better than a vague text prompt.)

3. Take the chosen reference image and implement ONLY that direction in code — match its layout, spacing, color, and type. Do not blend in the other variations.

Guardrails:
- MCP-vs-CLI boundary: don't wrap a CLI in MCP just because MCP exists. Keep deterministic work in scripts/CLIs; reserve agents for judgment calls like this selection.
- Model-by-task-shape: use Codex for the coding follow-through; reserve fast Flash-class models for small, tightly scoped tasks only.

How it works

Step 1 asks an image model for a sheet of 5 variations of one component/screen at once.
Step 2 puts the human in the loop: selecting a direction expresses taste better than a vague prompt.
Step 3 hands the chosen reference image to the coding agent and constrains it to implement only that direction.
Taught caution — the MCP-vs-CLI boundary: don't wrap a CLI in MCP just because MCP exists; keep deterministic work in scripts/CLIs and reserve agents for judgment.
Model-by-task-shape: Codex for coding follow-through, fast Flash models only for small scoped tasks.
Reference: Google Gemini video/image prompting, Codex CLI.

Part 3 · After lunch

Going autonomous: hooks, self-improving loops, background daemons, and code as a throwaway artifact.

Prompt + SkillLesson 07 · Hooks, Skills & the Ralph Loop

Ralph Loop (stop-hook goal continuation)

When you want an agent to pursue a goal across many turns without manual nudging. A Stop hook re-checks the original task when the agent tries to stop and pushes it to keep working until a clear completion condition is met.

PromptPaste into your agent

Copy-paste prompt

You are my coding agent. Build a minimal "Ralph Loop" Codex plugin: a Stop hook that, when I would otherwise stop, re-points me at the original goal and makes me continue until a completion condition is met. Ground everything in the REAL Codex hook model — the three lifecycle events are UserPromptSubmit, PreToolUse (Tool Use), and Stop. The Ralph Loop is a STOP hook.

1. First enable hooks + plugins in ~/.codex/config.toml, then restart Codex:
     [features]
     hooks = true
     plugins = true
     plugin_hooks = true

2. Scaffold the real plugin layout:
     my-plugin/
       .codex-plugin/plugin.json      # manifest, with "hooks": "hooks/hooks.json"
       hooks/hooks.json               # wires a Stop hook command
       hooks/hook.py                  # the Stop hook script

3. In hooks/hooks.json, wire the Stop event to run the hook script, e.g.:
     {
       "hooks": {
         "Stop": [
           { "hooks": [ { "type": "command", "command": "python3 \"${CODEX_PLUGIN_ROOT}/hooks/hook.py\"", "timeout": 5 } ] }
         ]
       }
     }

4. hook.py reads the Stop event JSON on stdin and writes EXACTLY ONE control-JSON line. When the goal is NOT yet complete, emit {"continue": true} and re-state the original goal so I keep working; when the completion condition is reached, let me stop. (Note: suppressOutput is ONLY valid on UserPromptSubmit.) Keep the hook fast and boring.

CRITICAL guardrail: the loop MUST have a hard completion condition — an iteration cap or an explicit target state — enforced by the hook, not just suggested in a prompt. Without it you get a runaway loop.

How it works

The three Codex hook lifecycle events are UserPromptSubmit, PreToolUse (Tool Use), and Stop; the Ralph Loop is a Stop hook.
Hooks must be enabled in ~/.codex/config.toml under [features] (hooks, plugins, plugin_hooks = true) and Codex restarted.
The real plugin layout is .codex-plugin/plugin.json (manifest) + hooks/hooks.json (wiring) + a hook script; ${CODEX_PLUGIN_ROOT} resolves the plugin dir.
The hook reads the event JSON on stdin and writes exactly one control-JSON line ({"continue": true} to keep going). The continuation is enforced by the harness, not a prompt.
CRITICAL: require an explicit completion condition (iteration cap or target state) or you get runaway loops.
Reference: Codex Hooks docs, Better Plugins.

Codex skillralph-loop-hookInstall & forget

Save as ~/.agents/skills/ralph-loop-hook/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: ralph-loop-hook
description: Trigger when the user wants an agent to keep pursuing a goal across turns via a Codex Stop hook (the Ralph Loop). Do NOT use without a clear completion condition — unbounded loops waste tokens and run tools too long.
---

# Ralph Loop hook

Scaffold a Ralph Loop: a Codex Stop hook that, when the agent tries to stop, re-points it at the original goal and continues until a completion condition is met — enforced by the harness, not merely suggested in a prompt. The three Codex hook lifecycle events are UserPromptSubmit, PreToolUse (Tool Use), and Stop; the Ralph Loop is a Stop hook.

## Prerequisite

Enable hooks + plugins in `~/.codex/config.toml`, then restart Codex:
```toml
[features]
hooks = true
plugins = true
plugin_hooks = true
```
`/plugins` confirms install; `/hooks` reviews/trusts bundled hooks.

## Steps

1. Scaffold the real plugin layout:
   ```
   my-plugin/
     .codex-plugin/plugin.json      # manifest, with "hooks": "hooks/hooks.json"
     hooks/hooks.json               # hook wiring
     hooks/hook.py                  # the Stop hook script
   ```

2. Wire the Stop event in `hooks/hooks.json`:
   ```json
   {
     "hooks": {
       "Stop": [
         { "hooks": [ { "type": "command", "command": "python3 \"${CODEX_PLUGIN_ROOT}/hooks/hook.py\"", "timeout": 5 } ] }
       ]
     }
   }
   ```

3. `hooks/hook.py` reads the Stop event JSON on stdin and writes EXACTLY ONE control-JSON line. While the goal is NOT complete, emit `{"continue": true}` and re-state the original goal so the agent keeps working; when the completion condition is reached, allow it to stop. (`suppressOutput` is ONLY valid on UserPromptSubmit — Codex rejects it on other events.) Keep the hook fast and boring; push any slow work to a detached worker.

## Guardrails
- MANDATORY: a hard completion condition (iteration cap or target state) enforced by the hook. Without it you get a runaway loop that wastes tokens and runs tools too long.

Prompt + SkillLesson 08 · Self-Improving Loops & Deterministic Gates

Self-improving log-gated fix loop

When you need to KNOW an agent really verified its work, not just trust a happy completion message. This captures the exact terminal failure, feeds it back to harden the tool, and gates fix/verify behind structured logs a supervisor compares.

PromptPaste into your agent

Copy-paste prompt

You are my coding agent. Don't trust completion messages — prove the work with logs. Run this self-improvement + log-gated verification loop on <TOOL/WORKFLOW>.

1. Capture the EXACT terminal failure state as clean text (don't summarize it from memory). In CMUX, use the terminal-text extraction to grab the real output:
     cmux read-screen
   Use that real text as context.

2. Use this prompt shape against the failure — state the goal, paste the failure after a colon, then ask for a root-cause fix:
     "Goal: <GOAL>. Here is the exact failure: <PASTE TERMINAL TEXT>. Explain WHY it failed in this scenario, then PATCH the hook/workflow so this case is covered next time."

3. Make verification CONCRETE and deterministic. The fix must write structured logs / JSON artifacts / timestamps to disk on each run. Then a supervisor agent runs the scenarios, waits, and COMPARES expected-vs-actual fields in those log files. That comparison is the gate — only pass when the fields match.

Guardrails: keep worker agents narrow and tool-limited; keep the orchestrator context-rich. REDACT secrets from logs.

How it works

Capture the exact terminal failure as clean text (via CMUX cmux read-screen) rather than a memory summary, and use it as real context.
The prompt shape — goal, then the failure after a colon, then "explain why it failed and patch the workflow" — drives a true root-cause fix, not a band-aid.
Verification is made concrete: the workflow writes structured logs / JSON artifacts / timestamps, and a supervisor agent runs scenarios and compares expected-vs-actual fields. That field comparison is the deterministic gate.
Keep worker agents narrow and tool-limited; keep the orchestrator context-rich.
Redact secrets from logs.
Reference: codex-daemons, CMUX.

Codex skilllog-gated-verificationInstall & forget

Save as ~/.agents/skills/log-gated-verification/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: log-gated-verification
description: Trigger when you must prove an agent actually verified work rather than trusting a completion message — write structured logs/artifacts and have a supervisor compare expected-vs-actual fields (a deterministic gate). Do NOT trust self-reported success.
---

# Log-gated verification

Prove work with logs instead of trusting completion messages. Capture the exact failure, patch the workflow to cover it, and gate fix/verify behind structured logs a supervisor compares field-by-field.

## Steps

1. Capture the EXACT terminal failure state as clean text (not a memory summary). In CMUX, extract the real terminal output:
   ```bash
   cmux read-screen
   ```
   Use that real text as context.

2. Use this prompt shape against the failure — goal, then the failure after a colon, then the fix request:
   ```
   Goal: <goal>. Here is the exact failure: <paste terminal text>.
   Explain WHY it failed in this scenario, then PATCH the hook/workflow so this case is covered next time.
   ```

3. Make verification CONCRETE and deterministic: the workflow must write structured JSON logs / timestamps / payload snapshots to disk on each run.

4. A supervisor agent runs the scenarios, waits, and COMPARES expected-vs-actual fields in those log files. That field comparison is THE GATE — pass only when the fields match.

## Guardrails
- Do NOT trust self-reported success; the log comparison is the source of truth.
- Keep worker agents narrow and tool-limited; keep the orchestrator context-rich.
- REDACT secrets from logs.

Prompt + SkillLesson 09 · Background Daemons & Agent Swarms

Isolated Codex daemon profile

When a repetitive, well-scoped CLI task should run on a cheap, fast, low-context warm agent instead of a full general agent. This wraps one CLI in a single-purpose Codex SDK profile with a tight context ceiling.

PromptPaste into your agent

Copy-paste prompt

You are my coding agent. Generate a new isolated "pro-<TOOL>" Codex daemon profile for the <TOOL> CLI, following the codex-daemons pattern. The point is a cheap, fast, low-context warm agent that does ONE thing — target ~6-10K input tokens.

1. Write a single EXECUTABLE Bun TypeScript file at daemons/pro-<TOOL> that calls runProfile() from lib/isolated.ts. Use this prompt shape inside developerInstructions:
     ## Operating rule  -> run <TOOL> via exec_command before any final answer; never answer from memory
     ## Command map     -> KEYWORD -> <TOOL> COMMAND  (e.g. status -> <TOOL> status; help -> <TOOL> --help)
     ## Workflow        -> narrowest read-only command first; mutate only when target+action are explicit; on usage error run --help then retry once
     ## Command rules   -> use only <TOOL>; no web, no images, no file edits unless explicitly asked
     ## Output          -> be terse; report what you found/changed
   Keep the command map LITERAL — do not dump --help text. Use a low-reasoning model (reasoningEffort default "low"; default model gpt-5.3-codex-spark).

2. The isolation is applied by runProfile() — it sets: base_instructions replaced, developer_instructions custom, skills.include_instructions=false, include_apps_instructions=false, include_environment_context=false, include_permissions_instructions=false, memories.use_memories=false, mcp_servers={}, web_search="disabled", features {plugins, apps, image_generation, tool_search, tool_suggest} all false (features.apps=false alone saves ~14K tokens), CODEX_HOME redirected to a disposable /tmp dir, and auth.json symlinked from real ~/.codex so token refreshes propagate.

3. After generating, run these steps:
     chmod +x daemons/pro-<TOOL>
     bun daemons/pro-<TOOL> --help        # smoke test
     # add daemons/pro-<TOOL> to package.json "bin", then:
     bun link                              # put it on PATH

Prereqs: Bun runtime; @openai/codex-sdk; an authenticated @openai/codex CLI.

How it works

A profile is a single executable Bun TS file using runProfile() from lib/isolated.ts, with the Oracle-tuned prompt shape: Operating rule -> Command map -> Workflow -> Command rules -> Output.
The context ceiling comes from real isolation keys: base_instructions replaced, custom developer_instructions, skills.include_instructions=false, apps/environment/permissions instructions off, memories.use_memories=false, mcp_servers={}, web_search="disabled", and features (plugins/apps/image_generation/tool_search/tool_suggest) all false — features.apps=false alone saves ~14K tokens.
CODEX_HOME is redirected to a disposable /tmp dir and auth.json is symlinked from real ~/.codex so token refreshes propagate.
Use a low-reasoning model and keep prompts literal (command map over dumping --help); target ~6-10K input tokens (default full config is ~22K).
After-generation: chmod +x, bun daemons/pro-<TOOL> --help, add to package.json bin, bun link.
Reference: codex-daemons, @openai/codex-sdk, Bun.

Codex skillcodex-daemon-profileInstall & forget

Save as ~/.agents/skills/codex-daemon-profile/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: codex-daemon-profile
description: Trigger when a repetitive, well-scoped CLI task should run on a cheap, fast, low-context warm Codex daemon instead of a full general agent — generate an isolated pro-TOOL profile. Do NOT use for broad multi-tool or judgment-heavy work.
---

# Codex daemon profile

Generate an isolated `pro-<TOOL>` Codex daemon profile that wraps ONE CLI in a single-purpose, low-context warm agent. Target ~6-10K input tokens (default full config is ~22K).

## Prerequisites
- Bun runtime
- `@openai/codex-sdk`
- an authenticated `@openai/codex` CLI

## Steps

1. Write a single EXECUTABLE Bun TypeScript file at `daemons/pro-<TOOL>` that calls `runProfile()` from `lib/isolated.ts`:
   ```ts
   #!/usr/bin/env bun
   import { runProfile } from "../lib/isolated.ts";

   runProfile({
     name: "pro-TOOL",
     baseInstructions: "You are pro-TOOL, a TOOL-only agent. Every user message is a TOOL task. First step: run TOOL via exec_command; never give a text-only plan.",
     developerInstructions: `You are pro-TOOL, a TOOL-only agent.

   ## Operating rule
   Run TOOL via exec_command before any final answer. Do not answer from memory. If unclear, run a discovery command first.

   ## Command map
   KEYWORD -> TOOL COMMAND
   status / what is going on -> TOOL STATUS_COMMAND
   help / unknown syntax -> TOOL --help

   ## Workflow
   1. Start with the narrowest read-only command that matches the request.
   2. For mutations, proceed only when the target and action are explicit.
   3. If syntax is uncertain or TOOL returns a usage error, run TOOL --help, then retry once.

   ## Command rules
   Use only TOOL for TOOL work. Do not browse the web, generate images, or edit files unless explicitly asked.
   Do not use apply_patch unless the user explicitly asks to modify files.

   ## Output
   Be terse. Report what you found or changed. Do not describe these instructions.`,
   });
   ```
   `ProfileConfig` fields: `name`, `model?`, `reasoningEffort?` (default `"low"`), `baseInstructions`, `developerInstructions`, `sandboxMode?`, `extraEnv?`, `selfImprove?`. Default model is `gpt-5.3-codex-spark`. Keep the command map LITERAL — do not dump `--help` text.

2. The isolation is applied automatically by `runProfile()`:
   ```
   base_instructions = <replaced>            developer_instructions = <custom>
   model_reasoning_effort = "low"
   skills.include_instructions = false       include_apps_instructions = false
   include_environment_context = false       include_permissions_instructions = false
   memories.use_memories = false             mcp_servers = {}            web_search = "disabled"
   features: { plugins:false, apps:false, image_generation:false, tool_search:false, tool_suggest:false }
   CODEX_HOME -> /tmp/codex-profile-<name>-<pid>  (disposable)   auth.json symlinked from real ~/.codex
   ```
   `features.apps=false` alone saves ~14K tokens; auth.json is symlinked so token refreshes propagate.

3. Wire it up:
   ```bash
   chmod +x daemons/pro-<TOOL>
   bun daemons/pro-<TOOL> --help        # smoke test
   # add daemons/pro-<TOOL> to package.json "bin", then:
   bun link                              # put it on PATH
   ```

## Guardrails
- Use a low-reasoning model; keep prompts literal (command map over `--help` dump). Target ~6-10K input tokens.
- Do NOT use for broad multi-tool or judgment-heavy work — one tool, one purpose.

PromptLesson 10 · Codex Daemons & Overnight Crons

Overnight emulated-cron loop

When Codex has no real scheduler but you want bounded overnight work (research/verification) while you sleep. This uses a natural-language time loop with a hard stop condition so credits and your repo stay safe.

PromptPaste into your agent

Copy-paste prompt

This is a /goal-style overnight task. Codex has no real cron, so you will emulate one in natural language with a HARD stop condition baked into this goal. Stay read-only unless I explicitly ask for a mutation.

GOAL: <RESEARCH/VERIFICATION OBJECTIVE>

LOOP:
1. Record a starting timestamp now.
2. Every <N> minutes, run this routine: <THE RESEARCH/VERIFICATION ROUTINE>.
3. STOP when the FIRST of these is hit (this bound is part of the goal, not optional):
     - <DURATION, e.g. 6 hours> has elapsed since the starting timestamp, OR
     - the clock reaches <TARGET TIME>, OR
     - <ITERATION LIMIT> iterations completed, OR
     - <BUDGET CAP> is reached.

For any debugging the loop does, bundle the context triad so you find the REAL mismatch:
  - intent (this goal text)
  - implementation (the source)
  - output (logs / screenshots / traces)
When those three disagree, that disagreement points at the actual bug.

Guardrails: prefer read-only discovery before any mutation; keep the durable contract in this natural-language goal.

How it works

The loop is an emulated cron written in natural language: record a starting timestamp, repeat a routine every N minutes, and stop at a fixed duration / target clock time / iteration limit / budget cap.
The stop condition is written INTO the goal — it's mandatory, not a suggestion — so overnight runs stay bounded and credits/repo stay safe.
For debugging, bundle the context triad — intent (goal text) + implementation (source) + output (logs/screenshots/traces); when those three disagree the model finds the real mismatch.
Prefer read-only discovery before any mutation; keep the durable contract in natural-language goals.
Reference: PackX, codex-daemons, Agent Browser.

Prompt + SkillLesson 11 · Code as a Throwaway Artifact

Video as requirements

When showing-and-narrating is faster than writing a perfect spec. Record yourself using the app while speaking intent, then have a multimodal model extract acceptance criteria and edge cases as agent-ready requirements.

PromptPaste into your agent

Copy-paste prompt

I'm going to turn a narrated screen recording into agent-ready requirements instead of writing a perfect spec.

1. I will record a short narrated walkthrough of <APP/FEATURE>: I point, click, and talk through my intent and the outcomes I want.

2. Feed that recording to a multimodal model (Gemini, per the workshop) and ask it to extract a structured requirements checklist:
     - acceptance criteria
     - the user flow (step by step)
     - the specific clicks / typing / layout changes shown
     - responsive behavior
     - edge cases

3. Output a REPEATABLE, agent-ready requirements checklist (a QA story) I can hand to a coding agent.

Framing: the durable assets are the natural-language contract + the resulting diff — NOT the generated code. Don't over-specify HOW to verify; give clear outcomes and keep my QA judgment in the loop.

Guardrail: strip sensitive data first — video can expose more than you expect (open tabs, tokens, notifications, file paths).

How it works

Record a short narrated walkthrough — point, click, and talk through intent and desired outcomes — which is often faster than writing a perfect spec.
Feed it to a multimodal model (Gemini, per the workshop) and extract acceptance criteria, the user flow, the clicks/typing/layout changes, responsive behavior, and edge cases.
The output is a repeatable, agent-ready requirements checklist (a QA story).
Under the throwaway-artifact mindset, the durable asset is the natural-language contract + the diff, not the generated code; don't over-specify how to verify — give clear outcomes and keep your QA judgment in the loop.
Caution: strip sensitive data before uploading; video can expose more than you expect.
Reference: AGENTS.md, OpenAI Codex.

Codex skillvideo-as-requirementsInstall & forget

Save as ~/.agents/skills/video-as-requirements/SKILL.md (or your project's .agents/skills/), then restart Codex.

---
name: video-as-requirements
description: Trigger when a narrated screen recording captures intent better than a written spec — feed the video to a multimodal model to extract acceptance criteria, the user flow, and edge cases as an agent-ready requirements checklist. Do NOT upload videos containing sensitive data.
---

# Video as requirements

Turn a narrated screen recording into agent-ready requirements. When showing-and-narrating beats writing a perfect spec, record the app in use and let a multimodal model extract the contract.

## Steps

1. Record a short narrated walkthrough of the app: point, click, and talk through intent and desired outcomes.

2. Feed the recording to a multimodal model (Gemini, per the workshop) and ask it to extract a structured checklist:
   - acceptance criteria
   - the user flow (step by step)
   - the specific clicks / typing / layout changes shown
   - responsive behavior
   - edge cases

3. Output a REPEATABLE, agent-ready requirements checklist (a QA story) to hand to a coding agent.

## Framing
- The durable assets are the natural-language contract + the resulting diff — NOT the generated code.
- Don't over-specify HOW to verify; give clear outcomes and keep human QA judgment in the loop.

## Guardrails
- STRIP sensitive data before uploading — video can expose more than you expect (open tabs, tokens, notifications, file paths). Do NOT upload videos containing sensitive data.