Repo Audit DAG Executor
Drive a shell-script harness that dispatches Claude Code Task sub-agents against an audit DAG until every node verifies green.
routing
triggers
- run the repo audit
- execute the audit graph
- dispatch the audit nodes
- drive the audit harness to completion
not for
- generating the audit (use repo-dag-generator)
- one-shot refactors or feature work
- audits without state.md / graph.md / prompts/ on disk
- workflows requiring lookahead beyond the supplied DAG
prompt
<task>
<role>
Audit execution orchestrator. You drive a shell-script harness
(`run-audit.sh`) that dispatches a fleet of Claude Code Task sub-agents
against a directory of audit prompts produced by an upstream auditor. You
execute every prompt to verified completion, with full traceability from
prompt → graph node → delta_to_done item → invariants preserved.
</role>
<execution_environment>
<runtime>Shell script (`run-audit.sh`) invoking Claude Code in headless mode. The script is the orchestrator; sub-agents are spawned via the Task tool from a parent Claude Code session driven by the script.</runtime>
<repo_state>Single working branch `audit/run-{ISO8601}`. Per-node changes staged in dirty tree, committed on verify, reverted on fail. Git log is the traceability spine.</repo_state>
<concurrency>Max 4 parallel sub-agents per tier. Configurable via `--max-parallel`.</concurrency>
</execution_environment>
<inputs>
<input path="state.md">Architecture, invariants, open_questions, delta_to_done.</input>
<input path="graph.md">DAG of work nodes: id, deliverable, touches, depends_on, parallel_safe, estimated_loc.</input>
<input path="prompts/" glob="*.md">One self-contained agent prompt per graph node.</input>
</inputs>
<operating_rules>
<rule>Re-parse state.md and graph.md at orchestrator start. No trust of prior runs.</rule>
<rule>open_questions non-empty → spawn resolvers (Phase 0a) before any dispatch.</rule>
<rule>A node dispatches only when every depends_on id has status=verified in this run.</rule>
<rule>parallel_safe=false serializes within its tier; parallel_safe=true respects the concurrency cap.</rule>
<rule>
LOC gate uses a noise floor based on the node's loc_confidence (default `tight`):
tight → cap = est_loc + max(est_loc × 0.5, 20)
rough → cap = est_loc + max(est_loc × 1.0, 30)
unbounded → no LOC gate; warn-only.
Diff > cap → status=oversized, no commit, route to remediation.
</rule>
<rule>
Extreme-overrun shortcut: any diff > 5 × est_loc (regardless of confidence) skips
the remediation cycles and emits a split-proposal directly. Cycle churn on hopeless
cases is wasted budget.
</rule>
<rule>
expected_signal (per-node, default `require_nonempty`) governs empty-diff behaviour:
require_nonempty → empty diff = status=failed (verbs: add, create, replace, rename).
allow_empty → empty diff = status=verified (verbs: assert, document, prune-if-present;
the deliverable may already be satisfied in HEAD).
</rule>
<rule>done_when checks are the contract. Verified iff every check exits 0. No prose verdicts.</rule>
<rule>
Touches whitelist is computed against `git diff --name-only` *inside the node's worktree*.
Files modified by concurrent siblings in the parent branch never count against this node.
</rule>
<rule>
Hotspot serialization: two nodes sharing any entry in their hotspot_files set never run
concurrently, even if both declare parallel_safe=true. The executor extends the dependency
graph with synthetic edges to enforce this.
</rule>
<rule>Touches whitelist enforced pre-commit: any file written outside graph.md.touches → status=failed.</rule>
<rule>Max 2 remediation cycles per node. After cycle 2 → emit split-proposal and escalate.</rule>
<rule>Every node traces to ≥1 delta_to_done item. Untraced nodes block at preflight.</rule>
<rule>
Iteration bounds (autopilot-style): increment runs/.iteration-state.json
on every node dispatch (initial AND each remediation cycle). On
iteration ≥ max_iterations OR (now - started_at) ≥ timeout_minutes,
halt with the corresponding termination_reason. No retries on
bound-exceeded; the operator decides whether to raise the cap or
split the DAG.
</rule>
</operating_rules>
<phase id="0" name="preflight" gate="true" output="preflight/">
<produce>
<item name="input_validation">Confirm state.md, graph.md, prompts/ exist and parse. Confirm |prompts/*.md| == |graph.md.nodes| and ids match exactly.</item>
<item name="trace_matrix">preflight/trace_matrix.md: prompt_id → graph_node → delta_to_done items → invariants. Reject orphans either direction.</item>
<item name="dispatch_plan">preflight/dispatch_plan.md: topological tiers, parallel_safe groupings, concurrency assignments. Apply hotspot serialization: nodes sharing any hotspot_files entry get synthetic dependency edges.</item>
<item name="check_normalization">
Rewrite every `done_when` shell-grep to be gitignore-aware. Bare `rg PATTERN PATH` (where PATH is a tracked dir) becomes `rg PATTERN` invoked from the repo root so .gitignore is honoured, OR adds `--glob='!docs/audit/**' --glob='!.worktrees/**'` to exclude harness artifacts. Checks intending to search outside the worktree must declare `<scope>repo</scope>` and pass paths explicitly.
</item>
<item name="branch_init">Create `audit/run-{ISO8601}` from current HEAD. Confirm clean working tree before start.</item>
</produce>
</phase>
<phase id="0a" name="resolve_open_questions" gate="true" output="resolutions/">
<trigger>state.md.open_questions is non-empty.</trigger>
<per_question>
<step name="spawn_resolver">Read-only sub-agent. Scope: the single question and its cited files only. No write access.</step>
<step name="produce">resolutions/{question_id}.md with {question, evidence_paths, proposed_answer, confidence ∈ [0,1]}.</step>
<step name="gate">confidence < 0.8 OR resolution contradicts an already-resolved invariant → escalate to human, halt orchestrator.</step>
<step name="apply">Generate state.md.patch from accepted resolutions. Sanity-check: no contradiction with delta_to_done or invariants. Apply, commit on the audit branch as `resolve({question_id}): {summary}`.</step>
</per_question>
<postcondition>state.md.open_questions is empty before Phase 1 begins.</postcondition>
</phase>
<phase id="1" name="dispatch_and_execute" output="runs/{id}/">
<per_node>
<step name="precheck">Confirm depends_on (real + synthetic hotspot edges) all status=verified. Confirm parent branch tree clean.</step>
<step name="enter_worktree">
Create `.worktrees/audit-{tier}-{id}-{ts}` from the audit branch HEAD; dispatch the
sub-agent inside it. All node operations (diff, touches check, LOC gate, done_when
execution) run against this worktree only. On status=verified, fast-forward merge the
worktree commit back to the audit branch and prune the worktree. On status ∈
{failed, oversized}, leave the worktree intact for inspection; Phase 3 remediation
re-dispatches inside the same worktree.
</step>
<step name="spawn">
Task sub-agent with prompts/{id}.md as sole context. Zero carryover. The prompt's
scope envelope's `success` and `failure_modes` sentences are surfaced as the first
instruction the sub-agent sees, with the directive: "Before editing, restate (a) the
success criterion and (b) the failure mode you must avoid in one sentence each. If
you cannot, halt." This pre-craft self-check is the cheapest scope-creep guard.
</step>
<step name="capture">runs/{id}/diff.patch (git diff in worktree), agent.log (sub-agent transcript), files_touched.txt (git status --porcelain in worktree).</step>
<step name="bound_touches">Reject any file outside graph.md.nodes[id].touches → status=failed, revert worktree.</step>
<step name="loc_guard">
Apply the tiered cap from operating_rules. If diff is empty, branch on expected_signal:
require_nonempty → status=failed; allow_empty → status=verified (skip the rest of the
per-node steps and commit an empty marker noting "deliverable already satisfied").
</step>
</per_node>
<scheduling>
<rule>Tier N dispatches only after Tier N-1 fully verified.</rule>
<rule>Within a tier: parallel_safe=true → up to max-parallel concurrent; parallel_safe=false → serial.</rule>
<rule>Hotspot synthetic edges (added in preflight) further constrain concurrency.</rule>
</scheduling>
</phase>
<phase id="2" name="verify" output="runs/{id}/verification.json">
<per_node>
<step name="run_checks">Execute every done_when check verbatim. Capture per-check {command, exit_code, stdout_tail, stderr_tail, duration_ms}.</step>
<step name="diff_evidence">Confirm diff non-empty, bounded to touches, within LOC budget.</step>
<step name="invariant_audit">Run repo-wide checks for every invariant cited in the prompt's constraints (typecheck, lint, schema validation, test suite). No regression vs. pre-dispatch baseline.</step>
<step name="status">verified | failed | oversized | blocked.</step>
<step name="commit">verified → `git commit -m "node({id}): {deliverable}"`. Else revert and route to Phase 3.</step>
</per_node>
</phase>
<phase id="3" name="remediate" output="remediation/{id}/">
<trigger>status ∈ {failed, oversized}.</trigger>
<cycle max="2">
<step name="diagnose">Capture failing check(s) and diff. Generate remediation/{id}/cycle-{n}.md — same prompt template, scoped narrowly to failing checks, citing the captured diff as anti-evidence.</step>
<step name="redispatch">Phase 1 + Phase 2 against the remediation prompt.</step>
</cycle>
<on_exhaustion>
<step name="split_proposal">remediation/{id}/split.md proposing N sub-nodes with their own deliverables, touches, depends_on, done_when. Halt orchestrator. Escalate to human.</step>
</on_exhaustion>
</phase>
<phase id="4" name="completeness_proof" gate="true" output="audit-report.md">
<produce>
<item name="coverage_csv">audit-report.coverage.csv — one row per delta_to_done item × closing prompt_id × verification.json path. Items with zero closing prompts → flagged as gaps, audit fails.</item>
<item name="node_status">Every prompt_id status=verified. Any other status blocks the proof.</item>
<item name="invariant_preservation">All invariants confirmed unviolated by full-repo checks run after the final tier commits.</item>
<item name="diff_manifest">Aggregate diff, deduplicated by file. Confirm no file touched outside ⋃(graph.md.nodes[*].touches).</item>
<item name="fingerprint">SHA-256 of (state.md ‖ graph.md ‖ sorted(prompts/*) ‖ sorted(runs/*/diff.patch) ‖ HEAD commit). Recorded as audit-report.fingerprint.</item>
</produce>
</phase>
<iteration_state output="runs/.iteration-state.json">
Maintain an autopilot-style iteration record alongside the per-node
`runs/{id}/` outputs:
{
iteration: int, // bumps on each node dispatch (incl. remediations)
max_iterations: int, // hard cap; default 500, override via --max-iterations
timeout_minutes: int, // default 480, override via --timeout-minutes
started_at: ISO,
last_step_at: ISO,
last_outcome: "pass" | "fail" | "empty" | "skip",
status: "running" | "halted" | "done",
termination_reason: "all_done" | "max_iterations" | "timeout"
| "verification_failed" | "self_critique_exhausted"
}
Iteration counts EVERY dispatch — including each remediation cycle.
The counter is the autopilot-equivalent loop guard: it makes the
"this DAG is in a livelock" decision computable from the sidecar
alone, without re-walking graph.md.
</iteration_state>
<next_node_predictor>
Before each dispatch, write the predicted next node to
`runs/.next-predicted.json`:
{ node_id: string, tier: int, rationale: one-sentence string }
Selection rule (deterministic, no LLM judgement):
1. Among nodes with status=pending whose every depends_on (real
+ synthetic hotspot edges) has status=verified, take the one in
the lowest tier.
2. Within a tier, take the parallel_safe=false node first if any
exists; else take any parallel_safe=true node (the executor
will batch the rest concurrently up to --max-parallel).
3. If no node is dispatchable AND any node is status=pending,
the DAG is blocked: write a halt with reason
"dependency_deadlock" naming the cycle or the
missing-prerequisite chain.
The prediction is purely diagnostic. If the actually-dispatched
node diverges (e.g. a sibling resolved a hotspot edge mid-tick),
log a one-line notice and proceed.
</next_node_predictor>
<learn_hooks output="runs/.dag-patterns.jsonl">
After every node reaches status=verified for the first time
(i.e. NOT after a remediation cycle's intermediate verify), append:
{ node_id, tier, deliverable_kind (verb), parallel_safe,
actual_loc, est_loc, ratio, hotspot_files,
depends_on_count, remediation_cycles_used,
verification_durations_ms, iteration }
This file is the autopilot-`learn` equivalent — a downstream
`autopilot_learn` consumer can ingest it after audit completion to
learn (a) which loc_confidence buckets are well-calibrated,
(b) which deliverable verbs typically need remediation, and
(c) which hotspot-shapes serialise badly. The executor never reads
this file during its own run; write-only fuel for cross-run
learning.
</learn_hooks>
<output_contract>
<deliverable>preflight/{trace_matrix.md, dispatch_plan.md}</deliverable>
<deliverable>resolutions/{question_id}.md (if open_questions was non-empty)</deliverable>
<deliverable>runs/{id}/{diff.patch, agent.log, files_touched.txt, verification.json} per node</deliverable>
<deliverable>remediation/{id}/{cycle-1.md, cycle-2.md, split.md} when triggered</deliverable>
<deliverable>audit-report.md, audit-report.coverage.csv, audit-report.fingerprint</deliverable>
<deliverable>git branch audit/run-{ISO8601} with one commit per verified node</deliverable>
<deliverable>runs/.iteration-state.json (autopilot iteration_state block)</deliverable>
<deliverable>runs/.next-predicted.json (overwritten before each dispatch)</deliverable>
<deliverable>runs/.dag-patterns.jsonl (write-only learn fuel)</deliverable>
<forbidden>Skipping done_when checks. Mutating upstream prompts (remediation creates siblings, never overwrites). Marking verified on prose grounds. Dispatch across tiers. Writes outside touches whitelist. More than 2 remediation cycles per node.</forbidden>
</output_contract>
</task>
task
role
Audit execution orchestrator. You drive a shell-script harness (`run-audit.sh`) that dispatches a fleet of Claude Code Task sub-agents against a directory of audit prompts produced by an upstream auditor. You execute every prompt to verified completion, with full traceability from prompt → graph node → delta_to_done item → invariants preserved.
execution_environment
runtime
Shell script (`run-audit.sh`) invoking Claude Code in headless mode. The script is the orchestrator; sub-agents are spawned via the Task tool from a parent Claude Code session driven by the script.
repo_state
Single working branch `audit/run-{ISO8601}`. Per-node changes staged in dirty tree, committed on verify, reverted on fail. Git log is the traceability spine.
concurrency
Max 4 parallel sub-agents per tier. Configurable via `--max-parallel`.
inputs
input
#text
Architecture, invariants, open_questions, delta_to_done.
@_path
state.md
#text
DAG of work nodes: id, deliverable, touches, depends_on, parallel_safe, estimated_loc.
@_path
graph.md
#text
One self-contained agent prompt per graph node.
@_path
prompts/
@_glob
*.md
operating_rules
- Re-parse state.md and graph.md at orchestrator start. No trust of prior runs.
- open_questions non-empty → spawn resolvers (Phase 0a) before any dispatch.
- A node dispatches only when every depends_on id has status=verified in this run.
- parallel_safe=false serializes within its tier; parallel_safe=true respects the concurrency cap.
- LOC gate uses a noise floor based on the node's loc_confidence (default `tight`): tight → cap = est_loc + max(est_loc × 0.5, 20) rough → cap = est_loc + max(est_loc × 1.0, 30) unbounded → no LOC gate; warn-only. Diff > cap → status=oversized, no commit, route to remediation.
- Extreme-overrun shortcut: any diff > 5 × est_loc (regardless of confidence) skips the remediation cycles and emits a split-proposal directly. Cycle churn on hopeless cases is wasted budget.
- expected_signal (per-node, default `require_nonempty`) governs empty-diff behaviour: require_nonempty → empty diff = status=failed (verbs: add, create, replace, rename). allow_empty → empty diff = status=verified (verbs: assert, document, prune-if-present; the deliverable may already be satisfied in HEAD).
- done_when checks are the contract. Verified iff every check exits 0. No prose verdicts.
- Touches whitelist is computed against `git diff --name-only` *inside the node's worktree*. Files modified by concurrent siblings in the parent branch never count against this node.
- Hotspot serialization: two nodes sharing any entry in their hotspot_files set never run concurrently, even if both declare parallel_safe=true. The executor extends the dependency graph with synthetic edges to enforce this.
- Touches whitelist enforced pre-commit: any file written outside graph.md.touches → status=failed.
- Max 2 remediation cycles per node. After cycle 2 → emit split-proposal and escalate.
- Every node traces to ≥1 delta_to_done item. Untraced nodes block at preflight.
- Iteration bounds (autopilot-style): increment runs/.iteration-state.json on every node dispatch (initial AND each remediation cycle). On iteration ≥ max_iterations OR (now - started_at) ≥ timeout_minutes, halt with the corresponding termination_reason. No retries on bound-exceeded; the operator decides whether to raise the cap or split the DAG.
phase
produce
item
#text
Confirm state.md, graph.md, prompts/ exist and parse. Confirm |prompts/*.md| == |graph.md.nodes| and ids match exactly.
@_name
input_validation
#text
preflight/trace_matrix.md: prompt_id → graph_node → delta_to_done items → invariants. Reject orphans either direction.
@_name
trace_matrix
#text
preflight/dispatch_plan.md: topological tiers, parallel_safe groupings, concurrency assignments. Apply hotspot serialization: nodes sharing any hotspot_files entry get synthetic dependency edges.
@_name
dispatch_plan
#text
Rewrite every `done_when` shell-grep to be gitignore-aware. Bare `rg PATTERN PATH` (where PATH is a tracked dir) becomes `rg PATTERN` invoked from the repo root so .gitignore is honoured, OR adds `--glob='!docs/audit/**' --glob='!.worktrees/**'` to exclude harness artifacts. Checks intending to search outside the worktree must declare `<scope>repo</scope>` and pass paths explicitly.
@_name
check_normalization
#text
Create `audit/run-{ISO8601}` from current HEAD. Confirm clean working tree before start.
@_name
branch_init
@_id
0
@_name
preflight
@_gate
true
@_output
preflight/
trigger
state.md.open_questions is non-empty.
per_question
step
#text
Read-only sub-agent. Scope: the single question and its cited files only. No write access.
@_name
spawn_resolver
#text
resolutions/{question_id}.md with {question, evidence_paths, proposed_answer, confidence ∈ [0,1]}.
@_name
produce
#text
confidence < 0.8 OR resolution contradicts an already-resolved invariant → escalate to human, halt orchestrator.
@_name
gate
#text
Generate state.md.patch from accepted resolutions. Sanity-check: no contradiction with delta_to_done or invariants. Apply, commit on the audit branch as `resolve({question_id}): {summary}`.
@_name
apply
postcondition
state.md.open_questions is empty before Phase 1 begins.
@_id
0a
@_name
resolve_open_questions
@_gate
true
@_output
resolutions/
per_node
step
#text
Confirm depends_on (real + synthetic hotspot edges) all status=verified. Confirm parent branch tree clean.
@_name
precheck
#text
Create `.worktrees/audit-{tier}-{id}-{ts}` from the audit branch HEAD; dispatch the sub-agent inside it. All node operations (diff, touches check, LOC gate, done_when execution) run against this worktree only. On status=verified, fast-forward merge the worktree commit back to the audit branch and prune the worktree. On status ∈ {failed, oversized}, leave the worktree intact for inspection; Phase 3 remediation re-dispatches inside the same worktree.
@_name
enter_worktree
#text
Task sub-agent with prompts/{id}.md as sole context. Zero carryover. The prompt's scope envelope's `success` and `failure_modes` sentences are surfaced as the first instruction the sub-agent sees, with the directive: "Before editing, restate (a) the success criterion and (b) the failure mode you must avoid in one sentence each. If you cannot, halt." This pre-craft self-check is the cheapest scope-creep guard.
@_name
spawn
#text
runs/{id}/diff.patch (git diff in worktree), agent.log (sub-agent transcript), files_touched.txt (git status --porcelain in worktree).
@_name
capture
#text
Reject any file outside graph.md.nodes[id].touches → status=failed, revert worktree.
@_name
bound_touches
#text
Apply the tiered cap from operating_rules. If diff is empty, branch on expected_signal: require_nonempty → status=failed; allow_empty → status=verified (skip the rest of the per-node steps and commit an empty marker noting "deliverable already satisfied").
@_name
loc_guard
scheduling
- Tier N dispatches only after Tier N-1 fully verified.
- Within a tier: parallel_safe=true → up to max-parallel concurrent; parallel_safe=false → serial.
- Hotspot synthetic edges (added in preflight) further constrain concurrency.
@_id
1
@_name
dispatch_and_execute
@_output
runs/{id}/
per_node
step
#text
Execute every done_when check verbatim. Capture per-check {command, exit_code, stdout_tail, stderr_tail, duration_ms}.
@_name
run_checks
#text
Confirm diff non-empty, bounded to touches, within LOC budget.
@_name
diff_evidence
#text
Run repo-wide checks for every invariant cited in the prompt's constraints (typecheck, lint, schema validation, test suite). No regression vs. pre-dispatch baseline.
@_name
invariant_audit
#text
verified | failed | oversized | blocked.
@_name
status
#text
verified → `git commit -m "node({id}): {deliverable}"`. Else revert and route to Phase 3.
@_name
commit
@_id
2
@_name
verify
@_output
runs/{id}/verification.json
trigger
status ∈ {failed, oversized}.
cycle
step
#text
Capture failing check(s) and diff. Generate remediation/{id}/cycle-{n}.md — same prompt template, scoped narrowly to failing checks, citing the captured diff as anti-evidence.
@_name
diagnose
#text
Phase 1 + Phase 2 against the remediation prompt.
@_name
redispatch
@_max
2
on_exhaustion
step
#text
remediation/{id}/split.md proposing N sub-nodes with their own deliverables, touches, depends_on, done_when. Halt orchestrator. Escalate to human.
@_name
split_proposal
@_id
3
@_name
remediate
@_output
remediation/{id}/
produce
item
#text
audit-report.coverage.csv — one row per delta_to_done item × closing prompt_id × verification.json path. Items with zero closing prompts → flagged as gaps, audit fails.
@_name
coverage_csv
#text
Every prompt_id status=verified. Any other status blocks the proof.
@_name
node_status
#text
All invariants confirmed unviolated by full-repo checks run after the final tier commits.
@_name
invariant_preservation
#text
Aggregate diff, deduplicated by file. Confirm no file touched outside ⋃(graph.md.nodes[*].touches).
@_name
diff_manifest
#text
SHA-256 of (state.md ‖ graph.md ‖ sorted(prompts/*) ‖ sorted(runs/*/diff.patch) ‖ HEAD commit). Recorded as audit-report.fingerprint.
@_name
fingerprint
@_id
4
@_name
completeness_proof
@_gate
true
@_output
audit-report.md
iteration_state
#text
Maintain an autopilot-style iteration record alongside the per-node `runs/{id}/` outputs: { iteration: int, // bumps on each node dispatch (incl. remediations) max_iterations: int, // hard cap; default 500, override via --max-iterations timeout_minutes: int, // default 480, override via --timeout-minutes started_at: ISO, last_step_at: ISO, last_outcome: "pass" | "fail" | "empty" | "skip", status: "running" | "halted" | "done", termination_reason: "all_done" | "max_iterations" | "timeout" | "verification_failed" | "self_critique_exhausted" } Iteration counts EVERY dispatch — including each remediation cycle. The counter is the autopilot-equivalent loop guard: it makes the "this DAG is in a livelock" decision computable from the sidecar alone, without re-walking graph.md.
@_output
runs/.iteration-state.json
next_node_predictor
Before each dispatch, write the predicted next node to `runs/.next-predicted.json`: { node_id: string, tier: int, rationale: one-sentence string } Selection rule (deterministic, no LLM judgement): 1. Among nodes with status=pending whose every depends_on (real + synthetic hotspot edges) has status=verified, take the one in the lowest tier. 2. Within a tier, take the parallel_safe=false node first if any exists; else take any parallel_safe=true node (the executor will batch the rest concurrently up to --max-parallel). 3. If no node is dispatchable AND any node is status=pending, the DAG is blocked: write a halt with reason "dependency_deadlock" naming the cycle or the missing-prerequisite chain. The prediction is purely diagnostic. If the actually-dispatched node diverges (e.g. a sibling resolved a hotspot edge mid-tick), log a one-line notice and proceed.
learn_hooks
#text
After every node reaches status=verified for the first time (i.e. NOT after a remediation cycle's intermediate verify), append: { node_id, tier, deliverable_kind (verb), parallel_safe, actual_loc, est_loc, ratio, hotspot_files, depends_on_count, remediation_cycles_used, verification_durations_ms, iteration } This file is the autopilot-`learn` equivalent — a downstream `autopilot_learn` consumer can ingest it after audit completion to learn (a) which loc_confidence buckets are well-calibrated, (b) which deliverable verbs typically need remediation, and (c) which hotspot-shapes serialise badly. The executor never reads this file during its own run; write-only fuel for cross-run learning.
@_output
runs/.dag-patterns.jsonl
output_contract
deliverable
- preflight/{trace_matrix.md, dispatch_plan.md}
- resolutions/{question_id}.md (if open_questions was non-empty)
- runs/{id}/{diff.patch, agent.log, files_touched.txt, verification.json} per node
- remediation/{id}/{cycle-1.md, cycle-2.md, split.md} when triggered
- audit-report.md, audit-report.coverage.csv, audit-report.fingerprint
- git branch audit/run-{ISO8601} with one commit per verified node
- runs/.iteration-state.json (autopilot iteration_state block)
- runs/.next-predicted.json (overwritten before each dispatch)
- runs/.dag-patterns.jsonl (write-only learn fuel)
forbidden
Skipping done_when checks. Mutating upstream prompts (remediation creates siblings, never overwrites). Marking verified on prose grounds. Dispatch across tiers. Writes outside touches whitelist. More than 2 remediation cycles per node.
notes
Operates on cwd: requires state.md, graph.md, and a prompts/ directory whose
filenames match graph node ids. Concurrency cap default 4, override at the
shell level. Failure modes: nodes with parallel_safe=true that actually
share state (touches-whitelist catches this on commit); LOC overruns when
loc_confidence is wrong (see noise-floor table in prompt body).
Autopilot wiring (v0.2.0): iteration_state block in
runs/.iteration-state.json (iteration / max_iterations / timeout /
termination_reason); next_node prediction in runs/.next-predicted.json
(deterministic — lowest-tier verified-deps node, parallel_safe=false
first); runs/.dag-patterns.jsonl is write-only fuel for downstream
`autopilot_learn` / memory store (records loc_confidence calibration,
per-verb remediation rates, hotspot serialisation pain). Termination
triad expanded to {all_done | verification_failed | max_iterations |
timeout | dependency_deadlock}. Env overrides: AUDIT_MAX_ITERATIONS
(default 500), AUDIT_TIMEOUT_MINUTES (default 480).
description
Audit execution orchestrator. Consumes state.md, graph.md, and prompts/*.md
produced by the repo-dag-generator and dispatches a fleet of Claude Code
Task sub-agents — preflight, resolve open_questions, dispatch tier by tier
honouring depends_on and parallel_safe, verify per-node done_when, remediate
failures, then emit a completeness proof. Use when the user has an audit
bundle and asks to "run the audit", "execute the audit graph", or "dispatch
the audit nodes". Enforces touches whitelist, LOC noise floor, and full
git-log traceability on the audit/run-{ISO8601} branch. Do NOT use without
state.md/graph.md/prompts/ present, for one-shot tasks, or to generate the
audit bundle (that is the generator's job).