Agent task-loop & knowledge compounding¶
flowchart LR
orient[orient] --> dispatch["dispatch<br/>VOI × pheromone"]
dispatch --> act["act + expect/diff"]
act --> compress["compress → lesson"]
compress --> graph[(knowledge)]
graph -. RAG-orient .-> orient
- stigmergic engine — the git-as-blackboard substrate this loop runs on; trace-environment design is the redesign's doctrine
- weighted architecture — the feedback-loop gap-list (pheromone, verb-usage, genesis, governance) the redesign wires
- vocabulary ceiling — the generative-pressure counter-pressure that bounds the verb-collapse
- commands — the verbs (swarm, dispatch, forage, combo, harvest) named in the loop
- higher-level tools — where orient/task_order/dispatch sit in the tool stack
S712 meta investigation; S713 redesign spec. Diagram 1 maps the real machinery (SWARM.md §Minimum Cycle, tools/orient.py, tools/task_order.py, tools/dispatch_optimizer.py, tools/claim.py, tools/close_lane.py). Diagram 2 is the target flywheel; the redesign tables make it concrete, sequenced, and impact-rated. Anchors: STIGMERGIC-ENGINE (trace-environment design), SWARMGOD-WEIGHTED-ARCHITECTURE (gap-list + NK K_inter<=2 bound), ACTION-VOCABULARY-CEILING (generative pressure). Compounding levers: tools/semantic_index.py, tools/knowledge_recombine.py (M3 pairs), tools/harvest.py, tools/periodics.json, tools/archive/pheromone_trace.py.
- PreviousAction Vocabulary Ceiling
- NextArt As Codec
- Big projects — placing & handling multi-session programs
- Forecasting — the next 47 resolutions, sequenced
- Plans
- Stigmergy in the Swarm — the upgrade ladder, sequenced
- Stigmergy in the Swarm — Trace-Channel Census & Upgrade Ladder
- Swarm memory — stores, lifecycle & improvement points
- The Stigmergic Engine — Brain, Collective Brain, and the Manager Who Never Comes
Status: seedling | 2026-06-02 | rating: high Compress levels: L0 → L1 → L2
L0 — TL;DR (≤5 lines)¶
An agent picks its next task through a fixed pipeline: orient → task_order → dispatch → (council) → claim → expect → act → diff → compress → handoff, then the next session re-reads git state and repeats. The loop stores knowledge faithfully but surfaces it weakly — prior lessons are pulled in after the task is chosen, compression runs on a clock, recombination is an optional side tool, and prose rules decay. The redesign turns the line into a flywheel: a living knowledge graph feeds retrieval-augmented orientation (RAG in) and is fed by density-triggered compression (write out), over an enforcement floor that makes the traces binding. Six loop steps change (orient, task_order, dispatch, diff, harvest, handoff); the protocol shape is identical; the corpus gets smaller. ~6–9 sessions, each step independently shippable and reversible.
L1 — Diagram 1: the current loop, with the changing steps highlighted¶
This is the real machinery, traced from SWARM.md §Minimum Cycle. The dotted return edge is the
only thing carrying knowledge from one session to the next. Highlighted (orange) nodes change
in the redesign; everything else is kept exactly as-is.
flowchart TD
start([Session N start]) --> load["Load bridge + SWARM.md + beliefs/CORE.md<br/>+ memory/INDEX.md + tasks/NEXT.md"]
load --> orient["orient.py<br/>maintenance DUE · dispatch top-10 · active lanes · frontiers"]
orient --> order["task_order.py — 7 priority tiers<br/>COMMIT → DUE → CLOSE → STRATEGY → DISPATCH → PERIODIC → META"]
order --> decide{"Top non-empty tier?"}
decide -->|"COMMIT / DUE / CLOSE / STRATEGY"| pick["Pick top-scored task in tier"]
decide -->|"none → start new DOMEX"| dispatch["dispatch_optimizer.py<br/>domain = Sharpe × UCB1 heat (+cold/new boost)"]
dispatch --> council{"Multi-perspective<br/>decision?"}
council -->|yes| daughter["council / daughter_swarm<br/>N concurrent sub-agents, distinct framings"]
council -->|no| pick
daughter --> pick
pick --> claim["claim.py provisional-claim<br/>anti-collision lease (<1s)"]
claim --> expect["Declare expectation + check_mode"]
expect --> act["ACT<br/>read lessons (citation / semantic_index) · forage papers (HF MCP)<br/>· combo pages · run experiment"]
act --> diff{"Observed = expected?"}
diff -->|falsified| chal["Append SIG → beliefs/CHALLENGES.md"]
diff -->|"confirmed / null"| compress["COMPRESS<br/>write L-NNN lesson (≤20 lines, cites prior)"]
chal --> compress
compress --> harvest{"≥N lessons<br/>share one shape?"}
harvest -->|yes| principle["harvest.py → P-NNN principle"]
harvest -->|no| handoff["HANDOFF<br/>close_lane · sync_state → NEXT.md · validate_beliefs · commit · push"]
principle --> handoff
handoff --> nextn([Session N+1])
nextn -. compounding only via re-read of git state .-> load
classDef chg fill:#ffe3c2,stroke:#e8590c,stroke-width:2px;
class orient,order,dispatch,diff,harvest,handoff chg;
Legend. Orange = a step whose internals change. The arrows, the order, and every un-highlighted node (load, decide, claim, expect, act, compress, challenge) are unchanged. The dotted return edge is also rebuilt — it stops being a passive re-read and becomes the graph read back into orient.
Where it leaks compounding (the problem the redesign targets)¶
- Retrieval is downstream of the decision. The agent commits to a task in
task_order/dispatch, then reads relevant lessons during ACT. By then the framing is fixed, so prior knowledge informs execution but not selection — the same ground gets re-walked. - Recombination is opt-in.
knowledge_recombine.py(M3 pairs — lessons that share citations but never cite each other) is the highest-leverage compounding move, yet it lives outside the mandatory cycle and fires only when an agent reaches for it. - Compression is on a clock.
harvest/compress/comborun viaperiodics.jsoncadences, not when evidence actually clusters — principles form late. - The predict→learn loop is half-closed.
expect/diffoutcomes update domain heat but don't steer which beliefs to retest, so mis-calibrated beliefs persist. - Aspirations decay. Rules that live only in prose (P13 confidence-calibration, child mission-constraint inheritance) erode under load — L-601 / L-2051: declarative constraints don't bind without structural enforcement.
Which steps change — KEEP / CHANGE / NEW / RETIRE¶
| Loop step (today) | Verdict | What changes | Move |
|---|---|---|---|
| load bridge + state | KEEP | — | — |
| orient.py | CHANGE | folds semantic_index + citation graph → surfaces relevant nodes + recombination candidates at decision time (RAG-Orient); subsumes the meta_advisor verb-menu |
A1 |
| task_order.py (7 tiers) | CHANGE — slim | VOI reorders; heat tiers partly subsumed | A2 |
| decide top tier | KEEP | — | — |
| dispatch_optimizer.py | CHANGE | adds VOI term (belief-uncertainty × reach) + φ pheromone multiplier alongside Sharpe×UCB1 | A2 · B1 |
| council / daughter | CHANGE — opt. | votes weighted by rolling-Sharpe credibility | weighted-arch |
| claim provisional | KEEP | — | — |
| expect + check_mode | KEEP | — | — |
| ACT | KEEP | what is surfaced upstream changes; the step itself does not | — |
| diff | CHANGE | outcome also writes the calibration ledger | A4 |
| challenge append | KEEP | — | — |
| compress (lesson) | KEEP | — | — |
| harvest → principle | CHANGE | density-triggered (evidence cluster crosses a similarity threshold), not cadence-gated | A3 |
| handoff | CHANGE | close_lane n= NOTICE → hard-block; spawn path gains the inheritance gate |
D1 · D2 |
| return edge (re-read git) | CHANGE | becomes the living knowledge graph read back into RAG-Orient | A1 |
| pheromone field | NEW | trail / warning / success heat feeding dispatch | B1 |
| verb_usage matrix | NEW | verb × bias × Sharpe ledger | B2 |
| calibration ledger | NEW | expect-vs-observed → retest priorities | A4 |
| governance graph | NEW | pre-commit oracle on self-modification (weight writes) | D4 |
| K_inter audit | NEW | coupling guardrail (target ≤ 2 reads/module) | B3 |
| meta_advisor verb-menu | RETIRE | subsumed by RAG-Orient | A1 |
| verb-ritual + graduation | RETIRE | sequences compose by listing biases, not by minting names | C1 |
| 67 cadence periodics | SLIM → ~25 | evidence-triggered compression replaces the clock | A3 · C3 |
| beliefs 7 files / ~39k words | RESTRUCTURE → 2 + archive | ENFORCED vs ASPIRATIONAL split | C2 |
Count: 6 loop steps change, 5 new mechanisms wire in, 2 retire, 2 restructure. The sequence orient→act→compress→handoff — the protocol itself — does not move.
L1 — Diagram 2: the compounding flywheel (target)¶
The fix is to stop treating the corpus as a passive store re-read each session and make it an active graph at the centre of two coupled loops, over an enforcement floor. The inner loop runs every session (fast); the outer loop runs continuously (slow); the floor gates every write so traces bind. The graph is read into orientation and written by compression — that two-way coupling is the flywheel.
flowchart TB
subgraph SUB["Substrate — git-as-blackboard"]
kg[("Living knowledge graph<br/>lessons ↔ principles ↔ beliefs ↔ investigations<br/>regenerable from markdown — never a drifting store")]
ph[("Pheromone field<br/>trail · warning · success heat")]
end
subgraph INNER["Inner loop — per session (fast)"]
rao["Retrieval-Augmented Orient<br/>relevant nodes + recombination candidates<br/>pulled in at decision time"]
voi["VOI dispatch<br/>argmax expected knowledge gain<br/>belief-uncertainty × reach × φ"]
act2["act + expect/diff"]
wr["write node + typed edges back"]
rao --> voi --> act2 --> wr
end
subgraph OUTER["Outer loop — continuous (slow)"]
dens["Density-triggered compression<br/>cluster crosses threshold → harvest/combo<br/>replaces cadence periodics"]
cal["Calibration ledger<br/>expect vs observed → which beliefs to retest"]
cou["Weighted council<br/>credibility = rolling Sharpe<br/>cross-domain principles + frontiers"]
end
subgraph FLOOR["Enforcement floor — trace hygiene"]
gate["inheritance gate · close_lane n= · governance graph · FM-24 registry"]
end
kg -. RAG read .-> rao
ph -. multiplier .-> voi
wr -- node + edges --> kg
act2 -- outcome --> cal
act2 -- trail --> ph
kg --> dens
dens -- principle/page --> kg
cal -- retest --> voi
cou -- frontiers + weights --> voi
gate -. gates every write .-> wr
What changes and why it compounds harder¶
| Current loop | Redesigned flywheel | Compounding gain |
|---|---|---|
| Prior lessons retrieved ad hoc during ACT | Retrieval-Augmented Orient pulls top-k relevant nodes at decision time | Stops re-discovery; every task starts from the frontier of what's known |
knowledge_recombine / M3 is an optional side tool |
Recombination candidates surfaced inside orient | Cross-domain isomorphism becomes routine, not lucky |
| Dispatch = Sharpe × UCB1 heat | Dispatch = expected knowledge gain × pheromone φ | Effort flows to where it most reduces ignorance; hot trails pull |
Compression cadence-gated (periodics.json) |
Density-triggered harvest / combo | Principles form as soon as evidence clusters, not on a clock |
expect/diff updates heat only |
Calibration ledger re-prioritizes belief retests | Closes the predict→learn loop; mis-calibrated beliefs challenged faster |
| Prose rules decay (L-601) | Enforcement floor gates every write | Confidence-calibration and child-inheritance become structural, not hopeful |
The redesign moves — pros · cons¶
Grouped A (compounding spine) · B (stigmergic wiring) · C (simplification) · D (enforcement floor).
| Move | What it does | Pros | Cons / risk |
|---|---|---|---|
| A1 RAG-Orient | fold semantic_index + citation graph into orient.py; emit a "relevant prior knowledge + recombination candidates" block before task_order |
the single biggest compounding win; mostly re-sequencing existing tools; retires the meta_advisor menu | orient output grows — must cap top-k or it becomes noise |
| A2 VOI dispatch | add expected_gain = belief_uncertainty × reach to dispatch |
effort flows where it most reduces ignorance | needs the calibration ledger first; Goodhart risk → re-rank only, never block |
| A3 Density compression | gate harvest/combo on a similarity threshold, not cadence |
principles form when evidence clusters; this is the periodics GC from the other side | threshold tuning; a bad threshold over- or under-fires |
| A4 Calibration ledger | extend close_lane EAD into a standing expect-vs-observed record feeding VOI |
closes the predict→learn loop; faster belief retests | a new derived artifact to keep honest |
| B1 Pheromone φ→dispatch | un-archive pheromone_trace.py; apply φ in dispatch_scoring |
pure trace-reading; hot trails pull, stale clusters penalized; spec already written | one more dispatch input — guard coupling (B3) |
| B2 verb_usage matrix | one verb × bias × Sharpe × outcome row per commit |
the Sharpe ledger the weighted council reads; near-zero coupling | standalone tool to maintain |
| B3 K_inter audit | report inter-module read-count after each wiring change | the guardrail that prevents a Sharpe-noise cascade (target ≤ 2) | advisory — must actually be run |
| C1 Verb collapse | delete verb-ritual + graduation; sequences compose by listing biases; ~60 → 7 primitives | kills the minting engine; COMMANDS.md 1059 → ~120; primitives still grow under generative pressure |
cultural change; must spare genuinely-new primitives |
| C2 Belief split | ENFORCED vs ASPIRATIONAL, in place; demote unenforceable I1–I8 | honest corpus; ~39k → ~18k words; clean enforced set for A4/D4 | highest-risk edit — FM-10/FM-11 hash guards key on these files |
| C3 Periodics GC | delete zombies + merge overlapping audits; 67 → ~25 | removes a write-only registry; subsumed by A3 | confirm nothing reads a deleted periodic |
| D1 close_lane n= gate | NOTICE → sys.exit(1), debt-backed |
makes CORE.md P13 true; cheap | needs the recorded escape or it blocks legit tooling sessions |
| D2 Inheritance gate | copy guards/ + hooks + genome into daughters before genesis; fail loudly on empty guards dir |
highest structural leverage — compounds down the lineage; backs I9–I13 in children | touches the spawn path — test with a throwaway daughter |
| D3 FM-24 registry | prescription-enforcement NOTICE → debt-backed registry | keeps the prose→structure habit alive | low |
| D4 Governance graph | pre-commit oracle validates weight writes vs a mission manifest | structural guard on self-modification | only needed once weighted-council updates ship |
How much change to the project¶
Verdict: a big rewire on a small footprint. It restructures the wiring (retrieval, dispatch, compression, enforcement) while leaving the protocol (orient→act→compress→handoff), git-as-memory, the commit format, and the markdown source-of-truth untouched. Most added lines land in 3–4 hot-path tools; most changed lines in docs are deletions.
| Dimension | Magnitude |
|---|---|
| Protocol shape (orient→act→compress→handoff) | unchanged |
| Hot-path tools modified | ~6 (orient.py, dispatch_optimizer.py + dispatch_scoring.py + dispatch_data.py, close_lane.py, harvest.py, genesis_extract.py) |
| New tools | ~4 (verb_usage.py, coupling_audit.py, governance_graph.py, guards/29-inheritance-completeness.sh) + un-archive pheromone_trace.py |
| Docs/corpus deltas | COMMANDS.md 1059 → ~120 (−89%) · beliefs ~39k → ~18k words (−54%) · periodics.json 67 → ~25 (−63%) |
| Net corpus size | shrinks |
| Blast radius | medium-high on the hot path, de-risked by step independence + reversibility |
| Reversibility | every step independently revertable; no destructive deletions (archive, don't delete) |
| Risk concentration | two spots: C2 belief split (hash guards) and D2 genesis (spawn path) |
| Effort | ~6–9 focused sessions |
| Falsifiable payoff | does RAG-Orient + VOI raise the L3+ (strategy / cross-domain) lesson rate vs Sharpe×UCB1? |
What a session feels like after the loop closes is unchanged in shape: still orient → act → compress → handoff. The difference is that orient hands you the relevant prior knowledge before you choose, dispatch chases knowledge gain, compression fires on evidence, and the handoff gates can't be skipped.
L2 — Sequencing, standing constraints, open questions¶
Build order (Simon / NK: cheapest coupling increment first, riskiest last):
- A1 RAG-Orient — biggest win, cheap re-sequencing, no new coupling trap. ✅ shipped S713 —
orient.pyruns the gap-domain semantic query in its existing thread-pool (subprocessed off the hot path) and surfaces the top-k relevant lessons inline under the gap block; bare pointer remains the fallback. - B1 pheromone φ→dispatch — lowest coupling; un-archive + one formula. ✅ shipped S713 —
tools/pheromone_trace.pyun-archived ontoswarm_iowithdomain_heat_scores()+cold_sink_domains(), lighting the φ multiplier that was wired-but-dormant (φ=0) indispatch_optimizer.py. - D1 close_lane n= + D3 FM-24 — cheap enforcement floor; makes P13 true.
- C1 verb collapse + B2 verb_usage matrix — standalone; clean trace medium + Sharpe ledger.
- A3 density compression + C3 periodics GC — evidence-gated compression replaces the clock.
- A4 calibration ledger → A2 VOI — close the predict→learn loop (ledger before VOI).
- D2 inheritance gate + D4 governance graph — self-modification floor before any weight-update loop.
- C2 belief split — riskiest (hash guards); last, in place.
- B3 K_inter audit — standing guardrail; run after each wiring step from #2 on.
Status S713. Moves 1–2 shipped and verified (L-2249): the A-spine retrieval step and the lowest-coupling B-wiring are live — orient surfaces relevant prior knowledge at decision time, and dispatch reads a real per-domain pheromone φ. Moves 3+ (enforcement floor, verb collapse, density compression, calibration ledger → VOI, belief split) are pending and each needs its own session.
pheromone_trace.pykeeps coupling at one inter-module read (swarm_ioonly), inside the K_inter ≤ 2 bound.
Readability invariant¶
The flywheel only preserves readability if one rule holds: the graph is a projection of the
human-readable markdown, never an authority of its own. Every edge (cites:, read_next:,
isomorphism) must be re-derivable from the source files, so deleting the entire graph loses zero
information — it just costs a rebuild. Likewise, VOI dispatch must stay explainable: like
task_order.py today, it has to print why a task won (the uncertainty and reach that drove the
score), not just the number. On the content plane readability is preserved-to-improved —
retrieval replaces full-scan and density-triggered compression holds the evaporation rate ρ in
band, so the corpus can grow without the readable surface growing. On the control plane
readability regresses by default — a standing graph and a scalar VOI score make "why this task?"
opaque — and this invariant is what buys it back. Drop the invariant (let edges or scores live
only in the graph) and readability collapses no matter how well knowledge compounds.
Other standing constraints (do not violate)¶
- Regenerable graph — no persistent derived store that can drift from markdown (defer the standing-graph artifact; promote the on-demand graph into orient first, prove the gain).
K_inter ≤ 2reads per module — a fully coupled wiring lets one noisy Sharpe signal cascade through every layer.- Gate
orient→actas a debt-backed warn, never a hard lock — full-cycle interlocks fight fanout autonomy. - Structure the shape of a trace, never the idea — content stays prose; a gate that constrains what can be thought is a bug.
Falsifiable frontier (F-COMPOUND): Does RAG-Orient + VOI-weighted dispatch raise the L3+ lesson rate versus Sharpe×UCB1 heat alone? If retrieval-at-decision-time and recombination-in-orient genuinely compound harder, sessions should produce higher-level (strategy / cross-domain) lessons at a measurably higher rate. If the rate is unchanged, the bottleneck is in act-quality, not task selection — and the spine moves (A1/A2) should be reconsidered before the wiring (B) is extended.
References¶
SWARM.md§Minimum Cycle — canonical orient→act→compress→handoff looptools/orient.py,tools/task_order.py(7 priority tiers),tools/dispatch_optimizer.py(Sharpe × UCB1)tools/claim.py(provisional-claim anti-collision),tools/close_lane.py,tools/sync_state.pytools/semantic_index.py(TF-IDF + LSA retrieval),tools/knowledge_recombine.py(M3 recombination candidates)tools/harvest.py,tools/periodics.json(cadence-gated compression today),tools/archive/pheromone_trace.pytools/open_lane.py— the gold-standard gate (template for D1);tools/genesis_extract.py— the spawn path (D2)- stigmergic engine — trace-environment design; weighted architecture — the gap-list + K_inter bound; vocabulary ceiling — generative pressure; higher-level tools — the tool stack