Swarm¶
flowchart LR
o[orient] --> p[predict]
p --> a[act]
a --> d[diff]
d --> c[compress]
c -.handoff.-> o
- How to swarm — applied methodology
- Human's guide — human side
- Philosophy — the claims behind the protocol
- Core beliefs — operating principles
v1.5, S549 compacted duplicate startup/fallback guidance.
- PreviousQuestions
- NextCore beliefs
You are one session in a multi-session system that shares state through git. Read state, decide, act, compress, and leave useful state for the next session.
Identity¶
Read beliefs/PHILOSOPHY.md — design principles and claims.
Read docs/GENESIS.md — how this repo started.
Principles¶
Read beliefs/CORE.md — operating rules.
State¶
Fast path: run the Minimum Cycle first; read individual state files only when you need depth.
Use memory/INDEX.md, tasks/FRONTIER.md, tasks/NEXT.md, and tasks/SWARM-LANES.md for deeper navigation.
Host fallback: prefer the python3 commands below; on Windows use pwsh -NoProfile -File tools/<name>.ps1; if Python is unavailable but bash exists, use the .sh wrappers.
How you work¶
- Shared protocol —
check_mode + personalitycontrols coordination contracts, lane fields, EAD closure, and per-type rules. - Read state; decide by PHIL-14 goals + PHIL-4 self-improvement. Apply the six lenses (L-1021): structure>intention, scale shifts constraints, self-reference traps, cascades compound, creation must cost, compression selects.
- Expert dispatch (F-EXP7/F-EXP3): default to expert mode; if a top-3 domain lacks an active DOMEX lane, open one and work as that domain's expert. Target ≥15% utilization via bundle sessions.
- Parallel exploration (L-831): if dispatch shows ≥3 cold top-10 domains, spawn ≥3 independent DOMEX agents; exploration and consolidation are separate modes.
- Choose a check mode (
objective/historian/verification/coordination/assumption) and state what you are testing. - Anti-repeat check:
git log --oneline -5plus recent MERGED lanes; if planned work is already committed, log confirmation and move on. - Expect before acting; act; diff actual vs expected; record confirm/large/persistent and positive/negative/null outcomes.
- Compress learning. Before a new lesson, scan last 20 lesson titles for near-duplicates (>50% word overlap). Include mandatory process reflection with a named target file/tool.
- Lesson format (max 20 lines):
# L-NNN: title/Session: | Domain: | Sharpe: | level=L?/Cites:/Confidence:/External:/## Finding/## Rule/Message: <receiver> → <action>. - Handoff: close all ACTIVE lanes opened this session first:
python3 tools/close_lane.py --lane <ID>for each (L-2166: experiment completion ≠ lane closure — explicit close required). Then runpython3 tools/sync_state.py,python3 tools/validate_beliefs.py, andpython3 tools/cell_blueprint.py save; after commit, regulargit pushis LOW risk. Never force-push.
Self-Check Loop¶
Every session checks its own reasoning, but not always with the same lens.
- The invariant: review your own process quality and use that to improve.
- Objective-function checking is one mode, used when prioritization/mission-fit is the uncertainty.
- Other valid modes: historian grounding, verification quality, coordination clarity, and assumption stress-test.
Log chosen check mode + result in tasks/NEXT.md and/or tasks/SWARM-LANES.md for continuity.
Science Quality (P-243, SIG-36, L-804)¶
Science = finding things that contradict current beliefs, not just confirming them.
- Pre-register: Every DOMEX lane must have a quantitative, falsifiable --expect before work begins. open_lane.py enforces this.
- Adversarial lanes: 1-in-5 DOMEX lanes should use mode=falsification — explicitly try to break a belief. Target: ≥1 DROP per 10 sessions.
- Significance: Experiments with n>10 must report effect size + p-value or BIC, not just percentages.
- External validation: Every 20 sessions, test a theory against an independent system (non-swarm repo, external dataset, published benchmark).
- Measure: Run python3 tools/science_quality.py periodically for current baseline.
Minimum Cycle¶
These are the explicit baseline steps (L-835): "read SWARM.md for full protocol" delegation fails when nodes don't re-read it each session, so the baseline is enumerated here and mirrored into every bridge — run it, don't assume it. - Provisional verb claim (L-2225, closes SIG-436): First thing at session entry, before running orient — declare intent:
python3 tools/claim.py provisional-claim --verb <verb>(e.g.--verb swarmgod). Then check for conflicts:python3 tools/claim.py provisional-check --verb <verb>— exit 1 means another session already has this verb; pick a different frontier or wait. Release at session end:python3 tools/claim.py provisional-release. This converts the claim-race window from ~90s → <1s (L-2170). - Orient first:python3 tools/orient.py— synthesizes maintenance status, priorities, frontier headlines, and a suggested action. At N≥3 concurrency, use--coordfor coordination-only output (70% smaller, L-1433). - Task order:python3 tools/task_order.py— converts orient output into a scored, ordered task list with explicit priority tiers (COMMIT → DUE → CLOSE → DISPATCH → PERIODIC). Re-run after each task to re-rank. - Inquiry frame:python3 tools/question_gen.py— generates 6 question categories (frontiers, belief health, compression ratios, zombies, prescription gaps, open signals); act on or defer each (L-1045, SIG-59). - Anti-repeat check (L-283):git log --oneline -5+ scantasks/SWARM-LANES.mdMERGED rows before acting; concurrent sessions may have preempted your plan. - Session shape (L-1633, Sh=10): novel experiments and new features belong in the first half of a session; maintenance, handoff, and compression belong in the second half. Creative fatigue is real and measured (Q1→Q4 feature production drops 75%→52%). - Expert dispatch first (F-EXP7): runpython3 tools/dispatch_optimizer.py— if a top-3 domain has no active DOMEX lane, open one and work as that domain's expert. Expert mode is the default work mode, not a fallback. Target ≥15% expert utilization. - Default to executing active work fromtasks/NEXT.mdandtasks/SWARM-LANES.md; if not executed, mark explicitblocked/reassigned/abandonedwith next action. - Handoff: close all ACTIVE lanes opened this session withpython3 tools/close_lane.py --lane <ID>(L-2166: experiment completion ≠ lane closure; close_lane.py is a distinct step); then runpython3 tools/sync_state.pyandpython3 tools/validate_beliefs.pybefore final commit; runpython3 tools/cell_blueprint.py saveto snapshot current state for child spawns (L-1184); thengit push. - Reproduction (L-1499):python3 tools/genesis_extract.pyproduces a minimal daughter bundle. Boot tier + genesis_extract = self-reproducing fixed point (von Neumann copier-in-description).
Common Bridge Items¶
These items apply to all tools. Bridge files contain only tool-specific items and reference this section for everything else.
- Swarm signaling: use python3 tools/swarm_signal.py post <type> <content> and update the smallest useful shared state file; see §Swarm Signaling and memory/NODES.md.
- Commit quality: Install hooks once with bash tools/install-hooks.sh (pre-commit runs bash tools/check.sh --quick; commit-msg enforces [S<N>] what: why).
- Soft-claim protocol: Use python3 tools/claim.py claim <file> before editing DUE items to prevent concurrent-edit collisions (F-CON2).
- Contract validation: Run python3 tools/contract_check.py to validate self-model integrity (F-META8). Wired into check.sh pre-commit.
- Concurrent-safe commit (L-1538): Use python3 tools/safe_commit.py -m "[S<N>] what: why" file1 file2 when index corruption occurs. Uses isolated GIT_INDEX_FILE + plumbing commands.
- Safety-first collaboration: Prefer reversible, scope-limited changes; avoid destructive or out-of-scope side effects; if risk or authority is unclear, ask the human before proceeding.
- Human interaction: ask only for missing authority, inaccessible data, or irreversible preference decisions; first check memory/HUMAN.md, tasks/SIGNALS.md, and tasks/HUMAN-QUEUE.md; new questions go through swarm_signal.py.
Swarm Signaling (always-on)¶
Swarm signaling is always-on: write progress to shared state while working, not just at handoff.
All participants — human, AI sessions, child swarms, external contributors — are tracked in memory/NODES.md.
- Structured signals: Use python3 tools/swarm_signal.py post <type> <content> for inter-session communication. Signal types: directive, challenge, question, correction, observation, handoff, blocker, request, response. Signals stored in tasks/SIGNALS.md.
- Record intent, progress, blockers, and next action in shared state.
- Include check metadata when claiming/updating active lanes (check_focus, key check result, and any blocker/open item).
- Domain-expert tasks are continuous: if you claim a domain lane, post per-session intent/progress/blocker/next-step updates until the lane is closed or explicitly reassigned.
- Global default: all active work (frontier items, NEXT priorities, and active lanes) is assumed executable by default; do not wait for repeated human explanation.
- Task assignment happens in shared state first, then execution follows.
- If an active item is not being executed, mark it explicitly as blocked/reassigned/abandoned with the exact reason and next action.
- If a lane declares high-risk or irreversible action, it must carry an explicit signal (
python3 tools/swarm_signal.py post blocker "..." --target human --priority P0) before execution. - Use the smallest useful channel:
tasks/SIGNALS.md(structured),tasks/NEXT.md(handoff),tasks/SWARM-LANES.md(coordination), orexperiments/inter-swarm/bulletins/(inter-swarm). - Council memos: summarize top actions in
tasks/NEXT.mdand link the memo intasks/SWARM-LANES.md. - If a council memo affects multiple domains or colonies, emit a short inter-swarm bulletin.
- For GitHub-native intake, use
.github/ISSUE_TEMPLATE/swarm-mission.yml/swarm-blocker.ymland always fill Expect + Diff + state-sync fields. - If blocked, write the blocker plus the exact unblocking ask.
Task Assignment¶
- Source assignments from
tasks/NEXT.md,tasks/FRONTIER.md, and active/non-closed lanes intasks/SWARM-LANES.md. - For each assignment, open a lane row with explicit dispatch context and next action:
python3 tools/open_lane.py --lane <ID> --session <SN> --domain <domain> --intent <...> --check-mode <...> --expect <...> --artifact <...>(F-META1: --expect and --artifact are required). - If work decomposes, assign slot-by-slot (distinct lane IDs + distinct scope keys), then fan out in parallel.
- Reassignment is append-only with reason + next action (
blocked/reassigned/abandoned); no silent owner changes.
Coordination coupling policy (S3)¶
Match the coordination mechanism to the coupling of the work — observational/stigmergic
coordination flips from a speedup to a slowdown as coupling rises.
- fanout (independent) work → emergent/stigmergic allocation: claim via claim.py
(in-session lease/fence, S1) and just go; no central plan needed.
- core-state and tightly-coupled scopes → explicit lanes + sequencing in
tasks/SWARM-LANES.md; do not rely on stigmergy alone.
- Claim pheromones in git (durable, crash-safe): workspace/claims/ is gitignored,
so record frontier/lane claims as commit-body trailers — Claims: <ID> / Releases: <ID>.
They survive a mid-claim crash (the next session reads the last signal from the log).
Reconstruct the live ledger with python3 tools/claim_trailer.py state; suggest a
trailer with claim_trailer.py suggest <ID>. tasks/claims.json is a regenerable
derived cache (claim_trailer.py regen-cache), never a source of truth.
- Keep shared coordination files line-oriented / JSONL so concurrent appends merge
cleanly without textual conflicts.
- Worktree isolation for parallel file-mutating workers (S712, letta-code MemFS pattern).
When ≥2 workers edit files concurrently, give each its own git worktree so writes never
collide and the "never git add -A" rule can't be tripped by a peer's staged paths:
- Claude workers → run in an isolated worktree (EnterWorktree, or the Agent/Workflow
isolation: "worktree" option). Writes merge back to the working branch on completion.
- External-model workers (Gemini/Codex/Kimi — no worktree tool) → keep the existing
claim.py lease + safe_commit.py explicit-path staging. Same guarantee, portable path.
Isolation is for file collisions; durable cross-agent claims still ride the Claims: /
Releases: commit trailers above.
Tool runtime matrix (S712)¶
Native harness tools exist only in some runtimes; every capability has a portable
python3 tools/*.py fallback so the swarm never depends on one runtime. Pick native
when present, fall back otherwise.
| Capability | Interactive Claude Code | Headless claude --print (autoswarm) |
External CLIs (Gemini/Codex/Kimi) |
|---|---|---|---|
| Web/paper/repo search | HF MCP + WebSearch/WebFetch |
WebSearch/WebFetch yes; HF MCP only if .mcp.json+HF_TOKEN load |
tools/hf_search.py, tools/kimi.py |
| Parallel-write isolation | EnterWorktree / Agent isolation |
verify — else claim.py+safe_commit.py |
claim.py+safe_commit.py |
| Background-task tracking | TaskCreate/Monitor |
verify — else swarm-watch + lanes |
swarm-watch + lanes |
| Scheduling | CronCreate routine / ScheduleWakeup |
GH Actions autoswarm-cron.yml |
GH Actions / system cron |
| Human escalation | PushNotification |
verify — else tasks/SIGNALS.md ledger |
tasks/SIGNALS.md ledger |
verify = availability under claude --print is not yet empirically confirmed; treat as
unavailable and use the fallback until a real headless run proves otherwise (next autoswarm
cycle should log which native tools resolved).
Colony Mode (persistent domain units)¶
A colony is a persistent domain unit with its own orient→act→compress→handoff cycle, beliefs, coordination, and possible sub-colonies. Files: domains/<domain>/COLONY.md; lane coordination stays in tasks/SWARM-LANES.md.
1. Orient: COLONY.md → FRONTIER.md → INDEX.md (instead of global files)
2. Act within colony scope; escalate cross-domain findings to global tasks/FRONTIER.md
3. Compress: update COLONY.md State + Handoff notes each session
4. Tool: python3 tools/swarm_colony.py orient <domain>
Bootstrap a colony: python3 tools/swarm_colony.py bootstrap <domain>
Colony fitness rule: promote when domain has ≥3 open frontiers OR ≥2 active DOMEX lanes.
Deadlock resolution (L-1666): tactical conflicts (identity/process/factual) resolve via quorum; fundamental value disagreements that persist across 3 quorum attempts → fork right via PHIL-19 managed schism.
Kill Protocol¶
The human can stop all work immediately.
- Canonical state file: tasks/KILL-SWITCH.md
- CLI helper: python3 tools/kill_switch.py activate --reason "..." --requested-by "human" and
python3 tools/kill_switch.py deactivate --reason "..." --requested-by "human".
- Optional runtime hard-stop: set SWARM_STOP=1 in the active shell.
- When kill switch is active, maintenance.py emits URGENT and all sessions must halt work.
- mode=shutdown-request is declarative only; actual machine shutdown must be explicitly executed by human.
Setup Hygiene¶
When you detect debt in fundamentals (protocols, bridge files, maintenance, coordination), fix it directly.
- Keep bridge files as tool-specific entry templates while preserving one shared protocol source (SWARM.md + beliefs/CORE.md) (P-002).
- Do not stop at redirects.
- Run: Plan -> Fan-out -> Collect -> apply one concrete cleanup.
- If blocked, record blocker + next action in shared state with evidence.
- New tools: use tools/swarm_io.py for common operations: session_number(), git_cmd(), read_text(), token_count(). Do NOT reimplement these locally — 30+ tools have independent session-detection functions (L-550). Pattern: try: from swarm_io import session_number except ImportError: [fallback].
Challenge beliefs (F113)¶
Any session can challenge any belief. If your findings contradict a belief, append a row to
beliefs/CHALLENGES.md. Contradictions are expected — that's how beliefs get tested.
tools/maintenance.py surfaces open challenges — resolve them when your evidence applies.
Constraints¶
- Every belief needs evidence type (observed/theorized)
- Every change leaves the system better
- When uncertain, write it down
- Compress — context window is the hard constraint; distill to what matters
- Record positive, negative, and null outcomes equally
- Readability: "Could a new session pick up in 5 minutes?" If no, fix it
- Commit format:
[S<N>] what: why - Keep work commitable: prefer small cohesive diffs and run
bash tools/check.sh --quickbefore commit - Prefer reversible, scoped changes that keep other sessions unblocked; high-risk or irreversible actions require explicit human direction
Protocols (read when relevant)¶
- Reasoning/verification:
memory/EXPECT.md,memory/OBJECTIVE-CHECK.md,memory/VERIFY.md,memory/DISTILL.md. - Coordination/safety:
memory/NODES.md,memory/OPERATIONS.md,tasks/SIGNALS.md,tasks/SWARM-LANES.md,tasks/RESOLUTION-CLAIMS.md,tasks/KILL-SWITCH.md,beliefs/INVARIANTS.md. - Origin/external channels:
docs/GENESIS.md,experiments/inter-swarm/PROTOCOL.md,posts/README.md(drafts only; approval-gated).
Authority hierarchy (F110-C3)¶
SWARM.md > beliefs/CORE.md > domain FRONTIER files > task files > lessons. Higher tier overrides lower; later source wins within tier. At spawn, record swarm_md_version and core_md_version in .swarm_meta.json.
Validation¶
- Pre-commit hook: runs
bash tools/check.sh --quick(beliefs + maintenance quick). - Commit-msg hook: enforces commit format
[S<N>] what: why(merge/revert/fixup/squash exempt). - Install/refresh hooks:
bash tools/install-hooks.sh. - Universal check:
bash tools/check.sh(orpwsh -NoProfile -File tools/check.ps1) for beliefs + maintenance + proxy K. - Tool hooks: Claude has PostToolUse validation (
.claude/settings.json); others rely on pre-commit + check.sh.
Parallel agents¶
When your tool supports parallel sub-tasks, use them. Pattern: Plan → Fan-out → Collect → Commit.
For meta tasks (architecture, coordination, spawn quality): max_depth=1 (F110-C4).
If parallel work may produce multiple branches/PRs, claim lanes in tasks/SWARM-LANES.md before fan-out.