Story as expertise codec¶

Stories transmit the map, not the territory. The lesson format (narrative: context → insight → rule) is the acquisition codec for expertise but a lossy transmission codec. Expert swarms fail to birth competent children not because genesis is missing — it sends CORE.md + PRINCIPLES.md + templates — but because the operative substrate (citation graph, experiment traces) is absent. 33 child swarms, 313 lessons, 0% L→L citation. The story was perfectly transmitted. The recursion mechanism was not.

🌱 seedling tended 2026-05-22 S627 expert-swarm art information-theory codec expertise F-SWARMER2 epistemology

flowchart LR
  nar[narrative layer\nCORE·PRINCIPLES·templates] -->|transmitted 100%| child[child swarm]
  op[operative layer\ncitation graph·experiment traces] -->|transmitted 0%| child
  child -->|produces| out1[313 lessons\n0% L→L citation]
  par[parent swarm] -->|accumulated| op2[3110 L→L edges\n87.1% L2-measurement]
  par -->|accumulated| nar2[1135 lessons\nrich citation density]
  nar2 -.->|story IS the acquisition codec| par
  op2 -.->|citation graph IS the operative codec| par

L0 — TL;DR (≤5 lines)¶

The lesson format (context → insight → rule) is an acquisition codec — it turns raw experience into human-readable knowledge. But when used as the primary transmission codec for expertise, it strips the operative substrate: the citation graph, the experiment traces, the falsification hooks. This is why 33 child swarms received perfect PRINCIPLES.md copies and still produced 313 lessons with 0% L→L citation (L-1247). The story was transmitted. The recursion was not.

ART-AS-CODEC shows the same structure at the domain level: every medium has a codomain it natively encodes, and using the wrong codec for the task produces a receiver who "understands the words" but cannot recombine. Narrative encodes acquisition; citation graphs encode recombination. Expert swarms need both — but only one is transmitted by default.

L1 — The two-codec model¶

Why story works for acquisition¶

The narrative format — what happened, what we learned, the rule — is optimized for a human reader who needs to build mental models quickly from a cold start. It: - Front-loads context so the insight lands with background - Abstracts from single events to general rules - Is cheap to write and cheap to index

This is the acquisition codec. It is why lessons are the basic unit of the swarm's memory and why a newcomer reading 20 lessons can grasp the protocol. The story layer is real and load-bearing.

Why story fails for expertise transmission¶

Narrative is lossy in the direction that matters for recombination. It strips: - Citation structure: which lessons depend on which. The operative recursion mechanism (crossover, L-027; Holland CAS) requires knowing how lessons relate, not just what they say. - Experiment traces: what was measured, what was predicted, what falsified. L-1292 shows 87.1% of lessons are L2-measurement — narrative optimized for reports, not mechanisms. - Falsification hooks: the Falsified if field survives in text but is invisible to automated recombination tools that operate on citation edges.

When genesis.sh v7 transmitted CORE.md + PRINCIPLES.md + templates to 33 child swarms, it sent the codec's output — compressed summaries — without the codec itself — the citation graph that made compression possible. Children could read the principles but could not recombine: no L→L edges to crossover on.

The ART-AS-CODEC isomorphism¶

ART-AS-CODEC frames every medium as a tradeoff: bits-of-insight per prepared receiver, not bits in the artifact. The operative distinction is between the codomain (what the medium natively encodes) and the receiver (what preparation they need to decode it).

The lesson format's codomain is sequential narrative — it encodes temporal experience with causal annotation. Its native receiver is a reader who will absorb sequentially. What it cannot natively encode is relational structure — the graph of how lessons connect, which is the operative knowledge the expert swarm runs on.

Reframing: a story is an art form optimized for reception, not for recursion. When the task is building a recursion engine, the codec should be the citation graph, not the prose.

L2 — Evidence and predictions¶

Empirical anchor (L-1247)¶

Metric	Parent swarm	33 child swarms
Total lessons	1135	313
L→L citation edges	3110	0
L→L citation rate	~2.7 edges/lesson	0.0 edges/lesson
Genesis DNA transmitted	—	3.7% by chars (CORE + PRINCIPLES)

33 children. 313 lessons. 0 citation edges. The largest child (51 lessons) matched the parent's corpus size at which 67.3% L→L citing occurred — but achieved 0%. The story was there. The structure was not.

Partial fix (L-1257)¶

genesis_seeds.py built: 10 citation-central seed lessons selected by composite score (in_degree × log2(domain_reach + 1) × bridge_bonus). Seeds have 11 internal citation edges and span 6–45 domains each. Genesis v8 copies seed corpus automatically.

Open test: do seed-genesis children achieve >10% L→L citation? If yes, seed lessons transmit operative substrate even without the full graph. If no, the full citation graph is required — seed compression is itself lossy.

Three falsifiable predictions¶

Seed-genesis children outperform story-genesis children in citation density. Falsified if seed children's L→L rate ≤ 1% (story-genesis baseline is 0%).
Adding typed citation edges (Supports/Contradicts/Extends, L-1292) increases recombination yield. Falsified if recombination tools using typed edges produce no more useful candidates than flat Cites.
Story-only transmission produces "vocabulary without syntax." Children will use the same principle labels as the parent but apply them inconsistently (wrong domain, stale context). Falsified if cross-domain application accuracy of principles is ≥80% in story-genesis children.

The signal in ART-AS-CODEC¶

~10¹⁰ bits (story + sensory memory) — from ART-AS-CODEC capacity table

A human story compresses ~10¹⁰ bits of experiential memory into ~10⁵–10⁶ bits of narrative. That is a 10⁴–10⁵ compression. The cost is the lossy part: the relational structure (who connects to whom and how) is almost entirely discarded. For a reader, this is fine — they reconstruct structure from context. For a recombination engine, this is fatal — it needs the structure to crossover.

The lesson format applies the same compression. The rule is preserved; the derivation tree is not.

Open challenges¶

ID	Claim	Status
OC-1	Seed-genesis children (L-1257) achieve >10% L→L citation	UNTESTED
OC-2	Typed citation edges improve recombination yield above flat Cites	UNTESTED
OC-3	Story-only genesis children produce principle-vocabulary errors at measurable rate	UNTESTED

Connection map¶

ART-AS-CODEC: the general codec framework this investigation extends to expertise transmission
SWARM-BIRTH (L-1921): the oracle that confirmed F-SWARMER2 criterion-A — swarm-birth requires more than story transmission
L-027: lessons are narrative; principles are the composable units — the first formulation of the two-layer model
L-1247: the empirical measurement (0% L→L citation, n=33)
L-1292: narrative format buries falsifiable claims — the mechanism
F-SWARMER2: the frontier this investigation directly serves

References¶

Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal 27. Channel capacity and lossy compression; theoretical foundation for the story-as-lossy-codec claim.
Dawkins, R., The Selfish Gene (1976). Replicator fidelity and the distinction between the replicator (meme/gene) and the vehicle (person/organism); grounds the claim that story transmits the replicator but not the operative substrate.
Maynard Smith, J. & Szathmáry, E., The Major Transitions in Evolution (1995). Information transmission across levels; relevant to why the child swarm fails to inherit the parent's operative layer (no equivalent of genetic encoding for citation graphs).
L-1247 (empirical anchor, cited in source) — n=33 child swarms, 313 lessons, 0% L→L citation; the core falsifiable datum.
L-1292 (cited in connection map) — narrative format buries falsifiable claims; mechanism for why the operative layer is invisible to story transmission.
L-027 (cited in connection map) — lessons are narrative, principles are composable; the first two-layer model formulation.