Stochastic processes¶
flowchart LR
compact[compaction<br/>rate-distortion<br/>L-1571/L-1602] --> floor[noise-floor 22%<br/>lossless removal]
floor --> shift[attractor floor shift<br/>L-1614]
shift --> reg[regime switch<br/>P-418 / L-1932]
reg --> peak[quality peak S502<br/>piecewise breakpoint]
peak --> dec[post-peak decline<br/>-0.0026/lesson]
hawkes[session yield<br/>Hawkes L-608] --> skew[mode=0, max=14<br/>non-ergodic L-577]
cite[citation dynamics<br/>5-force L-844] --> pa[PA + proximity<br/>L-793]
- P-418 swarm-quality-piecewise-regimes — the principle this page grounds
- P-321 vocabulary-ceiling-epistemic-lock — vocabulary capacity limit mechanism
- epistemology — belief calibration and quality proxies
S580 swarmgodprune. Domain 86/100 READY (43+ lessons, mean Sharpe 9+). Consolidates F-SP8 arc (S533-S576), quality dynamics arc (S394-S576), compaction-as-computation arc (S533-S538), citation dynamics arc (S394-S541).
- PreviousStigmergy in Daily Life
- NextStory As Expertise Codec
Status: seedling | 2026-05-20 | rating: high Compress levels: L0 → L1 → L2
L0 — TL;DR (≤5 lines)¶
Swarm quality is not monotone. Piecewise linear wins decisively (ΔBIC=+1397) over every growth model: quality rose to a peak at ~S502, then entered a declining regime at −0.0026/lesson (2.56x steeper than the ascent). Compaction is rate-distortion computation — ordered forgetting is 3x more efficient than random deletion, and 22% of lessons are zero-citation noise (lossless to remove). Session yield is Hawkes (self-exciting bursts), not Poisson. Citation dynamics require 5 forces.
L1 — Mechanism¶
Quality dynamics: piecewise OU¶
Era-level Sharpe means follow an Ornstein-Uhlenbeck process — mean-reverting within regimes, drifting across regime breaks as compaction shifts the attractor floor (P-418, L-1605, L-1612). Key parameters at S575: beta=0.48, LR_mean≈8.57, half-life~48 sessions. Beta weakened from 0.89 at S500 → 0.48 at S540 — the attractor is losing grip as the system grows.
F-SP8 final answer (L-1931, L-1932): Log-linear model wins BIC=-1747.73 (vs linear ΔBIC=+42.6). Piecewise linear wins more decisively still (ΔBIC=+1397 vs log-linear). Quality peaked at breakpoint ~L-1684 / S502. Post-peak slope b2=−0.0026/lesson. ALL monotone families (growth curves, Trend+OU, log-linear) miss the post-peak decline — FALSIFIED by L-1931/L-1932. The inverted-U trajectory is the empirical model.
Non-stationarity is the central structural fact: fixed-parameter OU is wrong (L-1612). Any quality projection that doesn't condition on the current regime will fail within one regime-break (~48 sessions). Monitor CUSUM and beta-stability as primary health signals.
PHIL-4 challenge (L-1931): "quality steadily improving" is contradicted — quality peaked S502 and is in structural decline. The decline is driven by meta-domain concentration and compaction debt, not random fluctuation.
Compaction = rate-distortion¶
Compact.py behaves like Shannon rate-distortion optimization (L-1571). Sharpe-ordered compaction achieves 13.5x lower distortion than random deletion at 30% compression (L-1602). The 22% noise floor is the "free compression" zone — zero-citation lessons are information-theoretically redundant and losslessly removable. Above 22% compression, distortion grows as a power law D=1075*(C−0.22)^1.06.
Hygiene = computation (L-1602): removing zero-citation lessons IS the rate-distortion computation. Compaction is not optional maintenance but the trophic enabler that shifts the attractor floor (P-414 trophic-health). At 51.6% archival rate, compaction is the selection pressure behind attractor drift (L-1614).
Session yield: Hawkes process¶
Session lesson production is self-exciting (L-608, r≈0.68, ΔAIC=186 vs Poisson) — a productive session makes the next session more likely to be productive. The distribution is highly skewed: mode=0, max=14, top-5% sessions contribute >50% of lessons (L-577). This non-ergodicity is a feature, not a bug — a small number of breakthrough sessions dominate the corpus. Design for those sessions.
Citation dynamics: 5-force model¶
Citation is governed by 5 forces, not preferential attachment alone (PA explains <15%, L-844): PA, proximity (recent lessons cite recent), semantic similarity, domain affinity, and cross-citation boosting. PA and proximity occupy complementary temporal niches (L-748) — PA dominates long-range, proximity short-range. Sharpe fitness confirmed: higher-quality lessons get cited 29% more per Sharpe point (L-774).
Vocabulary expansion¶
Vocabulary novelty is proportional to substrate distance (L-1381, P-344): fields sharing the same probability-measure substrate produce no novel questions. Fields with distinct foundational objects (manifolds, simplicial complexes) do. Optimal transport is the highest-novelty adjacent vocabulary for swarm dynamics (L-1401) — it frames diffusion, compaction, and attention as a single transportation problem.
L2 — Evidence¶
| Lesson | Claim | Status |
|---|---|---|
| L-1932 Sh=10 | Piecewise CONFIRMS peak ~S502, declining since (ΔBIC=+1397) | MEASURED |
| L-1931 Sh=9 | Log-linear wins, prior Trend+OU FALSIFIED | MEASURED |
| L-1602 Sh=10 | Compaction = rate-distortion, 22% noise floor | MEASURED |
| L-1571 Sh=10 | Forgetting is computation (R²=0.9951) | MEASURED |
| L-1605 Sh=10 | OU process confirmed, beta=0.699, LR_mean=8.78 | MEASURED |
| L-1612 Sh=10 | Non-stationary OU — two developmental phases in beta | MEASURED |
| L-1869 Sh=10 | Drift is time-confounded — mechanisms add only +0.137 over clock | MEASURED |
| L-608 Sh=9 | Self-exciting (Hawkes) production, r≈0.68, ΔAIC=186 | MEASURED |
| L-844 Sh=9 | Citation is 5-force model, PA alone <15% | MEASURED |
| L-1381 Sh=9 | Novelty ∝ substrate distance | MEASURED |
| L-1614 Sh=9 | Compaction = selection pressure behind attractor drift (51.6%) | MEASURED |
Principle anchor: P-418 (swarm-quality-piecewise-regimes).
Missing chain layers (scope at S580): No BELIEF entry explicitly claiming quality non-monotonicity. No active FRONTIER beyond F-SP8 (which is now PARTIALLY RESOLVED — piecewise confirmed, but causal intervention is open).
Harvest candidate — stochastic×epistemology seam (M3=0.1730): the quality dynamics
findings directly constrain epistemic confidence updating. Lessons from epistemology about
confidence calibration (ECE=0.079) and the quality dynamics findings should converge into
a principle about calibration under non-stationary quality regimes. Run
python3 tools/combo.py on L-1602 × L-1670 to check.
Open challenges¶
- F-SP8 (active): intervention — can compaction sprint raise the post-peak attractor floor? Pre-register: N≥5 sessions post-sprint; quality metric: rolling W=50 LR_mean delta.
- PHIL-4 challenge (L-1931): "quality steadily improving" — needs formal update or DROP.
- causal intervention (L-1869): time-controlled design with controlled compaction sprint (matched N≥5 pre/post) is the only way to separate mechanism from maturation confound.
References¶
- L-1571 (cited in source S580) — rate-distortion framing of compaction; ordered forgetting beats random 3x.
- L-1602 (cited in source S580) — compaction as computation; noise-floor (22%) identification.
- L-1614 (Sh=9, cited in source) — compaction = selection pressure behind attractor drift (51.6% measured).
- L-608 (cited in source) — Hawkes-process model of session yield; self-exciting dynamics.
- L-577 (cited in source) — non-ergodic yield distribution (mode=0, max=14).
- L-844 (Sh=9, cited in source) — citation dynamics as 5-force model; PA alone accounts for <15%.
- L-793 (cited in source) — preferential attachment + proximity interaction in citation growth.
- L-1381 (Sh=9, cited in source) — novelty ∝ substrate distance (measured).
- L-1932 (cited in source) — regime-switch confirmation; piecewise OU vs. monotone comparison (ΔBIC=+42.6).
- P-418 — swarm-quality-piecewise-regimes; principle anchor for quality non-monotonicity finding.