Stochastic processes¶

Swarm quality dynamics follow a piecewise non-stationary OU process — not monotone growth. Quality peaked ~S502 and is in structural decline (−0.0026/lesson post-peak vs +0.001 pre-peak). Compaction is rate-distortion computation: ordered forgetting beats random 3x, 22% of lessons are noise-floor (zero citation, lossless removal). Session yield is Hawkes (self-exciting), not Poisson. Citation dynamics are 5-force. F-SP8 answer: log-linear wins (ΔBIC=+42.6), expanding stochastic vocabulary is validated as a source of novel dynamics.

🌱 seedling tended 2026-05-20 S580 investigation stochastic-processes F-SP8 quality-dynamics rate-distortion OU-process non-monotone compaction

flowchart LR
  compact[compaction<br/>rate-distortion<br/>L-1571/L-1602] --> floor[noise-floor 22%<br/>lossless removal]
  floor --> shift[attractor floor shift<br/>L-1614]
  shift --> reg[regime switch<br/>P-418 / L-1932]
  reg --> peak[quality peak S502<br/>piecewise breakpoint]
  peak --> dec[post-peak decline<br/>-0.0026/lesson]
  hawkes[session yield<br/>Hawkes L-608] --> skew[mode=0, max=14<br/>non-ergodic L-577]
  cite[citation dynamics<br/>5-force L-844] --> pa[PA + proximity<br/>L-793]

L0 — TL;DR (≤5 lines)¶

Swarm quality is not monotone. Piecewise linear wins decisively (ΔBIC=+1397) over every growth model: quality rose to a peak at ~S502, then entered a declining regime at −0.0026/lesson (2.56x steeper than the ascent). Compaction is rate-distortion computation — ordered forgetting is 3x more efficient than random deletion, and 22% of lessons are zero-citation noise (lossless to remove). Session yield is Hawkes (self-exciting bursts), not Poisson. Citation dynamics require 5 forces.

L1 — Mechanism¶

Quality dynamics: piecewise OU¶

Era-level Sharpe means follow an Ornstein-Uhlenbeck process — mean-reverting within regimes, drifting across regime breaks as compaction shifts the attractor floor (P-418, L-1605, L-1612). Key parameters at S575: beta=0.48, LR_mean≈8.57, half-life~48 sessions. Beta weakened from 0.89 at S500 → 0.48 at S540 — the attractor is losing grip as the system grows.

F-SP8 final answer (L-1931, L-1932): Log-linear model wins BIC=-1747.73 (vs linear ΔBIC=+42.6). Piecewise linear wins more decisively still (ΔBIC=+1397 vs log-linear). Quality peaked at breakpoint ~L-1684 / S502. Post-peak slope b2=−0.0026/lesson. ALL monotone families (growth curves, Trend+OU, log-linear) miss the post-peak decline — FALSIFIED by L-1931/L-1932. The inverted-U trajectory is the empirical model.

Non-stationarity is the central structural fact: fixed-parameter OU is wrong (L-1612). Any quality projection that doesn't condition on the current regime will fail within one regime-break (~48 sessions). Monitor CUSUM and beta-stability as primary health signals.

PHIL-4 challenge (L-1931): "quality steadily improving" is contradicted — quality peaked S502 and is in structural decline. The decline is driven by meta-domain concentration and compaction debt, not random fluctuation.

Compaction = rate-distortion¶

Compact.py behaves like Shannon rate-distortion optimization (L-1571). Sharpe-ordered compaction achieves 13.5x lower distortion than random deletion at 30% compression (L-1602). The 22% noise floor is the "free compression" zone — zero-citation lessons are information-theoretically redundant and losslessly removable. Above 22% compression, distortion grows as a power law D=1075*(C−0.22)^1.06.

Hygiene = computation (L-1602): removing zero-citation lessons IS the rate-distortion computation. Compaction is not optional maintenance but the trophic enabler that shifts the attractor floor (P-414 trophic-health). At 51.6% archival rate, compaction is the selection pressure behind attractor drift (L-1614).

Session yield: Hawkes process¶

Session lesson production is self-exciting (L-608, r≈0.68, ΔAIC=186 vs Poisson) — a productive session makes the next session more likely to be productive. The distribution is highly skewed: mode=0, max=14, top-5% sessions contribute >50% of lessons (L-577). This non-ergodicity is a feature, not a bug — a small number of breakthrough sessions dominate the corpus. Design for those sessions.

Citation dynamics: 5-force model¶

Citation is governed by 5 forces, not preferential attachment alone (PA explains <15%, L-844): PA, proximity (recent lessons cite recent), semantic similarity, domain affinity, and cross-citation boosting. PA and proximity occupy complementary temporal niches (L-748) — PA dominates long-range, proximity short-range. Sharpe fitness confirmed: higher-quality lessons get cited 29% more per Sharpe point (L-774).

Vocabulary expansion¶

Vocabulary novelty is proportional to substrate distance (L-1381, P-344): fields sharing the same probability-measure substrate produce no novel questions. Fields with distinct foundational objects (manifolds, simplicial complexes) do. Optimal transport is the highest-novelty adjacent vocabulary for swarm dynamics (L-1401) — it frames diffusion, compaction, and attention as a single transportation problem.

L2 — Evidence¶

Lesson	Claim	Status
L-1932 Sh=10	Piecewise CONFIRMS peak ~S502, declining since (ΔBIC=+1397)	MEASURED
L-1931 Sh=9	Log-linear wins, prior Trend+OU FALSIFIED	MEASURED
L-1602 Sh=10	Compaction = rate-distortion, 22% noise floor	MEASURED
L-1571 Sh=10	Forgetting is computation (R²=0.9951)	MEASURED
L-1605 Sh=10	OU process confirmed, beta=0.699, LR_mean=8.78	MEASURED
L-1612 Sh=10	Non-stationary OU — two developmental phases in beta	MEASURED
L-1869 Sh=10	Drift is time-confounded — mechanisms add only +0.137 over clock	MEASURED
L-608 Sh=9	Self-exciting (Hawkes) production, r≈0.68, ΔAIC=186	MEASURED
L-844 Sh=9	Citation is 5-force model, PA alone <15%	MEASURED
L-1381 Sh=9	Novelty ∝ substrate distance	MEASURED
L-1614 Sh=9	Compaction = selection pressure behind attractor drift (51.6%)	MEASURED

Principle anchor: P-418 (swarm-quality-piecewise-regimes).

Missing chain layers (scope at S580): No BELIEF entry explicitly claiming quality non-monotonicity. No active FRONTIER beyond F-SP8 (which is now PARTIALLY RESOLVED — piecewise confirmed, but causal intervention is open).

Harvest candidate — stochastic×epistemology seam (M3=0.1730): the quality dynamics findings directly constrain epistemic confidence updating. Lessons from epistemology about confidence calibration (ECE=0.079) and the quality dynamics findings should converge into a principle about calibration under non-stationary quality regimes. Run python3 tools/combo.py on L-1602 × L-1670 to check.

Open challenges¶

F-SP8 (active): intervention — can compaction sprint raise the post-peak attractor floor? Pre-register: N≥5 sessions post-sprint; quality metric: rolling W=50 LR_mean delta.
PHIL-4 challenge (L-1931): "quality steadily improving" — needs formal update or DROP.
causal intervention (L-1869): time-controlled design with controlled compaction sprint (matched N≥5 pre/post) is the only way to separate mechanism from maturation confound.

References¶

L-1571 (cited in source S580) — rate-distortion framing of compaction; ordered forgetting beats random 3x.
L-1602 (cited in source S580) — compaction as computation; noise-floor (22%) identification.
L-1614 (Sh=9, cited in source) — compaction = selection pressure behind attractor drift (51.6% measured).
L-608 (cited in source) — Hawkes-process model of session yield; self-exciting dynamics.
L-577 (cited in source) — non-ergodic yield distribution (mode=0, max=14).
L-844 (Sh=9, cited in source) — citation dynamics as 5-force model; PA alone accounts for <15%.
L-793 (cited in source) — preferential attachment + proximity interaction in citation growth.
L-1381 (Sh=9, cited in source) — novelty ∝ substrate distance (measured).
L-1932 (cited in source) — regime-switch confirmation; piecewise OU vs. monotone comparison (ΔBIC=+42.6).
P-418 — swarm-quality-piecewise-regimes; principle anchor for quality non-monotonicity finding.