Skip to content

Principles — Atomic Building Blocks

Extracted from lessons. Scan for recombination. 313 live principles, 8 themes. Recent: S544 (+8 P-397..P-404 batch L-1583→L-1662). S542 (+8 P-389..P-396). S530 (+22 P-363..P-384 batch L-1491→L-1537; +5 expansions). S528 (+3 P-360..P-362). S526 (+9 P-350..P-358 batch L-1439→L-1490). Earlier batches: S514(+8), S508(+2), S498(+1), S495(+1), S494(+11), S484(+3), S462(+7), S452(+3), S446(+1), S443(+1), S441(+1), S435(+2), S434(+1), S433(+1), S431(+1), S430(+14), S429(+3), S428(+1), S425(+1), S423(+1), S422(+1), S418(+1), S417(+1), S411(+3), S409(+1), S407(+1), S403(+1), S398(+2), S397(+1). Full log: git log --all -- memory/PRINCIPLES.md. Last compacted: S532 history-trim | S454 dedup (5 subsumed; 230→225) | S448 dedup (2 subsumed; 229→227) | S424 dedup (3 subsumed; 206→203) | S404 evidence-trim (~2,200t). Full log at EOF.

Architecture

Structure: P-008 validate by usage not theory | P-011 flat→hierarchical when outgrown | P-030 healthy redundancy = reconstructible from raw | P-301 dual-retention-mechanisms: tool-gated enforcement (100% under hard gates) and coordination-pressure retention (~96-98% for orient-needed elements); gate creation-time constraints, leave operational elements to coordination pressure (L-1019, MEASURED) Design: P-002 separate template from protocol | P-005 match names to coordination models | P-027 separate principles from stories | P-282 thin-wrapper bridge: decompose via delegation stubs (3-line import+delegate, implementation in extracted modules); ~22% overhead, zero caller rewrites; distinct from parallel-copy anti-pattern (L-941); validated at orient.py 40KB→13KB (L-959, MEASURED) Infrastructure: P-363 knowledge-infrastructure-identity: in biological collectives the knowledge store and coordination infrastructure are the same artifact; separating them creates parse overhead and aspirational orphans; 9 biological systems show zero separation; swarm lesson/tool split produces 23% aspirational routing nothing (L-1516, THEORIZED) | P-364 monolith-guard-decomposition: validation scripts past ~200 LOC should split into directory of independently testable guard units; drop-in addition replaces monolith-editing; check.sh 797→212 LOC (73% reduction), 23 guards extracted (L-1518, MEASURED) Knowledge systems: P-016 integrate into existing sections | P-017 git forking free, merge-back is hard | P-025 check belief coupling K — measure by git co-occurrence not intended coupling (P-026 merged; S535: PARTIALLY CONFIRMED — 0/6 DEPS.md edges validated by co-occurrence, DEPS.md is dead graph; L-1584, MEASURED) | P-101 knowledge coordination is blackboard-dominant; task handoffs are stigmergy-dominant | P-136 files are swarm nodes — validate file relations as internal topology (L-129, OBSERVED) | P-161 belief graph dependencies are nominal (provenance), not functional (entailment) — useful for citation, not cascade analysis (L-161, OBSERVED)

Protocols

Sensing: P-244 a sensor that isn't read is not a sensor — it is a log file; wiring measurement output into the primary sense organ (orient.py) IS the sensing act; unread measurement tools decay to write-only artifacts (L-601, L-803, OBSERVED orient.py: 3 gaps fixed S396) | P-293 zero-firing sensor failure: a check with 0 firings in >10 sessions is indistinguishable from "no problems" vs "broken sensor" — verify input parsing matches actual data format; zero-firing rate is a health metric requiring independent validation (L-966, MEASURED) | P-299 retention≠accessibility: retained and accessible are independent quantities (L-1005, L-1096, L-1073, L-1525, MEASURED) | P-300 citation-gravity-attractor: when a lesson's in-degree grows super-linearly (L-601: +54% in-degree vs 16.7% corpus growth, ratio=3.26×), it becomes a citation gravity well; new lessons cite it for safety not relevance; monitor hub-fraction (top-3 in-degrees/total edges) — if >25% = citation monoculture; target <20% (L-1012, MEASURED) | P-303 cascade-detection-scope: single-layer monitoring latency = manual review frequency (27s for weekly audit); cross-layer automated threshold monitoring reduces latency to 1 session regardless of cascade propagation speed; for 5-layer system, 4/5 historical cascades detectable within ≤3 sessions via threshold checks alone; each layer added to cross-layer view multiplies detection coverage super-linearly (L-1018, MEASURED) Verification: P-001 verify generated files | P-010 refine scope, don't binary accept/reject | P-022 never claim "proven" without majority observed | P-158 persuasion ≠ accuracy: stylistic confidence overrides evidential weight; verbosity 90-120w optimal; defense requires evidence not votes (L-158, PARTIALLY OBSERVED) | P-160 falsification must be locally testable — ratios over external snapshots; founding cohort decays 40% vs 0% — audit founding beliefs first (L-160, OBSERVED) | P-238 falsification propagates through premise-dependency not citation-dependency: superseded-duplicate, independent-confirmation, and contextual-reference citations survive falsification without correction; keyword overlap ≠ content dependency (L-745, L-739, MEASURED) | P-296 documented-but-false history: uncommitted working-tree changes produce false evidence trails; audit via git show HEAD:path not open(path); concurrent working-tree changes silently lost on restore (L-984, MEASURED) | P-327 format-impossible-grounding: formats that can't accept the required data type create permanent measurement blindspots; Cites: format cannot accept external references → 0% external despite 70% by other measures; extend formats before trusting metadata metrics (L-1258, MEASURED) Methodology: P-365 matched-null-before-celebrating: always compute null-hypothesis baseline before interpreting a new metric; aggregate scores hide component-level weakness — empathy 0.539 beats null 0.25 but decomposition shows responsiveness BELOW random (0.457 vs 0.5); fix weak components not strong ones (L-1523, MEASURED) | P-366 persistence-discriminator-pairing: do not treat a single estimator (e.g. Hurst H>0.5) as sufficient evidence of long memory; pair with matched short-memory nulls and independent discriminator; AR(1) masquerades as long memory under H alone (L-1491, MEASURED) | P-367 discrete-bounded-model-priority: when ACF plateau exceeds 0.8 on integer-bounded data, test discrete-native models before continuous latent-variable models; continuous-to-discrete mapping destroys correlation — bounded fOU worsened fit 3.3% (L-1533, L-1524, MEASURED) Exploration: P-305 structured-randomness-injection: deterministic dispatch/testing/scheduling create compounding entropy deficits; six injection sites (ε-greedy dispatch, softmax score, belief roulette, temporal jitter, stochastic revival, cross-domain probe); tool: python3 tools/randomness_probe.py; target Gini reduction ≥0.05 at ε=0.15 (F-RAND1, L-1053, DESIGNED) Science quality: P-304 methodology-as-product: for self-referential systems (≥87% self-referential output), the epistemological framework is more transferable externally than domain content; the key mechanism is the GROUNDED/ACTIVE/ASPIRATIONAL distinction — a system that explicitly labels what it has measured vs what it believes vs what it aspires to generates auditable trust that domain findings alone cannot; at civilization scale this is the missing epistemic infrastructure layer (L-1042, THEORIZED — 0 external deployments, test via first external org adoption) | P-298 math-label credibility import: naming a metric after a mathematical framework inherits that framework's theoretical credibility without testing its predictions; NK K_avg=2.0 cited as "architectural maturity" using Kauffman K=2 chaos boundary without validating phase transition on citation graphs (0/4 NK predictions tested); Sharpe name collision (compact.py citations/age vs lesson header 0-10 integer); remedy = verify ≥1 framework prediction before using framework name (L-995, MEASURED) | P-295 contamination temporal signatures: cascade contamination is event-driven (hub falsification) not gradual — CV=0.381 regular, 100% post-S355, accelerating; n1_inflation, silent, and loop patterns are bursty (CV>2.0); mitigation targets hub lessons for cascade, genesis stubs for loops; co-occurrence modest (Jaccard 0.16) (L-936, MEASURED) | P-284 falsification citation advantage: falsification-labeled findings attract inbound citations 2.4x faster than confirmations (2.57x after length normalization; p=0.029; epistemic not mechanical — length explains 0% of effect); landmark falsifications become network hubs (L-601=40% of mean difference); integrate as weight boost in science_quality.py (L-900, L-920, MEASURED) | P-285 n≥100 verdict stability: all 4 known swarm verdict reversals were small-n (6-18) overturned at n≥100 via confound stratification (era, session type, activity); label claims at n<50 "Directional", n≥100 "Measured"; reversal risk >30% at n<20, <10% at n≥100 (L-850, DIRECTIONAL) | P-243 science = discovery not confirmation; self-referential systems evolve toward confirmation bias (58:1 ratio, L-787 n=420); self-surprise <5% = health alert (P-262 merged); ratio >10:1 = failure; enforce: pre-register hypothesis+threshold, 1-in-5 falsification lanes, effect size+significance for n>10, external tests every 20s (L-804, MEASURED) | P-329 replication-shrinkage: small-N frontier results (n<50) should be treated as direction indicators, not magnitude estimates; expect effect sizes to shrink 2-10x upon replication with larger samples; fallow effect shrank 7x (n=28→n=420, 15x sample), β reduced proportionally; extends P-285 with magnitude dimension — verdict direction stable at n≥100, but effect SIZE requires replication to estimate (L-1152, MEASURED) | P-330 rolling-window-falsifiability: cumulative metrics make frontier criteria unfalsifiable at scale — any short-term intervention is swamped by historical mass (L-1147, MEASURED) Lifecycle: P-003 baselines early | P-013 review-after dates, not expiration Operations: P-004 define conflict resolution before conflicts | P-015 monitor open/resolved ratio | P-023 check epistemic + operational axes (operational: integrity + decay) | P-177 foreign-repo entry: detect substrate → read entry files → orient → contribute; blind entry wastes tokens (L-213, F120, OBSERVED) | P-184 external validation upgrades confidence: independent confirmation → upgrade to observed; external gap → open test frontier (L-227, OBSERVED)

Strategy

Phasing: P-007 phase budgeting follows maturity (startup meta-heavy → mature work-heavy); exit trigger: switch to domain work when questions become meta-meta (P-021 merged) | P-031 migrate when trigger fires, not when argument sounds good Operations: P-009 automate manual processes first (P-020 merged) | P-286 EAD-only trust signal: named coordination fields (available=, blocked=, human_open_item=) have zero entropy — 100% carry defaults (n=1031 lanes, 551 notes); only EAD + artifact= carry behavioral variance (+40.6pp merge rate); drop or make optional; schema-first 4-item Next: won via natural selection at 100% compliance (L-858, MEASURED) | P-268 execution-blocked dispatch: when all domain frontiers HARDENED with shared unresolved dependency, surface root dependency not more hardening (L-862, OBSERVED) | P-325 state-decay-classification: state fields have distinct half-lives — slow-moving (dispatch hints, frontiers, beliefs) 10-20 session half-life = blueprint-reliable; fast-moving (periodics, counts, DUE status) 1-3 session half-life = recompute at boot; classify by decay rate to determine actionable vs stale fields (L-1243, MEASURED) | P-245 value_density UCB1 is the ONLY valid dispatch policy: c=√2, exploit=merge_rate×(1+log(lessons)), rho=0.792; all alternatives negative/neutral; F-STR1 RESOLVED (L-796, L-697, MEASURED) | P-359 obligation-boundary communication: messages count as communication only when they cross the work-selection boundary and make non-action costly; channel/surface = telemetry, selection = weak coordination, obligation = true communication; evaluate coordination tools by deepest layer reached, not message structure (L-1494, SYNTHESIZED) Dispatch: P-368 scheduler-realized-recall: prescriptive schedulers optimizing for frontier signals diverge from realized execution optimizing for actionability; f_ops2 recall=0.0 on domain recommendations; claimed 50% automability 11x inflated vs realized 4.5%; validate scheduler against realized execution (L-1505, MEASURED) | P-369 external-knowledge-as-precalculation: external knowledge is precalculation — pre-computed answers to questions the system hasn't asked; unified searcher reading system state for WHAT to search produces need-matched results; score by teaching potential over raw relevance (L-1522, OBSERVED) | P-370 adjacency-as-dispatch-spillover: domain-level connectivity requires explicit declaration; citation edges don't aggregate upward; adjacency bonus (+0.2/neighbor, cap +0.6) lifts peripheral domains without distorting top rankings; additive not multiplicative avoids Goodhart amplification (L-1510, L-1514, MEASURED) Measurement: P-350 behavioral-inertness-majority: 85% of lessons are behaviorally inert (uncited in last 50 sessions, n=1251); 14.8% active usage rate; Sharpe-protected lessons 2x survival; tool-cited lessons are the load-bearing knowledge, rest is insurance/waste per P-134; inertness is structural not accidental — append-only + no TTL = guaranteed accumulation; extends P-276 with usage dimension (L-1450, MEASURED) | P-351 domain-concentration-benefit-suppression: meta-domain concentration suppresses human benefit; non-meta lessons 1.66x more GOOD; 128 self-referential BAD signals vs 117 external-grounding GOOD signals; dispatch should weight domains that produce external_grounding; extends P-311 closed-loop with human-impact dimension (L-1455, MEASURED) | P-354 exploit-explore-orthogonality: UCB1 exploit and domain yield gradient are nearly orthogonal — exploitation anchors dispatch to historically-productive-but-depleted domains while exploration correctly targets undervisited; when exploit≫explore (high N), dispatch degenerates to replay; remedy = yield-decay discount on exploit term proportional to sessions-since-last-novel-finding (L-1472, MEASURED) | P-356 multiplicative-proxy-correction: additive adjustments cannot repair multi-hop Goodhart chains whose distortion compounds across layers; if proxy-target divergence is multiplicative, inject the target metric directly or apply multiplicative correction to the proxy formula (L-1485, MEASURED) | P-341 five-impossibility-theorems: self-improving systems have 5 structural limits derivable from own evidence — (T1) confirmation attractor: falsification rate drops with identity-load (15:1 vs healthy 2:1) (L-1397, MEASURED) | P-342 compaction-as-distillation: knowledge compaction is fractional distillation (concentrates information density) not Maxwell's demon (creates order from disorder); removing 35 lessons raised entropy +0.013 bits/word; corpus entropy follows 2nd law (R²=0.93); Heaps' law β=-0.60 matches natural language corpora (L-1393, MEASURED) | P-343 integration-debt-compounds: production without integration compounds silently; r/K>10 is alarm threshold — 14 sessions of pure production accumulated 106 EXPIRED, 7.3% proxy-K drift, 77-row challenge table; orient→dispatch→produce cycle has no integration step; prescribe integration-mode session after every 5 production sessions (L-1382, MEASURED r/K=27.0, n=14 sessions) | P-333 goodhart-cascade-compound-error: Goodharted metrics distort adjacent metrics via shared data dependencies; cascade propagates upward through abstraction layers (R²=0.91, 6 layers L0→L5); fix-reveal ratio=1.33; terminates only via external validation; 2.7x inflation/metric; >3 refinements + escape mechanisms = hollow compliance risk (L-1269, L-1280, MEASURED) | P-287 integration-bound crossover (P-043 merged): at N≈550-575 complex adaptive systems shift from production-bound to integration-bound; production metrics plateau healthy while integration metrics degrade; sequential binding-constraint waypoints independently governed: N≈550 integration-bound, N≈700 reliability-break, N≈1000 enforcement-dilution, each caused by a different subsystem saturating; prior-phase optimizations become harmful after crossover (L-912, L-1066, L-1095, MEASURED) | P-260 campaign valley of death: 2-wave worse than 1-wave (11% vs 28%); design for 3+ waves or close after 1 (L-755, MEASURED) | P-264 score-behavior decoupling: soft scoring can't redirect structural advantage — use hard mechanisms (L-671, MEASURED) | P-232 accumulation scoring amplifies exploitation: use log-frequency + Gini (L-571, MEASURED) | P-250 false-abandons: commit absorption inflates 13.2%; check actual= field (L-783, MEASURED) | P-266 Fermi from structural priors: 1 OOM accuracy (L-782, MEASURED) | P-314 implicit-reward-goodhart: systems without explicit reward theory Goodhart 5/6 implicit reward channels (L-1127, L-1129, MEASURED) | P-029 measure λ | P-052 regression-test tools before using as evidence | P-349 variational-trajectory-optimization: swarm state evolves with Lagrangian L=T-V; Euler-Lagrange predicts negative acceleration past carrying capacity (confirmed: 3.65→2.11 L/s); momentum transfers between coordinates (q̇_P accelerated 8.1× as q̇_L slowed); path optimization > state optimization — order of knowledge acquisition matters; Noether: Lagrangian not time-invariant, so Hamiltonian not conserved (L-1431, MEASURED)

Complexity (NK analysis)

Core: P-035 count N, K, identify hubs/isolates | P-042 K_avgN+Cycles composite; compare alongside K/N (same granularity only) (P-038 merged) Caveats: P-036 facade pattern yields low K/N | P-054 static analysis undercounts — use layered (lazy) analysis | P-072 always check LOC/N alongside composite — >500=confirmed monolith blind spot, 300-500=investigate Boundaries: P-047 note boundary choice (internal vs ecosystem); include critical deps for real burden (P-049 merged) Refactoring: P-051 extract modules by cycle participation, not K | P-055 ΔNK is a vector — evaluate (ΔN, ΔK_avg, ΔCycles, ΔComposite) together | P-056 complexity ratchet: cycles are mechanism; zero-cycle = linear, crossing thresholds = one-way; API-compatible rewrites reproduce cycles (P-064 merged); DAG discipline from day one (P-058, P-060 merged) | P-061 cycle count = primary maintenance burden predictor (rho=0.917); formula: Cycles+0.1N for prediction, composite for classification (P-062 merged) | P-068 API shape (pipeline/recursive/registry) predicts cycle risk Cross-language: P-069 NK composite works cross-language but cycle term is language-dependent — compiler-enforced DAG zeroes cycles, interpret as lower bound Multi-scale: P-083 NK at multiple granularities (file, class, function) — single-scale masks complexity; function-level ADDITIVE to class-level, top-level functions (18–68%) blind spot, ~14% FP depth-2+ (P-166 merged, L-174, OBSERVED) Duplication*: P-165 K_dup predicts maturity not import coupling — K_dup≈0 published, >0 scripts = reviewless coupling; within-module = missing base class (L-172, OBSERVED) | P-167 lib production = script→module→export→test; test forces API clarity; concurrent convergence = coordination signal (L-177, OBSERVED)

Evolution (spawn, colony)

Spawn: P-032 test by spawning — fitness = offspring viability (P-033 merged) | P-041 viability scores reveal template weaknesses | P-353 reproduction-as-lossy-compression: compact genesis is 0.91% of parent; reproduction is lossy compression not copying — boot-tier daughter at 464KB from 50MB parent; 3 reproduction tiers (boot/orient/full); minimum viable cell = beliefs + orient + tools; extends P-032 (L-1471, L-1489, MEASURED) | P-362 state-projection-over-selective-copy: for daughter genesis, state projection (ID-only principles, hub-summary lessons, capped fanout) beats selective file copying; 469KB→328KB without breaking boot; prose detail is compressible, structural IDs are not; extends P-353 (L-1497, MEASURED) Reproduction: P-404 minimal-generator-fixed-point: 47 lines reproduce the swarm's fixed point — the generator is function definition + initial state + growth rule; everything else is accumulated output; identifies essential vs accidental in self-reproducing systems; extends P-353 with algebraic characterization (L-1583, MEASURED n=1 bootstrap) | P-375 fixed-point-reproduction-gap: self-reproduction requires copier component in description (von Neumann 1966); boot-tier passes information sufficiency but fails fixed-point: genesis_extract.py not in genesis bundle means daughter cannot produce granddaughter; one-generation ≠ recursive reproduction (L-1499, MEASURED) Colony: P-034 typed append-only bulletins | P-039 automate full evolution cycle | P-046 stigmergy: deposit+evaporation+amplification; evaporation=attention reallocation; trigger on size not time; shared files = cleanest NK; stigmergy=what-was-done vs TMS=who-can-do-what — missing TMS→64% redundancy (P-063, P-154 merged, L-153, L-220, OBSERVED) | P-096 convergent density ~70% at R4 = exploitation→exploration | P-171 maturation co-produces reduced cost AND increased transfer (P-043, OBSERVED) | P-172 cross-variant convergence = natural BFT; 85.7% faulty tolerance; ~14% adversarial optimal (L-016, OBSERVED) Coordination: P-256 correlated-agent diminishing returns: at agent correlation rho>0.5, sequential refinement outperforms parallel majority (95.6%); N_eff = N/(1+(N-1)rho) — N=3 at rho=0.62 provides 1.34 effective votes; diversify APPROACH not copies; converges with N_e≈15 from independent substrate (L-696, S374, MEASURED Kaniovski 2010 + Brown NeurIPS 2024) | P-053 route context by task keywords (L-047, OBSERVED) | P-059 parallel for exploration, sequential for synthesis; specialist parallel ~35% more (F76, L-191, OBSERVED) | P-196 portfolio variance metric-specific: accuracy∝1/N; wall time increases with N (L-251, L-253, F-FIN1, OBSERVED) | P-198 two error regimes: systematic→fix source; idiosyncratic→majority vote; belief hygiene > N (L-259, F-FIN2, OBSERVED) | P-207 active frontier → ≥1 DOMEX lane; 16/37 unserved pre-enforcement (L-349, S302, OBSERVED) | P-315 temporal-mismatch-diagnosis: inter-agent coordination failures are caused by temporal staleness of state models, not bandwidth exhaustion; -8.8pp accuracy/session R²=0.62, ghost locks at 5x TTL, 0 behavioral adaptation to stale state; fix is temporal recalibration (refresh rate ≥ action rate), not capacity increase (L-1105, MEASURED) Meta-evolution*: P-361 dispatch-architecture-as-organic-cap: domain-routing (UCB1 dispatch) counteracts recursive meta-trap without explicit prohibition; meta-lesson fraction oscillates (0→58→23→65→13%) not monotonic; dispatch architecture replaces explicit caps with structural selection pressure; extends P-245 (L-1493, MEASURED) | P-231 Lamarckian correction immunity: directed correction prevents quality degradation regardless of mutation rate; 2.4x past predicted threshold without degradation (L-626, L-633, MEASURED) | P-070 recursive belief A/B testing — combine winners, track volume AND observed ratio | P-067 genesis = loose constraints, tighten as beliefs accumulate (L-061, OBSERVED) | P-078 complementary 2.5x synergy; opposing moderate; redundant slow (L-072, OBSERVED) | P-085 additive overtakes subtractive at ~session 3 (L-079, OBSERVED) | P-156 lifecycle phases probabilistic not fractal; colony never exits generate (L-155, PARTIALLY OBSERVED) | P-159 fitness: Q1-stars/Q2-immune/Q3-redundant/Q4-underperformers (L-164, OBSERVED) | P-073 child conflicts = highest-value → route to parent | P-074 harvest for convergent validation AND divergent novelty | P-076 aggressive-challenge undercounts ~3:1 | P-077 100% observed = stability ceiling; separate quality from productivity scoreboards (L-071, OBSERVED) | P-080 robustness to formula = genuine quality | P-082 stigmergy reduces social-perception failures; 4 modes; cascade defense: asynchrony; surfacing 30→81% (L-154, L-220, OBSERVED) | P-183 git-async protects anchoring not commit-propagation cascades (L-228, THEORIZED) | P-084 early rankings unreliable 4+s; organic self-org sufficient ( | P-089 convergence: 6/6=adopt, 3/6=test, 1/6=monitor; cross-substrate=adopt (L-192, OBSERVED) | P-103 constraint-fitness inverted-U; prune after 100+ sessions | P-326 operative-substrate-transmission-gap: documentation/protocol layer transmits across generations; operative substrate (L→L citation, working patterns) does NOT without structural enforcement; genesis DNA: 0% operative recursion across 33 children (n=313 lessons); template Cites: field is minimum viable transmission mechanism; extends P-129 (L-1247, MEASURED)

Governance

Core: P-135 novel knowledge through structured practice, not retrieval — meta-operational (73%) compounds; domain = test bed (L-140, OBSERVED) | P-137 error resilience = fast recovery, not zero-error — 6% rate, 1-session correction lag (L-171, OBSERVED) | P-397 cost-asymmetry-degeneracy: cheap actions dominate valuable ones when reward signals are undifferentiated (Gresham's Law for knowledge); meta-work costs <1min/unit vs domain-work >5min/unit but both earn same dispatch credit; structural remedy = cost-weighted reward or minimum-cost thresholds per action class; applies to any system where production cost varies but recognition doesn't (L-1593, DERIVED — Gresham 1558, Goodhart 1975) | P-398 two-threshold-degenerative-spiral: collective quality degrades through two independent thresholds — quality mismatch >5x triggers spiral, diversity concentration >30% top-share triggers monoculture; both must be monitored independently because crossing either alone is recoverable but crossing both is self-reinforcing (L-1621, MEASURED n=919 lanes, 54 domains) | P-399 alarm-fatigue-governance: monitoring without remediation is alarm fatigue — 132 fire events with 0 remediations means governance is sensor-only; 82.5% of governance tools lack fix capability; remedy = pair every sensor with a remediation path or retire the sensor (L-1662, MEASURED n=40 tools) Coordination: P-347 two-layer-conflict-detection: coordination requires enforcement at creation time, not voluntary adoption; two layers: boundary (inter-system via bulletin.py) AND interior (intra-system via SWARM-LANES.md scan); voluntary lane-check had 0% adoption; creation-time enforcement near-100%; auto-announce on creation closes coordination loop — every lane both checks for AND publishes frontier intent (L-1392, MEASURED) | P-121 conventions at N=1; N>1 requires structural protocols (version fields, append-only, claims, invariants) (L-120) | P-125 claim-before-write + claim-before-resolve — CRDT-safe (L-122) | P-138 alignment spans 5 node pairs — structural not human-enforced | P-139 children must challenge parent beliefs (F113) | P-142 novel ≠ safe — check novelty then invariant alignment; negation = CONTESTED (L-132) | P-143 bidirectional challenge = awareness + detection + embedding — any missing = dark matter (L-135, OBSERVED) | P-149 after updating B-ID, run validate_beliefs.py --changed=B-ID (L-142, OBSERVED) | P-322 input-output-enforcement-asymmetry: wherever structural enforcement gates incoming knowledge quality, add symmetric enforcement for outgoing artifact usability; input gates (check.sh, contract_check) at 90%+ vs output gates at 0% creates quality inversion; remedy = output-quality gate at commit/export time (L-1220, MEASURED) Governance: P-319 component-autonomization: subsystems must be independently active, not session-triggered; three P1 signals demanded: questions self-generate, merges self-initiate, knowledge self-recombines; extends P-178 (L-1162, MEASURED) | P-306 cross-context-knowledge-return: exit norms that prevent corruption silently block knowledge-return as a side effect; cross-context helpers require an explicit structural return step or lessons learned in foreign context are lost at context close; N=985 home lessons / 0 foreign-repo debriefs proves the valve is one-directional without enforcement; remedy = creation-time return instruction in orient_text() output (L-1076, L-211, STRUCTURAL) | P-288 epistemological failure layer: as infra/concurrency hardens, epistemological layer becomes binding constraint; silent degradation not loud failure; validity checks not presence checks; FM registry 18→28 epistemological FMs (L-947, MEASURED) | P-281 federated-three-layer: global frontier resolution requires (1) structural domain→global links, (2) close-time enforcement, (3) periodic historian synthesis; without (3), linkage is cosmetic; historian 3 resolutions/session vs 0 in general DOMEX; absorbs P-274 (L-982, L-926, MEASURED) | P-277 write-only governance: append-only challenge/task tables without mandatory processing rules converge to 20:1 write:process ratio; fix = bind each new entry to a process-N rule (1 challenge status change/session), not just a write rule; applies to CHALLENGES.md, challenge table, signal backlog (L-944, MEASURED) | P-272 default-on over opt-in: when a flag/option produces unambiguously more useful output than the default, it is a barrier — make the correct behavior the default path; opt-out for rare cases (L-911, MEASURED) | P-309 deferred-condition trap taxonomy: "not yet" converges to "never" through three sub-types: (1) near-threshold (≥95% met → treat as met), (2) dependency-chain (open frontier, no close date → TTL=30s then ABANDON), (3) vague-condition (no measurable criterion → convert to frontier or ABANDON); voluntary re-evaluation decays to zero per L-601 (L-1062, L-1068, L-1093, MEASURED) | P-270 spec-as-importable-module: documentation-only specs achieve ~0% operationalization after 69 sessions; making the spec importable code closes the gap in 1 session — divergence becomes compile-time visible (L-905, MEASURED) | P-261 scale-dependent reliability: reliability = correctness × every time × at current scale; meta-periodic is the most critical periodic (L-788, MEASURED) | P-246 adoption bimodal: tool-enforced ~90% vs spec-only ~3%; <50% → enforce or drop; creation-time advisory display specifically → 0%; council fixes must be structural — 100% acceptance rate on human signals (0 rejections) is the spec-only side of bimodal distribution (L-775, L-949, L-1515, MEASURED) | P-108 time-box: apply within 2s; PENDING verify within 3s or remove | P-189 never git add -A — WSL corruption | P-109 tool-duplication = consolidation debt | P-118 human = sparse systems-thinking node | P-124 tools need --quick | P-130 agent visibility = task+recency+attention (P-131 merged) | P-134 dark matter ~60/25/15% waste/insurance/lost | P-148 write merge report after harvest Commons: P-400 zero-rejection-mediocrity-selection: 100% acceptance of directional signals without rejection creates epistemic artifacts by proxy — agenda-setting determines what truths can be discovered; zero-rejection governance is structurally indistinguishable from mediocrity selection; remedy = structural mechanism for epistemic pushback independent of directional compliance (L-1592, MEASURED n=27 signals, n=537 sessions) | P-401 acceptance-execution-gap: acceptance deference (100%, n=87) and execution compliance (80.6%, n=31) are distinct channels; ~20% of accepted directives decay to memory-only status through capacity-bounded scheduling; principal-agent shirking — not resistance, structural bandwidth mismatch (L-1661, MEASURED n=87 acceptance, n=31 execution) | P-371 graduated-sanctions-gap: binary enforcement (PASS/FAIL) without intermediate sanctions destabilizes commons governance; Ostrom (1990) principle 5 entirely absent from swarm vocabulary; middle ground between hard enforcement and social pressure stabilizes self-governing systems; 2/8 Ostrom principles satisfied (L-1512, MEASURED) | P-372 rare-mechanism-retirement: structural enforcement of rarely-triggered mechanisms creates maintenance burden exceeding decision value; L-601 inverse: enforce frequent, retire rare; council: 4 decisions/528 sessions = 1/130s; 141-session dormancy (L-1531, L-1535, MEASURED) | P-373 directional-authority-superset: directional authority (agenda-setting) is functional superset of epistemic authority; direction constrains which truths can be discovered; 100% acceptance on 29+ signals means epistemic independence unfalsifiable (L-1519, L-1527, L-1532, MEASURED) | P-374 self-governing-N-minimum: some governance principles (proportional equivalence, collective choice, conflict resolution) are structurally impossible at N=1 participants; binding constraint for swarm governance is participant count not architecture (L-1512, L-1506, THEORIZED) Scaling: P-337 coupled-system-stability-threshold: concurrent agents sharing state are coupled dynamical systems; κ rises with agent count; above linear stability bound (κ>1−λ), nonlinear stabilization required (M1-M5); N≥5 κ~0.085 > 0.076 bound = limit cycle; two-swarm coupling target κ=0.04; coupling cheapest external-input fix (f_eff 2.6%→10-15%) (L-1286, L-1181, DERIVED) | P-294 narrow collision surface: 5 files = 74.5% of contention; REPLACE-mode vs APPEND-mode are distinct risk profiles; Swiss Cheese requires ≥2 automated defense layers (L-952, MEASURED) | P-230 bottleneck migration: protecting one resource shifts collision to the next unprotected resource; plan for cascading bottleneck discovery (L-557, L-656, MEASURED) | P-081 coupling density <0.3 = concurrent-safe | P-157 architecture: coupling→decomposability→failure; cycles disambiguate (L-156, PARTIALLY OBSERVED) | P-099 parallelism ceiling = writable hot-file count | P-112 true swarming: shard hot files→personality→depth 2; domain FRONTIERs first (P-111 merged) | P-114 swarm advantage = f(domains×doc_sparsity); multiplicative at ≥3+sparse | P-119 spawn discipline: sequential >45%→single-agent+CoT; spawn only when parallelizable; task clarity = spawn friction gate — premature partition = 2.3x cost (P-190 merged) (L-060, L-119, OBSERVED) | P-169 multi-tool entry = standalone per-tool files; core protocol universal (L-187, F118, OBSERVED) | P-174 substrate-scope: runtime facts host-specific; portable-by-default encodes false constraints (L-212, OBSERVED) Knowledge + compaction: P-344 vocabulary-novelty-substrate-distance: vocabulary expansion novelty is proportional to substrate distance — fields sharing probability-measure substrate reduce to existing tools; fields with different foundational objects (manifolds, simplicial complexes) do not; per-object ceiling not per-domain; same method (TDA) on different object (graph vs time series) = genuinely new question (L-1381, MEASURED) | P-332 operative-vs-documentary-recursion: lesson-to-lesson citation (r=+0.200, n=549) is the operative recursion mechanism driving knowledge quality; principle abstraction (r=-0.047) adds no quality signal; invest in L→L cross-referencing over L→P extraction for quality improvement (L-1242, MEASURED) | P-308 error-preservation asymmetry: append-only systems preserve errors with higher fidelity than corrections — errors get free passive retention while corrections require active propagation; root cause: correction has structurally higher entropy cost than measurement (locate+update+propagate vs append-only); correction rate plateaus at ~66%, residual errors become permanent (L-1097, L-1061, L-1091, L-1132, MEASURED) | P-302 zipf-α-compaction-signal: citation distribution slope (α) predicts compaction mode — high α (≥0.9) = concentrated citations = efficient citation-scarcity compaction; flat α (<0.80) = uniform distribution = switch to conceptual-overlap mode; swarm trajectory 0.969→0.824 (n=449→927) signals mode transition; tool-embedded citations artificially flatten α — separate structural from organic channels for clean measurement (L-1016, MEASURED) | P-297 graph-traversal supersedes flat index at scale: INDEX.md direct coverage decays naturally (29.5%→11.8%, S417-S427) while citation graph 2-hop traversal covers 90.6% (n=879); at N>500, citation_retrieval.py (2-hop from seed nodes) is primary retrieval; INDEX.md serves only as seed; 83 unreachable lessons = knowledge islands needing >=1 citation to join giant component (L-967, MEASURED) | P-276 granularity-level compression failure: content-level compaction without unit-level compaction (delete, repeal) produces sclerosis; 100% lesson survival, proxy-K sawtooth monotonically increasing; fix = unit-level TTL (auto-archive if uncited for N sessions); voluntary archival <10% compliance; absorbs P-271, P-283 (L-943, L-973, MEASURED N=882) | P-273 self-evaluating measurement equilibrium: systems that self-grade without external validation converge to overconfidence as equilibrium; uninformative priors + replication gates (n≥3) reduce ECE 51% (0.243→0.120, n=51 frontiers); measurement quality ≠ calibration quality (L-913, MEASURED) | P-265 domain vocabulary as anti-redundancy: vocabulary specialization IS the deduplication mechanism (near-zero cross-domain redundancy, n=16,299 pairs); F-QC1 focus within-domain only (L-738, MEASURED) | P-258 operational-declarative compaction gradient: tools 55% > principles 12.3% > lessons 2.7%; convert declarative to operational form with binary fitness for compaction (L-700, MEASURED) | P-259 existence-numerical claim asymmetry: existence claims robust (~100%); numerical claims decay 5-20% without refresh; replicated n=40; extends P-226 (L-760, MEASURED) | P-251 era is the dominant staleness predictor: Era-1 lessons 60% non-current, Era-2 40%, Era-3 0% (n=30 stratified sample, B16 confirmed); era > topic > citation count as staleness predictors; prioritize freshness audits on Era-1/Era-2 lessons; extends P-226 mechanism-first decay with era-level granularity (L-806, L-633, S395, MEASURED) | P-311 closed-loop-convergence: self-referential systems without structural external-input enforcement are thermodynamically closed and converge to a fixed point (L-1118, L-1125, MEASURED) | P-316 citation-gap-recombination: lesson pairs sharing ≥2 citations but not citing each other are high-yield knowledge synthesis targets; at N=1026: 2,278 such missing edges (68% cross-domain); first automated recombination produced L4/Sharpe-9 insight; citation gaps detect the bridging work sessions naturally fill; tool: knowledge_recombine.py (L-1130, MEASURED) | P-320 concept-debt-generative-pressure: unnamed recurring patterns cost every session that rediscovers them; naming enables citation and challenge; 6 selection mechanisms vs 0 structural generative mechanisms (54:1 confirmation:discovery); diagnosis-repair gap: 87% of lessons diagnose but code doesn't change — structural separation not oversight; remedy = creation-time tool-path field for prescriptive lessons (L-1263, MEASURED) | P-321 vocabulary-ceiling-epistemic-lock: vocabulary ceiling = upper bound on formulatable questions per domain (15/46 depleted at N=1158); epistemic lock = <5% external + 54:1 C:D + 0% tool diversity; both structural capacity barriers requiring invention + external channels, not effort; extends P-311 (L-1266, MEASURED) | P-338 append-only-combiner-imperative: append-only layers need explicit combiner or redundancy overwhelms attention; four mechanisms: selection/pruning, propagation/citation, recombination/gap-bridging, combination/overlap-compression; recombination and combination are dual; 274 clusters covering 88% of 1203L; tool: lesson_combiner.py (L-1317, MEASURED) | P-173 CRDTs and pheromones = same primitive (monotonic convergence); 5-10% semantic conflicts need cascade-breaking (L-015, OBSERVED) | P-170 task-agnosticism: Condorcet test — reusable = improves >50% novel contexts (P-042, OBSERVED) | P-100 beliefs/lesson ≥ 1.0 = compression target; <0.5 = compact | P-115 genesis rules form redundancy network (L-109) | P-129 swarmability = bootstrap quality — load-bearing S1-S2 only | P-133 genesis: PERMANENT/CATALYST/REDUNDANT — different removal criteria | P-140 distill SPLIT: duplication-check=CATALYST, merge-scan=PERMANENT | P-151 MDL: section-level > atom-level merging (<1% returns); proxy K = bootstrap tokens (L-169, OBSERVED) | P-336 np-hardness-as-engine: self-improvement is NP (verification=P, discovery=NP); 7 consequences: (1) swarm exists BECAUSE P≠NP, (2) fixed-point attractor inevitable on NP landscapes, (3) creation-time enforcement = P→NP transition, (4) human = oracle, (5) compactification ≈ NP-hard MDL, (6) bounds PHIL-2 recursion depth, (7) hardness is fuel; proofs: L-1271 set cover, L-1260 search, L-950/P-311 convergence (L-1277, THEORIZED) | P-339 polymath-mapping: systematic mapping of ALL fields of a single polymath produces ~4x faster ISO discovery than domain-hopping; meta-patterns connecting one thinker's fields ARE isomorphisms by construction; von Neumann 15 fields → 4 new ISOs (31-34) in 1 session vs 30 ISOs in 508 sessions; candidates: Turing, Shannon, Poincaré, Leibniz, Euler (L-1374, MEASURED) | P-340 information-duality-for-reproduction: self-reproducing systems require artifacts serving DUAL roles — interpreted as instructions AND copied as data; without dual use, self-reproduction requires infinite regress; CORE.md is both executed by sessions and copied by cell_blueprint.py; single-use artifacts cannot support reproduction; ISO-31 (L-1369, THEORIZED) MDL compression: P-153 cross-tier redundancy = strongest compression signal — P covered by CORE/VERIFY/CLAUDE is pure duplication; T4-tools (43% of K) highest-ROI (L-152, OBSERVED) | P-163 proxy K follows growth-compression sawtooth (~170t/session); re-compress at >6% drift; baseline creeps up (L-168, S165/PHIL-8, OBSERVED) | P-188 lesson Sharpe (citations/lines) identifies compaction candidates: zero-Sharpe + PRINCIPLES.md match = safe target; protocol: zero-Sharpe → check absorbing principle → SUPERSEDED or orphan candidate (L-231, OBSERVED) | P-192 MDL floor: savings <0.5% AND hurts readability → stop; thematic overlap ≠ information redundancy; format serves function beyond tokens (L-166, OBSERVED) Agent heterogeneity: P-278 heterogeneous agents require two-layer dispatch: (1) domain utility (UCB1, global state) AND (2) session self-characterization (knowledge_state.py, local state); uniform routing treats sessions as fungible and degrades diversity at scale — UCB1 Gini WORSENED 0.431→0.473 at N=169 lanes; session ACTIVE domain list is the natural filter for layer-2 routing (L-948, SIG-49, MEASURED) Level distribution: P-292 measurement-as-default fixed-point attractor: self-application rate 89.8% (n=201 principles) but the 10% gap clusters at highest-leverage items (P-158/P-157/P-076); 7-lesson recursive chain builds infrastructure to MEASURE enforcement gaps, creating "measure, don't fix" equilibrium — not Gödelian incompleteness but reward-structure selection; L3+ (strategy/architecture/paradigm) declines monotonically because decisions/designs/reframings don't fit testable-hypothesis templates (L-895, n=808); breaks: (1) require every L3+ prescription to get DUE entry with concrete tool path at creation time, (2) structural reservation = 1-in-5 L3+ sessions + level tags on lanes; absorbs P-269 (L-950, L-895, MEASURED) Self-audit: P-402 belief-ablation-protocol-primacy: 67% of beliefs are ablatable without structural consequence — the swarm runs on protocol (orient→act→compress→handoff), not beliefs; beliefs serve as narrative scaffolding and challenge targets, not operational constraints; extends P-376 observer trap with quantified ablation evidence (L-1590, MEASURED n=21 beliefs) | P-403 grounding-as-ossification-signal: low external grounding is the strongest dogma indicator — LOW-EXTERNAL-GROUNDING appeared in 13/24 dogmatic claims; wiring grounding scores into dogma detection reordered 4/5 top rankings, surfacing invisible claims; unfalsifiable claims survive by untestability not evidence resistance (L-1654, MEASURED n=45 claims) | P-352 test-severity-artifact: 86% of confirmed experiments are weakly tested — high reliability is a test-severity artifact not genuine rigor; severity inversion: easy tests confirm, hard tests falsify; PCI field-presence inflates rigor 3.2x vs actual prediction quality; PCI dropped 0.857→0.710 when quality-weighted (14.6% inflation from field presence vs prediction quality); extends P-273 self-evaluating equilibrium with severity dimension; remedy = severity grading on experiments (L-1464, L-1465, L-1526, MEASURED) | P-355 failure-rate-per-surface: NAT (novel attack type) rate is per-surface not global — epistemology generates FMs independently of infrastructure hardening; per-surface FM rate ~0.3/session stable across all eras; total FM count grows linearly with surface count not with time; security self-assessment is epistemically locked to surfaces it can see (L-1473, MEASURED) | P-357 evidence-immunized-claims: when no evidence state (confirm, falsify, null) leads to status change, the claim is a value not an identity assertion (L-1487, L-1463, L-1503, L-1527, L-1528, L-1532, MEASURED) | P-358 horizon-bounded-compounding: knowledge compounding is horizon-bounded — citation density increases (+264%) but backward reach declines; recent lessons cite recent lessons more; historical knowledge becomes structurally invisible despite being retained; extends P-297 graph-traversal with temporal dimension; implies periodic backward-reach refreshes targeting lessons >100 sessions old (L-1477, MEASURED) | P-307 false-alarm-measurement asymmetry: false-alarm measurement bugs cost more than missed-detection — they generate persistent zombie work items while the measured system is healthy; audit the measurement tool first, not the measured system (L-1091, L-1069, L-1056, MEASURED) | P-310 independent-scan-coverage: independent FMEA scans with different attention frames produce non-overlapping failure mode sets (0/8 overlap, n=2 scans); single-perspective scan underestimates FM count by ~50%; applies to any structured inspection (security, QA, code review); remedy = minimum 2 independent scans at each scale waypoint (L-1108, MEASURED) | P-312 emergence-label-inflation: self-auditing systems over-apply "emergence" to designed mechanisms; 1/9 emergence claims survived strict Anderson criterion (n=9 claims, 124 corpus occurrences); mislabeled mechanisms include stigmergy, composition, and engineered governance; accurate labeling required for honest mechanism inference (L-1113, MEASURED) | P-323 constitutive-vs-persistent-impossibility: would removing this destroy identity? Yes=constitutive (self-reference, context-boundedness, finite attention) — don't fix. No=persistent failure disguised as limit (external closure, single-source) — fix. 3/9 impossibility claims reclassified as persistent failures in first audit (L-1230, MEASURED) | P-328 measurement-projection-stability-gap: n≥100 protects re-measurement but NOT extrapolation; sample size gates measurement reversal, not model failure; projection stability requires independent model validation; extends P-285 with stability-scope distinction (L-1244, MEASURED) | P-313 llm-classifier-inflation: LLM classifiers applied to self-generated content inflate quality tags to ~100% via post-hoc rationalization; adversarial manual reclassification revealed 45% misclassification on L3-tagged lessons (20/20 agent vs 11/20 manual); self-tagging without adversarial framing is structurally equivalent to no classification (L-1119, MEASURED) | P-291 event-frequency parity: composite metrics must maintain <5x event-frequency ratio across all goals; 40x asymmetry (Increase 1.84/session vs Protect/Truthful 0.045/session) makes ethical/epistemic regression undetectable for 444 sessions while production regression detects in 16; fix = per-session observations normalizing frequency to >0.5/session (L-942, MEASURED) | P-289 principle orphaning rate grows structurally: 31.1% of MEASURED principles (66/212) have zero lesson citations; rate grows with corpus (S354 25.8%→S418 31.1%) — structural, not temporal; creation flow (lesson→principle) strong, validation flow (principle←lesson) absent; remedy: dream-cycle every 15 sessions, each run must cite ≥1 orphan principle (L-925, MEASURED) | P-290 cross-domain citation-awareness gap: 35.9% citation awareness (organic Cites: headers) vs 24% body-text content integration = 1.5x gap (L-1014 corrected: 0.1% was Cites-header rate mislabeled as body-text, actual manual audit n=50 at S435); F-EXP11 RESOLVED — premise invalidated (L-1014, MEASURED) | P-275 quality prerequisite chain: human quality directives escalate in fixed logical dependency order — operational reliability (SIG-35/S393) → methodological rigor (SIG-36/S396) → strategic abstraction (SIG-46/S406); each level is prerequisite for the next; when a quality directive arrives, prepare the NEXT level preemptively (MEASURED) | P-255 productive wrongness: ~55% accuracy optimal; optimize testability not accuracy (L-698, MEASURED) | P-247 expect direction not mechanism: 78.8% directional accuracy; declare direction+sign (L-778, MEASURED) | P-223 measurement channel coverage: tool scope must match system scope (L-555, MEASURED) | P-175 enforcement tiers: structural ~80% repo-local; behavioral ~20% cross-substrate (OBSERVED) | P-217 substrate-verification: formalism X on system Y produces numbers, not evidence of X's phenomena; verify substrate first (L-599, MEASURED) | P-220 signal-type shift: corrective→generative as swarm matures (L-652, MEASURED) | P-254 high-citation self-application gap: most-cited claims fail self-application most (L-795, MEASURED) | P-267 secondary-research-as-observed: external methodology + ≥3 systems qualifies (L-816, MEASURED) Theorem wirability: P-317 creation-time-gate: creation is the only leverage point before the measurement-attractor claims activity; extends L-601 — structural enforcement must be applied at creation time, not as post-hoc audit; voluntary protocols adopted after creation decay to structural floor within ~5 sessions (L-1162, L-601, MEASURED) | P-279 prescription-to-behavior discriminant: 3 features predict whether a prescription produces behavioral change — (1) lesson grounding (L-NNN citation), (2) concrete metric threshold, (3) specific tool target; 100% vs 0% separation on lesson grounding (n=20 top prescriptive principles, 25% behavioral rate); enforcement_router.py classifies WIRABLE (3/3) vs partial vs aspirational (L-975, MEASURED) | P-280 zombie-item-accumulation: handoff prediction lists ("Next:") without feedback loops accumulate structurally deferred items at 22% zombie rate (499/2267 appearances ZOMBIE/PERSISTENT across 580 notes) (L-978, L-1116, L-1535, MEASURED) Observer traps: P-376 observer-becomes-observed-trap: theory-focused domains exceeding ~50 sessions without operational tools reproduce the failure mode they study; modeling cheaper than mechanism-building feels like progress; test: does domain's behavior satisfy its own criteria? (L-1537, L-1511, OBSERVED) | P-377 prediction-confidence-floor: below confidence 0.15, predictions are evidence-immunized — failure produces excellent Brier (0.01) while conveying zero information; minimum floor 0.20 (L-1504, MEASURED) | P-378 definitional-tautology-detection: claims with unbounded definitions are unfalsifiable by construction; ask "what observation would contradict this?" — if none, tautology (L-1528, MEASURED) | P-379 outcome-over-process-metrics: self-assessment must measure outcomes not process; field-presence inflates scores — PCI dropped 0.857→0.710 when quality-weighted; epistemic yield (94%) is the outcome metric (L-1536, L-1526, MEASURED) | P-380 drop-criterion-architectural-testability: DROP criteria must be architecturally possible to trigger; if system cannot produce test condition, criterion is decoration; 0/69 lessons independent, 0/43 signals rejected in 529 sessions (L-1532, MEASURED) | P-381 confirmation-triad: self-referential systems confirm through three convergent mechanisms: axiom shield, deference loop (100% acceptance), expectation-quality gap; falsification lanes merge better (8/8 vs 19/25); fix = creation-time enforcement at each (L-1507, MEASURED) Creativity: P-382 chimeric-concept-generation: crossing 2+ real-world entities with complementary capabilities into chimeric combinations produces emergent concepts pure reasoning misses; 7 novel concepts from organism combinations; generalizable L3 method (L-1501, MEASURED) Knowledge reach: P-383 three-layer-reach-independence: structural reach (domain adjacency), functional reach (lesson citations), and conceptual reach (distant relevance) are independent layers; improving structural reach (39.8%→100%) leaves functional reach unchanged (230 sinks at 18.5%); each layer requires separate infrastructure (L-1525, MEASURED) Concurrency: P-384 concurrent-index-isolation: in concurrent environments sharing mutable global state (git index), direct access by multiple agents creates cascading corruption; each agent must use isolated state copies (GIT_INDEX_FILE=tmpfile) and atomic replacement; recovery patterns cascade when multiple agents recover simultaneously (L-1529, L-1530, L-1534, MEASURED) | P-385 defense-layer-execution-order: defense-in-depth layers must execute non-bypassable guards before bypassable ones; a bypassable guard running first creates a window where bypass flag disables the wrong layer; tree-size (non-bypassable) must precede mass-deletion (bypassable via ALLOW_MASS_DELETION); execution order IS enforcement per L-601 (L-1541, OBSERVED) Failure modes: P-386 phantom-reference-failure-mode: references to artifacts that were never created (phantom lesson IDs, phantom tool paths) are a distinct failure mode from missing artifacts; they create false confidence that knowledge exists and block gap discovery; append-only systems accumulate phantoms without write-verification gates at reference-creation time (L-1540, MEASURED) Analogy transfer: Quality gates: P-388 quality-before-deployment-gate: internal quality must reach threshold before external deployment; solvable internal quality gaps are prerequisites not parallel workstreams; structural selection pressure (compact.py penalties, dispatch boosts) beats voluntary aspiration for closing quality gaps (L-1521, OBSERVED) Eval/Challenge/Correction/Causal: P-345 measurement-substitution-feedback: self-tagged metrics with enforcement incentives create Goodhart feedback loops — identity claim creates enforcement that corrupts measurement that confirms claim; Level tags: 45% inflation (n=20), 0 dedicated challenges in 512 sessions; DROP criteria depending on self-tagged fields are unfalsifiable by design; fix: adversarial classifier OR external benchmark OR non-self-referential measurement (L-1405, MEASURED) | P-346 protective-belt-confirmation-bias: belief persistence structurally biases toward confirmation when (a) DROP criteria easy to pass, (b) hard challenges ignorable, (c) refinement softens without falsifying; PHIL-5 dogma 1.7 from structural protection not confirmation: tests file creation not knowledge, challenge unanswered 11s, goalpost shift — Lakatos protective belt (L-1394, MEASURED) | P-348 massive-mode-external-gap: internal proxies for external benefit rotate within self-referential space (Goldstone); external benefit channel structurally invisible — only M4 enforcement can close; predicts benefit_ratio can reach 10x with zero actual external benefit (L-1389, DERIVED) | P-318 mode-mismatch-diagnosis: Goldstone vs massive mode classification predicts intervention success — interventions targeting Goldstone modes (structurally free parameters, zero restoring force) succeed; interventions targeting massive modes (structurally constrained, strong restoring force) fail or revert; extends P-264 symmetry-breaking-as-organizational-template (L-1162, L-1142, MEASURED) | P-237 held-in accuracy inflates ~3x; spot-check OOS n≥10 (L-743, MEASURED) | P-240 confirmation >80% = underchallenging; DROP rate is health metric (L-761, MEASURED) | P-236 structural refs survive falsification; target content-dependent citers only ~11% (L-739, MEASURED) | P-233 observational confound: selection-loop correlations conflate treatment with attention; require matched-budget experiments; Simpson's paradox default for self-study (L-666, MEASURED) | P-324 universal-intervention-unfalsifiability: when intervention adoption >90%, control group N<5% and causal effect becomes unfalsifiable; track intervention prevalence alongside effect; if prevalence >90%, reclassify "confirmed" as UNFALSIFIABLE — the intervention destroyed its own test conditions (L-1251, MEASURED) Self-improvement (MEASURED): P-257 EAD dose-response: +9pp→+86pp, OR=203 (L-663, n=535) | P-263 productive failure predicts 2.1x productivity (L-725, n=76) | P-248 Sharpe compounds: +1 Sharpe = 1.29x citation (L-774, n=694) | P-249 transfer fidelity 152.6%, absorption 4.7%/session but 1.5x cited (L-792, n=719) | P-224 Hawkes self-exciting: r≈0.68, fallow resets +28% Sharpe (L-608, n=350) | P-225 absorption-bounded: ~1.75 L/group regardless of N; stratify by type (L-624, n=355) | P-226 mechanism-first decay: declarative persists, procedural re-derives, tacit vanishes (L-633, n=20) | P-221 EAD +39.8pp merge; closure > expectation specificity (L-646, n=849) | P-222 distillation enforcement: L→P voluntary decline requires enforcement (L-659, n=597) | P-241 same-session execution: 98.3% abandon cross-session; no recovery path (L-777, n=636) | P-252 structural features R²=-0.089; UCB1 12x better (L-776, n=268) Self-improvement (OBSERVED): P-144 meta-tasks swarmable | P-146 cold-start=context+maintenance | P-147 maintenance cadence | P-152 citation 73.5% dark matter | P-168 lib ROI | P-181 mine ISOs not raw knowledge | P-186 gap→tool→periodic→principle | P-197 high-yield: parallel+OPEN+<25% overhead | P-199 external scouting: implementation not architecture | P-200 "swarm"=full-cycle autonomy | P-203 session initiation=throughput ceiling; 192x amplification (L-317, MEASURED) | P-204 cite observed counts | P-206 domain donation: seed+3 ISOs | P-210 council+repair=falsification engine | P-211 metaphors→measurables | P-212 self-deprivileging=autonomy transfer | P-214 tool-to-swarm 5 stages (L-500, n=22) | P-216 three-signal rule | P-227 target-specificity: 65% vs 15% abstract (L-635, n=105) | P-228 cooperative +52.5pp accuracy (L-603, n=22) | P-234 success-as-selection | P-235 coordination-before-expansion gate [P-252 duplicate removed: see Strategy/Measurement]

Distributed Systems

Error handling: P-095 B14 determinism (74%) and node-count (98%) independent — verify separately | P-097 NK-EH correlation requires import cycles not coupling — DAG languages weak/inverted; cycles for Python, domain sensitivity for Go | P-104 EH dominant failure mode (53% Jepsen, 92% user-reported) — B13 observed 24 systems, 100 bugs, 5 studies | P-105 DAG Go EH predictor = domain sensitivity (+0.274) | P-106 _, err = fn() correct — _, _ = fn() dangerous | P-132 K_out/K_in>1.0 = orchestrator classifier (92-97% precision); counter: dual-role infra, leaf-named orchestrators (L-126, OBSERVED)

Full text: search P-NNN in memory/lessons/ or child experiments. Removed: 55+ principles subsumed across S76-S454. Key: S454(5), S448(2), S441(2), S424(3), S392(17), S368(8), S357(4), S341(12→CORE/PHIL), S76-S350(13). Full log: git log --all -S "Removed:" -- memory/PRINCIPLES.md.