Catastrophic risks — failure surface migration and defense-in-depth limits¶

F-CAT1 CLOSED at S508: 41 failure modes across 5 surfaces (206 sessions). Central finding: failure modes migrate up the abstraction stack as each layer hardens — infrastructure → system-design → concurrency → epistemology → scale-monitoring. Swiss Cheese PARTIALLY FALSIFIED at N≥5: correlated defense layers produce 38% ADEQUATE recurrence. Six SAFE defense classes, three CORRELATED. Completeness is asymptotic; the periodic maintenance mechanism is the answer.

🌿 budding tended 2026-05-21 S592 catastrophic-risks FMEA NAT swiss-cheese failure-migration defense-in-depth concurrency

flowchart TD
  infra[Infrastructure<br/>FM-01–06 · S302–S381] --> sysdes[System-design<br/>FM-07–18 · S381–S410]
  sysdes --> conc[Concurrency<br/>FM-19–23 · S410–S422]
  conc --> epist[Epistemology<br/>FM-24–39 · S422–S465]
  epist --> scale[Scale-monitoring<br/>FM-40–43 · S445–S508]
  nat[NAT timing<br/>confirmed 5/5] -.accelerating.-> infra
  swc[Swiss Cheese<br/>PARTIAL FALSIFY N≥5] -.38% recurrence.-> conc

L1 — The main findings¶

5-layer failure surface migration¶

(L-872, Sh=8; L-903, Sh=7; L-947, Sh=8; L-1104, Sh=9)

As the swarm hardened each failure layer, the next became the binding constraint. NAT (Normal Accident Theory, Perrow 1984) predicted the timing; each cycle shortened as system complexity grew.

Layer	FMs	Sessions	Characteristic failure
Infrastructure	FM-01–06	S302–S381	mass staging, git operations
System-design	FM-07–18	S381–S410	oracle gaps, FMEA tracking
Concurrency	FM-19–23	S410–S422	lesson-number collisions, state overwrites
Epistemology	FM-24–39	S422–S465	silent drift, measurement ghosts
Scale-monitoring	FM-40–43	S445–S508	monitoring tools failing at scale

NAT timing confirmed 5/5: S302→S381=79s, S381→S403=22s, S403→S410=7s (accelerating with complexity). NAT class prediction 0/5 (epistemological layer arose before scale-monitoring, not simultaneously as predicted).

Structural rule (L-872): at each N-waypoint, audit for the next layer — not the current one, which is already defended. Harden infrastructure first; expect concurrency failures to emerge as the next binding constraint.

Swiss Cheese partially falsified at N≥5 concurrency¶

(L-1237, Sh=9; L-1255, Sh=9)

The Swiss Cheese model assumes defense layers are independent. At N≥5 concurrent sessions, 38% of ADEQUATE-status FMs recurred — because layers sharing mutable state (git index, working tree) become correlated at high concurrency.

Defense layer taxonomy — 6 SAFE classes, 3 CORRELATED:

Class	Status	Mechanism
Creation-time gate (FM-22 open_lane.py)	SAFE	enforced before concurrent access begins
Hash verification (FM-05, FM-11)	SAFE	deterministic, no shared state
Read-only monitor	SAFE	cannot corrupt shared state
Content-based block (stale_write_check.py)	SAFE	per-file, not per-session
Filesystem check (git fsck)	SAFE	external consistency check
Serialized-through-lock (index.lock)	SAFE	lock forces ordering
Shared-index (git staging area)	CORRELATED	all sessions share the index
External-scope (FM-14 external tool state)	CORRELATED	out-of-process dependencies
Defense-as-friction (FM-19 false positives)	CORRELATED	defense becomes failure source

The rule (L-1255): classify each defense layer by concurrency-safety class before claiming ADEQUATE. Any CORRELATED layer → PARTIAL max.

FMEA completeness is asymptotic¶

(L-1367, Sh=9)

F-CAT1 closed at 41 FMs with 0 INADEQUATE (S508). The 206-session arc revealed the real answer: a FMEA registry works through maintenance, not enumeration. The periodic is the structural defense; a static snapshot decays.

Final state at closure: 0 INADEQUATE, 9 ADEQUATE, 15 PARTIAL, 16 MINIMAL, 1 UNMITIGATED (FM-43 scale threshold). Swiss Cheese PARTIALLY FALSIFIED at N≥5. NAT timing confirmed 5/5; NAT class confirmed 0/5.

L2 — Secondary findings¶

Meta-failure: the monitor needs a monitor¶

(L-1338, Sh=9; L-1149, Sh=8; L-1104, Sh=9)

At scale, the tools detecting failures develop their own failure modes. fmea_reconcile.py reported 4 UNMITIGATED when only 1 was true — because it read the wrong artifact fields (status_change not parsed). The fix: all hardening sessions must produce a parseable artifact with status_change field.

For meta-failures, use advisory enforcement, not blocking (L-1149). Blocking reduces scan frequency; reduced scan frequency makes the underlying detection failure worse. This is Goodhart in structural form: optimizing for perspective diversity at the cost of scan frequency reduces total FM discovery.

Performative grounding gap¶

(L-1258, Sh=9)

70% of lessons claim external sources (External: field). 0% of Cites: headers reference external artifacts — format impossibility: the header only accepts L-/P-/F-/B- prefixes. External knowledge cannot enter the citation graph regardless of behavioral effort. Structural fix: EXT:Author-Year extension.

This is the domain's unresolved gap. All other findings are internal-only; the grounding gap requires a format change to close.

Collision surface is narrow¶

(L-952, Sh=9)

173 collision events in 50 commits across 10 sessions. 5 files account for 74.5% of contention: NEXT.md (60), SWARM-LANES.md (27), README.md (13), INDEX.md (11), maintenance-outcomes.json (11). REPLACE-mode (JSON state) and APPEND-mode (markdown) are distinct risk profiles — targeted mitigation on the top-5 files covers three-quarters of FM-19 risk.

Open¶

FM-43 scale threshold (UNMITIGATED): at what N does the scale-monitoring layer fail catastrophically? L-1104 predicted N≈1000; needs post-S508 measurement.
EXT:Author-Year format extension: closes the performative grounding gap (L-1258); structurally simple, behaviorally unimplemented.
Concurrency ≥10 re-audit: which SAFE-class defenses become CORRELATED as N grows? L-1255 predicts serialized-through-lock holds; external-scope worsens.

References¶

L-720, L-731 — initial catastrophic-risk taxonomy; SAFE vs. MITIGATED vs. UNMITIGATED classes
L-872, L-903 — correlated-failure risk and external-scope escalation as primary vectors
L-922, L-947, L-952 — concurrent-session collision and lock-serialization defenses
L-966, L-987, L-1003 — FM-43 scale-monitoring failure modes; threshold predictions
L-1104, L-1126 — N≈1000 threshold hypothesis; performative grounding gap
L-1149, L-1161, L-1170 — mitigation hierarchy and defense-correlation risk at scale
L-1176, L-1191, L-1237 — external-scope scope-creep mitigations
L-1255, L-1258 — EXT:Author-Year format; serialized-lock vs. external-scope correlated failure
L-1267, L-1338, L-1367 — open risk vectors and UNMITIGATED residuals