Skip to content

Heuristic Scorecard

Which market decision was made on which heuristic — every named heuristic (P-NNN / L-NNN / ISO-N) the swarm cited to drive a forecast, scored by how the market resolved those calls. Autodiff/backtesting on verbal statements.
🟢 live tended 2026-06-08 heuristics finance calibration credit-assignment backtesting external-output F-COMP1
Read next

Auto-generated by tools/render_heuristics_page.py from experiments/finance/heuristics/scorecard.json

5 market-validated heuristics  ·  25 resolved calls  ·  corpus Brier 0.214  ·  mean verbal-Sharpe -0.093

Each forecast on the dashboard names the heuristics that drove it. When the market resolves the call, its Brier score is split back across those heuristics by weight — credit assignment on verbal statements. Heuristics that keep being right rise; ones that keep being wrong are pruned or compacted.

Last updated: 2026-06-08. verbal-Sharpe = (0.25 − mean Brier contribution) × √(calls); positive = beats a coin flip, and it grows with how often the heuristic has been right.

  • Top: CAND-005 (verbal-Sharpe +0.048)
  • Bottom: CAND-002 (verbal-Sharpe -0.390)

Literature statements — backtested on history

The fast loop: clear empirical claims from the finance literature, each encoded as a price-derivable rule and backtested walk-forward on ~10 years of data. OOS Sharpe is the out-of-sample number (held-out last 30%) — the honest one. No transaction costs; in-sample bias controlled by the split; N<20 is direction-only.

Statement Claim Citation OOS Sharpe OOS N Best asset (OOS Sharpe)
STMT-001 12-1 momentum An asset trading above its level of ~12 months ago (skipping the… Jegadeesh & Titman (1993); Asness, Moskowitz & Pedersen (2013) +0.70 228 GLD (+1.55)
STMT-005 52-week-high breakout An asset trading near its trailing 52-week high tends to continu… George & Hwang (2004) 'The 52-Week High and Momentum Investing' +0.50 81 GLD (+2.46)
STMT-002 Trend (50/200 MA cross) When the 50-day moving average is above the 200-day, the trend i… Moskowitz, Ooi & Pedersen (2012) 'Time Series Momentum' +0.39 234 GLD (+1.47)
STMT-004 Low-volatility regime Equities in a low realized-volatility regime earn better risk-ad… Ang, Hodrick, Xing & Zhang (2006) 'The Cross-Section of Volatility' +0.00 2 SPY (+0.00)
STMT-003 Short-horizon mean reversion Over a few weeks, an asset stretched far from its recent mean te… Jegadeesh (1990) 'Evidence of Predictable Behavior of Security Returns' -0.05 194 EEM (+0.81)

Backtest window 2015-01-01 → 2026-06-02. Caveats: in-sample bias (OOS reported separately); NO transaction costs / slippage; single-asset; non-stationarity: historical edge may not persist (see live verbal-Sharpe).

Comprehensive heuristic — today's composite decision

The decision step: for each asset, the applicable statements are combined into one call, weighted by each statement's OOS Sharpe on that asset. A statement with no historical edge on an asset gets zero weight, so the system stays silent where it has no basis. Each decision shows exactly which statement pushed it, and how hard.

  • EEM → BULL (conf 0.65, score +0.57, 4 statements): STMT-005 +1×w1.59, STMT-001 +1×w0.86, STMT-003 -1×w0.81, STMT-002 +1×w0.48
  • GLD → BULL (conf 0.64, score +0.55, 3 statements): STMT-001 +1×w1.55, STMT-002 +1×w1.47, STMT-005 +0×w2.46
  • IWM → NEUTRAL (conf 0.5, score +0.00, 0 statements): —
  • QQQ → BULL (conf 0.7, score +1.00, 3 statements): STMT-005 +1×w1.48, STMT-001 +1×w1.35, STMT-002 +1×w0.90
  • SPY → BULL (conf 0.7, score +1.00, 3 statements): STMT-005 +1×w1.11, STMT-001 +1×w1.04, STMT-002 +1×w0.82
  • WTI → NEUTRAL (conf 0.5, score +0.00, 0 statements): —

Computed 2026-06-02. composite_heuristic.py --asset <X> --register files a decision as a pre-registered prediction (handing it to the live loop below).

Market-validated heuristics

Heuristics attached to a prediction while it was still open (pre-registered), then scored when the market resolved it. This is the only set that earns verbal-Sharpe.

Heuristic Claim (verbal statement) Domains #Calls Hit-rate Mean Brier verbal-Sharpe Trend Verdict
CAND-005 Defensive rotation: utilities/healthcare (non-cyclical, AI-electric… finance, forecasting 1 0% 0.203 +0.048 👁 watch
CAND-004 Stretched-level consolidation: an over-extended asset at an extreme… finance, forecasting, health 3 67% 0.236 +0.025 ⬆️ promote
CAND-003 Divergence mean-reversion: a lagging asset catches up to its driver… finance, forecasting 1 0% 0.250 +0.000 👁 watch
CAND-001 2-week momentum continuation: a leading asset in a risk-on regime c… cryptocurrency, finance, forecasting 6 0% 0.309 -0.145 ⬇️ prune
CAND-002 Low-VIX-in-conflict mean reversion: VIX below 20 during active mili… finance, forecasting 1 100% 0.640 -0.390 👁 watch

Decision → heuristic traces

Expand a heuristic to see every call it drove and how each resolved.

CAND-005 — verbal-Sharpe +0.048 over 1 call(s) · 👁 watch > Defensive rotation: utilities/healthcare (non-cyclical, AI-electricity tailwind) outperform on a geopolitical/VIX shock. | Prediction | Role | Weight | Outcome | Brier | Vindicated | |------------|------|-------:|---------|------:|:----------:| | PRED-0030 | driver | 1.00 | ❌ INCORRECT | 0.203 | ✗ |
CAND-004 — verbal-Sharpe +0.025 over 3 call(s) · ⬆️ promote > Stretched-level consolidation: an over-extended asset at an extreme consolidates sideways rather than breaking out or down over ~2 weeks. | Prediction | Role | Weight | Outcome | Brier | Vindicated | |------------|------|-------:|---------|------:|:----------:| | PRED-0024 | driver | 1.00 | ❌ INCORRECT | 0.302 | ✗ | | PRED-0027 | driver | 1.00 | ✅ CORRECT | 0.203 | ✔ | | PRED-0029 | driver | 1.00 | ✅ CORRECT | 0.203 | ✔ |
CAND-003 — verbal-Sharpe +0.000 over 1 call(s) · 👁 watch > Divergence mean-reversion: a lagging asset catches up to its driver when their historical relationship diverges (e.g. WTI->XLE) within 2-3 weeks. | Prediction | Role | Weight | Outcome | Brier | Vindicated | |------------|------|-------:|---------|------:|:----------:| | PRED-0025 | driver | 1.00 | ❌ INCORRECT | 0.250 | ✗ |
CAND-001 — verbal-Sharpe -0.145 over 6 call(s) · ⬇️ prune > 2-week momentum continuation: a leading asset in a risk-on regime continues its trend over a ~2-week window. | Prediction | Role | Weight | Outcome | Brier | Vindicated | |------------|------|-------:|---------|------:|:----------:| | PRED-0019 | driver | 1.00 | ❌ INCORRECT | 0.250 | ✗ | | PRED-0020 | driver | 1.00 | ❌ INCORRECT | 0.250 | ✗ | | PRED-0021 | driver | 1.00 | ❌ INCORRECT | 0.302 | ✗ | | PRED-0022 | driver | 1.00 | ❌ INCORRECT | 0.490 | ✗ | | PRED-0026 | driver | 1.00 | ❌ INCORRECT | 0.360 | ✗ | | PRED-0028 | driver | 1.00 | ❌ INCORRECT | 0.203 | ✗ |
CAND-002 — verbal-Sharpe -0.390 over 1 call(s) · 👁 watch > Low-VIX-in-conflict mean reversion: VIX below 20 during active military conflict reverts upward within ~2 weeks. | Prediction | Role | Weight | Outcome | Brier | Vindicated | |------------|------|-------:|---------|------:|:----------:| | PRED-0023 | driver | 1.00 | ✅ CORRECT | 0.640 | ✔ |

Reverse trace — which heuristics drove each resolved call

The macro view: open a forecast and see the named heuristics behind it, their weights, and whether each was vindicated by the outcome.

  • PRED-0019 BULL SPY → ❌ INCORRECT (Brier 0.25): CAND-001(100%)
  • PRED-0020 BULL QQQ → ❌ INCORRECT (Brier 0.25): CAND-001(100%)
  • PRED-0021 BULL BTC → ❌ INCORRECT (Brier 0.3025): CAND-001(100%)
  • PRED-0022 BULL WTI → ❌ INCORRECT (Brier 0.49): CAND-001(100%)
  • PRED-0023 BULL VIX → ✅ CORRECT (Brier 0.64): CAND-002(100%)
  • PRED-0024 NEUTRAL GLD → ❌ INCORRECT (Brier 0.3025): CAND-004(100%)
  • PRED-0025 BULL XLE → ❌ INCORRECT (Brier 0.25): CAND-003(100%)
  • PRED-0026 BEAR DXY → ❌ INCORRECT (Brier 0.36): CAND-001(100%)
  • PRED-0027 NEUTRAL AAPL → ✅ CORRECT (Brier 0.2025): CAND-004(100%)
  • PRED-0028 BULL IWM → ❌ INCORRECT (Brier 0.2025): CAND-001(100%)
  • PRED-0029 NEUTRAL XLV → ✅ CORRECT (Brier 0.2025): CAND-004(100%)
  • PRED-0030 BULL XLU → ❌ INCORRECT (Brier 0.2025): CAND-005(100%)
  • PRED-0001 BEAR SPY → ❌ INCORRECT (Brier 0.04): ISO-4(33%), ISO-11(33%), ISO-2(33%) (backfill — illustrative)
  • PRED-0004 BULL GLD → ❌ INCORRECT (Brier 0.04): ISO-8(100%) (backfill — illustrative)
  • PRED-0005 BEAR QQQ → ❌ INCORRECT (Brier 0.04): ISO-2(100%) (backfill — illustrative)
  • PRED-0008 BULL VIX → ❌ INCORRECT (Brier 0.04): ISO-13(33%), L-1287(33%), L-1289(33%) (backfill — illustrative)
  • PRED-0015 BEAR AAPL → ❌ INCORRECT (Brier 0.04): ISO-2(100%) (backfill — illustrative)
  • PRED-0003 BULL TLT → ➕ PARTIAL (Brier 0.135): ISO-4(100%) (backfill — illustrative)

Illustrative backfill (excluded from scoring)

Heuristics reconstructed from the theses of already-resolved predictions. They show the mechanism working, but they are hindsight — attached after the outcome was known — so they earn no verbal-Sharpe and never drive evolution verdicts. Note a heuristic can show positive calibration skill at a 0% direction hit-rate: a wrong-but-low-confidence call has a low Brier, which is good calibration.

Heuristic Claim (verbal statement) Domains #Calls Hit-rate Mean Brier verbal-Sharpe Trend Verdict
ISO-2 Sector rotation to utilities/energy signals defensive positioning (… control-theory, evolution, finance… 3 0% 0.040 +0.364 ◻️ illustrative
ISO-11 Oil shock from Strait of Hormuz closure propagates through economy … control-theory, finance, history… 1 0% 0.040 +0.210 ◻️ illustrative
ISO-8 ISO-8 power law: gold tends to move in explosive bursts, not gradua… distributed-systems, game-theory, history… 1 0% 0.040 +0.210 ◻️ illustrative
ISO-13 ISO-13 integral windup: Hormuz closure is accumulating economic dam… catastrophic-risks, control-theory, finance… 1 0% 0.040 +0.210 ◻️ illustrative
L-1287 The damage accumulates like debt (Hindu karma pattern, L-1287). catastrophic-risks, control-theory, finance… 1 0% 0.040 +0.210 ◻️ illustrative
L-1289 Unanimity-as-failure (L-1289): VIX has been trading in 20-30 range … catastrophic-risks, control-theory, finance… 1 0% 0.040 +0.210 ◻️ illustrative
ISO-4 CAPE at 39 is 2x historical avg — only 1929 and 2000 were comparabl… control-theory, finance, history… 2 38% 0.111 +0.196 ◻️ illustrative

Pending — pre-registered attributions (open calls)

Heuristics attached to open predictions before the outcome is known. These earn verbal-Sharpe the moment their call resolves — this is where forward credit comes from.

  • PRED-0031 BULL SPY/QQQ (conf 55%, due 2026-06-06): CAND-001(100%)
  • PRED-0002 BULL XLE (conf 30%, due 2026-06-20): ISO-13(100%)
  • PRED-0006 BULL BTC (conf 65%, due 2026-06-20): ISO-4(50%), L-1289(50%)
  • PRED-0007 BEAR DXY (conf 35%, due 2026-06-20): L-1287(50%), ISO-6(50%)
  • PRED-0009 BULL XLU (conf 30%, due 2026-06-20): CAND-005(100%)
  • PRED-0010 BULL XLV (conf 40%, due 2026-06-20): CAND-005(100%)

Probationary — foraged candidates (CAND-NNN)

Candidate heuristics minted from foraged papers (tools/paper_intake.py --emit-candidate-heuristic). They join the pool on probation: a prediction can cite one as a driver, and once enough such calls resolve it graduates to a P-NNN principle (PROMOTE) or is dropped (PRUNE).

Candidate Type Source Applied? Claim
CAND-001 THESIS thesis:PRED forecasting batch (S499-S547) 2-week momentum continuation: a leading asset in a risk-on regime continues i…
CAND-002 THESIS thesis:PRED forecasting batch (S499-S547) Low-VIX-in-conflict mean reversion: VIX below 20 during active military confl…
CAND-003 THESIS thesis:PRED forecasting batch (S499-S547) Divergence mean-reversion: a lagging asset catches up to its driver when thei…
CAND-004 THESIS thesis:PRED forecasting batch (S499-S547) Stretched-level consolidation: an over-extended asset at an extreme consolida…
CAND-005 THESIS thesis:PRED forecasting batch (S499-S547) Defensive rotation: utilities/healthcare (non-cyclical, AI-electricity tailwi…

What is this?

The Forecast Dashboard records pre-registered market calls. This page answers a different question: which named heuristic was each decision made on, and did it earn its keep?

  • A heuristic is a named verbal rule the swarm already holds — a principle (P-NNN), a lesson (L-NNN), or a cross-domain isomorphism (ISO-N).
  • Each prediction attaches the heuristics that drove it, with a weight (how much each drove the call) and a role (driver / risk / counter).
  • When the call resolves, the Brier score is the loss signal, split across the driver heuristics by weight — the verbal analogue of backprop's credit assignment.
  • verbal-Sharpe rewards heuristics that are well-calibrated and repeatedly applied; the evolution loop promotes the top, prunes the negative, and compacts the redundant.

New candidate heuristics enter the pool from foraged papers (tools/paper_intake.py --emit-candidate-heuristic, namespace CAND-NNN) and are validated the same way — external research in, market truth out.

Sources: experiments/finance/heuristics/scorecard.json (computed by tools/heuristic_backtest.py) over experiments/finance/predictions/registry.json.