Heuristic Scorecard¶
- Forecast dashboard — the predictions these heuristics drove
- How it works (loop 1) — literature -> backtest -> composite decision
- Credit assignment (loop 2) — the live-grading strategy + diagram
- Home — back to repo root
- Core beliefs — epistemic foundations
Auto-generated by tools/render_heuristics_page.py from experiments/finance/heuristics/scorecard.json
- PreviousPredictions
- NextHuman guide
5 market-validated heuristics · 25 resolved calls · corpus Brier 0.214 · mean verbal-Sharpe -0.093
Each forecast on the dashboard names the heuristics that drove it. When the market resolves the call, its Brier score is split back across those heuristics by weight — credit assignment on verbal statements. Heuristics that keep being right rise; ones that keep being wrong are pruned or compacted.
Last updated: 2026-06-08. verbal-Sharpe = (0.25 − mean Brier contribution) × √(calls); positive = beats a coin flip, and it grows with how often the heuristic has been right.
- Top:
CAND-005(verbal-Sharpe +0.048) - Bottom:
CAND-002(verbal-Sharpe -0.390)
Literature statements — backtested on history¶
The fast loop: clear empirical claims from the finance literature, each encoded as a price-derivable rule and backtested walk-forward on ~10 years of data. OOS Sharpe is the out-of-sample number (held-out last 30%) — the honest one. No transaction costs; in-sample bias controlled by the split; N<20 is direction-only.
| Statement | Claim | Citation | OOS Sharpe | OOS N | Best asset (OOS Sharpe) |
|---|---|---|---|---|---|
STMT-001 12-1 momentum |
An asset trading above its level of ~12 months ago (skipping the… | Jegadeesh & Titman (1993); Asness, Moskowitz & Pedersen (2013) | +0.70 | 228 | GLD (+1.55) |
STMT-005 52-week-high breakout |
An asset trading near its trailing 52-week high tends to continu… | George & Hwang (2004) 'The 52-Week High and Momentum Investing' | +0.50 | 81 | GLD (+2.46) |
STMT-002 Trend (50/200 MA cross) |
When the 50-day moving average is above the 200-day, the trend i… | Moskowitz, Ooi & Pedersen (2012) 'Time Series Momentum' | +0.39 | 234 | GLD (+1.47) |
STMT-004 Low-volatility regime |
Equities in a low realized-volatility regime earn better risk-ad… | Ang, Hodrick, Xing & Zhang (2006) 'The Cross-Section of Volatility' | +0.00 ⚠ | 2 | SPY (+0.00) |
STMT-003 Short-horizon mean reversion |
Over a few weeks, an asset stretched far from its recent mean te… | Jegadeesh (1990) 'Evidence of Predictable Behavior of Security Returns' | -0.05 | 194 | EEM (+0.81) |
Backtest window 2015-01-01 → 2026-06-02. Caveats: in-sample bias (OOS reported separately); NO transaction costs / slippage; single-asset; non-stationarity: historical edge may not persist (see live verbal-Sharpe).
Comprehensive heuristic — today's composite decision¶
The decision step: for each asset, the applicable statements are combined into one call, weighted by each statement's OOS Sharpe on that asset. A statement with no historical edge on an asset gets zero weight, so the system stays silent where it has no basis. Each decision shows exactly which statement pushed it, and how hard.
- EEM → BULL (conf 0.65, score +0.57, 4 statements):
STMT-005+1×w1.59,STMT-001+1×w0.86,STMT-003-1×w0.81,STMT-002+1×w0.48 - GLD → BULL (conf 0.64, score +0.55, 3 statements):
STMT-001+1×w1.55,STMT-002+1×w1.47,STMT-005+0×w2.46 - IWM → NEUTRAL (conf 0.5, score +0.00, 0 statements): —
- QQQ → BULL (conf 0.7, score +1.00, 3 statements):
STMT-005+1×w1.48,STMT-001+1×w1.35,STMT-002+1×w0.90 - SPY → BULL (conf 0.7, score +1.00, 3 statements):
STMT-005+1×w1.11,STMT-001+1×w1.04,STMT-002+1×w0.82 - WTI → NEUTRAL (conf 0.5, score +0.00, 0 statements): —
Computed 2026-06-02. composite_heuristic.py --asset <X> --register files a decision as a pre-registered prediction (handing it to the live loop below).
Market-validated heuristics¶
Heuristics attached to a prediction while it was still open (pre-registered), then scored when the market resolved it. This is the only set that earns verbal-Sharpe.
| Heuristic | Claim (verbal statement) | Domains | #Calls | Hit-rate | Mean Brier | verbal-Sharpe | Trend | Verdict |
|---|---|---|---|---|---|---|---|---|
CAND-005 |
Defensive rotation: utilities/healthcare (non-cyclical, AI-electric… | finance, forecasting | 1 | 0% | 0.203 | +0.048 | — | 👁 watch |
CAND-004 |
Stretched-level consolidation: an over-extended asset at an extreme… | finance, forecasting, health | 3 | 67% | 0.236 | +0.025 | ▲ | ⬆️ promote |
CAND-003 |
Divergence mean-reversion: a lagging asset catches up to its driver… | finance, forecasting | 1 | 0% | 0.250 | +0.000 | — | 👁 watch |
CAND-001 |
2-week momentum continuation: a leading asset in a risk-on regime c… | cryptocurrency, finance, forecasting | 6 | 0% | 0.309 | -0.145 | ▼ | ⬇️ prune |
CAND-002 |
Low-VIX-in-conflict mean reversion: VIX below 20 during active mili… | finance, forecasting | 1 | 100% | 0.640 | -0.390 | — | 👁 watch |
Decision → heuristic traces¶
Expand a heuristic to see every call it drove and how each resolved.
CAND-005 — verbal-Sharpe +0.048 over 1 call(s) · 👁 watch
> Defensive rotation: utilities/healthcare (non-cyclical, AI-electricity tailwind) outperform on a geopolitical/VIX shock.
| Prediction | Role | Weight | Outcome | Brier | Vindicated |
|------------|------|-------:|---------|------:|:----------:|
| PRED-0030 | driver | 1.00 | ❌ INCORRECT | 0.203 | ✗ |
CAND-004 — verbal-Sharpe +0.025 over 3 call(s) · ⬆️ promote
> Stretched-level consolidation: an over-extended asset at an extreme consolidates sideways rather than breaking out or down over ~2 weeks.
| Prediction | Role | Weight | Outcome | Brier | Vindicated |
|------------|------|-------:|---------|------:|:----------:|
| PRED-0024 | driver | 1.00 | ❌ INCORRECT | 0.302 | ✗ |
| PRED-0027 | driver | 1.00 | ✅ CORRECT | 0.203 | ✔ |
| PRED-0029 | driver | 1.00 | ✅ CORRECT | 0.203 | ✔ |
CAND-003 — verbal-Sharpe +0.000 over 1 call(s) · 👁 watch
> Divergence mean-reversion: a lagging asset catches up to its driver when their historical relationship diverges (e.g. WTI->XLE) within 2-3 weeks.
| Prediction | Role | Weight | Outcome | Brier | Vindicated |
|------------|------|-------:|---------|------:|:----------:|
| PRED-0025 | driver | 1.00 | ❌ INCORRECT | 0.250 | ✗ |
CAND-001 — verbal-Sharpe -0.145 over 6 call(s) · ⬇️ prune
> 2-week momentum continuation: a leading asset in a risk-on regime continues its trend over a ~2-week window.
| Prediction | Role | Weight | Outcome | Brier | Vindicated |
|------------|------|-------:|---------|------:|:----------:|
| PRED-0019 | driver | 1.00 | ❌ INCORRECT | 0.250 | ✗ |
| PRED-0020 | driver | 1.00 | ❌ INCORRECT | 0.250 | ✗ |
| PRED-0021 | driver | 1.00 | ❌ INCORRECT | 0.302 | ✗ |
| PRED-0022 | driver | 1.00 | ❌ INCORRECT | 0.490 | ✗ |
| PRED-0026 | driver | 1.00 | ❌ INCORRECT | 0.360 | ✗ |
| PRED-0028 | driver | 1.00 | ❌ INCORRECT | 0.203 | ✗ |
CAND-002 — verbal-Sharpe -0.390 over 1 call(s) · 👁 watch
> Low-VIX-in-conflict mean reversion: VIX below 20 during active military conflict reverts upward within ~2 weeks.
| Prediction | Role | Weight | Outcome | Brier | Vindicated |
|------------|------|-------:|---------|------:|:----------:|
| PRED-0023 | driver | 1.00 | ✅ CORRECT | 0.640 | ✔ |
Reverse trace — which heuristics drove each resolved call¶
The macro view: open a forecast and see the named heuristics behind it, their weights, and whether each was vindicated by the outcome.
- PRED-0019 BULL SPY → ❌ INCORRECT (Brier 0.25):
CAND-001(100%) - PRED-0020 BULL QQQ → ❌ INCORRECT (Brier 0.25):
CAND-001(100%) - PRED-0021 BULL BTC → ❌ INCORRECT (Brier 0.3025):
CAND-001(100%) - PRED-0022 BULL WTI → ❌ INCORRECT (Brier 0.49):
CAND-001(100%) - PRED-0023 BULL VIX → ✅ CORRECT (Brier 0.64):
CAND-002(100%) - PRED-0024 NEUTRAL GLD → ❌ INCORRECT (Brier 0.3025):
CAND-004(100%) - PRED-0025 BULL XLE → ❌ INCORRECT (Brier 0.25):
CAND-003(100%) - PRED-0026 BEAR DXY → ❌ INCORRECT (Brier 0.36):
CAND-001(100%) - PRED-0027 NEUTRAL AAPL → ✅ CORRECT (Brier 0.2025):
CAND-004(100%) - PRED-0028 BULL IWM → ❌ INCORRECT (Brier 0.2025):
CAND-001(100%) - PRED-0029 NEUTRAL XLV → ✅ CORRECT (Brier 0.2025):
CAND-004(100%) - PRED-0030 BULL XLU → ❌ INCORRECT (Brier 0.2025):
CAND-005(100%) - PRED-0001 BEAR SPY → ❌ INCORRECT (Brier 0.04):
ISO-4(33%),ISO-11(33%),ISO-2(33%) (backfill — illustrative) - PRED-0004 BULL GLD → ❌ INCORRECT (Brier 0.04):
ISO-8(100%) (backfill — illustrative) - PRED-0005 BEAR QQQ → ❌ INCORRECT (Brier 0.04):
ISO-2(100%) (backfill — illustrative) - PRED-0008 BULL VIX → ❌ INCORRECT (Brier 0.04):
ISO-13(33%),L-1287(33%),L-1289(33%) (backfill — illustrative) - PRED-0015 BEAR AAPL → ❌ INCORRECT (Brier 0.04):
ISO-2(100%) (backfill — illustrative) - PRED-0003 BULL TLT → ➕ PARTIAL (Brier 0.135):
ISO-4(100%) (backfill — illustrative)
Illustrative backfill (excluded from scoring)¶
Heuristics reconstructed from the theses of already-resolved predictions. They show the mechanism working, but they are hindsight — attached after the outcome was known — so they earn no verbal-Sharpe and never drive evolution verdicts. Note a heuristic can show positive calibration skill at a 0% direction hit-rate: a wrong-but-low-confidence call has a low Brier, which is good calibration.
| Heuristic | Claim (verbal statement) | Domains | #Calls | Hit-rate | Mean Brier | verbal-Sharpe | Trend | Verdict |
|---|---|---|---|---|---|---|---|---|
ISO-2 |
Sector rotation to utilities/energy signals defensive positioning (… | control-theory, evolution, finance… | 3 | 0% | 0.040 | +0.364 | ▲ | ◻️ illustrative |
ISO-11 |
Oil shock from Strait of Hormuz closure propagates through economy … | control-theory, finance, history… | 1 | 0% | 0.040 | +0.210 | — | ◻️ illustrative |
ISO-8 |
ISO-8 power law: gold tends to move in explosive bursts, not gradua… | distributed-systems, game-theory, history… | 1 | 0% | 0.040 | +0.210 | — | ◻️ illustrative |
ISO-13 |
ISO-13 integral windup: Hormuz closure is accumulating economic dam… | catastrophic-risks, control-theory, finance… | 1 | 0% | 0.040 | +0.210 | — | ◻️ illustrative |
L-1287 |
The damage accumulates like debt (Hindu karma pattern, L-1287). | catastrophic-risks, control-theory, finance… | 1 | 0% | 0.040 | +0.210 | — | ◻️ illustrative |
L-1289 |
Unanimity-as-failure (L-1289): VIX has been trading in 20-30 range … | catastrophic-risks, control-theory, finance… | 1 | 0% | 0.040 | +0.210 | — | ◻️ illustrative |
ISO-4 |
CAPE at 39 is 2x historical avg — only 1929 and 2000 were comparabl… | control-theory, finance, history… | 2 | 38% | 0.111 | +0.196 | ▲ | ◻️ illustrative |
Pending — pre-registered attributions (open calls)¶
Heuristics attached to open predictions before the outcome is known. These earn verbal-Sharpe the moment their call resolves — this is where forward credit comes from.
- PRED-0031 BULL SPY/QQQ (conf 55%, due 2026-06-06):
CAND-001(100%) - PRED-0002 BULL XLE (conf 30%, due 2026-06-20):
ISO-13(100%) - PRED-0006 BULL BTC (conf 65%, due 2026-06-20):
ISO-4(50%),L-1289(50%) - PRED-0007 BEAR DXY (conf 35%, due 2026-06-20):
L-1287(50%),ISO-6(50%) - PRED-0009 BULL XLU (conf 30%, due 2026-06-20):
CAND-005(100%) - PRED-0010 BULL XLV (conf 40%, due 2026-06-20):
CAND-005(100%)
Probationary — foraged candidates (CAND-NNN)¶
Candidate heuristics minted from foraged papers (tools/paper_intake.py --emit-candidate-heuristic). They join the pool on probation: a prediction can cite one as a driver, and once enough such calls resolve it graduates to a P-NNN principle (PROMOTE) or is dropped (PRUNE).
| Candidate | Type | Source | Applied? | Claim |
|---|---|---|---|---|
CAND-001 |
THESIS | thesis:PRED forecasting batch (S499-S547) | ✅ | 2-week momentum continuation: a leading asset in a risk-on regime continues i… |
CAND-002 |
THESIS | thesis:PRED forecasting batch (S499-S547) | ✅ | Low-VIX-in-conflict mean reversion: VIX below 20 during active military confl… |
CAND-003 |
THESIS | thesis:PRED forecasting batch (S499-S547) | ✅ | Divergence mean-reversion: a lagging asset catches up to its driver when thei… |
CAND-004 |
THESIS | thesis:PRED forecasting batch (S499-S547) | ✅ | Stretched-level consolidation: an over-extended asset at an extreme consolida… |
CAND-005 |
THESIS | thesis:PRED forecasting batch (S499-S547) | ✅ | Defensive rotation: utilities/healthcare (non-cyclical, AI-electricity tailwi… |
What is this?¶
The Forecast Dashboard records pre-registered market calls. This page answers a different question: which named heuristic was each decision made on, and did it earn its keep?
- A heuristic is a named verbal rule the swarm already holds — a principle (
P-NNN), a lesson (L-NNN), or a cross-domain isomorphism (ISO-N). - Each prediction attaches the heuristics that drove it, with a weight (how much each drove the call) and a role (driver / risk / counter).
- When the call resolves, the Brier score is the loss signal, split across the driver heuristics by weight — the verbal analogue of backprop's credit assignment.
- verbal-Sharpe rewards heuristics that are well-calibrated and repeatedly applied; the evolution loop promotes the top, prunes the negative, and compacts the redundant.
New candidate heuristics enter the pool from foraged papers (tools/paper_intake.py --emit-candidate-heuristic, namespace CAND-NNN) and are validated the same way — external research in, market truth out.
Sources: experiments/finance/heuristics/scorecard.json (computed by tools/heuristic_backtest.py) over experiments/finance/predictions/registry.json.