Notes as Information Space — a cross-field connection methodology¶
flowchart LR
notes["field notes\n(per-course corpus)"] -->|"redundancy:\nsame object re-derived"| course["indexed by COURSE\nnot by OBJECT"]
course -->|"forage + ingest"| dag["typed object DAG\nmath_tree.py"]
dag -->|"generalizes /\nspecializes"| gen["named generalization\n(one canonical object)"]
gen -->|"A↔B isomorphism"| atlas["EQUIVALENCES-ATLAS\n33 clusters / 14 fields"]
gen --> web["website:\nobject-indexed viewer"]
atlas --> contrib["contributor:\nfieldforge loop"]
mdl["MDL: compression\n= generalization = memory"] -.frames.- gen
- Oxford Math Notes (build plan) — the concrete, phased, diagrammatic build that executes this investigation — Phase 0 dedups one object and measures the compression
- mathematics — the partition-function spine — why the swarm treats math as one generating object, not many courses
- equivalences atlas — 33 clusters across 14 fields — the existing cross-field dedup substrate; an A↔B edge is the dedup operator for notes
- information science — MDL (L-559): compression = generalization = memory are one operator — the formal reason redundancy in notes is a defect
- non-equivalence atlas — the guardrail: the σ-metric says when two presentations are genuinely different vs. trivially restatable — stops over-merging
- swarm tooling repos — the use-it-or-drop sieve oxford_math_notes is filed under (forage → math_tree move)
- statement composition — the general frame: object-indexing here is one instance of capturing meaning as a typed constraint graph across all communication codecs
S712 swarmgodforage → INVESTIGATE. Foraged external repo oxford_math_notes (6/97 Oxford courses; github.com/dafdaf1234444/oxford_math_notes) + peer landscape (Lean/mathlib, Isabelle/AFP, Coq, Metamath, Mizar; Stacks, Kerodon, ProofWiki, nLab, PlanetMath; theorem-dependency-graphs, AutoMathKG, semantic search over ~9M statements, autoformalization/Aria). Sieved against existing math machinery: domains/mathematics (102 nodes), tools/math_tree.py (generalizes/specializes), EQUIVALENCES-ATLAS (33 clusters / 14 fields / 7 deep structures). Proposes the field-agnostic note-ingestion methodology swarmgodfieldforge; Phase-0 proof + tooling (math_tree generalize, viewer staging) same session. Raw forage: references/math/forage-notes-as-infospace-s712.md.
- PreviousNon Equivalence Atlas
- NextNothing
Lecture notes index mathematics by the course that happened to teach it. Mathematics indexes itself by the object. The gap between those two indexes is pure redundancy — and generalization is the operator that removes it.
Status: 🌱 seedling | 2026-06-02 | rating: high Compress levels: L0 → L1 → L2
L0 — TL;DR (≤5 lines)¶
Notes are a low-compression codec: the same theorem is re-derived course-by-course because notes are indexed by course, not by object — and math's whole job is to generalize that redundancy away. oxford_math_notes (6/97 Oxford courses → interactive cross-referenced HTML that traces any theorem to first principles) has the right instinct one layer too low: it cross-references inside a fixed corpus and lists "same concept across courses" as an unmet goal; cross-field (math↔physics) isn't in scope. The swarm already owns the fix — domains/mathematics (102 typed nodes), math_tree.py (generalizes/specializes edges), and the EQUIVALENCES-ATLAS (33 clusters across 14 fields). So incorporate ≠ import: forage oxford as a seed into the cross-field atlas the swarm already started, exposing it two ways — (a) a website = an object-indexed math_tree viewer; (b) a contributor = a field-agnostic forage→ingest→dedup→generalize→connect→feedback loop. Verdict: the contributor loop is feasible now; all-of-math+physics is a multi-year horizon — so the honest first step is to dedup one cross-course repeat and measure the compression.
L1 — The argument¶
1. Thesis: knowledge work is branching + generalization over an information space¶
Most mathematical activity is: pick an object, test its properties, and branch to neighbouring objects — a walk over an information space. The move that makes later work cheaper is generalization: find the one general statement that subsumes many specific instances, and every specialization beneath it is paid for once. Redundancy is the enemy; generalization / isomorphism is the dedup operator. This is not a metaphor for the swarm — it is the swarm's own MDL result (L-559): compression, generalization, and memory are the same operator at different scales. And MATHEMATICS shows the dual fact from the other side — "the same object keeps arriving independently" (the partition function Z reached from five frameworks).
The defect in lecture notes is therefore structural, not editorial: indexing by course forces re-derivation; indexing by object forces dedup. Fixing it is the same operator math already runs internally.
2. Similar projects — how the field fights the same redundancy¶
Two winning strategies recur (none of which oxford_math_notes implements yet):
- Formal libraries dedup via type hierarchies + reusable axiomatizations + proof patterns. Lean mathlib (~200k+ theorems): a single
Group/Ringtype-class subsumes every concrete instance — generalization is the deduplicator, mechanically. Also Isabelle/AFP (~3M LOC), Coq/Rocq, Metamath (substitution-based proof sharing), Mizar (declarative, Tarski–Grothendieck base). - Wiki knowledge bases dedup via stable tags + obsessive cross-reference + hierarchical foundations. Stacks Project & Kerodon (stable alphanumeric tags that survive edits), ProofWiki (one-result-per-page), nLab (category theory as the unifying frame), PlanetMath.
- Knowledge-graph / AI approaches: explicit theorem-dependency graphs, AutoMathKG, semantic search over ~9M theorem statements, autoformalization (e.g. Aria decomposes theorems into dependency graphs before formalizing).
Takeaway: the field's two proven dedup engines are type hierarchies (formal) and stable tags + cross-ref (wiki). oxford_math_notes ships the weakest form — cross-reference over a small fixed corpus, no type hierarchy, no cross-corpus identifiers.
3. Diagnosis: oxford_math_notes — what it is, what's right, what's missing¶
- What it is. 6 of 97 Oxford courses converted to a single self-contained interactive
index.htmleach (MathJax, three-panel layout, dependency graph, peek-popovers). Every item has a stable machine ID (def-2.2.4), a human name, a one-to-three-sentence description, color-coded type, and explicit uses / used-by edges. You can clickthm-5.3.7and trace todef-1.1.2in one chain. - What's right (keep it). This is already a typed dependency DAG with a trace-to-first-principles UX — exactly the structure the swarm builds, plus a genuinely good reading experience the swarm currently lacks on the site.
- What's missing (the gap). Its own [VISION] names goal #6 — "the same concept treated in different courses" — and admits the failure: "B2.1 Representation Theory defines the same concept differently" than B2.3, with no unification. So cross-course dedup is aspirational, not delivered; cross-field (math↔physics) is out of scope entirely; there is no generalization edge and no formalization/verification.
- Can it be improved, feasibly? Yes, in tiers. Feasible now: adopt stable cross-corpus IDs (Stacks-style), keep the
uses_in_statementvsuses_in_proofdistinction, add the status ladder. Laborious: re-index per-object so a theorem taught in three courses is one node with three course-views. Hard / open: extend to 97 courses, then all math+physics, then generalize proof procedures and connect fields — that is the mathlib/nLab-scale problem, not a refactor. Honest read: improvable and worth borrowing from, but "clean it up + do all of math" conflates a tractable refactor with a civilization-scale project. The leverage is to run oxford's UX over the swarm's existing cross-field graph — not to grow the repo in place.
4. The swarm already started this (map onto existing machinery)¶
The cross-field connection methodology the user wants already has a substrate; what's missing is the pipeline that feeds it. Existing assets:
tools/math_tree.py+domains/mathematics/(102 nodes) — a typed DAG (axiom→definition→lemma→proposition→theorem→corollary→example), with the Lean-Blueprintuses_in_statementvsuses_in_proofdistinction andgeneralizes/specializesedges, a status ladder (stub→stated→proved→verified→formalized),cascade(error propagation), andimport-latex --link-existing(a working literal-dedup primitive). oxford'sdef-/thm-/lem-/cor-/prop-/ex-map 1:1 onto itsD/T/L/C/P/Enode types.- EQUIVALENCES-ATLAS — 33 isomorphism clusters across 14 fields instantiating 7 deep structures. An
A↔Bedge is precisely the operator that collapses "same idea, different approach" across courses and fields. NON-EQUIVALENCE-ATLAS's σ-metric is the guardrail against over-merging. - The card graph (this very page format) is the human-facing compression layer;
read_nextedges are oxford's cross-references generalized across the whole corpus.
5. The cross-field connection methodology (the central contribution)¶
A field-agnostic loop — math is field #1 (via oxford), physics is the named field #2:
forage a field's notes → ingest as typed math_tree nodes (tagged course:) → dedup (literal repeats by --link-existing; "same idea, different approach" by atlas-style isomorphism + generalizes/specializes) → generalize (N≥3 instances ⇒ emit one general node) → connect (promote a proven cross-field match to an atlas cluster) → feedback (confirmed clusters become L-NNN lessons, recurring patterns become P-NNN principles). Packaged as a claimed-by-use verb, swarmgodfieldforge (see L2.4).
L2 — Deep dives¶
L2.1 Why object-index beats course-index (the compression argument)¶
Formalize redundancy as description length. N courses each re-deriving theorem T cost ≈ N·|T|; one canonical T plus N pointers costs |T| + N·|ptr|. The dedup gain grows with N — this is just MDL, the same operator INFORMATION-SCIENCE identifies. Concrete evidence already in the tool: typed learning-paths (path --typed) skip proof-only dependencies and measurably shorten the prerequisite chain, so statement/proof typing buys real compression before any cross-course merge.
L2.2 Generalizing proof procedures, not just statements¶
The deeper ask is dedup at the level of proof tactics (induction schema, diagonal argument, fixed-point / contraction argument, compactness extraction). This maps onto the atlas's deep structures — e.g. DS1 (self-reference: Lawvere unifying Cantor–Gödel–Turing–Russell as contrapositives of one categorical fact). A "proof procedure" is a generalization node whose specializations are concrete proofs. Honest ceiling: without a proof assistant this is a tag-coincidence heuristic (proof_strategy tags) + human sign-off, not a machine-checked result. Detecting that two proofs are "the same procedure" stays human-confirmed.
L2.3 Cross-field connection as the dedup endgame¶
The atlas's A↔B portals are the mechanism for "connecting different fields": once analysis notes and physics notes both reduce to the same canonical object, every theorem on one side transfers to the other (the atlas's "free prediction machine"). The σ-metric of NON-EQUIVALENCE-ATLAS is the guardrail — it says when two course-presentations are genuinely different (real σ, keep both) vs. trivially restatable (σ≈0, dedup) — preventing the false merges that would corrupt the DAG and propagate via cascade.
L2.4 Incorporation: website + contributor¶
- (a) As a website. Done this session (S712):
domains/mathematics/viewer.html+ itsnodes_export.jsonare staged into the built site at/docs/math-viewer/via acpblock inbuild_site_src.sh. The result is oxford's trace-to-first-principles UX rendered over the swarm's object-indexed, cross-field DAG instead of a per-course silo. "Trace to first principles" = the existingmath_tree.py pathcommand. - (b) As a contributor —
swarmgodfieldforge. A new claimed-by-use combined verb: forage pulls external facts; fieldforge folds an external field's notes into the native DAG and emits the surplus as beliefs. One run = one course (or one cross-course duplicate pair). Sequence:import→dedup(literal:--link-existing; same-idea:equiv_scanner --candidates+ confirm) →generalize(emit general node +generalizesedges at N≥3) → write areferences/.../forage-*.mdrecord with an overclaim guard → emitL-NNN; if a strategy recurs 3+ →P-NNN; if a proven isomorphism → a new atlas cluster.
L2.5 Feasibility tiers + the first concrete step¶
- Easy / now (reuses existing tools): add nodes (
math_tree add/import-latex); literal-repeat dedup (--link-existing); a math atlas site section (onecpblock + nav + cards); writing the lesson/forage record. - Hard (engineering + judgment, human-confirmed): reliable semantic dedup of "same idea, different approach"; generalizing proof procedures; verifying correctness without a proof assistant (imported nodes stay at
stated; never auto-promote pastverified). - Speculative / research-grade (named, not promised): autoformalization (text→Lean) to reach
formalized; machine-checked subsumption (proving a general node actually subsumes its instances). - First concrete step (Phase-0, this session). Pick ONE object that recurs across ≥2 Oxford courses — Ring is defined in both A0 Linear Algebra (
def-2.1) and A3 Rings & Modules; and the First Isomorphism Theorem appears for rings (A0thm-2.7) and recurs for groups/modules. Ingest the instances asmath_treenodes, merge the literal Ring duplicate onto one canonical node, and add a general First Isomorphism Theorem node withgeneralizesedges to the per-structure instances. Runpathbefore/after and record the node/description-length reduction as the measured proof that object-indexing dedups. If the gain is real, promote the loop to a periodic ritual.
Open questions¶
- Can
math_tree.pyedges carry aproof_strategytag so a tactic becomes a first-class generalizing node? (→ frontier) - Can
equiv_scanner.pyauto-suggest same-idea merges (a human-confirmed prior), or only inventory existing clusters? - Licensing/ingestion: under what terms can external course notes (oxford and others) be ingested as derived nodes?
- Physics as field #2: which physics "course notes" corpus is the cleanest seed, and does the σ-guard hold across the math↔physics boundary?
References¶
oxford_math_notes— https://github.com/dafdaf1234444/oxford_math_notes (VISION, FORMAT_SPEC; 6/97 courses).- Formal: Lean mathlib, Isabelle/AFP, Coq/Rocq, Metamath, Mizar.
- Wiki: Stacks Project, Kerodon, ProofWiki, nLab, PlanetMath.
- Internal:
tools/math_tree.py,domains/mathematics/, EQUIVALENCES-ATLAS, INFORMATION-SCIENCE (MDL L-559); raw foragereferences/math/forage-notes-as-infospace-s712.md.