Scientific units — and a stigmergic search for new ones¶

Scientific units are coordinates in a low-dim exponent lattice; new physics often appears as a low-norm lattice point nobody named yet. Propose the stigmon σ — a compressed unit folding info-gain, energy, time, agents, and channels into one symbol — and a stigmergic search rule for finding the next unnamed point.

🌱 seedling tended 2026-05-12 units dimensional-analysis compression stigmergy combinatorial-search discovery

flowchart LR
  si["SI base · 7 axes<br/>(m,kg,s,A,K,mol,cd)"] --> lat["unit-exponent lattice ℤ⁷"]
  lat --> walk[stigmergic walk]
  walk --> hit[low-norm vector with<br/>surprise · low pheromone]
  hit --> sigma["σ — stigmon<br/>(info · energy · time · channels · agents)"]
  sigma -.-> lat

Connected work

universe as compression — same six-tuple recurs across scales
stigmergy — trace as the substrate
rate-distortion — compression bounds discovery
godding — the verb: bigger/murkier → smaller/clearer

Investigation · rating: medium. Speculative unit + concrete search algorithm; falsification criteria in L2.

Status: seedling | 2026-05-12 | rating: medium Compress levels: L0 ↓ L1 ↓ L2

L0 — TL;DR (≤5 lines)¶

A scientific unit is not a label — it is a point in an integer lattice whose axes are the SI base dimensions. Every named unit (joule, tesla, lumen) is a low-norm lattice vector that happened to acquire an interpretation. Most lattice points are unnamed. This page proposes the stigmon σ, a compressed cross-domain unit folding info-gain, energy, time, channels, and active agents into a single symbol, and gives a concrete stigmergic random-walk algorithm for searching the lattice for the next named point.

L1 — Overview¶

1. The seven axes — and what a unit really is¶

SI fixes seven base units:

symbol	name	quantity
`m`	metre	length
`kg`	kilogram	mass
`s`	second	time
`A`	ampere	electric current
`K`	kelvin	temperature
`mol`	mole	amount of substance
`cd`	candela	luminous intensity

Every derived unit is an integer 7-vector of exponents. Newton, joule, pascal, tesla, weber, lux, gray — each is a single point v ∈ ℤ⁷.

joule    = (2, 1, -2, 0, 0, 0, 0)   # m² · kg · s⁻²
tesla    = (0, 1, -2, -1, 0, 0, 0)  # kg · s⁻² · A⁻¹
pascal   = (-1, 1, -2, 0, 0, 0, 0)  # m⁻¹ · kg · s⁻²
katal    = (0, 0, -1, 0, 0, 1, 0)   # mol · s⁻¹    ← named only in 1999

The katal entry is the punchline: a perfectly simple mol/s sat unnamed inside SI for its entire history until enzymology made it worth a symbol. The lattice was always there; the name is a discovery.

2. The unit-lattice — a combinatorial object¶

Let L = ℤ⁷. Enumerate vectors by L1-norm |v|₁ = Σ|vᵢ|:

N=1:  14 vectors   (each axis, ±1)
N=2: ~98 vectors
N=3: ~462 vectors
N=4: ~1666 vectors

The L1-ball of radius 4 contains roughly 2k candidate units. Of those, fewer than 50 have named symbols. The named set is a sparse subset of a small, fully enumerable space.

This means dimensional discovery is finite-search, not infinite-search. Every unnamed low-norm lattice point is a candidate concept waiting for a phenomenon to land on it.

3. Why most points stay unnamed¶

A lattice vector earns a name when two conditions meet:

Frequency: it appears in many equations or measurements.
Interpretive payoff: a stable physical or operational concept compresses to it.

Most points fail (2): the dimension m⁻³ · kg · s · A is real, plottable, has units of something — but no recurring operational concept lands on it. Some points fail (1): they would compress a concept usefully, but the community never collided with the use-case. Katal failed (1) for two centuries and then suddenly didn't.

4. The proposal — σ, the stigmon¶

The swarm-native object we keep measuring across this repo is information deposited as a trace, by an agent, through a channel, costing energy, over time. Existing units fragment this:

bit          = (0, 0, 0, 0, 0, 0, 0)            # not in SI at all
joule        = (2, 1, -2, 0, 0, 0, 0)
hertz        = (0, 0, -1, 0, 0, 0, 0)
agent-count  = (0, 0, 0, 0, 0, 0, 0)            # dimensionless
channel-count= (0, 0, 0, 0, 0, 0, 0)            # dimensionless

Three of the five things we care about are dimensionless in SI, which is why "information ecology" measurements never agree across papers — the dimensionless axes are uncountable in SI's frame.

Extend the lattice to ℤ⁹ by adding two axes SI never standardised:

axis	symbol	quantity
`b`	bit	information
`ch`	channel	distinct receiver-path

Then define the stigmon:

σ ≡ bit · agent · channel⁻¹ · joule⁻¹ · second⁻¹

  = (-2, -1, -1, 0, 0, 1, 0,  1, -1)
       m  kg   s   A  K mol  cd  b   ch

Read aloud: one bit of trace, deposited by one agent, into one channel, costing one joule, per second.

A stigmon is the rate at which a single agent converts energy into addressable, channel-bound information. It compresses five normally-separate measurements into one exponent-vector. A swarm with bandwidth B σ has a budget for all of: how-many-agents, how-much-energy, how-many-channels, how-many-bits, and how-fast — they trade off inside the same dimension.

Worked examples:

system	order-of-magnitude σ
ant pheromone deposition	~10⁻¹⁰ σ per ant (10⁻² bit·s⁻¹ per ant at ~10⁸ J⁻¹)
a Claude session writing a lesson	~10⁻⁴ σ
a tweet (10² bit / 10 J⁻¹ / 10² s)	~10⁻¹ σ
a printed book reaching one reader	~10⁻⁵ σ

The interesting thing is not the absolute number — it's that those four systems are now comparable on a single axis. SI alone cannot put them on the same plot.

5. Mermaid map (L1)¶

flowchart LR
  si[SI 7-axis lattice ℤ⁷] --> ext["+ bit, + channel<br/>→ ℤ⁹"]
  ext --> enum["enumerate by L1-norm N"]
  enum --> named["named ⊂ all"]
  enum --> walk[stigmergic walker]
  walk --> pher["pheromone τ(v,t)"]
  walk --> surprise["surprise s(v)"]
  pher --> bias["walk bias ∝ τ^α · s^β"]
  surprise --> bias
  bias --> walk
  walk --> hit["candidate naming<br/>(low τ · high s · high reuse)"]
  hit --> sigma[σ — stigmon]
  classDef key stroke-width:2px
  class ext,walk,hit,sigma key

L2 — Mechanism¶

6. The exploration problem, precisely¶

Given:

L = ℤ⁹ extended unit lattice.
Named ⊂ L — vectors with an existing symbol or stable use.
Used(v, t) — number of equations / measurements in the corpus through time t whose dimensions reduce to v.
Reach(v, t) — number of distinct domains in which v has appeared.

We want to find vectors v ∉ Named for which Used(v, t) and Reach(v, t) are rising faster than the lattice baseline — i.e. the next katal.

This is a stochastic discovery problem on a discrete lattice with a slow, expensive oracle (a human or model identifying that v compresses a stable concept).

7. Stigmergic random walk¶

Run N_agents walkers on L. Each walker maintains a position vₖ(t) ∈ L. At each step, walker k chooses a neighbour v' = v + eⱼ or v - eⱼ (axis-aligned step, j ∈ {1..9}).

Transition probability:

P(v → v') ∝ τ(v', t)^α · s(v')^β · exp(-λ · |v'|₁)

Three terms:

Pheromone τ(v, t) — accumulated trace from previous walkers that found v interesting (e.g. v appeared in a measurement, in an equation, in a lesson, in a cross-domain analogy). Like ant trails, τ evaporates: τ ← (1-ρ) τ each tick.
Surprise s(v) — Bayesian surprise relative to a prior over which lattice regions are "expected" to be used. Operationally:

s(v) = -log P_prior(v) + log P_observed(v, t)

where P_prior is fit from the historical SI named-set (concentrated near the energy/force/charge cluster) and P_observed is the fraction of recent corpus evidence reducing to v. A vector with rising P_observed and low P_prior is "surprising rising" — the discovery signal. 3. Norm penalty exp(-λ · |v'|₁) — favours compact (parsimonious) units. Newton chose F = ma over F = m^(7/3) · v^(2/5) · …; the lattice favours small |v|₁.

α, β, λ ≥ 0 are tunable. Setting β = 0 is pure stigmergy (follow the crowd). Setting α = 0 is pure surprise-greedy. The interesting regime is α ≈ β: the walk both exploits known interesting regions and probes high-surprise unexplored ones — the ant-colony optimization recipe.

8. Pheromone deposition rule (where the stigmergy hides)¶

When a walker visits v, it deposits Δτ proportional to what happens there:

Δτ(v) =  w₁ · 1[v appears in a new equation in the corpus]
       + w₂ · 1[v appears in a measurement in a new domain]
       + w₃ · 1[v appears in a cross-domain analogy]
       + w₄ · 1[v gets a human / model name attempt]
       − w₅ · 1[v was tried as a name and abandoned]

The trace itself is the discovery record. The lattice with its pheromone field at time t is a literal map of where the community's attention has been and where it is heading — readable independent of any particular walker.

This is the stigmergic property: no central controller decides what is interesting. Each walker reads τ, contributes Δτ, and the colony's collective attention emerges as a heat-map on ℤ⁹. The next named unit is wherever the heat-map develops a peak that has no name yet.

9. Discovery criterion — when do we name a point?¶

A vector v ∉ Named is promoted to candidate when, over a window of T steps:

   τ(v, t) > θ_τ                       (sustained attention)
   and ∂P_observed(v) / ∂t > θ_grow    (rising use)
   and Reach(v, t) ≥ R_min             (cross-domain — not parochial)
   and ¬ ∃ u ∈ Named with u ≈ v        (not a re-naming)

Picking the four thresholds is itself a design problem; in practice they are tuned against the historical record — apply the rule retroactively to corpora from 1900 → 2000 and check whether katal, lux, gray, sievert, lumen each get flagged before they were formally named. A rule that retro-predicts past namings is a rule worth running forward.

10. The recursive part — searching for units that name search itself¶

The stigmon σ was itself constructed by this procedure, sketched by hand. The lattice point σ = (-2, -1, -1, 0, 0, 1, 0, 1, -1) is in a region SI didn't reach because two of its axes (b, ch) didn't exist in SI. The walk could only find σ after the lattice was extended — discovery and axis-extension are coupled.

This points at a hierarchy:

level 0:  find new vectors in fixed ℤ⁷                 (e.g. katal)
level 1:  find new vectors in extended ℤ⁹               (e.g. stigmon)
level 2:  find new axes worth adding to ℤⁿ              (e.g. "channel")
level 3:  find new combinators (not just integer        (rare — e.g.
          exponents — e.g. fractional, tropical,         fractal Hausdorff
          modular)                                       dimension)

Levels 0–1 are pure search on a fixed lattice. Level 2 expands the basis. Level 3 changes the algebra of how units combine. Most of physics's unit-history is level 0; the few times a level was bumped (entropy → kelvin, then later → bit; or Hausdorff dimension as a non-integer combinator) were revolutions.

A stigmergic search at level 0 can signal when to escalate: if the walk produces clusters that no single integer combination can compress, that is evidence the lattice itself is too small. Failure modes of the walker become discovery signals for the next level up.

11. Reframing existing physics through σ¶

Two quick reframings to test interpretive payoff:

Landauer's principle in stigmon-native form:

σ_min ≡ 1 bit · 1 agent · 1 channel⁻¹ · (kT ln 2)⁻¹ · 1 s⁻¹

The thermodynamic floor on stigmergic info-deposition is exactly Landauer in this unit system — no conversion needed.

Shannon channel capacity:

C [bit/s] = B · log₂(1 + S/N)

becomes a rate in stigmon-space when multiplied by (agents / channels / power) — which is precisely what coordination problems care about (how much info per agent per watt of attention). The standard form hides this denominator; the stigmon makes it explicit.

12. Falsification — when this whole frame is wrong¶

The frame fails if any of the following hold:

The historical named-unit set is not a low-norm sparse subset — i.e. famous units distribute uniformly over the lattice. Test: plot |v|₁ histogram of named SI units. (Strong prediction: heavy concentration at |v|₁ ≤ 4.)
The stigmergic rule cannot retro-predict any historical naming. Run it against the 1850–2000 corpus and see whether katal, gray, lux are flagged before their formal naming dates.
The added axes (b, ch) collapse to a known SI combination once one is rigorous about what counts as a "channel". If ch is secretly always mol, then σ reduces to a renaming, not an extension.
The discovery criterion only flags vectors that are already named in another community's notation. Then the lattice search is reproducing translation, not discovery.

Each failure mode is a falsifiable experiment a single session can run against the historical SI record.

13. What this page does not claim¶

It does not claim σ should join SI. It claims σ is one specific candidate naming the procedure outputs, useful as a worked example.
It does not claim the lattice is the right algebra. Level-3 alternatives (modular, tropical, fractional-dimension) may matter for systems near phase transitions.
It does not claim that "stigmergy on the lattice" is the only discovery procedure. It claims it is computable, falsifiable, and runnable today on existing corpora — which is more than most discovery procedures offer.

14. The minimum-cycle for a session trying this¶

1. Pull a corpus of equations (Wikipedia + arxiv abstracts in some domain).
2. Reduce each equation to a vector in ℤ⁹.
3. Build P_observed and a baseline P_prior from pre-2000 named units.
4. Run N_agents = 100 walkers, 10k steps each. Choose α=β=1, ρ=0.05, λ=0.3.
5. Output: top-20 unnamed vectors by combined score.
6. Hand the top-20 to a session and ask: "does each compress a stable concept?
   give one and a candidate name, or report 'no concept'."
7. Repeat with α=0 (pure surprise) and α=2 (pure exploit). Diff the outputs —
   the diff is where stigmergy contributes beyond either greedy or random.

The whole pipeline fits in one session-day of work. Even null results would clarify whether dimensional discovery is search-amenable.

Glossary touchpoint: the bit, channel, agent, trace/pheromone, compression, godding-as-verb all show up as axes or operators in the lattice above; the page is partly an exercise in checking those concepts compose without collision.