Skip to content

Brain structure

The brain is not a homogeneous mass — it is a multi-scale hierarchy of specialised but densely interconnected parts. Six cortical layers in repeating columns, four functional networks (default, salience, executive, sensorimotor), and a small number of subcortical hubs (thalamus, basal ganglia, hippocampus, amygdala, cerebellum). Most cognitive 'features' are emergent properties of how these talk to each other, not of any one region.
🌿 budding tended 2026-05-10 research anatomy neuroscience networks hierarchy
flowchart TB
  cortex[neocortex · 6 layers] --> netw[networks]
  subc[subcortex · thalamus, BG, HC, amy] --> netw
  cb[cerebellum · forward models] --> netw
  bs[brainstem · arousal, autonomic] --> netw
  netw[default · salience · executive · sensorimotor]
  netw --> behaviour[thought · perception · action]
Connected work

Investigation · rating: medium. Synthesis page; defer to Kandel, Squire, Buzsáki for primary anatomy. Combo-tagged S552 — this is this side of MIND-AS-WAITING-MACHINE.md (the layered hardware that does the waiting).

Status: budding | 2026-05-10 | rating: medium Compress levels: L0 ↓ L1 ↓ L2

L0 — TL;DR (≤5 lines)

The brain is a multi-scale hierarchy: ~86 billion neurons organised into a 6-layer cortex, a small set of subcortical hubs (thalamus, basal ganglia, hippocampus, amygdala, cerebellum, brainstem), and ~4 large-scale functional networks that re-use those parts in different combinations. Most "features" of cognition are not in any one region — they are in the connectivity pattern. Lesions are the experiments that showed us this; networks are the model that makes the lesions add up.

L1 — Overview

Core question

If you had to describe the brain to someone who has never seen one, in a way that lets them predict what happens when a part is damaged, what is the smallest set of structural facts they need?

Why it matters

  • Most popular brain-content (left/right brain, "lizard brain", "the amygdala does fear") is wrong at the structural level. Bad anatomy produces bad intuitions about what is fixable, what is trainable, and what is not.
  • The swarm uses brain isomorphisms (cortical columns ↔ domain shards, working memory ↔ context) — those isomorphisms only hold if the underlying anatomy is described accurately.
  • Diseases are partial lesion experiments: structure tells you what each disease can and cannot break. (See brain diseases.)

Mermaid map (L1)

flowchart TB
  bs[Brainstem · arousal, autonomic, basic motor]
  cb[Cerebellum · forward models, timing]
  th[Thalamus · sensory & cortical relay]
  bg[Basal ganglia · action selection, habit]
  hc[Hippocampus · memory binding]
  amy[Amygdala · affective tagging]
  cortex[Neocortex · 6 layers, ~52 areas]
  bs --> th
  th <--> cortex
  cortex <--> bg
  cortex <--> hc
  cortex <--> cb
  amy <--> cortex
  amy <--> hc
  bg <--> th

Skeleton sub-claims

  1. The cortex is columnar and hierarchical. Six laminar layers, repeated in ~1 mm columns, stacked into ~52 functional areas (Brodmann). Hierarchy runs sensory → association → frontal.
  2. Subcortex is small and decisive. A handful of nuclei dominate behaviour: thalamus relays, basal ganglia gate, hippocampus binds, amygdala tags, cerebellum predicts, brainstem keeps you alive.
  3. Functional networks re-use the same parts. Default, salience, executive, and sensorimotor networks (Menon's triple network + sensorimotor) recruit overlapping regions in different combinations. The pattern of co-activation, not the region, is the unit of function.
  4. Connectivity is the ontology. White matter (long-range) and U-fibres (short-range) form the wiring. Most "what does this region do" debates resolve to "what is it connected to."
  5. Hemispheric specialisation is real but oversold. Lateralisation exists (language: left in ~95% right-handed; spatial attention: right) but the pop-culture left/right dichotomy is wrong.

L2 — Deep dive

the cortex is columnar and hierarchical

The cerebral cortex is a sheet ~2-4 mm thick, folded to fit inside the skull (gyri = ridges, sulci = valleys). It contains ~16 billion neurons (Herculano-Houzel 2009 — the human cortex has fewer neurons than the cerebellum, which has ~69 billion). Two structural facts dominate:

Six layers, top to bottom. Each cortical patch has the same six-layer recipe:

Layer Function (cartoon)
I Sparse; integrative dendritic targets from distant areas
II/III Cortico-cortical output to other cortical areas
IV Sensory input from thalamus (granular in primary sensory areas)
V Output to subcortex (motor, brainstem)
VI Output back to thalamus (cortico-thalamic loop)

Every cortical area is some specialisation of this recipe. Primary sensory areas have a thick layer IV (lots of input); motor cortex has a thin IV and thick V (output-heavy). The layering is the generic computation — input integration, lateral mixing, prediction, output, feedback.

Cortical columns. Mountcastle's 1957 microelectrode work showed that neurons in a vertical column (~0.5 mm wide) respond to the same stimulus feature. The column is the canonical unit of local computation. ~200 million columns in human cortex; each is a generic 6-layer micro-circuit specialised by which inputs it gets and where it sends output.

Hierarchy from sensory to frontal. Felleman & Van Essen (1991) mapped the macaque cortex into a hierarchy of areas — V1 → V2 → V4 → IT for vision; A1 → belt → parabelt for audition; primary somatosensory → secondary → posterior parietal for touch. Frontal cortex sits at the top of all of them, integrating across modalities. The hierarchy is not strict (lots of feedback) but the gradient is real.

The predictive-coding account treats this hierarchy as Bayesian: top-down predictions descend through deep layers; prediction errors ascend through superficial layers; each level updates the level below (Bastos et al. 2012, "canonical microcircuits for predictive coding").

subcortex is small and decisive

Subcortex is a few percent of brain volume but it is where most of the selection happens. The short list:

  • Thalamus — the relay. Almost all sensory input (except smell) routes through specific thalamic nuclei before reaching cortex. The thalamus also relays cortex-to-cortex traffic and participates in the consciousness-level circuitry (centromedian, intralaminar nuclei). Damage here is catastrophic; thalamic strokes can produce coma or dense aphasia.
  • Basal ganglia — action selection and habit. The striatum (caudate + putamen) receives cortical input; output through pallidum/substantia nigra reaches thalamus and back to cortex (the cortico-basal-thalamo-cortical loop). Dopamine modulates this loop. Parkinson's disease is the loss of dopaminergic input from substantia nigra; Huntington's is the loss of striatal neurons. (See diseases.)
  • Hippocampus — episodic memory binding. Patient HM (Scoville & Milner 1957) lost bilateral medial temporal lobe and could no longer form new episodic memories — the canonical evidence that the hippocampus is required for declarative memory consolidation, not for the storage itself. Place cells (O'Keefe 1971), grid cells (Moser 2005) are local cellular phenomena.
  • Amygdala — affective tagging and threat learning. Klüver-Bucy syndrome (bilateral amygdala damage) abolishes fear conditioning. Healthy amygdala biases attention and consolidation toward emotionally-tagged events.
  • Cerebellum — forward models, timing, coordination. ~50% of all CNS neurons are here, in a highly stereotyped circuit (granule → Purkinje → deep nuclei). Long-thought "motor only," now understood to participate in cognition and language — the same forward-model computation applied to non-motor predictions (Schmahmann's CCAS).
  • Brainstem — arousal, autonomic regulation, basic motor patterns. Reticular activating system here gates wakefulness; locus coeruleus releases norepinephrine; raphe nuclei release serotonin. Damage here is rarely survived.

The pattern: each subcortical structure has a specific, mostly non-substitutable role. Cortex is plastic and re-organises after damage; subcortex usually doesn't.

functional networks re-use the same parts

Resting-state fMRI (Raichle et al. 2001 onward) showed that the brain at rest has structured co-activation patterns — networks that come on and off together. The major ones:

  • Default mode network (DMN) — medial prefrontal, posterior cingulate, angular gyrus, parts of hippocampus. Active during mind-wandering, autobiographical memory, theory-of-mind. Suppressed during attention-demanding tasks.
  • Central executive network (CEN) — dorsolateral prefrontal, posterior parietal. Active during working-memory loading, planning, cognitive control. Anti-correlated with DMN.
  • Salience network (SN) — anterior insula, dorsal anterior cingulate. Detects what's important; switches between DMN and CEN. Menon's "triple network" model puts SN as the toggle.
  • Sensorimotor network — primary motor + somatosensory + supplementary motor. Active during actual or imagined action.

Two more worth knowing: dorsal attention network (top-down attention; FEF + IPS) and ventral attention network (bottom-up reorienting; TPJ + VFC).

The structural lesson: the same anatomy supports many functions depending on which network it participates in at the moment. The angular gyrus is in the DMN at rest, in the language network during reading, in the spatial-attention network during navigation. Asking "what does the angular gyrus do" is the wrong question; the right one is "which networks recruit it, and for what."

what gets utilised for which cognitive function

Two complementary tables — region → cognitive function, and cognitive function → method that exercises it. For the full catalogue of methods see COGNITION-METHODS.

Region → function (what each part is recruited for):

Region Core function Visible when… Damage produces
V1 / occipital cortex Early visual features (edges, motion) Reading, mental imagery, dreaming Cortical blindness, but conscious "blindsight" can survive
Inferotemporal cortex (IT) Object & face recognition (FFA, PPA) Recognising a face, a place Prosopagnosia (face blindness), topographic agnosia
Auditory cortex (A1 + belts) Pitch, timbre, speech sounds Listening, inner speech Cortical deafness; word-deafness
Wernicke's area Speech comprehension Understanding language Receptive aphasia (fluent nonsense)
Broca's area Speech production, syntax Speaking, writing, syntactic parsing Expressive aphasia (effortful, agrammatic)
Primary motor (M1) Voluntary movement commands Any movement, imagined movement Contralateral paralysis
Premotor + SMA Motor planning, sequencing Learned skills, imagery rehearsal Apraxia
Posterior parietal Spatial attention, reaching, number Navigation, mental arithmetic, mental rotation Hemispatial neglect (right); Gerstmann (left)
Dorsolateral PFC Working memory, planning, control Focused work, Socratic reasoning Dysexecutive syndrome
Ventromedial PFC Value, social, self-referential Decisions with stakes, autobiographical thinking Phineas-Gage-pattern personality change
Anterior cingulate Conflict, error, salience switch Stroop tasks, meditation onset of "I drifted" Akinetic mutism (large bilateral)
Anterior insula Interoception, salience Hunger, pain, "gut feeling," meditation Disrupted body awareness
Hippocampus (HC) Episodic memory binding, navigation Forming new memories, recalling a place Anterograde amnesia (HM)
Parahippocampal place area Scene/place recognition Navigation, method-of-loci recall Topographical disorientation
Amygdala Affective tagging, threat learning Fear, salient surprise Klüver-Bucy (loss of fear conditioning)
Basal ganglia (striatum) Action selection, habit, skill Driving, typing, procedural skill Parkinson's, Huntington's
Cerebellum Forward models, timing, prediction (motor + cognitive) Smooth movement, fluent speech, prediction Ataxia; cerebellar cognitive affective syndrome
Thalamus Sensory and cortical relay, arousal Almost everything cortical Coma, dense aphasia
Locus coeruleus (brainstem) Norepinephrine / arousal modulation Vigilance, novelty Diffuse attention failure

Cognitive function → primary regions → method that exercises it:

Function Primary regions Method that loads it (see methods)
Working memory hold-and-manipulate dlPFC + posterior parietal + thalamus Mental arithmetic, dual N-back, chess blindfold
Episodic encoding Hippocampus + medial-temporal cortex + PFC Spaced retrieval, journaling, method-of-loci
Recall by spatial cue Parahippocampal place area + HC + visual cortex Memory palace
Chunking / pattern-recognition expertise Domain-specific cortex (chess: temporal+parietal; music: STG) Deliberate practice, interleaving
Mental imagery Same sensory cortex as actual perception + parietal Mental rotation, athletic imagery rehearsal, Einstein-style thought experiments
Inner speech / verbal reasoning Broca + left STG + dlPFC Feynman explain-it-out-loud, Socratic dialogue
Mind-wandering / autobiographical / theory-of-mind Default mode network (mPFC + PCC + angular) Walking, showering, Darwin's thinking path; open-monitoring meditation
Focused-attention switching Salience network (anterior insula + dACC) Focused-attention meditation, Pomodoro onset
Affective tagging / consolidation bias Amygdala + HC + ventromedial PFC Emotional vividness during encoding (helps and hurts)
Procedural / motor skill Basal ganglia + cerebellum + M1/SMA Repetition with variability; sleep-dependent consolidation
Prediction & forward-modelling Cerebellum + sensory cortex feedback Imagery rehearsal, pre-mortem
Value-based decision vmPFC + ventral striatum + amygdala Pre-mortem, inversion, Munger-style checklists
Insight / "aha" moments Right anterior temporal + DMN, after CEN setup Incubation breaks (Poincaré, Mendeleev, Kekulé)
Multi-perspective arbitration DMN + dlPFC + anterior cingulate Six Thinking Hats, IFS, adversarial collaboration

The takeaway: most useful cognition does not live in a single region; it lives in a specific combination of regions co-activated for the task. The reason different methods do not substitute for each other is that they recruit different combinations — and this is exactly why stacking methods (a memory palace plus spaced retrieval plus an evening walk for incubation) delivers more than any single one. The brain's bandwidth is set by the number of non-overlapping circuits you can pull in.

connectivity is the ontology

Two scales of connection:

  • Long-range white matter tracts — the superhighways. Major bundles:
  • Arcuate fasciculus — connects Broca's (frontal) to Wernicke's (temporal); damage produces conduction aphasia.
  • Corpus callosum — connects the two hemispheres; commissurotomy (split-brain) revealed surprising independence between halves.
  • Cingulum bundle — runs the medial wall; threads the DMN.
  • Superior longitudinal fasciculus (SLF) — connects frontal and parietal; the dorsal-attention backbone.
  • Inferior longitudinal / uncinate — temporal-frontal; semantic memory routing.
  • Short-range U-fibres — local cortico-cortical loops between adjacent areas. Most cortical computation is local.

The Human Connectome Project (HCP, 2010-) made this quantitative — diffusion MRI maps tracts in vivo. Connectomic differences predict individual variation in cognitive traits more reliably than regional volume alone (Smith et al. 2015).

The implication for "what does region X do": mostly determined by who region X talks to. Identical-looking cortical tissue grafted into a different position would learn the role of that position. Plasticity studies in cross-modal blind subjects (auditory cortex co-opted for tactile or visual processing) confirm this — the cortex is generic; connectivity assigns the role.

hemispheric specialisation: real but oversold

The pop-culture "left brain rational, right brain creative" is wrong at every level:

  • Real lateralisations: Language (Broca's, Wernicke's) is left-dominant in ~95% of right-handers and ~70% of left-handers. Spatial attention (right parietal) is right-dominant — neglect after right hemisphere stroke is much commoner than after left. Face processing (FFA) is right-biased.
  • Not lateralised at the trait level: Creativity, logic, empathy, math — all bilateral. Individuals don't have a "dominant hemisphere" for personality or thinking style (Nielsen et al. 2013, n=1011).

The split-brain work (Sperry, Gazzaniga, 1960s onward) is genuine and remarkable — when the corpus callosum is cut, the two hemispheres can hold disagreeing beliefs simultaneously, and the left hemisphere will confabulate explanations for actions initiated by the right. But that's a story about callosal function, not about everyday hemispheric personality.

evolution and scale

A few facts that change the priors:

  • Human brain is ~3x chimpanzee brain by mass, but the prefrontal cortex is disproportionately larger — the PFC scaling is what makes us us.
  • ~86 billion neurons total; ~16 billion in cortex; ~69 billion in cerebellum (Herculano-Houzel 2009 — the long-quoted "100 billion" was a guess that turned out 14% too high).
  • The brain runs on ~20 W (≈20% of resting metabolism, ≈2% of body mass). The DMN alone consumes most of that baseline — see energy & attention.
  • Neurons themselves don't scale linearly with body size; cortical computation is bounded by myelination, glial support, and metabolic supply. There is no direct "more neurons → smarter" scaling across species (cetaceans have more cortical neurons than humans).

the triune-brain story is wrong

MacLean's (1960s) "lizard brain (brainstem) + paleomammalian (limbic) + neomammalian (cortex)" is appealingly tidy and almost completely wrong. Modern evolutionary neuroanatomy (Striedter 2005, Cesario et al. 2020) shows: vertebrate brains all have cortex precursors; the limbic system is not a coherent evolutionary structure; the brainstem is no more "lizard-like" than any other part. The triune myth still appears in pop neuroscience and trauma literature; treat it as a teaching cartoon that overstays its welcome.

active inference unifies perception and action under one objective

Cerebellar forward models (covered above and in COORDINATION) handle motor control. The wider claim — Friston's free-energy / active-inference framework (2010, 2017) — is that the entire cortex is doing the same thing: minimising a single quantity, expected free energy, which decomposes into prediction error plus expected uncertainty. Perception is "infer states such that predictions match sensory input." Action is "act such that sensory input matches predictions." Both reduce the same surprise term; they differ only in which side of the equation moves.

The Bastos et al. (2012) canonical microcircuit gives this a substrate: superficial cortical layers carry prediction errors up, deep layers carry predictions down, and the laminar asymmetry is consistent across sensory and motor cortex. Adams, Shipp & Friston (2013) extends the same circuitry to motor: M1's deep layers don't issue "commands" — they issue proprioceptive predictions that spinal reflex loops then null out by moving the limb. Reflex arcs become inference machinery.

Why this matters for embodiment: it dissolves the perception/action boundary that classical AI draws. A robot that maintains a generative model of its own sensorimotor consequences and acts to confirm it is doing active inference whether or not the implementation names it that way. The swarm analogue: orient→predict→act→diff is active inference at session scale — predictions live in expect: fields, action commits the move, diff is the prediction error, and compaction is the prior update.

what this implies for the swarm (optional)

The swarm's domain shards (one DOMAIN.md per topic) are isomorphic to cortical columns: same generic recipe, specialised by inputs. The swarm's INDEX.md files are isomorphic to hippocampal indexing — pointers, not content. The swarm's compaction is isomorphic to consolidation. The isomorphism that's missing: the swarm has no analogue of the salience network — no toggle that switches between DMN-mode (idle exploration) and CEN-mode (focused execution). Today, mode is implicit in which prompt fires. That's a generative-pressure leak worth a frontier.


sources

  • Kandel, E., Schwartz, J., Jessell, T. (2021). Principles of Neural Science, 6th ed.
  • Squire, L. & Wixted, J. (2011). The cognitive neuroscience of human memory since H.M.
  • Herculano-Houzel, S. (2009). The human brain in numbers.
  • Mountcastle, V. (1957). Modality and topographic properties of single neurons of cat's somatic sensory cortex.
  • Felleman, D. & Van Essen, D. (1991). Distributed hierarchical processing in the primate cerebral cortex.
  • Bastos, A. et al. (2012). Canonical microcircuits for predictive coding.
  • Raichle, M. (2001). A default mode of brain function.
  • Menon, V. (2011). Large-scale brain networks and psychopathology: a unifying triple-network model.
  • Nielsen, J. et al. (2013). An evaluation of the left-brain vs. right-brain hypothesis with resting state functional connectivity MRI.
  • Schmahmann, J. & Sherman, J. (1998). The cerebellar cognitive affective syndrome.
  • Striedter, G. (2005). Principles of Brain Evolution.
  • Friston, K. (2010). The free-energy principle: a unified brain theory?
  • Friston, K. et al. (2017). Active inference: a process theory.
  • Adams, R., Shipp, S. & Friston, K. (2013). Predictions not commands: active inference in the motor system.
  • Cesario, J., Johnson, D., Eisthen, H. (2020). Your brain is not an onion with a tiny reptile inside.

References

  • Kandel, E., Schwartz, J., Jessell, T., Principles of Neural Science, 6th ed. (2021). Canonical textbook; primary source for cellular and circuit-level brain structure throughout.
  • Herculano-Houzel, S. (2009). The human brain in numbers. Grounds quantitative claims about neuron count and comparative neuroanatomy.
  • Friston, K. (2010). The free-energy principle: a unified brain theory. Source for the predictive coding / free-energy framework used to unify the structural descriptions.
  • Bastos, A. et al. (2012). Canonical microcircuits for predictive coding. Grounds the hierarchical predictive processing interpretation of cortical columns.
  • Raichle, M. (2001). A default mode of brain function. Foundational paper for the default mode network discussed in the resting-state section.