Traveling Salesman¶
flowchart LR
cities[n cities + distances] --> tour[permutation π]
tour --> cost["Σ d(π_i, π_{i+1})"]
cost --> min[minimise]
- Math dependency trees — where TSP sits in the lattice
- Span of logic gates — another worst-case vs practice story
Dantzig–Fulkerson–Johnson 1954; Christofides 1976; Lin–Kernighan 1973; Arora 1996; Karlin–Klein–Gharan 2020; Concorde solver.
doc_version: 1.0 | 2026-05-16
Given n cities and a distance d(i,j) between each pair, find a tour visiting each city exactly once and returning to the start, of minimum total length. The decision version ("is there a tour ≤ L?") was one of Karp's original 21 NP-complete problems (1972). The optimisation version is NP-hard.
And yet: the Concorde solver has cracked instances with 85,900 cities to
provable optimum (the pla85900 VLSI instance, 2006), and Lin-Kernighan
variants routinely produce tours within 0.01% of optimum on million-node
instances in minutes. This gap — theoretical intractability vs practical
solvability — is the main thing TSP teaches.
1. The Problem, Tightly¶
Input. A complete weighted graph K_n with edge weights d : E → ℝ_{≥0}.
Output. A Hamiltonian cycle C minimising Σ_{(i,j) ∈ C} d(i,j).
Variants by structure on d:
| Name | Constraint on d | Hardness of approximation |
|---|---|---|
| General TSP | none | not approximable within any constant unless P=NP |
| Metric TSP | d(i,j) ≤ d(i,k) + d(k,j) | 1.5 − 10⁻³⁶ (KKG 2020) |
| Euclidean TSP | cities in ℝ², d = L² | PTAS (Arora 1996, Mitchell 1996) |
| Asymmetric TSP | d(i,j) ≠ d(j,i) allowed | O(1)-approximable (Svensson–Tarnawski–Végh 2018) |
| (1,2)-TSP | d ∈ | 8/7-approximable |
The metric assumption is the cliff edge: general TSP is inapproximable because you can encode Hamiltonian-cycle decision into it with infinite penalties; once triangle inequality is back, every shortcut is at most as long as the path it bypasses, and approximation becomes possible.
2. Why It's Hard (Theory)¶
Three independent sources of hardness, each blocking a natural attack:
Exponential search space. (n−1)!/2 distinct tours. Brute force is dead at n ≈ 15.
Held-Karp DP is exponential. Define OPT(S, j) = shortest path from 1 through exactly S ending at j. Recurrence is O(n² · 2ⁿ) time, O(n · 2ⁿ) space. Best known exact worst-case algorithm — and the exponent in n is believed essential under the Exponential Time Hypothesis.
The natural ILP has exponentially many constraints. Variables x_{ij} ∈ {0,1}, degree-2 at every vertex. To forbid subtours you need a constraint for every subset S ⊊ V:
Σ_{i ∈ S, j ∉ S} x_{ij} ≥ 2 for all ∅ ⊊ S ⊊ V
That's 2ⁿ − 2 constraints. The LP relaxation also has fractional vertices (half-integral solutions on the Petersen graph), so even rounding is nontrivial.
3. Why It's Easy (Practice)¶
Cutting planes turn 2ⁿ constraints into a few thousand. You don't enumerate subtour constraints — you find violated ones on demand using min-cut (Padberg–Rinaldi 1991). Concorde solves the LP relaxation, finds the most violated cut, adds it, repeats. Branch only when no cut tightens the LP. This is branch-and-cut and it is the reason exact TSP works.
Lin-Kernighan is sequential 2-opt with backtracking. A 2-opt move removes two edges and reconnects to form a new tour. LK chains these: remove an edge, add an edge, remove an edge, add an edge, ... — keeping a running gain and accepting the sequence only if the cumulative gain is positive. LKH (Helsgaun's variant) adds 5-opt and α-nearness candidate lists. On random Euclidean instances it produces tours within 0.05% of optimum at n = 10⁶.
Christofides–Serdyukov gives 1.5-approximation for metric TSP in polynomial time: 1. Build a minimum spanning tree T. 2. Find a minimum-weight perfect matching M on the odd-degree vertices of T (Edmonds' algorithm). 3. T ∪ M is Eulerian; shortcut the Euler tour to a Hamiltonian one.
The 1.5 bound stood from 1976 to 2020, when Karlin, Klein, and Oveis Gharan shaved it by 10⁻³⁶ using randomised half-integral relaxation rounding — the first improvement in 44 years. The actual constant doesn't matter; the proof of some improvement does, because it cracked a stuck question.
4. The Algorithm Map¶
flowchart TD
P[TSP instance]
P --> E[exact]
P --> H[heuristic]
P --> A[approximation]
E --> HK[Held-Karp DP<br/>O(n² 2ⁿ), n ≲ 25]
E --> BC[Branch-and-cut / Concorde<br/>n ≲ 10⁵ in practice]
H --> NN[nearest neighbour<br/>~25% over optimum]
H --> O2[2-opt / 3-opt<br/>~5% over optimum]
H --> LK[Lin-Kernighan / LKH<br/>~0.05% over optimum]
H --> EA[GA, ACO, SA<br/>variable, often worse than LKH]
A --> CH[Christofides<br/>1.5 · OPT, metric only]
A --> KKG[KKG 2020<br/>1.5 − 10⁻³⁶, metric]
A --> PTAS[Arora PTAS<br/>(1+ε) · OPT, Euclidean]
5. Reductions That Matter¶
- ATSP → STSP. Replace each vertex v with three vertices v_in, v_mid, v_out connected by a path with very negative weights; original edges attach to v_in and v_out. A symmetric tour in the expanded graph encodes a direction in the original. Cost: 3× the vertices.
- TSP → Hamiltonian Cycle. Set d(i,j) = 1 for edges in G, d(i,j) = 2 otherwise. Tour of length n iff G is Hamiltonian.
- Hamiltonian Cycle → TSP is what makes TSP NP-hard.
- TSP path → TSP cycle. Add a dummy vertex with distance 0 to all others.
6. The Lesson¶
The reason TSP is tractable in practice is structure in the input. Real instances — chip routing, drilling, vehicle routing — have geographic clustering, low intrinsic dimension, and a Lipschitz-ish distance function. Under such inputs, the LP relaxation is nearly integral, cuts close the gap quickly, and 2-opt's landscape has few deep local minima.
The pathological inputs that force exponential behaviour (random high-dimensional distance matrices with no metric structure) do not occur in production.
This is the general shape of much of applied algorithmics: worst-case complexity is a property of the input distribution you assume, not of the problem itself. SAT, ILP, graph isomorphism, and TSP all have the same shape — undecidable or NP-hard in the lab, routinely solved at industrial scale. Linear programming had it too, until Khachiyan (1979) and Karmarkar (1984) closed the theory-practice gap from the theory side.
The open frontier for TSP is the same as for the rest: characterise the input distributions under which polynomial-time solvers provably succeed, rather than the worst case under which they provably fail. Smoothed analysis (Spielman–Teng 2001, originally for simplex) is the cleanest extant framework.
7. Numbers Worth Remembering¶
| Instance | n | First solved to optimum | Solver | CPU time |
|---|---|---|---|---|
| dantzig42 | 42 | 1954 | DFJ by hand | days (people) |
| gr120 | 120 | 1977 | Grötschel | hours |
| pr2392 | 2,392 | 1987 | Padberg–Rinaldi | hours |
| usa13509 | 13,509 | 1998 | Concorde | weeks (cluster) |
| sw24978 | 24,978 | 2004 | Concorde | months (cluster) |
| pla85900 | 85,900 | 2006 | Concorde | 136 CPU-years |
The gap from 42 to 85,900 in 52 years is ~12% growth per year — close to Moore's Law on the hardware side, but with algorithmic improvements contributing at least as much. Concorde with 1986 hardware would still beat 1986 Concorde with 2006 hardware.