Benford's Law and Ergodic Theory

Why does the digit 1 lead 30% of real-world numbers?

UT Math DRP Symposium, April 24, 2025·— views

Abstract: Working from Nillsen's Randomness and Recurrence in Dynamical Systems, I presented a proof sketch of Benford's Law using dynamical systems: leading digits map to mantissas via $\log_{10},$ and Weyl equidistribution on irrational circle rotations (Kronecker systems) yields the formula. I also studied recurrence (Poincaré, Kac), normal numbers, and ergodicity as the shared framework for “random-looking” digit behavior.

The Phenomenon

Benford's Law shows up in tax returns, stock prices, physical constants, population data - anywhere numbers span orders of magnitude. The leading digit $d$ satisfies $P(d) = \log_{10}\left(1 + \frac{1}{d}\right),$ so 1 leads 30.1% of the time and 9 just 4.6%.

Why It Happens

Take $2^n$ . Any real number can be written as $10^k \cdot m$ where $m \in [1, 10)$ is the mantissa - the part that determines the leading digit. Since $2 = 10^{\log_{10}(2)},$ we have $2^n = 10^{n \log_{10}(2)}.$

Write $n \log_{10}(2) = k + \{n \log_{10}(2)\}$ where $k = \lfloor n \log_{10}(2) \rfloor$ and $\{x\} = x - \lfloor x \rfloor$ is the fractional part. Then $2^n = 10^k \cdot 10^{\{n \log_{10}(2)\}},$ so $m = 10^{\{n \log_{10}(2)\}}$ . The leading digit is $d$ iff $m \in [d, d+1),$ i.e. $\{n \log_{10}(2)\} \in [\log_{10}(d), \log_{10}(d+1))$ .

This sequence of fractional parts is a Kronecker system: rotation on $[0,1)$ by the irrational angle $\log_{10}(2)$ . Weyl's Equidistribution Theorem says such orbits are uniformly distributed. Since the orbit visits intervals in proportion to their length, $P(\text{leading digit} = d) = \log_{10}(d+1) - \log_{10}(d) = \log_{10}\left(1 + \frac{1}{d}\right),$ which is exactly Benford's Law. The same argument extends to any multiplicatively growing sequence.

Intuitively, going from leading digit 1 to 2 requires doubling, while 8 to 9 is only a 12.5% increase. So, in real-world datasets, multiplicative processes spend more time with smaller leading digits.

Scale Invariance

Scale invariance gives a different perspective. Units are arbitrary - a natural leading-digit law shouldn't care whether we measure in miles or kilometers. Benford's Law is the unique distribution with this property. On a log scale, multiplying by a constant just adds a constant; invariance under these shifts corresponds to a uniform distribution of fractional parts of $\log_{10}(x),$ which produces Benford's formula.

Recurrence and Waiting Times

The Kronecker system above is one example of a dynamical system - a model of how states evolve over time. A natural question: how often do these systems revisit particular states? Poincaré's recurrence theorem: for a measure-preserving transformation on a finite-measure space (e.g., a length-preserving map on a bounded interval), almost every point in a set $U$ eventually returns to $U$ . The strengthened version says almost every point is recurrent, returning arbitrarily close to its starting position infinitely often.

Kac's theorem quantifies the average return time: if $U$ is a region within a space $S,$ the expected number of steps to return to $U$ is $\mu(S)/\mu(U),$ the ratio of total measure to the region's measure. If $U$ occupies 10% of the space, you return on average every 10 iterations.

Randomness and Normal Numbers

Benford's Law shows that first digits aren't uniformly distributed across datasets. But what about all digits within a single number? When do those look “random”? A number is normal (in a given base) if its digits look statistically uniform at every level: single digits, pairs, triples, and longer blocks all occur in the proportions you'd expect from random digits. Borel's Normal Numbers Theorem: almost all real numbers are normal. (This is a loose definition; the formal treatment involves measure theory and is worth exploring if you're curious.)

Important distinction: normality is about digit distribution within a single number's base-b expansion, while Benford's Law is about leading digits across a dataset. For example, $2/3$ is not normal - its expansion is eventually periodic in any integer base. In binary, $2/3 = 0.\overline{10}_2$ : 0s and 1s appear equally often, but among length-2 blocks only 10 and 01 appear; 00 and 11 never occur, where a normal number would have each pair at frequency $1/4$ . So it looks “normal” for single digits but fails normality at length 2. Both Benford and normality ask “what does randomness look like?” in different contexts.

The Bigger Picture

These seemingly distinct results share a common foundation in ergodic theory. A system is ergodic if a single trajectory, followed long enough, visits every region in proportion to its measure. Birkhoff's Ergodic Theorem formalizes this: in ergodic systems, time averages equal space averages. Track one orbit over time, and the fraction of time it spends in any region converges to that region's measure.

Borel's theorem, Weyl's theorem, and Benford's Law are all instances of this principle. The digit 1 leads 30% of the time because the corresponding interval on $[0,1)$ has measure $\log_{10}(2) - \log_{10}(1) \approx 0.301$ .