Article 5CCTR Average sum of digits

Average sum of digits

by
John
from John D. Cook on (#5CCTR)

The smaller the base you write numbers in, the smaller their digits sums will be on average.

This need not be true of any given number, only for numbers on average. For example, let n = 2021. In base 10, the sum of the digits is 5, but in base 2 the sum of the digits is 8 because n is written as 11111100101 in binary. So for our particular example, a smaller base lead to a larger digit sum. In fact, all smaller bases than 10 lead to a larger digit sum for 2021. But on average, numbers have smaller digit sums when written in binary than when written in base 10.

Let's explore this with a little Python code, then look at a theoretical result.

First, we need some code to express numbers in different bases, and here's a simple function for that.

 # From https://stackoverflow.com/a/28666223 def numberToBase(n, b): if n == 0: return [0] digits = [] while n: digits.append(int(n % b)) n //= b return digits[::-1]

Using the code above, our digit sum function is trivial:

 def digitSum(n, b): return sum(numberToBase(n, b))

Now for our simulation.

 import numpy as np np.random.seed(20200103) upper_bound = 1000000000 sample_size = 1000 bases = np.arange(2, 17) sample = np.random.randint(0, upper_bound, size = sample_size) out = [sum(digitSum(n, b) for n in sample)/sample_size for b in bases]

A theorem in [1] says that

digit_sum_asymp.svg

where S(b, N) is the sum of the number of digits of all numbers from 1 to N when written in base b. This value is asymptotically equal to the expression on the right.

We can compute the average number of digits we'd expect in our simulation using the asymptotic result as an approximation.

 asymp = [0.5*(b-1)*np.log(upper_bound) / np.log(b) for b in bases]

The table below compares the simulation results to theory.

 |------+------------+------------| | base | simulation | asymptotic | |------+------------+------------| | 2 | 14.899 | 14.949 | | 3 | 18.660 | 18.863 | | 4 | 22.317 | 22.423 | | 5 | 25.398 | 25.752 | | 6 | 28.345 | 28.915 | | 7 | 30.996 | 31.949 | | 8 | 34.839 | 34.880 | | 9 | 36.422 | 37.726 | | 10 | 40.275 | 40.500 | | 11 | 41.480 | 43.211 | | 12 | 44.353 | 45.868 | | 13 | 47.454 | 48.476 | | 14 | 49.949 | 51.041 | | 15 | 51.772 | 53.567 | | 16 | 53.610 | 56.058 | |------+------------+------------|

[1] L. E. Bush. An Asymptotic Formula for the Average Sum of the Digits of Integers. The American Mathematical Monthly , Mar., 1940, Vol. 47, pp. 154-156.

The post Average sum of digits first appeared on John D. Cook.omu7VqKEKrU
External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title John D. Cook
Feed Link https://www.johndcook.com/blog
Reply 0 comments