Randomness is a slippery concept, defying precise definition. A simple example of a random series is provided by repeatedly tossing a coin. Assigning “1” for heads and “0” for tails, we generate a random sequence of binary digits or bits. Ten tosses might produce a sequence such as 1001110100. Continuing thus, we can generate a sequence of any length having no discernible pattern.
Random sequences are needed for computer simulations of complex phenomena like fluid flow, interactions of subatomic particles and the evolution of galaxies. These simulations may require billions of random numbers every minute and a reliable means of generating these is vital for the integrity of the simulations. Tossing coins is impractical if large samples of random numbers are required. We need more effective methods and also rigorous statistical tests to ensure that the sequences are really random.
What is randomness?
Several definitions of randomness have been proposed, none being entirely satisfactory. In 1919, Austrian mathematician Richard von Mises proposed that a binary sequence is random if the proportion of zeros and ones approaches 50 per cent and if this property is also true for any sub-sequence generated by simple rules. Taking the alternating sequence 0101010101, it is clear that zeros and ones occur with equal frequency, but the sub-sequence formed by taking every second number contains only ones, which is clearly not random.
The Russian mathematician Andrey Kolmogorov defined the complexity of a binary sequence in terms of the length of the shortest computer program or algorithm that can generate it. A non-random sequence has patterns that enable it to be expressed in a form more compact than its complete list of bits.
Obviously, the phrase “a sequence of one million 1s” completely defines this sequence, whereas the actual listing of the sequence requires a million bits of data. Thus, non-random sequences are compressible. Randomness and incompressibility are equivalent concepts.
No computational algorithm can generate a truly random binary sequence. This is simple to explain: any sequence produced by a computer program is automatically compressed to the length of the program. If it is truly random, the program must be of length comparable to the sequence, and we might just as well list it directly. So, any number that is computable is non-random. The best we can do is to generate long sequences that have some of the key properties of random numbers. The programs that do this are called pseudo-random number generators.
Pseudo-random versus truly random numbers
Pseudo-random number generators are algorithms that use mathematical formulae to produce sequences of numbers. These sequences are deterministic: they are predicted by the algorithm. However, they appear completely random and satisfy various statistical conditions for randomness. The algorithms vary in quality: the best are very good, the worst are very bad.
True random number generators extract randomness from physical phenomena that are completely unpredictable. One source of randomness is atmospheric noise, the “static” generated by lightning discharges: world-wide, there are about 40 lightning flashes every second, which can be detected by an ordinary radio. Atmospheric noise produces numbers that pass all statistical checks for randomness.
For 20 years, Dr Mads Haahr of the school of computer science and statistics at Trinity College, Dublin has been using atmospheric noise to produce random numbers for the website random.org. The data is used for lotteries and sweepstakes, to drive online games, for scientific applications and for art and music. The system now uses a distributed configuration with receivers in different geographic locations, and the results are subjected to a battery of statistical tests to give confidence in the randomness of the numbers.
Peter Lynch is emeritus professor at UCD School of Mathematics & Statistics. He blogs atthatsmaths.com