Random Number Generation in Python

Python offers a random number module named random that offers a broad collection of random number generation facilities. How is it possible to generate a random number using software? Shouldn’t a computer program execute consis­tently and always produce the same answer each time? Yes, it should. The resolu­tion of this apparent problem lies is the definition of random.

First, randomness is defined only for collections of events or numbers. One number, or even a small collection, cannot be said to be random. Randomness re­flects the lack of a pattern, and only one or two events don’t really display a pattern. Randomness is more of a statistical property of a sequence and is not necessarily related strictly to unpredictability. After all, if a computer program can generate random numbers, then it should be possible to predict the next one it will generate.

A random number generator (RNG) on a computer is referred to as pseudo­random; it is not truly random, but exhibits properties of randomness. These properties can be tested statistically. A typical RNG returns a floating point number between 0.0 and 1.0. This value can easily be transformed into a random number, either real or integer, in any desired range. A die roll is an integer between 1 and 6 inclusive. An RNG function named rand01() can be converted into a die roll as

int(rand01()*6 + 1)

If the numbers generated by rand01() are random, then it should produce die rolls that each have a probability of 1/6. If not, then there is a bias.

If a coin is flipped many times and the sequence HTHTHTHTHTHTHT re­sults, the probability of H or T (heads or tails) is 0.5, or 50%, which is what would be expected. If a sequence has the correct percentages for each outcome, then it passes the frequency test. Yet this sequence is probably not random because of the obvious pattern in the results. The frequency test is not enough.

A second test would consider pairs in the sequence and compare the prob­ability of occurrence of each pair against the theoretical. In the coin toss, there are four possible pairs: HH, HT, TH, and TT. Each pair should appear with equal probability, and yet the string above shows only HT instances. It is not random. A standard suite of randomness tests called Diehard includes a more complex version of this test, involving groups of five elements in the sequence, each one having a theoretical probability of 1 in 120. This kind of test can be called the serial test or overlapping permutations.

A third test involves using the RNG to generate poker hands. The probability of specific hands is well-known, and any consistent variation from these prob­abilities would imply a flaw in the RNG. This is the poker test. Any complex random game could be used, and the Diehard suite uses the game of craps.

There are many other tests that could be applied, and all are based on gener­ating complex situations and comparing the theoretical distribution of properties generated against what the RNG creates. So, now that there are ways of testing an RNG, can one be written in Python and tested?

1. Linear Congruential Method

Pseudo-random number generators basically shuffle the bits around in a number in complex and non-repeating ways; at least, they don’t repeat for a large number of trials. A common method for doing this is to calculate a value that is
bound to be larger than the place where it is to be stored and keep only the re­mainder each time. The value of this remainder is pseudo-random under certain conditions. A linear equation can be used and is fast to calculate:

X , = ( aX + b) mod m

where Xi is the previous random number in the sequence and Xi+1 is the next one. The value of m should be quite large and it should be a prime number. Many computers have used a 32-bit integer size, and as it happens 232 – 1 is a good value for m ( = 2147483647). Python integers can be as large as desired, so larger values could be used. Keeping then to 32 bits is accomplished using an and operation and masking the result with a 32-bit constant: OxFFFFFFFF.

Values for a and b are more flexible, but large values are a good idea, and too many factors can cause problems. One good set of values is a=69069 and b=362437. This method uses a previous value to calculate the next one, so an initial value is required. This is called the seed, and it must be possible for a user/ programmer to be able to set this seed value to whatever they choose. If not, then the RNG will generate the same set of values each time it is used. That’s good for debugging, because when tracking down a problem, it is important that the program behave consistently.

The basic RNG described above is as follows:

xseed = 76951

def irandOl ():

global xseed

xseed = (69069* xseed+362437)   & OxFFFFFFFF

return xseed

This function returns a number between 0 and 2147483647, and resets the seed (_xseed, a global) each time. However, we need a function that returns a number between 0 and 1; a second function does this simply by dividing the above result by 2147483647:

def rand01():

return irand01()/0xFFFFFFFF

A function that can set the seed is needed, too:

def setseed (x):

global_xseed

_xseed = x

A commonly used function in the Python random package is randrange(a, b), which returns a random integer between a and b. The code for a die has al­ready been written, and so the math is known. Using the tools just written, this is coded as

def randrange (n1, n2):

x = (int) (rand01()*(n2-n1+1)) + n1

return x

How can a random number generator be made to generate a different set of numbers every time a program starts using it? Simply by setting the seed to a number that is hard to predict. Such a number is found in the low bits (millisec­onds and microseconds) of the system clock. It is impossible to predict what these will be. Randomizing the RNG can be accomplished like this:

def randomize ():

global_xseed

_xseed = int(time.time ())  & 0xFF

The time.time() function returns the number of seconds since a fixed date in the past, called the epoch. This date is usually January 1st, 1970, midnight.

Other methods for generating random numbers exist and are commonly used. Python’s random class uses the Mersenne Twister algorithm, which is often seen as a default in programming languages, but is slow. The Blum-Blum-Shub algo­rithm resembles the linear congruential, but uses the relation xi+1 = xi2 mod m where m is the product of two prime numbers. Dozens more methods exist. There are also practical methods for generating true random numbers, and these are based on specific hardware that captures a truly random process such as radioac­tive decay, the photoelectric effect, or random electromagnetic noise.

Finally, there are websites that offer random numbers and sequences on re­quest. Random.org serves up true random numbers, for example, and there are dozens of other such sites. The time needed to connect to a server and upload a random number is considerable, so they should be used knowing the tradeoff of time for the random number quality.

Source: Parker James R. (2021), Python: An Introduction to Programming, Mercury Learning and Information; Second edition.

Leave a Reply

Your email address will not be published. Required fields are marked *