From: George Marsaglia Subject: Random numbers for C: The END? Date: Wed, 20 Jan 1999 10:55:14 -0500 Newsgroups: sci.stat.math,sci.math Keywords: Some good(?) random number generators, with C code, comparisons My offer of RNG's for C was an invitation to dance; I did not expect the Tarantella. I hope this post will stop the music, or at least slow it to a stately dance for language chauvinists and software police---under a different heading. In response to a number of requests for good RNG's in C, and mindful of the desirability of having a variety of methods readily available, I offered several. They were implemented as in-line functions using the #define feature of C. Numerous responses have led to improvements; the result is the listing below, with comments describing the generators. I thank all the experts who contributed suggestions, either directly to me or as part of the numerous threads. It seems necessary to use a (circular) table in order to get extremely long periods for some RNG's. Each new number is some combination of the previous r numbers, kept in the circular table. The circular table has to keep at least the last r, but possible more than r, numbers. For speed, an 8-bit index seems best for accessing members of the table---at least for Fortran, where an 8-bit integer is readily available via integer*1, and arithmetic on the index is automatically mod 256 (least-absolute-residue). Having little experience with C, I got out my little (but BIG) Kernighan and Ritchie book to see if there were an 8-bit integer type. I found none, but I did find char and unsigned char: one byte. Furthemore, K&R said arithmetic on characters was ok. That, and a study of the #define examples, led me to propose #define's for in-line generators LFIB4 and SWB, with monster periods. But it turned out that char arithmetic jumps "out of character", other than for simple cases such as c++ or c+=1. So, for safety, the index arithmetic below is kept in character by the UC definition. Another improvement on the original version takes advantage of the comma operator, which, to my chagrin, I had not seen in K&R. It is there, but only with an example of (expression,expression). From the advice of contributors, I found that the comma operator allows (expression,...,expression,expression) with the last expression determining the value. That makes it much easier to create in-line functions via #define (see SHR3, LFIB4, SWB and FIB below). The improved #define's are listed below, with a function to initialize the table and a main program that calls each of the in-line functions one million times and then compares the result to what I got with a DOS version of gcc. That main program can serve as a test to see if your system produces the same results as mine. _________________________________________ |If you run the program below, your output| | should be seven lines, each a 0 (zero).| ----------------------------------------- Some readers of the threads are not much interested in the philosophical aspects of computer languages, but want to know: what is the use of this stuff? Here are simple examples of the use of the in-line functions: Include the #define's in your program, with the accompanying static variable declarations, and a procedure, such as the example, for initializing the static variable (seeds) and the table. Then any one of those in-line functions, inserted in a C expression, will provide a random 32-bit integer, or a random float if UNI or VNI is used. For example, KISS&255; would provide a random byte, while 5.+2.*UNI; would provide a random real (float) from 5 to 7. Or 1+MWC%10; would provide the proverbial "take a number from 1 to 10", (but with not quite, but virtually, equal probabilities). More generally, something such as 1+KISS%n; would provide a practical uniform random choice from 1 to n, if n is not too big. A key point is: a wide variety of very fast, high- quality, easy-to-use RNG's are available by means of the nine in-line functions below, used individually or in combination. The comments after the main test program describe the generators. These descriptions are much as in the first post, for those who missed them. Some of the generators (KISS, MWC, LFIB4) seem to pass all tests of randomness, particularly the DIEHARD battery of tests, and combining virtually any two or more of them should provide fast, reliable, long period generators. (CONG or FIB alone and CONG+FIB are suspect, but quite useful in combinations.) Serious users of random numbers may want to run their simulations with several different generators, to see if they get consistent results. These #define's may make it easy to do. Bonne chance, George Marsaglia The C code follows---------------------------------: #include #define znew (z=36969*(z&65535)+(z>>16)) #define wnew (w=18000*(w&65535)+(w>>16)) #define MWC ((znew<<16)+wnew ) #define SHR3 (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5)) #define CONG (jcong=69069*jcong+1234567) #define FIB ((b=a+b),(a=b-a)) #define KISS ((MWC^CONG)+SHR3) #define LFIB4 (c++,t[c]=t[c]+t[UC(c+58)]+t[UC(c+119)]+t[UC(c+178)]) #define SWB (c++,bro=(x2^7700 and is highly recommended. Subtract-with-borrow has the same local behaviour as lagged Fibonacci using +,-,xor---the borrow merely provides a much longer period. SWB fails the birthday spacings test, as do all lagged Fibonacci and other generators that merely combine two previous values by means of =,- or xor. Those failures are for a particular case: m=512 birthdays in a year of n=2^24 days. There are choices of m and n for which lags >1000 will also fail the test. A reasonable precaution is to always combine a 2-lag Fibonacci or SWB generator with another kind of generator, unless the generator uses *, for which a very satisfactory sequence of odd 32-bit integers results. The classical Fibonacci sequence mod 2^32 from FIB fails several tests. It is not suitable for use by itself, but is quite suitable for combining with other generators. The last half of the bits of CONG are too regular, and it fails tests for which those bits play a significant role. CONG+FIB will also have too much regularity in trailing bits, as each does. But keep in mind that it is a rare application for which the trailing bits play a significant role. CONG is one of the most widely used generators of the last 30 years, as it was the system generator for VAX and was incorporated in several popular software packages, all seemingly without complaint. Finally, because many simulations call for uniform random variables in 0 Subject: Re: Random number generator question Date: Thu, 25 Feb 1999 10:51:08 +0100 Newsgroups: sci.math.research I wouldn't use a random number generator that wasn't (a) recommended by an expert; (b) recent. In particular I distrust random number generators provided as part of general software libraries (for example the Java random number generators, or those available on Unix boxes). Fortunately there are a number of generators around which satisfy both conditions; for example you might care to do a web search for the Mersenne Twister, or look at what claims to be its (English language) home page at http://www.math.keio.ac.jp/matumoto/emt.html Of course I have no problem with people trying to invent their own random number generators. Perhaps many of the experts started by trying to invent the perfect RNG and then discovering its flaws. But I wouldn't use a random number generator that I had invented.