Randomization

Randomization is the process of making something random; in various contexts this involves, for example:

generating a random permutation of a sequence (such as when shuffling cards);
selecting a random sample of a population (important in statistical sampling);
allocating experimental units via random assignment to a treatment or control condition;
generating random numbers (see Random number generation); or
transforming a data stream (such as when using a scrambler in telecommunications).

Randomization is not haphazard. Instead, a random process is a sequence of random variables describing a process whose outcomes do not follow a deterministic pattern, but follow an evolution described by probability distributions. For example, a random sample of individuals from a population refers to a sample where every individual has a known probability of being sampled. This would be contrasted with nonprobability sampling where arbitrary individuals are selected.

Applications

Randomization is used in statistics and in gambling.

Statistics

Randomization is a core principle in statistical theory, whose importance was emphasized by Charles S. Peirce in "Illustrations of the Logic of Science" (1877-1878) and "A Theory of Probable Inference" (1883). Randomization-based inference is especially important in experimental design and in survey sampling. The first use of "randomization" listed in the Oxford English Dictionary is its use by Ronald Fisher in 1926.

Randomized experiments

In the statistical theory of design of experiments, randomization involves randomly allocating the experimental units across the treatment groups. For example, if an experiment compares a new drug against a standard drug, then the patients should be allocated to either the new drug or to the standard drug control using randomization. Randomization reduces confounding by equalising so-called factors ( independent variables) that have not been accounted for in the experimental design.

Survey sampling

Survey sampling uses randomization, following the criticisms of previous "representative methods" by Jerzy Neyman in his 1922 report to the International Statistical Institute.

Resampling

Some important methods of statistical inference use resampling from the observed data. Multiple alternative versions of the data-set that "might have been observed" are created by randomization of the original data-set, the only one observed. The variation of statistics calculated for these alternative data-sets is a guide to the uncertainty of statistics estimated from the original data.

Gambling

Randomization is used extensively in the field of gambling. Because poor randomization may allow a skilled gambler to take advantage, much research has been devoted to effective randomization. A classic example of randomizing is shuffling playing cards.

Techniques

Although historically "manual" randomization techniques (such as shuffling cards, drawing pieces of paper from a bag, spinning a roulette wheel) were common, nowadays automated techniques are mostly used. As both selecting random samples and random permutations can be reduced to simply selecting random numbers, random number generation methods are now most commonly used, both hardware random number generators and pseudo-random number generators.

Non-algorithmic randomization methods include:

Casting yarrow stalks (for the I Ching)
Throwing dice
Flipping a coin
Drawing straws
Shuffling cards
Roulette wheels
Drawing pieces of paper or balls from a bag
"Lottery machines"
Observing atomic decay using a radiation counter

COMMENTS