Random selection is how you draw the sample of people for your study from a population.
Random assignment is how you assign the sample that you draw to different groups or treatments in your study.
It is possible to have both random selection and assignment in a study. Let’s say you drew a random sample of 100 clients from a population list of 1000 current clients of your organization. That is random sampling. Now, let’s say you randomly assign 50 of these clients to get some new additional treatment and the other 50 to be controls. That’s random assignment.
It is also possible to have only one of these (random selection or random assignment) but not the other in a study. For instance, if you do not randomly draw the 100 cases from your list of 1000 but instead just take the first 100 on the list, you do not have random selection. But you could still randomly assign this nonrandom sample to treatment versus control. Or, you could randomly select 100 from your list of 1000 and then nonrandomly (haphazardly) assign them to treatment or control.
And, it’s possible to have neither random selection nor random assignment. In a typical nonequivalent groups design in education you might nonrandomly choose two 5th grade classes to be in your study. This is nonrandom selection. Then, you could arbitrarily assign one to get the new educational program and the other to be the control. This is nonrandom (or nonequivalent) assignment.
Random selection is related to sampling. Therefore it is most related to the external validity (or generalizability) of your results. After all, we would randomly sample so that our research participants better represent the larger group from which they’re drawn. Random assignment is most related to design. In fact, when we randomly assign participants to treatments we have, by definition, an experimental design. Therefore, random assignment is most related to internal validity. After all, we randomly assign in order to help assure that our treatment groups are similar to each other (i.e., equivalent) prior to the treatment
How to do randomization?
A few months ago I read this paper on how to do randomization; it has just come on-line, and I recommend it highly. Meanwhile, I summarized it; here are the greatest hits.In Pursuit of Balance: Randomization in Practice in Development Field Experiments
By Miriam Bruhn (WB) and David McKenzie (WB, IZA, BREAD)
Randomized experiments are increasingly used in development economics. … This paper carries out an extensive review of the randomization methods used in existing randomized experiments, presents new evidence from a survey of leading development economists, and carries out simulation results in order to provide guidance for researchers considering which method to use for randomization.
The shortest summary of results:
in samples of 300 or greater, the different randomization methods perform similarly in terms of achieving balance in outcomes variables at follow-up. In smaller samples, however, the choice of randomization method is important, with matching and stratification performing best at achieving balance. Moreover, the ex-post analysis should explicitly account for how the randomization was conducted by including the appropriate controls. [Don’t worry: they tell us how!]
How are most researchers randomizing?
most researchers have at some point used simple randomization (probably with some stratification) – 80 percent of the full sample and 94 percent of researchers who have carried out five or more experiments have done this. However, we also see much more use of other methods than is apparent from the existing literature. 56 percent had used pairwise matching … 32 percent of all researchers…have subjectively decided whether to re-randomize based on an initial test of balance. The multiple draws process described [do a bunch of randomizations, then pick best balance] above has also been used by 24 percent of researchers, and is more common amongst the researchers with 38 percent of the 5 or more experiment group using this method.
“Which methods do better in terms of achieving balance and avoiding extremes?”
on average all methods of randomizing lead to balance. however… stratification, matching, and especially the minmax t-stat method have much less extreme differences in baseline outcomes, while the big stick method only results in narrow improvements in balance over a single random draw. [In other words, on average they’re all about the same, but your less likely to occasionally get a highly unbalanced draw with stratification, matching, and minimizing the t-stat.]
“What does balance on observables imply about balance on unobservables?”
Aickin (2001) notes that methods which balance on observables can do no worse than pure randomization with regard to balancing unobserved variables.
Should you control for stratification or pair-wise matching in the analysis?
Thus, on average, it is overly conservative to not include the controls for stratum or pair in analysis. [i.e., your std errors are too big] … BUT in a non-trivial proportion of draws, it will be the case that not including stratum dummies will be anti-conservative, potentially leading the researcher to find a significant effect that is no longer significant when stratum dummies are controlled for. Hence researchers can not argue that if they ignore the randomization method, and find significant effects treating their study as if they purely randomized, that these same treatment effects will necessarily remain significant if one were to account for the method of randomization.
In the analysis, how do you control for having randomized a bunch of times and then chosen the randomization with best balance?
the correct statistical methods for covariate-dependent randomization schemes such as minimization are still a conundrum in the statistics literature