Quasi-experimental designs are useful for real-world evaluations when a true experiment cannot be conducted – when we cannot randomly assign persons to treatment or control groups, when we cannot control the administration of the program or policy or restrict the policy to a treatment group or when programs are not directed at individuals. The term quasi-experimental unfortunately implies to some persons that there is something wrong or second-rate about the design. Quite to the contrary, quasi-experimental approaches seek to maintain the logic of full experimentation but without the procedures, hardware, techniques, or control of the laboratory. Cook and Campbell’s Quasi-Experimentation provides extensive coverage of this approach and the design options available, including a description of appropriate statistical tests for the designs. There are two basic designs that planners and analysts should find very useful: the non-equivalent control group and the interrupted time-series designs. The non-equivalent control-group design involves the comparison of a treatment group and a similar (but not randomly selected) group before and after the policy or program is implemented. The interrupted time-series design involves the comparison of a treatment group several times both before and after the policy or program is implemented.
Nonequivalent Control-Group Design
The non-equivalent control-group design (Table 1) is read with the Ts and Cs indicating observations or measurements for the treatment and control groups, respectively, and the subscripts 1 and 2 indicating, respectively, preprogram and postprogram measurements. The dashed line indicates that the control group is logically selected so as to be similar, but not necessarily equivalent, to the treatment group. In short, a group, locale, or other entity is given a program, and both before-and after-program observations are made of relevant variables. Before-and-after observations are also made fro the same criteria for a small group that does not receive the program.
The pretest(before) and posttest (after) observations are compared to judge whether there are pre- and postprogram differences and to what extent the change can be attributed to the policy or program. A variety of possible differences might occur, but if there were no external influence on the treatment and control groups, the groups were similar before the treatment, and the policy or pgram had an effect, the posttest score for the target group should allow an increase or decrease compared with the control group. This design, therefore, allows us to narrow the range of the explanations for any changes observed. The pretest and posttest allows us to measure change over time for both target and control groups, and the use of a control group helps us judge whether change in the treatment group resulted from the policy or program or whether it simply reflected a change taking place among similar groups, perhaps being caused by external (nonpolicy) factors.
This design controls for many of the internal threats to validity, but is still not perfect. Differences observed after program implementation may result because the two groups were really not similar, because the members of one group developed more quickly than members of the other group, because non-treatment events affected on group and not the other, or because a control group with extreme pretest scores were selected. Before using such a design, the analyst should establish a strong theory to guide the evaluation and develop an understanding of plausible results.
Table 1: Non-equivalent Control-Group Evaluation Design
|Before-Program Status||After-Program Status|
|Key: T1 = value of indicator for treatment group before program is implementedT2 = value of indicator for treatment group after program is implemented
C1 = value of indicator for control group before program is implemented
C2 = value of indicator for control group after program is implemented
Note: The dashed line indicates that the treatment and control groups are not equivalent.
The non-equivalent control group design can be modified in a number of ways. For example, it can be adjusted statistically for not being able to collect pretest and posttest scores with the same instrument. Adding one or more additional pretests appears to be the best use of resources in improving the non-equivalent control-group design. This allows us to determine whether the two groups were changing in similar or different ways that might affect the postprogram scores, to develop better estimates of the preprogram scores, and to permit better statistical analysis of gain scores. We could take separate sample for the preprogram and postprogram measures to eliminate the change that the pretest affects either the posttest score or the test taker’s receptivity to the treatment. This improvement requires great caution in design, sampling, and interpretation.
Interrupted Time-Series Design
The uninterrupted time-series design involves periodic tests, measurements, or observations of a relevant variable for our group or locale at equally spaced intervals, with the introduction of a policy, program, or treatment at a predetermined interval. The time-series data are examined to determine whether the introduction of the policy had an effect. This approach is depicted in Table 2. The effect of the treatment might be measured as change in the level or direction of the observed variable. For example, before the treatment the data might have depicted a level trend, and following the treatment a similar, but higher- or lower- level trend might be discerned, indicating that the treatment had an effect. For example, consider a truancy-reduction program. Before the program the truancy rate might have been six students per hundred, but after the new truancy-prevention program is instituted the rate might fall to three students per hundred. Other results are possible, including an increase or stabilization in the rate.
Table 2: Interrupted Time-Series Evaluation Design
|Before-Program Status||After-Program Status|
|One Group||B1 B2 B3 B4||A1 A2 A3 A4|
|Key: B1 through B4= values of indicator for the group for observation periods before the program is implemented.A1 through A4= values of indicator for the group for observation periods after the program is implemented.
In practice, time-series analyses are complicated because the trend data are not always smooth. Impacts may be delayed rather than instantaneous, may vary by the season, or may decay over time as the treatment wears off. On rare occasions the impact may even increase over time. To complicate matters, the combination of effects must be interpreted. For example, a policy may induce a change in a rate that is delayed and decays over time.
The interrupted time-series design contains several threats to internal validity. The most obvious problem is that the design does not control for history. Since there is no equivalent control group, the possibility exists that changes observed were not induced by the policy or program but by an external event or nonprogram-related change. Because the time-series data are collected over a relatively long period of time, there is a chance that the way records are kept during the data-collection period may change. There is also the chance that the policy or program may cause participants to drop out, with the result that the remaining participants may constitute a group with different characteristics, and thus different posttest scores, from what the full group would have had. Time-series data may also be affected by seasonal or cyclical trends, which could lead to false interpretations.
Steps can be taken to reduce these threats to internal validity. Using a no-treatment control group helps to identify possible effects of history, and shortening the time intervals between observations enhances interpretations. Carefully monitoring record-keeping procedures throughout the experiment will reveal if differences occur simply because of bookkeeping changes. Including a supplemental study to determine the effect on groups or persons present during the full term of the experiment will avoid the threat of self-selection. Finally, collecting data for longer time series will help to identify cyclical variation.
The time-series design can be modified in a variety of ways to address the needs of different programs or to respond to data availability. In addition to adding a no-treatment control group, the approach could be modified to test the withdrawal of a treatment after a period of observation. Treatments could be introduced, removed, reintroduced, removed, and so on, or the treatment could be switched back and forth between two groups, each serving as the other’s control.
To produce an accurate evaluation, it is essential to select the most appropriate quasi-experimental design. Research has shown that different quasi-experimental designs can generate for the same situation results that vary greatly in the magnitude of estimated effects. In addition, Schwartz and Zorn argue that statistical controls should be added to quasi-experimental evaluation designs to permit the detection of smaller effects.
Literature and Internet Links
Seymour I. Schwartz and Peter M. Zorn, “A Critique of Statistical Controls for Measuring Program Effects: Applications to Urban Growth Control,” Journal of Policy Analysis and Management 7, no. 3 (Spring 1988), 491-505