While E-TRIALS provides many covariates and offers context that other educational research studies do not capture, the system does not capture the full context of the study environment for each student. A lack of context may prevent you from knowing if the teacher has done something that will nullify the benefits of your treatment. Students in the control may have been terribly confused and may have kept asking the teacher for help. While a lack of context is a potential weakness, it will likely dilute the effects of a treatment rather than inflate them.
A second more troublesome issue is contamination effects. In their review paper, McMillan et al, (2007) stated “an important principle of good randomized studies is that the intervention and control groups are completely independent, without any effect on each other. This condition is often problematic in field research.” In real classrooms, a student in one condition may show their assignment to a student in another condition. This will more likely dilute effects rather than inflate them. One solution for this problem might be to add in a self report question, like “Collaboration is good thing in learning. Did you collaborate with anyone else on this assignment?” In Kelly et al., (2013) we found that some students in the control conditions (that represented a business as usual condition), self-reported that they texted their friends asking for help. We realized that this diluted the effect size we estimated. In this case, the effect size was large and we still found reliable differences. As McMillan and colleagues suggest, “when control subjects realize that intervention subjects have received something ‘special’ they may react by initiating behavior to obtain the same outcomes as the intervention group (compensatory rivalry) or may be resentful and be less motivated (resentful demoralization).” Compensatory rivalry will dilute effects while resentful demoralization will inflate effects. Debriefing could include surveys to assess whether students and teachers had noticed that conditions were different.
Differential attrition is another threat. Since the posttest-section is always at the end you may find that the difference in posttest results are due to students in the different condition completing the posttest measures at different rates. It turns out that differential attribution will be a threat that can fully control and use in a positive way. If one condition causes students to not complete their work and fail to do the problems on the posttest-section, that, in and of itself, is a useful dependent measure. Since the posttest-section is always at the end you may find that the difference in posttest results are due to students in the different condition completing the posttest measures at a different rates. It turns out that differential attribution will be a threat that you think you can fully control and in fact use in a positive way. If one condition causes students to not complete their work and fail to do the problems on the posttest-section, that in and of itself is a useful dependent measure. You may find a condition that causes some students to not complete their homework, but for those who do finish, the effect is big enough to still show that if you do finish it is a better condition to be in. We call this a “tough-love” condition, it causes a student to quit but if they don’t quit it is significantly better.
Another potential threat to validity is caused by sequencing effects as students are exposed to a series of experiments. It is possible that carryover effects from one study will influence the results of a later study. Separate independent randomization is used to prevent this, but we can put into place automatic blocking, that when randomization is done for study #2 we block to make sure there is an equal number of students in each conditions in study #1 assigned into an equal number of conditions in study #2. The effects of study #1 will just increase variance, making it harder to detect difference but not threaten validity if a finding is found.
A final threat to internal validity is novelty effects, or Hawthorne Effects. A novelty effect is any new or different condition that improves learning simply because students are paying attention to it based on its novelty. A condition that you submit may best our certified control condition but the effects may not generalize to other problem sets. Novelty effects inflate an observed effect size. Ultimately we are able to detect novelty effects through replication, by applying an idea multiple times to see if it loses its novelty.