Study suggests poor fidelity might mean effective education strategies never see light of day
Jan 21, 2022
Cambridge (England) [UK], January 21 : A study has warned that promising new education interventions have been potentially 'unnecessarily scrapped' because trials to test their effectiveness may be insufficiently faithful to the original research.
The cautionary note was raised after researchers ran a large-scale computer simulation of more than 11,000 research trials to examine how much 'fidelity' influenced the results. The findings were published in 'Psychological Methods'.
In science and the social sciences, 'fidelity' meant the extent to which tests evaluating a new innovation adhered to the design of the original experiment on which that innovation was based. In much the same way that scientists would test a new drug before it is approved, new strategies for improving learning would often be evaluated thoroughly in schools or other settings before being rolled out.
Many innovations get rejected at this stage because the trials indicated that they resulted in little or no learning progress. Academics have, however, for some time, voiced concerns that in some cases fidelity losses could be compromising the trial. In many cases, fidelity has not been consistently measured or reported.
The new study had put this theory to the test. Researchers at the University of Cambridge and Carnegie Mellon University ran thousands of computer-modelled trials, featuring millions of simulated participants. They then examined how far changes in fidelity altered the 'effect size' of an intervention.
They found that even relatively subtle deviations in fidelity could have a significant impact. For every 5 per cent of fidelity lost in the simulated follow-up tests, the effect size fell by a corresponding 5 per cent.
In real-life contexts, this could mean that some high-potential innovations were deemed unfit for use because low fidelity was distorting the results. The study noted: "There is growing concern that a substantial number of null findings in educational interventions... could be due to a lack of fidelity, resulting in potentially sound programmes being unnecessarily scrapped."
The findings might be particularly useful to organisations such as the Education Endowment Foundation (EEF) in the United Kingdom, or the What Works Clearinghouse in the United States, both of which evaluate new education research. The EEF reported the results of project trials on its website. At present, more than three out of five of reports indicated that the intervention being tested led to no progress, or negative progress, for pupils.
Michelle Ellefson, Professor of Cognitive Science at the Faculty of Education, University of Cambridge, said: "A lot of money is being invested in these trials, so we should look closely at how well they are controlling for fidelity. Replicability in research is hugely important, but the danger is that we could be throwing out promising interventions because of fidelity violations and creating an unnecessary trust gap between teachers and researchers."
Academics have frequently referred to a 'replication crisis' precisely because the results of so many studies were difficult to reproduce. In education, trials were often carried out by a mix of teachers and researchers. Larger studies, in particular, created ample opportunities for inadvertent fidelity losses, either through human factors (such as research instructions being misread), or changes in the research environment (for example to the timing or conditions of the test).
Ellefson and Professor Daniel Oppenheimer from Carnegie Mellon University developed a computer-based randomised control trial, which, in the first instance, simulated an imaginary intervention in 40 classrooms, each with 25 students. They ran this over and over again, each time adjusting a set of variables - including the potential effect size of the intervention, the ability levels of the students, and the fidelity of the trial itself.
In subsequent models, they added additional, confounding elements which might further affect the results - for example, the quality of resources in the school, or the fact that better teachers might have higher-performing students. The study combined representative permutations of the variables they introduced, modelling 11,055 trials altogether.
Strikingly, across the entire data set, the results indicated that for every 1 per cent of fidelity lost in a trial, the effect size of the intervention also dropped by 1 per cent. This 1:1 correspondence meant that even a trial with, for example, 80 per cent fidelity, would see a significant drop in effect size, which might cast doubt on the value of the intervention being tested.
A more granular analysis then revealed that the effect of fidelity losses tended to be greater where a bigger effect size was anticipated. In other words, the most promising research innovations were also more sensitive to fidelity violations.
Although the confounding factors weakened this overall relationship, fidelity had by far the greatest impact on the effect sizes in all the tests the researchers ran.
Ellefson and Oppenheimer suggested that organisations conducting research trials might wish to establish firmer processes for ensuring, measuring and reporting fidelity so that their recommendations were as robust as possible. Their paper pointed to a research in 2013 which found that only 29 per cent of after-school intervention studies measured fidelity, and another study, in 2010, which found that only 15 per cent of social work intervention studies collected fidelity data.
""When teachers are asked to try out new teaching methods, it is natural - perhaps even admirable - for them to want to adapt the method to the needs of their specific students," Oppenheimer said. "To have reliable scientific tests, however, it's essential to follow the instructions precisely; otherwise, researchers can't know whether the intervention will be broadly effective. It's really important for research teams to monitor and measure fidelity in studies, in order to be able to draw valid conclusions."
Ellefson concluded by saying: "Many organisations do a great job of independently evaluating research, but they need to make sure that fidelity is both measured and scrupulously checked. Sometimes the right response when findings cannot be replicated may not be to dismiss the research altogether, but to step back, and ask why it might have worked in one case, but not in another?"