Study casts doubt on evidence for 'gold standard' psychological treatments
LAWRENCE — A paper appearing today in a special edition of the Journal of Abnormal Psychology questions much of the statistical evidence underpinning therapies designated as “Empirically Supported Treatments,” or ESTs, by Division 12 of the American Psychological Association.
For years, ESTs have represented a “gold standard” in research-supported psychotherapies for conditions like depression, schizophrenia, eating disorders, substance abuse, generalized anxiety and post-traumatic stress disorder. But recent concerns about the replicability of research findings in clinical psychology prompted the re-examination of their evidence.
The new study, led by researchers at the University of Kansas and University of Victoria, concluded that while underlying evidence for a small number of empirically supported treatments is strong, “power and replicability estimates were concerningly low across almost all ESTs, and individually, some ESTs scored poorly across multiple metrics.”
“By some accounts, there are over 600 approaches to psychotherapy, and some are going to be more effective than others,” said co-lead author Alexander Williams, program director of psychology and director of the Psychological Clinic for KU’s Edwards Campus. “Since the 1970s, people have been trying to figure out which are most effective using clinical trials just like in medicine, where some subjects are assigned to a therapy and some to a control group. Division 12 of the APA has a list of therapies with strong scientific evidence from these trials, called ESTs. Ours is the first attempt anyone has made using this broad suite of statistical tools to evaluate the EST literature.”
The researchers analyzed 78 ESTs with “strong” or “modest” research support, as determined by the APA’s Society of Clinical Psychology Division 12, from more than 450 published articles. Four types of evidential value were assessed — rates of misreported statistics, power, R-index and Bayes factors. Among the key conclusions:
- 56% (44 of 78) of ESTs fared poorly across most metric scores.
- 19% (15 of 78) of ESTs fared strongly across most metric scores.
- 52% (26 of 50) of ESTs deemed by Division 12 of the APA as having Strong Research Support fared poorly across most metric scores.
- 22% (11 of 50) of ESTs deemed by Division 12 of the APA as having Strong Research Support fared strongly across most metric scores.
- 64% (18 of 28) of ESTs deemed by Division 12 of the APA as having Modest Research Support fared poorly across most metric scores.
- 14% (4 of 28) of ESTs deemed by Division 12 of the APA as having Modest Research Support fared strongly across most metric scores.
“Our findings don’t mean that therapy doesn’t work, they don’t mean that anything goes or everything is the same,” said co-lead author John Sakaluk, assistant professor in the University of Victoria’s Department of Psychology, who earned his doctorate at KU. “But based on this evidence, we don’t know if most therapies designated as ESTs do actually have better science on their side compared to alternative, research-supported forms of therapy.”
According to Williams, the field of clinical psychology may be ripe for a broad-scale reassessment of therapies that were thought to be supported by rigorous scientific evidence until now.
“Medical researchers coined a term called ‘medical reversal,’” the KU researcher said. “Sometimes these are medical practices that doctors use across the country, but they are discontinued after it’s found they don’t work or aren’t more effective than less-costly alternatives — or they’re actually harmful. Pending replications of our results, we may need broad systems-level psychotherapy reversals. Some of these ESTs are widely implemented in big systems like the Veterans Health Administration. If we find evidence for them isn’t as strong as believed, it may be worth looking at. Let’s say, hypothetically, there are two therapies for depression, and people have said, ‘Well, Therapy A has stronger evidence for it than Therapy B.’ But we know Therapy B works, too, and it’s less costly. Today, if we find the evidence for Therapy A isn’t actually stronger, it may be time to promote Therapy B.”
Further, Williams advised clinicians and patients to continually evaluate progress in therapy and adjust therapeutic approaches based more on patient progress than research evidence of a given therapy’s effectiveness.
“For clinicians and clients, this speaks to the importance of frequently assessing how well a client is doing in therapy,” he said. “Routine outcome monitoring is always a good thing to be doing, but it may be a particularly good idea based on new evidence that we don’t know if some therapies are effective. So, if I’m a patient, I want to assess how I’m doing — and there are different measures for doing that. This study suggests it’s even more important than previously believed.”
For the research community, the authors recommended a reassessment of the size and power of clinical trials and more collaborations between labs to increase the precision of analyses, along with fresh approaches to how research is appraised, published and evaluated.
“One of the things that becomes really obvious when you look at the literature is researchers are collecting and analyzing their data in ways that are extremely flexible,” Sakaluk said. “If you don’t follow certain rules of statistical inference, you can inadvertently trick yourself into claiming effects that aren’t really there. For EST research, it may become important to define in advance what researchers are going to do — like how they’ll analyze data — and go on record in a way that restricts what they’re going to do. This would coincide with a movement to encourage researchers to propose what they’d like to do and get reviewers and journal editors to weigh in before — not after — scientists do research, and to publish it irrespective of what they find.”
Williams said studies supporting the power of clinical treatments should improve over time with more exacting approaches to statistical data.
“This is a system-level issue that will get better as our field begins to grapple with replication,” he said. “We think you’ll see improvement in study design going forward. There wasn’t a fieldwide appreciation for these problems until a decade ago. It takes time for the field to improve. We think our results will complement ongoing efforts by Division 12 to increase the quality of EST research and evaluation.”
Williams and Sakaluk’s co-authors were Robyn Kilshaw of the University of Utah and Kathleen Teresa Rhyner of the Canandaigua VA Medical Center, the latter of whom also earned her doctorate at KU.
Photo: Pexels.com