Simine Vazire crafted a thought provoking blog post about how some in the field respond to counter-intuitive findings. One common reaction among critics of this kind of research is to claim that the results are unbelievable. This reaction seems to fit with the maxim that extraordinary claims should require extraordinary evidence (AKA the Sagan doctrine). For example, the standard of evidence needed to support the claim that a high-calorie/low nutrient diet coupled with a sedentary life style is negatively associated with morbidity might be different than the standard of proof needed to support the claim that attending class is positively associated with exam performance. One claim seems far more extraordinary than the other. Put another way: Prior subjective beliefs about the truthiness of these claims might differ and thus the research evidence needed to modify these pre-existing beliefs should be different.
I like the Sagan doctrine but I think we can all appreciate the difficulties that arise when trying to determine standards of evidence needed to justify a particular research claim. There are no easy answers except for the tried and true response that all scientific claims should be thoroughly evaluated by multiple teams using strong methods and multiple operational definitions of the underlying constructs. But this is a “long term” perspective and provides little guidance when trying to interpret any single study or package of studies. Except that it does, sort of. A long term perspective means that most findings should be viewed with a big grain of salt, at least initially. Skepticism is a virtue (and I think this is one of the overarching themes of Simine’s blog posts thus far). However, skepticism does not preclude publication and even some initial excitement about an idea. It simply precludes making bold and definitive statements based on initial results with unknown generality. More research is needed because of the inherent uncertainty of scientific claims. To quote a lesser known U2 lyric – “Uncertainty can be a guiding light”.
Anyways, I will admit to having the “unbelievable” reaction to a number of research studies. However, my reaction usually springs from a different set of concerns rather than just a suspicion that a particular claim is counter to my own intuitions. I am fairly skeptical of my own intuitions. I am also fairly skeptical of the intuitions of others. And I still find lots of studies literally unbelievable.
Here is a partial list of the reasons for my skepticism. (Note: These points cover well worn ground so feel free to ignore if it sounds like I am beating a dead horse!)
1. Large effect sizes coupled with small sample sizes. Believe it or not, there is guidance in the literature to help generate an expected value for research findings in “soft” psychology. A reasonable number of effects are between .20 and .30 in the r metric and relatively few are above .50 (see Hemphill, 2003; Richard et al., 2003). Accordingly, when I read studies that generate “largish” effect size estimates (i.e., r ≥ |.40|), I tend to be skeptical. I think an effect size estimate of .50 is in fact an extraordinary claim.
My skepticism gets compounded when the sample sizes are small and thus the confidence intervals are wide. This means that the published findings are consistent with a wide range of plausible effect sizes so that any inference about the underlying effect size is not terribly constrained. The point estimates are not precise. Authors might be excited about the .50 correlation but the 95% CI suggests that the data are actually consistent with anything from a tiny effect to a massive effect. Frankly, I also hate it when the lower bound of the CI falls just slightly above 0 and thus the p value is just slightly below .05. It makes me suspect p-hacking was involved. (Sorry, I said it!)
2. Conceptual replications but no direct replications. The multi-study package common to such prestigious outlets like PS or JPSP has drawn critical attention in the last 3 or so years. Although these packages seem persuasive on the surface, they often show hints of publication bias on closer inspection. The worry is that the original researchers actually conducted a number of related studies and only those that worked were published. Thus, the published package reflects a biased sampling of the entire body of studies. The ones that failed to support the general idea were left to languish in the proverbial file drawer. This generates inflated effect size estimates and makes the case for an effect seem far more compelling than it should be in light of all of the evidence. Given these issues, I tend to want to see a package of studies that reports both direct and conceptual replications. If I see only conceptual replications, I get skeptical. This is compounded when each study itself has a modest sample size with a relatively large effect size estimate that produces a 95% CI that gets quite close to 0 (see Point #1).
3. Breathless press releases. Members of some of my least favorite crews in psychology seem to create press releases for every paper they publish. (Of course, my perceptions could be biased!). At any rate, press releases are designed by the university PR office to get media attention. The PR office is filled with smart people trained to draw positive attention to the university using the popular media. I do not have a problem with this objective per se. However, I do not think this should be the primary mission of the social scientist. Sometimes good science is only interesting to the scientific community. I get skeptical when the press release makes the paper seem like it was the most groundbreaking research in all of psychology. I also get skeptical when the press release draws strong real world implications from fairly constrained lab studies. It makes me think the researchers overlooked the thorny issues with generalized causal inference.
I worry about saying this but I will put it out there – I suspect that some press releases were envisioned before the research was even conducted. This is probably an unfair reaction to many press releases but at least I am being honest. So I get skeptical when there is a big disconnect between the press release and the underlying research like when sweeping claims are made on a study of say 37 kids. Or big claims about money and happiness are drawn from priming studies involving pictures of money.
I would be interested to hear what makes others skeptical of published claims.
A little background tangential to the main points of this post:
One way to generate press excitement is to quote the researcher(s) as being shocked by the results. Unfortunately, I often think some of shock and awe expressed in these press releases is disingenuous. Why? Researchers designed the studies to test specific predictions in the first place. So they had some expectations as to what they would find. Alternatively, if someone did obtain a shocking initial result, they should conduct multiple direct replications to make sure the original result was not simply a false positive. This kind of narrative is not usually part of the press release.
I also hate to read press releases that generalize the underlying results well beyond the initial design and purpose of the research. Sometimes the real world implications of experiments are just not clear. In fact, not all research is designed to have real world implications. If we take the classic Mook reading at face value, lots of experimental research in psychology has no clear real world implications. This is perfectly OK but it might make the findings less interesting to the general public. Or at least it probably requires more background knowledge to make the implications interesting. Such background is beyond the scope of the press release.