I had no intention of revisiting the debate over the use of brain imaging in social neuroscience, which I blogged about last month. But that post brought such a tsumani of anger, dismay, invective and outrage that I felt an obligation to go back and dig more deeply into whether the charges in a paper by Ed Vul of MIT, Hal Pashler of UC San Diego and colleagues that is in press at Perspectives on Psychological Science were as meritless as many of the scientists I heard from claimed.
The basic criticism of Vul et al rests on statistics, so I sought out eminent statisticians who have no horse in this race. I had room to cite only two of them in my magazine column this week, but there were no dissenting voices: some studies that use fMRI to correlate patterns of brain activity with some measure of emotion, thought or other psychological trait/behavior of interest to social neuroscience are indeed problematic. For a good discussion, I recommend a post by Andrew Gelman, professor of statistics and of political science and director of the Applied Statistics Center at Columbia University.
But the targets of the criticism also make legitimate points. They're right that calling anyone’s science “voodoo” (in the title of the Vul et al paper) is not very nice or conducive to constructive dialogue. And as they say, social neuroscience is not the only field that uses functional neuroimaging in a way that has problematic stats.
Neither of these points gets to the heart of the criticism, however. Even a response prepared by Matthew Lieberman of UCLA and colleagues, while viewed as the best of the bunch by the statisticians and neuroscientists who were kind enough to read it for me, doesn’t answer all the concerns Vul et al. raised. As Lieberman told me, “What we’re fundamentally interested in is whether there are these relationships [between a pattern of brain activity and a psychological measure] at all. The initial test [looking for patterns that correlate with these measures] tells you there are regions of the brain worth interrogating.” Once scientists do that initial pass, he argues, those who know what they’re doing apply the proper controls and methods of statistical analysis to make sure their subsequent scans are independent of the first. The problem, say other scientists with extensive experience in neuroimaging (and in reading neuroimaging papers), is that “what he describes as good statistical practice doesn’t occur in a lot of these papers,” as one researcher (who doesn’t want to antagonize colleagues more than he already has) told me.
Failure to do the stats properly is the main problem identified by Vul et al. Alas, some experienced practitioners of neuroimaging concede that their field is indeed beset by the “circularity” the imaging critics identified, Nikolaus Kriegeskorte of NIMH told me. “In extreme cases, the effect [in which a pattern of brain activity is correlated with a behavior, feeling, etc.] doesn’t exist at all, and what you are reporting is just noise. Because we have so much data and selection is inevitable, neuroscience is faced with the challenge of avoiding the bias that can come with data selection.” That problem is not unique to imaging, I hasten to add: EEGs and invasive recording have it, too. “It is not a new problem, and there are techniques to avoid it,” Kriegeskorte said.
“Things can do wrong, but how wrong?” he continued. “Our sense is, a whole range of things can happen, from a slight distortion [in the strength of correlations] to entirely spurious results. Some papers do not deal with it well, and are based on incorrect statistics. Whether the central conclusions are wrong cannot be determined without redoing or at least reanalyzing the experiment. Vul et al. have the central point right, but they were unnecessarily inflammatory and their estimate of how much [reported correlations have been inflated] might be too high. But reported correlations are almost certainly higher than they should be.”
The reason that matters is that brain imaging is increasingly being usd not for pure discovery and hypothesis testing, as UCLA’s Lieberman rightly explains, but for real-world uses with potentially worrisome implications, as I explain in my column this week.
So how can laymen, not to mention science journalists, separate good studies from questionable ones? Not easily. Even when we play by the rules and report only studies that have been peer-reviewed and published, it turns out, we can't be assured that the study found what it claims to: some of the most problematic studies ID'd by Vul et al. are in eminent journals. But speaking for myself, when I write about neuroimaging studies in the future I will ask a lot more, and harder, questions about the method of analysis than I have in the past.