Tag Archives: psychology

Almost Two-thirds of Psychological Studies Are Wrong

Einstein, as everyone knows, famously defined insanity as doing the same thing repeatedly and expecting different results. Science is the mirror image of insanity (which is not to say there are no mad scientists). It expects — indeed, requires — the same results when scientists do the same experiments or calculations over and over. Thus, according to an important and widely noticed study just published in Science, “Estimating The Reproducibility Of Psychological Science,” there is a real question whether much of the allegedly scientific research published in learned journals of psychology actually qualifies as science.

The Reproducibility Project, coordinated by University of Virginia psychology professor Brian Nosek, executive director of the Center for Open Science, involved a team of 270 psychologists from around the world who attempted to replicate the findings of 100 articles published in 2008 selected from three leading psychology journals: Psychological Science, Journal of Personality and Social Psychology, and Journal of Experimental Psychology: Learning, Memory, and Cognition.

A substantial majority of the studies studied, it turned out, were not reproducible, leading to “a clear conclusion” (as stated in the Science report): “A large portion of replications produced weaker evidence for the original findings despite using materials provided by the original authors, review in advance for methodological fidelity, and high statistical power to detect the original effect sizes.”

Humorously Understated

“Weaker evidence for the original findings” is a polite, statistically precise but obfuscatory way of saying that the conclusions of those studies could not be confirmed. Reviewing these results, the New York Times declared in an article with an almost humorously understated title that “Many Psychology Findings Not As Strong As Claimed, Study Says.” The actual Times article was considerably more dramatic than its title suggests, noting for example that “Strictly on the basis of significance — a statistical measure of how likely it is that a result did not occur by chance — 35 of the studies held up, and 62 did not. (Three were excluded because their significance was not clear.) The overall ‘effect size,’ a measure of the strength of a finding, dropped by about half across all of the studies.”

The fact that the Reproducibility Project found that the findings of nearly two-thirds of the studies its researchers examined could not be reproduced is proving to be a substantial embarrassment in the field of psychology, and those associated with the review project are making a great effort to soften the impact of their striking results. “The eye-opening results don’t necessarily mean that those original findings were incorrect or that the scientific process is flawed,” the Smithsonian Magazine insisted.

When one study finds an effect that a second study can’t replicate, there are several possible reasons, says co-author Cody Christopherson of Southern Oregon University. Study A’s result may be false, or Study B’s results may be false—or there may be some subtle differences in the way the two studies were conducted that impacted the results.

“This project is not evidence that anything is broken. Rather, it’s an example of science doing what science does,” says Christopherson. “It’s impossible to be wrong in a final sense in science. You have to be temporarily wrong, perhaps many times, before you are ever right.”

Well, sure, but how reassuring is it to be told that about two-thirds of the presumably peer-reviewed psychological research published in leading journals is wrong … but only “temporarily”?

The original studies were virtually (probably literally) all based on experiments that the reproducers tried to reproduce, and thus a substantial amount of both the originals and the reproducers’ studies was devoted to statistical analysis of significance, reliability, etc. That is no doubt as it should be, and to its credit the Reproducibility Project and the Center for Open Science have made all of their own research available online. Perhaps in the future another reproducibility with even more resources and researchers will check the work of this one.

Odd Research Design

If there is such an effort in the future, I think it would be in order to consider a dimension that so far as I can tell was not attempted here — moving beyond an analysis of the statistical fit between research methodology and conclusions to a more qualitative consideration of the research design, significance, and even good sense. Several of the studies I looked at would have fallen short on those grounds even if their conclusions had been found to be statistically valid.

Consider, for example, K.R. Morrison and D.T. Miller, “Distinguishing between silent and vocal minorities,” Journal of Personality and Social Psychology 94 (2008): 871-882, whose results were confirmed, here, for the Reproducibility Project by Prof. Matt Motyl of the University of Illinois at Chicago. Morrison and Miller set out to test the entirely reasonable hypothesis that people will be more willing to express their opinions to an audience they think supportive than one they think would be critical. To test this hypothesis they

compared the proportions of bumper stickers [counted in the parking lots of 3 Target department stores] expressing liberal or conservative opinions in a county that voted for a more liberal candidate or a more conservative candidate in the 2004 US Presidential Election. Specifically, they hypothesized that liberals in the liberal county would be more likely to express their opinions than conservatives in the liberal county, and conservatives in the conservative county would be more

likely to express their opinions than liberals in the conservative county.

Surprise! There were more Democratic bumper stickers in the Democratic county and more Republican bumper stickers in the Republican county. But do these findings really confirm the hypothesis? Can’t they be as readily explained by the fact that Democratic counties have more Democrats and Republican counties more Republicans? Or perhaps the political parties were more organized and had more to spend on bumper stickers in counties where they were strong. And do we know the demographic/political breakdown of Target shoppers? Thus the fact that these findings were replicated by this method hardly makes them more significant.

I also looked at two studies by Stanford’s Claude Steele and co-authors purporting to test his ubiquitous “stereotype threat” theory. In “The Space Between Us: Stereotype Threat and Distance in Interracial Contexts,” Journal of Personality and Social Psychology 94 (2008): 91-107, the authors “use stereotype threat theory as a model” to test a prediction that whites would physically distance themselves from blacks in a conversation where the whites feared being stereotyped as racists. In a sense the theory, assumed to have been established by Steele’s earlier work, was used to test itself. Elaborate scenarios were established, and the authors found to their relief and satisfaction that the target white males sat closer to the black confederates when the conversation was about “love and relationships” than when the subject was “racial profiling,” unless the latter were described as a “learning experience.”

The attempt to replicate this study “was unable to attain statistical significance.” It did confirm that when the subject was racial profiling whites sat farther from blacks but was unable to attribute that to any perceived “stereotype threat” fear of being regarded as racist because the distance was largely unaffected by the “learning experience” variable. “Perhaps the prominence of racial profiling in the media, such as Ferguson, Missouri, and New York, has made people, regardless of ethnicity, more apprehensive to discuss the topic and subsequently distance themselves more during conversation,” the replication author suggested. The replication, however, did not even attempt to evaluate the authors’ conclusion that the “social distance” they found confirmed their view that “one’s concern with appearing prejudiced might have the ironic and unintended consequence of causing racial harms,” that “there may be ‘racism without racists.’” Thus there is reason to doubt whether those conclusions would be warranted even if the replication had been able “to attain statistical significance.”

In another study, “Social Identity Contingencies: How Diversity Cues Signal Threat or Safety for African Americans in Mainstream Institutions,” Journal of Personality and Social Psychology 94 (2008): 615-630, Steele et al. claim to have demonstrated that “people at risk of devaluation based on group membership are attuned to cues that signal social identity contingencies — judgments, stereotypes, opportunities, restrictions, and treatments that are tied to one’s social identity.”

In English: blacks are attuned to cues that they might be devalued because they are black. One of the most prominent threatening cues identified by Steele and his co-authors was “colorblindness,” which can be seen as “a means to ignore or invalidate the challenges that come with stigmatized group identities. Interpreted in this way, a colorblind diversity philosophy is diagnostic of marginalization, and we expect this cue to activate threatening social identity contingencies.”

The analysis of this study “did not replicate the original finding that fairness cues create more trust for Black but not White participants in an environment with low-minority representation.” It did not, however, attempt to evaluate the accuracy of reasonableness of the “cue” that a company’s colorblind policy can be seen as a threat to marginalize its black employees. But even if that and the study’s other findings were confirmed, however, the original study  would probably provide more convincing evidence of the pervasive political correctness in the Bay Area, where the participants were selected, than the persuasiveness of Steele’s “stereotype threat” theory.

Methodological replication, in short, is important … but it is not all-important. Studies like these three, for example, would be unconvincing even if their findings were confirmed.

One Result of Income Inequality–Dubious Psychological Studies

As an academic specialty, psychology suffers from a distinct lack of respect. For one clue as to why, consider the story last week on Inside Higher Ed, Does Income Inequality Promote Cheating?. A doctoral student at Queens University in Ontario says yes–and he didn’t even have to leave his computer to reach that conclusion. A Google search for sites that offer college students free term papers or easily plagiarized papers for sale, he says, suggests that states with the highest income inequality generate social mistrust that leads to a generally high rate of cheating.

Continue reading One Result of Income Inequality–Dubious Psychological Studies

A Double Shock to Liberal Professors

haidt200.jpgSocial psychology has long been a haven for left-wing scholars. Jonathan Haidt, one of  the best known and most respected young social psychologists, has heaved two bombshells at his field–one indicting it for effectively excluding conservatives (he is a liberal) and the other for what he sees as a jaundiced and cult-like opposition to religion (he is an atheist).

Here he is on the treatment of conservatives:

I submit to you that the under-representation of conservatives in social psychology, by a factor of several hundred, is evidence that we are a tribal moral community that actively discourages conservatives from entering. … We should take our own rhetoric about the benefits of diversity seriously and apply it to ourselves. … Just imagine if we had a true diversity of perspectives in social psychology.  Imagine if conservative students felt free enough to challenge our dominant ideas, and bold enough to pull us out of our deepest ideological ruts. That is my vision for our bright  post-partisan future.

Continue reading A Double Shock to Liberal Professors

Psychology: The Latest Threat to Campus Free Speech?

Steven Pinker, noted Harvard psychologist and linguist delivered an address to mark Boston’s Ford Hall Forum’s presentation of their Louis P. and Evelyn Smith First Amendment Award to the Foundation for Individual Rights in Education. Pinker’s speech draws valuably upon two of Pinker’s hats – as psychologist and FIRE adviser in offering a sharp analysis of the threat that rising notions of psychology pose to free speech. Pinker outlines the subconscious force of the “psychology of taboo”, and the theoretically innocuous speculations, such as the price of betrayal or infidelity, that “in fact are corrosive because they require people to think exactly the kind of thoughts that they should not think if they are committed friends, allies, family members.” Recognize that taboo? I’m sure. Individually, it’s a taboo that’s hardwired; the problem rises when institutions larger than the individual, such as academia “which is, at least nominally devoted to pursuing the truth no matter how uncomfortable it makes people emotionally” begins to buttress the taboo with institutional force, banning speech and inquiry of sorts that might cause discomfort, and squarely quashing first amendment rights in the process. This is the path that leads to the University of Northern Iowa seeking to ban “unwelcome electronic communications” and it’s a frightening one for sure. Read the speech to find out more.