Skip to main content

Home/ New Media Ethics 2009 course/ Group items tagged Meta-analysis

Rss Feed Group items tagged

Weiye Loh

Odds Are, It's Wrong - Science News - 0 views

  • science has long been married to mathematics. Generally it has been for the better. Especially since the days of Galileo and Newton, math has nurtured science. Rigorous mathematical methods have secured science’s fidelity to fact and conferred a timeless reliability to its findings.
  • a mutant form of math has deflected science’s heart from the modes of calculation that had long served so faithfully. Science was seduced by statistics, the math rooted in the same principles that guarantee profits for Las Vegas casinos. Supposedly, the proper use of statistics makes relying on scientific results a safe bet. But in practice, widespread misuse of statistical methods makes science more like a crapshoot.
  • science’s dirtiest secret: The “scientific method” of testing hypotheses by statistical analysis stands on a flimsy foundation. Statistical tests are supposed to guide scientists in judging whether an experimental result reflects some real effect or is merely a random fluke, but the standard methods mix mutually inconsistent philosophies and offer no meaningful basis for making such decisions. Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.
  • ...24 more annotations...
  • Experts in the math of probability and statistics are well aware of these problems and have for decades expressed concern about them in major journals. Over the years, hundreds of published papers have warned that science’s love affair with statistics has spawned countless illegitimate findings. In fact, if you believe what you read in the scientific literature, you shouldn’t believe what you read in the scientific literature.
  • “There are more false claims made in the medical literature than anybody appreciates,” he says. “There’s no question about that.”Nobody contends that all of science is wrong, or that it hasn’t compiled an impressive array of truths about the natural world. Still, any single scientific study alone is quite likely to be incorrect, thanks largely to the fact that the standard statistical system for drawing conclusions is, in essence, illogical. “A lot of scientists don’t understand statistics,” says Goodman. “And they don’t understand statistics because the statistics don’t make sense.”
  • In 2007, for instance, researchers combing the medical literature found numerous studies linking a total of 85 genetic variants in 70 different genes to acute coronary syndrome, a cluster of heart problems. When the researchers compared genetic tests of 811 patients that had the syndrome with a group of 650 (matched for sex and age) that didn’t, only one of the suspect gene variants turned up substantially more often in those with the syndrome — a number to be expected by chance.“Our null results provide no support for the hypothesis that any of the 85 genetic variants tested is a susceptibility factor” for the syndrome, the researchers reported in the Journal of the American Medical Association.How could so many studies be wrong? Because their conclusions relied on “statistical significance,” a concept at the heart of the mathematical analysis of modern scientific experiments.
  • Statistical significance is a phrase that every science graduate student learns, but few comprehend. While its origins stretch back at least to the 19th century, the modern notion was pioneered by the mathematician Ronald A. Fisher in the 1920s. His original interest was agriculture. He sought a test of whether variation in crop yields was due to some specific intervention (say, fertilizer) or merely reflected random factors beyond experimental control.Fisher first assumed that fertilizer caused no difference — the “no effect” or “null” hypothesis. He then calculated a number called the P value, the probability that an observed yield in a fertilized field would occur if fertilizer had no real effect. If P is less than .05 — meaning the chance of a fluke is less than 5 percent — the result should be declared “statistically significant,” Fisher arbitrarily declared, and the no effect hypothesis should be rejected, supposedly confirming that fertilizer works.Fisher’s P value eventually became the ultimate arbiter of credibility for science results of all sorts
  • But in fact, there’s no logical basis for using a P value from a single study to draw any conclusion. If the chance of a fluke is less than 5 percent, two possible conclusions remain: There is a real effect, or the result is an improbable fluke. Fisher’s method offers no way to know which is which. On the other hand, if a study finds no statistically significant effect, that doesn’t prove anything, either. Perhaps the effect doesn’t exist, or maybe the statistical test wasn’t powerful enough to detect a small but real effect.
  • Soon after Fisher established his system of statistical significance, it was attacked by other mathematicians, notably Egon Pearson and Jerzy Neyman. Rather than testing a null hypothesis, they argued, it made more sense to test competing hypotheses against one another. That approach also produces a P value, which is used to gauge the likelihood of a “false positive” — concluding an effect is real when it actually isn’t. What  eventually emerged was a hybrid mix of the mutually inconsistent Fisher and Neyman-Pearson approaches, which has rendered interpretations of standard statistics muddled at best and simply erroneous at worst. As a result, most scientists are confused about the meaning of a P value or how to interpret it. “It’s almost never, ever, ever stated correctly, what it means,” says Goodman.
  • experimental data yielding a P value of .05 means that there is only a 5 percent chance of obtaining the observed (or more extreme) result if no real effect exists (that is, if the no-difference hypothesis is correct). But many explanations mangle the subtleties in that definition. A recent popular book on issues involving science, for example, states a commonly held misperception about the meaning of statistical significance at the .05 level: “This means that it is 95 percent certain that the observed difference between groups, or sets of samples, is real and could not have arisen by chance.”
  • That interpretation commits an egregious logical error (technical term: “transposed conditional”): confusing the odds of getting a result (if a hypothesis is true) with the odds favoring the hypothesis if you observe that result. A well-fed dog may seldom bark, but observing the rare bark does not imply that the dog is hungry. A dog may bark 5 percent of the time even if it is well-fed all of the time. (See Box 2)
    • Weiye Loh
       
      Does the problem then, lie not in statistics, but the interpretation of statistics? Is the fallacy of appeal to probability is at work in such interpretation? 
  • Another common error equates statistical significance to “significance” in the ordinary use of the word. Because of the way statistical formulas work, a study with a very large sample can detect “statistical significance” for a small effect that is meaningless in practical terms. A new drug may be statistically better than an old drug, but for every thousand people you treat you might get just one or two additional cures — not clinically significant. Similarly, when studies claim that a chemical causes a “significantly increased risk of cancer,” they often mean that it is just statistically significant, possibly posing only a tiny absolute increase in risk.
  • Statisticians perpetually caution against mistaking statistical significance for practical importance, but scientific papers commit that error often. Ziliak studied journals from various fields — psychology, medicine and economics among others — and reported frequent disregard for the distinction.
  • “I found that eight or nine of every 10 articles published in the leading journals make the fatal substitution” of equating statistical significance to importance, he said in an interview. Ziliak’s data are documented in the 2008 book The Cult of Statistical Significance, coauthored with Deirdre McCloskey of the University of Illinois at Chicago.
  • Multiplicity of mistakesEven when “significance” is properly defined and P values are carefully calculated, statistical inference is plagued by many other problems. Chief among them is the “multiplicity” issue — the testing of many hypotheses simultaneously. When several drugs are tested at once, or a single drug is tested on several groups, chances of getting a statistically significant but false result rise rapidly.
  • Recognizing these problems, some researchers now calculate a “false discovery rate” to warn of flukes disguised as real effects. And genetics researchers have begun using “genome-wide association studies” that attempt to ameliorate the multiplicity issue (SN: 6/21/08, p. 20).
  • Many researchers now also commonly report results with confidence intervals, similar to the margins of error reported in opinion polls. Such intervals, usually given as a range that should include the actual value with 95 percent confidence, do convey a better sense of how precise a finding is. But the 95 percent confidence calculation is based on the same math as the .05 P value and so still shares some of its problems.
  • Statistical problems also afflict the “gold standard” for medical research, the randomized, controlled clinical trials that test drugs for their ability to cure or their power to harm. Such trials assign patients at random to receive either the substance being tested or a placebo, typically a sugar pill; random selection supposedly guarantees that patients’ personal characteristics won’t bias the choice of who gets the actual treatment. But in practice, selection biases may still occur, Vance Berger and Sherri Weinstein noted in 2004 in ControlledClinical Trials. “Some of the benefits ascribed to randomization, for example that it eliminates all selection bias, can better be described as fantasy than reality,” they wrote.
  • Randomization also should ensure that unknown differences among individuals are mixed in roughly the same proportions in the groups being tested. But statistics do not guarantee an equal distribution any more than they prohibit 10 heads in a row when flipping a penny. With thousands of clinical trials in progress, some will not be well randomized. And DNA differs at more than a million spots in the human genetic catalog, so even in a single trial differences may not be evenly mixed. In a sufficiently large trial, unrandomized factors may balance out, if some have positive effects and some are negative. (See Box 3) Still, trial results are reported as averages that may obscure individual differences, masking beneficial or harm­ful effects and possibly leading to approval of drugs that are deadly for some and denial of effective treatment to others.
  • nother concern is the common strategy of combining results from many trials into a single “meta-analysis,” a study of studies. In a single trial with relatively few participants, statistical tests may not detect small but real and possibly important effects. In principle, combining smaller studies to create a larger sample would allow the tests to detect such small effects. But statistical techniques for doing so are valid only if certain criteria are met. For one thing, all the studies conducted on the drug must be included — published and unpublished. And all the studies should have been performed in a similar way, using the same protocols, definitions, types of patients and doses. When combining studies with differences, it is necessary first to show that those differences would not affect the analysis, Goodman notes, but that seldom happens. “That’s not a formal part of most meta-analyses,” he says.
  • Meta-analyses have produced many controversial conclusions. Common claims that antidepressants work no better than placebos, for example, are based on meta-analyses that do not conform to the criteria that would confer validity. Similar problems afflicted a 2007 meta-analysis, published in the New England Journal of Medicine, that attributed increased heart attack risk to the diabetes drug Avandia. Raw data from the combined trials showed that only 55 people in 10,000 had heart attacks when using Avandia, compared with 59 people per 10,000 in comparison groups. But after a series of statistical manipulations, Avandia appeared to confer an increased risk.
  • combining small studies in a meta-analysis is not a good substitute for a single trial sufficiently large to test a given question. “Meta-analyses can reduce the role of chance in the interpretation but may introduce bias and confounding,” Hennekens and DeMets write in the Dec. 2 Journal of the American Medical Association. “Such results should be considered more as hypothesis formulating than as hypothesis testing.”
  • Some studies show dramatic effects that don’t require sophisticated statistics to interpret. If the P value is 0.0001 — a hundredth of a percent chance of a fluke — that is strong evidence, Goodman points out. Besides, most well-accepted science is based not on any single study, but on studies that have been confirmed by repetition. Any one result may be likely to be wrong, but confidence rises quickly if that result is independently replicated.“Replication is vital,” says statistician Juliet Shaffer, a lecturer emeritus at the University of California, Berkeley. And in medicine, she says, the need for replication is widely recognized. “But in the social sciences and behavioral sciences, replication is not common,” she noted in San Diego in February at the annual meeting of the American Association for the Advancement of Science. “This is a sad situation.”
  • Most critics of standard statistics advocate the Bayesian approach to statistical reasoning, a methodology that derives from a theorem credited to Bayes, an 18th century English clergyman. His approach uses similar math, but requires the added twist of a “prior probability” — in essence, an informed guess about the expected probability of something in advance of the study. Often this prior probability is more than a mere guess — it could be based, for instance, on previous studies.
  • it basically just reflects the need to include previous knowledge when drawing conclusions from new observations. To infer the odds that a barking dog is hungry, for instance, it is not enough to know how often the dog barks when well-fed. You also need to know how often it eats — in order to calculate the prior probability of being hungry. Bayesian math combines a prior probability with observed data to produce an estimate of the likelihood of the hunger hypothesis. “A scientific hypothesis cannot be properly assessed solely by reference to the observational data,” but only by viewing the data in light of prior belief in the hypothesis, wrote George Diamond and Sanjay Kaul of UCLA’s School of Medicine in 2004 in the Journal of the American College of Cardiology. “Bayes’ theorem is ... a logically consistent, mathematically valid, and intuitive way to draw inferences about the hypothesis.” (See Box 4)
  • In many real-life contexts, Bayesian methods do produce the best answers to important questions. In medical diagnoses, for instance, the likelihood that a test for a disease is correct depends on the prevalence of the disease in the population, a factor that Bayesian math would take into account.
  • But Bayesian methods introduce a confusion into the actual meaning of the mathematical concept of “probability” in the real world. Standard or “frequentist” statistics treat probabilities as objective realities; Bayesians treat probabilities as “degrees of belief” based in part on a personal assessment or subjective decision about what to include in the calculation. That’s a tough placebo to swallow for scientists wedded to the “objective” ideal of standard statistics. “Subjective prior beliefs are anathema to the frequentist, who relies instead on a series of ad hoc algorithms that maintain the facade of scientific objectivity,” Diamond and Kaul wrote.Conflict between frequentists and Bayesians has been ongoing for two centuries. So science’s marriage to mathematics seems to entail some irreconcilable differences. Whether the future holds a fruitful reconciliation or an ugly separation may depend on forging a shared understanding of probability.“What does probability mean in real life?” the statistician David Salsburg asked in his 2001 book The Lady Tasting Tea. “This problem is still unsolved, and ... if it remains un­solved, the whole of the statistical approach to science may come crashing down from the weight of its own inconsistencies.”
  •  
    Odds Are, It's Wrong Science fails to face the shortcomings of statistics
Weiye Loh

Meta-analysis - PsychWiki - A Collaborative Psychology Wiki - 0 views

  • A meta-analysis is only informative if it adequately summarizes the existing literature, so a thorough literature search is critical to retrieve every relevant study, such as database searches, ancestry approach, descendancy approach, hand searching, and the invisible college (i.e., network of researchers who know about unpublished studies, conference proceedings, etc). For more information see (Johnson & Eagly, 2000) (Handbook of Research Methods in Social and Personality Psychology) which details five general ways to retrieve relevant articles.
    • Weiye Loh
       
      How is one able to know that one has exhausted the "invisible college?" Perhaps we need an official record or a database of unpublished studies, conference proceedings, etc. 
Weiye Loh

Alzheimer's Studies Find New Genetic Links - NYTimes.com - 0 views

  • The two largest studies of Alzheimer’s disease have led to the discovery of no fewer than five genes that provide intriguing new clues to why the disease strikes and how it progresses.
  • For years, there have been unproven but persistent hints that cholesterol and inflammation are part of the disease process. People with high cholesterol are more likely to get the disease. Strokes and head injuries, which make Alzheimer’s more likely, also cause brain inflammation. Now, some of the newly discovered genes appear to bolster this line of thought, because some are involved with cholesterol and others are linked to inflammation or the transport of molecules inside cells.
  • By themselves, the genes are not nearly as important a factor as APOE, a gene discovered in 1995 that greatly increases risk for the disease: by 400 percent if a person inherits a copy from one parent, by 1,000 percent if from both parents.
  • ...7 more annotations...
  • In contrast, each of the new genes increases risk by no more than 10 to 15 percent; for that reason, they will not be used to decide if a person is likely to develop Alzheimer’s. APOE, which is involved in metabolizing cholesterol, “is in a class of its own,” said Dr. Rudolph Tanzi, a neurology professor at Harvard Medical School and an author of one of the papers.
  • But researchers say that even a slight increase in risk helps them in understanding the disease and developing new therapies. And like APOE, some of the newly discovered genes appear to be involved with cholesterol.
  • The other paper is by researchers in Britain, France and other European countries with contributions from the United States. They confirmed the genes found by the American researchers and added one more gene.
  • The American study got started about three years ago when Gerard D. Schellenberg, a pathology professor at the University of Pennsylvania, went to the National Institutes of Health with a complaint and a proposal. Individual research groups had been doing their own genome studies but not having much success, because no one center had enough subjects. In an interview, Dr. Schellenberg said that he had told Dr. Richard J. Hodes, director of the National Institute on Aging, the small genomic studies had to stop, and that Dr. Hodes had agreed. These days, Dr. Hodes said, “the old model in which researchers jealously guarded their data is no longer applicable.”
  • So Dr. Schellenberg set out to gather all the data he could on Alzheimer’s patients and on healthy people of the same ages. The idea was to compare one million positions on each person’s genome to determine whether some genes were more common in those who had Alzheimer’s. “I spent a lot of time being nice to people on the phone,” Dr. Schellenberg said. He got what he wanted: nearly every Alzheimer’s center and Alzheimer’s geneticist in the country cooperated. Dr. Schellenberg and his colleagues used the mass of genetic data to do an analysis and find the genes and then, using two different populations, to confirm that the same genes were conferring the risk. That helped assure the investigators that they were not looking at a chance association. It was a huge effort, Dr. Mayeux said. Many medical centers had Alzheimer’s patients’ tissue sitting in freezers. They had to extract the DNA and do genome scans.
  • “One of my jobs was to make sure the Alzheimer’s cases really were cases — that they had used some reasonable criteria” for diagnosis, Dr. Mayeux said. “And I had to be sure that people who were unaffected really were unaffected.”
  • Meanwhile, the European group, led by Dr. Julie Williams of the School of Medicine at Cardiff University, was engaged in a similar effort. Dr. Schellenberg said the two groups compared their results and were reassured that they were largely finding the same genes. “If there were mistakes, we wouldn’t see the same things,” he added. Now the European and American groups are pooling their data to do an enormous study, looking for genes in the combined samples. “We are upping the sample size,” Dr. Schellenberg said. “We are pretty sure more stuff will pop out.”
  •  
    Gene Study Yields
Weiye Loh

The Decline Effect and the Scientific Method : The New Yorker - 0 views

  • On September 18, 2007, a few dozen neuroscientists, psychiatrists, and drug-company executives gathered in a hotel conference room in Brussels to hear some startling news. It had to do with a class of drugs known as atypical or second-generation antipsychotics, which came on the market in the early nineties.
  • the therapeutic power of the drugs appeared to be steadily waning. A recent study showed an effect that was less than half of that documented in the first trials, in the early nineteen-nineties. Many researchers began to argue that the expensive pharmaceuticals weren’t any better than first-generation antipsychotics, which have been in use since the fifties. “In fact, sometimes they now look even worse,” John Davis, a professor of psychiatry at the University of Illinois at Chicago, told me.
  • Before the effectiveness of a drug can be confirmed, it must be tested and tested again. Different scientists in different labs need to repeat the protocols and publish their results. The test of replicability, as it’s known, is the foundation of modern research. Replicability is how the community enforces itself. It’s a safeguard for the creep of subjectivity. Most of the time, scientists know what results they want, and that can influence the results they get. The premise of replicability is that the scientific community can correct for these flaws.
  • ...30 more annotations...
  • But now all sorts of well-established, multiply confirmed findings have started to look increasingly uncertain. It’s as if our facts were losing their truth: claims that have been enshrined in textbooks are suddenly unprovable. This phenomenon doesn’t yet have an official name, but it’s occurring across a wide range of fields, from psychology to ecology. In the field of medicine, the phenomenon seems extremely widespread, affecting not only antipsychotics but also therapies ranging from cardiac stents to Vitamin E and antidepressants: Davis has a forthcoming analysis demonstrating that the efficacy of antidepressants has gone down as much as threefold in recent decades.
  • the effect is especially troubling because of what it exposes about the scientific process. If replication is what separates the rigor of science from the squishiness of pseudoscience, where do we put all these rigorously validated findings that can no longer be proved? Which results should we believe? Francis Bacon, the early-modern philosopher and pioneer of the scientific method, once declared that experiments were essential, because they allowed us to “put nature to the question.” But it appears that nature often gives us different answers.
  • At first, he assumed that he’d made an error in experimental design or a statistical miscalculation. But he couldn’t find anything wrong with his research. He then concluded that his initial batch of research subjects must have been unusually susceptible to verbal overshadowing. (John Davis, similarly, has speculated that part of the drop-off in the effectiveness of antipsychotics can be attributed to using subjects who suffer from milder forms of psychosis which are less likely to show dramatic improvement.) “It wasn’t a very satisfying explanation,” Schooler says. “One of my mentors told me that my real mistake was trying to replicate my work. He told me doing that was just setting myself up for disappointment.”
  • In private, Schooler began referring to the problem as “cosmic habituation,” by analogy to the decrease in response that occurs when individuals habituate to particular stimuli. “Habituation is why you don’t notice the stuff that’s always there,” Schooler says. “It’s an inevitable process of adjustment, a ratcheting down of excitement. I started joking that it was like the cosmos was habituating to my ideas. I took it very personally.”
  • The most likely explanation for the decline is an obvious one: regression to the mean. As the experiment is repeated, that is, an early statistical fluke gets cancelled out. The extrasensory powers of Schooler’s subjects didn’t decline—they were simply an illusion that vanished over time. And yet Schooler has noticed that many of the data sets that end up declining seem statistically solid—that is, they contain enough data that any regression to the mean shouldn’t be dramatic. “These are the results that pass all the tests,” he says. “The odds of them being random are typically quite remote, like one in a million. This means that the decline effect should almost never happen. But it happens all the time!
  • this is why Schooler believes that the decline effect deserves more attention: its ubiquity seems to violate the laws of statistics. “Whenever I start talking about this, scientists get very nervous,” he says. “But I still want to know what happened to my results. Like most scientists, I assumed that it would get easier to document my effect over time. I’d get better at doing the experiments, at zeroing in on the conditions that produce verbal overshadowing. So why did the opposite happen? I’m convinced that we can use the tools of science to figure this out. First, though, we have to admit that we’ve got a problem.”
  • In 2001, Michael Jennions, a biologist at the Australian National University, set out to analyze “temporal trends” across a wide range of subjects in ecology and evolutionary biology. He looked at hundreds of papers and forty-four meta-analyses (that is, statistical syntheses of related studies), and discovered a consistent decline effect over time, as many of the theories seemed to fade into irrelevance. In fact, even when numerous variables were controlled for—Jennions knew, for instance, that the same author might publish several critical papers, which could distort his analysis—there was still a significant decrease in the validity of the hypothesis, often within a year of publication. Jennions admits that his findings are troubling, but expresses a reluctance to talk about them publicly. “This is a very sensitive issue for scientists,” he says. “You know, we’re supposed to be dealing with hard facts, the stuff that’s supposed to stand the test of time. But when you see these trends you become a little more skeptical of things.”
  • the worst part was that when I submitted these null results I had difficulty getting them published. The journals only wanted confirming data. It was too exciting an idea to disprove, at least back then.
  • the steep rise and slow fall of fluctuating asymmetry is a clear example of a scientific paradigm, one of those intellectual fads that both guide and constrain research: after a new paradigm is proposed, the peer-review process is tilted toward positive results. But then, after a few years, the academic incentives shift—the paradigm has become entrenched—so that the most notable results are now those that disprove the theory.
  • Jennions, similarly, argues that the decline effect is largely a product of publication bias, or the tendency of scientists and scientific journals to prefer positive data over null results, which is what happens when no effect is found. The bias was first identified by the statistician Theodore Sterling, in 1959, after he noticed that ninety-seven per cent of all published psychological studies with statistically significant data found the effect they were looking for. A “significant” result is defined as any data point that would be produced by chance less than five per cent of the time. This ubiquitous test was invented in 1922 by the English mathematician Ronald Fisher, who picked five per cent as the boundary line, somewhat arbitrarily, because it made pencil and slide-rule calculations easier. Sterling saw that if ninety-seven per cent of psychology studies were proving their hypotheses, either psychologists were extraordinarily lucky or they published only the outcomes of successful experiments. In recent years, publication bias has mostly been seen as a problem for clinical trials, since pharmaceutical companies are less interested in publishing results that aren’t favorable. But it’s becoming increasingly clear that publication bias also produces major distortions in fields without large corporate incentives, such as psychology and ecology.
  • While publication bias almost certainly plays a role in the decline effect, it remains an incomplete explanation. For one thing, it fails to account for the initial prevalence of positive results among studies that never even get submitted to journals. It also fails to explain the experience of people like Schooler, who have been unable to replicate their initial data despite their best efforts
  • an equally significant issue is the selective reporting of results—the data that scientists choose to document in the first place. Palmer’s most convincing evidence relies on a statistical tool known as a funnel graph. When a large number of studies have been done on a single subject, the data should follow a pattern: studies with a large sample size should all cluster around a common value—the true result—whereas those with a smaller sample size should exhibit a random scattering, since they’re subject to greater sampling error. This pattern gives the graph its name, since the distribution resembles a funnel.
  • The funnel graph visually captures the distortions of selective reporting. For instance, after Palmer plotted every study of fluctuating asymmetry, he noticed that the distribution of results with smaller sample sizes wasn’t random at all but instead skewed heavily toward positive results.
  • Palmer has since documented a similar problem in several other contested subject areas. “Once I realized that selective reporting is everywhere in science, I got quite depressed,” Palmer told me. “As a researcher, you’re always aware that there might be some nonrandom patterns, but I had no idea how widespread it is.” In a recent review article, Palmer summarized the impact of selective reporting on his field: “We cannot escape the troubling conclusion that some—perhaps many—cherished generalities are at best exaggerated in their biological significance and at worst a collective illusion nurtured by strong a-priori beliefs often repeated.”
  • Palmer emphasizes that selective reporting is not the same as scientific fraud. Rather, the problem seems to be one of subtle omissions and unconscious misperceptions, as researchers struggle to make sense of their results. Stephen Jay Gould referred to this as the “shoehorning” process. “A lot of scientific measurement is really hard,” Simmons told me. “If you’re talking about fluctuating asymmetry, then it’s a matter of minuscule differences between the right and left sides of an animal. It’s millimetres of a tail feather. And so maybe a researcher knows that he’s measuring a good male”—an animal that has successfully mated—“and he knows that it’s supposed to be symmetrical. Well, that act of measurement is going to be vulnerable to all sorts of perception biases. That’s not a cynical statement. That’s just the way human beings work.”
  • One of the classic examples of selective reporting concerns the testing of acupuncture in different countries. While acupuncture is widely accepted as a medical treatment in various Asian countries, its use is much more contested in the West. These cultural differences have profoundly influenced the results of clinical trials. Between 1966 and 1995, there were forty-seven studies of acupuncture in China, Taiwan, and Japan, and every single trial concluded that acupuncture was an effective treatment. During the same period, there were ninety-four clinical trials of acupuncture in the United States, Sweden, and the U.K., and only fifty-six per cent of these studies found any therapeutic benefits. As Palmer notes, this wide discrepancy suggests that scientists find ways to confirm their preferred hypothesis, disregarding what they don’t want to see. Our beliefs are a form of blindness.
  • John Ioannidis, an epidemiologist at Stanford University, argues that such distortions are a serious issue in biomedical research. “These exaggerations are why the decline has become so common,” he says. “It’d be really great if the initial studies gave us an accurate summary of things. But they don’t. And so what happens is we waste a lot of money treating millions of patients and doing lots of follow-up studies on other themes based on results that are misleading.”
  • In 2005, Ioannidis published an article in the Journal of the American Medical Association that looked at the forty-nine most cited clinical-research studies in three major medical journals. Forty-five of these studies reported positive results, suggesting that the intervention being tested was effective. Because most of these studies were randomized controlled trials—the “gold standard” of medical evidence—they tended to have a significant impact on clinical practice, and led to the spread of treatments such as hormone replacement therapy for menopausal women and daily low-dose aspirin to prevent heart attacks and strokes. Nevertheless, the data Ioannidis found were disturbing: of the thirty-four claims that had been subject to replication, forty-one per cent had either been directly contradicted or had their effect sizes significantly downgraded.
  • The situation is even worse when a subject is fashionable. In recent years, for instance, there have been hundreds of studies on the various genes that control the differences in disease risk between men and women. These findings have included everything from the mutations responsible for the increased risk of schizophrenia to the genes underlying hypertension. Ioannidis and his colleagues looked at four hundred and thirty-two of these claims. They quickly discovered that the vast majority had serious flaws. But the most troubling fact emerged when he looked at the test of replication: out of four hundred and thirty-two claims, only a single one was consistently replicable. “This doesn’t mean that none of these claims will turn out to be true,” he says. “But, given that most of them were done badly, I wouldn’t hold my breath.”
  • the main problem is that too many researchers engage in what he calls “significance chasing,” or finding ways to interpret the data so that it passes the statistical test of significance—the ninety-five-per-cent boundary invented by Ronald Fisher. “The scientists are so eager to pass this magical test that they start playing around with the numbers, trying to find anything that seems worthy,” Ioannidis says. In recent years, Ioannidis has become increasingly blunt about the pervasiveness of the problem. One of his most cited papers has a deliberately provocative title: “Why Most Published Research Findings Are False.”
  • The problem of selective reporting is rooted in a fundamental cognitive flaw, which is that we like proving ourselves right and hate being wrong. “It feels good to validate a hypothesis,” Ioannidis said. “It feels even better when you’ve got a financial interest in the idea or your career depends upon it. And that’s why, even after a claim has been systematically disproven”—he cites, for instance, the early work on hormone replacement therapy, or claims involving various vitamins—“you still see some stubborn researchers citing the first few studies that show a strong effect. They really want to believe that it’s true.”
  • scientists need to become more rigorous about data collection before they publish. “We’re wasting too much time chasing after bad studies and underpowered experiments,” he says. The current “obsession” with replicability distracts from the real problem, which is faulty design. He notes that nobody even tries to replicate most science papers—there are simply too many. (According to Nature, a third of all studies never even get cited, let alone repeated.)
  • Schooler recommends the establishment of an open-source database, in which researchers are required to outline their planned investigations and document all their results. “I think this would provide a huge increase in access to scientific work and give us a much better way to judge the quality of an experiment,” Schooler says. “It would help us finally deal with all these issues that the decline effect is exposing.”
  • Although such reforms would mitigate the dangers of publication bias and selective reporting, they still wouldn’t erase the decline effect. This is largely because scientific research will always be shadowed by a force that can’t be curbed, only contained: sheer randomness. Although little research has been done on the experimental dangers of chance and happenstance, the research that exists isn’t encouraging
  • John Crabbe, a neuroscientist at the Oregon Health and Science University, conducted an experiment that showed how unknowable chance events can skew tests of replicability. He performed a series of experiments on mouse behavior in three different science labs: in Albany, New York; Edmonton, Alberta; and Portland, Oregon. Before he conducted the experiments, he tried to standardize every variable he could think of. The same strains of mice were used in each lab, shipped on the same day from the same supplier. The animals were raised in the same kind of enclosure, with the same brand of sawdust bedding. They had been exposed to the same amount of incandescent light, were living with the same number of littermates, and were fed the exact same type of chow pellets. When the mice were handled, it was with the same kind of surgical glove, and when they were tested it was on the same equipment, at the same time in the morning.
  • The premise of this test of replicability, of course, is that each of the labs should have generated the same pattern of results. “If any set of experiments should have passed the test, it should have been ours,” Crabbe says. “But that’s not the way it turned out.” In one experiment, Crabbe injected a particular strain of mouse with cocaine. In Portland the mice given the drug moved, on average, six hundred centimetres more than they normally did; in Albany they moved seven hundred and one additional centimetres. But in the Edmonton lab they moved more than five thousand additional centimetres. Similar deviations were observed in a test of anxiety. Furthermore, these inconsistencies didn’t follow any detectable pattern. In Portland one strain of mouse proved most anxious, while in Albany another strain won that distinction.
  • The disturbing implication of the Crabbe study is that a lot of extraordinary scientific data are nothing but noise. The hyperactivity of those coked-up Edmonton mice wasn’t an interesting new fact—it was a meaningless outlier, a by-product of invisible variables we don’t understand. The problem, of course, is that such dramatic findings are also the most likely to get published in prestigious journals, since the data are both statistically significant and entirely unexpected. Grants get written, follow-up studies are conducted. The end result is a scientific accident that can take years to unravel.
  • This suggests that the decline effect is actually a decline of illusion.
  • While Karl Popper imagined falsification occurring with a single, definitive experiment—Galileo refuted Aristotelian mechanics in an afternoon—the process turns out to be much messier than that. Many scientific theories continue to be considered true even after failing numerous experimental tests. Verbal overshadowing might exhibit the decline effect, but it remains extensively relied upon within the field. The same holds for any number of phenomena, from the disappearing benefits of second-generation antipsychotics to the weak coupling ratio exhibited by decaying neutrons, which appears to have fallen by more than ten standard deviations between 1969 and 2001. Even the law of gravity hasn’t always been perfect at predicting real-world phenomena. (In one test, physicists measuring gravity by means of deep boreholes in the Nevada desert found a two-and-a-half-per-cent discrepancy between the theoretical predictions and the actual data.) Despite these findings, second-generation antipsychotics are still widely prescribed, and our model of the neutron hasn’t changed. The law of gravity remains the same.
  • Such anomalies demonstrate the slipperiness of empiricism. Although many scientific ideas generate conflicting results and suffer from falling effect sizes, they continue to get cited in the textbooks and drive standard medical practice. Why? Because these ideas seem true. Because they make sense. Because we can’t bear to let them go. And this is why the decline effect is so troubling. Not because it reveals the human fallibility of science, in which data are tweaked and beliefs shape perceptions. (Such shortcomings aren’t surprising, at least for scientists.) And not because it reveals that many of our most exciting theories are fleeting fads and will soon be rejected. (That idea has been around since Thomas Kuhn.) The decline effect is troubling because it reminds us how difficult it is to prove anything. We like to pretend that our experiments define the truth for us. But that’s often not the case. Just because an idea is true doesn’t mean it can be proved. And just because an idea can be proved doesn’t mean it’s true. When the experiments are done, we still have to choose what to believe.
Weiye Loh

Research integrity: Sabotage! : Nature News - 0 views

  • University of Michigan in Ann Arbor
  • Vipul Bhrigu, a former postdoc at the university's Comprehensive Cancer Center, wears a dark-blue three-buttoned suit and a pinched expression as he cups his pregnant wife's hand in both of his. When Pollard Hines calls Bhrigu's case to order, she has stern words for him: "I was inclined to send you to jail when I came out here this morning."
  • Bhrigu, over the course of several months at Michigan, had meticulously and systematically sabotaged the work of Heather Ames, a graduate student in his lab, by tampering with her experiments and poisoning her cell-culture media. Captured on hidden camera, Bhrigu confessed to university police in April and pleaded guilty to malicious destruction of personal property, a misdemeanour that apparently usually involves cars: in the spaces for make and model on the police report, the arresting officer wrote "lab research" and "cells". Bhrigu has said on multiple occasions that he was compelled by "internal pressure" and had hoped to slow down Ames's work. Speaking earlier this month, he was contrite. "It was a complete lack of moral judgement on my part," he said.
  • ...16 more annotations...
  • Bhrigu's actions are surprising, but probably not unique. There are few firm numbers showing the prevalence of research sabotage, but conversations with graduate students, postdocs and research-misconduct experts suggest that such misdeeds occur elsewhere, and that most go unreported or unpoliced. In this case, the episode set back research, wasted potentially tens of thousands of dollars and terrorized a young student. More broadly, acts such as Bhrigu's — along with more subtle actions to hold back or derail colleagues' work — have a toxic effect on science and scientists. They are an affront to the implicit trust between scientists that is necessary for research endeavours to exist and thrive.
  • Despite all this, there is little to prevent perpetrators re-entering science.
  • federal bodies that provide research funding have limited ability and inclination to take action in sabotage cases because they aren't interpreted as fitting the federal definition of research misconduct, which is limited to plagiarism, fabrication and falsification of research data.
  • In Bhrigu's case, administrators at the University of Michigan worked with police to investigate, thanks in part to the persistence of Ames and her supervisor, Theo Ross. "The question is, how many universities have such procedures in place that scientists can go and get that kind of support?" says Christine Boesz, former inspector-general for the US National Science Foundation in Arlington, Virginia, and now a consultant on scientific accountability. "Most universities I was familiar with would not necessarily be so responsive."
  • Some labs are known to be hyper-competitive, with principal investigators pitting postdocs against each other. But Ross's lab is a small, collegial place. At the time that Ames was noticing problems, it housed just one other graduate student, a few undergraduates doing projects, and the lab manager, Katherine Oravecz-Wilson, a nine-year veteran of the lab whom Ross calls her "eyes and ears". And then there was Bhrigu, an amiable postdoc who had joined the lab in April 2009.
  • Some people whom Ross consulted with tried to convince her that Ames was hitting a rough patch in her work and looking for someone else to blame. But Ames was persistent, so Ross took the matter to the university's office of regulatory affairs, which advises on a wide variety of rules and regulations pertaining to research and clinical care. Ray Hutchinson, associate dean of the office, and Patricia Ward, its director, had never dealt with anything like it before. After several meetings and two more instances of alcohol in the media, Ward contacted the department of public safety — the university's police force — on 9 March. They immediately launched an investigation — into Ames herself. She endured two interrogations and a lie-detector test before investigators decided to look elsewhere.
  • At 4:00 a.m. on Sunday 18 April, officers installed two cameras in the lab: one in the cold room where Ames's blots had been contaminated, and one above the refrigerator where she stored her media. Ames came in that day and worked until 5:00 p.m. On Monday morning at around 10:15, she found that her medium had been spiked again. When Ross reviewed the tapes of the intervening hours with Richard Zavala, the officer assigned to the case, she says that her heart sank. Bhrigu entered the lab at 9:00 a.m. on Monday and pulled out the culture media that he would use for the day. He then returned to the fridge with a spray bottle of ethanol, usually used to sterilize lab benches. With his back to the camera, he rummaged through the fridge for 46 seconds. Ross couldn't be sure what he was doing, but it didn't look good. Zavala escorted Bhrigu to the campus police department for questioning. When he told Bhrigu about the cameras in the lab, the postdoc asked for a drink of water and then confessed. He said that he had been sabotaging Ames's work since February. (He denies involvement in the December and January incidents.)
  • Misbehaviour in science is nothing new — but its frequency is difficult to measure. Daniele Fanelli at the University of Edinburgh, UK, who studies research misconduct, says that overtly malicious offences such as Bhrigu's are probably infrequent, but other forms of indecency and sabotage are likely to be more common. "A lot more would be the kind of thing you couldn't capture on camera," he says. Vindictive peer review, dishonest reference letters and withholding key aspects of protocols from colleagues or competitors can do just as much to derail a career or a research project as vandalizing experiments. These are just a few of the questionable practices that seem quite widespread in science, but are not technically considered misconduct. In a meta-analysis of misconduct surveys, published last year (D. Fanelli PLoS ONE 4, e5738; 2009), Fanelli found that up to one-third of scientists admit to offences that fall into this grey area, and up to 70% say that they have observed them.
  • Some say that the structure of the scientific enterprise is to blame. The big rewards — tenured positions, grants, papers in stellar journals — are won through competition. To get ahead, researchers need only be better than those they are competing with. That ethos, says Brian Martinson, a sociologist at HealthPartners Research Foundation in Minneapolis, Minnesota, can lead to sabotage. He and others have suggested that universities and funders need to acknowledge the pressures in the research system and try to ease them by means of education and rehabilitation, rather than simply punishing perpetrators after the fact.
  • Bhrigu says that he felt pressure in moving from the small college at Toledo to the much bigger one in Michigan. He says that some criticisms he received from Ross about his incomplete training and his work habits frustrated him, but he doesn't blame his actions on that. "In any kind of workplace there is bound to be some pressure," he says. "I just got jealous of others moving ahead and I wanted to slow them down."
  • At Washtenaw County Courthouse in July, having reviewed the case files, Pollard Hines delivered Bhrigu's sentence. She ordered him to pay around US$8,800 for reagents and experimental materials, plus $600 in court fees and fines — and to serve six months' probation, perform 40 hours of community service and undergo a psychiatric evaluation.
  • But the threat of a worse sentence hung over Bhrigu's head. At the request of the prosecutor, Ross had prepared a more detailed list of damages, including Bhrigu's entire salary, half of Ames's, six months' salary for a technician to help Ames get back up to speed, and a quarter of the lab's reagents. The court arrived at a possible figure of $72,000, with the final amount to be decided upon at a restitution hearing in September.
  • Ross, though, is happy that the ordeal is largely over. For the month-and-a-half of the investigation, she became reluctant to take on new students or to hire personnel. She says she considered packing up her research programme. She even questioned her own sanity, worrying that she was the one sabotaging Ames's work via "an alternate personality". Ross now wonders if she was too trusting, and urges other lab heads to "realize that the whole spectrum of humanity is in your lab. So, when someone complains to you, take it seriously."
  • She also urges others to speak up when wrongdoing is discovered. After Bhrigu pleaded guilty in June, Ross called Trempe at the University of Toledo. He was shocked, of course, and for more than one reason. His department at Toledo had actually re-hired Bhrigu. Bhrigu says that he lied about the reason he left Michigan, blaming it on disagreements with Ross. Toledo let Bhrigu go in July, not long after Ross's call.
  • Now that Bhrigu is in India, there is little to prevent him from getting back into science. And even if he were in the United States, there wouldn't be much to stop him. The National Institutes of Health in Bethesda, Maryland, through its Office of Research Integrity, will sometimes bar an individual from receiving federal research funds for a time if they are found guilty of misconduct. But Bhigru probably won't face that prospect because his actions don't fit the federal definition of misconduct, a situation Ross finds strange. "All scientists will tell you that it's scientific misconduct because it's tampering with data," she says.
  • Ames says that the experience shook her trust in her chosen profession. "I did have doubts about continuing with science. It hurt my idea of science as a community that works together, builds upon each other's work and collaborates."
  •  
    Research integrity: Sabotage! Postdoc Vipul Bhrigu destroyed the experiments of a colleague in order to get ahead.
Weiye Loh

News Clips: Pinning down acupuncture: It's a placebo - 0 views

  • some doctors seem to have embraced even disproven remedies. Take, for instance, a review of acupuncture research that appeared last July in the New England Journal of Medicine. This highly respected journal is one of the most widely read by doctors across specialities.In Acupuncture For Chronic Low Back Pain, the authors reviewed clinical trials done to assess if acupuncture actually helps in chronic low back pain. The most important meta-analysis available was a 2008 study involving 6,359 patients, which 'showed that real acupuncture treatments were no more effective than sham acupuncture treatments'.
  • The authors then editorialised: 'There was nevertheless evidence that both real acupuncture and sham acupuncture were more effective than no treatment and that acupuncture can be a useful supplement to other forms of conventional therapy for low back pain.'
  • First, they admit that pooled clinical trials of the best sort show that real acupuncture does no better than sham acupuncture. This should mean that acupuncture does not work - full stop. But then they say that both sham and real acupuncture work as well as the other and thus is useful. Translation: Please use acupuncture as a placebo on your patients; just don't let them know it is a placebo.
  • ...6 more annotations...
  • I should add that I am not criticising TCM per se. Only acupuncture, a facet of TCM, albeit its most dramatic, is being scrutinised here. Chinese herbology must be analysed on its own merits.Interestingly, although acupuncture may be TCM's poster boy today, the Chinese physician in days of yore would have looked askance at it. Instead, his practice and prestige were based upon his grasp of the Chinese pharmacopoeia.
  • Acupuncture was left to the shamans and blood letters. After all, it was grounded, not in the knowledge of which herbs were best for what conditions, but astrology.
  • In Giovanni Maciocia's 2005 book, The Foundations Of Chinese Medicine: A Comprehensive Text For Acupuncturists And Herbalists, there is a chart showing the astrological provenance of acupuncture. The chart shows how the 12 main acupuncture meridians and the 12 main body segments correspond to the 12 Houses of the Chinese zodiac.
  • In Chinese cosmology, all life is animated by a numinous force called qi, the flow of which mirrors the sun's apparent 'movement' during the year through the ecliptic. (The ecliptic is the imaginary plane of the earth's orbit around the sun).Moreover, everything in the Chinese zodiac is mirrored on Earth and in Man. This was taught even in the earliest systematised TCM text, the Yellow Emperor's Canon Of Medicine, thus: 'Heaven is covered with constellations, Earth with waterways, and man with channels.'This 'as above, so below' doctrine means that if there is qi flowing around in the imaginary closed loop of the zodiac, there is qi flowing correspondingly in the body's closed loop of imaginary meridians as well.
  • Note that not only is acupuncture astrological in origin but also the astrology is based on a model of the universe which has the earth at its centre. This geocentric model was an erroneous idea widely accepted before the Copernican revolution.
  • So should doctors check the daily horoscopes of their patients?
Weiye Loh

The Breakthrough Institute: New Report: How Efficiency Can Increase Energy Consumption - 0 views

  • There is a large expert consensus and strong evidence that below-cost energy efficiency measures drive a rebound in energy consumption that erodes much and in some cases all of the expected energy savings, concludes a new report by the Breakthrough Institute. "Energy Emergence: Rebound and Backfire as Emergent Phenomena" covers over 96 published journal articles and is one of the largest reviews of the peer-reviewed journal literature to date. (Readers in a hurry can download Breakthrough's PowerPoint demonstration here or download the full paper here.)
  • In a statement accompanying the report, Breakthrough Institute founders Ted Nordhaus and Michael Shellenberger wrote, "Below-cost energy efficiency is critical for economic growth and should thus be aggressively pursued by governments and firms. However, it should no longer be considered a direct and easy way to reduce energy consumption or greenhouse gas emissions." The lead author of the new report is Jesse Jenkins, Breakthrough's Director of Energy and Climate Policy; Nordhaus and Shellenberger are co-authors.
  • The findings of the new report are significant because governments have in recent years relied heavily on energy efficiency measures as a means to cut greenhouse gases. "I think we have to have a strong push toward energy efficiency," said President Obama recently. "We know that's the low-hanging fruit, we can save as much as 30 percent of our current energy usage without changing our quality of life." While there is robust evidence for rebound in academic peer-reviewed journals, it has largely been ignored by major analyses, including the widely cited 2009 McKinsey and Co. study on the cost of reducing greenhouse gases.
  • ...2 more annotations...
  • The idea that increased energy efficiency can increase energy consumption at the macro-economic level strikes many as a new idea, or paradoxical, but it was first observed in 1865 by British economist William Stanley Jevons, who pointed out that Watt's more efficient steam engine and other technical improvements that increased the efficiency of coal consumption actually increased rather than decreased demand for coal. More efficient engines, Jevons argued, would increase future coal consumption by lowering the effective price of energy, thus spurring greater demand and opening up useful and profitable new ways to utilize coal. Jevons was proven right, and the reality of what is today known as "Jevons Paradox" has long been uncontroversial among economists.
  • Economists have long observed that increasing the productivity of any single factor of production -- whether labor, capital, or energy -- increases demand for all of those factors. This is one of the basic dynamics of economic growth. Luddites who feared there would be fewer jobs with the emergence of weaving looms were proved wrong by lower price for woven clothing and demand that has skyrocketed (and continued to increase) ever since. And today, no economist would posit that an X% improvement in labor productivity would lead directly to an X% reduction in employment. In fact, the opposite is widely expected: labor productivity is a chief driver of economic growth and thus increases in employment overall. There is no evidence, the report points out, that energy is any different, as per capita energy consumption everywhere on earth continues to rise, even as economies become more efficient each year.
Weiye Loh

Political - or politicized? - psychology » Scienceline - 0 views

  • The idea that your personal characteristics could be linked to your political ideology has intrigued political psychologists for decades. Numerous studies suggest that liberals and conservatives differ not only in their views toward government and society, but also in their behavior, their personality, and even how they travel, decorate, clean and spend their leisure time. In today’s heated political climate, understanding people on the “other side” — whether that side is left or right — takes on new urgency. But as researchers study the personal side of politics, could they be influenced by political biases of their own?
  • Consider the following 2006 study by the late California psychologists Jeanne and Jack Block, which compared the personalities of nursery school children to their political leanings as 23-year olds. Preschoolers who went on to identify as liberal were described by the authors as self-reliant, energetic, somewhat dominating and resilient. The children who later identified as conservative were described as easily offended, indecisive, fearful, rigid, inhibited and vulnerable. The negative descriptions of conservatives in this study strike Jacob Vigil, a psychologist at the University of New Mexico, as morally loaded. Studies like this one, he said, use language that suggests the researchers are “motivated to present liberals with more ideal descriptions as compared to conservatives.”
  • Most of the researchers in this field are, in fact, liberal. In 2007 UCLA’s Higher Education Research Institute conducted a survey of faculty at four-year colleges and universities in the United States. About 68 percent of the faculty in history, political science and social science departments characterized themselves as liberal, 22 percent characterized themselves as moderate, and only 10 percent as conservative. Some social psychologists, like Jonathan Haidt of the University of Virginia, have charged that this liberal majority distorts the research in political psychology.
  • ...9 more annotations...
  • It’s a charge that John Jost, a social psychologist at New York University, flatly denies. Findings in political psychology bear upon deeply held personal beliefs and attitudes, he said, so they are bound to spark controversy. Research showing that conservatives score higher on measures of “intolerance of ambiguity” or the “need for cognitive closure” might bother some people, said Jost, but that does not make it biased.
  • “The job of the behavioral scientist is not to try to find something to say that couldn’t possibly be offensive,” said Jost. “Our job is to say what we think is true, and why.
  • Jost and his colleagues in 2003 compiled a meta-analysis of 88 studies from 12 different countries conducted over a 40-year period. They found strong evidence that conservatives tend to have higher needs to reduce uncertainty and threat. Conservatives also share psychological factors like fear, aggression, dogmatism, and the need for order, structure and closure. Political conservatism, they explained, could serve as a defense against anxieties and threats that arise out of everyday uncertainty, by justifying the status quo and preserving conditions that are comfortable and familiar.
  • The study triggered quite a public reaction, particularly within the conservative blogosphere. But the criticisms, according to Jost, were mistakenly focused on the researchers themselves; the findings were not disputed by the scientific community and have since been replicated. For example, a 2009 study followed college students over the span of their undergraduate experience and found that higher perceptions of threat did indeed predict political conservatism. Another 2009 study found that when confronted with a threat, liberals actually become more psychologically and politically conservative. Some studies even suggest that physiological traits like sensitivity to sudden noises or threatening images are associated with conservative political attitudes.
  • “The debate should always be about the data and its proper interpretation,” said Jost, “and never about the characteristics or motives of the researchers.” Phillip Tetlock, a psychologist at the University of California, Berkeley, agrees. However, Tetlock thinks that identifying the proper interpretation can be tricky, since personality measures can be described in many ways. “One observer’s ‘dogmatism’ can be another’s ‘principled,’ and one observer’s ‘open-mindedness’ can be another’s ‘flaccid and vacillating,’” Tetlock explained.
  • Richard Redding, a professor of law and psychology at Chapman University in Orange, California, points to a more general, indirect bias in political psychology. “It’s not the case that researchers are intentionally skewing the data,” which rarely happens, Redding said. Rather, the problem may lie in what sorts of questions are or are not asked.
  • For example, a conservative might be more inclined to undertake research on affirmative action in a way that would identify any negative outcomes, whereas a liberal probably wouldn’t, said Redding. Likewise, there may be aspects of personality that liberals simply haven’t considered. Redding is currently conducting a large-scale study on self-righteousness, which he suspects may be associated more highly with liberals than conservatives.
  • “The way you frame a problem is to some extent dictated by what you think the problem is,” said David Sears, a political psychologist at the University of California, Los Angeles. People’s strong feelings about issues like prejudice, sexism, authoritarianism, aggression, and nationalism — the bread and butter of political psychology — may influence how they design a study or present a problem.
  • The indirect bias that Sears and Redding identify is a far cry from the liberal groupthink others warn against. But given that psychology departments are predominantly left leaning, it’s important to seek out alternative viewpoints and explanations, said Jesse Graham, a social psychologist at the University of Southern California. A self-avowed liberal, Graham thinks it would be absurd to say he couldn’t do fair science because of his political preferences. “But,” he said, “it is something that I try to keep in mind.”
  •  
    The idea that your personal characteristics could be linked to your political ideology has intrigued political psychologists for decades. Numerous studies suggest that liberals and conservatives differ not only in their views toward government and society, but also in their behavior, their personality, and even how they travel, decorate, clean and spend their leisure time. In today's heated political climate, understanding people on the "other side" - whether that side is left or right - takes on new urgency. But as researchers study the personal side of politics, could they be influenced by political biases of their own?
1 - 8 of 8
Showing 20 items per page