Skip to main content

Home/ New Media Ethics 2009 course/ Group items tagged Kuhn

Rss Feed Group items tagged

Weiye Loh

The Ashtray: The Ultimatum (Part 1) - NYTimes.com - 0 views

  • “Under no circumstances are you to go to those lectures. Do you hear me?” Kuhn, the head of the Program in the History and Philosophy of Science at Princeton where I was a graduate student, had issued an ultimatum. It concerned the philosopher Saul Kripke’s lectures — later to be called “Naming and Necessity” — which he had originally given at Princeton in 1970 and planned to give again in the Fall, 1972.
  • Whiggishness — in history of science, the tendency to evaluate and interpret past scientific theories not on their own terms, but in the context of current knowledge. The term comes from Herbert Butterfield’s “The Whig Interpretation of History,” written when Butterfield, a future Regius professor of history at Cambridge, was only 31 years old. Butterfield had complained about Whiggishness, describing it as “…the study of the past with direct and perpetual reference to the present” – the tendency to see all history as progressive, and in an extreme form, as an inexorable march to greater liberty and enlightenment. [3] For Butterfield, on the other hand, “…real historical understanding” can be achieved only by “attempting to see life with the eyes of another century than our own.” [4][5].
  • Kuhn had attacked my Whiggish use of the term “displacement current.” [6] I had failed, in his view, to put myself in the mindset of Maxwell’s first attempts at creating a theory of electricity and magnetism. I felt that Kuhn had misinterpreted my paper, and that he — not me — had provided a Whiggish interpretation of Maxwell. I said, “You refuse to look through my telescope.” And he said, “It’s not a telescope, Errol. It’s a kaleidoscope.” (In this respect, he was probably right.) [7].
  • ...9 more annotations...
  • I asked him, “If paradigms are really incommensurable, how is history of science possible? Wouldn’t we be merely interpreting the past in the light of the present? Wouldn’t the past be inaccessible to us? Wouldn’t it be ‘incommensurable?’ ” [8] ¶He started moaning. He put his head in his hands and was muttering, “He’s trying to kill me. He’s trying to kill me.” ¶And then I added, “…except for someone who imagines himself to be God.” ¶It was at this point that Kuhn threw the ashtray at me.
  • I call Kuhn’s reply “The Ashtray Argument.” If someone says something you don’t like, you throw something at him. Preferably something large, heavy, and with sharp edges. Perhaps we were engaged in a debate on the nature of language, meaning and truth. But maybe we just wanted to kill each other.
  • That's the problem with relativism: Who's to say who's right and who's wrong? Somehow I'm not surprised to hear Kuhn was an ashtray-hurler. In the end, what other argument could he make?
  • For us to have a conversation and come to an agreement about the meaning of some word without having to refer to some outside authority like a dictionary, we would of necessity have to be satisfied that our agreement was genuine and not just a polite acknowledgement of each others' right to their opinion, can you agree with that? If so, then let's see if we can agree on the meaning of the word 'know' because that may be the crux of the matter. When I use the word 'know' I mean more than the capacity to apprehend some aspect of the world through language or some other represenational symbolism. Included in the word 'know' is the direct sensorial perception of some aspect of the world. For example, I sense the floor that my feet are now resting upon. I 'know' the floor is really there, I can sense it. Perhaps I don't 'know' what the floor is made of, who put it there, and other incidental facts one could know through the usual symbolism such as language as in a story someone tells me. Nevertheless, the reality I need to 'know' is that the floor, or whatever you may wish to call the solid - relative to my body - flat and level surface supported by more structure then the earth, is really there and reliably capable of supporting me. This is true and useful knowledge that goes directly from the floor itself to my knowing about it - via sensation - that has nothing to do with my interpretive system.
  • Now I am interested in 'knowing' my feet in the same way that my feet and the whole body they are connected to 'know' the floor. I sense my feet sensing the floor. My feet are as real as the floor and I know they are there, sensing the floor because I can sense them. Furthermore, now I 'know' that it is 'I' sensing my feet, sensing the floor. Do you see where I am going with this line of thought? I am including in the word 'know' more meaning than it is commonly given by everyday language. Perhaps it sounds as if I want to expand on the Cartesian formula of cogito ergo sum, and in truth I prefer to say I sense therefore I am. It is my sensations of the world first and foremost that my awareness, such as it is, is actively engaged with reality. Now, any healthy normal animal senses the world but we can't 'know' if they experience reality as we do since we can't have a conversation with them to arrive at agreement. But we humans can have this conversation and possibly agree that we can 'know' the world through sensation. We can even know what is 'I' through sensation. In fact, there is no other way to know 'I' except through sensation. Thought is symbolic representation, not direct sensing, so even though the thoughtful modality of regarding the world may be a far more reliable modality than sensation in predicting what might happen next, its very capacity for such accurate prediction is its biggest weakness, which is its capacity for error
  • Sensation cannot be 'wrong' unless it is used to predict outcomes. Thought can be wrong for both predicting outcomes and for 'knowing' reality. Sensation alone can 'know' reality even though it is relatively unreliable, useless even, for making predictions.
  • If we prioritize our interests by placing predictability over pure knowing through sensation, then of course we will not value the 'knowledge' to be gained through sensation. But if we can switch the priorities - out of sheer curiosity perhaps - then we can enter a realm of knowledge through sensation that is unbelievably spectacular. Our bodies are 'made of' reality, and by methodically exercising our nascent capacity for self sensing, we can connect our knowing 'I' to reality directly. We will not be able to 'know' what it is that we are experiencing in the way we might wish, which is to be able to predict what will happen next or to represent to ourselves symbolically what we might experience when we turn our attention to that sensation. But we can arrive at a depth and breadth of 'knowing' that is utterly unprecedented in our lives by operating that modality.
  • One of the impressions that comes from a sustained practice of self sensing is a clearer feeling for what "I" is and why we have a word for that self referential phenomenon, seemingly located somewhere behind our eyes and between our ears. The thing we call "I" or "me" depending on the context, turns out to be a moving point, a convergence vector for a variety of images, feelings and sensations. It is a reference point into which certain impressions flow and out of which certain impulses to act diverge and which may or may not animate certain muscle groups into action. Following this tricky exercize in attention and sensation, we can quickly see for ourselves that attention is more like a focused beam and awareness is more like a diffuse cloud, but both are composed of energy, and like all energy they vibrate, they oscillate with a certain frequency. That's it for now.
  • I loved the writer's efforts to find a fixed definition of “Incommensurability;” there was of course never a concrete meaning behind the word. Smoke and mirrors.
Weiye Loh

Skepticblog » The Value of Vertigo - 1 views

  • But Ruse’s moment of vertigo is not as surprising as it may appear. Indeed, he put effort into achieving this immersion: “I am atypical, I took about three hours to go through [the creation museum] but judging from my students most people don’t read the material as obsessively as I and take about an hour.” Why make this meticulous effort, when he could have dismissed creationism’s well-known scientific problems from the parking lot, or from his easy chair at home?
  • According to Ruse, the vertiginous “what if?” feeling has a practical value. After all, it’s easy to find problems with a pseudoscientific belief; what’s harder is understanding how and why other people believe. “It is silly just to dismiss this stuff as false,” Ruse argues (although it is false, and although Ruse has fought against “this stuff” for decades). “A lot of people believe Creationism so we on the other side need to get a feeling not just for the ideas but for the psychology too.”
  •  
    In June of 2009, philosopher of biology Michael Ruse took a group of grad students to the Answers in Genesis Creation Museum in Kentucky (and also some mainstream institutions) as part of a course on how museums present science. In a critical description of his visit, Ruse reflected upon "the extent to which the Creationist museum uses modern science to its own ends, melding it in seamlessly with its own Creationist message." Continental drift, the Big Bang, and even natural selection are all presented as evidence in support of Young Earth cosmology and flood geology. While immersing himself in the museum's pitch, Ruse wrote, Just for one moment about half way through the exhibit…I got that Kuhnian flash that it could all be true - it was only a flash (rather like thinking that Freudianism is true or that the Republicans are right on anything whatsoever) but it was interesting nevertheless to get a sense of how much sense this whole display and paradigm can make to people.
Weiye Loh

Skepticblog » The Decline Effect - 0 views

  • The first group are those with an overly simplistic or naive sense of how science functions. This is a view of science similar to those films created in the 1950s and meant to be watched by students, with the jaunty music playing in the background. This view generally respects science, but has a significant underappreciation for the flaws and complexity of science as a human endeavor. Those with this view are easily scandalized by revelations of the messiness of science.
  • The second cluster is what I would call scientific skepticism – which combines a respect for science and empiricism as a method (really “the” method) for understanding the natural world, with a deep appreciation for all the myriad ways in which the endeavor of science can go wrong. Scientific skeptics, in fact, seek to formally understand the process of science as a human endeavor with all its flaws. It is therefore often skeptics pointing out phenomena such as publication bias, the placebo effect, the need for rigorous controls and blinding, and the many vagaries of statistical analysis. But at the end of the day, as complex and messy the process of science is, a reliable picture of reality is slowly ground out.
  • The third group, often frustrating to scientific skeptics, are the science-deniers (for lack of a better term). They may take a postmodernist approach to science – science is just one narrative with no special relationship to the truth. Whatever you call it, what the science-deniers in essence do is describe all of the features of science that the skeptics do (sometimes annoyingly pretending that they are pointing these features out to skeptics) but then come to a different conclusion at the end – that science (essentially) does not work.
  • ...13 more annotations...
  • this third group – the science deniers – started out in the naive group, and then were so scandalized by the realization that science is a messy human endeavor that the leap right to the nihilistic conclusion that science must therefore be bunk.
  • The article by Lehrer falls generally into this third category. He is discussing what has been called “the decline effect” – the fact that effect sizes in scientific studies tend to decrease over time, sometime to nothing.
  • This term was first applied to the parapsychological literature, and was in fact proposed as a real phenomena of ESP – that ESP effects literally decline over time. Skeptics have criticized this view as magical thinking and hopelessly naive – Occam’s razor favors the conclusion that it is the flawed measurement of ESP, not ESP itself, that is declining over time. 
  • Lehrer, however, applies this idea to all of science, not just parapsychology. He writes: And this is why the decline effect is so troubling. Not because it reveals the human fallibility of science, in which data are tweaked and beliefs shape perceptions. (Such shortcomings aren’t surprising, at least for scientists.) And not because it reveals that many of our most exciting theories are fleeting fads and will soon be rejected. (That idea has been around since Thomas Kuhn.) The decline effect is troubling because it reminds us how difficult it is to prove anything. We like to pretend that our experiments define the truth for us. But that’s often not the case. Just because an idea is true doesn’t mean it can be proved. And just because an idea can be proved doesn’t mean it’s true. When the experiments are done, we still have to choose what to believe.
  • Lehrer is ultimately referring to aspects of science that skeptics have been pointing out for years (as a way of discerning science from pseudoscience), but Lehrer takes it to the nihilistic conclusion that it is difficult to prove anything, and that ultimately “we still have to choose what to believe.” Bollocks!
  • Lehrer is describing the cutting edge or the fringe of science, and then acting as if it applies all the way down to the core. I think the problem is that there is so much scientific knowledge that we take for granted – so much so that we forget it is knowledge that derived from the scientific method, and at one point was not known.
  • It is telling that Lehrer uses as his primary examples of the decline effect studies from medicine, psychology, and ecology – areas where the signal to noise ratio is lowest in the sciences, because of the highly variable and complex human element. We don’t see as much of a decline effect in physics, for example, where phenomena are more objective and concrete.
  • If the truth itself does not “wear off”, as the headline of Lehrer’s article provocatively states, then what is responsible for this decline effect?
  • it is no surprise that effect science in preliminary studies tend to be positive. This can be explained on the basis of experimenter bias – scientists want to find positive results, and initial experiments are often flawed or less than rigorous. It takes time to figure out how to rigorously study a question, and so early studies will tend not to control for all the necessary variables. There is further publication bias in which positive studies tend to be published more than negative studies.
  • Further, some preliminary research may be based upon chance observations – a false pattern based upon a quirky cluster of events. If these initial observations are used in the preliminary studies, then the statistical fluke will be carried forward. Later studies are then likely to exhibit a regression to the mean, or a return to more statistically likely results (which is exactly why you shouldn’t use initial data when replicating a result, but should use entirely fresh data – a mistake for which astrologers are infamous).
  • skeptics are frequently cautioning against new or preliminary scientific research. Don’t get excited by every new study touted in the lay press, or even by a university’s press release. Most new findings turn out to be wrong. In science, replication is king. Consensus and reliable conclusions are built upon multiple independent lines of evidence, replicated over time, all converging on one conclusion.
  • Lehrer does make some good points in his article, but they are points that skeptics are fond of making. In order to have a  mature and functional appreciation for the process and findings of science, it is necessary to understand how science works in the real world, as practiced by flawed scientists and scientific institutions. This is the skeptical message.
  • But at the same time reliable findings in science are possible, and happen frequently – when results can be replicated and when they fit into the expanding intricate weave of the picture of the natural world being generated by scientific investigation.
Weiye Loh

Science Warriors' Ego Trips - The Chronicle Review - The Chronicle of Higher Education - 0 views

  • By Carlin Romano Standing up for science excites some intellectuals the way beautiful actresses arouse Warren Beatty, or career liberals boil the blood of Glenn Beck and Rush Limbaugh. It's visceral.
  • A brave champion of beleaguered science in the modern age of pseudoscience, this Ayn Rand protagonist sarcastically derides the benighted irrationalists and glows with a self-anointed superiority. Who wouldn't want to feel that sense of power and rightness?
  • You hear the voice regularly—along with far more sensible stuff—in the latest of a now common genre of science patriotism, Nonsense on Stilts: How to Tell Science From Bunk (University of Chicago Press), by Massimo Pigliucci, a philosophy professor at the City University of New York.
  • ...24 more annotations...
  • it mixes eminent common sense and frequent good reporting with a cocksure hubris utterly inappropriate to the practice it apotheosizes.
  • According to Pigliucci, both Freudian psychoanalysis and Marxist theory of history "are too broad, too flexible with regard to observations, to actually tell us anything interesting." (That's right—not one "interesting" thing.) The idea of intelligent design in biology "has made no progress since its last serious articulation by natural theologian William Paley in 1802," and the empirical evidence for evolution is like that for "an open-and-shut murder case."
  • Pigliucci offers more hero sandwiches spiced with derision and certainty. Media coverage of science is "characterized by allegedly serious journalists who behave like comedians." Commenting on the highly publicized Dover, Pa., court case in which U.S. District Judge John E. Jones III ruled that intelligent-design theory is not science, Pigliucci labels the need for that judgment a "bizarre" consequence of the local school board's "inane" resolution. Noting the complaint of intelligent-design advocate William Buckingham that an approved science textbook didn't give creationism a fair shake, Pigliucci writes, "This is like complaining that a textbook in astronomy is too focused on the Copernican theory of the structure of the solar system and unfairly neglects the possibility that the Flying Spaghetti Monster is really pulling each planet's strings, unseen by the deluded scientists."
  • Or is it possible that the alternate view unfairly neglected could be more like that of Harvard scientist Owen Gingerich, who contends in God's Universe (Harvard University Press, 2006) that it is partly statistical arguments—the extraordinary unlikelihood eons ago of the physical conditions necessary for self-conscious life—that support his belief in a universe "congenially designed for the existence of intelligent, self-reflective life"?
  • Even if we agree that capital "I" and "D" intelligent-design of the scriptural sort—what Gingerich himself calls "primitive scriptural literalism"—is not scientifically credible, does that make Gingerich's assertion, "I believe in intelligent design, lowercase i and lowercase d," equivalent to Flying-Spaghetti-Monsterism? Tone matters. And sarcasm is not science.
  • The problem with polemicists like Pigliucci is that a chasm has opened up between two groups that might loosely be distinguished as "philosophers of science" and "science warriors."
  • Philosophers of science, often operating under the aegis of Thomas Kuhn, recognize that science is a diverse, social enterprise that has changed over time, developed different methodologies in different subsciences, and often advanced by taking putative pseudoscience seriously, as in debunking cold fusion
  • The science warriors, by contrast, often write as if our science of the moment is isomorphic with knowledge of an objective world-in-itself—Kant be damned!—and any form of inquiry that doesn't fit the writer's criteria of proper science must be banished as "bunk." Pigliucci, typically, hasn't much sympathy for radical philosophies of science. He calls the work of Paul Feyerabend "lunacy," deems Bruno Latour "a fool," and observes that "the great pronouncements of feminist science have fallen as flat as the similarly empty utterances of supporters of intelligent design."
  • It doesn't have to be this way. The noble enterprise of submitting nonscientific knowledge claims to critical scrutiny—an activity continuous with both philosophy and science—took off in an admirable way in the late 20th century when Paul Kurtz, of the University at Buffalo, established the Committee for the Scientific Investigation of Claims of the Paranormal (Csicop) in May 1976. Csicop soon after launched the marvelous journal Skeptical Inquirer
  • Although Pigliucci himself publishes in Skeptical Inquirer, his contributions there exhibit his signature smugness. For an antidote to Pigliucci's overweening scientism 'tude, it's refreshing to consult Kurtz's curtain-raising essay, "Science and the Public," in Science Under Siege (Prometheus Books, 2009, edited by Frazier)
  • Kurtz's commandment might be stated, "Don't mock or ridicule—investigate and explain." He writes: "We attempted to make it clear that we were interested in fair and impartial inquiry, that we were not dogmatic or closed-minded, and that skepticism did not imply a priori rejection of any reasonable claim. Indeed, I insisted that our skepticism was not totalistic or nihilistic about paranormal claims."
  • Kurtz combines the ethos of both critical investigator and philosopher of science. Describing modern science as a practice in which "hypotheses and theories are based upon rigorous methods of empirical investigation, experimental confirmation, and replication," he notes: "One must be prepared to overthrow an entire theoretical framework—and this has happened often in the history of science ... skeptical doubt is an integral part of the method of science, and scientists should be prepared to question received scientific doctrines and reject them in the light of new evidence."
  • Pigliucci, alas, allows his animus against the nonscientific to pull him away from sensitive distinctions among various sciences to sloppy arguments one didn't see in such earlier works of science patriotism as Carl Sagan's The Demon-Haunted World: Science as a Candle in the Dark (Random House, 1995). Indeed, he probably sets a world record for misuse of the word "fallacy."
  • To his credit, Pigliucci at times acknowledges the nondogmatic spine of science. He concedes that "science is characterized by a fuzzy borderline with other types of inquiry that may or may not one day become sciences." Science, he admits, "actually refers to a rather heterogeneous family of activities, not to a single and universal method." He rightly warns that some pseudoscience—for example, denial of HIV-AIDS causation—is dangerous and terrible.
  • But at other points, Pigliucci ferociously attacks opponents like the most unreflective science fanatic
  • He dismisses Feyerabend's view that "science is a religion" as simply "preposterous," even though he elsewhere admits that "methodological naturalism"—the commitment of all scientists to reject "supernatural" explanations—is itself not an empirically verifiable principle or fact, but rather an almost Kantian precondition of scientific knowledge. An article of faith, some cold-eyed Feyerabend fans might say.
  • He writes, "ID is not a scientific theory at all because there is no empirical observation that can possibly contradict it. Anything we observe in nature could, in principle, be attributed to an unspecified intelligent designer who works in mysterious ways." But earlier in the book, he correctly argues against Karl Popper that susceptibility to falsification cannot be the sole criterion of science, because science also confirms. It is, in principle, possible that an empirical observation could confirm intelligent design—i.e., that magic moment when the ultimate UFO lands with representatives of the intergalactic society that planted early life here, and we accept their evidence that they did it.
  • "As long as we do not venture to make hypotheses about who the designer is and why and how she operates," he writes, "there are no empirical constraints on the 'theory' at all. Anything goes, and therefore nothing holds, because a theory that 'explains' everything really explains nothing."
  • Here, Pigliucci again mixes up what's likely or provable with what's logically possible or rational. The creation stories of traditional religions and scriptures do, in effect, offer hypotheses, or claims, about who the designer is—e.g., see the Bible.
  • Far from explaining nothing because it explains everything, such an explanation explains a lot by explaining everything. It just doesn't explain it convincingly to a scientist with other evidentiary standards.
  • A sensible person can side with scientists on what's true, but not with Pigliucci on what's rational and possible. Pigliucci occasionally recognizes that. Late in his book, he concedes that "nonscientific claims may be true and still not qualify as science." But if that's so, and we care about truth, why exalt science to the degree he does? If there's really a heaven, and science can't (yet?) detect it, so much the worse for science.
  • Pigliucci quotes a line from Aristotle: "It is the mark of an educated mind to be able to entertain a thought without accepting it." Science warriors such as Pigliucci, or Michael Ruse in his recent clash with other philosophers in these pages, should reflect on a related modern sense of "entertain." One does not entertain a guest by mocking, deriding, and abusing the guest. Similarly, one does not entertain a thought or approach to knowledge by ridiculing it.
  • Long live Skeptical Inquirer! But can we deep-six the egomania and unearned arrogance of the science patriots? As Descartes, that immortal hero of scientists and skeptics everywhere, pointed out, true skepticism, like true charity, begins at home.
  • Carlin Romano, critic at large for The Chronicle Review, teaches philosophy and media theory at the University of Pennsylvania.
  •  
    April 25, 2010 Science Warriors' Ego Trips
Weiye Loh

The Decline Effect and the Scientific Method : The New Yorker - 0 views

  • On September 18, 2007, a few dozen neuroscientists, psychiatrists, and drug-company executives gathered in a hotel conference room in Brussels to hear some startling news. It had to do with a class of drugs known as atypical or second-generation antipsychotics, which came on the market in the early nineties.
  • the therapeutic power of the drugs appeared to be steadily waning. A recent study showed an effect that was less than half of that documented in the first trials, in the early nineteen-nineties. Many researchers began to argue that the expensive pharmaceuticals weren’t any better than first-generation antipsychotics, which have been in use since the fifties. “In fact, sometimes they now look even worse,” John Davis, a professor of psychiatry at the University of Illinois at Chicago, told me.
  • Before the effectiveness of a drug can be confirmed, it must be tested and tested again. Different scientists in different labs need to repeat the protocols and publish their results. The test of replicability, as it’s known, is the foundation of modern research. Replicability is how the community enforces itself. It’s a safeguard for the creep of subjectivity. Most of the time, scientists know what results they want, and that can influence the results they get. The premise of replicability is that the scientific community can correct for these flaws.
  • ...30 more annotations...
  • But now all sorts of well-established, multiply confirmed findings have started to look increasingly uncertain. It’s as if our facts were losing their truth: claims that have been enshrined in textbooks are suddenly unprovable. This phenomenon doesn’t yet have an official name, but it’s occurring across a wide range of fields, from psychology to ecology. In the field of medicine, the phenomenon seems extremely widespread, affecting not only antipsychotics but also therapies ranging from cardiac stents to Vitamin E and antidepressants: Davis has a forthcoming analysis demonstrating that the efficacy of antidepressants has gone down as much as threefold in recent decades.
  • In private, Schooler began referring to the problem as “cosmic habituation,” by analogy to the decrease in response that occurs when individuals habituate to particular stimuli. “Habituation is why you don’t notice the stuff that’s always there,” Schooler says. “It’s an inevitable process of adjustment, a ratcheting down of excitement. I started joking that it was like the cosmos was habituating to my ideas. I took it very personally.”
  • At first, he assumed that he’d made an error in experimental design or a statistical miscalculation. But he couldn’t find anything wrong with his research. He then concluded that his initial batch of research subjects must have been unusually susceptible to verbal overshadowing. (John Davis, similarly, has speculated that part of the drop-off in the effectiveness of antipsychotics can be attributed to using subjects who suffer from milder forms of psychosis which are less likely to show dramatic improvement.) “It wasn’t a very satisfying explanation,” Schooler says. “One of my mentors told me that my real mistake was trying to replicate my work. He told me doing that was just setting myself up for disappointment.”
  • the effect is especially troubling because of what it exposes about the scientific process. If replication is what separates the rigor of science from the squishiness of pseudoscience, where do we put all these rigorously validated findings that can no longer be proved? Which results should we believe? Francis Bacon, the early-modern philosopher and pioneer of the scientific method, once declared that experiments were essential, because they allowed us to “put nature to the question.” But it appears that nature often gives us different answers.
  • The most likely explanation for the decline is an obvious one: regression to the mean. As the experiment is repeated, that is, an early statistical fluke gets cancelled out. The extrasensory powers of Schooler’s subjects didn’t decline—they were simply an illusion that vanished over time. And yet Schooler has noticed that many of the data sets that end up declining seem statistically solid—that is, they contain enough data that any regression to the mean shouldn’t be dramatic. “These are the results that pass all the tests,” he says. “The odds of them being random are typically quite remote, like one in a million. This means that the decline effect should almost never happen. But it happens all the time!
  • this is why Schooler believes that the decline effect deserves more attention: its ubiquity seems to violate the laws of statistics. “Whenever I start talking about this, scientists get very nervous,” he says. “But I still want to know what happened to my results. Like most scientists, I assumed that it would get easier to document my effect over time. I’d get better at doing the experiments, at zeroing in on the conditions that produce verbal overshadowing. So why did the opposite happen? I’m convinced that we can use the tools of science to figure this out. First, though, we have to admit that we’ve got a problem.”
  • In 2001, Michael Jennions, a biologist at the Australian National University, set out to analyze “temporal trends” across a wide range of subjects in ecology and evolutionary biology. He looked at hundreds of papers and forty-four meta-analyses (that is, statistical syntheses of related studies), and discovered a consistent decline effect over time, as many of the theories seemed to fade into irrelevance. In fact, even when numerous variables were controlled for—Jennions knew, for instance, that the same author might publish several critical papers, which could distort his analysis—there was still a significant decrease in the validity of the hypothesis, often within a year of publication. Jennions admits that his findings are troubling, but expresses a reluctance to talk about them publicly. “This is a very sensitive issue for scientists,” he says. “You know, we’re supposed to be dealing with hard facts, the stuff that’s supposed to stand the test of time. But when you see these trends you become a little more skeptical of things.”
  • the worst part was that when I submitted these null results I had difficulty getting them published. The journals only wanted confirming data. It was too exciting an idea to disprove, at least back then.
  • the steep rise and slow fall of fluctuating asymmetry is a clear example of a scientific paradigm, one of those intellectual fads that both guide and constrain research: after a new paradigm is proposed, the peer-review process is tilted toward positive results. But then, after a few years, the academic incentives shift—the paradigm has become entrenched—so that the most notable results are now those that disprove the theory.
  • Jennions, similarly, argues that the decline effect is largely a product of publication bias, or the tendency of scientists and scientific journals to prefer positive data over null results, which is what happens when no effect is found. The bias was first identified by the statistician Theodore Sterling, in 1959, after he noticed that ninety-seven per cent of all published psychological studies with statistically significant data found the effect they were looking for. A “significant” result is defined as any data point that would be produced by chance less than five per cent of the time. This ubiquitous test was invented in 1922 by the English mathematician Ronald Fisher, who picked five per cent as the boundary line, somewhat arbitrarily, because it made pencil and slide-rule calculations easier. Sterling saw that if ninety-seven per cent of psychology studies were proving their hypotheses, either psychologists were extraordinarily lucky or they published only the outcomes of successful experiments. In recent years, publication bias has mostly been seen as a problem for clinical trials, since pharmaceutical companies are less interested in publishing results that aren’t favorable. But it’s becoming increasingly clear that publication bias also produces major distortions in fields without large corporate incentives, such as psychology and ecology.
  • While publication bias almost certainly plays a role in the decline effect, it remains an incomplete explanation. For one thing, it fails to account for the initial prevalence of positive results among studies that never even get submitted to journals. It also fails to explain the experience of people like Schooler, who have been unable to replicate their initial data despite their best efforts
  • an equally significant issue is the selective reporting of results—the data that scientists choose to document in the first place. Palmer’s most convincing evidence relies on a statistical tool known as a funnel graph. When a large number of studies have been done on a single subject, the data should follow a pattern: studies with a large sample size should all cluster around a common value—the true result—whereas those with a smaller sample size should exhibit a random scattering, since they’re subject to greater sampling error. This pattern gives the graph its name, since the distribution resembles a funnel.
  • The funnel graph visually captures the distortions of selective reporting. For instance, after Palmer plotted every study of fluctuating asymmetry, he noticed that the distribution of results with smaller sample sizes wasn’t random at all but instead skewed heavily toward positive results.
  • Palmer has since documented a similar problem in several other contested subject areas. “Once I realized that selective reporting is everywhere in science, I got quite depressed,” Palmer told me. “As a researcher, you’re always aware that there might be some nonrandom patterns, but I had no idea how widespread it is.” In a recent review article, Palmer summarized the impact of selective reporting on his field: “We cannot escape the troubling conclusion that some—perhaps many—cherished generalities are at best exaggerated in their biological significance and at worst a collective illusion nurtured by strong a-priori beliefs often repeated.”
  • Palmer emphasizes that selective reporting is not the same as scientific fraud. Rather, the problem seems to be one of subtle omissions and unconscious misperceptions, as researchers struggle to make sense of their results. Stephen Jay Gould referred to this as the “shoehorning” process. “A lot of scientific measurement is really hard,” Simmons told me. “If you’re talking about fluctuating asymmetry, then it’s a matter of minuscule differences between the right and left sides of an animal. It’s millimetres of a tail feather. And so maybe a researcher knows that he’s measuring a good male”—an animal that has successfully mated—“and he knows that it’s supposed to be symmetrical. Well, that act of measurement is going to be vulnerable to all sorts of perception biases. That’s not a cynical statement. That’s just the way human beings work.”
  • One of the classic examples of selective reporting concerns the testing of acupuncture in different countries. While acupuncture is widely accepted as a medical treatment in various Asian countries, its use is much more contested in the West. These cultural differences have profoundly influenced the results of clinical trials. Between 1966 and 1995, there were forty-seven studies of acupuncture in China, Taiwan, and Japan, and every single trial concluded that acupuncture was an effective treatment. During the same period, there were ninety-four clinical trials of acupuncture in the United States, Sweden, and the U.K., and only fifty-six per cent of these studies found any therapeutic benefits. As Palmer notes, this wide discrepancy suggests that scientists find ways to confirm their preferred hypothesis, disregarding what they don’t want to see. Our beliefs are a form of blindness.
  • John Ioannidis, an epidemiologist at Stanford University, argues that such distortions are a serious issue in biomedical research. “These exaggerations are why the decline has become so common,” he says. “It’d be really great if the initial studies gave us an accurate summary of things. But they don’t. And so what happens is we waste a lot of money treating millions of patients and doing lots of follow-up studies on other themes based on results that are misleading.”
  • In 2005, Ioannidis published an article in the Journal of the American Medical Association that looked at the forty-nine most cited clinical-research studies in three major medical journals. Forty-five of these studies reported positive results, suggesting that the intervention being tested was effective. Because most of these studies were randomized controlled trials—the “gold standard” of medical evidence—they tended to have a significant impact on clinical practice, and led to the spread of treatments such as hormone replacement therapy for menopausal women and daily low-dose aspirin to prevent heart attacks and strokes. Nevertheless, the data Ioannidis found were disturbing: of the thirty-four claims that had been subject to replication, forty-one per cent had either been directly contradicted or had their effect sizes significantly downgraded.
  • The situation is even worse when a subject is fashionable. In recent years, for instance, there have been hundreds of studies on the various genes that control the differences in disease risk between men and women. These findings have included everything from the mutations responsible for the increased risk of schizophrenia to the genes underlying hypertension. Ioannidis and his colleagues looked at four hundred and thirty-two of these claims. They quickly discovered that the vast majority had serious flaws. But the most troubling fact emerged when he looked at the test of replication: out of four hundred and thirty-two claims, only a single one was consistently replicable. “This doesn’t mean that none of these claims will turn out to be true,” he says. “But, given that most of them were done badly, I wouldn’t hold my breath.”
  • the main problem is that too many researchers engage in what he calls “significance chasing,” or finding ways to interpret the data so that it passes the statistical test of significance—the ninety-five-per-cent boundary invented by Ronald Fisher. “The scientists are so eager to pass this magical test that they start playing around with the numbers, trying to find anything that seems worthy,” Ioannidis says. In recent years, Ioannidis has become increasingly blunt about the pervasiveness of the problem. One of his most cited papers has a deliberately provocative title: “Why Most Published Research Findings Are False.”
  • The problem of selective reporting is rooted in a fundamental cognitive flaw, which is that we like proving ourselves right and hate being wrong. “It feels good to validate a hypothesis,” Ioannidis said. “It feels even better when you’ve got a financial interest in the idea or your career depends upon it. And that’s why, even after a claim has been systematically disproven”—he cites, for instance, the early work on hormone replacement therapy, or claims involving various vitamins—“you still see some stubborn researchers citing the first few studies that show a strong effect. They really want to believe that it’s true.”
  • scientists need to become more rigorous about data collection before they publish. “We’re wasting too much time chasing after bad studies and underpowered experiments,” he says. The current “obsession” with replicability distracts from the real problem, which is faulty design. He notes that nobody even tries to replicate most science papers—there are simply too many. (According to Nature, a third of all studies never even get cited, let alone repeated.)
  • Schooler recommends the establishment of an open-source database, in which researchers are required to outline their planned investigations and document all their results. “I think this would provide a huge increase in access to scientific work and give us a much better way to judge the quality of an experiment,” Schooler says. “It would help us finally deal with all these issues that the decline effect is exposing.”
  • Although such reforms would mitigate the dangers of publication bias and selective reporting, they still wouldn’t erase the decline effect. This is largely because scientific research will always be shadowed by a force that can’t be curbed, only contained: sheer randomness. Although little research has been done on the experimental dangers of chance and happenstance, the research that exists isn’t encouraging
  • John Crabbe, a neuroscientist at the Oregon Health and Science University, conducted an experiment that showed how unknowable chance events can skew tests of replicability. He performed a series of experiments on mouse behavior in three different science labs: in Albany, New York; Edmonton, Alberta; and Portland, Oregon. Before he conducted the experiments, he tried to standardize every variable he could think of. The same strains of mice were used in each lab, shipped on the same day from the same supplier. The animals were raised in the same kind of enclosure, with the same brand of sawdust bedding. They had been exposed to the same amount of incandescent light, were living with the same number of littermates, and were fed the exact same type of chow pellets. When the mice were handled, it was with the same kind of surgical glove, and when they were tested it was on the same equipment, at the same time in the morning.
  • The premise of this test of replicability, of course, is that each of the labs should have generated the same pattern of results. “If any set of experiments should have passed the test, it should have been ours,” Crabbe says. “But that’s not the way it turned out.” In one experiment, Crabbe injected a particular strain of mouse with cocaine. In Portland the mice given the drug moved, on average, six hundred centimetres more than they normally did; in Albany they moved seven hundred and one additional centimetres. But in the Edmonton lab they moved more than five thousand additional centimetres. Similar deviations were observed in a test of anxiety. Furthermore, these inconsistencies didn’t follow any detectable pattern. In Portland one strain of mouse proved most anxious, while in Albany another strain won that distinction.
  • The disturbing implication of the Crabbe study is that a lot of extraordinary scientific data are nothing but noise. The hyperactivity of those coked-up Edmonton mice wasn’t an interesting new fact—it was a meaningless outlier, a by-product of invisible variables we don’t understand. The problem, of course, is that such dramatic findings are also the most likely to get published in prestigious journals, since the data are both statistically significant and entirely unexpected. Grants get written, follow-up studies are conducted. The end result is a scientific accident that can take years to unravel.
  • This suggests that the decline effect is actually a decline of illusion.
  • While Karl Popper imagined falsification occurring with a single, definitive experiment—Galileo refuted Aristotelian mechanics in an afternoon—the process turns out to be much messier than that. Many scientific theories continue to be considered true even after failing numerous experimental tests. Verbal overshadowing might exhibit the decline effect, but it remains extensively relied upon within the field. The same holds for any number of phenomena, from the disappearing benefits of second-generation antipsychotics to the weak coupling ratio exhibited by decaying neutrons, which appears to have fallen by more than ten standard deviations between 1969 and 2001. Even the law of gravity hasn’t always been perfect at predicting real-world phenomena. (In one test, physicists measuring gravity by means of deep boreholes in the Nevada desert found a two-and-a-half-per-cent discrepancy between the theoretical predictions and the actual data.) Despite these findings, second-generation antipsychotics are still widely prescribed, and our model of the neutron hasn’t changed. The law of gravity remains the same.
  • Such anomalies demonstrate the slipperiness of empiricism. Although many scientific ideas generate conflicting results and suffer from falling effect sizes, they continue to get cited in the textbooks and drive standard medical practice. Why? Because these ideas seem true. Because they make sense. Because we can’t bear to let them go. And this is why the decline effect is so troubling. Not because it reveals the human fallibility of science, in which data are tweaked and beliefs shape perceptions. (Such shortcomings aren’t surprising, at least for scientists.) And not because it reveals that many of our most exciting theories are fleeting fads and will soon be rejected. (That idea has been around since Thomas Kuhn.) The decline effect is troubling because it reminds us how difficult it is to prove anything. We like to pretend that our experiments define the truth for us. But that’s often not the case. Just because an idea is true doesn’t mean it can be proved. And just because an idea can be proved doesn’t mean it’s true. When the experiments are done, we still have to choose what to believe.
1 - 5 of 5
Showing 20 items per page