Skip to main content

Home/ New Media Ethics 2009 course/ Group items matching "Data" in title, tags, annotations or url

Group items matching
in title, tags, annotations or url

Sort By: Relevance | Date Filter: All | Bookmarks | Topics Simple Middle
Weiye Loh

Open-Access Economics by Barry Eichengreen - Project Syndicate - 0 views

  • in a discipline that regards ingenuity as the ultimate virtue, those who engage in the grunt work of data cleaning and replication receive few rewards. Nobel prizes are not awarded for constructing new historical estimates of GDP that allow policy analysis to be extended back in time.
  • How could a flawed study have appeared first in the prestigious working-paper series of the National Bureau of Economic Research (NBER) and then in a journal of the American Economic Association? And, if this was possible, why should policymakers and a discerning public vest any credibility in economic research?CommentsView/Create comment on this paragraphIt was possible because economists are not obliged to make their data and programs publicly available when publishing scientific research. It is said that NBER working papers are even more prestigious than publication in refereed journals. Yet the Bureau does not require scholars to post their data and programs to its Web site as a condition for working-paper publication.
  • Statistics are helpful. But in economics, as in other lines of social inquiry, they are no substitute for proper historical analysis.
  •  
    "Big data promises big progress. But large data sets also make replication impossible without the author's cooperation. And the incentive for authors to cooperate is, at best, mixed. It is therefore the responsibility of editorial boards and the directors of organizations like the NBER to make open access obligatory."
Weiye Loh

Let's make science metrics more scientific : Article : Nature - 0 views

  • Measuring and assessing academic performance is now a fact of scientific life.
  • Yet current systems of measurement are inadequate. Widely used metrics, from the newly-fashionable Hirsch index to the 50-year-old citation index, are of limited use1
  • Existing metrics do not capture the full range of activities that support and transmit scientific ideas, which can be as varied as mentoring, blogging or creating industrial prototypes.
  • ...15 more annotations...
  • narrow or biased measures of scientific achievement can lead to narrow and biased science.
  • Global demand for, and interest in, metrics should galvanize stakeholders — national funding agencies, scientific research organizations and publishing houses — to combine forces. They can set an agenda and foster research that establishes sound scientific metrics: grounded in theory, built with high-quality data and developed by a community with strong incentives to use them.
  • Scientists are often reticent to see themselves or their institutions labelled, categorized or ranked. Although happy to tag specimens as one species or another, many researchers do not like to see themselves as specimens under a microscope — they feel that their work is too complex to be evaluated in such simplistic terms. Some argue that science is unpredictable, and that any metric used to prioritize research money risks missing out on an important discovery from left field.
    • Weiye Loh
       
      It is ironic that while scientists feel that their work are too complex to be evaluated in simplistic terms or matrics, they nevertheless feel ok to evaluate the world in simplistic terms. 
  • It is true that good metrics are difficult to develop, but this is not a reason to abandon them. Rather it should be a spur to basing their development in sound science. If we do not press harder for better metrics, we risk making poor funding decisions or sidelining good scientists.
  • Metrics are data driven, so developing a reliable, joined-up infrastructure is a necessary first step.
  • We need a concerted international effort to combine, augment and institutionalize these databases within a cohesive infrastructure.
  • On an international level, the issue of a unique researcher identification system is one that needs urgent attention. There are various efforts under way in the open-source and publishing communities to create unique researcher identifiers using the same principles as the Digital Object Identifier (DOI) protocol, which has become the international standard for identifying unique documents. The ORCID (Open Researcher and Contributor ID) project, for example, was launched in December 2009 by parties including Thompson Reuters and Nature Publishing Group. The engagement of international funding agencies would help to push this movement towards an international standard.
  • if all funding agencies used a universal template for reporting scientific achievements, it could improve data quality and reduce the burden on investigators.
    • Weiye Loh
       
      So in future, we'll only have one robust matric to evaluate scientific contribution? hmm...
  • Importantly, data collected for use in metrics must be open to the scientific community, so that metric calculations can be reproduced. This also allows the data to be efficiently repurposed.
  • As well as building an open and consistent data infrastructure, there is the added challenge of deciding what data to collect and how to use them. This is not trivial. Knowledge creation is a complex process, so perhaps alternative measures of creativity and productivity should be included in scientific metrics, such as the filing of patents, the creation of prototypes4 and even the production of YouTube videos.
  • Perhaps publications in these different media should be weighted differently in different fields.
  • There needs to be a greater focus on what these data mean, and how they can be best interpreted.
  • This requires the input of social scientists, rather than just those more traditionally involved in data capture, such as computer scientists.
  • An international data platform supported by funding agencies could include a virtual 'collaboratory', in which ideas and potential solutions can be posited and discussed. This would bring social scientists together with working natural scientists to develop metrics and test their validity through wikis, blogs and discussion groups, thus building a community of practice. Such a discussion should be open to all ideas and theories and not restricted to traditional bibliometric approaches.
  • Far-sighted action can ensure that metrics goes beyond identifying 'star' researchers, nations or ideas, to capturing the essence of what it means to be a good scientist.
  •  
    Let's make science metrics more scientific Julia Lane1 Top of pageAbstract To capture the essence of good science, stakeholders must combine forces to create an open, sound and consistent system for measuring all the activities that make up academic productivity, says Julia Lane.
Weiye Loh

Congress told that Internet data caps will discourage piracy - 0 views

  • While usage-based billing and data caps are often talked about in terms of their ability to curb congestion, it's rarely suggested that making Internet access more expensive is a positive move for the content industries. But Castro has a whole host of such suggestions, drawn largely verbatim from his 2009 report (PDF) on the subject.
  • Should the US government actually fund antipiracy research? Sure. Should the US government “enlist” Internet providers to block entire websites? Sure. Should copyright holders suggest to the government which sites should go on the blocklist? Sure. Should ad networks and payment processors be forced to cut ties to such sites, even if those sites are legal in the countries where they operate? Sure.
  • Castro's original 2009 paper goes further, suggesting that deep packet inspection (DPI) be routinely deployed by ISPs in order to scan subscriber traffic for potential copyright infringements. Sound like wiretapping? Yes, though Castro has a solution if courts do crack down on the practice: "the law should be changed." After all, "piracy mitigation with DPI deals with a set of issues virtually identical to the largely noncontroversial question of virus detection and mitigation."
  • ...1 more annotation...
  • If you think that some of these approaches to antipiracy enforcement have problems, Castro knows why; he told Congress yesterday that critics of such ideas "assume that piracy is the bedrock of the Internet economy" and don't want to disrupt it, a statement patently absurd on its face.
  •  
    Internet data caps aren't just good at stopping congestion; they can also be useful tools for curtailing piracy. That was one of the points made by Daniel Castro, an analyst at the Information Technology and Innovation Foundation (ITIF) think tank in Washington DC. Castro testified (PDF) yesterday before the House Judiciary Committee about the problem of "parasite" websites, saying that usage-based billing and monthly data caps were both good ways to discourage piracy, and that the government shouldn't do anything to stand in their way. The government should allow "pricing structures and usage caps that discourage online piracy," he wrote, which comes pretty close to suggesting that heavy data use implies piracy and should be limited.
Weiye Loh

Designers Make Data Much Easier to Digest - NYTimes.com - 0 views

  • On the benefit side, people become more engaged when they can filter information that is presented visually and make discoveries on their own. On the risk side, Professor Shneiderman says, tools as powerful as visualizations have the potential to mislead or confuse consumers. And privacy implications arise, he says, as increasing amounts of personal, housing, medical and financial data become widely accessible, searchable and viewable.
  • In the 1990s, Professor Shneiderman developed tree mapping, which uses interlocking rectangles to represent complicated data sets. The rectangles are sized and colored to convey different kinds of information, like revenue or geographic region, says Jim Bartoo, the chief executive of the Hive Group, a software company that uses tree mapping to help companies and government agencies monitor operational data. When executives or plant managers see the nested rectangles grouped together, he adds, they should be able to immediately spot anomalies or trends. In one tree-map visualization of a sales department on the Hive Group site, red tiles represent underperforming sales representatives while green tiles represent people who exceeded their sales quotas. So it’s easy to identify the best sales rep in the company: the biggest green tile. But viewers can also reorganize the display — by region, say, or by sales manager — to see whether patterns exist that explain why some employees are falling behind. “It’s the ability of the human brain to pick out size and color” that makes tree mapping so intuitive, Mr. Bartoo says. Information visualization, he adds, “suddenly starts answering questions that you didn’t know you had.”
  • data visualization is no longer just a useful tool for researchers and corporations. It’s also an entertainment and marketing vehicle.
  • ...2 more annotations...
  • In 2009, for example, Stamen Design, a technology and design studio in San Francisco, created a live visualization of Twitter traffic during the MTV Video Music awards. In the animated graphic, floating bubbles, each displaying a photograph of a celebrity, expanded or contracted depending on the volume of Twitter activity about each star. The project provided a visceral way for viewers to understand which celebrities dominated Twitter talk in real time, says Eric Rodenbeck, the founder and creative director of Stamen Design.
  • Designers once created visual representations of data that would steer viewers to information that seemed the most important or newsworthy, he says; now they create visualizations that contain attractive overview images and then let users direct their own interactive experience — wherever it may take them. “It’s not about leading with a certain view anymore,” he says. “It’s about delivering the view that gets the most participation and engagement.”
Weiye Loh

Roger Pielke Jr.'s Blog: Continued Deceleration of the Decarbonization of the Global Economy - 0 views

  •  
    data shows that in 2010 the world saw the rate of change in its carbon dioxide emissions per unit of economic activity continue to decrease -- to zero.  (The data that I use are global GDP data from Angus Maddison extended using IMF global GDP growth rates and NEAA carbon dioxide data extended to 2010 using the 2010 growth rate released by the IEA yesterday).  The deceleration of the decarbonization of the global economy means that the world is moving away from stabilization of concentrations of carbon dioxide in the atmosphere, and despite the various reports issued and assertions made, there is no evidence to support claims to the contrary. 
Weiye Loh

DenialDepot: A word of caution to the BEST project team - 0 views

  • 1) Any errors, however inconsequential, will be taken Very Seriously and accusations of fraud will be made.
  • 2) If you adjust the raw data we will accuse you of fraudulently fiddling the figures whilst cooking the books.3) If you don't adjust the raw data we will accuse you of fraudulently failing to account for station biases and UHI.
  • 7) By all means publish all your source code, but we will still accuse you of hiding the methodology for your adjustments.
  • ...10 more annotations...
  • 8) If you publish results to your website and errors are found, we will accuse you of a Very Serious Error irregardless of severity (see point #1) and bemoan the press release you made about your results even though you won't remember making any press release about your results.
  • 9) With regard to point #8 above, at extra cost and time to yourself you must employ someone to thoroughly check each monthly update before is is published online, even if this delays publication of the results till the end of the month. You might be surprised at this because no-one actually relies on such freshly published data anyway and aren't the many eyes of blog audit better than a single pair of eyes? Well that's irrelevant. See points #1 and #810) If you don't publish results promptly at the start of the month on the public website, but instead say publish the results to a private site for checks to be performed before release, we will accuse you of engaging in unscientific-like secrecy and massaging the data behind closed doors.
  • 14) If any region/station shows a warming trend that doesn't match the raw data, and we can't understand why, we will accuse you of fraud and dismiss the entire record. Don't expect us to have to read anything to understand results.
  • 15) You must provide all input datasets on your website. It's no good referencing NOAAs site and saying they "own" the GHCN data for example. I don't want their GHCN raw temperatures file, I want the one on your hard drive which you used for the analysis, even if you claim they are the same. If you don't do this we will accuse you of hiding the data and preventing us checking your results.
  • 24. In the event that you comply with all of the above, we will point out that a mere hundred-odd years of data is irrelevant next to the 4.5 billion year history of Earth. So why do you even bother?
  • 23) In the unlikely event that I haven't wasted enough of your time forcing you to comply with the above rules, I also demand to see all emails you have sent or will send during the period 1950 to 2050 that contain any of these keywords
  • 22) We don't need any scrutiny because our role isn't important.
  • 17) We will treat your record as if no alternative exists. As if your record is the make or break of Something Really Important (see point #1) and we just can't check the results in any other way.
  • 16) You are to blame for any station data your team uses. If we find out that a station you use is next to an AC Unit, we will conclude you personally planted the thermometer there to deliberately get warming.
  • an article today by Roger Pielke Nr. (no relation) that posited the fascinating concept that thermometers are just as capricious and unreliable proxies for temperature as tree rings. In fact probably more so, and re-computing global temperature by gristlecone pines would reveal the true trend of global cooling, which will be in all our best interests and definitely NOT just those of well paying corporate entities.
  •  
    Dear Professor Muller and Team, If you want your Berkley Earth Surface Temperature project to succeed and become the center of attention you need to learn from the vast number of mistakes Hansen and Jones have made with their temperature records. To aid this task I created a point by point list for you.
Weiye Loh

Keen On… Michael Fertik: Why Data is the New Oil and Why We, the Consumer, Aren't Benefitting From It | TechCrunch - 0 views

  •  
    In today's Web 3.0 personal data rich economy, reputation is replacing cash, Fertik believes. And he is confident that his company, Reputation.com, is well placed to become the new rating index of this digital ecosystem. But Fertik isn't ecstatic about the way in which new online products, such as facial recognition technology, are exploiting the privacy of online consumers. Arguing that "data is the new oil," Fertik believes that the only people not benefitting from today's social economy are consumers themselves. Rather than government legislation, however, the solution, Fertik told me, are more start-up entrepreneurs like himself providing paid products that empower consumers in our Web 3.0 world of pervasive personalized data. This is the second and final part of my interview with Fertik. Yesterday, he explained to me why people will pay for privacy.
Weiye Loh

"Open" - "Necessary" but not "Sufficient" « Gurstein's Community Informatics - 0 views

  • Egon Willighagen commenting on Peter Murray-Rusk response to my blogpost  writes: Open Data is *not* about how to present (governmental) Data in a human readable way to the general public to take advantage of (though I understand why he got that idea), but Open Data is about making this technically and legally *possible*. He did not get that point, unfortunately.
  • “Open Data” as articulated above by Willighagen has the form of a private club—open “technically” (and “legally”) to all to join but whose membership requires a degree of education, ressources, technical skill such as to put it out of the reach of any but a very select group.
  • Parminder Jeet Singh in his own comments contrasts Open Data with Public Data—a terminology and conceptual shift with which I am coming to agree—where Public Data is Data which is not only “open” but also is designed and structured so as to be usable by the broad “public” (“the people”).
Weiye Loh

How should we use data to improve our lives? - By Michael Agger - Slate Magazine - 0 views

  • The Swiss economists Bruno Frey and Alois Stutzer argue that people do not appreciate the real cost of a long commute. And especially when that commute is unpredictable, it takes a toll on our daily well-being.
  • imagine if we shared our commuting information so that we could calculate the average commute from various locations around a city. When the growing family of four pulls up to a house for sale for in New Jersey, the listing would indicate not only the price and the number of bathrooms but also the rush-hour commute time to Midtown Manhattan. That would be valuable information to have, since buyers could realistically factor the tradeoffs of remaining in a smaller space closer to work against moving to a larger space and taking on a longer commute.
  • In a cover story for the New York Times Magazine, the writer Gary Wolf documented the followers of “The Data-Driven Life,” programmers, students, and self-described geeks who track various aspects of their lives. Seth Roberts does a daily math exercise to measure small changes in his mental acuity. Kiel Gilleade is a "Body Blogger" who shares his heart rate via Twitter. On the more extreme end, Mark Carranza has a searchable Database of every idea he's had since 1984. They're not alone. This community continues to thrive, and its efforts are chronicled at a blog called the Quantified Self, co-founded by Wolf and Kevin Kelly.
  • ...3 more annotations...
  • If you've ever asked Nike+ to log your runs or given Google permission to keep your search history, you've participated in a bit of self-tracking. Now that more people have location-aware smartphones and the Web has made data easy to share, personal data is poised to become an important tool to understand how we live, and how we all might live better. One great example of this phenomenon in action is the site Cure Together, which allows you to enter your symptoms—for, say, "anxiety" or "insomnia"—and the various remedies you've tried to feel better. One thing the site does is aggregate this information and present the results in chart form. Here is the chart for depression:
  • Instead of being isolated in your own condition, you can now see what has worked for others. The same principle is at work at the site Fuelly, where you can "track, share, and compare" your miles per gallon and see how efficient certain makes and models really are.
  • Businesses are also using data tracking to spur their employees to accomplishing companywide goals: Wal-Mart partnered with Zazengo to help employees track their "personal sustainability" actions such as making a home-cooked meal or buying local produce. The app Rescue Time, which records all of the activity on your computer, gives workers an easy way to account for their time. And that comes in handy when you want to show the boss how efficient telecommuting can be.
  •  
    Data for a better planet
Weiye Loh

Information technology and economic change: The impact of the printing press | vox - Research-based policy analysis and commentary from leading economists - 0 views

  • Despite the revolutionary technological advance of the printing press in the 15th century, there is precious little economic evidence of its benefits. Using data on 200 European cities between 1450 and 1600, this column finds that economic growth was higher by as much as 60 percentage points in cities that adopted the technology.
  • Historians argue that the printing press was among the most revolutionary inventions in human history, responsible for a diffusion of knowledge and ideas, “dwarfing in scale anything which had occurred since the invention of writing” (Roberts 1996, p. 220). Yet economists have struggled to find any evidence of this information technology revolution in measures of aggregate productivity or per capita income (Clark 2001, Mokyr 2005). The historical data thus present us with a puzzle analogous to the famous Solow productivity paradox – that, until the mid-1990s, the data on macroeconomic productivity showed no effect of innovations in computer-based information technology.
  • In recent work (Dittmar 2010a), I examine the revolution in Renaissance information technology from a new perspective by assembling city-level data on the diffusion of the printing press in 15th-century Europe. The data record each city in which a printing press was established 1450-1500 – some 200 out of over 1,000 historic cities (see also an interview on this site, Dittmar 2010b). The research emphasises cities for three principal reasons. First, the printing press was an urban technology, producing for urban consumers. Second, cities were seedbeds for economic ideas and social groups that drove the emergence of modern growth. Third, city sizes were historically important indicators of economic prosperity, and broad-based city growth was associated with macroeconomic growth (Bairoch 1988, Acemoglu et al. 2005).
  • ...8 more annotations...
  • Figure 1 summarises the data and shows how printing diffused from Mainz 1450-1500. Figure 1. The diffusion of the printing press
  • City-level data on the adoption of the printing press can be exploited to examine two key questions: Was the new technology associated with city growth? And, if so, how large was the association? I find that cities in which printing presses were established 1450-1500 had no prior growth advantage, but subsequently grew far faster than similar cities without printing presses. My work uses a difference-in-differences estimation strategy to document the association between printing and city growth. The estimates suggest early adoption of the printing press was associated with a population growth advantage of 21 percentage points 1500-1600, when mean city growth was 30 percentage points. The difference-in-differences model shows that cities that adopted the printing press in the late 1400s had no prior growth advantage, but grew at least 35 percentage points more than similar non-adopting cities from 1500 to 1600.
  • The restrictions on diffusion meant that cities relatively close to Mainz were more likely to receive the technology other things equal. Printing presses were established in 205 cities 1450-1500, but not in 40 of Europe’s 100 largest cities. Remarkably, regulatory barriers did not limit diffusion. Printing fell outside existing guild regulations and was not resisted by scribes, princes, or the Church (Neddermeyer 1997, Barbier 2006, Brady 2009).
  • Historians observe that printing diffused from Mainz in “concentric circles” (Barbier 2006). Distance from Mainz was significantly associated with early adoption of the printing press, but neither with city growth before the diffusion of printing nor with other observable determinants of subsequent growth. The geographic pattern of diffusion thus arguably allows us to identify exogenous variation in adoption. Exploiting distance from Mainz as an instrument for adoption, I find large and significant estimates of the relationship between the adoption of the printing press and city growth. I find a 60 percentage point growth advantage between 1500-1600.
  • The importance of distance from Mainz is supported by an exercise using “placebo” distances. When I employ distance from Venice, Amsterdam, London, or Wittenberg instead of distance from Mainz as the instrument, the estimated print effect is statistically insignificant.
  • Cities that adopted print media benefitted from positive spillovers in human capital accumulation and technological change broadly defined. These spillovers exerted an upward pressure on the returns to labour, made cities culturally dynamic, and attracted migrants. In the pre-industrial era, commerce was a more important source of urban wealth and income than tradable industrial production. Print media played a key role in the development of skills that were valuable to merchants. Following the invention printing, European presses produced a stream of math textbooks used by students preparing for careers in business.
  • These and hundreds of similar texts worked students through problem sets concerned with calculating exchange rates, profit shares, and interest rates. Broadly, print media was also associated with the diffusion of cutting-edge business practice (such as book-keeping), literacy, and the social ascent of new professionals – merchants, lawyers, officials, doctors, and teachers.
  • The printing press was one of the greatest revolutions in information technology. The impact of the printing press is hard to identify in aggregate data. However, the diffusion of the technology was associated with extraordinary subsequent economic dynamism at the city level. European cities were seedbeds of ideas and business practices that drove the transition to modern growth. These facts suggest that the printing press had very far-reaching consequences through its impact on the development of cities.
Weiye Loh

Roger Pielke Jr.'s Blog: Blind Spots in Australian Flood Policies - 0 views

  • better management of flood risks in Australia will depend up better data on flood risk.  However, collecting such data has proven problematic
  • As many Queenslanders affected by January’s floods are realising, riverine flood damage is commonly excluded from household insurance policies. And this is unlikely to change until councils – especially in Queensland – stop dragging their feet and actively assist in developing comprehensive data insurance companies can use.
  • ? Because there is often little available information that would allow an insurer to adequately price this flood risk. Without this, there is little economic incentive for insurers to accept this risk. It would be irresponsible for insurers to cover riverine flood without quantifying and pricing the risk accordingly.
  • ...8 more annotations...
  • The first step in establishing risk-adjusted premiums is to know the likelihood of the depth of flooding at each address. This information has to be address-specific because the severity of flooding can vary widely over small distances, for example, from one side of a road to the other.
  • A litany of reasons is given for withholding data. At times it seems that refusal stems from a view that insurance is innately evil. This is ironic in view of the gratuitous advice sometimes offered by politicians and commentators in the aftermath of extreme events, exhorting insurers to pay claims even when no legal liability exists and riverine flood is explicitly excluded from policies.
  • Risk Frontiers is involved in jointly developing the National Flood Information Database (NFID) for the Insurance Council of Australia with Willis Re, a reinsurance broking intermediary. NFID is a five year project aiming to integrate flood information from all city councils in a consistent insurance-relevant form. The aim of NFID is to help insurers understand and quantify their risk. Unfortunately, obtaining the base Data for NFID from some local councils is difficult and sometimes impossible despite the support of all state governments for the development of NFID. Councils have an obligation to assess their flood risk and to establish rules for safe land development. However, many are antipathetic to the idea of insurance. Some states and councils have been very supportive – in New South Wales and Victoria, particularly. Some states have a central repository – a library of all flood studies and digital terrain models (digital elevation Data). Council reluctance to release Data is most prevalent in Queensland, where, unfortunately, no central repository exists.
  • Second, models of flood risk are sometimes misused:
  • many councils only undertake flood modelling in order to create a single design flood level, usually the so-called one-in-100 year flood. (For reasons given later, a better term is the flood with an 1% annual likelihood of being exceeded.)
  • Inundation maps showing the extent of the flood with a 1% annual likelihood of exceedance are increasingly common on council websites, even in Queensland. Unfortunately these maps say little about the depth of water at an address or, importantly, how depth varies for less probable floods. Insurance claims usually begin when the ground is flooded and increase rapidly as water rises above the floor level. At Windsor in NSW, for example, the difference in the water depth between the flood with a 1% annual chance of exceedance and the maximum possible flood is nine metres. In other catchments this difference may be as small as ten centimetres. The risk of damage is quite different in both cases and an insurer needs this information if they are to provide coverage in these areas.
  • The ‘one-in-100 year flood’ term is misleading. To many it is something that happens regularly once every 100 years — with the reliability of a bus timetable. It is still possible, though unlikely, that a flood of similar magnitude or even greater flood could happen twice in one year or three times in successive years.
  • The calculations underpinning this are not straightforward but the probability that an address exposed to a 1-in-100 year flood will experience such an event or greater over the lifetime of the house – 50 years say – is around 40%. Over the lifetime of a typical home mortgage – 25 years – the probability of occurrence is 22%. These are not good odds.
  •  
    John McAneney of Risk Frontiers at Macquarie University in Sydney identifies some opportunities for better flood policies in Australia.
Weiye Loh

World Bank Institute: We're also the data bank - video | Media | guardian.co.uk - 0 views

  •  
    Aleem Walji, practice manager for innovation at the World Bank Institute, which assists and advises policy makers and NGOs, tells the Guardian's Activate summit in London about the organisation's commitment to open data
Jiamin Lin

Firms allowed to share private data - 0 views

  •  
    Companies who request for their customer's private information may in turn distribute these confidential particulars to others. As such, cases of fraud and identity theft have surfaced, with fraudsters using these distributed identities to apply for loans or credit cards. Unlike other countries, no privacy law to safeguard an individual's data against unauthorized commercial use has been put in place. As a result, fraudsters are able to ride on this loophole. Ethical Question: Is it right for companies to request for their customer's private information for certain reasons? Is it even fair that they distribute these information to third parties, perhaps as a way to make money? Problem: I think the main problem is that there isn't a law in Singapore that safeguards an individual's data against unauthorized commercial use. Even though the Model data Protection Code scheme tries to do the above, it is after all, still a voluntary scheme. Companies can opt to adopt the scheme, but whether they choose to apply it regularly, is another issue. As long as a privacy law is not in place, this issue will continue to recur in Singapore.
Weiye Loh

The Data-Driven Life - NYTimes.com - 0 views

  • Humans make errors. We make errors of fact and errors of judgment. We have blind spots in our field of vision and gaps in our stream of attention.
  • These weaknesses put us at a disadvantage. We make decisions with partial information. We are forced to steer by guesswork. We go with our gut.
  • Others use data.
  • ...3 more annotations...
  • Others use data. A timer running on Robin Barooah’s computer tells him that he has been living in the United States for 8 years, 2 months and 10 days. At various times in his life, Barooah — a 38-year-old self-employed software designer from England who now lives in Oakland, Calif. — has also made careful records of his work, his sleep and his diet.
  • A few months ago, Barooah began to wean himself from coffee. His method was precise. He made a large cup of coffee and removed 20 milliliters weekly. This went on for more than four months, until barely a sip remained in the cup. He drank it and called himself cured. Unlike his previous attempts to quit, this time there were no headaches, no extreme cravings. Still, he was tempted, and on Oct. 12 last year, while distracted at his desk, he told himself that he could probably concentrate better if he had a cup. Coffee may have been bad for his health, he thought, but perhaps it was good for his concentration. Barooah wasn’t about to try to answer a question like this with guesswork. He had a good data set that showed how many minutes he spent each day in focused work. With this, he could do an objective analysis. Barooah made a chart with dates on the bottom and his work time along the side. Running down the middle was a big black line labeled “Stopped drinking coffee.” On the left side of the line, low spikes and narrow columns. On the right side, high spikes and thick columns. The data had delivered their verdict, and coffee lost.
  • “People have such very poor sense of time,” Barooah says, and without good time calibration, it is much harder to see the consequences of your actions. If you want to replace the vagaries of intuition with something more reliable, you first need to gather data. Once you know the facts, you can live by them.
Weiye Loh

Likert scale - Wikipedia, the free encyclopedia - 0 views

  • Whether individual Likert items can be considered as interval-level data, or whether they should be considered merely ordered-categorical data is the subject of disagreement. Many regard such items only as ordinal data, because, especially when using only five levels, one cannot assume that respondents perceive all pairs of adjacent levels as equidistant. On the other hand, often (as in the example above) the wording of response levels clearly implies a symmetry of response levels about a middle category; at the very least, such an item would fall between ordinal- and interval-level measurement; to treat it as merely ordinal would lose information. Further, if the item is accompanied by a visual analog scale, where equal spacing of response levels is clearly indicated, the argument for treating it as interval-level data is even stronger.
  • When treated as ordinal data, Likert responses can be collated into bar charts, central tendency summarised by the median or the mode (but some would say not the mean), dispersion summarised by the range across quartiles (but some would say not the standard deviation), or analyzed using non-parametric tests, e.g. chi-square test, Mann–Whitney test, Wilcoxon signed-rank test, or Kruskal–Wallis test.[4] Parametric analysis of ordinary averages of Likert scale data is also justifiable by the Central Limit Theorem, although some would disagree that ordinary averages should be used for Likert scale data.
Weiye Loh

Libel Chill and Me « Skepticism « Critical Thinking « Skeptic North - 0 views

  • Skeptics may by now be very familiar with recent attempts in Canada to ban wifi from public schools and libraries.  In short: there is no valid scientific reason to be worried about wifi.  It has also been revealed that the chief scientists pushing the wifi bans have been relying on poor data and even poorer studies.  By far the vast majority of scientific data that currently exists supports the conclusion that wifi and cell phone signals are perfectly safe.
  • So I wrote about that particular topic in the summer.  It got some decent coverage, but the fear mongering continued. I wrote another piece after I did a little digging into one of the main players behind this, one Rodney Palmer, and I discovered some decidedly pseudo-scientific tendencies in his past, as well as some undisclosed collusion.
  • One night I came home after a long day at work, a long commute, and a phone call that a beloved family pet was dying, and will soon be in significant pain.  That is the state I was in when I read the news about Palmer and Parliamentary committee.
  • ...18 more annotations...
  • That’s when I wrote my last significant piece for Skeptic North.  Titled, “Rodney Palmer: When Pseudoscience and Narcissism Collide,” it was a fiery take-down of every claim I heard Palmer speak before the committee, as well as reiterating some of his undisclosed collusion, unethical media tactics, and some reasons why he should not be considered an expert.
  • This time, the article got a lot more reader eyeballs than anything I had ever written for this blog (or my own) and it also caught the attention of someone on a school board which was poised to vote on wifi.  In these regards: Mission very accomplished.  I finally thought that I might be able to see some people in the media start to look at Palmer’s claims with a more critical eye than they had been previously, and I was flattered at the mountain of kind words, re-tweets, reddit comments and Facebook “likes.”
  • The comments section was mostly supportive of my article, and they were one of the few things that kept me from hiding in a hole for six weeks.  There were a few comments in opposition to what I wrote, some sensible, most incoherent rambling (one commenter, when asked for evidence, actually linked to a YouTube video which they referred to as “peer reviewed”)
  • One commenter was none other than the titular subject of the post, Rodney Palmer himself.  Here is a screen shot of what he said: Screen shot of the Libel/Slander threat.
  • Knowing full well the story of the libel threat against Simon Singh, I’ve always thought that if ever a threat like that came my way, I’d happily beat it back with the righteous fury and good humour of a person with the facts on their side.  After all, if I’m wrong, you’d be able to prove me wrong, rather than try to shut me up with a threat of a lawsuit.  Indeed, I’ve been through a similar situation once before, so I should be an old hat at this! Let me tell you friends, it’s not that easy.  In fact, it’s awful.  Outside observers could easily identify that Palmer had no case against me, but that was still cold comfort to me.  It is a very stressful situation to find yourself in.
  • The state of libel and slander laws in this country are such that a person can threaten a lawsuit without actually threatening a lawsuit.  There is no need to hire a lawyer to investigate the claims, look into who I am, where I live, where I work, and issue a carefully worded threatening letter demanding compliance.  All a person has to say is some version of  “Libel.  Slander.  Hmmmm….,” and that’s enough to spook a lot of people into backing off. It’s a modern day bogeyman.  They don’t have to prove it.  They don’t have to act on it.  A person or organization just has to say “BOO!” with sufficient seriousness, and unless you’ve got a good deal of editorial and financial support, discussion goes out the window. Libel Chill refers to the ‘chilling effect’ that the possibility of a libel/slander lawsuit has.  If a person is scared they might get sued, then they won’t even comment on a piece at all.  In my case, I had already commented three times on the wifi scaremongering, but this bogus threat against me was surely a major contributing factor to my not commenting again.
  • I ceased to discuss anything in the comment thread of the original article, and even shied away from other comment threads, calling me out.  I learned a great deal about the wifi/EMF issue since I wrote the article, but I did not comment on any of it, because I knew that Palmer and his supporters were watching me like a hawk (sorry to stretch the simile), and would likely try to silence me again.  I couldn’t risk a lawsuit.  Even though I knew there was no case against me, I couldn’t afford a lawyer just to prove that I didn’t do anything illegal.
  • The Libel and Slanders Act of Ontario, 1990 hasn’t really caught up with the internet.  There isn’t a clear precedent that defines a blog post, Twitter feed or Facebook post as falling under the umbrella of “broadcast,” which is what the bill addresses.  If I had written the original article in print, Palmer would have had six weeks to file suit against me.  But the internet is only kind of considered ‘broadcast.’  So it could be just six weeks, but he could also have up to two years to act and get a lawyer after me.  Truth is, there’s not a clear demarcation point for our Canadian legal system.
  • Libel laws in Canada are somewhere in between the Plaintiff-favoured UK system, and the Defendant-favoured US system.  On the one hand, if Palmer chose to incur the expense and time to hire a lawyer and file suit against me, the burden of proof would be on me to prove that I did not act with malice.  Easy peasy.  On the other hand, I would have a strong case that I acted in the best interests of Canadians, which would fall under the recent Supreme Court of Canada decision on protecting what has been termed, “Responsible Communication.”  The Supreme Court of Canada decision does not grant bloggers immunity from libel and slander suits, but it is a healthy dose of welcome freedom to discuss issues of importance to Canadians.
  • Palmer himself did not specify anything against me in his threat.  There was nothing particular that he complained about, he just said a version of “Libel and Slander!” at me.  He may as well have said “Boo!”
  • This is not a DBAD discussion (although I wholeheartedly agree with Phil Plait there). 
  • If you’d like to boil my lessons down to an acronym, I suppose the best one would be DBRBC: Don’t be reckless. Be Careful.
  • I wrote a piece that, although it was not incorrect in any measurable way, was written with fire and brimstone, piss and vinegar.  I stand by my piece, but I caution others to be a little more careful with the language they use.  Not because I think it is any less or more tactically advantageous (because I’m not sure anyone can conclusively demonstrate that being an aggressive jerk is an inherently better or worse communication tool), but because the risks aren’t always worth it.
  • I’m not saying don’t go after a person.  There are egomaniacs out there who deserve to be called out and taken down (verbally, of course).  But be very careful with what you say.
  • ask yourself some questions first: 1) What goal(s) are you trying to accomplish with this piece? Are you trying to convince people that there is a scientific misunderstanding here?  Are you trying to attract the attention of the mainstream media to a particular facet of the issue?  Are you really just pissed off and want to vent a little bit?  Is this article a catharsis, or is it communicative?  Be brutally honest with your intentions, it’s not as easy as you think.  Venting is okay.  So is vicious venting, but be careful what you dress it up as.
  • 2) In order to attain your goals, did you use data, or personalities?  If the former, are you citing the best, most current data you have available to you? Have you made a reasonable effort to check your data against any conflicting data that might be out there? If the latter, are you providing a mountain of evidence, and not just projecting onto personalities?  There is nothing inherently immoral or incorrect with going after the personalities.  But it is a very risky undertaking. You have to be damn sure you know what you’re talking about, and damn ready to defend yourself.  If you’re even a little loose with your claims, you will be called out for it, and a legal threat is very serious and stressful. So if you’re going after a personality, is it worth it?
  • 3) Are you letting the science speak for itself?  Are you editorializing?  Are you pointing out what part of your piece is data and what part is your opinion?
  • 4) If this piece was written in anger, frustration, or otherwise motivated by a powerful emotion, take a day.  Let your anger subside.  It will.  There are many cathartic enterprises out there, and you don’t need to react to the first one that comes your way.  Let someone else read your work before you share it with the internet.  Cooler heads definitely do think more clearly.
Weiye Loh

Scientists Are Cleared of Misuse of Data - NYTimes.com - 0 views

  • The inquiry, by the Commerce Department’s inspector general, focused on e-mail messages between climate scientists that were stolen and circulated on the Internet in late 2009 (NOAA is part of the Commerce Department). Some of the e-mails involved scientists from NOAA.
  • Climate change skeptics contended that the correspondence showed that scientists were manipulating or withholding information to advance the theory that the earth is warming as a result of human activity.
  • In a report dated Feb. 18 and circulated by the Obama administration on Thursday, the inspector general said, “We did not find any evidence that NOAA inappropriately manipulated data.”
  • ...6 more annotations...
  • The finding comes at a critical moment for NOAA as some newly empowered Republican House members seek to rein in the Environmental Protection Agency’s plans to regulate greenhouse gas emissions, often contending that the science underpinning global warming is flawed. NOAA is the federal agency tasked with monitoring climate data.
  • The inquiry into NOAA’s conduct was requested last May by Senator James M. Inhofe, Republican of Oklahoma, who has challenged the science underlying human-induced climate change. Mr. Inhofe was acting in response to the controversy over the e-mail messages, which were stolen from the Climatic Research Unit at the University of East Anglia in England, a major hub of climate research. Mr. Inhofe asked the inspector general of the Commerce Department to investigate how NOAA scientists responded internally to the leaked e-mails. Of 1,073 messages, 289 were exchanges with NOAA scientists.
  • The inspector general reviewed the 1,073 e-mails, and interviewed Dr. Lubchenco and staff members about their exchanges. The report did not find scientific misconduct; it did however, challenge the agency over its handling of some Freedom of Information Act requests in 2007. And it noted the inappropriateness of e-mailing a collage cartoon depicting Senator Inhofe and five other climate skeptics marooned on a melting iceberg that passed between two NOAA scientists.
  • The report was not a review of the climate data itself. It joins a series of investigations by the British House of Commons, Pennsylvania State University, the InterAcademy Council and the National Research Council into the leaked e-mails that have exonerated the scientists involved of scientific wrongdoing.
  • But Mr. Inhofe said the report was far from a clean bill of health for the agency and that contrary to its executive summary, showed that the scientists “engaged in data manipulation.”
  • “It also appears that one senior NOAA employee possibly thwarted the release of important federal scientific information for the public to assess and analyze,” he said, referring to an employee’s failure to provide material related to work for the Intergovernmental Panel on Climate Change, a different body that compiles research, in response to a Freedom of Information request.
Weiye Loh

Climategate: Hiding the Decline? - 0 views

  • Regarding the “hide the decline” email, Jones has explained that when he used the word “trick”, he simply meant “a mathematical approach brought to bear to solve a problem”. The inquiry made the following criticism of the resulting graph (its emphasis): [T]he figure supplied for the WMO Report was misleading. We do not find that it is misleading to curtail reconstructions at some point per se, or to splice data, but we believe that both of these procedures should have been made plain — ideally in the figure but certainly clearly described in either the caption or the text. [1.3.2] But this was one isolated instance that occurred more than a decade ago. The Review did not find anything wrong with the overall picture painted about divergence (or uncertainties generally) in the literature and in IPCC reports. The Review notes that the WMO report in question “does not have the status or importance of the IPCC reports”, and concludes that divergence “is not hidden” and “the subject is openly and extensively discussed in the literature, including CRU papers.” [1.3.2]
  • As for the treatment of uncertainty in the AR4’s paleoclimate chapter, the Review concludes that the central Figure 6.10 is not misleading, that “[t]he variation within and between lines, as well as the depiction of uncertainty is quite apparent to any reader”, that “there has been no exclusion of other published temperature reconstructions which would show a very different picture”, and that “[t]he general discussion of sources of uncertainty in the text is extensive, including reference to divergence”. [7.3.1]
  • Regarding CRU’s selections of tree ring series, the Review does not presume to say whether one series is better than another, though it does point out that CRU have responded to the accusation that Briffa misused the Yamal data on their website. The Review found no evidence that CRU scientists knowingly promoted non-representative series or that their input cast doubt on the IPCC’s conclusions. The much-maligned Yamal series was included in only 4 of the 12 temperature reconstructions in the AR4 (and not at all in the TAR).
  • ...1 more annotation...
  • What about the allegation that CRU withheld the Yamal data? The Review found that “CRU did not withhold the underlying raw data (having correctly directed the single request to the owners)”, although “we believe that CRU should have ensured that the data they did not own, but on which their publications relied, was archived in a more timely way.” [1.3.2]
  •  
    Regarding the "hide the decline" email, Jones has explained that when he used the word "trick", he simply meant "a mathematical approach brought to bear to solve a problem". The inquiry made the following criticism of the resulting graph (its emphasis): [T]he figure supplied for the WMO Report was misleading. We do not find that it is misleading to curtail reconstructions at some point per se, or to splice data, but we believe that both of these procedures should have been made plain - ideally in the figure but certainly clearly described in either the caption or the text. [1.3.2] But this was one isolated instance that occurred more than a decade ago. The Review did not find anything wrong with the overall picture painted about divergence (or uncertainties generally) in the literature and in IPCC reports. The Review notes that the WMO report in question "does not have the status or importance of the IPCC reports", and concludes that divergence "is not hidden" and "the subject is openly and extensively discussed in the literature, including CRU papers." [1.3.2]
Weiye Loh

Pew Research raw survey data now available - 0 views

  •  
    The Pew Research churns out a lot of interesting results from a number of surveys about online and American culture, but they usually only shared aggregated results, pre-made charts and graphs. This is well and good for the information-consuming public; however, these results can spawn curiosities that are fun to dig into. Luckily, the Pew Research Center launched a Data Sets section that provides raw survey responses and the questions in a variety of easy-to-use Data formats.
Weiye Loh

Scientist Beloved by Climate Deniers Pulls Rug Out from Their Argument - Environment - GOOD - 0 views

  • One of the scientists was Richard Muller from University of California, Berkeley. Muller has been working on an independent project to better estimate the planet's surface temperatures over time. Because he is willing to say publicly that he has some doubts about the accuracy of the temperature stations that most climate models are based on, he has been embraced by the science denying crowd.
  • A Koch brothers charity, for example, has donated nearly 25 percent of the financial support provided to Muller's project.
  • Skeptics of climate science have been licking their lips waiting for his latest research, which they hoped would undermine the data behind basic theories of anthropogenic climate change. At the hearing today, however, Muller threw them for a loop with this graph:
  • ...3 more annotations...
  • Muller's data (black line) tracks pretty well with the three established data sets. This is just an initial sampling of Muller's data—just 2 percent of the 1.6 billion records he's working with—but these early findings are incredibly consistent with the previous findings
  • In his testimony, Muller made these points (emphasis mine): The Berkeley Earth Surface Temperature project was created to make the best possible estimate of global temperature change using as complete a record of measurements as possible and by applying novel methods for the estimation and elimination of systematic biases. We see a global warming trend that is very similar to that previously reported by the other groups. The world temperature data has sufficient integrity to be used to determine global temperature trends. Despite potential biases in the data, methods of analysis can be used to reduce bias effects well enough to enable us to measure long-term Earth temperature changes. data integrity is adequate. Based on our initial work at Berkeley Earth, I believe that some of the most worrisome biases are less of a problem than I had previously thought.
  • For the many climate deniers who hang their arguments on Muller's "doubts," this is a severe blow. Of course, when the hard scientific truths are inconvenient, climate denying House leaders can always call a lawyer, a marketing professor, and an economist into the scientific hearing.
  •  
    Today, there was a climate science hearing in the House Committee on Science, Space, and Technology. Of the six "expert" witnesses, only three were scientists. The others were an economist, a lawyer, and a professor of marketing. One of the scientists was Richard Muller from University of California, Berkeley. Muller has been working on an independent project to better estimate the planet's surface temperatures over time. Because he is willing to say publicly that he has some doubts about the accuracy of the temperature stations that most climate models are based on, he has been embraced by the science denying crowd. A Koch brothers charity, for example, has donated nearly 25 percent of the financial support provided to Muller's project.
‹ Previous 21 - 40 of 227 Next › Last »
Showing 20 items per page