Skip to main content

Home/ Groups/ DJCamp2011
Tom Johnson

Reporters' Lab // Creating a newsroom tool in 30 hours or less - 1 views

  •  
    Creating a newsroom tool in 30 hours or less June 28, 2012 at 2:51 PM At NewsHack in San Francisco, a team of eight journalists and developers spent 30 hours cobbling together Haystax, a point-and-click Web scraper to help anyone collect public information from online databases. Now we need help taking it to the next level.
Tom Johnson

Mr. People - Data cleaning - 1 views

  •  
    Mr. People Years ago, while trying to clean up the names of donors in campaign finance data from the Federal Election Commission, I hacked together a Perl module - loosely based on the Lingua-EN-NameParse module - to standardize names. One port to Ruby later, I've finally put together a Web front end for it. Try it out below - paste your own data in or try the sample data. To use the people Ruby gem in your own scripts, sudo gem install people, then read the documentation. Suggestions? Send them to mrpeople@ericson.net Allow couples:   Case:  Output:    Paste your names here:
Tom Johnson

http://theyrule.net - 1 views

  •  
    They Rule Overview They Rule aims to provide a glimpse of some of the relationships of the US ruling class. It takes as its focus the boards of some of the most powerful U.S. companies, which share many of the same directors. Some individuals sit on 5, 6 or 7 of the top 1000 companies. It allows users to browse through these interlocking directories and run searches on the boards and companies. A user can save a map of connections complete with their annotations and email links to these maps to others. They Rule is a starting point for research about these powerful individuals and corporations. Context A few companies control much of the economy and oligopolies exert control in nearly every sector of the economy. The people who head up these companies swap on and off the boards from one company to another, and in and out of government committees and positions. These people run the most powerful institutions on the planet, and we have almost no say in who they are. This is not a conspiracy, they are proud to rule, yet these connections of power are not always visible to the public eye. Karl Marx once called this ruling class a 'band of hostile brothers.' They stand against each other in the competitve struggle for the continued accumulation of their capital, but they stand together as a family supporting their interests in perpetuating the profit system as whole. Protecting this system can require the cover of a 'legitimate' force - and this is the role that is played by the state. An understanding of this system can not be gleaned from looking at the inter-personal relations of this class alone, but rather how they stand in relation to other classes in society. Hopefully They Rule will raise larger questions about the structure of our society and in whose benefit it is run. The Data We do not claim that this data is 100% accurate at all times. Corporate directors have a habit of dying, quitting boards, joining new ones and most frustratingly passing on their name
  •  
    I think this data must be very useful to the people in Occupy Wall Street
Tom Johnson

An Applied Demography Toolbox - 1 views

  •  
    An Applied Demography Toolbox A collection of applied demography programs, scripts, spreadsheets and databases. If you have any questions (such as how to apply the tools to your own work), recommendations or additions, you can send a message to me (Eddie Hunsinger) at edynivn@gmail.com. If you would like to use, share or reproduce any information or ideas from the linked files, be sure to cite the respective source. Here is a neat article that gives this site some inspiration. Acknowledgments. Subscribe to new postings . Return to Eddie's homepage. http://www.demog.berkeley.edu/~eddieh/toolbox.html#MedianCalculator
  •  
    These tend to be US-centric, but there are universal tools here for statistical analysis.
Tom Johnson

You are here - 1 views

  •  
    Google - based app put together by the guys at the Chi Trib. presented at the hackaton in Santiago, Chile
Tom Johnson

Needlebase - for acquiring, integrating, cleansing, analyzing and publishing data on th... - 1 views

  • ITA Software is proud to introduce Needlebase™, a revolutionary platform for acquiring, integrating, cleansing, analyzing and publishing data on the web.  Using Needlebase through a web browser, without programmers or DBAs, your data team can easily: acquire data from multiple sources:  A simple tagging process quickly imports structured data from complex websites, XML feeds, and spreadsheets into a unified database of your design. merge, deduplicate and cleanse: Needlebase uses intelligent semantics to help you find and merge variant forms of the same record.  Your merges, edits and deletions persist even after the original data is refreshed from its source. build and publish custom data views: Use Needlebase's visual UI and powerful query language to configure exactly your desired view of the data, whether as a list, table, grid, or map.  Then, with one click, publish the data for others to see, or export a feed of the clean data to your own local database. Needlebase dramatically reduces the time, cost, and expertise needed to build and maintain comprehensive databases of practically anything. Read on to learn more about Needlebase's capabilities and our early adopters' success stories, or watch our tutorial videos. Then sign up to get started!
  •  
    ITA Software is proud to introduce Needlebase™, a revolutionary platform for acquiring, integrating, cleansing, analyzing and publishing data on the web. Using Needlebase through a web browser, without programmers or DBAs, your data team can easily: acquire data from multiple sources: A simple tagging process quickly imports structured data from complex websites, XML feeds, and spreadsheets into a unified database of your design. merge, deduplicate and cleanse: Needlebase uses intelligent semantics to help you find and merge variant forms of the same record. Your merges, edits and deletions persist even after the original data is refreshed from its source. build and publish custom data views: Use Needlebase's visual UI and powerful query language to configure exactly your desired view of the data, whether as a list, table, grid, or map. Then, with one click, publish the data for others to see, or export a feed of the clean data to your own local database. Needlebase dramatically reduces the time, cost, and expertise needed to build and maintain comprehensive databases of practically anything. Read on to learn more about Needlebase's capabilities and our early adopters' success stories, or watch our tutorial videos. Then sign up to get started! http://needlebase.com
Tom Johnson

Playground | Social Analytics For Marketers - 0 views

  •  
    What is it? A social analytics platform which contains over 1,000 days of tweets (all 70 billion of them), Facebook activity and blog posts. How is it of use to journalists? "Journalists can easily develop real-time insights into any story from Playground," PeopleBrowsr UK CEO Andrew Grill explains. Complex keyword searches can be divided by user influence, geolocation, sentiment, and virtual communities of people with shared interests and affinities. These features - and many more - let reporters and researchers easily drill down to find the people and content driving the conversation on social networks on any subject. Playground lets you use the data the way you want to use it. You can either export the graphs and tables that the site produces automatically or export the results in a CSV file to create your own visualisations, which could potentially make it the next favourite tool of data journalists. Grill added: The recent launch of our fully transparent Kred influencer platform will make it faster and easier for journalists to find key influencers in a particular community. You can give Playground a try for the first 14 days before signing up for one of their subscriptions ($19 a month for students and journalists, $149 for organisations and companies).
Tom Johnson

Data Science Central - 0 views

  •  
    Welcome to Data Science Central! Data Science Central is the industry's one stop resource for big data practitioners. From Analytics to Data Integration to Visualization, Data Science Central (DSC) provides a true community experience through social interaction, peer to peer technical support, the latest in technology, tools and trends --and even job opportunities. We look forward to hearing your feedback as we grow this community of professionals in our exciting industry during times of dramatic change.
Tom Johnson

cohuman collaboration tool - 0 views

  •  
    Who uses Cohuman? Teams Leads Members Teams Cohuman is ideal for any group of people that needs to communicate more dynamically and effectively than email or traditional collaboration tools will allow. Startups, Distributed Teams, Small Businesses, Deal Teams, Departments in larger organizations... in short Cohuman is for any group that requires a solution designed to coordinate people and manage projects more intelligently. Clear Task Ownership Assigning and tracking tasks is unambigious. Each team member has their personal responsibilities defined. Transparent Communication Everyone on the team knows exactly who is doing what - without extra effort. Intelligent Prioritization Every Task is ranked by Cohuman from the team's inputs in order of priority for people and projects so the important Tasks get done first. Dynamic Updates If a Task priority changes, the information is shared automatically with each team member - no Status update meetings or emails required. Powerful Email Integration Cohuman works for everyone on your team. Even those without a Cohuman account can interact with Cohuman via their email.
Tom Johnson

Open Data for Africa - Home - 0 views

  • AfDB Launches the Open Data for Africa PlatformThe African Development Bank Group (AfDB), in partnership with Knoema, has launched an Open Data for Africa platform aimed at significantly increasing access to quality data necessary for managing and monitoring development results in African countries, including the MDGs. The platform will also serve as a knowledge center for collecting, accessing…View all Introduction
  •  
    AfDB Launches the Open Data for Africa Platform The African Development Bank Group (AfDB), in partnership with Knoema, has launched an Open Data for Africa platform aimed at significantly increasing access to quality data necessary for managing and monitoring development results in African countries, including the MDGs. The platform will also serve as a knowledge center for collecting, accessing… View all Introduction
Tom Johnson

Public sector needs to improve quality of information, warns Eurim | Guardian Governmen... - 0 views

  • Public sector needs to improve quality of information, warns Eurim Parliamentary group gives cautious welcome to the EU's plans to open up more public sector data reddit this omnitracker.omniTrackEVarEvent( 12, 16, 'Guardian Government Computing: Reddit', 'click', '.reddit a' ); Comments (0) Sade Laja Guardian Professional, Monday 19 December 2011 07.08 EST Article history Sharing data on public services could have serious consequences unless the material has been valued, maintained and protected and the original reasons for its collection have been taken into account, the Information Society Alliance (Eurim), has warned. In a report on the quality of public sector information, the group says that the drive to put central and local government data online, open to public scrutiny, has revealed the long standing problems with quality that lie behind the reluctance of some departments and agencies to trust one another's data. It adds that it is important that decisions on spending cuts are based on good quality information.
  •  
    Sharing data on public services could have serious consequences unless the material has been valued, maintained and protected and the original reasons for its collection have been taken into account, the Information Society Alliance (Eurim), has warned. In a report on the quality of public sector information, the group says that the drive to put central and local government data online, open to public scrutiny, has revealed the long standing problems with quality that lie behind the reluctance of some departments and agencies to trust one another's data. It adds that it is important that decisions on spending cuts are based on good quality information.
  •  
    An important article. Please read.
Tom Johnson

World Public Opinion - 0 views

  •  
    WorldPublicOpinion.org WorldPublicOpinion.org is an international collaborative project whose aim is to give voice to public opinion around the world on international issues. As the world becomes increasingly integrated, problems have become increasingly global, pointing to a greater need for understanding between nations and for elucidating global norms. With the growth of democracy in the world, public opinion has come to play a greater role in the foreign policy process. WorldPublicOpinion.org seeks to reveal the values and views of publics in specific nations around the world as well as global patterns of world public opinion. WorldPublicOpinion.org was initiated by and is managed by the Program on International Policy Attitudes.
Tom Johnson

SchemaSpy - 0 views

  • SchemaSpyGraphical Database Schema Metadata Browser Sample Output FAQ Download Release Notes Support SchemaSpy John Currier Recent Donors: Anonymous monocongo chervitz Do you hate starting on a new project and having to try to figure out someone else's idea of a database? Or are you in QA and the developers expect you to understand all the relationships in their schema? If so then this tool's for you. SchemaSpy is a Java-based tool (requires Java 5 or higher) that analyzes the metadata of a schema in a database and generates a visual representation of it in a browser-displayable format. It lets you click through the hierarchy of database tables via child and parent table relationships as represented by both HTML links and entity-relationship diagrams. It's also designed to help resolve the obtuse errors that a database sometimes gives related to failures due to constraints.
  •  
    SchemaSpy Graphical Database Schema Metadata Browser SourceForge.net Sample Output FAQ Download Release Notes Support SchemaSpy John Currier Recent Donors: Anonymous monocongoProject Donor chervitzProject DonorAccepting Donations Support SchemaSpy Do you hate starting on a new project and having to try to figure out someone else's idea of a database? Or are you in QA and the developers expect you to understand all the relationships in their schema? If so then this tool's for you. SchemaSpy is a Java-based tool (requires Java 5 or higher) that analyzes the metadata of a schema in a database and generates a visual representation of it in a browser-displayable format. It lets you click through the hierarchy of database tables via child and parent table relationships as represented by both HTML links and entity-relationship diagrams. It's also designed to help resolve the obtuse errors that a database sometimes gives related to failures due to constraints.
Tom Johnson

Mining of Massive Datasets - 0 views

  •  
    Mining of Massive Datasets The book has now been published by Cambridge University Press. A hardcopy can be obtained Here. By agreement with the publisher, you can still download it free from this page. Cambridge Press does, however, retain copyright on the work, and we expect that you will acknowledge our authorship if you republish parts or all of it. We are sorry to have to mention this point, but we have evidence that other items we have published on the Web have been appropriated and republished under other names. It is easy to detect such misuse, by the way, as you will learn in Chapter 3. --- Anand Rajaraman (@anand_raj) and Jeff Ullman Downloads Download the Complete Book (340 pages, approximately 2MB) Download chapters of the book: Preface and Table of Contents Chapter 1 Data Mining Chapter 2 Large-Scale File Systems and Map-Reduce Chapter 3 Finding Similar Items Chapter 4 Mining Data Streams Chapter 5 Link Analysis Chapter 6 Frequent Itemsets Chapter 7 Clustering Chapter 8 Advertising on the Web Chapter 9 Recommendation Systems Index
Tom Johnson

How to use gestalt laws to make better charts The Excel Charts Blog - 0 views

  •  
    Perception: Gestalt Laws Home → Data visualization for Excel users → Perception: Gestalt Laws Every chart starts with a table. We transcribe this table into a visual representation of distances between data points: the "origin chart". That's when our "eye-brain system" starts making assumptions. It assumes that data points are somewhat related, even if they are not:
Tom Johnson

Interactive charts add heft to your data stories - Online News Association - 0 views

  •  
    Interactive charts add heft to your data stories Posted Feb. 16 - 10 a.m. in MJ Bear Fellows, Resources by Lucas Timmons Filed under data Data journalism can be very compelling. Stitched with a good narrative, it can tell one amazing story. But we can do better than that. We can also visualize the data and provide a great package. With that in mind, here are three free options for creating animated and interactive charts.
Tom Johnson

The Open Data Handbook - Open Data Manual - 0 views

  • The Open Data Handbook¶ This handbook discusses the legal, social and technical aspects of open data. It can be used by anyone but is especially designed for those seeking to open up data. It discusses the why, what and how of open data – why to go open, what open is, and the how to ‘open’ data. To get started, you may wish to look at the Introduction. You can navigate through the report using the Table of Contents (see sidebar or below). We warmly welcome comments on the text and will incorporate feedback as we go forward. We also welcome contributions or suggestions for additional sections and areas to examine.
  • The Open Data Handbook¶ This handbook discusses the legal, social and technical aspects of open data. It can be used by anyone but is especially designed for those seeking to open up data. It discusses the why, what and how of open data – why to go open, what open is, and the how to ‘open’ data. To get started, you may wish to look at the Introduction. You can navigate through the report using the Table of Contents (see sidebar or below). We warmly welcome comments on the text and will incorporate feedback as we go forward. We also welcome contributions or suggestions for additional sections and areas to examine.
  •  
    "The Open Data Handbook This handbook discusses the legal, social and technical aspects of open data. It can be used by anyone but is especially designed for those seeking to open up data. It discusses the why, what and how of open data - why to go open, what open is, and the how to 'open' data. To get started, you may wish to look at the Introduction. You can navigate through the report using the Table of Contents (see sidebar or below). We warmly welcome comments on the text and will incorporate feedback as we go forward. We also welcome contributions or suggestions for additional sections and areas to examine."
Tom Johnson

Reporters' Lab @ Duke University - 0 views

  •  
    The site now has reviews of common (and some uncommon) tools that promise to help your reporting, projects for the future and soon, we hope, news of promising and interesting use of new methods for reporting.
Tom Johnson

The Overview Project » Using Overview to analyze 4500 pages of documents on s... - 0 views

  • Using Overview to analyze 4500 pages of documents on security contractors in Iraq by Jonathan Stray on 02/21/2012 0 This post describes how we used a prototype of the Overview software to explore 4,500 pages of incident reports concerning the actions of private security contractors working for the U.S. State Department during the Iraq war. This was the core of the reporting work for our previous post, where we reported the results of that analysis. The promise of a document set like this is that it will give us some idea of the broader picture, beyond the handful of really egregious incidents that have made headlines. To do this, in some way we have to take into account most or all of the documents, not just the small number that might match a particular keyword search.  But at one page per minute, eight hours per day, it would take about 10 days for one person to read all of these documents — to say nothing of taking notes or doing any sort of followup. This is exactly the sort of problem that Overview would like to solve. The reporting was a multi-stage process: Splitting the massive PDFs into individual documents and extracting the text Exploration and subject tagging with the Overview prototype Random sampling to estimate the frequency of certain types of events Followup and comparison with other sources
  •  
    Using Overview to analyze 4500 pages of documents on security contractors in Iraq by Jonathan Stray on 02/21/2012 0 This post describes how we used a prototype of the Overview software to explore 4,500 pages of incident reports concerning the actions of private security contractors working for the U.S. State Department during the Iraq war. This was the core of the reporting work for our previous post, where we reported the results of that analysis. The promise of a document set like this is that it will give us some idea of the broader picture, beyond the handful of really egregious incidents that have made headlines. To do this, in some way we have to take into account most or all of the documents, not just the small number that might match a particular keyword search. But at one page per minute, eight hours per day, it would take about 10 days for one person to read all of these documents - to say nothing of taking notes or doing any sort of followup. This is exactly the sort of problem that Overview would like to solve. The reporting was a multi-stage process: Splitting the massive PDFs into individual documents and extracting the text Exploration and subject tagging with the Overview prototype Random sampling to estimate the frequency of certain types of events Followup and comparison with other sources
Tom Johnson

Jigsaw: Visual Analytics for Exploring and Understanding Document Collections - 0 views

  •  
    Be sure to view the video tutorial: http://www.cc.gatech.edu/gvu/ii/jigsaw/Jigsaw-tutorial.movhttp://www.cc.gatech.edu/gvu/ii/jigsaw/Jigsaw-tutorial.mov http://www.cc.gatech.edu/gvu/ii/jigsaw/views.html Jigsaw: Visual Analytics for Exploring and Understanding Document Collections System Views Jigsaw presents the individual reports in a document collection and the entities within those reports through a series of visualizations. We call these visualizations the system views. Below, we illustrate each view provided by the system and briefly describe their characteristics. Click on the individual images to see a larger version of the view. Also, a tutorial video illustrates the different views as well and the interactive behavior for each view can be seen on the video tutorial page. -tj
  •  
    Also see "The Information Interfaces Group, an HCI research group in the School of Interactive Computing at Georgia Tech, develops computing technologies that help people take advantage of information to enrich their lives. " http://www.cc.gatech.edu/gvu/ii/
1 - 20 Next › Last »
Showing 20 items per page