Skip to main content

Home/ DJCamp2011/ Group items tagged With

Rss Feed Group items tagged

Tom Johnson

RegExr: Free Online RegEx Testing Tool - 0 views

  • gExr is an online tool for editing and testing Regular Expressions (RegExp / RegEx). It provides a simple interface to enter RegEx expressions, and visualize matches in real-time editable source text. It also provides a handy RegExp snippet sidebar with descriptions and usage examples to make it easier to learn Regular Expressions through trial and error. It isn’t as powerful as a product like RegExBuddy, but it has the advantage of being online and free. I will be releasing a free desktop version for Mac OSX and Windows built with AIR in the next day or two. So far this has only taken a day of developmen
  •  
    "RegExr is an online tool for editing and testing Regular Expressions (RegExp / RegEx). It provides a simple interface to enter RegEx expressions, and visualize matches in real-time editable source text. It also provides a handy RegExp snippet sidebar with descriptions and usage examples to make it easier to learn Regular Expressions through trial and error. It isn't as powerful as a product like RegExBuddy, but it has the advantage of being online and free. I will be releasing a free desktop version for Mac OSX and Windows built with AIR in the next day or two. So far this has only taken a day of development, and the main app is only 150 lines of code. Flex 3 makes this kind of app so darn simple to put together."
Tom Johnson

The Overview Project » Using Overview to analyze 4500 pages of documents on s... - 0 views

  • Using Overview to analyze 4500 pages of documents on security contractors in Iraq by Jonathan Stray on 02/21/2012 0 This post describes how we used a prototype of the Overview software to explore 4,500 pages of incident reports concerning the actions of private security contractors working for the U.S. State Department during the Iraq war. This was the core of the reporting work for our previous post, where we reported the results of that analysis. The promise of a document set like this is that it will give us some idea of the broader picture, beyond the handful of really egregious incidents that have made headlines. To do this, in some way we have to take into account most or all of the documents, not just the small number that might match a particular keyword search.  But at one page per minute, eight hours per day, it would take about 10 days for one person to read all of these documents — to say nothing of taking notes or doing any sort of followup. This is exactly the sort of problem that Overview would like to solve. The reporting was a multi-stage process: Splitting the massive PDFs into individual documents and extracting the text Exploration and subject tagging with the Overview prototype Random sampling to estimate the frequency of certain types of events Followup and comparison with other sources
  •  
    Using Overview to analyze 4500 pages of documents on security contractors in Iraq by Jonathan Stray on 02/21/2012 0 This post describes how we used a prototype of the Overview software to explore 4,500 pages of incident reports concerning the actions of private security contractors working for the U.S. State Department during the Iraq war. This was the core of the reporting work for our previous post, where we reported the results of that analysis. The promise of a document set like this is that it will give us some idea of the broader picture, beyond the handful of really egregious incidents that have made headlines. To do this, in some way we have to take into account most or all of the documents, not just the small number that might match a particular keyword search. But at one page per minute, eight hours per day, it would take about 10 days for one person to read all of these documents - to say nothing of taking notes or doing any sort of followup. This is exactly the sort of problem that Overview would like to solve. The reporting was a multi-stage process: Splitting the massive PDFs into individual documents and extracting the text Exploration and subject tagging with the Overview prototype Random sampling to estimate the frequency of certain types of events Followup and comparison with other sources
Tom Johnson

Making square bar charts in Excel - 0 views

  • Solving the Pie December 14, 2006 By: Chris Gemignani Last week I challenged the you to reproduce this alternative to pie charts in Excel. I promised a screencast to show how it’s done. Eighteen people answered the call with nearly three dozen different solutions. Click here to watch the screencast showing how to accomplish the two most popular solutions; filling cells with conditional formatting and pushing the column chart to extremes. If you want to look at the source,Clint Ivy produced an excellent version of the cell filling approach.
  •  
    Solving the Pie Chris Gemignani December 14, 2006 By: Chris Gemignani Last week I challenged you to reproduce this alternative to pie charts in Excel. I promised a screencast to show how it's done. http://juiceanalytics.com/writing/2006/12/square-pie-screencast/ Square Pie Eighteen people answered the call with nearly three dozen different solutions. Click here to watch the screencast showing how to accomplish the two most popular solutions; filling cells with conditional formatting and pushing the column chart to extremes. If you want to look at the source,Clint Ivy produced an excellent version of the cell filling approach.
Tom Johnson

Investigative Reporters and Editors | Listserv archives - 0 views

  •  
    Listserv archives IRE and NICAR offer several opportunities for members and even non-members to exchange ideas, information, techniques and war stories. Joining is easy. If you are an IRE member, you may view the list archives: * Click an archive link and login with any e-mail address on record with the IRE office. Click "Get Password" if your first visit, to receive your LISTSERV password (separate from the IRE website password). Most users will login with the e-mail used for their IRE login account. Please e-mail listmaster@ire.org if you need help or have any questions. IRE-L archives. NICAR-L archives. IREPLUS-L archives. CENSUS-L archives. The following lists are less active: CFIC-L archives IRE-EDU-L archives IREBC-L archives
Tom Johnson

Download PowerPivot - Excel - Office.com - 0 views

  •  
    Tom Torok (NYT) writes: After years of looking down my nose at Excel because of its limitations, I have to say that I'm very impressed with Excel 2010 when used with a free Microsoft add-in called PowerPivot. http://office.microsoft.com/en-us/excel/download-powerpivot-HA101959985.aspx In a PowerPivot tutorial (link below), I imported eight tables  from several sources and joined them - yes, you can join relational data. It uses some magical data compression that allows for lightning fast sorts, filters and calculated fields. The largest table in the tutorial has about 2 million rows. A calculated field on that table took seconds. A did a pivot table on the table and the answers appeared as soon as I selected the fields. In one of  the training videos (http://www.powerpivot.com/) an MS guy works with a 101 million-record table on his laptop. It's really amazing. http://powerpivotsdr.codeplex.com/ If you install, be sure to read the prerequisites or you'll be installing and uninstalling both PowerPivot and Excel. I'm running it on a 32-bit XP machine (it won't run on a 64-bit XP but will work on Windows 7 64-bit). The tutorial is for a Windows 7 setup, but there are items in the menu bar that match the reference to the tutorial's ribbon. I noticed that if I call up an xlsx by double clicking on a file in Windows Explorer that PowerPivot is not enabled in the ribbon. If you call up a file from within Excel 2010 everything works as advertised.Regards, TT  
Tom Johnson

Data-Driven Journalism Workshop on EU Spending: Tools & Techniques. Utrecht, 8th-9th Se... - 0 views

  • Data-Driven Journalism Workshop on EU Spending: Tools & Techniques. Utrecht, 8th-9th September. Posted on August 9, 2011 by Lucy Chambers The following post is by Liliana Bonegru, Project Coordinator at the European Journalism Centre (EJC), and Lucy Chambers, Community Coordinator at the Open Knowledge Foundation. The post announces a joint workshop between the EJC and OKF, focusing on how to get started with data-driven reporting on spending data. This workshop will focus particularly on EU spending data. Interested in data-driven journalism and EU spending? The European Journalism Centre together with the Open Knowledge Foundation is hosting a one and a half day data-driven journalism workshop on EU spending in Utrecht, the Netherlands on 8th-9th September.
  •  
    Data-Driven Journalism Workshop on EU Spending: Tools & Techniques. Utrecht, 8th-9th September. Posted on August 9, 2011 by Lucy Chambers The following post is by Liliana Bonegru, Project Coordinator at the European Journalism Centre (EJC), and Lucy Chambers, Community Coordinator at the Open Knowledge Foundation. The post announces a joint workshop between the EJC and OKF, focusing on how to get started with data-driven reporting on spending data. This workshop will focus particularly on EU spending data. Interested in data-driven journalism and EU spending? The European Journalism Centre together with the Open Knowledge Foundation is hosting a one and a half day data-driven journalism workshop on EU spending in Utrecht, the Netherlands on 8th-9th September.
Tom Johnson

Data Without Borders | Connecting data science and non-profits in the service of humanity. - 0 views

  • Data Without Borders seeks to match non-profits in need of data analysis with freelance and pro bono data scientists who can work to help them with data collection, analysis, visualization, or decision support.
  •  
    Data Without Borders seeks to match non-profits in need of data analysis with freelance and pro bono data scientists who can work to help them with data collection, analysis, visualization, or decision support.
  •  
    A good resource to extend the intellectual power and reach of your newsroom.
Tom Johnson

When Maps Shouldn't Be Maps « Matthew Ericson - ericson.net - 0 views

  • « Illustrator MultiExporter script: Now with JPG and EPS When Maps Shouldn’t Be Maps View full interactive map on nytimes.com » Often, when you get data that is organized by geography — say, for example, food stamp rates in every county, high school graduation rates in every state, election results in every House district, racial and ethnic distributions in each census tract — the impulse is since the data CAN be mapped, the best way to present the data MUST be a map. You plug the data into ArcView, join it up with a shapefile, export to Illustrator, clean up the styles and voilà! Instant graphic ready to be published. And in many cases, that’s the right call.
  •  
    Matthew Ericson « Illustrator MultiExporter script: Now with JPG and EPS When Maps Shouldn't Be Maps View full interactive map on nytimes.com » Often, when you get data that is organized by geography - say, for example, food stamp rates in every county, high school graduation rates in every state, election results in every House district, racial and ethnic distributions in each census tract - the impulse is since the data CAN be mapped, the best way to present the data MUST be a map. You plug the data into ArcView, join it up with a shapefile, export to Illustrator, clean up the styles and voilà! Instant graphic ready to be published. And in many cases, that's the right call.
Tom Johnson

Interactive charts add heft to your data stories - Online News Association - 0 views

  •  
    Interactive charts add heft to your data stories Posted Feb. 16 - 10 a.m. in MJ Bear Fellows, Resources by Lucas Timmons Filed under data Data journalism can be very compelling. Stitched with a good narrative, it can tell one amazing story. But we can do better than that. We can also visualize the data and provide a great package. With that in mind, here are three free options for creating animated and interactive charts.
Tom Johnson

Google Language Translation Tools - 0 views

  •  
    This is the link to the Google Translation Tools with the top tool to search with translation (click away from automatically selected languages). So your English words are translated to Arabic, searched on Arabic pages, returned translated into English (as well as can be expected) http://www.google.com/language_tools It still works.. The other tools below..
Tom Johnson

T-LAB Tools for Text Analysis - 0 views

  •  
    The all-in-one software for Content Analysis and Text Mining Hello We are pleased to announce the release of T-LAB 8.0. This version represents a major change in the usability and the effectiveness of our software for text analysis. The most significant improvements concern the integration of bottom-up (i.e. unsupervised) methods for exploratory text analysis with top-down (i.e. supervised) approaches for the automated classification of textual units like words, sentences, paragraphs and documents. Among other things, this means that - besides discovering emerging patterns of words and themes from texts - the users can now easily build, apply and validate their models (e.g. dictionaries of categories or pre-existing manual categorizations) both for classical content analysis and for sentiment analysis. For this purpose several T-LAB functionalities have been expanded and a new ergonomic and powerful tool named 'Dictionary-Based Classification' has been added. No specific dictionaries have been built in; however, with some minor re-formatting, lots of resources available over the Internet and customized word lists can be quickly imported. Last but not least, in order to meet the needs of many customers, temporary licenses of the software are now on sale; moreover, without any time limit, the trial mode of the software now allows you to analyse your own texts up to 20 kb in txt format, each of which can include up to 20 short documents. To learn more, use the following link http://www.tlab.it/en/80news.php The Demo, the User's Manual and the Quick Introduction are available at http://www.tlab.it/en/download.php Kind Regards The T-LAB Team web: http://www.tlab.it/ e-mail: info@tlab.it
Tom Johnson

cohuman collaboration tool - 0 views

  •  
    Who uses Cohuman? Teams Leads Members Teams Cohuman is ideal for any group of people that needs to communicate more dynamically and effectively than email or traditional collaboration tools will allow. Startups, Distributed Teams, Small Businesses, Deal Teams, Departments in larger organizations... in short Cohuman is for any group that requires a solution designed to coordinate people and manage projects more intelligently. Clear Task Ownership Assigning and tracking tasks is unambigious. Each team member has their personal responsibilities defined. Transparent Communication Everyone on the team knows exactly who is doing what - without extra effort. Intelligent Prioritization Every Task is ranked by Cohuman from the team's inputs in order of priority for people and projects so the important Tasks get done first. Dynamic Updates If a Task priority changes, the information is shared automatically with each team member - no Status update meetings or emails required. Powerful Email Integration Cohuman works for everyone on your team. Even those without a Cohuman account can interact with Cohuman via their email.
Tom Johnson

COS 597G: Surveillance and Countermeasures, Fall 2013 - 0 views

  •  
    "COS-597G: Surveillance and Countermeasures (Fall 2013) Course description. This course surveys research on surveillance technologies and technical countermeasures. Readings come mostly from the computer science research literature, with some legal and policy readings to establish context. Course work will include reading and discussion, a few short writing assignments, and a substantial student-chosen course project. The course is designed for students with a solid grounding in computer science. Students unsure of their suitability of the course should contact the instructor. "
Tom Johnson

Free planning tool - download now for free! - PlanningForce has been chosing by a huge ... - 0 views

  •  
    Express Planner http://www.planningforce-express.com/ Those persons with a yen for project management will want to take a look at Planning Force's Express Planner. The program is designed for those doing work in project management and business, and it gives users the ability to apply calendars to projects and tasks, prioritize items, and create reports. The site includes several tutorials, and it is compatible with computers running Linux and Windows 2000 and newer. [KMG]
Tom Johnson

TransparencyCamp '11 Recap - Sunlight Foundation - 0 views

  • TransparencyCamp '11 Recap Nicole Aro May 4, 2011, 11:28 a.m. Sunlight’s fourth TransparencyCamp was this past weekend, and I’d like to take this moment to say to all of our attendees: Thank you -- you guys rock. To everyone else, I’m sorry that you missed such an awesome weekend, but we hope to see you next time around! This weekend was made possible by the generosity of our sponsors: Microsoft, Google, O’Reilly, Governing, iStrategyLabs, Forum One, and Adobe. I’d like to say a special thank you to Patrick Svenburg of Microsoft who stayed late to make sure we could finish setup and even helped us carry supplies(!). The weekend brought together about 250 government workers, software developers, investigative journalists, bloggers, students and open government advocates of all stripes to share stories, build relationships, and plan together to take on the challenges of building more open government. This year, TransparencyCamp also went global, bringing in 22 amazing transparency advocates from around the world to teach, learn and share with us here in the states.
  • TransparencyCamp '11 Recap Nicole Aro May 4, 2011, 11:28 a.m. Sunlight’s fourth TransparencyCamp was this past weekend, and I’d like to take this moment to say to all of our attendees: Thank you -- you guys rock. To everyone else, I’m sorry that you missed such an awesome weekend, but we hope to see you next time around! This weekend was made possible by the generosity of our sponsors: Microsoft, Google, O’Reilly, Governing, iStrategyLabs, Forum One, and Adobe. I’d like to say a special thank you to Patrick Svenburg of Microsoft who stayed late to make sure we could finish setup and even helped us carry supplies(!). The weekend brought together about 250 government workers, software developers, investigative journalists, bloggers, students and open government advocates of all stripes to share stories, build relationships, and plan together to take on the challenges of building more open government. This year, TransparencyCamp also went global, bringing in 22 amazing transparency advocates from around the world to teach, learn and share with us here in the states.
  •  
    "TransparencyCamp '11 Recap Nicole Aro May 4, 2011, 11:28 a.m. Sunlight's fourth TransparencyCamp was this past weekend, and I'd like to take this moment to say to all of our attendees: Thank you -- you guys rock. To everyone else, I'm sorry that you missed such an awesome weekend, but we hope to see you next time around! This weekend was made possible by the generosity of our sponsors: Microsoft, Google, O'Reilly, Governing, iStrategyLabs, Forum One, and Adobe. I'd like to say a special thank you to Patrick Svenburg of Microsoft who stayed late to make sure we could finish setup and even helped us carry supplies(!). The weekend brought together about 250 government workers, software developers, investigative journalists, bloggers, students and open government advocates of all stripes to share stories, build relationships, and plan together to take on the challenges of building more open government. This year, TransparencyCamp also went global, bringing in 22 amazing transparency advocates from around the world to teach, learn and share with us here in the states. "
Tom Johnson

Data Visualization Platform, Weave, Now Open Source | Government In The Lab - 0 views

  •  
    Data Visualization Platform, Weave, Now Open Source Logo Open Source Initiative Image via Wikipedia Civic Commons, Contributors (Karl Fogel, Author) With more and more civic data becoming available and accessible, the challenge grows for policy makers and citizens to leverage that data for better decision-making. It is often difficult to understand context and perform analysis. "Weave", however, helps. A web-based data visualization tool, Weave enables users to explore, analyze, visualize and disseminate data online from any location at any time. We saw tremendous potential in the platform and have been helping open-source the software, advising on community engagement strategy and licensing. This week, we were excited to see the soft launch of the Weave 1.0 Beta, which went open-source on Wednesday, June 15. Weave is the result of a broad partnership: it was developed by the Institute for Visualization and Perception Research at the University of Massachussetts Lowell, with support from the Open Indicators Consortium, which is made up of over ten municipal, regional, and state member organizations. This consortium will probably expand now that Weave is open source, leading hopefully to greater collaboration, more development, and further innovation on this important platform. Early-adopter data geeks should give it a spin. One of Weave's key features is high-speed interactivity and responsiveness, which is somewhat unusual in web-based visualization software; try out the demo sites or watch the video below. Our congratulations and thanks to the Weave team! As city management is increasingly data-driven, so data analysis and visualization tools will continue to be an important part of every city manager's toolkit. We are excited to see this evolving toolkit enter the civic commons. http://govinthelab.com/data-visualization-platform-weave-now-open-source
Tom Johnson

Visual.ly | Infographics & Visualizations. Create, Share, Explore - 0 views

  •  
    Visual.ly - a new tool to create data visualisations July 28th, 2011Posted by Sarah Marshall in Data, Design and graphics, Handy tools and technology, Multimedia Visual.ly is a new platform to allow you to explore and share data visualisations. According to the video below, it is two things: a platform to upload and promote your own visualisations and a space to connect "dataviz pros", advertisers and publishers. Visual.ly has teamed up with media partners, including GigaOM, Mashable and the Atlantic, who each have a profile showcasing their data visualisations. You will soon be able to create your own "beautiful visualisations in minutes" and will "instantly apply the graphics genius of the world's top information designers to your designs", the site promises. Plug and play, then grab and go with our push-button approach to visualisation creation. The sample images are impressive, but journalists will have to wait until they can upload their own data.
Tom Johnson

International Dataset Search - 0 views

  • International Dataset Search View View Source Description:  The TWC International Open Government Dataset Catalog (IOGDC) is a linked data application based on metadata scraped from an increasing number of international dataset catalog websites publishing a rich variety of government data. Metadata extracted from these catalog websites is automatically converted to RDF linked data and re-published via the TWC LOGD SPAQRL endpoint and made available for download. The TWC IOGDC demo site features an efficient, reconfigurable faceted browser with search capabilities offering a compelling demonstration of the value of a common metadata model for open government dataset catalogs. We believe that the vocabulary choices demonstrated by IOGDC highlights the potential for useful linked data applications to be created from open government catalogs and will encourage the adoption of such a standard worldwide. Warning: This demo will crash IE7 and IE8. Contributor: Eric Rozell Contributor: Jinguang Zheng Contributor: Yongmei Shi Live Demo:  http://logd.tw.rpi.edu/demo/international_dataset_catalog_search Notes: This is an experimental demo and some queries may take longer time to response (30 ~60 seconds). Please referesh this page if the demo is not loaded. Our metadata model can be accessed here . Procedure to getting and publishing metadata is described here . The RDF dump of the datasets can be downloaded here. Welcome to S2S! International OGD Catalog Search (searching 736,578 datasets)
  •  
    International Dataset Search View View Source Description: The TWC International Open Government Dataset Catalog (IOGDC) is a linked data application based on metadata scraped from an increasing number of international dataset catalog websites publishing a rich variety of government data. Metadata extracted from these catalog websites is automatically converted to RDF linked data and re-published via the TWC LOGD SPAQRL endpoint and made available for download. The TWC IOGDC demo site features an efficient, reconfigurable faceted browser with search capabilities offering a compelling demonstration of the value of a common metadata model for open government dataset catalogs. We believe that the vocabulary choices demonstrated by IOGDC highlights the potential for useful linked data applications to be created from open government catalogs and will encourage the adoption of such a standard worldwide. Warning: This demo will crash IE7 and IE8. Contributor: Eric Rozell Jinguang Zheng Yongmei Shi Live Demo: http://logd.tw.rpi.edu/demo/international_dataset_catalog_search Notes: This is an experimental demo and some queries may take longer time to response (30 ~60 seconds). Please referesh this page if the demo is not loaded. Our metadata model can be accessed here . Procedure to getting and publishing metadata is described here . The RDF dump of the datasets can be downloaded here. International OGD Catalog Search (searching 736,578 datasets) http://logd.tw.rpi.edu/demo/international_dataset_catalog_search
  •  
    Loads surprisingly quickly. Try entering your favorite search term in top blue box. Can use quotes to define phrases.
Tom Johnson

The special trick that helps identify dodgy stats | Ben Goldacre | Comment is free | Th... - 0 views

  • The special trick that helps identify dodgy stats Using Benford's law, forensic statisticians can spot suspicious patterns in the raw numbers, and estimate the chances figures have been tampered with
  • The results were fun. Greece – whose economy has tanked – showed the largest and most suspicious deviation from Benford's law of any country in the euro.
  •  
    if you go to the website testingbenfordslaw.com you'll see the proportions of each leading digit from lots of real-world datasets, graphed alongside what Benford's law predicts they should be, with data from Twitter users' follower counts to the number of books in different libraries across the US
  •  
    The special trick that helps identify dodgy stats Using Benford's law, forensic statisticians can spot suspicious patterns in the raw numbers, and estimate the chances figures have been tampered with
Tom Johnson

Corporate Accountability Data in Influence Explorer - Sunlight Labs: Blog - 0 views

  •  
    Again, US-centric, but this might generate some ideas of what could be accomplish in your city/nation. Late yesterday we announced a bunch of new features for Influence Explorer: http://sunlightlabs.com/blog/2011/ie-corporate-accountability/ As the blog post explains, you can now find information about a corporation's EPA violations, federal advisory committee memberships, and participation in the rulemaking process -- all in one place. I wanted to highlight that last feature a bit more, though. To my knowledge, this is the first time that the full corpus of public comments submitted to regulations.gov has been available for bulk download and analysis. This isn't a coincidence: regulations.gov is built using technologies that make scraping it unusually difficult. This is unfortunate, since everyone seems to agree that federal rulemakings are gaining in importance -- both because of congressional gridlock that leaves the regulatory process as a second-best option, and because of calls to simplify the regulatory landscape as a pro-growth measure. It's an area where influence is certainly exerted -- rulemakers are obliged to review every comment -- but little attention is paid to who's flooding dockets with comments, and which directions rules are being pushed. It's taken us several months to develop a reliable solution and to obtain past rulemakings, but we now have the data in hand. We plan to do much more with this dataset, and we're hoping that others will want to dig in, too. You can find a link to the bulk download options in the post above -- the full compressed archive of extracted text and metadata is ~16GB, but we've provided options for grabbing individual agencies' or dockets' data. If anyone wants the original documents (PDFs, DOCs, etc) we can talk through how to make that happen, but as they clock in at 1.5TB we'll want to make sure folks know what they're getting into before we spend the time and bandwidth. Finally, note that we currently o
‹ Previous 21 - 40 of 85 Next › Last »
Showing 20 items per page