Skip to main content

Home/ DJCamp2011/ Group items tagged search

Rss Feed Group items tagged

Tom Johnson

Google Correlate - 0 views

  •  
    Google Correlate lets you see how your data relates to search queries Posted: 25 May 2011 11:27 AM PDT Influenza search - Google Correlate A while back, Google showed how Influenza outbreaks correlated to searches for flu-related terms with Google Flu Trends. It helped researchers and policy-makers estimate flu activity much sooner than with previous methods. Google Correlate is the evolution of Flu Trends in that now you can correlate search trends with not just flu cases, but with your own data or other search queries. The above, which you already know about, matches flu cases with searches for "treatment for flu." Similarly, the search phrase that correlates highest with "Toyota for sale" is "used Hyundai," as shown below. You can also see how your data is related geographically. For example, annual rainfall (left) strongly correlates with searches for "disney vacation package." Although, it looks like distance is a strong factor in the latter, which should be a reminder that correlation is different from causation. Google is careful to point this out in their FAQ and explanation of the tool. Nevertheless, it's fun to poke around and sometimes see the non-sensical correlations. For example, the strongest correlation with "flowingdata" is "how to scan a document," because the growth rates of both seem similar. There's also a search by drawing function. You draw a time series, and Correlate finds terms that best match that trend. In the below chart, I drew a line (blue) that had steady growth, but plateaued towards present day. What weird correlations can you find? [Google Correlate]
Tom Johnson

BuzzData | Blog - 0 views

  • My blog All of Tumblr What is BuzzData? Data should be free-flowing, well-organized and easy to share. Wouldn’t it be nice if there was a place where you could store, share and show off your data with just a couple of mouse clicks? BuzzData lets you publish your data in a smarter, easier way. Instead of juggling versions and overwriting files, use BuzzData and enjoy a social network designed for data.
  •  
    What is BuzzData? Data should be free-flowing, well-organized and easy to share. Wouldn't it be nice if there was a place where you could store, share and show off your data with just a couple of mouse clicks? BuzzData lets you publish your data in a smarter, easier way. Instead of juggling versions and overwriting files, use BuzzData and enjoy a social network designed for data."
Tom Johnson

Michelle Minkoff » Learning to love…grep (let the computer search text for you) - 0 views

  • Blog Learning to love…grep (let the computer search text for you) Posted by Michelle Minkoff on Aug 9, 2012 in Blog, Uncategorized | No Comments I’ve gotten into the habit of posting daily learnings on Twitter, but some things require a more in-depth reminder. I also haven’t done as much paying as forward as I’d like (but I’m having a TON of fun!  and dealing with health problems!  but mostly fun!) I’d like to try to start posting more helpful tips here, partially as a notebook for myself, and partially to help others with similar issues. Today’s problem: I needed to search for a few lines of text, which could be contained in any one of nine files with 100,000 lines each. Opening all of the files took a very long time on my computer, not to mention executing a search. Enter the “grep” command in Terminal, that allows you to quickly search files using the power of the computer.
  •  
    Blog Learning to love…grep (let the computer search text for you) Posted by Michelle Minkoff on Aug 9, 2012 in Blog, Uncategorized | No Comments I've gotten into the habit of posting daily learnings on Twitter, but some things require a more in-depth reminder. I also haven't done as much paying as forward as I'd like (but I'm having a TON of fun! and dealing with health problems! but mostly fun!) I'd like to try to start posting more helpful tips here, partially as a notebook for myself, and partially to help others with similar issues. Today's problem: I needed to search for a few lines of text, which could be contained in any one of nine files with 100,000 lines each. Opening all of the files took a very long time on my computer, not to mention executing a search. Enter the "grep" command in Terminal, that allows you to quickly search files using the power of the computer.
  •  
    An easy to use method for content analysis
Tom Johnson

International Dataset Search - 0 views

  • International Dataset Search View View Source Description:  The TWC International Open Government Dataset Catalog (IOGDC) is a linked data application based on metadata scraped from an increasing number of international dataset catalog websites publishing a rich variety of government data. Metadata extracted from these catalog websites is automatically converted to RDF linked data and re-published via the TWC LOGD SPAQRL endpoint and made available for download. The TWC IOGDC demo site features an efficient, reconfigurable faceted browser with search capabilities offering a compelling demonstration of the value of a common metadata model for open government dataset catalogs. We believe that the vocabulary choices demonstrated by IOGDC highlights the potential for useful linked data applications to be created from open government catalogs and will encourage the adoption of such a standard worldwide. Warning: This demo will crash IE7 and IE8. Contributor: Eric Rozell Contributor: Jinguang Zheng Contributor: Yongmei Shi Live Demo:  http://logd.tw.rpi.edu/demo/international_dataset_catalog_search Notes: This is an experimental demo and some queries may take longer time to response (30 ~60 seconds). Please referesh this page if the demo is not loaded. Our metadata model can be accessed here . Procedure to getting and publishing metadata is described here . The RDF dump of the datasets can be downloaded here. Welcome to S2S! International OGD Catalog Search (searching 736,578 datasets)
  •  
    International Dataset Search View View Source Description: The TWC International Open Government Dataset Catalog (IOGDC) is a linked data application based on metadata scraped from an increasing number of international dataset catalog websites publishing a rich variety of government data. Metadata extracted from these catalog websites is automatically converted to RDF linked data and re-published via the TWC LOGD SPAQRL endpoint and made available for download. The TWC IOGDC demo site features an efficient, reconfigurable faceted browser with search capabilities offering a compelling demonstration of the value of a common metadata model for open government dataset catalogs. We believe that the vocabulary choices demonstrated by IOGDC highlights the potential for useful linked data applications to be created from open government catalogs and will encourage the adoption of such a standard worldwide. Warning: This demo will crash IE7 and IE8. Contributor: Eric Rozell Jinguang Zheng Yongmei Shi Live Demo: http://logd.tw.rpi.edu/demo/international_dataset_catalog_search Notes: This is an experimental demo and some queries may take longer time to response (30 ~60 seconds). Please referesh this page if the demo is not loaded. Our metadata model can be accessed here . Procedure to getting and publishing metadata is described here . The RDF dump of the datasets can be downloaded here. International OGD Catalog Search (searching 736,578 datasets) http://logd.tw.rpi.edu/demo/international_dataset_catalog_search
  •  
    Loads surprisingly quickly. Try entering your favorite search term in top blue box. Can use quotes to define phrases.
Tom Johnson

The Overview Project » Document mining shows Paul Ryan relying on the the pro... - 0 views

  •  
    Document mining shows Paul Ryan relying on the the programs he criticizes by Jonathan Stray on 11/02/2012 0 One of the jobs of a journalist is to check the record. When Congressman Paul Ryan became a vice-presidential candidate, Associated Press reporter Jack Gillum decided to examine the candidate through his own words. Hundreds of Freedom of Information requests and 9,000 pages later, Gillum wrote a story showing that Ryan has asked for money from many of the same Federal programs he has criticized as wasteful, including stimulus money and funding for alternative fuels. This would have been much more difficult without special software for journalism. In this case Gillum relied on two tools: DocumentCloud to upload, OCR, and search the documents, and Overview to automatically sort the documents into topics and visualize the contents. Both projects are previous Knight News Challenge winners. But first Gillum had to get the documents. As a member of Congress, Ryan isn't subject to the Freedom of Information Act. Instead, Gillum went to every federal agency - whose files are covered under FOIA - for copies of letters or emails that might identify Ryan's favored causes, names of any constituents who sought favors, and more. Bit by bit, the documents arrived - on paper. The stack grew over weeks, eventually piling up two feet high on Gillum's desk. Then he scanned the pages and loaded them into the AP's internal installation of DocumentCloud. The software converts the scanned pages to searchable text, but there were still 9000 pages of material. That's where Overview came in. Developed in house at the Associated Press, this open-source visualization tool processes the full text of each document and clusters similar documents together, producing a visualization that graphically shows the contents of the complete document set. "I used Overview to take these 9000 pages of documents, and knowing there was probably going to be a lot of garbage or ext
Tom Johnson

FusionTablesLayer Builder - 0 views

  • FusionTablesLayer Builder This wizard helps you create the HTML for a map with a FusionTablesLayer and search element (either text-based search or select menu). After creating your map, you can copy and paste the HTML code in the textarea below to display the map on your own website! Please submit bug reports here: Issue Tracker
  •  
    FusionTablesLayer Builder This wizard helps you create the HTML for a map with a FusionTablesLayer and search element (either text-based search or select menu). After creating your map, you can copy and paste the HTML code in the textarea below to display the map on your own website! Please submit bug reports here: Issue Tracker
  •  
    Click on the "Add another feature" drop-down to add additional layer or search box
Tom Johnson

Searchable Map Template with Google Fusion Tables - 0 views

  •  
    Searchable Map Template with Google Fusion Tables Turn a spreadsheet in to a searchable map You want to put your data on a searchable, filterable map. This is a free, open source tool to help you do it. Features clean, full screen layout new mobile and tablet friendly using responsive design address search (with variable radius) geolocation (find me!) new RESTful URLs for sharing searches results count (using Google's Fusion Tables API) ability to easily add additional search filters (checkboxes, sliders, etc) all done with HTML, CSS and Javascript - no server side code required Technologies used Google Fusion Tables (useful resources) Google Maps API V3 jQuery jQuery Address Twitter Bootstrap Note: This template is now supports the Fusion Tables v1 API. For more info on this, see their migration guide
Tom Johnson

Zanran Numerical Data Search - 0 views

  •  
    Zanran helps you to find 'semi-structured' data on the web. This is the numerical data that people have presented as graphs and tables and charts. For example, the data could be a graph in a PDF report, or a table in an Excel spreadsheet, or a barchart shown as an image in an HTML page. This huge amount of information can be difficult to find using conventional search engines, which are focused primarily on finding text rather than graphs, tables and bar charts. Put more simply: Zanran is Google for data. Language. English only please... for now. Phrase search. You can use double quotes to make phrases (e.g. "mobile phones"). Booleans. You can use a plus '+' to make a word mandatory, or a minus '-' to exclude it (e.g. +gas -oil production) Vocabulary. We have only limited synonyms - please try different words in your query. And we don't spell-check ... yet.
  •  
    OpenData Open Data
Tom Johnson

Spokeo People Search | White Pages | Find People - 0 views

  • What is Spokeo? Spokeo is a people search engine that aggregates white-pages listings and public records. Browse the directory:
  •  
    What is Spokeo? Spokeo is a people search engine that organizes vast quantities of white-pages listings, social information, and other people-related data from a large variety of public sources. Our mission is to help people find and connect with others, more easily than ever.
Tom Johnson

Snap Bird - search twitter's history - 0 views

  •  
    Searches individuals history of posting on Twitter.
Tom Johnson

Google Language Translation Tools - 0 views

  •  
    This is the link to the Google Translation Tools with the top tool to search with translation (click away from automatically selected languages). So your English words are translated to Arabic, searched on Arabic pages, returned translated into English (as well as can be expected) http://www.google.com/language_tools It still works.. The other tools below..
Tom Johnson

The Overview Project » Using Overview to analyze 4500 pages of documents on s... - 0 views

  • Using Overview to analyze 4500 pages of documents on security contractors in Iraq by Jonathan Stray on 02/21/2012 0 This post describes how we used a prototype of the Overview software to explore 4,500 pages of incident reports concerning the actions of private security contractors working for the U.S. State Department during the Iraq war. This was the core of the reporting work for our previous post, where we reported the results of that analysis. The promise of a document set like this is that it will give us some idea of the broader picture, beyond the handful of really egregious incidents that have made headlines. To do this, in some way we have to take into account most or all of the documents, not just the small number that might match a particular keyword search.  But at one page per minute, eight hours per day, it would take about 10 days for one person to read all of these documents — to say nothing of taking notes or doing any sort of followup. This is exactly the sort of problem that Overview would like to solve. The reporting was a multi-stage process: Splitting the massive PDFs into individual documents and extracting the text Exploration and subject tagging with the Overview prototype Random sampling to estimate the frequency of certain types of events Followup and comparison with other sources
  •  
    Using Overview to analyze 4500 pages of documents on security contractors in Iraq by Jonathan Stray on 02/21/2012 0 This post describes how we used a prototype of the Overview software to explore 4,500 pages of incident reports concerning the actions of private security contractors working for the U.S. State Department during the Iraq war. This was the core of the reporting work for our previous post, where we reported the results of that analysis. The promise of a document set like this is that it will give us some idea of the broader picture, beyond the handful of really egregious incidents that have made headlines. To do this, in some way we have to take into account most or all of the documents, not just the small number that might match a particular keyword search. But at one page per minute, eight hours per day, it would take about 10 days for one person to read all of these documents - to say nothing of taking notes or doing any sort of followup. This is exactly the sort of problem that Overview would like to solve. The reporting was a multi-stage process: Splitting the massive PDFs into individual documents and extracting the text Exploration and subject tagging with the Overview prototype Random sampling to estimate the frequency of certain types of events Followup and comparison with other sources
Tom Johnson

Open Data Directory - 0 views

  • A free search engine for data sets published by governments, private companies and other organizations. It now indexes 255180 datasets from many sources.
  •  
    A free search engine for data sets published by governments, private companies and other organizations. It now indexes 255,180 datasets from many sources.
Tom Johnson

Twiangulate: analyzing the connections between friends and followers - 0 views

  •  
    Who can use Twiangulate? Job Seekers Looking for an "in" at a company or trying to learn which tweeps might influence the hiring manager or boss? Or looking for tweeps at target companies or industries in a certain location? Twiangulate has you covered. Journalists Want to find hidden relationships or linchpin players in an industry? Maybe you're looking for sources in a microniche? Or who's most followed at a conference? Twiangulate does it all. Tech Junkies Want to know who your most influential Twitter followers are? Love social network mapping? Want to know when two tweeps start following the same person? Twiangulate!
Tom Johnson

8 must-reads detail how to verify information in real-time, from social media, users | ... - 0 views

  •  
    8 must-reads detail how to verify information in real-time, from social media, users Craig Silverman by Craig Silverman Published Apr. 27, 2012 7:46 am Updated Apr. 27, 2012 9:23 am Over the past couple of years, I've been trying to collect every good piece of writing and advice about verifying social media content and other types of information that flow across networks. This form of verification involves some new tools and techniques, and requires a basic understanding of the way networks operate and how people use them. It also requires many of the so-called old school values and techniques that have been around for a while: being skeptical, asking questions, tracking down high quality sources, exercising restraint, collaborating and communicating with team members. For example, lots of people talk about how Andy Carvin does crowdsourced verification and turns his Twitter feed into a real time newswire. Lost in the discussion is the fact that Carvin also develops sources and contacts on the ground and stays in touch with them on Skype and through other means. What you see on Twitter is only one part of the process. Some things never go out of style. At the same time, there are new tools, techniques and approaches every journalist should have in their arsenal. Fortunately, several leading practitioners of what I sometimes call the New Verification are gracious and generous about sharing what they know. One such generous lot are the folks at Storyful, a social media curation and verification operation that works with clients such as Reuters, ABC News, and The New York Times, among others. I wrote about them last year and examined how in some ways they act as an outsourced verification service for newsrooms. That was partly inspired by this post from Storyful founder Mark Little: I find it helps to think of curation as three central questions: * Discovery: How do we find valuable social media content? * Verification: How do we make sure we c
Tom Johnson

Playground | Social Analytics For Marketers - 0 views

  •  
    What is it? A social analytics platform which contains over 1,000 days of tweets (all 70 billion of them), Facebook activity and blog posts. How is it of use to journalists? "Journalists can easily develop real-time insights into any story from Playground," PeopleBrowsr UK CEO Andrew Grill explains. Complex keyword searches can be divided by user influence, geolocation, sentiment, and virtual communities of people with shared interests and affinities. These features - and many more - let reporters and researchers easily drill down to find the people and content driving the conversation on social networks on any subject. Playground lets you use the data the way you want to use it. You can either export the graphs and tables that the site produces automatically or export the results in a CSV file to create your own visualisations, which could potentially make it the next favourite tool of data journalists. Grill added: The recent launch of our fully transparent Kred influencer platform will make it faster and easier for journalists to find key influencers in a particular community. You can give Playground a try for the first 14 days before signing up for one of their subscriptions ($19 a month for students and journalists, $149 for organisations and companies).
Tom Johnson

Reporters' Lab // Spotted in St. Louis: Video Notebook sneak peek - 0 views

  •  
    Something that, at least for now, we've dubbed the Video Notebook. Your notes, as well as the sources you've imported, scroll along with the video. Just click on a note and the video jumps to the proper location in the timeline. The lab's lead developer, Charlie Szymanski, is heading up the project. His goal is to create an application to index, search and analyze recorded video by syncing notes and data feeds from sources like Twitter, Storify and live blogs. Essentially, it will allow reporters to save hours of time normally spent wading through video by jumping right to the segments they're looking for. We're hoping a tool like this will be especially helpful to reporters planning to live tweet recorded events, from city council meetings to political stump speeches.
Tom Johnson

Open Data Stories | About - 0 views

  • The challenge As noted in Open Data Stories’ first story, there are calls from various quarters for more data on the utility of governments releasing data and other material for re-use. The challenge would seem to be this: if people and organisations want governments to continue to invest in open data initiatives, they should jump into the feedback loop and tell governments, and the world, when they are putting open government data to good use. The scale or nature of beneficial use shouldn’t matter. It might be economic, creative, cultural or environmental. Or it could be something else. But tell us your story. Equally, interested stakeholders such as Creative Commons, the Open Knowledge Foundation and the Sunlight Foundation can tell us their stories too, even if that is only drawing us to relevant (and openly licensed ) articles that we can repost on Open Data Stories. And, of course, agencies who see the data they steward being put to good use should tell us too. Whatever the case, share your stories with others. The more you do, the richer the feedback loop and that, in turn, is likely to enable open data policies to be better developed and refined and, ultimately, to be sustainable.
  •  
    http://www.zanran.com/ a search engine for data & statistics. Time to open your data, people! #opendata
Tom Johnson

http://theyrule.net - 1 views

  •  
    They Rule Overview They Rule aims to provide a glimpse of some of the relationships of the US ruling class. It takes as its focus the boards of some of the most powerful U.S. companies, which share many of the same directors. Some individuals sit on 5, 6 or 7 of the top 1000 companies. It allows users to browse through these interlocking directories and run searches on the boards and companies. A user can save a map of connections complete with their annotations and email links to these maps to others. They Rule is a starting point for research about these powerful individuals and corporations. Context A few companies control much of the economy and oligopolies exert control in nearly every sector of the economy. The people who head up these companies swap on and off the boards from one company to another, and in and out of government committees and positions. These people run the most powerful institutions on the planet, and we have almost no say in who they are. This is not a conspiracy, they are proud to rule, yet these connections of power are not always visible to the public eye. Karl Marx once called this ruling class a 'band of hostile brothers.' They stand against each other in the competitve struggle for the continued accumulation of their capital, but they stand together as a family supporting their interests in perpetuating the profit system as whole. Protecting this system can require the cover of a 'legitimate' force - and this is the role that is played by the state. An understanding of this system can not be gleaned from looking at the inter-personal relations of this class alone, but rather how they stand in relation to other classes in society. Hopefully They Rule will raise larger questions about the structure of our society and in whose benefit it is run. The Data We do not claim that this data is 100% accurate at all times. Corporate directors have a habit of dying, quitting boards, joining new ones and most frustratingly passing on their name
  •  
    I think this data must be very useful to the people in Occupy Wall Street
Tom Johnson

DropboxAddons - Dropbox Wiki - 0 views

  • DropboxAddons Note: In addition to the Addons listed here, there are many other applications available at the Dropbox Apps Page.
  •  
    Some interesting DropboxAddons here for most operating systems. Note: In addition to the Addons listed here, there are many other applications available at the Dropbox Apps Page.
1 - 20 of 23 Next ›
Showing 20 items per page