Skip to main content

Home/ DJCamp2011/ Group items tagged content analysis

Rss Feed Group items tagged

Tom Johnson

T-LAB Tools for Text Analysis - 0 views

  •  
    The all-in-one software for Content Analysis and Text Mining Hello We are pleased to announce the release of T-LAB 8.0. This version represents a major change in the usability and the effectiveness of our software for text analysis. The most significant improvements concern the integration of bottom-up (i.e. unsupervised) methods for exploratory text analysis with top-down (i.e. supervised) approaches for the automated classification of textual units like words, sentences, paragraphs and documents. Among other things, this means that - besides discovering emerging patterns of words and themes from texts - the users can now easily build, apply and validate their models (e.g. dictionaries of categories or pre-existing manual categorizations) both for classical content analysis and for sentiment analysis. For this purpose several T-LAB functionalities have been expanded and a new ergonomic and powerful tool named 'Dictionary-Based Classification' has been added. No specific dictionaries have been built in; however, with some minor re-formatting, lots of resources available over the Internet and customized word lists can be quickly imported. Last but not least, in order to meet the needs of many customers, temporary licenses of the software are now on sale; moreover, without any time limit, the trial mode of the software now allows you to analyse your own texts up to 20 kb in txt format, each of which can include up to 20 short documents. To learn more, use the following link http://www.tlab.it/en/80news.php The Demo, the User's Manual and the Quick Introduction are available at http://www.tlab.it/en/download.php Kind Regards The T-LAB Team web: http://www.tlab.it/ e-mail: info@tlab.it
Tom Johnson

Software for Content Analysis - 0 views

  •  
    "Software for Content Analysis: Links to external sites The list below provides links to web sites where one can find information (often including purchasing information) regarding content analysis software as well as other types of software that are often utilized by content analysts. The list was last updated in December 2008. Some links may change. You might also find Will Lowe's Review of Software for Content Analysis useful. "
Tom Johnson

DIVA-GIS | DIVA-GIS: free, simple & effective - 0 views

  • DIVA-GIS DIVA-GIS is a free computer program for mapping and geographic data analysis (a geographic information system (GIS). With DIVA-GIS you can make maps of the world, or of a very small area, using, for example, state boundaries, rivers, a satellite image, and the locations of sites where an animal species was observed. We also provide free spatial data for the whole world that you can use in DIVA-GIS or other programs. You can use the discussion forum to ask questions, report problems, or make suggestions. Or contact us, and read the blog entries for the latest news. But first download the program and read the documentation. DIVA-GIS is particularly useful for mapping and analyzing biodiversity data, such as the distribution of species, or other 'point-distributions'. It reads and write standard data formats such as ESRI shapefiles, so interoperability is not a problem. DIVA-GIS runs on Windows and (with minor effort) on Mac OSX (see instructions). You can use the program to analyze data, for example by making grid (raster) maps of the distribution of biological diversity, to find areas that have high, low, or complementary levels of diversity. And you can also map and query climate data. You can predict species distributions using the BIOCLIM or DOMAIN models.
  •  
    DIVA-GIS DIVA-GIS is a free computer program for mapping and geographic data analysis (a geographic information system (GIS). With DIVA-GIS you can make maps of the world, or of a very small area, using, for example, state boundaries, rivers, a satellite image, and the locations of sites where an animal species was observed. We also provide free spatial data for the whole world that you can use in DIVA-GIS or other programs. You can use the discussion forum to ask questions, report problems, or make suggestions. Or contact us, and read the blog entries for the latest news. But first download the program and read the documentation. DIVA-GIS is particularly useful for mapping and analyzing biodiversity data, such as the distribution of species, or other 'point-distributions'. It reads and write standard data formats such as ESRI shapefiles, so interoperability is not a problem. DIVA-GIS runs on Windows and (with minor effort) on Mac OSX (see instructions). You can use the program to analyze data, for example by making grid (raster) maps of the distribution of biological diversity, to find areas that have high, low, or complementary levels of diversity. And you can also map and query climate data. You can predict species distributions using the BIOCLIM or DOMAIN models.
  •  
    DIVA-GIS DIVA-GIS is a free computer program for mapping and geographic data analysis (a geographic information system (GIS). With DIVA-GIS you can make maps of the world, or of a very small area, using, for example, state boundaries, rivers, a satellite image, and the locations of sites where an animal species was observed. We also provide free spatial data for the whole world that you can use in DIVA-GIS or other programs. You can use the discussion forum to ask questions, report problems, or make suggestions. Or contact us, and read the blog entries for the latest news. But first download the program and read the documentation. DIVA-GIS is particularly useful for mapping and analyzing biodiversity data, such as the distribution of species, or other 'point-distributions'. It reads and write standard data formats such as ESRI shapefiles, so interoperability is not a problem. DIVA-GIS runs on Windows and (with minor effort) on Mac OSX (see instructions). You can use the program to analyze data, for example by making grid (raster) maps of the distribution of biological diversity, to find areas that have high, low, or complementary levels of diversity. And you can also map and query climate data. You can predict species distributions using the BIOCLIM or DOMAIN models.
Tom Johnson

Michelle Minkoff » Learning to love…grep (let the computer search text for you) - 0 views

  • Blog Learning to love…grep (let the computer search text for you) Posted by Michelle Minkoff on Aug 9, 2012 in Blog, Uncategorized | No Comments I’ve gotten into the habit of posting daily learnings on Twitter, but some things require a more in-depth reminder. I also haven’t done as much paying as forward as I’d like (but I’m having a TON of fun!  and dealing with health problems!  but mostly fun!) I’d like to try to start posting more helpful tips here, partially as a notebook for myself, and partially to help others with similar issues. Today’s problem: I needed to search for a few lines of text, which could be contained in any one of nine files with 100,000 lines each. Opening all of the files took a very long time on my computer, not to mention executing a search. Enter the “grep” command in Terminal, that allows you to quickly search files using the power of the computer.
  •  
    Blog Learning to love…grep (let the computer search text for you) Posted by Michelle Minkoff on Aug 9, 2012 in Blog, Uncategorized | No Comments I've gotten into the habit of posting daily learnings on Twitter, but some things require a more in-depth reminder. I also haven't done as much paying as forward as I'd like (but I'm having a TON of fun! and dealing with health problems! but mostly fun!) I'd like to try to start posting more helpful tips here, partially as a notebook for myself, and partially to help others with similar issues. Today's problem: I needed to search for a few lines of text, which could be contained in any one of nine files with 100,000 lines each. Opening all of the files took a very long time on my computer, not to mention executing a search. Enter the "grep" command in Terminal, that allows you to quickly search files using the power of the computer.
  •  
    An easy to use method for content analysis
Tom Johnson

The Overview Project » Document mining shows Paul Ryan relying on the the pro... - 0 views

  •  
    Document mining shows Paul Ryan relying on the the programs he criticizes by Jonathan Stray on 11/02/2012 0 One of the jobs of a journalist is to check the record. When Congressman Paul Ryan became a vice-presidential candidate, Associated Press reporter Jack Gillum decided to examine the candidate through his own words. Hundreds of Freedom of Information requests and 9,000 pages later, Gillum wrote a story showing that Ryan has asked for money from many of the same Federal programs he has criticized as wasteful, including stimulus money and funding for alternative fuels. This would have been much more difficult without special software for journalism. In this case Gillum relied on two tools: DocumentCloud to upload, OCR, and search the documents, and Overview to automatically sort the documents into topics and visualize the contents. Both projects are previous Knight News Challenge winners. But first Gillum had to get the documents. As a member of Congress, Ryan isn't subject to the Freedom of Information Act. Instead, Gillum went to every federal agency - whose files are covered under FOIA - for copies of letters or emails that might identify Ryan's favored causes, names of any constituents who sought favors, and more. Bit by bit, the documents arrived - on paper. The stack grew over weeks, eventually piling up two feet high on Gillum's desk. Then he scanned the pages and loaded them into the AP's internal installation of DocumentCloud. The software converts the scanned pages to searchable text, but there were still 9000 pages of material. That's where Overview came in. Developed in house at the Associated Press, this open-source visualization tool processes the full text of each document and clusters similar documents together, producing a visualization that graphically shows the contents of the complete document set. "I used Overview to take these 9000 pages of documents, and knowing there was probably going to be a lot of garbage or ext
Tom Johnson

Mining of Massive Datasets - 0 views

  •  
    Mining of Massive Datasets The book has now been published by Cambridge University Press. A hardcopy can be obtained Here. By agreement with the publisher, you can still download it free from this page. Cambridge Press does, however, retain copyright on the work, and we expect that you will acknowledge our authorship if you republish parts or all of it. We are sorry to have to mention this point, but we have evidence that other items we have published on the Web have been appropriated and republished under other names. It is easy to detect such misuse, by the way, as you will learn in Chapter 3. --- Anand Rajaraman (@anand_raj) and Jeff Ullman Downloads Download the Complete Book (340 pages, approximately 2MB) Download chapters of the book: Preface and Table of Contents Chapter 1 Data Mining Chapter 2 Large-Scale File Systems and Map-Reduce Chapter 3 Finding Similar Items Chapter 4 Mining Data Streams Chapter 5 Link Analysis Chapter 6 Frequent Itemsets Chapter 7 Clustering Chapter 8 Advertising on the Web Chapter 9 Recommendation Systems Index
Tom Johnson

Politilines - 0 views

  •  
    Visualizing the words used in the 2011-2012 Republican Primary debates. The method: We collected transcripts from the American Presidency Project at UCSB, categorized them by hand, then ranked lemmatized word-phrases (or n-grams) by their frequency of use. Word-phrases can be made of up to five words. Our ranking agorithm accounts for things such as exclusive word-phrases - meaning, it won't count "United States" twice if it's used in a higher n-gram such as "President of the United States." While still in beta, the mini-app is responsive and easy to use. The next challenge, I think, is to really show what everyone talked about. For example, click on education and you see Newt Gingrich, Ron Paul, and Rick Perry brought those up. Then roll over the names to see the words each candidate used related to that topic. You get some sense of content, but it's still hard to decipher what each actually said about education.
Tom Johnson

ELAN description | The Language Archive - 0 views

  • ELAN description ELAN is a professional tool for the creation of complex annotations on video and audio resources. With ELAN a user can add an unlimited number of annotations to audio and/or video streams. An annotation can be a sentence, word or gloss, a comment, translation or a description of any feature observed in the media. Annotations can be created on multiple layers, called tiers. Tiers can be hierarchically interconnected. An annotation can either be time-aligned to the media or it can refer to other existing annotations. The textual content of annotations is always in Unicode and the transcription is stored in an XML format. ELAN provides several different views on the annotations, each view is connected and synchronized to the media playhead. Up to 4 video files can be associated with an annotation document. Each video can be integrated in the main document window or displayed in its own resizable window. ELAN delegates media playback to an existing media framework, like Windows Media Player, QuickTime or JMF (Java Media Framework). As a result a wide variety of audio and video formats is supported and high performance media playback can be achieved. ELAN is written in the Java programming language and the sources are available for non-commercial use. It runs on Windows, Mac OS X and Linux.
  •  
    ELAN description ELAN is a professional tool for the creation of complex annotations on video and audio resources. With ELAN a user can add an unlimited number of annotations to audio and/or video streams. An annotation can be a sentence, word or gloss, a comment, translation or a description of any feature observed in the media. Annotations can be created on multiple layers, called tiers. Tiers can be hierarchically interconnected. An annotation can either be time-aligned to the media or it can refer to other existing annotations. The textual content of annotations is always in Unicode and the transcription is stored in an XML format. ELAN provides several different views on the annotations, each view is connected and synchronized to the media playhead. Up to 4 video files can be associated with an annotation document. Each video can be integrated in the main document window or displayed in its own resizable window. ELAN delegates media playback to an existing media framework, like Windows Media Player, QuickTime or JMF (Java Media Framework). As a result a wide variety of audio and video formats is supported and high performance media playback can be achieved. ELAN is written in the Java programming language and the sources are available for non-commercial use. It runs on Windows, Mac OS X and Linux.
Tom Johnson

Shorenstein Center paper argues for collaboration in investigative reporting | Harvard ... - 0 views

  • Shorenstein Center paper argues for collaboration in investigative reporting Thursday, June 2, 2011 Sandy Rowe, former editor of The Oregonian, and Knight Fellow at the Shorenstein Center fall 2010 and spring 2011. Photograph by Martha Stewart Shorenstein Center, Harvard Kennedy School Contact: Janell Simsjanell_sims@harvard.eduhttp://www.hks.harvard.edu/presspol/index.html Media organizations may be able to perform their watchdog roles more effectively working together than apart. That is one conclusion in a new paper, “Partners of Necessity: The Case for Collaboration in Local Investigative Reporting,” authored by Sandy Rowe, former editor of Portland’s The Oregonian. The paper is based on interviews and research that Rowe conducted while serving as a Knight Fellow at the Shorenstein Center on the Press, Politics and Public Policy at Harvard Kennedy School. Rowe’s research examines the theory underpinning collaborative work and shows emerging models of collaboration that can lead to more robust investigative and accountability reporting in local and regional markets. “Growing evidence suggests that collaborations and partnerships between new and established news organizations, universities and foundations may be the overlooked key for investigative journalism to thrive at the local and state levels,” Rowe writes. “These partnerships, variously and often loosely organized, can share responsibility for content creation, generate wider distribution of stories and spread the substantial cost of accountability journalism.” Rowe was editor of The Oregonian from 1993 until January 2010. Under her leadership, the newspaper won five Pulitzer Prizes including the Gold Medal for Public Service. Rowe chairs the Board of Visitors of The Knight Fellowships at Stanford University and is a board member of the Committee to Protect Journalists. From 1984 until April 1993, Rowe was executive editor and vice president of The Virginian-Pilot and The Ledger-Star, Norfolk and Virginia Beach, Virginia. The Virginian-Pilot won the Pulitzer Prize for general news reporting under her leadership. Rowe’s year-long fellowship at the Shorenstein Center was funded by the John S. and James L. Knight Foundation. Read the full paper on the Shorenstein Center’s website.
  •  
    Shorenstein Center paper argues for collaboration in investigative reporting Thursday, June 2, 2011 Sandy Rowe, former editor of The Oregonian, and Knight Fellow at the Shorenstein Center fall 2010 and spring 2011. Photograph by Martha Stewart Shorenstein Center, Harvard Kennedy School Contact: Janell Sims janell_sims@harvard.edu http://www.hks.harvard.edu/presspol/index.html Media organizations may be able to perform their watchdog roles more effectively working together than apart. That is one conclusion in a new paper, "Partners of Necessity: The Case for Collaboration in Local Investigative Reporting," authored by Sandy Rowe, former editor of Portland's The Oregonian. The paper is based on interviews and research that Rowe conducted while serving as a Knight Fellow at the Shorenstein Center on the Press, Politics and Public Policy at Harvard Kennedy School. Rowe's research examines the theory underpinning collaborative work and shows emerging models of collaboration that can lead to more robust investigative and accountability reporting in local and regional markets. "Growing evidence suggests that collaborations and partnerships between new and established news organizations, universities and foundations may be the overlooked key for investigative journalism to thrive at the local and state levels," Rowe writes. "These partnerships, variously and often loosely organized, can share responsibility for content creation, generate wider distribution of stories and spread the substantial cost of accountability journalism." Rowe was editor of The Oregonian from 1993 until January 2010. Under her leadership, the newspaper won five Pulitzer Prizes including the Gold Medal for Public Service. Rowe chairs the Board of Visitors of The Knight Fellowships at Stanford University and is a board member of the Committee to Protect Journalists. From 1984 until April 1993, Rowe was executive editor and vice president of The Virginian-Pi
Tom Johnson

The Overview Project - 0 views

  •  
    How Overview turns Documents into Pictures by Jonathan Stray on 06/04/2012 | 0 Overview produces intricate visualizations of large document sets - beautiful, but what do they mean? These visualizations are saying something about the documents, which you can interpret if you know a little about how they're plotted. There are two visualizations in the current prototype version of Overview, and both are based on document clustering.
1 - 10 of 10
Showing 20 items per page