Skip to main content

Home/ DJCamp2011/ Group items tagged is

Rss Feed Group items tagged

Tom Johnson

Data Science Central - 0 views

  •  
    Welcome to Data Science Central! Data Science Central is the industry's one stop resource for big data practitioners. From Analytics to Data Integration to Visualization, Data Science Central (DSC) provides a true community experience through social interaction, peer to peer technical support, the latest in technology, tools and trends --and even job opportunities. We look forward to hearing your feedback as we grow this community of professionals in our exciting industry during times of dramatic change.
Tom Johnson

Mining of Massive Datasets - 0 views

  •  
    Mining of Massive Datasets The book has now been published by Cambridge University Press. A hardcopy can be obtained Here. By agreement with the publisher, you can still download it free from this page. Cambridge Press does, however, retain copyright on the work, and we expect that you will acknowledge our authorship if you republish parts or all of it. We are sorry to have to mention this point, but we have evidence that other items we have published on the Web have been appropriated and republished under other names. It is easy to detect such misuse, by the way, as you will learn in Chapter 3. --- Anand Rajaraman (@anand_raj) and Jeff Ullman Downloads Download the Complete Book (340 pages, approximately 2MB) Download chapters of the book: Preface and Table of Contents Chapter 1 Data Mining Chapter 2 Large-Scale File Systems and Map-Reduce Chapter 3 Finding Similar Items Chapter 4 Mining Data Streams Chapter 5 Link Analysis Chapter 6 Frequent Itemsets Chapter 7 Clustering Chapter 8 Advertising on the Web Chapter 9 Recommendation Systems Index
Tom Johnson

COS 597G: Surveillance and Countermeasures, Fall 2013 - 0 views

  •  
    "COS-597G: Surveillance and Countermeasures (Fall 2013) Course description. This course surveys research on surveillance technologies and technical countermeasures. Readings come mostly from the computer science research literature, with some legal and policy readings to establish context. Course work will include reading and discussion, a few short writing assignments, and a substantial student-chosen course project. The course is designed for students with a solid grounding in computer science. Students unsure of their suitability of the course should contact the instructor. "
Tom Johnson

Investigative Reporters and Editors | Listserv archives - 0 views

  •  
    Listserv archives IRE and NICAR offer several opportunities for members and even non-members to exchange ideas, information, techniques and war stories. Joining is easy. If you are an IRE member, you may view the list archives: * Click an archive link and login with any e-mail address on record with the IRE office. Click "Get Password" if your first visit, to receive your LISTSERV password (separate from the IRE website password). Most users will login with the e-mail used for their IRE login account. Please e-mail listmaster@ire.org if you need help or have any questions. IRE-L archives. NICAR-L archives. IREPLUS-L archives. CENSUS-L archives. The following lists are less active: CFIC-L archives IRE-EDU-L archives IREBC-L archives
Tom Johnson

We Just Ran Twenty-Three Million Queries of the World Bank's Website - Working Paper 36... - 0 views

  •  
    "Abstract Much of the data underlying global poverty and inequality estimates is not in the public domain, but can be accessed in small pieces using the World Bank's PovcalNet online tool. To overcome these limitations and reproduce this database in a format more useful to researchers, we ran approximately 23 million queries of the World Bank's web site, accessing only information that was already in the public domain. This web scraping exercise produced 10,000 points on the cumulative distribution of income or consumption from each of 942 surveys spanning 127 countries over the period 1977 to 2012. This short note describes our methodology, briefly discusses some of the relevant intellectual property issues, and illustrates the kind of calculations that are facilitated by this data set, including growth incidence curves and poverty rates using alternative PPP indices. The full data can be downloaded at www.cgdev.org/povcalnet. "
Tom Johnson

Open Data Stories | About - 0 views

  • The challenge As noted in Open Data Stories’ first story, there are calls from various quarters for more data on the utility of governments releasing data and other material for re-use. The challenge would seem to be this: if people and organisations want governments to continue to invest in open data initiatives, they should jump into the feedback loop and tell governments, and the world, when they are putting open government data to good use. The scale or nature of beneficial use shouldn’t matter. It might be economic, creative, cultural or environmental. Or it could be something else. But tell us your story. Equally, interested stakeholders such as Creative Commons, the Open Knowledge Foundation and the Sunlight Foundation can tell us their stories too, even if that is only drawing us to relevant (and openly licensed ) articles that we can repost on Open Data Stories. And, of course, agencies who see the data they steward being put to good use should tell us too. Whatever the case, share your stories with others. The more you do, the richer the feedback loop and that, in turn, is likely to enable open data policies to be better developed and refined and, ultimately, to be sustainable.
  •  
    http://www.zanran.com/ a search engine for data & statistics. Time to open your data, people! #opendata
Tom Johnson

Javascript used to display Business Database Search from The Dallas Morning News - 0 views

  •  
    Daniel Lathrop Wanted to share with all of you my latest installment in my ongoing love affair with Google Fusion Tables, the Dallas publicly-traded companies list. http://newsapps.dallasnews.com/media/dfw-public-companies.html I got the data from the biz desk on Thursday and wrote this little thing using JQuery, JQueryUI and FusionTables pretty quickly. And before everyone gets all "but you could have used [Caspio, TableSetter, Rails, PHP, Ilene, etc.]" on me, I know I could have. But doing this with Fusion Tables let me do all my work on the client side and let me create the user-experience I wanted. Plus, I now have a starting place to do this for any similar Fusion Tables project. For the curious, the Javascript can be found here: http://newsapps.dallasnews.com/media/fusiondmn.pubcompanies.js It's fewer than 150 lines, and more than a quarter of that is my Javascript for for rendering integer/floating point #s in newsroom style (e.g. $4.2 billion). I'm hoping to turn it into a robust tool for deploying searchable data with Fusion Tables and am going to ask my corporate overlords to let me open source it once I've done some refactoring to make it generally applicable. Critiques welcome. -Daniel --------------------------- Daniel Lathrop 206.718.0349 (cell)
Tom Johnson

MDA Analytics - 0 views

  •  
    An interesting example of yet another "next generation" data analysis and presentation tool. You can see the demos at http://www.lavastorm.com/ Emphasis is on visualizing the data analytic method while doing the analysis.
Tom Johnson

Benetech® :: Human Rights :: Overview - 0 views

  •  
    We are committed to equal access to technology. Our software is freely available, and anyone may share our technology and modify it to suit their needs - all without asking our permission. Benetech created Martus and Analyzer specifically for human rights data collection, coding and processing. These tools include cryptographic security features and flexible data structures that can be adapted to the needs of each human rights project. By releasing our software as open source, we participate in the technological community where tools can be audited and improved by others, as well as enabling widespread access to our ideas.
Tom Johnson

Redliner - Solve the Frustrations of Document Collaboration and Approval - 0 views

  •  
    Redliner, a recent addition to the SaaS field, takes the concept one step further. A "next generation" online collaboration tool, Redliner goes beyond establishing a shared work space "in the cloud"-where individuals can access, edit and comment on documents-and adds innovative workflow features that actually get important documents completed faster. Surpassing the capabilities of current online word processors, such as Google™ Docs, Zoho® Writer and Adobe® Buzzword®, Redliner abolishes the grunt work inherent in the document collaboration process. When several individuals are working on a single document and accessing various versions, how many times do they find themselves asking, "What has changed?", "Is this the latest version?" or "What do I need to respond to?"
Tom Johnson

Google refine basic: Full Tutorial by David Huynh - 0 views

  •  
    Google Refine is a power tool for working with messy data, primarily for * detecting and fixing inconsistencies * transforming data from one structure or format to another * connecting names within your data to name registries (databases) Use Google Refine when you need something ... * more powerful than a spreadsheet * more interactive and visual than scripting * more provisional / exploratory / experimental / playful than a database
Tom Johnson

BatchGeo - 0 views

  •  
    The web site batchgeo.com provides an easy-to-use web interface for creating interactive Google maps. If you have names and addresses and other information, you can quickly create on online Google map with up to 2500 points. "Maps tell a story, see what you're missing BatchGeo is simply the fastest way to create google maps from your address lists. It accepts addresses, intersections, cities, states, and postal codes. We do the hard work of figuring out where all your data lives in the real world."
Tom Johnson

An Applied Demography Toolbox - 1 views

  •  
    An Applied Demography Toolbox A collection of applied demography programs, scripts, spreadsheets and databases. If you have any questions (such as how to apply the tools to your own work), recommendations or additions, you can send a message to me (Eddie Hunsinger) at edynivn@gmail.com. If you would like to use, share or reproduce any information or ideas from the linked files, be sure to cite the respective source. Here is a neat article that gives this site some inspiration. Acknowledgments. Subscribe to new postings . Return to Eddie's homepage. http://www.demog.berkeley.edu/~eddieh/toolbox.html#MedianCalculator
  •  
    These tend to be US-centric, but there are universal tools here for statistical analysis.
Tom Johnson

Suggestions (but not standards) for live tweeting « The Buttry Diary - 0 views

  •  
    Suggestions (but not standards) for live tweeting September 6, 2011 by Steve Buttry "Do you know of any standards for content of live tweets?" a commenter asked on my blog recently. "I have students live tweet meetings and speeches. Would love some specific guidelines for what makes a good tweet," asked Michele Day, who teaches journalism at Northern Kentucky University. I know of no such standards. And if I did, I'd probably react that "standards" for a developing pursuit such as live-tweeting might be a bit rigid. This is a new technique and we are learning about it as we do it. I don't want standards to inhibit our development and experimentation with the technique. My standards would be the standards of good reporting: Be accurate, fair, interesting and engaging. https://stevebuttry.wordpress.com/2011/09/06/suggestions-but-not-standards-for-live-tweeting
Tom Johnson

OpenAustralia.org: Are your Representatives and Senators working for you in Australia's... - 0 views

  •  
    OpenAustralia.org is a non-partisan website run by a charity, the OpenAustralia Foundation and volunteers. It aims to make it easy for people to keep tabs on their representatives in Parliament.
Tom Johnson

10 ways to screw up your spreadsheet design | TechRepublic - 0 views

  •  
    10 ways to screw up your spreadsheet design Recommend +21 Votes 36 Comments 46Share more + By Susan Harkins June 23, 2011, 8:25 AM PDT Takeaway: How you set up a spreadsheet determines its efficiency, usability, and reliability. Avoiding these pitfalls during the design phase will save you a million headaches. Wrong references, missing values, and invalid data aren't the only things that will ruin a spreadsheet. The development process starts before you do a thing, while you're planning the design. These types of mistakes are worse than bugs because you can't troubleshoot them. All you can do is start over. Here are 10 mistakes to avoid early in the process, when you're still in the decision-making phase.
  •  
    A good list and read down into the comments; additional good tips there.
Tom Johnson

Open Data Cook Book - 0 views

  •  
    Open Data Cook BookMaking Open Data Accessible for EveryoneAbout the Cook BookThe open data cook book is collecting recipes for ways to find and use open data, particularly open data of social value - such as open government data, or open data for campaigners and charities. Working with data can seem scary. But it doesn't have to be. There are many different ways to make data useful - and lots of different gadgets to help you. Take a look at the growing list of cook book recipes to find simple step by step ideas for making use of open data. RecipesYou can find a list of the recipes so far here. Drafts, ideas and notesIn the cooks notebook you can find draft notes on using different datasets and sketches that might develop into recipes in future. Get InvolvedFind out how to get involved here or jump right in and create a recipe. Tweet with the #opendatacookbook tag, or bookmark content on del.icio.us 'opendatacookbook ' to share with the project. Join the mailing list to discuss developments. UpdateAfter a brief experiment with Drupal as a CMS for the cook book - we've switched to DokuWiki for a bit to make compiling a list of recipes a lot easier before we work out the best way to run the Cook Book.
Tom Johnson

Eurostat - 0 views

  •  
    Eurostat was established in 1953 to meet the requirements of the Coal and Steel Community. Over the years its task has broadened and when the European Community was founded in 1958 it became a Directorate-General (DG) of the European Commission. Eurostat's key role is to supply statistics to other DGs and supply the Commission and other European Institutions with data so they can define, implement and analyse Community policies. The result: Eurostat offers a whole range of important and interesting data that governments, businesses, the education sector, journalists and the public can use for their work and daily life. With the development of Community policies, Eurostat's role has changed. Today, collecting data for EMU and developing statistical systems in candidate countries for EU membership are more important than ten years ago.
Tom Johnson

Learn to code | Codecademy - 0 views

  •  
    Learn to code Codecademy is the easiest way to learn how to code. It's interactive, fun, and you can do it with your friends.
Tom Johnson

Code to make legend in Google Map - 0 views

  •  
    Ties in with Google Fusion.
  •  
    A handy cut-and-paste HTML script that is easily customized
« First ‹ Previous 81 - 100 of 101 Next ›
Showing 20 items per page