Scraping for Journalism: A Guide for Collecting Data - ProPublica - 5 views
-
awqi zar on 06 Jan 11Most of the techniques are within the ability of the moderately experienced programmer. The most difficult-to-scrape site was actually a previous Adobe Flash incarnation of Eli Lilly's disclosure site. Lilly has since released their data in PDF format.