Skip to main content

Home/ sensemaking/ Group items tagged crawler

Rss Feed Group items tagged

Jack Park

Technology Review: A Web Spider for Everyone - 1 views

  •  
    A user can start a Web crawl through 80legs's Web-based interface. The form on the company's site lets them set parameters for the project and upload custom code needed to control how the crawler does its job. For example, a user might want the crawler to find images and check them against a database of copyrighted ones. Deysarkar says his company's crawlers are capable of processing up to two billion pages a day. The company charges $2 for every million pages crawled, plus a fee of three cents per hour of processing used.
  •  
    A user can start a Web crawl through 80legs's Web-based interface. The form on the company's site lets them set parameters for the project and upload custom code needed to control how the crawler does its job. For example, a user might want the crawler to find images and check them against a database of copyrighted ones. Deysarkar says his company's crawlers are capable of processing up to two billion pages a day. The company charges $2 for every million pages crawled, plus a fee of three cents per hour of processing used.
Jack Park

Heritrix - Home Page - 0 views

  •  
    eritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
1 - 2 of 2
Showing 20 items per page