Skip to main content

Home/ Arquitectura?/ Group items tagged web-crawling

Rss Feed Group items tagged

Pablo Lalloni

PuerkitoBio/fetchbot - 0 views

  •  
    "A simple and flexible web crawler that follows the robots.txt policies and crawl delays."
Pablo Lalloni

Metascraper - 0 views

  •  
    "A Scala Library for Scraping Page Metadata. Scraping metadata (e.g. title, description, url, etc.) from a URL is something that Facebook currently does for you when you paste a URL into the "Update Status" box. For a service that I'm currently building out, we wanted to do this as well for our users. Thus Metascraper was born. There was already a Ruby solution called link_thumbnailer, but since this is a I/O heavy operation, I knew I wanted to build a solution using tools that supported non-blocking I/O and could be used without getting caught in callback spaghetti. Scala, Akka, and the Play framework immediately came to mind."
1 - 2 of 2
Showing 20 items per page