Tabula really is a wonderful tool for extracting data from tables in PDFs. It’s a locally hosted web app that allows you to
Select one or more PDFs with the data you want.
Identify the area of the page from which to extract the data.
Save the data in CSV, TSV, or JSON format.
I gave Tabula a try on the same PDF tables I wrote about last night, and it worked perfectly. You may recall that I didn’t like the column headings in the original table. Well, Tabula let me drag a rectangle to select just the data portion of the table, leaving the stuff I didn’t want out of the extracted CSV file.
Contents contributed and discussions participated by Comrad Compadre
2More
Comparison of single-board computers - 0 views
3More
DOJ won't help FCC fight state laws that harm municipal broadband | Ars Technica - 1 views
1 - 8 of 8
Showing 20▼ items per page