Then we put the data up, but the problem with Solr was it didn’t have a user interface, so we used Project Blacklight, which is open source software normally used by librarians. We used it for the journalists. It’s simple because it allows you to do faceted search—so, for example, you can facet by the folder structure of the leak, by years, by type of file. There were more complex things—it supports queries in regular expressions, so the more advanced users were able to search for documents with a certain pattern of numbers that, for example, passports use. You could also preview and download the documents. ICIJ open-sourced the code of our document processing chain, created by our web developer Matthew Caruana Galizia.
We also developed a batch-searching feature. So say you were looking for politicians in your country—you just run it through the system, and you upload your list to Blacklight and you would get a CSV back saying yes, there are matches for these names—not only exact matches, but also matches based on proximity. So you would say “I want Mar Cabra proximity 2” and that would give you “Mar Cabra,” “Mar whatever Cabra,” “Cabra, Mar,”—so that was good, because very quickly journalists were able to see… I have this list of politicians and they are in the data!
1More
1More
Not Alone: Cooperative and Trade Union Solutions for Freelancers - Shareable - 0 views
2More
Maybe It's Time to Trust Microsoft -- Maybe Not | FOSS Force - 0 views
1More
You Will Never Kill Piracy, and Piracy Will Never Kill You - 0 views
1More
Netflix Disappears From MPAA's 'Legal' Movie Search Engine - TorrentFreak - 1 views
1More
Learn how to calculate ROI for open hardware projects | Opensource.com - 0 views
6More
The People and Tech Behind the Panama Papers - Features - Source: An OpenNews project - 0 views
Sorry for the SPAM - 1 views
2More
OK, panic-newly evolved ransomware is bad news for everyone | Ars Technica UK - 0 views
3More
Massive EU data protection overhaul finally approved | Ars Technica UK - 0 views
2More
The importance of technical terms used in copyright licenses | Opensource.com - 2 views
2More
RIAA Says YouTube is Running a DMCA Protection Racket - TorrentFreak - 0 views
2More
48% of people who buy vinyl don't listen to the records | What Hi-Fi? - 0 views
1More
HTML5.1 begins to take shape on GitHub | InfoWorld - 0 views
2More
De Microsoft a Google pasando por Amazon: la guerra digital de Europa contra los gigant... - 0 views
3More
Support for huge transatlantic trade deal TTIP plummets in both US and Germany | Ars Te... - 0 views
2More
How Big Is Your Target? - Freedom Penguin - 0 views
2More
Hungry for Authentic Journalism? Picture This! | Jerry Ashton | LinkedIn - 0 views
« First
‹ Previous
3141 - 3160
Next ›
Last »
Showing 20▼ items per page