Group items tagged graph - Future of the Web

PT's blog » Compound documents in ICE and beyond: referencing parts of things - 0 views

ptsefton.com/...eferencing-parts-of-things.htm

compound-documents parallax pilcrow

shared by Gary Edwards on 30 Aug 08 - Cached

Gary Edwards on 30 Aug 08

Ben O'Steen has put up some thoughts on what he refers to as 'compound' documents and how to store them in repositories and allow for referencing of parts of a document, such as a table, a graph or even a paragraph. Why did I add the scare quotes to compound? While to a computer scientist a research paper with its graphs and tables and paragraphs might be compound, I suspect most authors tend to think of a research article as a single entity. Until we start giving them access to services that make it clear that it's not monolithic, that is. As background, Ben gives four rules: Note that the four rules of the web (well, of Linked Data technically) are in essence: * give everything a name, * make that name a URL … * which results in data about that thing, * and have it link to other related things.

<div class="cArrow"> </div><div class="cContentInner">Ben O'Steen has put up some thoughts on what he refers to as 'compound' documents and how to store them in repositories and allow for referencing of parts of a document, such as a table, a graph or even a paragraph. Why did I add the scare quotes to compound? While to a computer scientist a research paper with its graphs and tables and paragraphs might be compound, I suspect most authors tend to think of a research article as a single entity. Until we start giving them access to services that make it clear that it's not monolithic, that is. As background, Ben gives four rules: Note that the four rules of the web (well, of Linked Data technically) are in essence: * give everything a name, * make that name a URL … * which results in data about that thing, * and have it link to other related things. </div>

...

Cancel

NSA contractors use LinkedIn profiles to cash in on national security | Al Jazeera America - 0 views

america.aljazeera.com/...tractors-linkedinprofiles.html

surveillance state NSA NSA-methods spying-on-spies

shared by Paul Merrell on 30 May 14 - No Cached

NSA spies need jobs, too. And that is why many covert programs could be hiding in plain sight. Job websites such as LinkedIn and Indeed.com contain hundreds of profiles that reference classified NSA efforts, posted by everyone from career government employees to low-level IT workers who served in Iraq or Afghanistan. They offer a rare glimpse into the intelligence community's projects and how they operate. Now some researchers are using the same kinds of big-data tools employed by the NSA to scrape public LinkedIn profiles for classified programs. But the presence of so much classified information in public view raises serious concerns about security — and about the intelligence industry as a whole. “I’ve spent the past couple of years searching LinkedIn profiles for NSA programs,” said Christopher Soghoian, the principal technologist with the American Civil Liberties Union’s Speech, Privacy and Technology Project.
...

Cancel
On Aug. 3, The Wall Street Journal published a story about the FBI’s growing use of hacking to monitor suspects, based on information Soghoian provided. The next day, Soghoian spoke at the Defcon hacking conference about how he uncovered the existence of the FBI’s hacking team, known as the Remote Operations Unit (ROU), using the LinkedIn profiles of two employees at James Bimen Associates, with which the FBI contracts for hacking operations. “Had it not been for the sloppy actions of a few contractors updating their LinkedIn profiles, we would have never known about this,” Soghoian said in his Defcon talk. Those two contractors were not the only ones being sloppy.
...

Cancel
And there are many more. A quick search of Indeed.com using three code names unlikely to return false positives — Dishfire, XKeyscore and Pinwale — turned up 323 résumés. The same search on LinkedIn turned up 48 profiles mentioning Dishfire, 18 mentioning XKeyscore and 74 mentioning Pinwale. Almost all these people appear to work in the intelligence industry. Network-mapping the data Fabio Pietrosanti of the Hermes Center for Transparency and Digital Human Rights noticed all the code names on LinkedIn last December. While sitting with M.C. McGrath at the Chaos Communication Congress in Hamburg, Germany, Pietrosanti began searching the website for classified program names — and getting serious results. McGrath was already developing Transparency Toolkit, a Web application for investigative research, and knew he could improve on Pietrosanti’s off-the-cuff methods.
...

Cancel
...2 more annotations...
“I was, like, huh, maybe there’s more we can do with this — actually get a list of all these profiles that have these results and use that to analyze the structure of which companies are helping with which programs, which people are helping with which programs, try to figure out in what capacity, and learn more about things that we might not know about,” McGrath said. He set up a computer program called a scraper to search LinkedIn for public profiles that mention known NSA programs, contractors or jargon — such as SIGINT, the agency’s term for “signals intelligence” gleaned from intercepted communications. Once the scraper found the name of an NSA program, it searched nearby for other words in all caps. That allowed McGrath to find the names of unknown programs, too. Once McGrath had the raw data — thousands of profiles in all, with 70 to 80 different program names — he created a network graph that showed the relationships between specific government agencies, contractors and intelligence programs. Of course, the data are limited to what people are posting on their LinkedIn profiles. Still, the network graph gives a sense of which contractors work on several NSA programs, which ones work on just one or two, and even which programs military units in Iraq and Afghanistan are using. And that is just the beginning.
...

Cancel
Click on the image to view an interactive network illustration of the relationships between specific national security surveillance programs in red, and government organizations or private contractors in blue.
...

Cancel

Paul Merrell on 30 May 14

What a giggle, public spying on NSA and its contractors using Big Data. The interactive network graph with its sidebar display of relevant data derived from LinkedIn profiles is just too delightful.

<div class="cArrow"> </div><div class="cContentInner">What a giggle, public spying on NSA and its contractors using Big Data. The interactive network graph with its sidebar display of relevant data derived from LinkedIn profiles is just too delightful. </div>

...

Cancel

The People and Tech Behind the Panama Papers - Features - Source: An OpenNews project - 0 views

source.opennews.org/...-and-tech-behind-panama-papers

Panama-Papers technology open-source

shared by Paul Merrell on 12 Apr 16 - No Cached

Then we put the data up, but the problem with Solr was it didn’t have a user interface, so we used Project Blacklight, which is open source software normally used by librarians. We used it for the journalists. It’s simple because it allows you to do faceted search—so, for example, you can facet by the folder structure of the leak, by years, by type of file. There were more complex things—it supports queries in regular expressions, so the more advanced users were able to search for documents with a certain pattern of numbers that, for example, passports use. You could also preview and download the documents. ICIJ open-sourced the code of our document processing chain, created by our web developer Matthew Caruana Galizia. We also developed a batch-searching feature. So say you were looking for politicians in your country—you just run it through the system, and you upload your list to Blacklight and you would get a CSV back saying yes, there are matches for these names—not only exact matches, but also matches based on proximity. So you would say “I want Mar Cabra proximity 2” and that would give you “Mar Cabra,” “Mar whatever Cabra,” “Cabra, Mar,”—so that was good, because very quickly journalists were able to see… I have this list of politicians and they are in the data!
...

Cancel
Last Sunday, April 3, the first stories emerging from the leaked dataset known as the Panama Papers were published by a global partnership of news organizations working in coordination with the International Consortium of Investigative Journalists, or ICIJ. As we begin the second week of reporting on the leak, Iceland’s Prime Minister has been forced to resign, Germany has announced plans to end anonymous corporate ownership, governments around the world launched investigations into wealthy citizens’ participation in tax havens, the Russian government announced that the investigation was an anti-Putin propaganda operation, and the Chinese government banned mentions of the leak in Chinese media. As the ICIJ-led consortium prepares for its second major wave of reporting on the Panama Papers, we spoke with Mar Cabra, editor of ICIJ’s Data & Research unit and lead coordinator of the data analysis and infrastructure work behind the leak. In our conversation, Cabra reveals ICIJ’s years-long effort to build a series of secure communication and analysis platforms in support of genuinely global investigative reporting collaborations.
...

Cancel
For communication, we have the Global I-Hub, which is a platform based on open source software called Oxwall. Oxwall is a social network, like Facebook, which has a wall when you log in with the latest in your network—it has forum topics, links, you can share files, and you can chat with people in real time.
...

Cancel
...3 more annotations...
We had the data in a relational database format in SQL, and thanks to ETL (Extract, Transform, and Load) software Talend, we were able to easily transform the data from SQL to Neo4j (the graph-database format we used). Once the data was transformed, it was just a matter of plugging it into Linkurious, and in a couple of minutes, you have it visualized—in a networked way, so anyone can log in from anywhere in the world. That was another reason we really liked Linkurious and Neo4j—they’re very quick when representing graph data, and the visualizations were easy to understand for everybody. The not-very-tech-savvy reporter could expand the docs like magic, and more technically expert reporters and programmers could use the Neo4j query language, Cypher, to do more complex queries, like show me everybody within two degrees of separation of this person, or show me all the connected dots…
...

Cancel
We believe in open source technology and try to use it as much as possible. We used Apache Solr for the indexing and Apache Tika for document processing, and it’s great because it processes dozens of different formats and it’s very powerful. Tika interacts with Tesseract, so we did the OCRing on Tesseract. To OCR the images, we created an army of 30–40 temporary servers in Amazon that allowed us to process the documents in parallel and do parallel OCR-ing. If it was very slow, we’d increase the number of servers—if it was going fine, we would decrease because of course those servers have a cost.
...

Cancel
For the visualization of the Mossack Fonseca internal database, we worked with another tool called Linkurious. It’s not open source, it’s licensed software, but we have an agreement with them, and they allowed us to work with it. It allows you to represent data in graphs. We had a version of Linkurious on our servers, so no one else had the data. It was pretty intuitive—journalists had to click on dots that expanded, basically, and could search the names.
...

Cancel

Managing your Social Graph with Google+ [Google Plus] - Life With Alacrity - 2 views

www.lifewithalacrity.com/...al-graph-with-google-plus.html

managing social graph google plus life productivity Concentration brain psychology

shared by Michelle Pitman on 16 Jul 11 - No Cached

Mozilla Acquires Pocket | The Mozilla Blog - 0 views

blog.mozilla.org/...mozilla-acquires-pocket

Mozilla acquisitions Pocket

shared by Paul Merrell on 28 Feb 17 - No Cached

huong3a3y and ngoclinh91 liked it

e are excited to announce that the Mozilla Corporation has completed the acquisition of Read It Later, Inc. the developers of Pocket. Mozilla is growing, experimenting more, and doubling down on our mission to keep the internet healthy, as a global public resource that’s open and accessible to all. As our first strategic acquisition, Pocket contributes to our strategy by growing our mobile presence and providing people everywhere with powerful tools to discover and access high quality web content, on their terms, independent of platform or content silo. Pocket will join Mozilla’s product portfolio as a new product line alongside the Firefox web browsers with a focus on promoting the discovery and accessibility of high quality web content. (Here’s a link to their blog post on the acquisition).  Pocket’s core team and technology will also accelerate Mozilla’s broader Context Graph initiative.
...

Cancel
“We believe that the discovery and accessibility of high quality web content is key to keeping the internet healthy by fighting against the rising tide of centralization and walled gardens. Pocket provides people with the tools they need to engage with and share content on their own terms, independent of hardware platform or content silo, for a safer, more empowered and independent online experience.” – Chris Beard, Mozilla CEO Pocket brings to Mozilla a successful human-powered content recommendation system with 10 million unique monthly active users on iOS, Android and the Web, and with more than 3 billion pieces of content saved to date. In working closely with Pocket over the last year around the integration within Firefox, we developed a shared vision and belief in the opportunity to do more together that has led to Pocket joining Mozilla today. “We’ve really enjoyed partnering with Mozilla over the past year. We look forward to working more closely together to support the ongoing growth of Pocket and to create great new products that people love in support of our shared mission.” – Nate Weiner, Pocket CEO As a result of this strategic acquisition, Pocket will become a wholly owned subsidiary of Mozilla Corporation and will become part of the Mozilla open source project.
...

Cancel

"Self-Censorship on Facebook Sauvik Das and Adam Kramer - 0 views

sauvik.me/...ip_on_facebook_cameraready.pdf

self-censorship censorship Facebook Internet personal data privacy prudence publlishing Sauvik Das Adam Kramer espionaje seguridad

shared by Gonzalo San Gil, PhD. on 17 Dec 13 - No Cached

Gonzalo San Gil, PhD. on 17 Dec 13

Abstract We report results from an exploratory analysis examining "last - minute" self - censorship, or content that is filtered after being written, on Facebook. We collected data from 3.9 milion users over 17 days and associate self- censorship behavior with features describing users, their social graph, and the interactions between them. "

<div class="cArrow"> </div><div class="cContentInner">Abstract We report results from an exploratory analysis examining "last - minute" self - censorship, or content that is filtered after being written, on Facebook. We collected data from 3.9 milion users over 17 days and associate self- censorship behavior with features describing users, their social graph, and the interactions between them. "</div>

...

Cancel

Transparency Toolkit - 0 views

transparencytoolkit.org/about.html

transparency transparency-tools research-resources