Skip to main content

Home/ Future of the Web/ Group items tagged languages

Rss Feed Group items tagged

Gary Edwards

Wolfram Alpha is Coming -- and It Could be as Important as Google | Twine - 0 views

  • The first question was could (or even should) Wolfram Alpha be built using the Semantic Web in some manner, rather than (or as well as) the Mathematica engine it is currently built on. Is anything missed by not building it with Semantic Web's languages (RDF, OWL, Sparql, etc.)? The answer is that there is no reason that one MUST use the Semantic Web stack to build something like Wolfram Alpha. In fact, in my opinion it would be far too difficult to try to explicitly represent everything Wolfram Alpha knows and can compute using OWL ontologies. It is too wide a range of human knowledge and giant OWL ontologies are just too difficult to build and curate.
  • However for the internal knowledge representation and reasoning that takes places in the system, it appears Wolfram has found a pragmatic and efficient representation of his own, and I don't think he needs the Semantic Web at that level. It seems to be doing just fine without it. Wolfram Alpha is built on hand-curated knowledge and expertise. Wolfram and his team have somehow figured out a way to make that practical where all others who have tried this have failed to achieve their goals. The task is gargantuan -- there is just so much diverse knowledge in the world. Representing even a small segment of it formally turns out to be extremely difficult and time-consuming.
  • It has generally not been considered feasible for any one group to hand-curate all knowledge about every subject. This is why the Semantic Web was invented -- by enabling everyone to curate their own knowledge about their own documents and topics in parallel, in principle at least, more knowledge could be represented and shared in less time by more people -- in an interoperable manner. At least that is the vision of the Semantic Web.
  • ...1 more annotation...
  • Where Google is a system for FINDING things that we as a civilization collectively publish, Wolfram Alpha is for ANSWERING questions about what we as a civilization collectively know. It's the next step in the distribution of knowledge and intelligence around the world -- a new leap in the intelligence of our collective "Global Brain." And like any big next-step, Wolfram Alpha works in a new way -- it computes answers instead of just looking them up.
  •  
    A Computational Knowledge Engine for the Web In a nutshell, Wolfram and his team have built what he calls a "computational knowledge engine" for the Web. OK, so what does that really mean? Basically it means that you can ask it factual questions and it computes answers for you. It doesn't simply return documents that (might) contain the answers, like Google does, and it isn't just a giant database of knowledge, like the Wikipedia. It doesn't simply parse natural language and then use that to retrieve documents, like Powerset, for example. Instead, Wolfram Alpha actually computes the answers to a wide range of questions -- like questions that have factual answers such as "What country is Timbuktu in?" or "How many protons are in a hydrogen atom?" or "What is the average rainfall in Seattle this month?," "What is the 300th digit of Pi?," "where is the ISS?" or "When was GOOG worth more than $300?" Think about that for a minute. It computes the answers. Wolfram Alpha doesn't simply contain huge amounts of manually entered pairs of questions and answers, nor does it search for answers in a database of facts. Instead, it understands and then computes answers to certain kinds of questions.
Paul Merrell

Sir Tim Berners-Lee on 'Reinventing HTML' - 0 views

    • Paul Merrell
       
      Berners-Lee gives the obligaotry lip service to participation of "other stakeholders" but the stark reality is that W3C is the captive of the major browser developers. One may still credit W3C staff and Berners-Lee for what they have accomplished despite that reality, but in an organization that sells votes the needs of "other stakeholders" will always be neglected.
  • Some things are clearer with hindsight of several years. It is necessary to evolve HTML incrementally. The attempt to get the world to switch to XML, including quotes around attribute values and slashes in empty tags and namespaces all at once didn't work. The large HTML-generating public did not move, largely because the browsers didn't complain. Some large communities did shift and are enjoying the fruits of well-formed systems, but not all. It is important to maintain HTML incrementally, as well as continuing a transition to well-formed world, and developing more power in that world.
  • The plan is, informed by Webforms, to extend HTML forms. At the same time, there is a work item to look at how HTML forms (existing and extended) can be thought of as XForm equivalents, to allow an easy escalation path. A goal would be to have an HTML forms language which is a superset of the existing HTML language, and a subset of a XForms language wit added HTML compatibility.
  • ...7 more annotations...
  • There will be no dependency of HTML work on the XHTML2 work.
    • Paul Merrell
       
      He just confirms that that incremental migration from HTML forms to XForms is entirely a pie-in-the-sky aspiration, not a plan.
  • This is going to be a very major collaboration on a very important spec, one of the crown jewels of web technology. Even though hundreds of people will be involved, we are evolving the technology which millions going on billions will use in the future. There won't seem like enough thankyous to go around some days.
    • Paul Merrell
       
      This is the precise reason the major browser developers must be brought to heel rather than being catered to with a standard that serves only the needs of the browser developers and not the need of users for interoperable web applications. CSS is in the web app page templates, not in the markup that can be exchanged by web apps. Why can't MediaWiki exchange page content with Drupal? It's because HTML really sucks biig time as a data exchange format. All the power is in the CSS site templates, not in what users can stick in HTML forms.
    • Paul Merrell
       
      Bye-bye XForms.
    • Paul Merrell
       
      Perhaps a political reality. But I am 62 years old, have had three major heart attacks, and am still smoking cigarettes. I would like to experience interoperable web apps before I die. What does the incremental strategy do for me? I would much prefer to see Berners-Lee raising his considerable voice and stature against the dominance of the browser developers at W3C.
  • The perceived accountability of the HTML group has been an issue. Sometimes this was a departure from the W3C process, sometimes a sticking to it in principle, but not actually providing assurances to commenters. An issue was the formation of the breakaway WHAT WG, which attracted reviewers though it did not have a process or specific accountability measures itself.
  • Some things are very clear. It is really important to have real developers on the ground involved with the development of HTML. It is also really important to have browser makers intimately involved and committed. And also all the other stakeholders, including users and user companies and makers of related products.
  •  
    Like this http://www.hdfilmsaati.net Film,dvd,download,free download,product... ppc,adword,adsense,amazon,clickbank,osell,bookmark,dofollow,edu,gov,ads,linkwell,traffic,scor,serp,goggle,bing,yahoo.ads,ads network,ads goggle,bing,quality links,link best,ptr,cpa,bpa
Paul Merrell

Rapid - Press Releases - EUROPA - 0 views

  • Did the Commission co-operate with the United States on this case? The Commission and the United States Federal Trade Commission have kept each other regularly and closely informed on the state of play of their respective Intel investigations. These discussions have been held in a co-operative and friendly atmosphere, and have been substantively fruitful in terms of sharing experiences on issues of common interest.
  • Where does the money go? Once final judgment has been delivered in any appeals before the Court of First Instance (CFI) and the Court of Justice, the money goes into the EU’s central budget, thus reducing the contributions that Member States pay to the EU. Does Intel have to pay the fine if it appeals to the European Court of First Instance (CFI)? Yes. In case of appeals to the CFI, it is normal practice that the fine is paid into a blocked bank account pending the final outcome of the appeals process. Any fine that is provisionally paid will produce interest based on the interest rate applied by the European Central Bank to its main refinancing operations. In exceptional circumstances, companies may be allowed to cover the amount of the fine by a bank guarantee at a higher interest rate. What percentage of Intel's turnover does the fine represent? The fine represents 4.15 % of Intel's turnover in 2008. This is less than half the allowable maximum, which is 10% of a company's annual turnover.
  • How long is the Decision? The Decision is 542 pages long. When is the Decision going to be published? The Decision in English (the official language version of the Decision) will be made available as soon as possible on DG Competition’s website (once relevant business secrets have been taken out). French and German translations will also be made available on DG Competition’s website in due course. A summary of the Decision will be published in the EU's Official Journal L series in all languages (once the translations are available).
anonymous

The Word As We Knew It - 0 views

  •  
    The internet and it's unique ability to rapidly share information across the planet has created a sort of 'hot-bed' for the evolution of language. New phrases, words, acronyms and slangs have been given the ability to virally evolve and disseminate to new populations within a matter of days. Definitions are born, morph, and die based on the evolving collective consciousness of humanity.
Gonzalo San Gil, PhD.

GNU Gnash - GNU Project - Free Software Foundation (FSF) - 0 views

  •  
    "GNU Gnash is the GNU Flash movie player - Flash is an animation file format pioneered by Macromedia which continues to be supported by their successor company, Adobe. Flash has been extended to include audio and video content, and programs written in ActionScript, an ECMAScript-compatible language. Gnash is based on GameSWF, and supports most SWF v7 features and some SWF v8 and v9." [installing GNU Gnash http://gnashdev.org/?q=node/11]
  •  
    "GNU Gnash is the GNU Flash movie player - Flash is an animation file format pioneered by Macromedia which continues to be supported by their successor company, Adobe. Flash has been extended to include audio and video content, and programs written in ActionScript, an ECMAScript-compatible language. Gnash is based on GameSWF, and supports most SWF v7 features and some SWF v8 and v9."
Gonzalo San Gil, PhD.

Achieving Impossible Things with Free Culture and Commons-Based Enterprise : Terry Hanc... - 0 views

  •  
    "Author: Terry Hancock Keywords: free software; open source; free culture; commons-based peer production; commons-based enterprise; Free Software Magazine; Blender Foundation; Blender Open Movies; Wikipedia; Project Gutenberg; Open Hardware; One Laptop Per Child; Sugar Labs; licensing; copyleft; hosting; marketing; design; online community; Debian GNU/Linux; GNU General Public License; Creative Commons Attribution-ShareAlike License; TAPR Open Hardware License; collective patronage; women in free software; Creative Commons; OScar; C,mm,n; Free Software Foundation; Open Source Initiative; Freedom Defined; Free Software Definition; Debian Free Software Guidelines; Sourceforge; Google Code; digital rights management; digital restrictions management; technological protection measures; DRM; TPM; linux; gnu; manifesto Publisher: Free Software Magazine Press Year: 2009 Language: English Collection: opensource"
  •  
    "Author: Terry Hancock Keywords: free software; open source; free culture; commons-based peer production; commons-based enterprise; Free Software Magazine; Blender Foundation; Blender Open Movies; Wikipedia; Project Gutenberg; Open Hardware; One Laptop Per Child; Sugar Labs; licensing; copyleft; hosting; marketing; design; online community; Debian GNU/Linux; GNU General Public License; Creative Commons Attribution-ShareAlike License; TAPR Open Hardware License; collective patronage; women in free software; Creative Commons; OScar; C,mm,n; Free Software Foundation; Open Source Initiative; Freedom Defined; Free Software Definition; Debian Free Software Guidelines; Sourceforge; Google Code; digital rights management; digital restrictions management; technological protection measures; DRM; TPM; linux; gnu; manifesto Publisher: Free Software Magazine Press Year: 2009 Language: English Collection: opensource"
Paul Merrell

Most Agencies Falling Short on Mandate for Online Records - 1 views

  • Nearly 20 years after Congress passed the Electronic Freedom of Information Act Amendments (E-FOIA), only 40 percent of agencies have followed the law's instruction for systematic posting of records released through FOIA in their electronic reading rooms, according to a new FOIA Audit released today by the National Security Archive at www.nsarchive.org to mark Sunshine Week. The Archive team audited all federal agencies with Chief FOIA Officers as well as agency components that handle more than 500 FOIA requests a year — 165 federal offices in all — and found only 67 with online libraries populated with significant numbers of released FOIA documents and regularly updated.
  • Congress called on agencies to embrace disclosure and the digital era nearly two decades ago, with the passage of the 1996 "E-FOIA" amendments. The law mandated that agencies post key sets of records online, provide citizens with detailed guidance on making FOIA requests, and use new information technology to post online proactively records of significant public interest, including those already processed in response to FOIA requests and "likely to become the subject of subsequent requests." Congress believed then, and openness advocates know now, that this kind of proactive disclosure, publishing online the results of FOIA requests as well as agency records that might be requested in the future, is the only tenable solution to FOIA backlogs and delays. Thus the National Security Archive chose to focus on the e-reading rooms of agencies in its latest audit. Even though the majority of federal agencies have not yet embraced proactive disclosure of their FOIA releases, the Archive E-FOIA Audit did find that some real "E-Stars" exist within the federal government, serving as examples to lagging agencies that technology can be harnessed to create state-of-the art FOIA platforms. Unfortunately, our audit also found "E-Delinquents" whose abysmal web performance recalls the teletype era.
  • E-Delinquents include the Office of Science and Technology Policy at the White House, which, despite being mandated to advise the President on technology policy, does not embrace 21st century practices by posting any frequently requested records online. Another E-Delinquent, the Drug Enforcement Administration, insults its website's viewers by claiming that it "does not maintain records appropriate for FOIA Library at this time."
  • ...9 more annotations...
  • "The presumption of openness requires the presumption of posting," said Archive director Tom Blanton. "For the new generation, if it's not online, it does not exist." The National Security Archive has conducted fourteen FOIA Audits since 2002. Modeled after the California Sunshine Survey and subsequent state "FOI Audits," the Archive's FOIA Audits use open-government laws to test whether or not agencies are obeying those same laws. Recommendations from previous Archive FOIA Audits have led directly to laws and executive orders which have: set explicit customer service guidelines, mandated FOIA backlog reduction, assigned individualized FOIA tracking numbers, forced agencies to report the average number of days needed to process requests, and revealed the (often embarrassing) ages of the oldest pending FOIA requests. The surveys include:
  • The federal government has made some progress moving into the digital era. The National Security Archive's last E-FOIA Audit in 2007, " File Not Found," reported that only one in five federal agencies had put online all of the specific requirements mentioned in the E-FOIA amendments, such as guidance on making requests, contact information, and processing regulations. The new E-FOIA Audit finds the number of agencies that have checked those boxes is now much higher — 100 out of 165 — though many (66 in 165) have posted just the bare minimum, especially when posting FOIA responses. An additional 33 agencies even now do not post these types of records at all, clearly thwarting the law's intent.
  • The FOIAonline Members (Department of Commerce, Environmental Protection Agency, Federal Labor Relations Authority, Merit Systems Protection Board, National Archives and Records Administration, Pension Benefit Guaranty Corporation, Department of the Navy, General Services Administration, Small Business Administration, U.S. Citizenship and Immigration Services, and Federal Communications Commission) won their "E-Star" by making past requests and releases searchable via FOIAonline. FOIAonline also allows users to submit their FOIA requests digitally.
  • THE E-DELINQUENTS: WORST OVERALL AGENCIES In alphabetical order
  • Key Findings
  • Excuses Agencies Give for Poor E-Performance
  • Justice Department guidance undermines the statute. Currently, the FOIA stipulates that documents "likely to become the subject of subsequent requests" must be posted by agencies somewhere in their electronic reading rooms. The Department of Justice's Office of Information Policy defines these records as "frequently requested records… or those which have been released three or more times to FOIA requesters." Of course, it is time-consuming for agencies to develop a system that keeps track of how often a record has been released, which is in part why agencies rarely do so and are often in breach of the law. Troublingly, both the current House and Senate FOIA bills include language that codifies the instructions from the Department of Justice. The National Security Archive believes the addition of this "three or more times" language actually harms the intent of the Freedom of Information Act as it will give agencies an easy excuse ("not requested three times yet!") not to proactively post documents that agency FOIA offices have already spent time, money, and energy processing. We have formally suggested alternate language requiring that agencies generally post "all records, regardless of form or format that have been released in response to a FOIA request."
  • Disabilities Compliance. Despite the E-FOIA Act, many government agencies do not embrace the idea of posting their FOIA responses online. The most common reason agencies give is that it is difficult to post documents in a format that complies with the Americans with Disabilities Act, also referred to as being "508 compliant," and the 1998 Amendments to the Rehabilitation Act that require federal agencies "to make their electronic and information technology (EIT) accessible to people with disabilities." E-Star agencies, however, have proven that 508 compliance is no barrier when the agency has a will to post. All documents posted on FOIAonline are 508 compliant, as are the documents posted by the Department of Defense and the Department of State. In fact, every document created electronically by the US government after 1998 should already be 508 compliant. Even old paper records that are scanned to be processed through FOIA can be made 508 compliant with just a few clicks in Adobe Acrobat, according to this Department of Homeland Security guide (essentially OCRing the text, and including information about where non-textual fields appear). Even if agencies are insistent it is too difficult to OCR older documents that were scanned from paper, they cannot use that excuse with digital records.
  • Privacy. Another commonly articulated concern about posting FOIA releases online is that doing so could inadvertently disclose private information from "first person" FOIA requests. This is a valid concern, and this subset of FOIA requests should not be posted online. (The Justice Department identified "first party" requester rights in 1989. Essentially agencies cannot use the b(6) privacy exemption to redact information if a person requests it for him or herself. An example of a "first person" FOIA would be a person's request for his own immigration file.) Cost and Waste of Resources. There is also a belief that there is little public interest in the majority of FOIA requests processed, and hence it is a waste of resources to post them. This thinking runs counter to the governing principle of the Freedom of Information Act: that government information belongs to US citizens, not US agencies. As such, the reason that a person requests information is immaterial as the agency processes the request; the "interest factor" of a document should also be immaterial when an agency is required to post it online. Some think that posting FOIA releases online is not cost effective. In fact, the opposite is true. It's not cost effective to spend tens (or hundreds) of person hours to search for, review, and redact FOIA requests only to mail it to the requester and have them slip it into their desk drawer and forget about it. That is a waste of resources. The released document should be posted online for any interested party to utilize. This will only become easier as FOIA processing systems evolve to automatically post the documents they track. The State Department earned its "E-Star" status demonstrating this very principle, and spent no new funds and did not hire contractors to build its Electronic Reading Room, instead it built a self-sustaining platform that will save the agency time and money going forward.
Paul Merrell

Help:CirrusSearch - MediaWiki - 0 views

  • CirrusSearch is a new search engine for MediaWiki. The Wikimedia Foundation is migrating to CirrusSearch since it features key improvements over the previously used search engine, LuceneSearch. This page describes the features that are new or different compared to the past solutions.
  • 1 Frequently asked questions 1.1 What's improved? 2 Updates 3 Search suggestions 4 Full text search 4.1 Stemming 4.2 Filters (intitle:, incategory: and linksto:) 4.3 prefix: 4.4 Special prefixes 4.5 Did you mean 4.6 Prefer phrase matches 4.7 Fuzzy search 4.8 Phrase search and proximity 4.9 Quotes and exact matches 4.10 prefer-recent: 4.11 hastemplate: 4.12 boost-templates: 4.13 insource: 4.14 Auxiliary Text 4.15 Lead Text 4.16 Commons Search 5 See also
  • Stemming In search terminology, support for "stemming" means that a search for "swim" will also include "swimming" and "swimmed", but not "swam". There is support for dozens of languages, but all languages are wanted. There is a list of currently supported languages at elasticsearch.org; see their documentation on contributing to submit requests or patches.
  • ...1 more annotation...
  • See also Full specifications in the browser tests
  •  
    Lots of new tricks to learn on sites using MediaWiki as folks update their installations, I'm not a big fan of programs written in PHP and Javascript, but they're impossible to avoid on the Web. So is MediaWiki, so any real improvements help.  
Paul Merrell

Official Google Blog: A first step toward more global email - 0 views

  • Whether your email address is firstname.lastname@ or something more expressive like corgicrazy@, an email address says something about who you are. But from the start, email addresses have always required you to use non-accented Latin characters when signing up. Less than half of the world’s population has a mother tongue that uses the Latin alphabet. And even fewer people use only the letters A-Z. So if your name (or that of your favorite pet) contains accented characters (like “José Ramón”) or is written in another script like Chinese or Devanagari, your email address options are limited. But all that could change. In 2012, an organization called the Internet Engineering Task Force (IETF) created a new email standard that supports addresses with non-Latin and accented Latin characters (e.g. 武@メール.グーグル). In order for this standard to become a reality, every email provider and every website that asks you for your email address must adopt it. That’s obviously a tough hill to climb. The technology is there, but someone has to take the first step.
  • Today we're ready to be that someone. Starting now, Gmail (and shortly, Calendar) will recognize addresses that contain accented or non-Latin characters. This means Gmail users can send emails to, and receive emails from, people who have these characters in their email addresses. Of course, this is just a first step and there’s still a ways to go. In the future, we want to make it possible for you to use them to create Gmail accounts. Last month, we announced the addition of 13 new languages in Gmail. Language should never be a barrier when it comes to connecting with others and with this step forward, truly global email is now even closer to becoming a reality.
Gonzalo San Gil, PhD.

Organize a Giving Guide Giveaway - Free Software Foundation - December 1, 2014 - 0 views

  •  
    "by Free Software Foundation - Published on Nov 17, 2014 04:18 PM Organize an event to help people choose electronics gifts that actually give more than they take. In the flurry of holiday advertising that happens at the end of the year, many people are swept into buying freedom-denying and DRM-laden gifts that take more than they give. Each holiday season the FSF releases a Giving Guide to make it easy for you to choose tech gifts that respect your rights as a computer user and avoid those that don't. We'll be launching 2014's guide on Black Friday (November 28th), full of gifts that are fun and free, made by companies that share your values. It will be similar to 2013's Giving Guide, but more extensive and spruced up with a new design. It'll even have discounts on some of our favorite items, and translations into multiple languages."
  •  
    "by Free Software Foundation - Published on Nov 17, 2014 04:18 PM Organize an event to help people choose electronics gifts that actually give more than they take. In the flurry of holiday advertising that happens at the end of the year, many people are swept into buying freedom-denying and DRM-laden gifts that take more than they give. Each holiday season the FSF releases a Giving Guide to make it easy for you to choose tech gifts that respect your rights as a computer user and avoid those that don't. We'll be launching 2014's guide on Black Friday (November 28th), full of gifts that are fun and free, made by companies that share your values. It will be similar to 2013's Giving Guide, but more extensive and spruced up with a new design. It'll even have discounts on some of our favorite items, and translations into multiple languages."
Gonzalo San Gil, PhD.

Cyber bill's final language likely to anger privacy advocates | TheHill - 0 views

  •  
    "By Cory Bennett - 12/07/15 09:55 AM EST Digital rights advocates are in an uproar as the final text of a major cybersecurity bill appears to lack some of the privacy community's favored clauses. In the last few weeks, House and Senate negotiators have been working unofficially to reach a compromise between multiple versions of a cyber bill that would encourage businesses to share more data on hacking threats with the government."
Gonzalo San Gil, PhD.

Join the Battle for Net Neutrality - 0 views

  •  
    "Congress is trying to sneak language into a budget bill that would take away the FCC's ability to enforce the net neutrality rules we worked hard to pass, undermining everything we did to protect the open Internet. Thousands of calls and emails will nip this in the bud - contact Congress now! "
Paul Merrell

The People and Tech Behind the Panama Papers - Features - Source: An OpenNews project - 0 views

  • Then we put the data up, but the problem with Solr was it didn’t have a user interface, so we used Project Blacklight, which is open source software normally used by librarians. We used it for the journalists. It’s simple because it allows you to do faceted search—so, for example, you can facet by the folder structure of the leak, by years, by type of file. There were more complex things—it supports queries in regular expressions, so the more advanced users were able to search for documents with a certain pattern of numbers that, for example, passports use. You could also preview and download the documents. ICIJ open-sourced the code of our document processing chain, created by our web developer Matthew Caruana Galizia. We also developed a batch-searching feature. So say you were looking for politicians in your country—you just run it through the system, and you upload your list to Blacklight and you would get a CSV back saying yes, there are matches for these names—not only exact matches, but also matches based on proximity. So you would say “I want Mar Cabra proximity 2” and that would give you “Mar Cabra,” “Mar whatever Cabra,” “Cabra, Mar,”—so that was good, because very quickly journalists were able to see… I have this list of politicians and they are in the data!
  • Last Sunday, April 3, the first stories emerging from the leaked dataset known as the Panama Papers were published by a global partnership of news organizations working in coordination with the International Consortium of Investigative Journalists, or ICIJ. As we begin the second week of reporting on the leak, Iceland’s Prime Minister has been forced to resign, Germany has announced plans to end anonymous corporate ownership, governments around the world launched investigations into wealthy citizens’ participation in tax havens, the Russian government announced that the investigation was an anti-Putin propaganda operation, and the Chinese government banned mentions of the leak in Chinese media. As the ICIJ-led consortium prepares for its second major wave of reporting on the Panama Papers, we spoke with Mar Cabra, editor of ICIJ’s Data & Research unit and lead coordinator of the data analysis and infrastructure work behind the leak. In our conversation, Cabra reveals ICIJ’s years-long effort to build a series of secure communication and analysis platforms in support of genuinely global investigative reporting collaborations.
  • For communication, we have the Global I-Hub, which is a platform based on open source software called Oxwall. Oxwall is a social network, like Facebook, which has a wall when you log in with the latest in your network—it has forum topics, links, you can share files, and you can chat with people in real time.
  • ...3 more annotations...
  • We had the data in a relational database format in SQL, and thanks to ETL (Extract, Transform, and Load) software Talend, we were able to easily transform the data from SQL to Neo4j (the graph-database format we used). Once the data was transformed, it was just a matter of plugging it into Linkurious, and in a couple of minutes, you have it visualized—in a networked way, so anyone can log in from anywhere in the world. That was another reason we really liked Linkurious and Neo4j—they’re very quick when representing graph data, and the visualizations were easy to understand for everybody. The not-very-tech-savvy reporter could expand the docs like magic, and more technically expert reporters and programmers could use the Neo4j query language, Cypher, to do more complex queries, like show me everybody within two degrees of separation of this person, or show me all the connected dots…
  • We believe in open source technology and try to use it as much as possible. We used Apache Solr for the indexing and Apache Tika for document processing, and it’s great because it processes dozens of different formats and it’s very powerful. Tika interacts with Tesseract, so we did the OCRing on Tesseract. To OCR the images, we created an army of 30–40 temporary servers in Amazon that allowed us to process the documents in parallel and do parallel OCR-ing. If it was very slow, we’d increase the number of servers—if it was going fine, we would decrease because of course those servers have a cost.
  • For the visualization of the Mossack Fonseca internal database, we worked with another tool called Linkurious. It’s not open source, it’s licensed software, but we have an agreement with them, and they allowed us to work with it. It allows you to represent data in graphs. We had a version of Linkurious on our servers, so no one else had the data. It was pretty intuitive—journalists had to click on dots that expanded, basically, and could search the names.
Paul Merrell

Commentary: Don't be so sure Russia hacked the Clinton emails | Reuters - 0 views

  • By James Bamford Last summer, cyber investigators plowing through the thousands of leaked emails from the Democratic National Committee uncovered a clue.A user named “Феликс Эдмундович” modified one of the documents using settings in the Russian language. Translated, his name was Felix Edmundovich, a pseudonym referring to Felix Edmundovich Dzerzhinsky, the chief of the Soviet Union’s first secret-police organization, the Cheka.It was one more link in the chain of evidence pointing to Russian President Vladimir Putin as the man ultimately behind the operation.During the Cold War, when Soviet intelligence was headquartered in Dzerzhinsky Square in Moscow, Putin was a KGB officer assigned to the First Chief Directorate. Its responsibilities included “active measures,” a form of political warfare that included media manipulation, propaganda and disinformation. Soviet active measures, retired KGB Major General Oleg Kalugin told Army historian Thomas Boghart, aimed to discredit the United States and “conquer world public opinion.”As the Cold War has turned into the code war, Putin recently unveiled his new, greatly enlarged spy organization: the Ministry of State Security, taking the name from Joseph Stalin’s secret service. Putin also resurrected, according to James Clapper, the U.S. director of national intelligence, some of the KGB’s old active- measures tactics. On October 7, Clapper issued a statement: “The U.S. Intelligence community is confident that the Russian government directed the recent compromises of emails from U.S. persons and institutions, including from U.S. political organizations.” Notably, however, the FBI declined to join the chorus, according to reports by the New York Times and CNBC.A week later, Vice President Joe Biden said on NBC’s Meet the Press that "we're sending a message" to Putin and "it will be at the time of our choosing, and under the circumstances that will have the greatest impact." When asked if the American public would know a message was sent, Biden replied, "Hope not." Meanwhile, the CIA was asked, according to an NBC report on October 14, “to deliver options to the White House for a wide-ranging ‘clandestine’ cyber operation designed to harass and ‘embarrass’ the Kremlin leadership.”But as both sides begin arming their cyberweapons, it is critical for the public to be confident that the evidence is really there, and to understand the potential consequences of a tit-for-tat cyberwar escalating into a real war. 
  • This is a prospect that has long worried Richard Clarke, the former White House cyber czar under President George W. Bush. “It’s highly likely that any war that began as a cyberwar,” Clarke told me last year, “would ultimately end up being a conventional war, where the United States was engaged with bombers and missiles.”The problem with attempting to draw a straight line from the Kremlin to the Clinton campaign is the number of variables that get in the way. For one, there is little doubt about Russian cyber fingerprints in various U.S. campaign activities. Moscow, like Washington, has long spied on such matters. The United States, for example, inserted malware in the recent Mexican election campaign. The question isn’t whether Russia spied on the U.S. presidential election, it’s whether it released the election emails.Then there’s the role of Guccifer 2.0, the person or persons supplying WikiLeaks and other organizations with many of the pilfered emails. Is this a Russian agent? A free agent? A cybercriminal? A combination, or some other entity? No one knows.There is also the problem of groupthink that led to the war in Iraq. For example, just as the National Security Agency, the Central Intelligence Agency and the rest of the intelligence establishment are convinced Putin is behind the attacks, they also believed it was a slam-dunk that Saddam Hussein had a trove of weapons of mass destruction. Consider as well the speed of the political-hacking investigation, followed by a lack of skepticism, culminating in a rush to judgment. After the Democratic committee discovered the potential hack last spring, it called in the cybersecurity firm CrowdStrike in May to analyze the problem.
  • CrowdStrike took just a month or so before it conclusively determined that Russia’s FSB, the successor to the KGB, and the Russian military intelligence organization, GRU, were behind it. Most of the other major cybersecurity firms quickly fell in line and agreed. By October, the intelligence community made it unanimous. That speed and certainty contrasts sharply with a previous suspected Russian hack in 2010, when the target was the Nasdaq stock market. According to an extensive investigation by Bloomberg Businessweek in 2014, the NSA and FBI made numerous mistakes over many months that stretched to nearly a year. “After months of work,” the article said, “there were still basic disagreements in different parts of government over who was behind the incident and why.”  There was no consensus­, with just a 70 percent certainty that the hack was a cybercrime. Months later, this determination was revised again: It was just a Russian attempt to spy on the exchange in order to design its own. The federal agents also considered the possibility that the Nasdaq snooping was not connected to the Kremlin. Instead, “someone in the FSB could have been running a for-profit operation on the side, or perhaps sold the malware to a criminal hacking group.” Again, that’s why it’s necessary to better understand the role of Guccifer 2.0 in releasing the Democratic National Committee and Clinton campaign emails before launching any cyberweapons.
  • ...2 more annotations...
  • t is strange that clues in the Nasdaq hack were very difficult to find ― as one would expect from a professional, state-sponsored cyber operation. Conversely, the sloppy, Inspector Clouseau-like nature of the Guccifer 2.0 operation, with someone hiding behind a silly Bolshevik cover name, and Russian language clues in the metadata, smacked more of either an amateur operation or a deliberate deception.Then there’s the Shadow Brokers, that mysterious person or group that surfaced in August with its farcical “auction” to profit from a stolen batch of extremely secret NSA hacking tools, in essence, cyberweapons. Where do they fit into the picture? They have a small armory of NSA cyberweapons, and they appeared just three weeks after the first DNC emails were leaked. On Monday, the Shadow Brokers released more information, including what they claimed is a list of hundreds of organizations that the NSA has targeted over more than a decade, complete with technical details. This offers further evidence that their information comes from a leaker inside the NSA rather than the Kremlin. The Shadow Brokers also discussed Obama’s threat of cyber retaliation against Russia. Yet they seemed most concerned that the CIA, rather than the NSA or Cyber Command, was given the assignment. This may be a possible indication of a connection to NSA’s elite group, Tailored Access Operations, considered by many the A-Team of hackers.“Why is DirtyGrandpa threating CIA cyberwar with Russia?” they wrote. “Why not threating with NSA or Cyber Command? CIA is cyber B-Team, yes? Where is cyber A-Team?” Because of legal and other factors, the NSA conducts cyber espionage, Cyber Command conducts cyberattacks in wartime, and the CIA conducts covert cyberattacks. 
  • The Shadow Brokers connection is important because Julian Assange, the founder of WikiLeaks, claimed to have received identical copies of the Shadow Brokers cyberweapons even before they announced their “auction.” Did he get them from the Shadow Brokers, from Guccifer, from Russia or from an inside leaker at the NSA?Despite the rushed, incomplete investigation and unanswered questions, the Obama administration has announced its decision to retaliate against Russia.  But a public warning about a secret attack makes little sense. If a major cyber crisis happens in Russia sometime in the future, such as a deadly power outage in frigid winter, the United States could be blamed even if it had nothing to do with it. That could then trigger a major retaliatory cyberattack against the U.S. cyber infrastructure, which would call for another reprisal attack ― potentially leading to Clarke’s fear of a cyberwar triggering a conventional war. President Barack Obama has also not taken a nuclear strike off the table as an appropriate response to a devastating cyberattack.
  •  
    Article by James Bamford, the first NSA whistleblower and author of three books on the NSA.
Gonzalo San Gil, PhD.

Install and Access Facebook Messenger on Linux Desktop - 0 views

  •  
    "linuxmessenger app is a "Facebook-like" client for Linux desktop was written in Python language. It allows you to login to your Facebook account right from the command line without installing it on your system and have chat with your loved ones with much a like a Facebook interface. If you want, you can install it as a desktop client. This application has some built-in features like desktop notifications, pop-up alert, friends request and chat sound (with On/Off options)."
Paul Merrell

WikiLeaks - Vault 7: Projects - 0 views

  • Today, March 31st 2017, WikiLeaks releases Vault 7 "Marble" -- 676 source code files for the CIA's secret anti-forensic Marble Framework. Marble is used to hamper forensic investigators and anti-virus companies from attributing viruses, trojans and hacking attacks to the CIA. Marble does this by hiding ("obfuscating") text fragments used in CIA malware from visual inspection. This is the digital equivallent of a specalized CIA tool to place covers over the english language text on U.S. produced weapons systems before giving them to insurgents secretly backed by the CIA. Marble forms part of the CIA's anti-forensics approach and the CIA's Core Library of malware code. It is "[D]esigned to allow for flexible and easy-to-use obfuscation" as "string obfuscation algorithms (especially those that are unique) are often used to link malware to a specific developer or development shop." The Marble source code also includes a deobfuscator to reverse CIA text obfuscation. Combined with the revealed obfuscation techniques, a pattern or signature emerges which can assist forensic investigators attribute previous hacking attacks and viruses to the CIA. Marble was in use at the CIA during 2016. It reached 1.0 in 2015.
  • The source code shows that Marble has test examples not just in English but also in Chinese, Russian, Korean, Arabic and Farsi. This would permit a forensic attribution double game, for example by pretending that the spoken language of the malware creator was not American English, but Chinese, but then showing attempts to conceal the use of Chinese, drawing forensic investigators even more strongly to the wrong conclusion, --- but there are other possibilities, such as hiding fake error messages. The Marble Framework is used for obfuscation only and does not contain any vulnerabilties or exploits by itself.
  •  
    But it was the Russians who hacked the 2016 U.S. election. Really.
Gonzalo San Gil, PhD.

Directory of Open Access Journals - 1 views

  •  
    Free, full text, quality controlled scientific and scholarly journals, covering all subjects and many languages
Gonzalo San Gil, PhD.

Negotiating relicensing written works for the open knowledge movement | opensource.com - 0 views

  •  
    "Posted 2 May 2014 by Subhashish Panigrahi I began working with the Wikimedia Foundation in January 2012 for program and community support in India. With the Centre for Internet and Society's Access To Knowledge program, we focus on open access for scholarly publications to help communities enrich Wikipedia entries for Indic languages."
Gonzalo San Gil, PhD.

The Net vs. The Power of Narratives | TorrentFreak - 1 views

  •  
    " By Rick Falkvinge on April 29, 2012 C: 76 Opinion The net changes the world's power structures in a much more fundamental way than changing the way a few groups of entrepreneurs are able to make money. The net is the greatest equalizer that humankind has ever invented. It is either the greatest invention since the printing press, or the greatest invention since written language. The battles we see are not a result of loss of money; they are caused by a loss of the power of narratives."
‹ Previous 21 - 40 of 102 Next › Last »
Showing 20 items per page