Skip to main content

Home/ Future of the Web/ Group items tagged search

Rss Feed Group items tagged

Paul Merrell

News - Antitrust - Competition - European Commission - 0 views

  • Google inquiries Commission accuses Google of systematically favouring own shopping comparison service Infographic: Google might be favouring 'Google Shopping' when displaying general search results
  • Antitrust: Commission sends Statement of Objections to Google on comparison shopping service; opens separate formal investigation on AndroidWed, 15 Apr 2015 10:00:00 GMTAntitrust: Commission opens formal investigation against Google in relation to Android mobile operating systemWed, 15 Apr 2015 10:00:00 GMTAntitrust: Commission sends Statement of Objections to Google on comparison shopping serviceWed, 15 Apr 2015 10:00:00 GMTStatement by Commissioner Vestager on antitrust decisions concerning GoogleWed, 15 Apr 2015 11:39:00 GMT
  •  
    The more interesting issue to me is the accusation that Google violates antitrust law by boosting its comparison shopping search results in its search results, unfairly disadvantaging competing shopping services and not delivering best results to users. What's interesting to me is that the Commission is attempting to portray general search as a separate market from comparison shopping search, accusing Google of attempting to leverage its general search monopoly into the separate comoparison shopping search market. At first blush, Iim not convinced that these are or should be regarded as separable markets. But the ramifications are enormous. If that is a separate market, then arguably so is Google's book search, its Google Scholar search, its definition search, its site search, etc. It isn't clear to me how one might draw a defensible line taht does not also sweep in every new search feature  as a separate market.   
Paul Merrell

Shaking My Head - Medium - 0 views

  • Last month, at the request of the Department of Justice, the Courts approved changes to the obscure Rule 41 of the Federal Rules of Criminal Procedure, which governs search and seizure. By the nature of this obscure bureaucratic process, these rules become law unless Congress rejects the changes before December 1, 2016.Today I, along with my colleagues Senators Paul from Kentucky, Baldwin from Wisconsin, and Daines and Tester from Montana, am introducing the Stopping Mass Hacking (SMH) Act (bill, summary), a bill to protect millions of law-abiding Americans from a massive expansion of government hacking and surveillance. Join the conversation with #SMHact.
  • For law enforcement to conduct a remote electronic search, they generally need to plant malware in — i.e. hack — a device. These rule changes will allow the government to search millions of computers with the warrant of a single judge. To me, that’s clearly a policy change that’s outside the scope of an “administrative change,” and it is something that Congress should consider. An agency with the record of the Justice Department shouldn’t be able to wave its arms and grant itself entirely new powers.
  • These changes say that if law enforcement doesn’t know where an electronic device is located, a magistrate judge will now have the the authority to issue a warrant to remotely search the device, anywhere in the world. While it may be appropriate to address the issue of allowing a remote electronic search for a device at an unknown location, Congress needs to consider what protections must be in place to protect Americans’ digital security and privacy. This is a new and uncertain area of law, so there needs to be full and careful debate. The ACLU has a thorough discussion of the Fourth Amendment ramifications and the technological questions at issue with these kinds of searches.The second part of the change to Rule 41 would give a magistrate judge the authority to issue a single warrant that would authorize the search of an unlimited number — potentially thousands or millions — of devices, located anywhere in the world. These changes would dramatically expand the government’s hacking and surveillance authority. The American public should understand that these changes won’t just affect criminals: computer security experts and civil liberties advocates say the amendments would also dramatically expand the government’s ability to hack the electronic devices of law-abiding Americans if their devices were affected by a computer attack. Devices will be subject to search if their owners were victims of a botnet attack — so the government will be treating victims of hacking the same way they treat the perpetrators.
  • ...1 more annotation...
  • As the Center on Democracy and Technology has noted, there are approximately 500 million computers that fall under this rule. The public doesn’t know nearly enough about how law enforcement executes these hacks, and what risks these types of searches will pose. By compromising the computer’s system, the search might leave it open to other attackers or damage the computer they are searching.Don’t take it from me that this will impact your security, read more from security researchers Steven Bellovin, Matt Blaze and Susan Landau.Finally, these changes to Rule 41 would also give some types of electronic searches different, weaker notification requirements than physical searches. Under this new Rule, they are only required to make “reasonable efforts” to notify people that their computers were searched. This raises the possibility of the FBI hacking into a cyber attack victim’s computer and not telling them about it until afterward, if at all.
Paul Merrell

In Hearing on Internet Surveillance, Nobody Knows How Many Americans Impacted in Data C... - 0 views

  • The Senate Judiciary Committee held an open hearing today on the FISA Amendments Act, the law that ostensibly authorizes the digital surveillance of hundreds of millions of people both in the United States and around the world. Section 702 of the law, scheduled to expire next year, is designed to allow U.S. intelligence services to collect signals intelligence on foreign targets related to our national security interests. However—thanks to the leaks of many whistleblowers including Edward Snowden, the work of investigative journalists, and statements by public officials—we now know that the FISA Amendments Act has been used to sweep up data on hundreds of millions of people who have no connection to a terrorist investigation, including countless Americans. What do we mean by “countless”? As became increasingly clear in the hearing today, the exact number of Americans impacted by this surveillance is unknown. Senator Franken asked the panel of witnesses, “Is it possible for the government to provide an exact count of how many United States persons have been swept up in Section 702 surveillance? And if not the exact count, then what about an estimate?”
  • The lack of information makes rigorous oversight of the programs all but impossible. As Senator Franken put it in the hearing today, “When the public lacks even a rough sense of the scope of the government’s surveillance program, they have no way of knowing if the government is striking the right balance, whether we are safeguarding our national security without trampling on our citizens’ fundamental privacy rights. But the public can’t know if we succeed in striking that balance if they don’t even have the most basic information about our major surveillance programs."  Senator Patrick Leahy also questioned the panel about the “minimization procedures” associated with this type of surveillance, the privacy safeguard that is intended to ensure that irrelevant data and data on American citizens is swiftly deleted. Senator Leahy asked the panel: “Do you believe the current minimization procedures ensure that data about innocent Americans is deleted? Is that enough?”  David Medine, who recently announced his pending retirement from the Privacy and Civil Liberties Oversight Board, answered unequivocally:
  • Elizabeth Goitein, the Brennan Center director whose articulate and thought-provoking testimony was the highlight of the hearing, noted that at this time an exact number would be difficult to provide. However, she asserted that an estimate should be possible for most if not all of the government’s surveillance programs. None of the other panel participants—which included David Medine and Rachel Brand of the Privacy and Civil Liberties Oversight Board as well as Matthew Olsen of IronNet Cybersecurity and attorney Kenneth Wainstein—offered an estimate. Today’s hearing reaffirmed that it is not only the American people who are left in the dark about how many people or accounts are impacted by the NSA’s dragnet surveillance of the Internet. Even vital oversight committees in Congress like the Senate Judiciary Committee are left to speculate about just how far-reaching this surveillance is. It's part of the reason why we urged the House Judiciary Committee to demand that the Intelligence Community provide the public with a number. 
  • ...2 more annotations...
  • Senator Leahy, they don’t. The minimization procedures call for the deletion of innocent Americans’ information upon discovery to determine whether it has any foreign intelligence value. But what the board’s report found is that in fact information is never deleted. It sits in the databases for 5 years, or sometimes longer. And so the minimization doesn’t really address the privacy concerns of incidentally collected communications—again, where there’s been no warrant at all in the process… In the United States, we simply can’t read people’s emails and listen to their phone calls without court approval, and the same should be true when the government shifts its attention to Americans under this program. One of the most startling exchanges from the hearing today came toward the end of the session, when Senator Dianne Feinstein—who also sits on the Intelligence Committee—seemed taken aback by Ms. Goitein’s mention of “backdoor searches.” 
  • Feinstein: Wow, wow. What do you call it? What’s a backdoor search? Goitein: Backdoor search is when the FBI or any other agency targets a U.S. person for a search of data that was collected under Section 702, which is supposed to be targeted against foreigners overseas. Feinstein: Regardless of the minimization that was properly carried out. Goitein: Well the data is searched in its unminimized form. So the FBI gets raw data, the NSA, the CIA get raw data. And they search that raw data using U.S. person identifiers. That’s what I’m referring to as backdoor searches. It’s deeply concerning that any member of Congress, much less a member of the Senate Judiciary Committee and the Senate Intelligence Committee, might not be aware of the problem surrounding backdoor searches. In April 2014, the Director of National Intelligence acknowledged the searches of this data, which Senators Ron Wyden and Mark Udall termed “the ‘back-door search’ loophole in section 702.” The public was so incensed that the House of Representatives passed an amendment to that year's defense appropriations bill effectively banning the warrantless backdoor searches. Nonetheless, in the hearing today it seemed like Senator Feinstein might not recognize or appreciate the serious implications of allowing U.S. law enforcement agencies to query the raw data collected through these Internet surveillance programs. Hopefully today’s testimony helped convince the Senator that there is more to this topic than what she’s hearing in jargon-filled classified security briefings.
  •  
    The 4th Amendment: "The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and *particularly describing the place to be searched, and the* persons or *things to be seized."* So much for the particularized description of the place to be searched and the thngs to be seized.  Fah! Who needs a Constitution, anyway .... 
Alexandra IcecreamApps

Google Search Alternatives - Icecream Tech Digest - 1 views

  •  
    When it comes to searching for something online, the first website we turn to is Google. This king of the search engines has become so major that plenty of people say "Google it" instead of "Look this up online." However, … Continue reading →
  •  
    When it comes to searching for something online, the first website we turn to is Google. This king of the search engines has become so major that plenty of people say "Google it" instead of "Look this up online." However, … Continue reading →
Paul Merrell

Evidence of Google blacklisting of left and progressive sites continues to mount - Worl... - 0 views

  • A growing number of leading left-wing websites have confirmed that their search traffic from Google has plunged in recent months, adding to evidence that Google, under the cover of a fraudulent campaign against fake news, is implementing a program of systematic and widespread censorship. Truthout, a not-for-profit news website that focuses on political, social, and ecological developments from a left progressive standpoint, had its readership plunge by 35 percent since April. The Real News , a nonprofit video news and documentary service, has had its search traffic fall by 37 percent. Another site, Common Dreams , last week told the WSWS that its search traffic had fallen by up to 50 percent. As extreme as these sudden drops in search traffic are, they do not equal the nearly 70 percent drop in traffic from Google seen by the WSWS. “This is political censorship of the worst sort; it’s just an excuse to suppress political viewpoints,” said Robert Epstein, a former editor in chief of Psychology Today and noted expert on Google. Epstein said that at this point, the question was whether the WSWS had been flagged specifically by human evaluators employed by the search giant, or whether those evaluators had influenced the Google Search engine to demote left-wing sites. “What you don’t know is whether this was the human evaluators who are demoting you, or whether it was the new algorithm they are training,” Epstein said.
  • Richard Stallman, the world-renowned technology pioneer and a leader of the free software movement, said he had read the WSWS’s coverage on Google’s censorship of left-wing sites. He warned about the immense control exercised by Google over the Internet, saying, “For people’s main way of finding articles about a topic to be run by a giant corporation creates an obvious potential for abuse.” According to data from the search optimization tool SEMRush, search traffic to Mr. Stallman’s personal website, Stallman.org, fell by 24 percent, while traffic to gnu.org, operated by the Free Software Foundation, fell 19 percent. Eric Maas, a search engine optimization consultant working in the San Francisco Bay area, said his team has surveyed a wide range of alternative news sites affected by changes in Google’s algorithms since April.  “While the update may be targeting specific site functions, there is evidence that this update is promoting only large mainstream news organizations. What I find problematic with this is that it appears that some sites have been targeted and others have not.” The massive drop in search traffic to the WSWS and other left-wing sites followed the implementation of changes in Google’s search evaluation protocols. In a statement issued on April 25, Ben Gomes, the company’s vice president for engineering, stated that Google’s update of its search engine would block access to “offensive” sites, while working to surface more “authoritative content.” In a set of guidelines issued to Google evaluators in March, the company instructed its search evaluators to flag pages returning “conspiracy theories” or “upsetting” content unless “the query clearly indicates the user is seeking an alternative viewpoint.”
Paul Merrell

2nd Cir. Affirms That Creation of Full-Text Searchable Database of Works Is Fair Use | ... - 0 views

  • The fair use doctrine permits the unauthorized digitization of copyrighted works in order to create a full-text searchable database, the U.S. Court of Appeals for the Second Circuit ruled June 10.Affirming summary judgment in favor of a consortium of university libraries, the court also ruled that the fair use doctrine permits the unauthorized conversion of those works into accessible formats for use by persons with disabilities, such as the blind.
  • The dispute is connected to the long-running conflict between Google Inc. and various authors of books that Google included in a mass digitization program. In 2004, Google began soliciting the participation of publishers in its Google Print for Publishers service, part of what was then called the Google Print project, aimed at making information available for free over the Internet.Subsequently, Google announced a new project, Google Print for Libraries. In 2005, Google Print was renamed Google Book Search and it is now known simply as Google Books. Under this program, Google made arrangements with several of the world's largest libraries to digitize the entire contents of their collections to create an online full-text searchable database.The announcement of this program triggered a copyright infringement action by the Authors Guild that continues to this day.
  • Part of the deal between Google and the libraries included an offer by Google to hand over to the libraries their own copies of the digitized versions of their collections.In 2011, a group of those libraries announced the establishment of a new service, called the HathiTrust digital library, to which the libraries would contribute their digitized collections. This database of copies is to be made available for full-text searching and preservation activities. Additionally, it is intended to offer free access to works to individuals who have “print disabilities.” For works under copyright protection, the search function would return only a list of page numbers that a search term appeared on and the frequency of such appearance.
  • ...3 more annotations...
  • Turning to the fair use question, the court first concluded that the full-text search function of the Hathitrust Digital Library was a “quintessentially transformative use,” and thus constituted fair use. The court said:the result of a word search is different in purpose, character, expression, meaning, and message from the page (and the book) from which it is drawn. Indeed, we can discern little or no resemblance between the original text and the results of the HDL full-text search.There is no evidence that the Authors write with the purpose of enabling text searches of their books. Consequently, the full-text search function does not “supersede[ ] the objects [or purposes] of the original creation.”Turning to the fourth fair use factor—whether the use functions as a substitute for the original work—the court rejected the argument that such use represents lost sales to the extent that it prevents the future development of a market for licensing copies of works to be used in full-text searches.However, the court emphasized that the search function “does not serve as a substitute for the books that are being searched.”
  • The court also rejected the argument that the database represented a threat of a security breach that could result in the full text of all the books becoming available for anyone to access. The court concluded that Hathitrust's assertions of its security measures were unrebutted.Thus, the full-text search function was found to be protected as fair use.
  • The court also concluded that allowing those with print disabilities access to the full texts of the works collected in the Hathitrust database was protected as fair use. Support for this conclusion came from the legislative history of the Copyright Act's fair use provision, 17 U.S.C. §107.
Paul Merrell

Hakia Retools Semantic Search Engine to Better Battle Google, Yahoo - 0 views

  • Semantic search engine startup Hakia has retooled its Web site, adding tabs for news, images and "credible" site searches as a way to differentiate between its search approach and what it calls the "10 blue links" approach search incumbents Google, Yahoo and Microsoft have used in the first era of search engines. Hakia employs semantic search technologies, leveraging natural language processing to derive broader meaning from search queries.
  • Hakia began hawking "credible" Web sites, vetted by librarians and informational professionals, in April for health and medical searches drawing from sites examined by the Medical Library Association. These sites have a peer review process or strict editorial controls to ensure the accuracy of the information and zero commercial bias. The idea is to clearly define sites users can trust in an age when do-it-yourself chronicling via Wikipedia and other sites that enable crowdsourcing activities has led to some questionable results.
Paul Merrell

Federal Court Rules Suspicionless Searches of Travelers' Phones and Laptops Unconstitut... - 1 views

  • n a major victory for privacy rights at the border, a federal court in Boston ruled today that suspicionless searches of travelers’ electronic devices by federal agents at airports and other U.S. ports of entry are unconstitutional. The ruling came in a lawsuit, Alasaad v. McAleenan, filed by the American Civil Liberties Union (ACLU), Electronic Frontier Foundation (EFF), and ACLU of Massachusetts, on behalf of 11 travelers whose smartphones and laptops were searched without individualized suspicion at U.S. ports of entry.“This ruling significantly advances Fourth Amendment protections for millions of international travelers who enter the United States every year,” said Esha Bhandari, staff attorney with the ACLU’s Speech, Privacy, and Technology Project. “By putting an end to the government’s ability to conduct suspicionless fishing expeditions, the court reaffirms that the border is not a lawless place and that we don’t lose our privacy rights when we travel.”
  • The district court order puts an end to Customs and Border Control (CBP) and Immigration and Customs Enforcement (ICE) asserted authority to search and seize travelers’ devices for purposes far afield from the enforcement of immigration and customs laws. Border officers must now demonstrate individualized suspicion of illegal contraband before they can search a traveler’s device. The number of electronic device searches at U.S. ports of entry has increased significantly. Last year, CBP conducted more than 33,000 searches, almost four times the number from just three years prior. International travelers returning to the United States have reported numerous cases of abusive searches in recent months. While searching through the phone of Zainab Merchant, a plaintiff in the Alasaad case, a border agent knowingly rifled through privileged attorney-client communications. An immigration officer at Boston Logan Airport reportedly searched an incoming Harvard freshman’s cell phone and laptop, reprimanded the student for friends’ social media postings expressing views critical of the U.S. government, and denied the student entry into the country following the search.For the order:https://www.eff.org/document/alasaad-v-nielsen-summary-judgment-order For more on this case:https://www.eff.org/cases/alasaad-v-duke
Paul Merrell

Update: Google rolls out semantic search capabilities | InfoWorld | News | 2009-03-24 |... - 0 views

  •   Google has given its Web search engine an injection of semantic technology, as the search leader pushes into what many consider the future of search on the Internet. Oracle White Paper - Nucleus Report: Who's ready for SMB? - read this white paper. getRelatedBoxOne("/article/09/03/24/Google_rolls_out_semantic_search_capabilities_1.html","spBoxOne") The new technology will allow Google's search engine to identify associations and concepts related to a query, improving the list of related search terms Google displays along with its results, the company announced in an official blog on Tuesday.
  • Google has given its Web search engine an injection of semantic technology, as the search leader pushes into what many consider the future of search on the Internet.
  • The new technology will allow Google's search engine to identify associations and concepts related to a query, improving the list of related search terms Google displays along with its results, the company announced in an official blog on Tuesday.
Paul Merrell

From Radio to Porn, British Spies Track Web Users' Online Identities - 1 views

  • HERE WAS A SIMPLE AIM at the heart of the top-secret program: Record the website browsing habits of “every visible user on the Internet.” Before long, billions of digital records about ordinary people’s online activities were being stored every day. Among them were details cataloging visits to porn, social media and news websites, search engines, chat forums, and blogs. The mass surveillance operation — code-named KARMA POLICE — was launched by British spies about seven years ago without any public debate or scrutiny. It was just one part of a giant global Internet spying apparatus built by the United Kingdom’s electronic eavesdropping agency, Government Communications Headquarters, or GCHQ. The revelations about the scope of the British agency’s surveillance are contained in documents obtained by The Intercept from National Security Agency whistleblower Edward Snowden. Previous reports based on the leaked files have exposed how GCHQ taps into Internet cables to monitor communications on a vast scale, but many details about what happens to the data after it has been vacuumed up have remained unclear.
  • Amid a renewed push from the U.K. government for more surveillance powers, more than two dozen documents being disclosed today by The Intercept reveal for the first time several major strands of GCHQ’s existing electronic eavesdropping capabilities.
  • The surveillance is underpinned by an opaque legal regime that has authorized GCHQ to sift through huge archives of metadata about the private phone calls, emails and Internet browsing logs of Brits, Americans, and any other citizens — all without a court order or judicial warrant
  • ...17 more annotations...
  • A huge volume of the Internet data GCHQ collects flows directly into a massive repository named Black Hole, which is at the core of the agency’s online spying operations, storing raw logs of intercepted material before it has been subject to analysis. Black Hole contains data collected by GCHQ as part of bulk “unselected” surveillance, meaning it is not focused on particular “selected” targets and instead includes troves of data indiscriminately swept up about ordinary people’s online activities. Between August 2007 and March 2009, GCHQ documents say that Black Hole was used to store more than 1.1 trillion “events” — a term the agency uses to refer to metadata records — with about 10 billion new entries added every day. As of March 2009, the largest slice of data Black Hole held — 41 percent — was about people’s Internet browsing histories. The rest included a combination of email and instant messenger records, details about search engine queries, information about social media activity, logs related to hacking operations, and data on people’s use of tools to browse the Internet anonymously.
  • Throughout this period, as smartphone sales started to boom, the frequency of people’s Internet use was steadily increasing. In tandem, British spies were working frantically to bolster their spying capabilities, with plans afoot to expand the size of Black Hole and other repositories to handle an avalanche of new data. By 2010, according to the documents, GCHQ was logging 30 billion metadata records per day. By 2012, collection had increased to 50 billion per day, and work was underway to double capacity to 100 billion. The agency was developing “unprecedented” techniques to perform what it called “population-scale” data mining, monitoring all communications across entire countries in an effort to detect patterns or behaviors deemed suspicious. It was creating what it said would be, by 2013, “the world’s biggest” surveillance engine “to run cyber operations and to access better, more valued data for customers to make a real world difference.”
  • A document from the GCHQ target analysis center (GTAC) shows the Black Hole repository’s structure.
  • The data is searched by GCHQ analysts in a hunt for behavior online that could be connected to terrorism or other criminal activity. But it has also served a broader and more controversial purpose — helping the agency hack into European companies’ computer networks. In the lead up to its secret mission targeting Netherlands-based Gemalto, the largest SIM card manufacturer in the world, GCHQ used MUTANT BROTH in an effort to identify the company’s employees so it could hack into their computers. The system helped the agency analyze intercepted Facebook cookies it believed were associated with Gemalto staff located at offices in France and Poland. GCHQ later successfully infiltrated Gemalto’s internal networks, stealing encryption keys produced by the company that protect the privacy of cell phone communications.
  • Similarly, MUTANT BROTH proved integral to GCHQ’s hack of Belgian telecommunications provider Belgacom. The agency entered IP addresses associated with Belgacom into MUTANT BROTH to uncover information about the company’s employees. Cookies associated with the IPs revealed the Google, Yahoo, and LinkedIn accounts of three Belgacom engineers, whose computers were then targeted by the agency and infected with malware. The hacking operation resulted in GCHQ gaining deep access into the most sensitive parts of Belgacom’s internal systems, granting British spies the ability to intercept communications passing through the company’s networks.
  • In March, a U.K. parliamentary committee published the findings of an 18-month review of GCHQ’s operations and called for an overhaul of the laws that regulate the spying. The committee raised concerns about the agency gathering what it described as “bulk personal datasets” being held about “a wide range of people.” However, it censored the section of the report describing what these “datasets” contained, despite acknowledging that they “may be highly intrusive.” The Snowden documents shine light on some of the core GCHQ bulk data-gathering programs that the committee was likely referring to — pulling back the veil of secrecy that has shielded some of the agency’s most controversial surveillance operations from public scrutiny. KARMA POLICE and MUTANT BROTH are among the key bulk collection systems. But they do not operate in isolation — and the scope of GCHQ’s spying extends far beyond them.
  • The agency operates a bewildering array of other eavesdropping systems, each serving its own specific purpose and designated a unique code name, such as: SOCIAL ANTHROPOID, which is used to analyze metadata on emails, instant messenger chats, social media connections and conversations, plus “telephony” metadata about phone calls, cell phone locations, text and multimedia messages; MEMORY HOLE, which logs queries entered into search engines and associates each search with an IP address; MARBLED GECKO, which sifts through details about searches people have entered into Google Maps and Google Earth; and INFINITE MONKEYS, which analyzes data about the usage of online bulletin boards and forums. GCHQ has other programs that it uses to analyze the content of intercepted communications, such as the full written body of emails and the audio of phone calls. One of the most important content collection capabilities is TEMPORA, which mines vast amounts of emails, instant messages, voice calls and other communications and makes them accessible through a Google-style search tool named XKEYSCORE.
  • As of September 2012, TEMPORA was collecting “more than 40 billion pieces of content a day” and it was being used to spy on people across Europe, the Middle East, and North Africa, according to a top-secret memo outlining the scope of the program. The existence of TEMPORA was first revealed by The Guardian in June 2013. To analyze all of the communications it intercepts and to build a profile of the individuals it is monitoring, GCHQ uses a variety of different tools that can pull together all of the relevant information and make it accessible through a single interface. SAMUEL PEPYS is one such tool, built by the British spies to analyze both the content and metadata of emails, browsing sessions, and instant messages as they are being intercepted in real time. One screenshot of SAMUEL PEPYS in action shows the agency using it to monitor an individual in Sweden who visited a page about GCHQ on the U.S.-based anti-secrecy website Cryptome.
  • Partly due to the U.K.’s geographic location — situated between the United States and the western edge of continental Europe — a large amount of the world’s Internet traffic passes through its territory across international data cables. In 2010, GCHQ noted that what amounted to “25 percent of all Internet traffic” was transiting the U.K. through some 1,600 different cables. The agency said that it could “survey the majority of the 1,600” and “select the most valuable to switch into our processing systems.”
  • According to Joss Wright, a research fellow at the University of Oxford’s Internet Institute, tapping into the cables allows GCHQ to monitor a large portion of foreign communications. But the cables also transport masses of wholly domestic British emails and online chats, because when anyone in the U.K. sends an email or visits a website, their computer will routinely send and receive data from servers that are located overseas. “I could send a message from my computer here [in England] to my wife’s computer in the next room and on its way it could go through the U.S., France, and other countries,” Wright says. “That’s just the way the Internet is designed.” In other words, Wright adds, that means “a lot” of British data and communications transit across international cables daily, and are liable to be swept into GCHQ’s databases.
  • A map from a classified GCHQ presentation about intercepting communications from undersea cables. GCHQ is authorized to conduct dragnet surveillance of the international data cables through so-called external warrants that are signed off by a government minister. The external warrants permit the agency to monitor communications in foreign countries as well as British citizens’ international calls and emails — for example, a call from Islamabad to London. They prohibit GCHQ from reading or listening to the content of “internal” U.K. to U.K. emails and phone calls, which are supposed to be filtered out from GCHQ’s systems if they are inadvertently intercepted unless additional authorization is granted to scrutinize them. However, the same rules do not apply to metadata. A little-known loophole in the law allows GCHQ to use external warrants to collect and analyze bulk metadata about the emails, phone calls, and Internet browsing activities of British people, citizens of closely allied countries, and others, regardless of whether the data is derived from domestic U.K. to U.K. communications and browsing sessions or otherwise. In March, the existence of this loophole was quietly acknowledged by the U.K. parliamentary committee’s surveillance review, which stated in a section of its report that “special protection and additional safeguards” did not apply to metadata swept up using external warrants and that domestic British metadata could therefore be lawfully “returned as a result of searches” conducted by GCHQ.
  • Perhaps unsurprisingly, GCHQ appears to have readily exploited this obscure legal technicality. Secret policy guidance papers issued to the agency’s analysts instruct them that they can sift through huge troves of indiscriminately collected metadata records to spy on anyone regardless of their nationality. The guidance makes clear that there is no exemption or extra privacy protection for British people or citizens from countries that are members of the Five Eyes, a surveillance alliance that the U.K. is part of alongside the U.S., Canada, Australia, and New Zealand. “If you are searching a purely Events only database such as MUTANT BROTH, the issue of location does not occur,” states one internal GCHQ policy document, which is marked with a “last modified” date of July 2012. The document adds that analysts are free to search the databases for British metadata “without further authorization” by inputing a U.K. “selector,” meaning a unique identifier such as a person’s email or IP address, username, or phone number. Authorization is “not needed for individuals in the U.K.,” another GCHQ document explains, because metadata has been judged “less intrusive than communications content.” All the spies are required to do to mine the metadata troves is write a short “justification” or “reason” for each search they conduct and then click a button on their computer screen.
  • Intelligence GCHQ collects on British persons of interest is shared with domestic security agency MI5, which usually takes the lead on spying operations within the U.K. MI5 conducts its own extensive domestic surveillance as part of a program called DIGINT (digital intelligence).
  • GCHQ’s documents suggest that it typically retains metadata for periods of between 30 days to six months. It stores the content of communications for a shorter period of time, varying between three to 30 days. The retention periods can be extended if deemed necessary for “cyber defense.” One secret policy paper dated from January 2010 lists the wide range of information the agency classes as metadata — including location data that could be used to track your movements, your email, instant messenger, and social networking “buddy lists,” logs showing who you have communicated with by phone or email, the passwords you use to access “communications services” (such as an email account), and information about websites you have viewed.
  • Records showing the full website addresses you have visited — for instance, www.gchq.gov.uk/what_we_do — are treated as content. But the first part of an address you have visited — for instance, www.gchq.gov.uk — is treated as metadata. In isolation, a single metadata record of a phone call, email, or website visit may not reveal much about a person’s private life, according to Ethan Zuckerman, director of Massachusetts Institute of Technology’s Center for Civic Media. But if accumulated and analyzed over a period of weeks or months, these details would be “extremely personal,” he told The Intercept, because they could reveal a person’s movements, habits, religious beliefs, political views, relationships, and even sexual preferences. For Zuckerman, who has studied the social and political ramifications of surveillance, the most concerning aspect of large-scale government data collection is that it can be “corrosive towards democracy” — leading to a chilling effect on freedom of expression and communication. “Once we know there’s a reasonable chance that we are being watched in one fashion or another it’s hard for that not to have a ‘panopticon effect,’” he said, “where we think and behave differently based on the assumption that people may be watching and paying attention to what we are doing.”
  • When compared to surveillance rules in place in the U.S., GCHQ notes in one document that the U.K. has “a light oversight regime.” The more lax British spying regulations are reflected in secret internal rules that highlight greater restrictions on how NSA databases can be accessed. The NSA’s troves can be searched for data on British citizens, one document states, but they cannot be mined for information about Americans or other citizens from countries in the Five Eyes alliance. No such constraints are placed on GCHQ’s own databases, which can be sifted for records on the phone calls, emails, and Internet usage of Brits, Americans, and citizens from any other country. The scope of GCHQ’s surveillance powers explain in part why Snowden told The Guardian in June 2013 that U.K. surveillance is “worse than the U.S.” In an interview with Der Spiegel in July 2013, Snowden added that British Internet cables were “radioactive” and joked: “Even the Queen’s selfies to the pool boy get logged.”
  • In recent years, the biggest barrier to GCHQ’s mass collection of data does not appear to have come in the form of legal or policy restrictions. Rather, it is the increased use of encryption technology that protects the privacy of communications that has posed the biggest potential hindrance to the agency’s activities. “The spread of encryption … threatens our ability to do effective target discovery/development,” says a top-secret report co-authored by an official from the British agency and an NSA employee in 2011. “Pertinent metadata events will be locked within the encrypted channels and difficult, if not impossible, to prise out,” the report says, adding that the agencies were working on a plan that would “(hopefully) allow our Internet Exploitation strategy to prevail.”
Paul Merrell

Dept. of Justice Accuses Google of Illegally Protecting Monopoly - The New York Times - 1 views

  • The Justice Department accused Google on Tuesday of illegally protecting its monopoly over search and search advertising, the government’s most significant challenge to a tech company’s market power in a generation and one that could reshape the way consumers use the internet.In a much-anticipated lawsuit, the agency accused Google of locking up deals with giant partners like Apple and throttling competition through exclusive business contracts and agreements.Google’s deals with Apple, mobile carriers and other handset makers to make its search engine the default option for users accounted for most of its dominant market share in search, the agency said, a figure that it put at around 80 percent.“For many years,” the agency said in its 57-page complaint, “Google has used anticompetitive tactics to maintain and extend its monopolies in the markets for general search services, search advertising and general search text advertising — the cornerstones of its empire.”The lawsuit, which may stretch on for years, could set off a cascade of other antitrust lawsuits from state attorneys general. About four dozen states and jurisdictions, including New York and Texas, have conducted parallel investigations and some of them are expected to bring separate complaints against the company’s grip on technology for online advertising. Eleven state attorneys general, all Republicans, signed on to support the federal lawsuit.
  • The Justice Department did not immediately put forward remedies, such as selling off parts of the company or unwinding business contracts, in the lawsuit. Such actions are typically pursued in later stages of a case.Ryan Shores, an associate deputy attorney general, said “nothing is off the table” in terms of remedies.
  • Democratic lawmakers on the House Judiciary Committee released a sprawling report on the tech giants two weeks ago, also accusing Google of controlling a monopoly over online search and the ads that come up when users enter a query.
  • ...1 more annotation...
  • Google last faced serious scrutiny from an American antitrust regulator nearly a decade ago, when the Federal Trade Commission investigated whether it had abused its power over the search market. The agency’s staff recommended bringing charges against the company, according to a memo reported on by The Wall Street Journal. But the agency’s five commissioners voted in 2013 not to bring a case.Other governments have been more aggressive toward the big tech companies. The European Union has brought three antitrust cases against Google in recent years, focused on its search engine, advertising business and Android mobile operating system. Regulators in Britain and Australia are examining the digital advertising market, in inquiries that could ultimately implicate the company.“It’s the most newsworthy monopolization action brought by the government since the Microsoft case in the late ’90s,” said Bill Baer, a former chief of the Justice Department’s antitrust division. “It’s significant in that the government believes that a highly successful tech platform has engaged in conduct that maintains its monopoly power unlawfully, and as a result injures consumers and competition.”
Paul Merrell

Sun to Distribute Microsoft Live Search-Powered Toolbar as Part of Java Runtime Environ... - 0 views

  • Sun and Microsoft have agreed on a search distribution deal that will offer the MSN Toolbar, powered by Microsoft Live Search, to U.S.-based Internet Explorer users who download the Java Runtime Environment (JRE).
  • Sun and Microsoft have agreed on a search distribution deal that will offer the MSN Toolbar, powered by Microsoft Live Search, to U.S.-based Internet Explorer users who download the Java Runtime Environment (JRE). This agreement gives Internet Explorer users downloading Sun’s JRE the option to download the MSN Toolbar for one-click access to Live Search features, as well as news, entertainment, sports and more from the MSN network and direct access to Windows Live Hotmail and Windows Live Messenger.
  • “This agreement with Sun Microsystems is another important milestone in our strategy to secure broad-scale distribution for our search offering, enabling millions more people to experience the benefits of Live Search,” said Yusuf Mehdi, senior vice president of the Online Audience Business at Microsoft. “With the vast array of Java software-based Web applications that are downloaded every month, this deal will expose Live Search to millions more Internet users and drive increased volume for our search advertisers.”
Paul Merrell

The Latest Rules on How Long NSA Can Keep Americans' Encrypted Data Look Too Familiar |... - 0 views

  • Does the National Security Agency (NSA) have the authority to collect and keep all encrypted Internet traffic for as long as is necessary to decrypt that traffic? That was a question first raised in June 2013, after the minimization procedures governing telephone and Internet records collected under Section 702 of the Foreign Intelligence Surveillance Act were disclosed by Edward Snowden. The issue quickly receded into the background, however, as the world struggled to keep up with the deluge of surveillance disclosures. The Intelligence Authorization Act of 2015, which passed Congress this last December, should bring the question back to the fore. It established retention guidelines for communications collected under Executive Order 12333 and included an exception that allows NSA to keep ‘incidentally’ collected encrypted communications for an indefinite period of time. This creates a massive loophole in the guidelines. NSA’s retention of encrypted communications deserves further consideration today, now that these retention guidelines have been written into law. It has become increasingly clear over the last year that surveillance reform will be driven by technological change—specifically by the growing use of encryption technologies. Therefore, any legislation touching on encryption should receive close scrutiny.
  • Section 309 of the intel authorization bill describes “procedures for the retention of incidentally acquired communications.” It establishes retention guidelines for surveillance programs that are “reasonably anticipated to result in the acquisition of [telephone or electronic communications] to or from a United States person.” Communications to or from a United States person are ‘incidentally’ collected because the U.S. person is not the actual target of the collection. Section 309 states that these incidentally collected communications must be deleted after five years unless they meet a number of exceptions. One of these exceptions is that “the communication is enciphered or reasonably believed to have a secret meaning.” This exception appears to be directly lifted from NSA’s minimization procedures for data collected under Section 702 of FISA, which were declassified in 2013. 
  • While Section 309 specifically applies to collection taking place under E.O. 12333, not FISA, several of the exceptions described in Section 309 closely match exceptions in the FISA minimization procedures. That includes the exception for “enciphered” communications. Those minimization procedures almost certainly served as a model for these retention guidelines and will likely shape how this new language is interpreted by the Executive Branch. Section 309 also asks the heads of each relevant member of the intelligence community to develop procedures to ensure compliance with new retention requirements. I expect those procedures to look a lot like the FISA minimization guidelines.
  • ...6 more annotations...
  • This language is broad, circular, and technically incoherent, so it takes some effort to parse appropriately. When the minimization procedures were disclosed in 2013, this language was interpreted by outside commentators to mean that NSA may keep all encrypted data that has been incidentally collected under Section 702 for at least as long as is necessary to decrypt that data. Is this the correct interpretation? I think so. It is important to realize that the language above isn’t just broad. It seems purposefully broad. The part regarding relevance seems to mirror the rationale NSA has used to justify its bulk phone records collection program. Under that program, all phone records were relevant because some of those records could be valuable to terrorism investigations and (allegedly) it isn’t possible to collect only those valuable records. This is the “to find a needle a haystack, you first have to have the haystack” argument. The same argument could be applied to encrypted data and might be at play here.
  • This exception doesn’t just apply to encrypted data that might be relevant to a current foreign intelligence investigation. It also applies to cases in which the encrypted data is likely to become relevant to a future intelligence requirement. This is some remarkably generous language. It seems one could justify keeping any type of encrypted data under this exception. Upon close reading, it is difficult to avoid the conclusion that these procedures were written carefully to allow NSA to collect and keep a broad category of encrypted data under the rationale that this data might contain the communications of NSA targets and that it might be decrypted in the future. If NSA isn’t doing this today, then whoever wrote these minimization procedures wanted to at least ensure that NSA has the authority to do this tomorrow.
  • There are a few additional observations that are worth making regarding these nominally new retention guidelines and Section 702 collection. First, the concept of incidental collection as it has typically been used makes very little sense when applied to encrypted data. The way that NSA’s Section 702 upstream “about” collection is understood to work is that technology installed on the network does some sort of pattern match on Internet traffic; say that an NSA target uses example@gmail.com to communicate. NSA would then search content of emails for references to example@gmail.com. This could notionally result in a lot of incidental collection of U.S. persons’ communications whenever the email that references example@gmail.com is somehow mixed together with emails that have nothing to do with the target. This type of incidental collection isn’t possible when the data is encrypted because it won’t be possible to search and find example@gmail.com in the body of an email. Instead, example@gmail.com will have been turned into some alternative, indecipherable string of bits on the network. Incidental collection shouldn’t occur because the pattern match can’t occur in the first place. This demonstrates that, when communications are encrypted, it will be much harder for NSA to search Internet traffic for a unique ID associated with a specific target.
  • This lends further credence to the conclusion above: rather than doing targeted collection against specific individuals, NSA is collecting, or plans to collect, a broad class of data that is encrypted. For example, NSA might collect all PGP encrypted emails or all Tor traffic. In those cases, NSA could search Internet traffic for patterns associated with specific types of communications, rather than specific individuals’ communications. This would technically meet the definition of incidental collection because such activity would result in the collection of communications of U.S. persons who aren’t the actual targets of surveillance. Collection of all Tor traffic would entail a lot of this “incidental” collection because the communications of NSA targets would be mixed with the communications of a large number of non-target U.S. persons. However, this “incidental” collection is inconsistent with how the term is typically used, which is to refer to over-collection resulting from targeted surveillance programs. If NSA were collecting all Tor traffic, that activity wouldn’t actually be targeted, and so any resulting over-collection wouldn’t actually be incidental. Moreover, greater use of encryption by the general public would result in an ever-growing amount of this type of incidental collection.
  • This type of collection would also be inconsistent with representations of Section 702 upstream collection that have been made to the public and to Congress. Intelligence officials have repeatedly suggested that search terms used as part of this program have a high degree of specificity. They have also argued that the program is an example of targeted rather than bulk collection. ODNI General Counsel Robert Litt, in a March 2014 meeting before the Privacy and Civil Liberties Oversight Board, stated that “there is either a misconception or a mischaracterization commonly repeated that Section 702 is a form of bulk collection. It is not bulk collection. It is targeted collection based on selectors such as telephone numbers or email addresses where there’s reason to believe that the selector is relevant to a foreign intelligence purpose.” The collection of Internet traffic based on patterns associated with types of communications would be bulk collection; more akin to NSA’s collection of phone records en mass than it is to targeted collection focused on specific individuals. Moreover, this type of collection would certainly fall within the definition of bulk collection provided just last week by the National Academy of Sciences: “collection in which a significant portion of the retained data pertains to identifiers that are not targets at the time of collection.”
  • The Section 702 minimization procedures, which will serve as a template for any new retention guidelines established for E.O. 12333 collection, create a large loophole for encrypted communications. With everything from email to Internet browsing to real-time communications moving to encrypted formats, an ever-growing amount of Internet traffic will fall within this loophole.
  •  
    Tucked into a budget authorization act in December without press notice. Section 309 (the Act is linked from the article) appears to be very broad authority for the NSA to intercept any form of telephone or other electronic information in bulk. There are far more exceptions from the five-year retention limitation than the encrypted information exception. When reading this, keep in mind that the U.S. intelligence community plays semantic games to obfuscate what it does. One of its word plays is that communications are not "collected" until an analyst looks at or listens to partiuclar data, even though the data will be searched to find information countless times before it becomes "collected." That searching was the major basis for a decision by the U.S. District Court in Washington, D.C. that bulk collection of telephone communications was unconstitutional: Under the Fourth Amendment, a "search" or "seizure" requiring a judicial warrant occurs no later than when the information is intercepted. That case is on appeal, has been briefed and argued, and a decision could come any time now. Similar cases are pending in two other courts of appeals. Also, an important definition from the new Intelligence Authorization Act: "(a) DEFINITIONS.-In this section: (1) COVERED COMMUNICATION.-The term ''covered communication'' means any nonpublic telephone or electronic communication acquired without the consent of a person who is a party to the communication, including communications in electronic storage."       
Gary Edwards

Microsoft, Google Search and the Future of the Open Web - Google Docs - 0 views

  •  
    Response to the InformationWeek article "Remaking Microsoft: Get Out of Web Search!". Covers "The Myth of Google Enterprise Search", and the refusal of Google to implement or recognize W3C Semantic Web technologies. This refusal protects Google's proprietary search and categorization algorithms, but it opens the door wide for Microsoft Office editors to totally exploit the end-user semantic interface opportunities. If Microsoft can pull this off, they will take "search" to the Enterprise and beyond into every high end discipline using MSOffice to edit Web ready documents (private and public use). Also a bit about WebKit as the most disruptive technology Microsoft has faced since the advent of the Web.
Gary Edwards

Why Google Isn't Enough - Forbes.com - 0 views

  • There are three key ways that successful implementations of enterprise search differ from the search we use on the Web: richer user interfaces, business process context and heterogeneous content.
  •  
    One key refrain that expresses this trend is heard in companies around the world: "Why can't we have a Google inside the four walls of our company?" While at first this seems like a good idea, the problem of using search inside a company is much more complicated than just indexing documents, throwing up a search box and asking people if they feel lucky. This week, JargonSpy explores just what "enterprise search" means and why it is a complicated challenge that is becoming increasingly urgent for most companies to solve.
Paul Merrell

Inside Google Desktop: Google Desktop Update - 0 views

  •  
    Google throws in the towel on desktop search, just as Microsoft somehow reached into my WinXP Pro (which never runs with automatic updates turned on) and killed the file search functionality, replaced by a message that file search is no longer supported in Explorer 6, with an invitation to upgrade MSIE or use Bing. As though I would ever let MSIE outside my firewall! Somehow, the ability to search the cloud just isn't enough for me.  
Gary Edwards

Official Google Webmaster Central Blog: Introducing Rich Snippets - 0 views

  •  
    Google "Rich Snippets" is a new presentation of HTML snippets that applies Google's algorithms to highlight structured data embedded in web pages. Rich Snippets give end-users convenient summary information about their search results at a glance. Google is currently supporting a very limited subset of data about reviews and people. When searching for a product or service, users can easily see reviews and ratings, and when searching for a person, they'll get help distinguishing between people with the same name. It's a simple change to the display of search results, yet our experiments have shown that users find the new data valuable. For this to work though, both Web-masters and Web-workers have to annotate thier pages with structured data in a standard format. Google snippets supports microformats and RDFa. Existing Web data can be wrapped with some additional tags to accomplish this. Notice that Google avoids mention of RDF and the W3C's vision of a "Semantic Web" where Web objects are fully described in machine readable semantics. Over at the WHATWG group, where work on HTML5 continues, Google's Ian Hickson has been fighting RDFa and the Semantic Web in what looks to be an effort to protect the infamous Google algorithms. RDFa provides a means for Web-workers, knowledge-workers, line-of-business managers and document generating end-users to enrich their HTML+ with machine semantics. The idea being that the document experts creating Web content can best describe to search engine and content management machines the objects-of-information used. The google algorithms provide a proprietary semantics of this same content. The best solution to the tsunami of conten the Web has wrought would be to combine end-user semantic expertise with Google algorithms. Let's hope Google stays the RDFa course and comes around to recognize the full potential of organizing the world's information with the input of content providers. One thing the world desperatel
Willis Wee

How Google Social Search Works [VIDEO] - 4 views

  •  
    Google Social Search went live just 2 days ago. It might be confusing to some about how Google indexes its search results. From the laymen point of view, Google Social Search pulls out relevant content created by the people in your online social circle and includes them into your search result.
Gonzalo San Gil, PhD.

Disney Patents a Piracy Free Search Engine | TorrentFreak - 0 views

  •  
    " Ernesto on October 31, 2014 C: 56 News Disney has just obtained a patent for a search engine that ranks sites based on various "authenticity" factors. One of the goals of the technology is to filter pirated material from search results while boosting the profile of copyright and trademark holders' websites." [# ! #Imagine... # ! ... this kind of '#artifacts' mandatory to the computer # ! manufacturers... [# Additional #WARNING: #Disney to #Decide the "#Authenticity" of web #contents... what's next?]]
  •  
    " Ernesto on October 31, 2014 C: 56 News Disney has just obtained a patent for a search engine that ranks sites based on various "authenticity" factors. One of the goals of the technology is to filter pirated material from search results while boosting the profile of copyright and trademark holders' websites."
Paul Merrell

Help:CirrusSearch - MediaWiki - 0 views

  • CirrusSearch is a new search engine for MediaWiki. The Wikimedia Foundation is migrating to CirrusSearch since it features key improvements over the previously used search engine, LuceneSearch. This page describes the features that are new or different compared to the past solutions.
  • 1 Frequently asked questions 1.1 What's improved? 2 Updates 3 Search suggestions 4 Full text search 4.1 Stemming 4.2 Filters (intitle:, incategory: and linksto:) 4.3 prefix: 4.4 Special prefixes 4.5 Did you mean 4.6 Prefer phrase matches 4.7 Fuzzy search 4.8 Phrase search and proximity 4.9 Quotes and exact matches 4.10 prefer-recent: 4.11 hastemplate: 4.12 boost-templates: 4.13 insource: 4.14 Auxiliary Text 4.15 Lead Text 4.16 Commons Search 5 See also
  • Stemming In search terminology, support for "stemming" means that a search for "swim" will also include "swimming" and "swimmed", but not "swam". There is support for dozens of languages, but all languages are wanted. There is a list of currently supported languages at elasticsearch.org; see their documentation on contributing to submit requests or patches.
  • ...1 more annotation...
  • See also Full specifications in the browser tests
  •  
    Lots of new tricks to learn on sites using MediaWiki as folks update their installations, I'm not a big fan of programs written in PHP and Javascript, but they're impossible to avoid on the Web. So is MediaWiki, so any real improvements help.  
1 - 20 of 248 Next › Last »
Showing 20 items per page