Skip to main content

Home/ Future of the Web/ Group items tagged Internet-Archives

Rss Feed Group items tagged

4More

Archiveteam - 0 views

  • HISTORY IS OUR FUTURE And we've been trashing our history Archive Team is a loose collective of rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage. Since 2009 this variant force of nature has caught wind of shutdowns, shutoffs, mergers, and plain old deletions - and done our best to save the history before it's lost forever. Along the way, we've gotten attention, resistance, press and discussion, but most importantly, we've gotten the message out: IT DOESN'T HAVE TO BE THIS WAY. This website is intended to be an offloading point and information depot for a number of archiving projects, all related to saving websites or data that is in danger of being lost. Besides serving as a hub for team-based pulling down and mirroring of data, this site will provide advice on managing your own data and rescuing it from the brink of destruction. Currently Active Projects (Get Involved Here!) Archive Team recruiting Want to code for Archive Team? Here's a starting point.
  • Archive Team is a loose collective of rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage. Since 2009 this variant force of nature has caught wind of shutdowns, shutoffs, mergers, and plain old deletions - and done our best to save the history before it's lost forever. Along the way, we've gotten attention, resistance, press and discussion, but most importantly, we've gotten the message out: IT DOESN'T HAVE TO BE THIS WAY. This website is intended to be an offloading point and information depot for a number of archiving projects, all related to saving websites or data that is in danger of being lost. Besides serving as a hub for team-based pulling down and mirroring of data, this site will provide advice on managing your own data and rescuing it from the brink of destruction.
  • Who We Are and how you can join our cause! Deathwatch is where we keep track of sites that are sickly, dying or dead. Fire Drill is where we keep track of sites that seem fine but a lot depends on them. Projects is a comprehensive list of AT endeavors. Philosophy describes the ideas underpinning our work. Some Starting Points The Introduction is an overview of basic archiving methods. Why Back Up? Because they don't care about you. Back Up your Facebook Data Learn how to liberate your personal data from Facebook. Software will assist you in regaining control of your data by providing tools for information backup, archiving and distribution. Formats will familiarise you with the various data formats, and how to ensure your files will be readable in the future. Storage Media is about where to get it, what to get, and how to use it. Recommended Reading links to others sites for further information. Frequently Asked Questions is where we answer common questions.
  •  
    The Archive Team Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive - and it's really easy to do! The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the Tracker. Basic usage The warrior runs on Windows, OS X and Linux using a virtual machine. You'll need one of: VirtualBox (recommended) VMware workstation/player (free-gratis for personal use) See below for alternative virtual machines Partners with and contributes lots of archives to the Wayback Machine. Here's how you can help by contributing some bandwidth if you run an always-on box with an internet connection.
3More

Anti link-rot SaaS for web publishers -- WebCite - 0 views

  • The Problem Authors increasingly cite webpages and other digital objects on the Internet, which can "disappear" overnight. In one study published in the journal Science, 13% of Internet references in scholarly articles were inactive after only 27 months. Another problem is that cited webpages may change, so that readers see something different than what the citing author saw. The problem of unstable webcitations and the lack of routine digital preservation of cited digital objects has been referred to as an issue "calling for an immediate response" by publishers and authors [1]. An increasing number of editors and publishers ask that authors, when they cite a webpage, make a local copy of the cited webpage/webmaterial, and archive the cited URL in a system like WebCite®, to enable readers permanent access to the cited material.
  • What is WebCite®? WebCite®, a member of the International Internet Preservation Consortium, is an on-demand archiving system for webreferences (cited webpages and websites, or other kinds of Internet-accessible digital objects), which can be used by authors, editors, and publishers of scholarly papers and books, to ensure that cited webmaterial will remain available to readers in the future. If cited webreferences in journal articles, books etc. are not archived, future readers may encounter a "404 File Not Found" error when clicking on a cited URL. Try it! Archive a URL here. It's free and takes only 30 seconds. A WebCite®-enhanced reference is a reference which contains - in addition to the original live URL (which can and probably will disappear in the future, or its content may change) - a link to an archived copy of the material, exactly as the citing author saw it when he accessed the cited material.
  •  
    Free service spun off from the University of Toronto's University Health Network. Automagic archiving of cited internet content, generation of citations that include the url for the archived copy. Now if Google would just make it easier to use its search cache copies for the same purpose ...
3More

The Internet of Things Will Turn Large-Scale Hacks into Real World Disasters | Motherboard - 0 views

  • Disaster stories involving the Internet of Things are all the rage. They feature cars (both driven and driverless), the power grid, dams, and tunnel ventilation systems. A particularly vivid and realistic one, near-future fiction published last month in New York Magazine, described a cyberattack on New York that involved hacking of cars, the water system, hospitals, elevators, and the power grid. In these stories, thousands of people die. Chaos ensues. While some of these scenarios overhype the mass destruction, the individual risks are all real. And traditional computer and network security isn’t prepared to deal with them.Classic information security is a triad: confidentiality, integrity, and availability. You’ll see it called “CIA,” which admittedly is confusing in the context of national security. But basically, the three things I can do with your data are steal it (confidentiality), modify it (integrity), or prevent you from getting it (availability).
  • So far, internet threats have largely been about confidentiality. These can be expensive; one survey estimated that data breaches cost an average of $3.8 million each. They can be embarrassing, as in the theft of celebrity photos from Apple’s iCloud in 2014 or the Ashley Madison breach in 2015. They can be damaging, as when the government of North Korea stole tens of thousands of internal documents from Sony or when hackers stole data about 83 million customer accounts from JPMorgan Chase, both in 2014. They can even affect national security, as in the case of the Office of Personnel Management data breach by—presumptively—China in 2015. On the Internet of Things, integrity and availability threats are much worse than confidentiality threats. It’s one thing if your smart door lock can be eavesdropped upon to know who is home. It’s another thing entirely if it can be hacked to allow a burglar to open the door—or prevent you from opening your door. A hacker who can deny you control of your car, or take over control, is much more dangerous than one who can eavesdrop on your conversations or track your car’s location. With the advent of the Internet of Things and cyber-physical systems in general, we've given the internet hands and feet: the ability to directly affect the physical world. What used to be attacks against data and information have become attacks against flesh, steel, and concrete. Today’s threats include hackers crashing airplanes by hacking into computer networks, and remotely disabling cars, either when they’re turned off and parked or while they’re speeding down the highway. We’re worried about manipulated counts from electronic voting machines, frozen water pipes through hacked thermostats, and remote murder through hacked medical devices. The possibilities are pretty literally endless. The Internet of Things will allow for attacks we can’t even imagine.
  •  
    Bruce Scneier on the insecurity of the Internet of Things, and possible consequences.
2More

El Impacto de Internet en la Industria Discográfica [2005] [Tesis Doctoral] |... - 0 views

  •  
    "Posted by Gonzalo San Gil, PhD ⋅ 11/10/2014 ⋅ Leave a comment El Impacto de Internet en la Industria Discográfica [2005] [Tesis Doctoral] # Disappeared -after five years and close to 3000 (#Free) downloads from archive.org… " due to issues with the item's content." (?) (https://archive.org/details/ElImpactoDeInternetEnLaIndustriaDiscogrficaV2.1) … and with more than 10000 reads 'stolen' from Scribd due to "bot removal" (?) (https://www.scribd.com/doc/48406334/El-Impacto-de-Internet-en-la-Industria-Discografica-v2-1-2005) I try to share it here to see how it lasts… and how many Pe@ple can access to an original copylefted work untill the next 'issue'… "
  •  
    "Posted by Gonzalo San Gil, PhD ⋅ 11/10/2014 ⋅ Leave a comment El Impacto de Internet en la Industria Discográfica [2005] [Tesis Doctoral] # Disappeared -after five years and close to 3000 (#Free) downloads from archive.org… " due to issues with the item's content." (?) (https://archive.org/details/ElImpactoDeInternetEnLaIndustriaDiscogrficaV2.1) … and with more than 10000 reads 'stolen' from Scribd due to "bot removal" (?) (https://www.scribd.com/doc/48406334/El-Impacto-de-Internet-en-la-Industria-Discografica-v2-1-2005) I try to share it here to see how it lasts… and how many Pe@ple can access to an original copylefted work untill the next 'issue'… "
20More

From Radio to Porn, British Spies Track Web Users' Online Identities - 1 views

  • HERE WAS A SIMPLE AIM at the heart of the top-secret program: Record the website browsing habits of “every visible user on the Internet.” Before long, billions of digital records about ordinary people’s online activities were being stored every day. Among them were details cataloging visits to porn, social media and news websites, search engines, chat forums, and blogs. The mass surveillance operation — code-named KARMA POLICE — was launched by British spies about seven years ago without any public debate or scrutiny. It was just one part of a giant global Internet spying apparatus built by the United Kingdom’s electronic eavesdropping agency, Government Communications Headquarters, or GCHQ. The revelations about the scope of the British agency’s surveillance are contained in documents obtained by The Intercept from National Security Agency whistleblower Edward Snowden. Previous reports based on the leaked files have exposed how GCHQ taps into Internet cables to monitor communications on a vast scale, but many details about what happens to the data after it has been vacuumed up have remained unclear.
  • Amid a renewed push from the U.K. government for more surveillance powers, more than two dozen documents being disclosed today by The Intercept reveal for the first time several major strands of GCHQ’s existing electronic eavesdropping capabilities.
  • The surveillance is underpinned by an opaque legal regime that has authorized GCHQ to sift through huge archives of metadata about the private phone calls, emails and Internet browsing logs of Brits, Americans, and any other citizens — all without a court order or judicial warrant
  • ...17 more annotations...
  • A huge volume of the Internet data GCHQ collects flows directly into a massive repository named Black Hole, which is at the core of the agency’s online spying operations, storing raw logs of intercepted material before it has been subject to analysis. Black Hole contains data collected by GCHQ as part of bulk “unselected” surveillance, meaning it is not focused on particular “selected” targets and instead includes troves of data indiscriminately swept up about ordinary people’s online activities. Between August 2007 and March 2009, GCHQ documents say that Black Hole was used to store more than 1.1 trillion “events” — a term the agency uses to refer to metadata records — with about 10 billion new entries added every day. As of March 2009, the largest slice of data Black Hole held — 41 percent — was about people’s Internet browsing histories. The rest included a combination of email and instant messenger records, details about search engine queries, information about social media activity, logs related to hacking operations, and data on people’s use of tools to browse the Internet anonymously.
  • Throughout this period, as smartphone sales started to boom, the frequency of people’s Internet use was steadily increasing. In tandem, British spies were working frantically to bolster their spying capabilities, with plans afoot to expand the size of Black Hole and other repositories to handle an avalanche of new data. By 2010, according to the documents, GCHQ was logging 30 billion metadata records per day. By 2012, collection had increased to 50 billion per day, and work was underway to double capacity to 100 billion. The agency was developing “unprecedented” techniques to perform what it called “population-scale” data mining, monitoring all communications across entire countries in an effort to detect patterns or behaviors deemed suspicious. It was creating what it said would be, by 2013, “the world’s biggest” surveillance engine “to run cyber operations and to access better, more valued data for customers to make a real world difference.”
  • A document from the GCHQ target analysis center (GTAC) shows the Black Hole repository’s structure.
  • The data is searched by GCHQ analysts in a hunt for behavior online that could be connected to terrorism or other criminal activity. But it has also served a broader and more controversial purpose — helping the agency hack into European companies’ computer networks. In the lead up to its secret mission targeting Netherlands-based Gemalto, the largest SIM card manufacturer in the world, GCHQ used MUTANT BROTH in an effort to identify the company’s employees so it could hack into their computers. The system helped the agency analyze intercepted Facebook cookies it believed were associated with Gemalto staff located at offices in France and Poland. GCHQ later successfully infiltrated Gemalto’s internal networks, stealing encryption keys produced by the company that protect the privacy of cell phone communications.
  • Similarly, MUTANT BROTH proved integral to GCHQ’s hack of Belgian telecommunications provider Belgacom. The agency entered IP addresses associated with Belgacom into MUTANT BROTH to uncover information about the company’s employees. Cookies associated with the IPs revealed the Google, Yahoo, and LinkedIn accounts of three Belgacom engineers, whose computers were then targeted by the agency and infected with malware. The hacking operation resulted in GCHQ gaining deep access into the most sensitive parts of Belgacom’s internal systems, granting British spies the ability to intercept communications passing through the company’s networks.
  • In March, a U.K. parliamentary committee published the findings of an 18-month review of GCHQ’s operations and called for an overhaul of the laws that regulate the spying. The committee raised concerns about the agency gathering what it described as “bulk personal datasets” being held about “a wide range of people.” However, it censored the section of the report describing what these “datasets” contained, despite acknowledging that they “may be highly intrusive.” The Snowden documents shine light on some of the core GCHQ bulk data-gathering programs that the committee was likely referring to — pulling back the veil of secrecy that has shielded some of the agency’s most controversial surveillance operations from public scrutiny. KARMA POLICE and MUTANT BROTH are among the key bulk collection systems. But they do not operate in isolation — and the scope of GCHQ’s spying extends far beyond them.
  • The agency operates a bewildering array of other eavesdropping systems, each serving its own specific purpose and designated a unique code name, such as: SOCIAL ANTHROPOID, which is used to analyze metadata on emails, instant messenger chats, social media connections and conversations, plus “telephony” metadata about phone calls, cell phone locations, text and multimedia messages; MEMORY HOLE, which logs queries entered into search engines and associates each search with an IP address; MARBLED GECKO, which sifts through details about searches people have entered into Google Maps and Google Earth; and INFINITE MONKEYS, which analyzes data about the usage of online bulletin boards and forums. GCHQ has other programs that it uses to analyze the content of intercepted communications, such as the full written body of emails and the audio of phone calls. One of the most important content collection capabilities is TEMPORA, which mines vast amounts of emails, instant messages, voice calls and other communications and makes them accessible through a Google-style search tool named XKEYSCORE.
  • As of September 2012, TEMPORA was collecting “more than 40 billion pieces of content a day” and it was being used to spy on people across Europe, the Middle East, and North Africa, according to a top-secret memo outlining the scope of the program. The existence of TEMPORA was first revealed by The Guardian in June 2013. To analyze all of the communications it intercepts and to build a profile of the individuals it is monitoring, GCHQ uses a variety of different tools that can pull together all of the relevant information and make it accessible through a single interface. SAMUEL PEPYS is one such tool, built by the British spies to analyze both the content and metadata of emails, browsing sessions, and instant messages as they are being intercepted in real time. One screenshot of SAMUEL PEPYS in action shows the agency using it to monitor an individual in Sweden who visited a page about GCHQ on the U.S.-based anti-secrecy website Cryptome.
  • Partly due to the U.K.’s geographic location — situated between the United States and the western edge of continental Europe — a large amount of the world’s Internet traffic passes through its territory across international data cables. In 2010, GCHQ noted that what amounted to “25 percent of all Internet traffic” was transiting the U.K. through some 1,600 different cables. The agency said that it could “survey the majority of the 1,600” and “select the most valuable to switch into our processing systems.”
  • According to Joss Wright, a research fellow at the University of Oxford’s Internet Institute, tapping into the cables allows GCHQ to monitor a large portion of foreign communications. But the cables also transport masses of wholly domestic British emails and online chats, because when anyone in the U.K. sends an email or visits a website, their computer will routinely send and receive data from servers that are located overseas. “I could send a message from my computer here [in England] to my wife’s computer in the next room and on its way it could go through the U.S., France, and other countries,” Wright says. “That’s just the way the Internet is designed.” In other words, Wright adds, that means “a lot” of British data and communications transit across international cables daily, and are liable to be swept into GCHQ’s databases.
  • A map from a classified GCHQ presentation about intercepting communications from undersea cables. GCHQ is authorized to conduct dragnet surveillance of the international data cables through so-called external warrants that are signed off by a government minister. The external warrants permit the agency to monitor communications in foreign countries as well as British citizens’ international calls and emails — for example, a call from Islamabad to London. They prohibit GCHQ from reading or listening to the content of “internal” U.K. to U.K. emails and phone calls, which are supposed to be filtered out from GCHQ’s systems if they are inadvertently intercepted unless additional authorization is granted to scrutinize them. However, the same rules do not apply to metadata. A little-known loophole in the law allows GCHQ to use external warrants to collect and analyze bulk metadata about the emails, phone calls, and Internet browsing activities of British people, citizens of closely allied countries, and others, regardless of whether the data is derived from domestic U.K. to U.K. communications and browsing sessions or otherwise. In March, the existence of this loophole was quietly acknowledged by the U.K. parliamentary committee’s surveillance review, which stated in a section of its report that “special protection and additional safeguards” did not apply to metadata swept up using external warrants and that domestic British metadata could therefore be lawfully “returned as a result of searches” conducted by GCHQ.
  • Perhaps unsurprisingly, GCHQ appears to have readily exploited this obscure legal technicality. Secret policy guidance papers issued to the agency’s analysts instruct them that they can sift through huge troves of indiscriminately collected metadata records to spy on anyone regardless of their nationality. The guidance makes clear that there is no exemption or extra privacy protection for British people or citizens from countries that are members of the Five Eyes, a surveillance alliance that the U.K. is part of alongside the U.S., Canada, Australia, and New Zealand. “If you are searching a purely Events only database such as MUTANT BROTH, the issue of location does not occur,” states one internal GCHQ policy document, which is marked with a “last modified” date of July 2012. The document adds that analysts are free to search the databases for British metadata “without further authorization” by inputing a U.K. “selector,” meaning a unique identifier such as a person’s email or IP address, username, or phone number. Authorization is “not needed for individuals in the U.K.,” another GCHQ document explains, because metadata has been judged “less intrusive than communications content.” All the spies are required to do to mine the metadata troves is write a short “justification” or “reason” for each search they conduct and then click a button on their computer screen.
  • Intelligence GCHQ collects on British persons of interest is shared with domestic security agency MI5, which usually takes the lead on spying operations within the U.K. MI5 conducts its own extensive domestic surveillance as part of a program called DIGINT (digital intelligence).
  • GCHQ’s documents suggest that it typically retains metadata for periods of between 30 days to six months. It stores the content of communications for a shorter period of time, varying between three to 30 days. The retention periods can be extended if deemed necessary for “cyber defense.” One secret policy paper dated from January 2010 lists the wide range of information the agency classes as metadata — including location data that could be used to track your movements, your email, instant messenger, and social networking “buddy lists,” logs showing who you have communicated with by phone or email, the passwords you use to access “communications services” (such as an email account), and information about websites you have viewed.
  • Records showing the full website addresses you have visited — for instance, www.gchq.gov.uk/what_we_do — are treated as content. But the first part of an address you have visited — for instance, www.gchq.gov.uk — is treated as metadata. In isolation, a single metadata record of a phone call, email, or website visit may not reveal much about a person’s private life, according to Ethan Zuckerman, director of Massachusetts Institute of Technology’s Center for Civic Media. But if accumulated and analyzed over a period of weeks or months, these details would be “extremely personal,” he told The Intercept, because they could reveal a person’s movements, habits, religious beliefs, political views, relationships, and even sexual preferences. For Zuckerman, who has studied the social and political ramifications of surveillance, the most concerning aspect of large-scale government data collection is that it can be “corrosive towards democracy” — leading to a chilling effect on freedom of expression and communication. “Once we know there’s a reasonable chance that we are being watched in one fashion or another it’s hard for that not to have a ‘panopticon effect,’” he said, “where we think and behave differently based on the assumption that people may be watching and paying attention to what we are doing.”
  • When compared to surveillance rules in place in the U.S., GCHQ notes in one document that the U.K. has “a light oversight regime.” The more lax British spying regulations are reflected in secret internal rules that highlight greater restrictions on how NSA databases can be accessed. The NSA’s troves can be searched for data on British citizens, one document states, but they cannot be mined for information about Americans or other citizens from countries in the Five Eyes alliance. No such constraints are placed on GCHQ’s own databases, which can be sifted for records on the phone calls, emails, and Internet usage of Brits, Americans, and citizens from any other country. The scope of GCHQ’s surveillance powers explain in part why Snowden told The Guardian in June 2013 that U.K. surveillance is “worse than the U.S.” In an interview with Der Spiegel in July 2013, Snowden added that British Internet cables were “radioactive” and joked: “Even the Queen’s selfies to the pool boy get logged.”
  • In recent years, the biggest barrier to GCHQ’s mass collection of data does not appear to have come in the form of legal or policy restrictions. Rather, it is the increased use of encryption technology that protects the privacy of communications that has posed the biggest potential hindrance to the agency’s activities. “The spread of encryption … threatens our ability to do effective target discovery/development,” says a top-secret report co-authored by an official from the British agency and an NSA employee in 2011. “Pertinent metadata events will be locked within the encrypted channels and difficult, if not impossible, to prise out,” the report says, adding that the agencies were working on a plan that would “(hopefully) allow our Internet Exploitation strategy to prevail.”
1More

Takedown, Staydown Would Be a Disaster, Internet Archive Warns - TorrentFreak - 0 views

  •  
    " By Andy on June 7, 2016 C: 100 News The Internet Archive has issued its sternest warning yet over proposed changes to the DMCA. The Archive says that 'Notice and Staydown' would be an "absolute disaster" for the Internet that would trample due process, promote user monitoring, censorship, and have First Amendment implications."
1More

Internet Archive Seeks to Defend Against Wrongful Copyright Takedowns - TorrentFreak [#... - 0 views

  •  
    " By Andy on March 23, 2016 C: 25 Breaking As a non-profit library of millions of free books, movies, software and music, the Internet Archive has a keen interest in copyright law. In a submission to the U.S. Copyright Office the Archive says since the major studios often send invalid notices, they're suggesting a change in the law to allow content to remain up while disputes are settled."
1More

This year, the Internet Archive celebrated its 20th birthday | Opensource.com - 0 views

  •  
    "On May 12, 1996, like a benevolent mad scientist, Brewster Kahle brought the Internet Archive to life. The World Wide Web was in its infancy and the Archive was there to capture its growing pains. Inspired by and "
1More

Internet Archive Needs Your Help - 1 views

  •  
    [ This year the Internet Archive needs your help. In 2009-2010 we were able to employ hundreds of low income, out of work parents using a stimulus wage subsidy. Most of these parents worked to scan over 150,000 recent books for the blind and dyslexic. We appealed to legislators, but this program was defunded. ... ]
3More

This project aims to make '404 not found' pages a thing of the past - 0 views

  • The Internet is always changing. Sites are rising and falling, content is deleted, and bad URLs can lead to '404 Not Found' errors that are as helpful as a brick wall. A new project proposes an do away with dead 404 errors by implementing new HTML code that will help access prior versions of hyperlinked content. With any luck, that means that you’ll never have to run into a dead link again. The “404-No-More” project is backed by a formidable coalition including members from organizations like the Harvard Library Innovation Lab, Los Alamos National Laboratory, Old Dominion University, and the Berkman Center for Internet & Society. Part of the Knight News Challenge, which seeks to strengthen the Internet for free expression and innovation through a variety of initiatives, 404-No-More recently reached the semifinal stage. The project aims to cure so-called link rot, the process by which hyperlinks become useless overtime because they point to addresses that are no longer available. If implemented, websites such as Wikipedia and other reference documents would be vastly improved. The new feature would also give Web authors a way provide links that contain both archived copies of content and specific dates of reference, the sort of information that diligent readers have to hunt down on a website like Archive.org.
  • While it may sound trivial, link rot can actually have real ramifications. Nearly 50 percent of the hyperlinks in Supreme Court decisions no longer work, a 2013 study revealed. Losing footnotes and citations in landmark legal decisions can mean losing crucial information and context about the laws that govern us. The same study found that 70 percent of URLs within the Harvard Law Review and similar journals didn’t link to the originally cited information, considered a serious loss surrounding the discussion of our laws. The project’s proponents have come up with more potential uses as well. Activists fighting censorship will have an easier time combatting government takedowns, for instance. Journalists will be much more capable of researching dynamic Web pages. “If every hyperlink was annotated with a publication date, you could automatically view an archived version of the content as the author intended for you to see it,” the project’s authors explain. The ephemeral nature of the Web could no longer be used as a weapon. Roger Macdonald, a director at the Internet Archive, called the 404-No-More project “an important contribution to preservation of knowledge.”
  • The new feature would come in the form of introducing the mset attribute to the <a> element in HTML, which would allow users of the code to specify multiple dates and copies of content as an external resource. For instance, if both the date of reference and the location of a copy of targeted content is known by an author, the new code would like like this: The 404-No-More project’s goals are numerous, but the ultimate goal is to have mset become a new HTML standard for hyperlinks. “An HTML standard that incorporates archives for hyperlinks will loop in these efforts and make the Web better for everyone,” project leaders wrote, “activists, journalists, and regular ol’ everyday web users.”
8More

Internet Archive: Scanning Services - 1 views

  • Digitizing Print Collections with the Internet Archive Open and free online access, permanent storage, unlimited downloads and lifetime file management. We can help digitize your collections in 4 simple steps:
  • In addition to permanent hosting on archive.org, your books will be integrated with Open Library, openlibrary.org, a page on the web for every book.
  • Non-destructive color scanning using our Scribe system at one of our scanning centers across the globe. Complete MARC records, Dublin Core & XML, just 10c USD per image and a small set up charge per item.
  • ...4 more annotations...
  • Create and upload high-quality JP2000 images; persistent identifiers, lifetime hosting of files, lifetime management of file system and file access.
  • Create high quality PDF A files; run OCR across texts to allow "search inside" all books. Add to Internet Archive search engine; display via our open source Book Reader
  • 2,000,000 books online 600 million pages scanned 1,500 book scanned each day 15 million downloads each month 33 scanning centers in 7 countries 3.5 petabytes of storage 8 Gb per second bandwidth
  • Library of Congress Harvard University The New York Public Library Smithsonian Institution The Getty Research Institute University of California University of Toronto Biodiversity Heritage Library Boston Library Consortium C.A.R.L.I. Johns Hopkins University Allen County Public Library Lyrasis Massachusetts Institute of technology State Library of Massachusetts . . . and over 1,000 other Open Content Alliance partners
  •  
    I've been looking for a permanent online home for a couple of historical works I co-authored. My guidiing criterion has been the best chance of the works' long-term survival in a publicly-accessible form after my death. I think I may have just found my solution. 
1More

Internet Archive posts millions of historic images to Flickr | Ars Technica - 1 views

  •  
    "Earlier this year, communications technology scholar Kalev Leetaru began culling over 14 million images from the Internet Archive's public domain ebooks and uploading them to the Internet Archive's Flickr account. As of today, 2.6 million images are now easily searchable and downloadable."
6More

Profiled From Radio to Porn, British Spies Track Web Users' Online Identities | Global ... - 0 views

  • One system builds profiles showing people’s web browsing histories. Another analyzes instant messenger communications, emails, Skype calls, text messages, cell phone locations, and social media interactions. Separate programs were built to keep tabs on “suspicious” Google searches and usage of Google Maps. The surveillance is underpinned by an opaque legal regime that has authorized GCHQ to sift through huge archives of metadata about the private phone calls, emails and Internet browsing logs of Brits, Americans, and any other citizens  all without a court order or judicial warrant.
  • The power of KARMA POLICE was illustrated in 2009, when GCHQ launched a top-secret operation to collect intelligence about people using the Internet to listen to radio shows. The agency used a sample of nearly 7 million metadata records, gathered over a period of three months, to observe the listening habits of more than 200,000 people across 185 countries, including the U.S., the U.K., Ireland, Canada, Mexico, Spain, the Netherlands, France, and Germany.
  • GCHQ’s documents indicate that the plans for KARMA POLICE were drawn up between 2007 and 2008. The system was designed to provide the agency with “either (a) a web browsing profile for every visible user on the Internet, or (b) a user profile for every visible website on the Internet.” The origin of the surveillance system’s name is not discussed in the documents. But KARMA POLICE is also the name of a popular song released in 1997 by the Grammy Award-winning British band Radiohead, suggesting the spies may have been fans. A verse repeated throughout the hit song includes the lyric, “This is what you’ll get, when you mess with us.”
  • ...3 more annotations...
  • GCHQ vacuums up the website browsing histories using “probes” that tap into the international fiber-optic cables that transport Internet traffic across the world. A huge volume of the Internet data GCHQ collects flows directly into a massive repository named Black Hole, which is at the core of the agency’s online spying operations, storing raw logs of intercepted material before it has been subject to analysis. Black Hole contains data collected by GCHQ as part of bulk “unselected” surveillance, meaning it is not focused on particular “selected” targets and instead includes troves of data indiscriminately swept up about ordinary people’s online activities. Between August 2007 and March 2009, GCHQ documents say that Black Hole was used to store more than 1.1 trillion “events”  a term the agency uses to refer to metadata records  with about 10 billion new entries added every day. As of March 2009, the largest slice of data Black Hole held  41 percent  was about people’s Internet browsing histories. The rest included a combination of email and instant messenger records, details about search engine queries, information about social media activity, logs related to hacking operations, and data on people’s use of tools to browse the Internet anonymously.
  • Throughout this period, as smartphone sales started to boom, the frequency of people’s Internet use was steadily increasing. In tandem, British spies were working frantically to bolster their spying capabilities, with plans afoot to expand the size of Black Hole and other repositories to handle an avalanche of new data. By 2010, according to the documents, GCHQ was logging 30 billion metadata records per day. By 2012, collection had increased to 50 billion per day, and work was underway to double capacity to 100 billion. The agency was developing “unprecedented” techniques to perform what it called “population-scale” data mining, monitoring all communications across entire countries in an effort to detect patterns or behaviors deemed suspicious. It was creating what it saidwould be, by 2013, “the world’s biggest” surveillance engine “to run cyber operations and to access better, more valued data for customers to make a real world difference.” HERE WAS A SIMPLE AIM at the heart of the top-secret program: Record the website browsing habits of “every visible user on the Internet.” Before long, billions of digital records about ordinary people’s online activities were being stored every day. Among them were details cataloging visits to porn, social media and news websites, search engines, chat forums, and blogs.
  • The mass surveillance operation — code-named KARMA POLICE — was launched by British spies about seven years ago without any public debate or scrutiny. It was just one part of a giant global Internet spying apparatus built by the United Kingdom’s electronic eavesdropping agency, Government Communications Headquarters, or GCHQ. The revelations about the scope of the British agency’s surveillance are contained in documents obtained by The Intercept from National Security Agency whistleblower Edward Snowden. Previous reports based on the leaked files have exposed how GCHQ taps into Internet cables to monitor communications on a vast scale, but many details about what happens to the data after it has been vacuumed up have remained unclear.
2More

Liberad Internet! - 0 views

  •  
    "RESUMEN: Los intentos de control de Internet y los constantes ataques a su integridad han sido la tónica general por parte de poderes políticos, empresariales y mediáticos, desde su consolidación a principios de este Siglo XXI. La respuesta social ha sido tan contundente como reprimida. Los defensores de la Libertad de Acceso y de Expresión en Internet son, frecuentemente, acusados -de manera absolutamente infundada- de cómplices de piratería, promotores de abusos a menores y de difusión de pornografía, así como de colaborar con las redes mafiosas y el terrorismo global... Se olvidan (tal vez, deliberadamente) de las Campañas de Concienciación contra la Pena de Muerte o la Tortura, de la Solidaridad ante Catástrofes o, simplemente, de tod@s l@s Usuari@s que comparten, desinteresadamente, creación artística, literaria, técnica o científica en Internet...
  •  
    Conferencia de Apertura del Area de Gobierno de Internet en Mundo Internet 2.0: XI Congreso Nacional de Internet, Telecomunicaciones y Sociedad de la Informacion. Málaga 2007.
6More

Canada Casts Global Surveillance Dragnet Over File Downloads - The Intercept - 0 views

  • Canada’s leading surveillance agency is monitoring millions of Internet users’ file downloads in a dragnet search to identify extremists, according to top-secret documents. The covert operation, revealed Wednesday by CBC News in collaboration with The Intercept, taps into Internet cables and analyzes records of up to 15 million downloads daily from popular websites commonly used to share videos, photographs, music, and other files. The revelations about the spying initiative, codenamed LEVITATION, are the first from the trove of files provided by National Security Agency whistleblower Edward Snowden to show that the Canadian government has launched its own globe-spanning Internet mass surveillance system. According to the documents, the LEVITATION program can monitor downloads in several countries across Europe, the Middle East, North Africa, and North America. It is led by the Communications Security Establishment, or CSE, Canada’s equivalent of the NSA. (The Canadian agency was formerly known as “CSEC” until a recent name change.)
  • The latest disclosure sheds light on Canada’s broad existing surveillance capabilities at a time when the country’s government is pushing for a further expansion of security powers following attacks in Ottawa and Quebec last year. Ron Deibert, director of University of Toronto-based Internet security think tank Citizen Lab, said LEVITATION illustrates the “giant X-ray machine over all our digital lives.” “Every single thing that you do – in this case uploading/downloading files to these sites – that act is being archived, collected and analyzed,” Deibert said, after reviewing documents about the online spying operation for CBC News. David Christopher, a spokesman for Vancouver-based open Internet advocacy group OpenMedia.ca, said the surveillance showed “robust action” was needed to rein in the Canadian agency’s operations.
  • In a top-secret PowerPoint presentation, dated from mid-2012, an analyst from the agency jokes about how, while hunting for extremists, the LEVITATION system gets clogged with information on innocuous downloads of the musical TV series Glee. CSE finds some 350 “interesting” downloads each month, the presentation notes, a number that amounts to less than 0.0001 per cent of the total collected data. The agency stores details about downloads and uploads to and from 102 different popular file-sharing websites, according to the 2012 document, which describes the collected records as “free file upload,” or FFU, “events.” Only three of the websites are named: RapidShare, SendSpace, and the now defunct MegaUpload.
  • ...3 more annotations...
  • “The specific uses that they talk about in this [counter-terrorism] context may not be the problem, but it’s what else they can do,” said Tamir Israel, a lawyer with the University of Ottawa’s Canadian Internet Policy and Public Interest Clinic. Picking which downloads to monitor is essentially “completely at the discretion of CSE,” Israel added. The file-sharing surveillance also raises questions about the number of Canadians whose downloading habits could have been swept up as part of LEVITATION’s dragnet. By law, CSE isn’t allowed to target Canadians. In the LEVITATION presentation, however, two Canadian IP addresses that trace back to a web server in Montreal appear on a list of suspicious downloads found across the world. The same list includes downloads that CSE monitored in closely allied countries, including the United Kingdom, United States, Spain, Brazil, Germany and Portugal. It is unclear from the document whether LEVITATION has ever prevented any terrorist attacks. The agency cites only two successes of the program in the 2012 presentation: the discovery of a hostage video through a previously unknown target, and an uploaded document that contained the hostage strategy of a terrorist organization. The hostage in the discovered video was ultimately killed, according to public reports.
  • LEVITATION does not rely on cooperation from any of the file-sharing companies. A separate secret CSE operation codenamed ATOMIC BANJO obtains the data directly from internet cables that it has tapped into, and the agency then sifts out the unique IP address of each computer that downloaded files from the targeted websites. The IP addresses are valuable pieces of information to CSE’s analysts, helping to identify people whose downloads have been flagged as suspicious. The analysts use the IP addresses as a kind of search term, entering them into other surveillance databases that they have access to, such as the vast repositories of intercepted Internet data shared with the Canadian agency by the NSA and its British counterpart Government Communications Headquarters. If successful, the searches will return a list of results showing other websites visited by the people downloading the files – in some cases revealing associations with Facebook or Google accounts. In turn, these accounts may reveal the names and the locations of individual downloaders, opening the door for further surveillance of their activities.
  • Canada’s leading surveillance agency is monitoring millions of Internet users’ file downloads in a dragnet search to identify extremists, according to top-secret documents. The covert operation, revealed Wednesday by CBC News in collaboration with The Intercept, taps into Internet cables and analyzes records of up to 15 million downloads daily from popular websites commonly used to share videos, photographs, music, and other files. The revelations about the spying initiative, codenamed LEVITATION, are the first from the trove of files provided by National Security Agency whistleblower Edward Snowden to show that the Canadian government has launched its own globe-spanning Internet mass surveillance system. According to the documents, the LEVITATION program can monitor downloads in several countries across Europe, the Middle East, North Africa, and North America. It is led by the Communications Security Establishment, or CSE, Canada’s equivalent of the NSA. (The Canadian agency was formerly known as “CSEC” until a recent name change.)
1More

Internet Archive's S.F. office damaged in fire - SFGate November 6, 2013 - 0 views

  •  
    "The office of the nonprofit Internet Archive in San Francisco's Richmond District caught fire early Wednesday, destroying equipment and damaging several apartments next door. The blaze began about 3:45 a.m. in the single-story Clement Street office, between 12th and Funston avenues, and spread to an adjacent three-story complex, firefighters said. About eight residents of the next-door building were evacuated, but no injuries were reported. Firefighters had the blaze under control by 5 a.m."
2More

WG Review: Internet Wideband Audio Codec (codec) - 0 views

  • According to reports from developers of Internet audio applications and operators of Internet audio services, there are no standardized, high-quality audio codecs that meet all of the following three conditions: 1. Are optimized for use in interactive Internet applications. 2. Are published by a recognized standards development organization (SDO) and therefore subject to clear change control. 3. Can be widely implemented and easily distributed among application developers, service operators, and end users. There exist codecs that provide high quality encoding of audio information, but that are not optimized for the actual conditions of the Internet; according to reports, this mismatch between design and deployment has hindered adoption of such codecs in interactive Internet applications.
  • The goal of this working group is to develop a single high-quality audio codec that is optimized for use over the Internet and that can be widely implemented and easily distributed among application developers, service operators, and end users. Core technical considerations include, but are not necessarily limited to, the following: 1. Designing for use in interactive applications (examples include, but are not limited to, point-to-point voice calls, multi-party voice conferencing, telepresence, teleoperation, in-game voice chat, and live music performance) 2. Addressing the real transport conditions of the Internet as identified and prioritized by the working group 3. Ensuring interoperability with the Real-time Transport Protocol (RTP), including secure transport via SRTP 4. Ensuring interoperability with Internet signaling technologies such as Session Initiation Protocol (SIP), Session Description Protocol (SDP), and Extensible Messaging and Presence Protocol (XMPP); however, the result should not depend on the details of any particular signaling technology
1More

Make a Donation to the Internet Archive - 0 views

  •  
    [3-for-1 Match for All Donations! A generous supporter has offered to match every dollar we raise 3-to-1 through December 31st. We are trying to raise $150,000 in donations by the end of the year - with the match, that will give us $600,000, enough to buy 4 more petabytes of storage. Help us keep the library free for millions of people by making a tax-deductible donation today.]
2More

The Internet May Be Underwater in 15 Years - 1 views

  • When the internet goes down, life as the modern American knows it grinds to a halt. Gone are the cute kitten photos and the Facebook status updates—but also gone are the signals telling stoplights to change from green to red, and doctors’ access to online patient records. A vast web of physical infrastructure undergirds the internet connections that touch nearly every aspect of modern life. Delicate fiber optic cables, massive data transfer stations, and power stations create a patchwork of literal nuts and bolts that facilitates the flow of zeros and ones. Now, research shows that a whole lot of that infrastructure sits squarely in the path of rising seas. (See what the planet would look like if all the ice melted.) Scientists mapped out the threads and knots of internet infrastructure in the U.S. and layered that on top of maps showing future sea level rise. What they found was ominous: Within 15 years, thousands of miles of fiber optic cable—and hundreds of pieces of other key infrastructure—are likely to be swamped by the encroaching ocean. And while some of that infrastructure may be water resistant, little of it was designed to live fully underwater. “So much of the infrastructure that's been deployed is right next to the coast, so it doesn't take much more than a few inches or a foot of sea level rise for it to be underwater,” says study coauthor Paul Barford, a computer scientist at the University of Wisconsin, Madison. “It was all was deployed 20ish years ago, when no one was thinking about the fact that sea levels might come up.”
  • “This will be a big problem,” says Rae Zimmerman, an expert on urban adaptation to climate change at NYU. Large parts of internet infrastructure soon “will be underwater, unless they're moved back pretty quickly.”
3More

Microsoft breaks IE8 interoperability promise | The Register - 0 views

  • In March, Microsoft announced that their upcoming Internet Explorer 8 would: "use its most standards compliant mode, IE8 Standards, as the default." Note the last word: default. Microsoft argued that, in light of their newly published interoperability principles, it was the right thing to do. This declaration heralded an about-face and was widely praised by the web standards community; people were stunned and delighted by Microsoft's promise. This week, the promise was broken. It lasted less than six months. Now that Internet Explorer IE8 beta 2 is released, we know that many, if not most, pages viewed in IE8 will not be shown in standards mode by default.
  • How many pages are affected by this change? Here's the back of my envelope: The PC market can be split into two segments — the enterprise market and the home market. The enterprise market accounts for around 60 per cent of all PCs sold, while the home market accounts for the remaining 40 per cent. Within enterprises, intranets are used for all sorts of things and account for, perhaps, 80 per cent of all page views. Thus, intranets account for about half of all page views on PCs!
  •  
    Article by Hakon Lie of Opera Software. Also note that acdcording to the European Commission, "As for the tying of separate software products, in its Microsoft judgment of 17 September 2007, the Court of First Instance confirmed the principles that must be respected by dominant companies. In a complaint by Opera, a competing browser vendor, Microsoft is alleged to have engaged in illegal tying of its Internet Explorer product to its dominant Windows operating system. The complaint alleges that there is ongoing competitive harm from Microsoft's practices, in particular in view of new proprietary technologies that Microsoft has allegedly introduced in its browser that would reduce compatibility with open internet standards, and therefore hinder competition. In addition, allegations of tying of other separate software products by Microsoft, including desktop search and Windows Live have been brought to the Commission's attention. The Commission's investigation will therefore focus on allegations that a range of products have been unlawfully tied to sales of Microsoft's dominant operating system." http://europa.eu/rapid/pressReleasesAction.do?reference=MEMO/08/19&format=HTML&aged=0&language=EN&guiLanguage=en
1 - 20 of 58 Next › Last »
Showing 20 items per page