Skip to main content

Home/ Future of the Web/ Group items tagged link rot

Rss Feed Group items tagged

Paul Merrell

This project aims to make '404 not found' pages a thing of the past - 0 views

  • The Internet is always changing. Sites are rising and falling, content is deleted, and bad URLs can lead to '404 Not Found' errors that are as helpful as a brick wall. A new project proposes an do away with dead 404 errors by implementing new HTML code that will help access prior versions of hyperlinked content. With any luck, that means that you’ll never have to run into a dead link again. The “404-No-More” project is backed by a formidable coalition including members from organizations like the Harvard Library Innovation Lab, Los Alamos National Laboratory, Old Dominion University, and the Berkman Center for Internet & Society. Part of the Knight News Challenge, which seeks to strengthen the Internet for free expression and innovation through a variety of initiatives, 404-No-More recently reached the semifinal stage. The project aims to cure so-called link rot, the process by which hyperlinks become useless overtime because they point to addresses that are no longer available. If implemented, websites such as Wikipedia and other reference documents would be vastly improved. The new feature would also give Web authors a way provide links that contain both archived copies of content and specific dates of reference, the sort of information that diligent readers have to hunt down on a website like Archive.org.
  • While it may sound trivial, link rot can actually have real ramifications. Nearly 50 percent of the hyperlinks in Supreme Court decisions no longer work, a 2013 study revealed. Losing footnotes and citations in landmark legal decisions can mean losing crucial information and context about the laws that govern us. The same study found that 70 percent of URLs within the Harvard Law Review and similar journals didn’t link to the originally cited information, considered a serious loss surrounding the discussion of our laws. The project’s proponents have come up with more potential uses as well. Activists fighting censorship will have an easier time combatting government takedowns, for instance. Journalists will be much more capable of researching dynamic Web pages. “If every hyperlink was annotated with a publication date, you could automatically view an archived version of the content as the author intended for you to see it,” the project’s authors explain. The ephemeral nature of the Web could no longer be used as a weapon. Roger Macdonald, a director at the Internet Archive, called the 404-No-More project “an important contribution to preservation of knowledge.”
  • The new feature would come in the form of introducing the mset attribute to the <a> element in HTML, which would allow users of the code to specify multiple dates and copies of content as an external resource. For instance, if both the date of reference and the location of a copy of targeted content is known by an author, the new code would like like this: The 404-No-More project’s goals are numerous, but the ultimate goal is to have mset become a new HTML standard for hyperlinks. “An HTML standard that incorporates archives for hyperlinks will loop in these efforts and make the Web better for everyone,” project leaders wrote, “activists, journalists, and regular ol’ everyday web users.”
Paul Merrell

Anti link-rot SaaS for web publishers -- WebCite - 0 views

  • The Problem Authors increasingly cite webpages and other digital objects on the Internet, which can "disappear" overnight. In one study published in the journal Science, 13% of Internet references in scholarly articles were inactive after only 27 months. Another problem is that cited webpages may change, so that readers see something different than what the citing author saw. The problem of unstable webcitations and the lack of routine digital preservation of cited digital objects has been referred to as an issue "calling for an immediate response" by publishers and authors [1]. An increasing number of editors and publishers ask that authors, when they cite a webpage, make a local copy of the cited webpage/webmaterial, and archive the cited URL in a system like WebCite®, to enable readers permanent access to the cited material.
  • What is WebCite®? WebCite®, a member of the International Internet Preservation Consortium, is an on-demand archiving system for webreferences (cited webpages and websites, or other kinds of Internet-accessible digital objects), which can be used by authors, editors, and publishers of scholarly papers and books, to ensure that cited webmaterial will remain available to readers in the future. If cited webreferences in journal articles, books etc. are not archived, future readers may encounter a "404 File Not Found" error when clicking on a cited URL. Try it! Archive a URL here. It's free and takes only 30 seconds. A WebCite®-enhanced reference is a reference which contains - in addition to the original live URL (which can and probably will disappear in the future, or its content may change) - a link to an archived copy of the material, exactly as the citing author saw it when he accessed the cited material.
  •  
    Free service spun off from the University of Toronto's University Health Network. Automagic archiving of cited internet content, generation of citations that include the url for the archived copy. Now if Google would just make it easier to use its search cache copies for the same purpose ...
Paul Merrell

BitTorrent Sync creates private, peer-to-peer Dropbox, no cloud required | Ars Technica - 6 views

  • BitTorrent today released folder syncing software that replicates files across multiple computers using the same peer-to-peer file sharing technology that powers BitTorrent clients. The free BitTorrent Sync application is labeled as being in the alpha stage, so it's not necessarily ready for prime-time, but it is publicly available for download and working as advertised on my home network. BitTorrent, Inc. (yes, there is a legitimate company behind BitTorrent) took to its blog to announce the move from a pre-alpha, private program to the publicly available alpha. Additions since the private alpha include one-way synchronization, one-time secrets for sharing files with a friend or colleague, and the ability to exclude specific files and directories.
  • BitTorrent Sync provides "unlimited, secure file-syncing," the company said. "You can use it for remote backup. Or, you can use it to transfer large folders of personal media between users and machines; editors and collaborators. It’s simple. It’s free. It’s the awesome power of P2P, applied to file-syncing." File transfers are encrypted, with private information never being stored on an external server or in the "cloud." "Since Sync is based on P2P and doesn’t require a pit-stop in the cloud, you can transfer files at the maximum speed supported by your network," BitTorrent said. "BitTorrent Sync is specifically designed to handle large files, so you can sync original, high quality, uncompressed files."
  •  
    Direct P2P encrypted file syncing, no cloud intermediate, which should translate to far more secure exchange of files, with less opportunity for snooping by governments or others, than with cloud-based services. 
  • ...5 more comments...
  •  
    Hey Paul, is there an open source document management system that I could hook the BitTorrent Sync to?
  •  
    More detail please. What do you want to do with the doc management system? Platform? Server-side or stand-alone? Industrial strength and highly configurable or lightweight and simple? What do you mean by "hook?" Not that I would be able to answer anyway. I really know very little about BitTorrent Sync. In fact, as far as I'd gone before your question was to look at the FAQ. It's linked from . But there's a link to a forum on the same page. Giving the first page a quick scan confirms that this really is alpha-state software. But that would probably be a better place to ask. (Just give them more specific information of what you'd like to do.) There are other projects out there working on getting around the surveillance problem. I2P is one that is a farther along than BitTorrent Sync and quite a bit more flexible. See . (But I haven't used it, so caveat emptor.)
  •  
    There is a great list of PRISM Proof software at http://prism-break.org/. Includes a link to I2P. I want to replace gmail though, but would like another Web based system since I need multi device access. Of course, I need to replace my Google Apps / Google Docs system. That's why I asked about a PRISM Proof sync-share-store DMS. My guess is that there are many users similarly seeking a PRISM Proof platform of communications, content and collaborative computing systems. BusinessIndiser.com is crushed with articles about Google struggling to squirm out from under the NSA PRISM boot-on-the-back-of-their-neck situation. As if blaming the NSA makes up for the dragnet that they consented/allowed/conceded to cover their entire platform. Perhaps we should be watching Germany? There must be tons of startup operations underway, all seeking to replace Google, Amazon, FaceBook, Microsoft, Skype and so many others. It's a great day for Libertyware :)
  •  
    Is the NSA involvement the "Kiss of Death"? Google seems to think so. I'm wondering what the impact would be if ZOHO were to announce a PRISM Proof productivity platform?
  •  
    It is indeed. The E.U. has far more protective digital privacy rights than we do (none). If you're looking for a Dropbox replacement (you should be), for a cloud-based solution take a look at . Unlike Dropbox, all of the encryption/decryption happens on your local machine; Wuala never sees your files unencrypted. Dropbox folks have admitted that there's no technical barrier to them looking at your files. Their encrypt/decrypt operations are done in the cloud (if they actually bother) and they have the key. Which makes it more chilling that the PRISM docs Snowden link make reference to Dropbox being the next cloud service NSA plans to add to their collection. Wuala also is located (as are its servers) in Switzerland, which also has far stronger digital data privacy laws than the U.S. Plus the Swiss are well along the path to E.U. membership; they've ratified many of the E.U. treaties including the treaty on Human Rights, which as I recall is where the digital privacy sections are. I've begun to migrate from Dropbox to Wuala. It seems to be neck and neck with Dropbox on features and supported platforms, with the advantage of a far more secure approach and 5 GB free. But I'd also love to see more approaches akin to IP2 and Bittorrent Sync that provide the means to bypass the cloud. Don't depend on government to ensure digital privacy, route around the government voyeurs. Hmmm ... I wonder if the NSA has the computer capacity to handle millions of people switching to encrypted communication? :-) Thanks for the link to the software list.
  •  
    Re: Google. I don't know if it's the 'kiss of death" but they're definitely going to take a hit, particularly outside the U.S. BTW, I'm remembering from a few years back when the ODF Foundation was still kicking. I did a fair bit of research on the bureaucratic forces in the E.U. that were pushing for the Open Document Exchange Formats. That grew out of a then-ongoing push to get all of the E.U. nations connected via a network that is not dependent on the Internet. It was fairly complete at the time down to the national level and was branching out to the local level and the plan from there was to push connections to business and then to Joe Sixpack and wife. Interop was key, hence ODEF. The E.U. might not be that far away from an ability to sever the digital connections with the U.S. Say a bunch of daisy-chained proxy anonymizers for communications with the U.S. Of course they'd have to block the UK from the network and treat it like it is the U.S. There's a formal signals intelligence service collaboration/integration dating back to WW 2, as I recall, among the U.S., the U.K., Canada, Australia, and New Zealand. Don't remember its name. But it's the same group of nations that were collaborating on Echelon. So the E.U. wouldn't want to let the UK fox inside their new chicken coop. Ah, it's just a fantasy. The U.S. and the E.U. are too interdependent. I have no idea hard it would be for the Zoho folk to come up with desktop/side encryption/decryption. And I don't know whether their servers are located outside the reach of a U.S. court's search warrant. But I think Google is going to have to move in that direction fast if it wants to minimize the damage. Or get way out in front of the hounds chomping at the NSA's ankles and reduce the NSA to compost. OTOH, Google might be a government covert op. for all I know. :-) I'm really enjoying watching the NSA show. Who knows what facet of their Big Brother operation gets revealed next?
  •  
    ZOHO is an Indian company with USA marketing offices. No idea where the server farm is located, but they were not on the NSA list. I've known Raju Vegesna for years, mostly from the old Web 2.0 and Office 2.0 Conferences. Raju runs the USA offices in Santa Clara. I'll try to catch up with him on Thursday. How he could miss this once in a lifetime moment to clean out Google, Microsoft and SalesForce.com is something I'd like to find out about. Thanks for the Wuala tip. You sent me that years ago, when i was working on research and design for the SurDocs project. Incredible that all our notes, research, designs and correspondence was left to rot in Google Wave! Too too funny. I recall telling Alex from SurDocs that he had to use a USA host, like Amazon, that could be trusted by USA customers to keep their docs safe and secure. Now look what i've done! I've tossed his entire company information set into the laps of the NSA and their cabal of connected corporatists :)
1 - 3 of 3
Showing 20 items per page