Ontology is Overrated -- Categories, Links, and Tags - 1 views
www.shirky.com/...ontology_overrated.html
research mapping NML theasu tagging socialbookmarking bookmarking web2.0 ENG654 eng102 clayshirky shirky amnoi

-
the link, which can point to anything, and the tag, which is a way of attaching labels to links
-
The question ontology asks is: What kinds of things exist or can exist in the world, and what manner of relations can those things have to each other?
- ...40 more annotations...
-
Dewey, 200: Religion 210 Natural theology 220 Bible 230 Christian theology 240 Christian moral & devotional theology 250 Christian orders & local church 260 Christian social theology 270 Christian church history 280 Christian sects & denominations 290 Other religions
-
Dewey, 200: Religion 210 Natural theology 220 Bible 230 Christian theology 240 Christian moral & devotional theology 250 Christian orders & local church 260 Christian social theology 270 Christian church history 280 Christian sects & denominations 290 Other religions
-
It is organized into non-overlapping categories that get more detailed at lower and lower levels -- any concept is supposed to fit in one category and in no other categories
-
The essence of a book isn't the ideas it contains. The essence of a book is "book." Thinking that library catalogs exist to organize concepts confuses the container for the thing contained.
-
Look what's happened here. Yahoo, faced with the possibility that they could organize things with no physical constraints, added the shelf back
-
The charitable explanation for this is that they thought of this kind of a priori organization as their job, and as something their users would value. The uncharitable explanation is that they thought there was business value in determining the view the user would have to adopt to use the system
-
The charitable explanation for this is that they thought of this kind of a priori organization as their job, and as something their users would value. The uncharitable explanation is that they thought there was business value in determining the view the user would have to adopt to use the system
-
The charitable explanation for this is that they thought of this kind of a priori organization as their job, and as something their users would value. The uncharitable explanation is that they thought there was business value in determining the view the user would have to adopt to use the system
-
The charitable explanation for this is that they thought of this kind of a priori organization as their job, and as something their users would value. The uncharitable explanation is that they thought there was business value in determining the view the user would have to adopt to use the system .
-
if you've got enough links, you don't need the hierarchy anymore. There is no shelf. There is no file system. The links alone are enough. [ Just Links (There Is No Filesystem) ]
-
Browse versus search is a radical increase in the trust we put in link infrastructure, and in the degree of power derived from that link structure. Browse says the people making the ontology, the people doing the categorization, have the responsibility to organize the world in advance.
-
Search says that, at the moment that you are looking for it, we will do our best to service it based on this link structure, because we believe we can build a world where we don't need the hierarchy to coexist with the link structure.
-
You can also turn that list around. You can say "Here are some characteristics where ontological classification doesn't work well": Domain Large corpus No formal categories Unstable entities Unrestricted entities No clear edges Participants Uncoordinated users Amateur users Naive catalogers No Authority
-
where the people doing the categorizing believe, even if only unconciously, that naming the world changes it.
-
"Oh my god, that means you won't be introducing the movies people to the cinema people!" To which the obvious answer is "Good. The movie people don't want to hang out with the cinema people."
-
The problem is, because the cataloguers assume their classification should have force on the world, they underestimate the difficulty of understanding what users are thinking, and they overestimate the amount to which users will agree, either with one another or with the catalogers, about the best way to categorize. They also underestimate the loss from erasing difference of expression, and they overestimate loss from the lack of a thesaurus.
-
We pretend that 'country' refers to a physical area the same way 'city' does, but it's not true, as we know from places like the former Yugoslavia.
-
A: "This is a book about Dresden." B: "This is a book about Dresden, and it goes in the category 'East Germany'."
-
They're able to take my books in while ignoring my categories, because all my books have ISBN numbers, International Standard Book Numbers.
-
Now imagine a world where everything can have a unique identifier. This should be easy, since that's the world we currently live in -- the URL gives us a way to create a globally unique ID for anything we need to point to.
-
Now imagine a world where everything can have a unique identifier. This should be easy, since that's the world we currently live in -- the URL gives us a way to create a globally unique ID for anything we need to point to. Sometimes the pointers are direct, as when a URL points to the contents of a Web page. Sometimes they are indirect, as when you use an Amazon link to point to a book. Sometimes there are layers of indirection, as when you use a URI, a uniform resource identifier, to name something whose location is indeterminate. But the basic scheme gives us ways to create a globally unique identifier for anything. And once you can do that, anyone can label those pointers, can tag those URLs, in ways that make them more valuable, and all without requiring top-down organization schemes. And this -- an explosion in free-form labeling of links, followed by all sorts of ways of grabbing value from those labels -- is what I think is happening now.
-
Tags are simply labels for URLs, selected to help the user in later retrieval of those URLs. Tags have the additional effect of grouping related URLs together. There is no fixed set of categories or officially approved choices. You can use words, acronyms, numbers, whatever makes sense to you, without regard for anyone else's needs, interests, or requirements.
-
Tags are important mainly for what they leave out. By forgoing formal classification, tags enable a huge amount of user-produced organizational value, at vanishingly small cost.
-
And if you can find any way to create value from combining myriad amateur classifications over time, they will come to be more valuable than professional categorization schemes, particularly with regards to robustness and cost of creation.
-
because you can derive 'this is who this link is was tagged by' and 'this is when it was tagged, you can start to do inclusion and exclusion around people and time, not just tags. You can start to do grouping. You can start to do decay. "Roll up tags from just this group of users, I'd like to see what they are talking about" or "Give me all tags with this signature, but anything that's more than a week old or a year old."
-
With tagging, when there is signal loss, it comes from people not having any commonality in talking about things. The loss is from the multiplicity of points of view, rather than from compression around a single point of view.
-
Tagging, by contrast, gets better with scale. With a multiplicity of points of view the question isn't "Is everyone tagging any given link 'correctly'", but rather "Is anyone tagging it the way I do?
-
This allows for partial, incomplete, or probabilistic merges that are better fits to uncertain environments -- such as the real world -- than rigid classification schemes.
-
You merge from the URLs, and then try and derive something about the categorization from there. This allows for partial, incomplete, or probabilistic merges that are better fits to uncertain environments -- such as the real world -- than rigid classification schemes.
-
Merges are Probabilistic, not Binary - Merges create partial overlap between tags, rather than defining tags as synonyms. Instead of saying that any given tag "is" or "is not" the same as another tag, del.icio.us is able to recommend related tags by saying "A lot of people who tagged this 'Mac' also tagged it 'OSX'." We move from a binary choice between saying two tags are the same or different to the Venn diagram option of "kind of is/somewhat is/sort of is/overlaps to this degree". That is a really profound change.
-
You can see there's a tag "to_read". A professional cataloguer would look at this tag in horror -- "This is context-dependent and temporary." Well, so was the category "East Germany." Once you expand your time scale to include the actual life of the categorization scheme itself, you recognize that the distinction between temporary and permanent is awfully vague. There isn't in fact a binary condition of a tag that can or cannot survive any kind of long-term examination.
-
It comes down ultimately to a question of philosophy. Does the world make sense or do we make sense of the world?
-
If, on the other hand, you believe that we make sense of the world, if we are, from a bunch of different points of view, applying some kind of sense to the world, then you don't privilege one top level of sense-making over the other. What you do instead is you try to find ways that the individual sense-making can roll up to something which is of value in aggregate, but you do it without an ontological goal. You do it without a goal of explicitly getting to or even closely matching some theoretically perfect view of the world.
-
"A lot of users tagging things foobar are also tagging them frobnitz. I'll tell the user foobar and frobnitz are related." It's up to the user to decide whether or not that recommendation is useful -- del.icio.us has no idea what the tags mean. The tag overlap is in the system, but the tag semantics are in the users. This is not a way to inject linguistic meaning into the machine.