Skip to main content

Home/ sensemaking/ Group items tagged categorizing

Rss Feed Group items tagged

Jack Park

Java Text Categorizing Library - 0 views

  •  
    The Java Text Categorizing Library (JTCL) is a pure java 1.5 implementation of libTextCat which in turn is "a library that was primarily developed for language guessing, a task on which it is known to perform with near-perfect accuracy". It's distributed under the LGPL and can also be used in order to categorize text into arbitrary topics by computing appropiate fingerprints which represent the categories.
Jack Park

InfoTangle :: The Hive Mind: Folksonomies and User-Based Tagging :: December :: 2005 - 1 views

  •  
    There is a revolution happening on the Internet that is alive and building momentum with each passing tag. With the advent of social software and Web 2.0, we usher in a new era of Internet order. One in which the user has the power to effect their own online experience, and contribute to others'. Today, users are adding metadata and using tags to organize their own digital collections, categorize the content of others and build bottom-up classification systems. The wisdom of crowds, the hive mind, and the collective intelligence are doing what heretofore only expert catalogers, information architects and website authors have done. They are categorizing and organizing the Internet and determining the user experience, and it's working. No longer do the experts have the monopoly on this domain; in this new age users have been empowered to determine their own cataloging needs. Metadata is now in the realm of the Everyman.
Jack Park

Sluijs - 0 views

  •  
    The present research analyses the 'social visualization' tool Sense.us, a commercial interactive Web application in which U.S. Census data are visualized. Sense.us was developed as a tool for social data exploration and interaction, in which it would be worthwhile to pay attention to the socio-cultural values that have driven the collection and categorization of the underlying U.S. Census datasets. It is argued that closer attention to value driven U.S. Census statistics would greatly enhance the social appeal of Sense.us, and would be a logical next step in the development of online social visualization tools. In order to allow for explicit socio-cultural values of statistics in online visualizations, three strategies are offered: pro-active annotation; more attention to visual aesthetics; and, a tighter integration of user profiles and represented data.
Jack Park

Everybody | Faviki - Social bookmarking tool using smart semantic Wikipedia (DBpedia) tags - 1 views

  •  
    Faviki is a social bookmarking tool which allows you to tag webpages you want to remember with Wikipedia terms. This means that everybody uses the same names for tags from the world's largest collection of knowledge. Thanks to DBpedia, which extracts structured information from Wikipedia and represents it in a flexible data model, these tags are reference to objects which are categorized automatically, keeping your and your friend's bookmarks and interests well organized.
Jack Park

PostGlobal Global Power Barometer (washingtonpost.com) - 0 views

  •  
    As it tracks and analyzes thought and actions across the world, the Global Power Barometer (GPB) frequently catches sight of issues that will impact global politics. These are the issues that likely will move the icons in coming weeks. We'll share our peeks at the future as they pass certain momentum thresholds. In future days we'll categorize the "Emerging Issues" and provide snippets about the progress of significant trends.
Jack Park

HCLSIG BioRDF Subgroup/aTags - ESW Wiki - 1 views

  •  
    # The primary intention of creating aTags is not the categorization of the document, but the representation of the key facts inside the document. Key facts in the biomedical domain might be, for example, "Protein A interacts with protein B" or "Overexpression of protein A in tissue B is the cause of disease C". # An aTag is comprised of a set of associated entities. The size of the set is arbitrary, but will typically lie between 2 and 5 entities. For example, the fact "Protein A binds to protein B" can be represented with an aTag comprising of the three entities "Protein A", "Molecular interaction" and "Protein B". Similarly, the fact "Overexpression of protein A in tissue B is the cause of disease C" can be represented with an aTag comprising of the four entities "Overexpression", "Protein A", "Tissue B" and "Disease C". # Each document or database entry can be described with an arbitrary number of such aTags. Each aTag can be associated with the relevant portions of text or data in a fine granularity. # The entities in an aTag are not simple strings, but resources that are part of ontologies and RDF/OWL-enabled databases. For example, "Protein A" and "Protein B" are resources that are defined in the UniProt database, whereas "Molecular Interaction" is a class in the branch of biological processes of the Gene Ontology. They are identified with their URIs.
Stian Danenbarger

Halpin et al: "The Complex Dynamics of Collaborative Tagging" (PDF, 2007) - 6 views

  •  
    "The debate within the Web community over the optimal means by which to organize information often pits formalized classications against distributed collaborative tagging systems. A number of questions remain unanswered, however, regarding the nature of collaborative tagging systems including whether coherent categorization schemes can emerge from unsupervised tagging by users. This paper uses data from the social bookmarking site del.icio.us to examine the dynamics of collaborative tagging systems. In particular, we examine whether the distribution of the frequency of use of tags for “popular” sites with a long history (many tags and many users) can be described by a power law distribution, often characteristic of what are considered complex systems. We produce a generative model of collaborative tagging in order to understand the basic dynamics behind tagging, including how a power law distribution of tags could arise. We empirically examine the tagging history of sites in order to determine how this distribution arises over time and to determine the patterns prior to a stable distribution. Lastly, by focusing on the high-frequency tags of a site where the distribution of tags is a stabilized power law, we show how tag co-occurrence networks for a sample domain of tags can be used to analyze the meaning of particular tags given their relationship to other tags."
  •  
    The paper shows that the tags users choose are not chaotic, but rather quickly converge to a common descriptive set of tags that is almost unchanging over time. Perhaps once the tags have stabilized, coherent URI-based identification schemes could emerge?
  •  
    Nice paper, thanks. Categories / tags / subjects / topics / issues ... that's what I'm working with right now. p.s. sure would be nice if the email notification included the source URL. I'm far more likely to download the PDF when I see something like www2007.org/paper635.pdf
Jack Park

The Trade & Environment Database - 0 views

  •  
    The Trade & Environment Database (TED) is a collection of categorical case studies that began with a focus on solely environmental issues, but did not include the economic consequences of other social policy choices, such as culture, rights, or other issues.
1 - 8 of 8
Showing 20 items per page