Group items tagged

Filter: All | Bookmarks | Topics Simple Middle

wiki.dbpedia.org : About - 2 views

dbpedia.org/About

dbpedia wikipedia

shared by Kurt Laitner on 19 Feb 10 - Cached

10 Ways to Use OpenCalais Today | OpenCalais - 0 views

www.opencalais.com/...10-ways-use-opencalais-today

entity extraction opencalais openpublish

shared by François Dongier on 06 Feb 10 - Cached

What Does Calais Do?
...

Cancel
It analyzes text you send it and extracts entities (people, organizations, geographies, etc.). In many cases, it links those entities to the world of Linked Data. It extracts facts – like the fact that John Doe is the CEO of Acme Corporation or such. It extracts events – like mergers, earning announcements, natural disasters and a bunch of others. It attaches a topic to the text as a whole, much like a newspaper would (Sports, Finance, Health, etc.). It creates SocialTags – our attempt to “tag” the article a way a human would to file it away somewhere.
...

Cancel
it’s free for up to 50,000 submissions per day for commercial or non-commercial purposes
...

Cancel
...5 more annotations...
Content Enhancement — There’s a whole world of Linked Data out there and OpenCalais can be your entry point. For example – take in press releases, and extract the companies mentioned in them. Use OpenCalais’ Linked Data entry points to get the SIC codes and the link to DBPedia. Access DBPedia and enhance your content with other information about the company like locations, people, products. Access Geonames to figure out what region the company is located in. Take that enhanced content and do cool things (like triage and workflow and presentation) with it.
...

Cancel
Alerting — Give users the ability to be alerted when certain types of content becomes available. Unlike simple keyword alerting with OpenCalais + Linked Data you can construct alerts like, “Tell me when there is M&A activity for a company in the Steel industry.”
...

Cancel
Automated News Portals — Want to create a general purpose news portal? Or maybe one that deals only with baseball news? Great. Subscribe to and/or acquire some content sources, and feed them through OpenCalais. Then use the metadata to throw away what you don’t care about and to organize the rest by topic, geography, person – whatever. A great example of an off-the-shelf solution that does this is OpenPublish.
...

Cancel
Finer-Grained / Higher-Value Syndication — Do you have content consumers via RSS or other syndication methods? Give them a better experience by allowing them to create their own channels based on OpenCalais metadata. Create channels based on region, types of events, companies, etc. – or any combination of those and other items.
...

Cancel
SEO — Something we get asked about all the time – we know people are experimenting – but they’re not being very public about their experimentation. Here’s a simple idea though: make your content more search friendly. Two routes: One easy, one a little harder. Route 1: Translate events into human readable text and get it on your page. Have a complicated article about an LBO of company x by people y? OpenCalais will identify an M&A event. Take that event and turn it into a tag like “Acquisitions” – something people might actually search for. Don’t just use it as a metatag – incorporate into the page via navigation or whatever so Google pays attention. Route 2: Use linked data to enhance your content. If you’re talking about a company or geography use OpenCalais Linked Data to enhance the page with additional information from Dbpedia, Geonames, CIA world fact book or a bunch of other sources.
...

Cancel

Extracting Enterprise Vocabularies Using Linked Open Data | Semantic Web Dog Food - 0 views

data.semanticweb.org/...html

linkeddata Juan Sequeda entity extraction vocabularies

shared by François Dongier on 04 Feb 10 - Cached

A common vocabulary is vital to smooth business operation, yet codifying and maintaining an enterprise vocabulary is an arduous, manual task. We describe a process to automatically extract a domain specific vocabulary (terms and types) from unstructured data in the enterprise guided by term definitions in Linked Open Data (LOD). We validate our techniques by applying them to the IT (Information Technology) domain, taking 58 Gartner analyst reports and using two specific LOD sources -- DBpedia and Freebase.
- François Dongier on 04 Feb 10
  
  This IBM article is referenced by Juan Sequeda in a post to the Linking Open Data mailing list (public-lod@w3.org, Feb 4, 2010) : Hi Matthias, We worked on something similar: entity type discovery using linked open data. Our project was given a corpus of documents in the same domain, identify specific entity types in the documents. Our objective was to search for documents in a corpus by specific entities. For example: "find articles that are about RDBMs" Standard NER tools identify high level types such as persons, organization, places because they have been previously trained on general corpora. I assume tools like OpenCalais have been trained on news-like documents and Zemanta has been trained on blog-like documents. We were interested in identifying specific types such a "RDBMS" when the word "Oracle" would show up in the text. In order to do that, we followed several domain term extraction techniques. We used LOD, specifically DBpedia, Freebase and Opencyc to disambiguate terms and also retrieve the entities. Honestly, evaluation is pretty hard to do, but our current implementation was not that bad (75% precision and 55% recall). We built upon some work by IBM where they create a vocabulary from text using LOD [1] Let me see if I can clean up the code and publish it as a service. [1] http://data.semanticweb.org/conference/iswc/2009/paper/inuse/143/html Juan Sequeda (575) SEQ-UEDA www.juansequeda.com
  
  <div class="cArrow"> </div><div class="cContentInner">This IBM article is referenced by Juan Sequeda in a post to the Linking Open Data mailing list (public-lod@w3.org, Feb 4, 2010) : Hi Matthias, We worked on something similar: entity type discovery using linked open data. Our project was given a corpus of documents in the same domain, identify specific entity types in the documents. Our objective was to search for documents in a corpus by specific entities. For example: "find articles that are about RDBMs" Standard NER tools identify high level types such as persons, organization, places because they have been previously trained on general corpora. I assume tools like OpenCalais have been trained on news-like documents and Zemanta has been trained on blog-like documents. We were interested in identifying specific types such a "RDBMS" when the word "Oracle" would show up in the text. In order to do that, we followed several domain term extraction techniques. We used LOD, specifically DBpedia, Freebase and Opencyc to disambiguate terms and also retrieve the entities. Honestly, evaluation is pretty hard to do, but our current implementation was not that bad (75% precision and 55% recall). We built upon some work by IBM where they create a vocabulary from text using LOD [1] Let me see if I can clean up the code and publish it as a service. [1] <a href="http://data.semanticweb.org/conference/iswc/2009/paper/inuse/143/html" rel="nofollow" target="_blank">http://data.semanticweb.org/conference/iswc/2009/paper/inuse/143/html</a> Juan Sequeda (575) SEQ-UEDA <a href="http://www.juansequeda.com" rel="nofollow" target="_blank">www.juansequeda.com</a></div>
  
  ...
  
  Cancel
...

Cancel

1 - 3 of 3

Showing 20▼ items per page

Group items tagged

wiki.dbpedia.org : About - 2 views

10 Ways to Use OpenCalais Today | OpenCalais - 0 views

Extracting Enterprise Vocabularies Using Linked Open Data | Semantic Web Dog Food - 0 views

Related searches