Skip to main content

Home/ MHC Languages/ Group items tagged search engine

Rss Feed Group items tagged

LRC MHC

Linguos Search Engine - 0 views

shared by LRC MHC on 12 May 09 - Cached
  •  
    linguos.com is a window to the non-English web. Users with can search the non-English web with just an English QWERTY keyboard. This is NOT a translation service, but a transliteration and transcription based search engine. In most languages, enter your queries as they would sound in your target language. linguos.com is powered by Linguaseek Language Technologies. linguaseek.com is a portal and a platform that enables multi-lingual search, communication and content generation. A single interface allows transliteration to over 120 languages (virtually all digitally available languages.) Global portals and services can benefit from the service by integrating with or licensing linguaseek.com webservices, allowing users to search for multi-lingual content, communicate (IM/email/etc) and generate content (blogs, comments, web pages, etc), all without requiring custom keyboards, software or transliteration schemes. linguaseek's transliteration is based on ISO standards where available and optimized for user input. Users familiar with English and a second (or more) language(s) will benefit the most from this service.
Daryl Beres

[oucs] All About Xaira - 1 views

  •  
    "Xaira is a text searching software originally developed at OUCS for use with the British National Corpus. This new version has been entirely re-written as a general purpose XML search engine, which will operate on any corpus of well-formed XML documents. It is however best used with TEI-conformant documents. Xaira has full Unicode support. This means you can use it to search and display text in any language, provided you have a suitable Unicode font installed on your system. At the heart of Xaira is the Xaira Object Model. This defines a range of objects and methods for representing and searching large amounts of linguistic data. The Xaira Server program implements this model. The Xaira Indexer program creates platform-independent indexes from collections of XML documents for use by the Server. Both these Xaira components can be deployed on any platform. Client programs can access a Xaira server using a close-coupled API such as that used by the Windows client (which is written in C++), or via XMLRPC or SOAP. We provide a fully-featured client for Windows, and a PHP code library which makes it easy to develop applications for the web which can talk to a Xaira server. All versions of Xaira are now distributed free of charge under the GNU General Public Licence."
Daryl Beres

Google as a Quick 'n Dirty Corpus Tool - 0 views

  •  
    "Until recently it was assumed that specialized software was required to do concordancing, but it turns out that a search engine such as Google can generate queries into almost limitless corpora (using the Advanced Search feature from the main portal page, for example). This paper by Tom Robb addresses more refined issues regarding the integrity of the data thus derived, and how we might improve on the integrity of that data through more defined searches, as explained here. "
LRC MHC

Open-Source Large Vocabulary CSR Engine Julius - 0 views

  •  
    ""Julius" is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. Based on word N-gram and context-dependent HMM, it can perform almost real-time decoding on most current PCs in 60k word dictation task. Major search techniques are fully incorporated such as tree lexicon, N-gram factoring, cross-word context dependency handling, enveloped beam search, Gaussian pruning, Gaussian selection, etc. Besides search efficiency, it is also modularized carefully to be independent from model structures, and various HMM types are supported such as shared-state triphones and tied-mixture models, with any number of mixtures, states, or phones. Standard formats are adopted to cope with other free modeling toolkit such as HTK, CMU-Cam SLM toolkit, etc. The main platform is Linux and other Unix workstations, and also works on Windows. Most recent version is developed on Linux and Windows (cygwin / mingw), and also has Microsoft SAPI version. Julius is distributed with open license together with source codes. Note: you should prepare a language model and an acoustic model to run a speech recognition with Julius. "
LRC MHC

New Search Technologies Mine the Web More Deeply - NYTimes.com - 0 views

  •  
    Beyond those trillion pages lies an even vaster Web of hidden data: financial information, shopping catalogs, flight schedules, medical research and all kinds of other material stored in databases that remain largely invisible to search engines.
LRC MHC

JimStroud - How to make Google your English Teacher | Englishcafe - 0 views

  •  
    Google is a very popular search engine, but did you know that it could also serve as a Tutor? Click here to download a 5-page guide, or scroll down to preview a few tips from the guide itself.
1 - 6 of 6
Showing 20 items per page