Skip to main content

Home/ kcenter_search/ Group items tagged indexation

Rss Feed Group items tagged

Christophe ICD

[pdf] EQUALITY OF RETRIEVAL: Levelling the Metadata Playing Field in Big Indexes, par A... - 0 views

  •  
    "The University of Calgary's Libraries and Cultural Resources became a beta partner with Serials Solutions' unified discovery service, Summon, in the spring of 2009. Since then it has worked to include metadata from numerous disparate systems in a single index to drive discovery in a Google-like environment. The University of Calgary has examined how MARC and other metadata schemas are mapped into Summon with an eye to ensuring the maximum possible population of index fields representing facets in addition to adhering to the established standards for cross mapping metadata schemas and indexing. The University of Calgary has investigated existing standards and worked closely with the Summon team to create mappings that reflect how MARC and other metadata can ultimately be used in big indexes. Combined with the normalization or collapsing of metadata records representing the same resource into a single metadata-rich record, fully leveraging MARC and other metadata in big indexes should not only level the metadata playing field but make competition between records a non-issue."
Christophe ICD

MasterKey Platform | Index Data - 0 views

  •  
    "MasterKey is a growing and evolving family of tools for building sophisticated information discovery solutions. They can be used individually, or they can be combined together using a shared service-based architecture. Some of the tools have been released by Index Data under OSS (open source software) licenses, whereas others, at this time, are made available exclusively to our customers. To complement the software, Index Data offers a full range of support, consulting, and development services. For businesses, our components are flexible and modular enough to fit into practically any existing software platform. For consortia and larger libraries, we offer an opportunity to create a solution which is uniquely adapted to local needs, while also based on open software components and open standards."
Christophe ICD

Zebra | Index Data - 0 views

  •  
    "Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads structured records in a variety of input formats (eg. email, XML, MARC) and allows access to them through exact boolean search expressions and relevance-ranked free-text queries. Zebra supports large databases (more than ten gigabytes of data, tens of millions of records). It supports incremental, safe database updates on live systems. You can access data stored in Zebra using a variety of Index Data tools (eg. YAZ and PHP/YAZ) as well as commercial and freeware Z39.50 clients and toolkits. Zebra is free software, available under the GPL license."
Christophe ICD

Information Toolmakers | Index Data - 0 views

  •  
    "For more than 15 years, Index Data has built technologies and solutions in support of searching. Our customers and partners include national libraries and consortia, government agencies, and commercial companies. Our business is to seek out the most challenging problems in the area of metasearching and large-scale content indexing and searching, and to engineer industrial-strength tools to help people solve those problems. We have been active participants in and contributors to the international standards communities in our field since the very beginning. Our open-source protocol implementations are the most widely used in the industry and have had significant impact on both the evolution and penetration of international standards like Z39.50 and SRU. Because of our long history in this area, and our focus on building flexible, reusable tools for developers and organizations, our partners can focus their attention on adding value and on creating compelling user experiences. "
Christophe ICD

Présentation de Summon | 24 hour library people - 0 views

  •  
    "Contrairement aux moteurs de recherche fédérée, Summon s'appuie sur un index unique, constamment mis à jour, qui contient les méta-données des ressources que propose la bibliothèque, ce qui garantirait un temps de réponse plus court. Concernant l'affichage des résultats, il se fait « selon leur pertinence », sans que l'on puisse savoir sur quoi cette pertinence est basée. Il semble cependant que l'algorithme traite principalement les métadonnées des notices plutôt que celles du texte intégral pour ne pas générer de bruit. Les résultats sont proposés sous la forme d'un affichage à facettes, avec des possibilités pour affiner sa recherche (tri par contenu, par date, par bibliothèque…), et un fil RSS est proposé sur les résultats de la recherche. Summon est construit sur une architecture logicielle « libre » (Lucene). Rien n'est à installer « en dur » localement, tout est hébergé sur les serveurs de SerialsSolutions (Seattle, WA). Summon bénéficie de la puissance de feu de ProQuest-CSA (métadonnées de Ulrich's) et a passé des accords avec de nombreux éditeurs pour proposer aujourd'hui un index riche de plus de 500 millions de documents indexés, provenant de plus de 70 000 périodiques et plus de 100 fournisseurs de contenus dans sa base de connaissance. L'interrogation des bases d'archives ouvertes est possible."
Christophe ICD

Any published benchmarks between Google, FAST, Verity, Autonomy, or other enterprise se... - 0 views

  •  
    "The Google Search Appliance (GSA) is a black-box system in that you install it, set up your options, and it runs. It is certainly standards-based: it indexes HTML and other popular formats; and the results are typically defined using XML and style sheets. But the options you can customize with regard to data sources, relevance ranking, and extended search (thesaurus, taxonomies, and parametric or faceted search) are somewhat limited. FAST, Autonomy/Verity K2, OmniFind and other traditional enterprise search engines have always been toolkits. You install the software and begin the process of customizing it for your environment. Data in databases or content repositories? No problem. Custom security implementation? Modify the indexing and search methods. Have custom thesauri or existing taxonomies? Plug them in. Need parametric or faceted search results? Small matter of programming - although not much. Want to change the way results are ranked or sorted? Use the native query syntax - for example, FAST Query Language (FQL) or the Verity Query Language (VQL)."
Christophe ICD

VuFind: The library OPAC meets Web2.0! - 0 views

  •  
    " VuFind is a library resource portal designed and developed for libraries by libraries. The goal of VuFind is to enable your users to search and browse through all of your library's resources by replacing the traditional OPAC to include: * Catalog Records * Locally Cached Journals * Digital Library Items * Institutional Repository * Institutional Bibliography * Other Library Collections and Resources VuFind is completely modular so you can implement just the basic system, or all of the components. And since it's open source, you can modify the modules to best fit your need or you can add new modules to extend your resource offerings. VuFind has many APIs to interact with the search, data and many other features. You can syndicate your record data with other institutions via an OAI server. You can search using vufind's algorithms via OpenSearch. And if you want complete access to your indexed data, you can interact with Solr, VuFind's backend search and index engine."
Christophe ICD

After Losing Users in Catalogs, Libraries Find Better Search Software | Technology - Th... - 0 views

  •  
    "But commercial vendors, smelling a new market, are stepping in. Serials Solutions, a subsidiary of ProQuest, released a software product in July called Summon. The company has been negotiating deals with publishers and content providers to create a searchable index of their content. It's like Google, except what Summon provides is an index of the "deep Web" of paid content. So now university libraries that pay for a subscription to Summon can let their users search their licensed content as well as locally owned stuff, together. Summon has 17 customers so far, including Arizona State University and Dartmouth College. The catch? It can be expensive. Andrew S. Nagy, senior discovery-services engineer at Serials Solutions, wouldn't say how expensive. But the cost of a subscription can run into the tens of thousands, said one university administrator who was not authorized to discuss price and thus wanted to remain anonymous. Summon also does not have permission to display the full text of articles."
Christophe ICD

ALA Midwinter 2010 - Recent Trends in Catalog Architecture | ALCTS Committee on Catalog... - 0 views

  •  
    - To Fix A Leaky Sink: Envisioning The Potential of Discovery Layers - LENS: Catalog Records and Additional Data Sources in the Aquabrowser Implementation at the University of Chicago - Automated Metadata Repurposing Using eXtensible Catalog Software - Equality of Retrieval: Leveling the Metadata Playing Field in Big Indexes
Christophe ICD

Hacking Summon | Code{4}lib - Issue 11, 2010-09-21 - 0 views

  •  
    This article will explore the space between Summon's out-of-the-box user interface and full developer API, providing practical advice on tweaking configuration information and catalog exports to take full advantage of Summon's indexing and faceting features. The article then describes the creation of OSUL's home-grown open source availability service which replaced and enhanced the availability information that Summon would normally pull directly from the catalog.
Christophe ICD

Evaluation of Federated Searching Options for the School Library | American Association... - 0 views

  •  
    "Three hosted federated search tools, Follett One Search, Gale PowerSearch Plus, and WebFeat Express, were configured and implemented in a school library. Databases from five vendors and the OPAC were systematically searched. Federated search results were compared with each other and to the results of the same searches in the database's native interface to disclose differences in handling query syntax, searching, retrieval, browsing results, etc. Each product was easily configured, but none were capable of searching every database desired. Simpler Boolean queries are the most successful queries because of the underlying structure and differences of the databases, and the capabilities of certain products. Federated search products succeed in simplifying access to multiple database resources at school, but searching remains different from the familiar Web search engines in many ways. To become more Google-like, federated searching must be done against indexes built in advance instead of the current real-time searching method."
Christophe ICD

Search features of digital libraries | Information Research, Vol. 5 No. 3, April 2000 - 0 views

  •  
    "Traditional on-line search services such as Dialog, DataStar and Lexis provide a wide range of search features (boolean and proximity operators, truncation, etc). This paper discusses the use of these features for effective searching, and argues that these features are required, regardless of advances in search engine technology. The literature on on-line searching is reviewed, identifying features that searchers find desirable for effective searching. A selective survey of current digital libraries available on the Web was undertaken, identifying which search features are present. The survey indicates that current digital libraries do not implement a wide range of search features. For instance: under half of the examples included controlled vocabulary, under half had proximity searching, only one enabled browsing of term indexes, and none of the digital libraries enable searchers to refine an initial search. Suggestions are made for enhancing the search effectiveness of digital libraries; for instance, by providing a full range of search operators, enabling browsing of search terms, enhancement of records with controlled vocabulary, enabling the refining of initial searches, etc."
Christophe ICD

Time Challenges - Challenging Times for Future Information Search - 0 views

  •  
    "It is hard to predict what the major challenge in search will be 100 years from now. The challenge may not even be related to information retrieval itself but could be the result of shortages of electricity, network disruptions due to insurgencies, information manipulation or access denial by an uncontrolled computer-based artificial intelligence (as imagined in the science fiction movie The Terminator). Of course, we could simply extrapolate the current ongoing trends, which we know do have an effect on information storage and retrieval, and might hope this gives an indication of some of the challenges that may affect finding and understanding information in a long-term perspective. In this article our focus will be on challenges that can be traced back to Time. Search is a two-sided issue: On one side, once data has been generated it has to be stored somewhere (on volatile or non-volatile media) and this stored data must somehow be accessible to a search or indexing engine; on the other side are the processes of search, retrieval and analysis of the data."
Christophe ICD

Endeca Information Access Platform (IAP) - 0 views

  •  
    "Endeca's revolutionary Information Access Platform (IAP) is the foundation of all our enterprise search solutions, and enables your business to rapidly and cost-effectively configure and deploy search applications that fit your business needs. By delivering configurable search-based business applications that offer any user interactive access to large volumes of any type of data, no matter the source or location, Endeca's IAP has differentiated itself from all other enterprise search technologies. With our innovative IAP, you'll have comprehensive functionality for connecting to and indexing enterprise content, and then exposing that information to your customers or employees with specific information needs."
Christophe ICD

Emerald: Article Request - "Power tags" in information retrieval - 0 views

  •  
    "Many Web 2.0 services (including Library 2.0 catalogs) make use of folksonomies. The purpose of this paper is to cut off all tags in the long tail of a document-specific tag distribution. The remaining tags at the beginning of a tag distribution are considered power tags and form a new, additional search option in information retrieval systems. In a theoretical approach the paper discusses document-specific tag distributions (power law and inverse-logistic shape), the development of such distributions (Yule-Simon process and shuffling theory) and introduces search tags (besides the well-known index tags) as a possibility for generating tag distributions."
Christophe ICD

koha-fr | communauté francophone koha - 0 views

  •  
    "Koha utilise le moteur d'indexation open-source Zebra. Le site http://www.indexdata.dk/zebra/ présente Zébra comme : "un outil hautement performant permettant d'indexer et de rechercher du texte structuré. Il est capable de lire les enregistrements dans différents formats d'entrée (par exemple email, XML, MARC) et donne accès aux données grâce à l'utilisation d'équations de recherches booléennes et de requêtes en texte libre. Zebra supporte de grosses bases de données ( dizaines de millions d'enregistrements, dizaines de gigabytes). Il permet la mise à jour incrémentale de la base de données sur des systèmes en production. Du fait que Zebra support le protocole de recherche Z39.50, vous pouvez rechercher dans les bases de données Zebra par l'intermédiaire d'une grande variété de programmes et outils, commerciaux ou libres, qui sont capables de communiquer via ce protocole." (traduit du Zebra - User-s Guide and Reference, p3, http://www.indexdata.dk/zebra/doc/zebra.pdf)."
Christophe ICD

Summon 'web scale'? I don't think so. | synthesize-specialize-mobilize - 0 views

  •  
    "I don't think its obvious, but what OCLC is trying to do with WorldCat is much bolder than Serials Solutions and Summon. With Summon, libraries are basically throwing all of their content into one index to break down the data silos within an institution. But what you end up with is a big search silo for that institution. With WorldCat, the vision is to break down not only the silos within institutions but also the silos between institutions. And not just break down those silos in the sense of harvest-and-search. The concept is that libraries and their patrons will be working together to improve a shared database through intentional and professional metadata. This shared database will be big enough to have a real impact on the web. Its records will surface in search engine results. Its interface will be familiar to many, and it will be customizable for a particular audience via the WorldCat Local route."
Christophe ICD

Sophia Search - 0 views

  •  
    "SOPHIA has been designed to address these recognised limitations. SOPHIA is unique in that it addresses the 'search problem' from a linguistic view point as opposed to from a purely mathematical perspective. It is based on an established model of language called Semiotics which describes how we understand and interpret the meaning of signs and texts in context. We don't argue that SOPHIA should replace Boolean search entirely - moreover it is a complementary tool that provides an enriched search experience that enables users to browse, navigate, discover and understand their information in context. SOPHIA has a 3-tier architecture designed for OEM and ease of integration with Development Partners. Contact sales@sophiasearch.com to explore Working Together on a project or application that can be enhanced and developed by a relationship with SOPHIA's semantic search technology."
Christophe ICD

[pdf] Visualisation interactive d'information, par M. Hascoët et M. Beaudouin... - 0 views

  •  
    "Les interfaces graphiques de recherche relèvent presque toutes d'un modèle unique et primaire : un champ de saisie uniligne et un bouton permettant de valider la requête saisie dans le champ. Face à la diversité des individus et de leurs besoins en informations, de nombreuses techniques de visualisation et d'interaction innovantes ont émergés ces dix dernières années. Pourtant ces innovations sont rarement utilisées et le but de cet article est de faire le point sur l'état de l'art des techniques de visualisation et d'interaction qui peuvent être pertinentes pour la recherche d'information."
1 - 20 of 22 Next ›
Showing 20 items per page