Scientific Information Retrieval

Pages 6 (1506 words)
Download 0
This paper will attempt to explain how new computer information retrieval (IR) methods have become an essential tool for the scientific community. The vast body of data created by the sciences has created a problem, namely how can this data be created to usable knowledge Data is distributed worldwide, and works from various universities and institutes are published and posted weekly, daily, even hourly in thousands of journals and reports.


The common search or Boolean query that computer users do everyday is a submission of a term to search engine which is programmed with a Boolean algorithm which finds documents with the term we included in the search and it is supported by an index containing all terms in the database. The simple form of Boolean query,
which is efficiently implemented over large databases, suffers several limitations: The number of retrieved documents is typically prohibitively large. A substantial part of the retrieved documents is irrelevant to the user's information need.
A broadly used alternative to the Boolean query is the similarity query, which is typically based on the vector-space model. Under this setting, documents are viewed as (algebraic) vectors over terms. A query, q, may consist of many terms, and even comprise a complete document. It too is viewed as a body of text, rather than merely as a search-terms combination and is represented as a vector as well. The retrieval task reduces to searching the database for document-vectors that are most similar to the query-vector. ...
Download paper
Not exactly what you need?