DNN Designer

Login |  
Readings
opera fix
Print   Minimize 
opera fix
  • Baeza, Chapter 4
opera fix


Modern Information Retrieval (Baeza) - Chapter 4
opera fix
Print   Minimize 
opera fix

Title: Query Languages
Summary: The author begins by making a clear distinction between data retrieval and information retrieval.

  • Protocol - languages that a higher level software package should use to query an online database or a CD-ROM archive.
  • Document or Retrieval Unit - basic element which can be retrieved as an answer to a query (normally a set of basic elements is retrieved, sometimes ranked by relevance or other criterion).
  • Query - formulation of a user information need.
  • Term Frequency - number of times a word appears inside a document.
  • Inverse Document Frequency - number of documents in which a word appears.
  • Keyword-Based Query
    • Single Word Query
    • Context Query (Phrase, Proximity)
    • Boolean Query - has a syntax composed of atoms (i.e. basic queries) that retrieve documents, and of boolean operators which work on their operands.
    • Natural Language
  • Pattern Matching
    • Allows retrieval of pieces of text that have some property.
    • It is more difficult to rank the results of a pattern matching expression.
    • Types of pattern: words, prefixes, suffixes, substrings, ranges, allowing errors, regular expressions.
  • Structural Query - text collections tend to have some structure built into them (e.g. HTML).
    • Fixed Structure - e.g. mail (To, From, Body).
    • Hypertext - directed graph where the nodes hold some text and the links represent connections between nodes or between positions inside the nodes.
    • Hierarchical Structure
opera fix


Copyright 2008 by WillWork.Org
Terms Of Use | Privacy Statement