DNN Designer

Login |  
Readings
opera fix
Print   Minimize 
opera fix
  • Baeza, Chapter 2, Section 2.5.3
opera fix

Modern Information Retrieval (Baeza) - Chapter 3
opera fix
Print   Minimize 
opera fix

Title: Modeling
Summary: Section 2.5.3 focuses primarily on vector modeling.

  • Vector modeling - assigning non-binary weights to index terms in queries and documents.
    • Pros
      • Its term-weighting scheme improves retrieval performance (more precise).
      • Its partial matching strategy allows retrieval of documents that approximate the query conditions.
      • Its cosine ranking formula sorts the documents according to their degree of similarity to the query.
    • Cons
      • It yields ranked answer sets which are difficult to improve without query expansion or relevance feedback.
  • Degree of similarity - it is computed for documents stored in a system.
opera fix
Assignment
opera fix
Print   Minimize 
opera fix

Information Need
In preparation for my thesis in the upcoming semesters, I have been performing research in the area of attention profiling. Attention profiling is a form of user profiling in which its primary focus is on what a user pays attention to online. Online activities include e-mail, news articles, music, shopping, and etc. The majority of the information I have collected so far are through various blogs and wikis. At this moment, it is my best interest to begin conducting research in publications and journal articles in related to attention profiling.

Downloads:

Report (44KB)

opera fix


Information Storage and Retrieval (Korfhage)
opera fix
Print   Minimize 
opera fix

Chapter 3 - Query Structures

  • Weight vector - a vector of terms and their respective weights or values.
  • Weights - can be assigned automatically based on the frequency counts of terms.

Chapter 4 - The Matching Process

  • The fact that a document contains a given term does not mean that the document is strongly related to the term.
  • Document space - an organized set of documents.
  • In some models, the structure of the query precludes it from inclusion in the document space (e.g. Boolean model).
  • Characteristic function - a function having the value of 1 on documents relevant to the query and 0 on documents that are not relevant.
  • When the query and the document representation are similar, the query can be considered as a point in the document space.
  • Most retrieval systems equate relevance in the form of topicality, with lexical similarity.
  • Vector modeling - similarity measurements are either distance or angular.
  • Cosine measure - cosine of the angle between the vectors representing the document and the query.
  • Distance measure - are intrinsic, based solely on the group of documents under consideration.
  • Angular measure - are extrinsic, representing a view of the document space from a fixed point, the origin.
opera fix

Copyright 2008 by WillWork.Org
Terms Of Use | Privacy Statement