 |


 |
| |
On Human Communication (Cherry) - Chapter 1
|
|
|
|  |  |
|
 |
|
|
|
Title: Communication and Organization
Summary:
- Speech and writing are by no means our only systems of communication. Social intercourse is greatly strengthened by habits of gesture - litle movements of the hands and face.
- Communication means a sharing of elements of behavior, or modes of life, by the existence of sets of rules.
- Sign - any physical event used in communication.
- 3 types of rule operating upon signs:
- Syntactic rules - rules of syntax, relations between signs.
- Semantic rules - relations between signs and the things, actions, relationships, quantities - designata.
- Pragmatic rules - relations between signs and their users.
- A man has remarkable powers of learning. Every communication, every perception adds to his accumulation of experiences; he is continually becoming a different person, for his every experience is part of a continuing process.
- Bees are able to discuss one thing only - food and where to find it.
- Animal signs can relate only to the future, but never, like human language, refer to the past.
|
 |
|
|
|

|
|
 |

|
 |

 |
| |
Online Information Retrieval (Harter) - Chapter 3
|
|
|
|  |  |
|
 |
|
|
|
Title: Database Structure, Organization, and Search
Summary: This chapter is written from the users' perspectives.
- Record - refers to a document surrogate - a representation of the document for storage and subsequent retrieval.
- Entity - objects about which information will be stored. Entities are considered in terms of their characteristics, called attributes.
- Field - a set of characters that represent the value of an attribute for the entity under consideration.
- Hierarchy of data elements: bit → byte → subfield → field → record → database → library.
- Linear File - a set of index records in which each record describes one item or entity, and are arranged in an order based on teh values of one or more attributes.
- Inverted Index - consists of records, typically alphabetically arranged, that are created from a linear file.
- Document / Term Matrix - rows are made up of documents or records (linear file); while columns are made up of index terms (inverted index). Example on page 73.
- Controlled Vocabulary - can be used for searching related terms.
- Boolean Operators - And, Or, Not: the order of operations is important and can be ambiguous.
- Word Proximity - e.g. two search terms to be adjacent; or present in a particular field or fields such as abstract or title; or present together in any field, sentence; or separated by n or fewer words.
- Truncation - to search on a piece of a longer word or phrase, usually its leftmost portion - using a wildcard (e.g. *).
- Stop Words - have no value for indexing or retrieval, and receive no entries made in the inverted index (e.g. a, an, and, by).
|
 |
|
|
|

 |
| |
Information Storage and Retrieval (Korfhage)
|
|
|
|  |  |
|
 |
|
|
Chapter 1 - Overview
- A person uses an information system in two major ways: to store information in anticipation of a future need, and to find information in response to a current need.
- An information system is composed of two major portions:
- Ectosystem - consists of those system factors that are not under the control of the designer (i.e. user, funder, and server).
- Endosystem - consists of those factors that the designer can specify and control (i.e. media used to store the information, the devices used to process the information, the algorithms by which the devices work, and the data structures used to organize the information).
- Signal → Data → Information → Knowledge → Wisdom
- Concept of information has both personal and time-dependent components that are not present in the concept of data.
- Information has a higher level of organization imposed by its relationship to a specific information need.
- Knowledge builds upon information to form a large, coherent view of a portion of reality.
- Wisdom adds to this knowledge a broader view still, encompassing all of known reality, and governing the use of the information that has been obtained and the knowledge that has been developed.
Chapter 3 - Query Structures
- Matching process is complicated by the fact that the query and the documents may have quite different forms.
- Stemming - reduction of a word to its root form.
- Proximity Operators - within X words of another word.
- Boolean Query
- Cons:
- There is no good way to weight terms for significance.
- Misstated query - hard for non-experts of AND, OR, and NOT to understand.
- Order of precedence.
- User is free to enter a very complex query.
- The result set can be very small or very large.
|
 |
|
|
|

|
|