|
|
1.1 root 1: .\" "@(#)invert.1 4.4 9/10/85";
2: .TH INVERT 1 "28 July 1983"
3: .UC 4
4: .SH NAME
5: invert, lookup \(em create and access an inverted index
6: .SH SYNOPSIS
7: .B invert
8: [option ... ] file ...
9: .ns
10: .PP
11: .B lookup
12: [option ... ]
13: .SH DESCRIPTION
14: .I Invert
15: creates an inverted index to one or more files.
16: .I Lookup
17: retrieves records from files for which an inverted index exists.
18: The inverted indices are intended for use with
19: .IR bib (1).
20: .PP
21: .I Invert
22: creates one inverted index to all of its input files.
23: The index must be stored in the current directory and may not be moved.
24: Input files may be absolute path names or paths relative to the current
25: directory.
26: Each input file is viewed as a set of records;
27: each record consists of non-blank lines;
28: records are separated by blank lines.
29: .PP
30: .I Lookup
31: retrieves records based on its input
32: .I (stdin).
33: Each line of input is a retrieval request.
34: All records that contain all of the keywords in the retrieval request
35: are sent to
36: .I stdout.
37: If there are no matching references, ``No references found.'' is sent to
38: .I stdout.
39: .I \ Lookup
40: first searches in the user's private index (default INDEX)
41: and then, if no references are found,
42: in the system index (/usr/dict/papers/INDEX).
43: The system index was produced using
44: .I invert
45: with the default options;
46: in general, the user is advised to use the defaults.
47: .PP
48: Keywords are a sequence of non-white space characters
49: with non-alphanumeric characters removed.
50: Keywords must be at least two characters and are truncated
51: (default length is 6).
52: Some common words are ignored.
53: Some lines of input are ignored for the purpose of collecting keywords.
54: .PP
55: The following options are available for
56: .I invert:
57: .IP "\-c \fIfile\fP" 8m
58: .ns
59: .IP \-c\fIfile\fP
60: File contains common words, one per line.
61: Common words are not used as keys.
62: (Default /usr/new/lib/bmac/common.)
63: .IP "\-k \fIi\fP"
64: .ns
65: .IP \-k\fIi\fP
66: Maximum number of keys kept per record. (Default 100)
67: .IP "\-l \fIi\fP"
68: .ns
69: .IP \-l\fIi\fP
70: Maximum length of keys. (Default 6)
71: .IP "\-p \fIfile\fP"
72: .ns
73: .IP \-p\fIfile\fP
74: File is the name of the private index file (output of
75: .IR invert ).
76: (Default is INDEX.)
77: The index must be stored in the current directory.
78: (Be careful of the second form.
79: The shell will not know to expand the file name.
80: E.g. \-p~/index won't work; use \-p\ ~/index.)
81: .IP \-s
82: Silent.
83: Suppress statistics.
84: .IP -%\fIstr\fP
85: Ignore lines that begin with %x
86: where x is in
87: .I str.
88: (Default is CNOPVX. See
89: .IR bib (1)
90: for explanation of field names.)
91: .PP
92: .I Lookup
93: has only the options
94: .BR c ,
95: .BR l ,
96: and
97: .B p
98: with the same meanings as
99: .I bib.
100: In particular, the
101: .B p
102: option can be followed by a list of comma separated index files.
103: These are searched in order from left to right until at least one reference
104: is found.
105: .SH FILES
106: INDEX inverted index
107: .br
108: /usr/tmp/invertxxxxxx scratch file for invert
109: .br
110: /usr/new/lib/bmac/common default list of common words
111: .br
112: /usr/dict/papers/INDEX default system index
113: .SH SEE ALSO
114: \fIA UNIX Bibliographic Database Facility\fP,
115: Timothy A. Budd and Gary M. Levin,
116: University of Arizona Technical Report 82-1, 1982.
117: .br
118: bib(1)
119: .SH DIAGNOSTICS
120: Messages indicating trouble accessing files are sent on
121: .I stderr.
122: There is an explicit message on
123: .I stdout
124: from
125: .I lookup
126: if no references are found.
127: .LP
128: .I Invert
129: produces a one line message of the form,
130: \*(oq%D\ documents\ \ \ %D distinct\ keys\ \ %D\ key\ occurrences\*(cq.
131: This can be suppressed with the \-s option.
132: .LP
133: The message \*(oqlocate: first key (%s) matched too many refs\*(cq
134: indicates that the first key matched more references than could be stored
135: in memory.
136: The simple solution is to use a less frequently occurring key as the first
137: key in the citation.
138: .SH BUGS
139: No attempt is made to check the compatibility between an index
140: and the files indexed.
141: The user must create a new index whenever
142: the files that are indexed are modified.
143: .LP
144: Attempting to invert a file containing unprintable characters can
145: cause chaos.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.