|
|
1.1 root 1:
2:
3:
4: INVERT(1) UNIX Programmer's Manual INVERT(1)
5:
6:
7:
8: NNAAMMEE
9: invert, lookup - create and access an inverted index
10:
11: SSYYNNOOPPSSIISS
12: iinnvveerrtt [option ... ] file ...
13:
14: llooookkuupp [option ... ]
15:
16: DDEESSCCRRIIPPTTIIOONN
17: _I_n_v_e_r_t creates an inverted index to one or more files.
18: _L_o_o_k_u_p retrieves records from files for which an inverted
19: index exists. The inverted indices are intended for use
20: with _b_i_b(1).
21:
22: _I_n_v_e_r_t creates one inverted index to all of its input files.
23: The index must be stored in the current directory and may
24: not be moved. Input files may be absolute path names or
25: paths relative to the current directory. Each input file is
26: viewed as a set of records; each record consists of non-
27: blank lines; records are separated by blank lines.
28:
29: _L_o_o_k_u_p retrieves records based on its input (_s_t_d_i_n). Each
30: line of input is a retrieval request. All records that con-
31: tain all of the keywords in the retrieval request are sent
32: to _s_t_d_o_u_t. If there are no matching references, ``No refer-
33: ences found.'' is sent to _s_t_d_o_u_t. _L_o_o_k_u_p first searches in
34: the user's private index (default INDEX) and then, if no
35: references are found, in the system index
36: (/usr/dict/papers/INDEX). The system index was produced
37: using _i_n_v_e_r_t with the default options; in general, the user
38: is advised to use the defaults.
39:
40: Keywords are a sequence of non-white space characters with
41: non-alphanumeric characters removed. Keywords must be at
42: least two characters and are truncated (default length is
43: 6). Some common words are ignored. Some lines of input are
44: ignored for the purpose of collecting keywords.
45:
46: The following options are available for _i_n_v_e_r_t:
47:
48: -c _f_i_l_e
49:
50: -c_f_i_l_e File contains common words, one per line. Common
51: words are not used as keys. (Default
52: /usr/new/lib/bmac/common.)
53:
54: -k _i
55:
56: -k_i Maximum number of keys kept per record. (Default
57: 100)
58:
59: -l _i
60:
61:
62:
63: Printed 8/22/89 28 July 1983 1
64:
65:
66:
67:
68:
69:
70: INVERT(1) UNIX Programmer's Manual INVERT(1)
71:
72:
73:
74: -l_i Maximum length of keys. (Default 6)
75:
76: -p _f_i_l_e
77:
78: -p_f_i_l_e File is the name of the private index file (output
79: of _i_n_v_e_r_t). (Default is INDEX.) The index must be
80: stored in the current directory. (Be careful of the
81: second form. The shell will not know to expand the
82: file name. E.g. -p~/index won't work; use
83: -p ~/index.)
84:
85: -s Silent. Suppress statistics.
86:
87: -%_s_t_r Ignore lines that begin with %x where x is in _s_t_r.
88: (Default is CNOPVX. See _b_i_b(1) for explanation of
89: field names.)
90:
91: _L_o_o_k_u_p has only the options cc, ll, and pp with the same mean-
92: ings as _b_i_b. In particular, the pp option can be followed by
93: a list of comma separated index files. These are searched
94: in order from left to right until at least one reference is
95: found.
96:
97: FFIILLEESS
98: INDEX inverted index
99: /usr/tmp/invertxxxxxx scratch file for invert
100: /usr/new/lib/bmac/common default list of common words
101: /usr/dict/papers/INDEX default system index
102:
103: SSEEEE AALLSSOO
104: _A _U_N_I_X _B_i_b_l_i_o_g_r_a_p_h_i_c _D_a_t_a_b_a_s_e _F_a_c_i_l_i_t_y, Timothy A. Budd and
105: Gary M. Levin, University of Arizona Technical Report 82-1,
106: 1982.
107: bib(1)
108:
109: DDIIAAGGNNOOSSTTIICCSS
110: Messages indicating trouble accessing files are sent on
111: _s_t_d_e_r_r. There is an explicit message on _s_t_d_o_u_t from _l_o_o_k_u_p
112: if no references are found.
113:
114: _I_n_v_e_r_t produces a one line message of the form,
115: %D documents %D distinct keys %D key occurrences. This
116: can be suppressed with the -s option.
117:
118: The message locate: first key (%s) matched too many refs
119: indicates that the first key matched more references than
120: could be stored in memory. The simple solution is to use a
121: less frequently occurring key as the first key in the cita-
122: tion.
123:
124: BBUUGGSS
125: No attempt is made to check the compatibility between an
126:
127:
128:
129: Printed 8/22/89 28 July 1983 2
130:
131:
132:
133:
134:
135:
136: INVERT(1) UNIX Programmer's Manual INVERT(1)
137:
138:
139:
140: index and the files indexed. The user must create a new
141: index whenever the files that are indexed are modified.
142:
143: Attempting to invert a file containing unprintable charac-
144: ters can cause chaos.
145:
146:
147:
148:
149:
150:
151:
152:
153:
154:
155:
156:
157:
158:
159:
160:
161:
162:
163:
164:
165:
166:
167:
168:
169:
170:
171:
172:
173:
174:
175:
176:
177:
178:
179:
180:
181:
182:
183:
184:
185:
186:
187:
188:
189:
190:
191:
192:
193:
194:
195: Printed 8/22/89 28 July 1983 3
196:
197:
198:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.