|
|
1.1 root 1: .de L \" literal font
2: .ft 5
3: .it 1 }N
4: .if !\\$1 \&\\$1 \\$2 \\$3 \\$4 \\$5 \\$6
5: ..
6: .de LR
7: .}S 5 1 \& "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6"
8: ..
9: .de RL
10: .}S 1 5 \& "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6"
11: ..
12: .de EX \" start example
13: .ta 1i 2i 3i 4i 5i 6i
14: .PP
15: .RS
16: .PD 0
17: .ft 5
18: .nf
19: ..
20: .de EE \" end example
21: .fi
22: .ft
23: .PD
24: .RE
25: .PP
26: ..
27: .TH HASH 3
28: .SH NAME
29: hash \- hash table support
30: .SH SYNOPSIS
31: .L "#include <hash.h>"
32: .SH DESCRIPTION
33: The
34: .I hash
35: routines manipulate collections of dynamic, scoped hash tables.
36: .PP
37: A hash table provides an association between a
38: .I key
39: and its
40: .IR value .
41: A
42: .I key
43: is a sequence of
44: .L char
45: elements and a
46: .I value
47: is a user supplied pointer to the value.
48: Each hash table has a dynamic number of slots,
49: each pointing to the head of a forward linked
50: .IR "collision chain" .
51: .PP
52: Hashing occurs as follows:
53: a
54: .I "hash function"
55: takes a
56: .I key
57: as an argument and returns a non-negative index.
58: The index modulo the table size produces a
59: slot that points to a
60: .IR "collision chain" .
61: The collision chain is sequentially searched until a match is found for
62: .IR key .
63: The hash tables are automatically resized as new entries are added.
64: .PP
65: Each hash table has one key type.
66: The default key type is a pointer to a null-terminated string.
67: The alternate key type is a pointer to a fixed length byte buffer,
68: declared for a given table by the
69: .L hashalloc()
70: function described below.
71: .PP
72: Hash table information is partitioned into two parts for efficient scoping.
73: The
74: .I root
75: part contains fixed information that is shared among a set of related
76: hash tables.
77: The remaining information is maintained on a per-table basis.
78: .PP
79: These basic types are defined in the header file
80: .B hash.h
81: (alternate names are listed in parenthesis):
82: .TP
83: .L "HASHTABLE (Hashtab_t)"
84: The per-table information.
85: The readonly public elements are:
86: .RS
87: .TP
88: .L "int buckets"
89: The number of table entries.
90: .TP
91: .L "char* name"
92: The hash table name.
93: .TP
94: .L "root"
95: The root information.
96: The public elements are:
97: .RS
98: .TP
99: .L "int root->accesses"
100: The number of lookups.
101: .TP
102: .L "int root->collisions"
103: The number of lookup collisions.
104: .RE
105: .TP
106: .L "HASHTABLE* scope"
107: The table that this scope covers,
108: .L NULL
109: if the table is not a scope.
110: .TP
111: .L "int size"
112: The current hash table size.
113: .RE
114: .TP
115: .L "HASHBUCKET (Hashbin_t)"
116: A collision chain hash bucket.
117: The public structure elements are:
118: .RS
119: .TP
120: .L "char* hashname(HASHBUCKET*)"
121: Returns a pointer to the hash bucket key given the bucket pointer.
122: .TP
123: .L "char* value"
124: The value associated with the key.
125: .RE
126: .TP
127: .L "HASHHEADER (Hashhdr_t)"
128: The hash bucket header that must be the first element in all user defined
129: buckets.
130: .L HASH_VALUE
131: may not be used with user defined buckets.
132: .TP
133: .L "HASHPOSITION (Hashpos_t)"
134: Stores the hash table position for
135: .LR hashscan .
136: The public elements are:
137: .RS
138: .TP
139: .L "HASHBUCKET* bucket"
140: The current hash bucket pointer.
141: .RE
142: .PP
143: The routines are:
144: .TP
145: .L "HASHTABLE* hashalloc(HASHTABLE* ref, int op, ...)"
146: Creates a new hash table and returns a pointer to the table.
147: .IR malloc (3)
148: is used to allocate space for the table.
149: .L NULL
150: is returned if the table cannot be created.
151: .L ref
152: is a pointer to a reference hash table that provides
153: default values for unspecified information.
154: The new hash table and
155: .L ref
156: share the same root information.
157: If
158: .L ref
159: is
160: .L NULL
161: then new root information is created for the new table.
162: The remaining arguments appear in
163: .I op-arg
164: pairs, followed by a final
165: .L 0
166: argument.
167: The
168: .I op-arg
169: pairs are:
170: .RS
171: .TP
172: .L "HASH_alloc, (char(*)()) alloc"
173: .L alloc
174: is a function that is called to process
175: .L HASHBUCKET
176: .L value
177: assignments.
178: The single argument is
179: .L "char* value"
180: and the processed
181: .L char*
182: value is returned.
183: .TP
184: .L "HASH_clear, int flags"
185: .L flags
186: are the
187: .L ref
188: flags to be cleared in the new hash table.
189: See
190: .L HASH_set
191: below.
192: .TP
193: .L "HASH_compare, (int(*)()) compare"
194: Specifies an alternate
195: .I key
196: comparison function.
197: The arguments and return value for
198: .L compare
199: are the same as for
200: .IR strncmp (3)
201: if
202: .L HASH_namesize
203: is specified and
204: .IR strcmp (3)
205: otherwise.
206: The first argument is the
207: .I key
208: from the current hash bucket on the
209: .I "collision chain"
210: and the second argument is the user supplied
211: .IR key .
212: .TP
213: .L "HASH_free, (int(*)()) free"
214: .L free
215: is a function that is called when a hash bucket is freed.
216: If
217: .L HASH_BUCKET
218: was set in
219: .L hashalloc
220: then the hash bucket pointer is passed, otherwise the bucket
221: .L value
222: pointer is passed.
223: .TP
224: .L "HASH_hash, (int(*)()) hash"
225: Specifies an alternate
226: .I key
227: hash function.
228: A
229: .L char*
230: key argument (and, if
231: .L HASH_namesize
232: is specified, an
233: .L int
234: key size argument) is passed to
235: .LR hash .
236: The return value must be a non-negative
237: .LR int .
238: .TP
239: .L "HASH_meanchain, int meanchain"
240: Specifies the mean collision chain length.
241: The hash table is automatically resized when this value is exceeded.
242: The default mean chain length is 2.
243: .TP
244: .L "HASH_name, char* name"
245: Associates
246: .L name
247: with the hash table.
248: Used by
249: .LR hashdump) .
250: .TP
251: .L "HASH_namesize, int namesize"
252: The fixed size in bytes for
253: .I keys
254: in the table.
255: If
256: .L namesize
257: is 0 (the default) then the
258: .I keys
259: are interpreted as null-terminated strings.
260: .TP
261: .L "HASH_set, int flags"
262: Changes the hash table flags by
263: .IR or ing
264: in
265: .LR flags .
266: The flags, which may be
267: .IR or ed
268: together, are:
269: .RS
270: .TP
271: .L HASH_ALLOCATE
272: Keys for new hash table entries are to be copied to data areas obtained from
273: .IR malloc (3).
274: .TP
275: .L HASH_FIXED
276: Fixes the hash table size, disabling any automatic table resizing.
277: .TP
278: .L HASH_SCOPE
279: The new hash table is a scope that is to be pushed on top of
280: .LR ref .
281: .L ref
282: must be
283: .RL non- NULL .
284: .RE
285: .RE
286: .TP
287: .L "HASHTABLE* hashfree(HASHTABLE* tab)"
288: The hash table
289: .L tab
290: is freed.
291: The scope covered table pointer is returned,
292: .L NULL
293: if
294: .L tab
295: is not a scope.
296: .TP
297: .L "char* hashlook(HASHTABLE* tab, char* name, int flags, char* value)"
298: Operates on the key
299: .L name
300: in the hash table
301: .L tab
302: according to
303: .L flags
304: and
305: .LR value .
306: A
307: .L HASHBUCKET
308: pointer is returned unless otherwise noted.
309: There are three basic lookup operations:
310: .RS
311: .TP
312: .L HASH_CREATE
313: .L name
314: is entered into the top level scope if it does not already exist.
315: If
316: .L name
317: also appears in a lower scope and
318: .L HASH_ALLOC
319: is set for the table then the new bucket will share the
320: .L name
321: field value with the lower scope.
322: .TP
323: .L HASH_DELETE
324: .L name
325: is deleted from the top level scope if it exists.
326: .L NULL
327: is returned.
328: .TP
329: .L HASH_LOOKUP
330: The scopes are searched in order from top to bottom for
331: .L name .
332: The bucket pointer for the first occurrence is returned.
333: .L NULL
334: is returned if
335: .L name
336: is not found.
337: .RE
338: The basic operations may be qualified by the following
339: (the qualifiers are restricted to the basic operations in
340: the parenthesized list):
341: .RS
342: .TP
343: .L "HASH_BUCKET (HASH_CREATE,HASH_DELETE,HASH_LOOKUP)"
344: .L name
345: is a pointer to a bucket that has already been entered into the table.
346: .TP
347: .L "HASH_FIXED (HASH_CREATE)"
348: .L value
349: is taken to be the size of the hash bucket to be created for
350: .L name
351: in the top level scope.
352: The minimum bucket size is silently restricted to
353: .LR sizeof(HASHHEADER) .
354: .TP
355: .L "HASH_INSTALL (HASH_CREATE)"
356: .L name
357: is a pointer to a bucket that has not been entered into the table.
358: .TP
359: .L "HASH_NOSCOPE (HASH_LOOKUP)"
360: The lookup is restricted to the top level scope.
361: .TP
362: .L "HASH_OPAQUE (HASH_CREATE,HASH_DELETE)"
363: Sets
364: .L (HASH_CREATE)
365: or clears
366: .L (HASH_DELETE)
367: the
368: .I opaque
369: property for the bucket.
370: An opaque bucket is not visible in lower scopes.
371: .TP
372: .L "HASH_SCOPE (HASH_CREATE,HASH_DELETE)"
373: All scopes are searched for the bucket.
374: If the bucket is not found for
375: .L HASH_CREATE
376: then a new bucket is created in the lowest scope.
377: .TP
378: .L "HASH_VALUE (HASH_CREATE,HASH_LOOKUP)"
379: For
380: .L HASH_CREATE
381: the bucket
382: .L value
383: field is set to
384: .L value
385: and the bucket
386: .L name
387: value is returned.
388: For
389: .L HASH_LOOKUP
390: the bucket
391: .L value
392: field is returned,
393: .L NULL
394: if the bucket is not found.
395: .RE
396: If
397: .L name
398: .L NULL
399: then the name from the most recent
400: .L hashlook()
401: is used, avoiding recomputation of some internal parameters.
402: .TP
403: .L "char* hashget(HASHTABLE* tab, char* name)"
404: Returns the value
405: associated with the key
406: .L name
407: in the hash table
408: .LR tab .
409: If
410: .L name
411: is
412: .L NULL
413: then the name from the most recent
414: .L hashget()
415: is used, avoiding recomputation of some internal parameters.
416: .L NULL
417: is returned if
418: .L name
419: is not in the table.
420: All scope covered tables are searched.
421: .TP
422: .L "HASHBUCKET* hashlast(HASHTABLE* tab)"
423: Returns a pointer to the most recent hash bucket for
424: the most recent hash table.
425: The value is set by
426: .LR hashlook() ,
427: .L hashscan()
428: and
429: .LR hashwalk() .
430: .TP
431: .L "char* hashput(HASHTABLE* tab, char* name, char* value)"
432: Set the value of the key
433: .L name
434: to
435: .L value
436: in the top level scope of the hash table
437: .LR tab .
438: .L name
439: is entered into the top level scope if necessary.
440: The (possibly re-allocated) key name pointer is returned
441: (see
442: .LR HASH_ALLOCATE ).
443: .TP
444: .L "int hashwalk(HASHTABLE* tab, int flags, (int(*)()) walker)"
445: The function
446: .L walker
447: is applied to each entry (not covered by a scope starting at
448: .LR tab )
449: in the hash table
450: .LR tab .
451: If
452: .L flags
453: is
454: .L HASH_NOSCOPE
455: then only the top level hash table is used, otherwise the walk includes
456: all scope covered tables.
457: .L walker
458: is called with
459: .L char*
460: .I key
461: as the first argument and
462: .L char*
463: .I value
464: as the second argument.
465: The walk terminates after the last entry or when
466: .L walker
467: returns a negative value.
468: The return value of the last call to
469: .L walker
470: is returned.
471: Only one walk may be active within a collection of scoped tables.
472: .TP
473: .L "void hashscan(HASHTABLE* tab, int flags, HASHPOSITION* pos)"
474: Initializes
475: .L pos
476: for a sequential scan on the hash table
477: .LR tab .
478: If
479: .L flags
480: is
481: .L HASH_NOSCOPE
482: then only the top level hash table is used, otherwise the scan includes
483: all scope covered tables.
484: Only one scan may be active within a collection of scoped tables.
485: .L hashdone()
486: must be called to terminate the scan.
487: .TP
488: .L "int hashnext(HASHPOSITION* pos)"
489: Generates the next element in the sequential scan set up by
490: .L hashscan()
491: on
492: .LR pos .
493: If no elements remain then
494: .L 0
495: is returned.
496: Otherwise
497: .L pos->bucket
498: points to the hash bucket of the next element and
499: .L 1
500: is returned.
501: .TP
502: .L "void hashdone(HASHPOSITION* pos)"
503: Completes a scan initiated by
504: .L hashscan()
505: on
506: .LR pos .
507: .TP
508: .L "int hashset(HASHTABLE* tab, int flags)"
509: Sets the flags for the hash table
510: .L tab
511: by
512: .IR or ing
513: in
514: .LR flags .
515: Only
516: .L HASH_ALLOCATE
517: and
518: .L HASH_FIXED
519: may be set.
520: .TP
521: .L "int hashclear(HASHTABLE* tab, int flags)"
522: Clears the flags for the hash table
523: .L tab
524: by masking out
525: .LR flags .
526: Only
527: .L HASH_ALLOCATE
528: and
529: .L HASH_FIXED
530: may be cleared.
531: .TP
532: .L "void hashdump(FILE* fp, HASHTABLE* tab, int flags)"
533: Dumps hash table accounting info to the output file stream
534: .LR fp .
535: If
536: .L tab
537: is
538: .L NULL
539: then all allocated hash tables are dumped, otherwise only information on
540: .L tab
541: is dumped.
542: If
543: .L flags
544: is
545: .L HASH_BUCKET
546: then the hash bucket
547: .I key-value
548: pairs for each collision chain are also dumped.
549: .TP
550: .L "void hashsize(HASHTABLE* tab, int size)"
551: Changes the size of the hash table
552: .L tab
553: to
554: .L size
555: where
556: .L size
557: must be a power of 2.
558: Explicit calls to this routine are not necessary as hash tables
559: are automatically resized.
560: .TP
561: .L "int strhash(char* name)"
562: Hashes the null terminated character string
563: .L name
564: using a linear congruent pseudo-random number generator algorithm
565: and returns a non-negative
566: .L int
567: hash value.
568: .TP
569: .L "int memhash(char* buf, int siz)"
570: Hashes the buffer
571: .L buf
572: of
573: .L siz
574: bytes using a linear congruent pseudo-random number generator algorithm
575: and returns a non-negative
576: .L int
577: hash value.
578: .TP
579: .L "long strsum(char* name, long sum)"
580: Returns a running 31-bit checksum of the string
581: .L name
582: where
583: .L sum
584: is
585: .L 0
586: on the first call and
587: the return value from a previous
588: .L memsum
589: or
590: .L strsum
591: call otherwise.
592: The checksum value is consistent across all implementations.
593: .TP
594: .L "long memsum(char* buf, int siz, long sum)"
595: Returns a running 31-bit checksum of buffer
596: .L buf
597: of
598: .L siz
599: bytes where
600: .L sum
601: is
602: .L 0
603: on the first call and
604: the return value from a previous
605: .L memsum
606: or
607: .L strsum
608: call otherwise.
609: The checksum value is consistent across all implementations.
610: .SH "SEE ALSO"
611: sum(1)
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.