|
|
1.1 root 1: .\" @(#)compress.1 6.5 (Berkeley) 5/11/86
2: .\"
3: .TH COMPRESS 1 "May 11, 1986"
4: .UC 6
5: .SH NAME
6: compress, uncompress, zcat \- compress and expand data
7: .SH SYNOPSIS
8: .PU
9: .ll +8
10: .B compress
11: [
12: .B \-f
13: ] [
14: .B \-v
15: ] [
16: .B \-c
17: ] [
18: .B \-b
19: .I bits
20: ] [
21: .I "name \&..."
22: ]
23: .ll -8
24: .br
25: .B uncompress
26: [
27: .B \-f
28: ] [
29: .B \-v
30: ] [
31: .B \-c
32: ] [
33: .I "name \&..."
34: ]
35: .br
36: .B zcat
37: [
38: .I "name \&..."
39: ]
40: .SH DESCRIPTION
41: .I Compress
42: reduces the size of the named files using adaptive Lempel-Ziv coding.
43: Whenever possible,
44: each file is replaced by one with the extension
45: .B "\&.Z,"
46: while keeping the same ownership modes, access and modification times.
47: If no files are specified, the standard input is compressed to the
48: standard output.
49: Compressed files can be restored to their original form using
50: .I uncompress
51: or
52: .I zcat.
53: .PP
54: The
55: .B \-f
56: option will force compression of
57: .IR name ,
58: even if it does not actually shrink
59: or the corresponding
60: .IR name .Z
61: file already exists.
62: Except when run in the background under
63: .IR /bin/sh ,
64: if
65: .B \-f
66: is not given the user is prompted as to whether an existing
67: .IR name .Z
68: file should be overwritten.
69: .PP
70: The
71: .B \-c
72: (``cat'') option makes
73: .I compress/uncompress
74: write to the standard output; no files are changed.
75: The nondestructive behavior of
76: .I zcat
77: is identical to that of
78: .I uncompress
79: .B \-c.
80: .PP
81: .I Compress
82: uses the modified Lempel-Ziv algorithm popularized in
83: "A Technique for High Performance Data Compression",
84: Terry A. Welch,
85: .I "IEEE Computer,"
86: vol. 17, no. 6 (June 1984), pp. 8-19.
87: Common substrings in the file are first replaced by 9-bit codes 257 and up.
88: When code 512 is reached, the algorithm switches to 10-bit codes and
89: continues to use more bits until the
90: limit specified by the
91: .B \-b
92: flag is reached (default 16).
93: .I Bits
94: must be between 9 and 16. The default can be changed in the source to allow
95: .I compress
96: to be run on a smaller machine.
97: .PP
98: After the
99: .I bits
100: limit is attained,
101: .I compress
102: periodically checks the compression ratio. If it is increasing,
103: .I compress
104: continues to use the existing code dictionary. However,
105: if the compression ratio decreases,
106: .I compress
107: discards the table of substrings and rebuilds it from scratch. This allows
108: the algorithm to adapt to the next "block" of the file.
109: .PP
110: Note that the
111: .B \-b
112: flag is omitted for
113: .I uncompress,
114: since the
115: .I bits
116: parameter specified during compression
117: is encoded within the output, along with
118: a magic number to ensure that neither decompression of random data nor
119: recompression of compressed data is attempted.
120: .PP
121: .ne 8
122: The amount of compression obtained depends on the size of the
123: input, the number of
124: .I bits
125: per code, and the distribution of common substrings.
126: Typically, text such as source code or English
127: is reduced by 50\-60%.
128: Compression is generally much better than that achieved by
129: Huffman coding (as used in
130: .IR pack ),
131: or adaptive Huffman coding
132: .RI ( compact ),
133: and takes less time to compute.
134: .PP
135: The
136: .B \-v
137: option causes
138: the printing of the percentage reduction of each file.
139: .PP
140: If an error occurs, exit status is 1, else
141: if the last file was not compressed because it became larger, the status
142: is 2; else the status is 0.
143: .SH "DIAGNOSTICS"
144: Usage: compress [\-fvc] [\-b maxbits] [file ...]
145: .in +8
146: Invalid options were specified on the command line.
147: .in -8
148: Missing maxbits
149: .in +8
150: Maxbits must follow
151: .BR \-b \.
152: .in -8
153: .IR file :
154: not in compressed format
155: .in +8
156: The file specified to
157: .I uncompress
158: has not been compressed.
159: .in -8
160: .IR file :
161: compressed with
162: .I xx
163: bits, can only handle
164: .I yy
165: bits
166: .in +8
167: .I File
168: was compressed by a program that could deal with
169: more
170: .I bits
171: than the compress code on this machine.
172: Recompress the file with smaller
173: .IR bits \.
174: .in -8
175: .IR file :
176: already has .Z suffix -- no change
177: .in +8
178: The file is assumed to be already compressed.
179: Rename the file and try again.
180: .in -8
181: .IR file :
182: filename too long to tack on .Z
183: .in +8
184: The file cannot be compressed because its name is longer than
185: 12 characters.
186: Rename and try again.
187: This message does not occur on BSD systems.
188: .in -8
189: .I file
190: already exists; do you wish to overwrite (y or n)?
191: .in +8
192: Respond "y" if you want the output file to be replaced; "n" if not.
193: .in -8
194: uncompress: corrupt input
195: .in +8
196: A SIGSEGV violation was detected which usually means that the input file is
197: corrupted.
198: .in -8
199: Compression:
200: .I "xx.xx%"
201: .in +8
202: Percentage of the input saved by compression.
203: (Relevant only for
204: .BR \-v \.)
205: .in -8
206: -- not a regular file: unchanged
207: .in +8
208: When the input file is not a regular file,
209: (e.g. a directory), it is
210: left unaltered.
211: .in -8
212: -- has
213: .I xx
214: other links: unchanged
215: .in +8
216: The input file has links; it is left unchanged. See
217: .IR ln "(1)"
218: for more information.
219: .in -8
220: -- file unchanged
221: .in +8
222: No savings is achieved by
223: compression. The input remains virgin.
224: .in -8
225: .SH "BUGS"
226: Although compressed files are compatible between machines with large memory,
227: .BR \-b \12
228: should be used for file transfer to architectures with
229: a small process data space (64KB or less, as exhibited by the DEC PDP
230: series, the Intel 80286, etc.)
231: .PP
232: .I compress
233: should be more flexible about the existence of the `.Z' suffix.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.