|
|
1.1 root 1:
2: @(#)README 5.3 (Berkeley) 9/17/85
3:
4: Compress version 4.0 improvements over 3.0:
5: o compress() speedup (10-50%) by changing division hash to xor
6: o decompress() speedup (5-10%)
7: o Memory requirements reduced (3-30%)
8: o Stack requirements reduced to less than 4kb
9: o Removed 'Big+Fast' compress code (FBITS) because of compress speedup
10: o Portability mods for Z8000 and PC/XT (but not zeus 3.2)
11: o Default to 'quiet' mode
12: o Unification of 'force' flags
13: o Manual page overhaul
14: o Portability enhancement for M_XENIX
15: o Removed text on #else and #endif
16: o Added "-V" switch to print version and options
17: o Added #defines for SIGNED_COMPARE_SLOW
18: o Added Makefile and "usermem" program
19: o Removed all floating point computations
20: o New programs: [deleted]
21:
22: The "usermem" script attempts to determine the maximum process size. Some
23: editing of the script may be necessary (see the comments). [It should work
24: fine on 4.3 bsd.] If you can't get it to work at all, just create file
25: "USERMEM" containing the maximum process size in decimal.
26:
27: The following preprocessor symbols control the compilation of "compress.c":
28:
29: o USERMEM Maximum process memory on the system
30: o SACREDMEM Amount to reserve for other proceses
31: o SIGNED_COMPARE_SLOW Unsigned compare instructions are faster
32: o NO_UCHAR Don't use "unsigned char" types
33: o BITS Overrules default set by USERMEM-SACREDMEM
34: o vax Generate inline assembler
35: o interdata Defines SIGNED_COMPARE_SLOW
36: o M_XENIX Makes arrays < 65536 bytes each
37: o pdp11 BITS=12, NO_UCHAR
38: o z8000 BITS=12
39: o pcxt BITS=12
40: o BSD4_2 Allow long filenames ( > 14 characters) &
41: Call setlinebuf(stderr)
42:
43: The difference "usermem-sacredmem" determines the maximum BITS that can be
44: specified with the "-b" flag.
45:
46: memory: at least BITS
47: ------ -- ----- ----
48: 433,484 16
49: 229,600 15
50: 127,536 14
51: 73,464 13
52: 0 12
53:
54: The default is BITS=16.
55:
56: The maximum bits can be overrulled by specifying "-DBITS=bits" at
57: compilation time.
58:
59: WARNING: files compressed on a large machine with more bits than allowed by
60: a version of compress on a smaller machine cannot be decompressed! Use the
61: "-b12" flag to generate a file on a large machine that can be uncompressed
62: on a 16-bit machine.
63:
64: The output of compress 4.0 is fully compatible with that of compress 3.0.
65: In other words, the output of compress 4.0 may be fed into uncompress 3.0 or
66: the output of compress 3.0 may be fed into uncompress 4.0.
67:
68: The output of compress 4.0 not compatible with that of
69: compress 2.0. However, compress 4.0 still accepts the output of
70: compress 2.0. To generate output that is compatible with compress
71: 2.0, use the undocumented "-C" flag.
72:
73: -from mod.sources, submitted by vax135!petsd!joe (Joe Orost), 8/1/85
74: --------------------------------
75:
76: Enclosed is compress version 3.0 with the following changes:
77:
78: 1. "Block" compression is performed. After the BITS run out, the
79: compression ratio is checked every so often. If it is decreasing,
80: the table is cleared and a new set of substrings are generated.
81:
82: This makes the output of compress 3.0 not compatible with that of
83: compress 2.0. However, compress 3.0 still accepts the output of
84: compress 2.0. To generate output that is compatible with compress
85: 2.0, use the undocumented "-C" flag.
86:
87: 2. A quiet "-q" flag has been added for use by the news system.
88:
89: 3. The character chaining has been deleted and the program now uses
90: hashing. This improves the speed of the program, especially
91: during decompression. Other speed improvements have been made,
92: such as using putc() instead of fwrite().
93:
94: 4. A large table is used on large machines when a relatively small
95: number of bits is specified. This saves much time when compressing
96: for a 16-bit machine on a 32-bit virtual machine. Note that the
97: speed improvement only occurs when the input file is > 30000
98: characters, and the -b BITS is less than or equal to the cutoff
99: described below.
100:
101: Most of these changes were made by James A. Woods (ames!jaw). Thank you
102: James!
103:
104: To compile compress:
105:
106: cc -O -DUSERMEM=usermem -o compress compress.c
107:
108: Where "usermem" is the amount of physical user memory available (in bytes).
109: If any physical memory is to be reserved for other processes, put in
110: "-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
111:
112: The difference "usermem-sacredmem" determines the maximum BITS that can be
113: specified, and the cutoff bits where the large+fast table is used.
114:
115: memory: at least BITS cutoff
116: ------ -- ----- ---- ------
117: 4,718,592 16 13
118: 2,621,440 16 12
119: 1,572,864 16 11
120: 1,048,576 16 10
121: 631,808 16 --
122: 329,728 15 --
123: 178,176 14 --
124: 99,328 13 --
125: 0 12 --
126:
127: The default memory size is 750,000 which gives a maximum BITS=16 and no
128: large+fast table.
129:
130: The maximum bits can be overruled by specifying "-DBITS=bits" at
131: compilation time.
132:
133: If your machine doesn't support unsigned characters, define "NO_UCHAR"
134: when compiling.
135:
136: If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
137:
138: After compilation, move "compress" to a standard executable location, such
139: as /usr/local. Then:
140: cd /usr/local
141: ln compress uncompress
142: ln compress zcat
143:
144: On machines that have a fixed stack size (such as Perkin-Elmer), set the
145: stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
146:
147: Next, install the manual (compress.l).
148: cp compress.l /usr/man/manl
149: cd /usr/man/manl
150: ln compress.l uncompress.l
151: ln compress.l zcat.l
152:
153: - or -
154:
155: cp compress.l /usr/man/man1/compress.1
156: cd /usr/man/man1
157: ln compress.1 uncompress.1
158: ln compress.1 zcat.1
159:
160: regards,
161: petsd!joe
162:
163: Here is a note from the net:
164:
165: >From hplabs!pesnta!amd!turtlevax!ken Sat Jan 5 03:35:20 1985
166: Path: ames!hplabs!pesnta!amd!turtlevax!ken
167: From: [email protected] (Ken Turkowski)
168: Newsgroups: net.sources
169: Subject: Re: Compress release 3.0 : sample Makefile
170: Organization: CADLINC, Inc. @ Menlo Park, CA
171:
172: In the compress 3.0 source recently posted to mod.sources, there is a
173: #define variable which can be set for optimum performance on a machine
174: with a large amount of memory. A program (usermem) to calculate the
175: useable amount of physical user memory is enclosed, as well as a sample
176: 4.2bsd Vax Makefile for compress.
177:
178: Here is the README file from the previous version of compress (2.0):
179:
180: >Enclosed is compress.c version 2.0 with the following bugs fixed:
181: >
182: >1. The packed files produced by compress are different on different
183: > machines and dependent on the vax sysgen option.
184: > The bug was in the different byte/bit ordering on the
185: > various machines. This has been fixed.
186: >
187: > This version is NOT compatible with the original vax posting
188: > unless the '-DCOMPATIBLE' option is specified to the C
189: > compiler. The original posting has a bug which I fixed,
190: > causing incompatible files. I recommend you NOT to use this
191: > option unless you already have a lot of packed files from
192: > the original posting by thomas.
193: >2. The exit status is not well defined (on some machines) causing the
194: > scripts to fail.
195: > The exit status is now 0,1 or 2 and is documented in
196: > compress.l.
197: >3. The function getopt() is not available in all C libraries.
198: > The function getopt() is no longer referenced by the
199: > program.
200: >4. Error status is not being checked on the fwrite() and fflush() calls.
201: > Fixed.
202: >
203: >The following enhancements have been made:
204: >
205: >1. Added facilities of "compact" into the compress program. "Pack",
206: > "Unpack", and "Pcat" are no longer required (no longer supplied).
207: >2. Installed work around for C compiler bug with "-O".
208: >3. Added a magic number header (\037\235). Put the bits specified
209: > in the file.
210: >4. Added "-f" flag to force overwrite of output file.
211: >5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you
212: > compile.
213: >6. The 'uncompress' script has been deleted; simply
214: > 'ln compress uncompress' after you compile and it will work.
215: >7. Removed extra bit masking for machines that support unsigned
216: > characters. If your machine doesn't support unsigned characters,
217: > define "NO_UCHAR" when compiling.
218: >
219: >Compile "compress.c" with "-O -o compress" flags. Move "compress" to a
220: >standard executable location, such as /usr/local. Then:
221: > cd /usr/local
222: > ln compress uncompress
223: > ln compress zcat
224: >
225: >On machines that have a fixed stack size (such as Perkin-Elmer), set the
226: >stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
227: >
228: >Next, install the manual (compress.l).
229: > cp compress.l /usr/man/manl - or -
230: > cp compress.l /usr/man/man1/compress.1
231: >
232: >Here is the README that I sent with my first posting:
233: >
234: >>Enclosed is a modified version of compress.c, along with scripts to make it
235: >>run identically to pack(1), unpack(1), an pcat(1). Here is what I
236: >>(petsd!joe) and a colleague (petsd!peora!srd) did:
237: >>
238: >>1. Removed VAX dependencies.
239: >>2. Changed the struct to separate arrays; saves mucho memory.
240: >>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.)
241: >>4. Sorted the character next chain and changed the search to stop
242: >>prematurely. This saves a lot on the execution time when compressing.
243: >>
244: >>This version is totally compatible with the original version. Even though
245: >>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
246: >>machine, due to the size of the arrays.
247: >>
248: >>Here is the README file from the original author:
249: >>
250: >>>Well, with all this discussion about file compression (for news batching
251: >>>in particular) going around, I decided to implement the text compression
252: >>>algorithm described in the June Computer magazine. The author claimed
253: >>>blinding speed and good compression ratios. It's certainly faster than
254: >>>compact (but, then, what wouldn't be), but it's also the same speed as
255: >>>pack, and gets better compression than both of them. On 350K bytes of
256: >>>unix-wizards, compact took about 8 minutes of CPU, pack took about 80
257: >>>seconds, and compress (herein) also took 80 seconds. But, compact and
258: >>>pack got about 30% compression, whereas compress got over 50%. So, I
259: >>>decided I had something, and that others might be interested, too.
260: >>>
261: >>>As is probably true of compact and pack (although I haven't checked),
262: >>>the byte order within a word is probably relevant here, but as long as
263: >>>you stay on a single machine type, you should be ok. (Can anybody
264: >>>elucidate on this?) There are a couple of asm's in the code (extv and
265: >>>insv instructions), so anyone porting it to another machine will have to
266: >>>deal with this anyway (and could probably make it compatible with Vax
267: >>>byte order at the same time). Anyway, I've linted the code (both with
268: >>>and without -p), so it should run elsewhere. Note the longs in the
269: >>>code, you can take these out if you reduce BITS to <= 15.
270: >>>
271: >>>Have fun, and as always, if you make good enhancements, or bug fixes,
272: >>>I'd like to see them.
273: >>>
274: >>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
275: >>
276: >> regards,
277: >> joe
278: >>
279: >>--
280: >>Full-Name: Joseph M. Orost
281: >>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
282: >>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
283: >>Phone: (201) 870-5844
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.