|
|
1.1 root 1: Enclosed is compress version 3.0 with the following changes:
2:
3: 1. "Block" compression is performed. After the BITS run out, the
4: compression ratio is checked every so often. If it is decreasing,
5: the table is cleared and a new set of substrings are generated.
6:
7: This makes the output of compress 3.0 not compatable with that of
8: compress 2.0. However, compress 3.0 still accepts the output of
9: compress 2.0. To generate output that is compatable with compress
10: 2.0, use the undocumented "-C" flag.
11:
12: 2. A quiet "-q" flag has been added for use by the news system.
13:
14: 3. The character chaining has been deleted and the program now uses
15: hashing. This improves the speed of the program, especially
16: during decompression. Other speed improvements have been made,
17: such as using putc() instead of fwrite().
18:
19: 4. A large table is used on large machines when a relatively small
20: number of bits is specified. This saves much time when compressing
21: for a 16-bit machine on a 32-bit virtual machine. Note that the
22: speed improvement only occurs when the input file is > 30000
23: characters, and the -b BITS is less than or equal to the cutoff
24: described below.
25:
26: Most of these changes were made by James A. Woods (ames!jaw). Thank you
27: James!
28:
29: Version 3.0 has been beta tested on many machines.
30:
31: To compile compress:
32:
33: cc -O -DUSERMEM=usermem -o compress compress.c
34:
35: Where "usermem" is the amount of physical user memory available (in bytes).
36: If any physical memory is to be reserved for other processes, put in
37: "-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
38:
39: The difference "usermem-sacredmem" determines the maximum BITS that can be
40: specified, and the cutoff bits where the large+fast table is used.
41:
42: memory: at least BITS cutoff
43: ------ -- ----- ---- ------
44: 4,718,592 16 13
45: 2,621,440 16 12
46: 1,572,864 16 11
47: 1,048,576 16 10
48: 631,808 16 --
49: 329,728 15 --
50: 178,176 14 --
51: 99,328 13 --
52: 0 12 --
53:
54: The default memory size is 750,000 which gives a maximum BITS=16 and no
55: large+fast table.
56:
57: The maximum bits can be overrulled by specifying "-DBITS=bits" at
58: compilation time.
59:
60: If your machine doesn't support unsigned characters, define "NO_UCHAR"
61: when compiling.
62:
63: If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
64:
65: After compilation, move "compress" to a standard executable location, such
66: as /usr/local. Then:
67: cd /usr/local
68: ln compress uncompress
69: ln compress zcat
70:
71: On machines that have a fixed stack size (such as Perkin-Elmer), set the
72: stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
73:
74: Next, install the manual (compress.l).
75: cp compress.l /usr/man/manl
76: cd /usr/man/manl
77: ln compress.l uncompress.l
78: ln compress.l zcat.l
79:
80: - or -
81:
82: cp compress.l /usr/man/man1/compress.1
83: cd /usr/man/man1
84: ln compress.1 uncompress.1
85: ln compress.1 zcat.1
86:
87: The zmore shell script and manual page are for use on systems that have a
88: "more(1)" program. Install the shell script and the manual page in a "bin"
89: and "man" directory, respectively. If your system doesn't have the
90: "more(1)" program, just skip "zmore".
91:
92: regards,
93: petsd!joe
94:
95: Here is the README file from the previous version of compress (2.0):
96:
97: >Enclosed is compress.c version 2.0 with the following bugs fixed:
98: >
99: >1. The packed files produced by compress are different on different
100: > machines and dependent on the vax sysgen option.
101: > The bug was in the different byte/bit ordering on the
102: > various machines. This has been fixed.
103: >
104: > This version is NOT compatible with the original vax posting
105: > unless the '-DCOMPATIBLE' option is specified to the C
106: > compiler. The original posting has a bug which I fixed,
107: > causing incompatible files. I recommend you NOT to use this
108: > option unless you already have a lot of packed files from
109: > the original posting by thomas.
110: >2. The exit status is not well defined (on some machines) causing the
111: > scripts to fail.
112: > The exit status is now 0,1 or 2 and is documented in
113: > compress.l.
114: >3. The function getopt() is not available in all C libraries.
115: > The function getopt() is no longer referenced by the
116: > program.
117: >4. Error status is not being checked on the fwrite() and fflush() calls.
118: > Fixed.
119: >
120: >The following enhancements have been made:
121: >
122: >1. Added facilities of "compact" into the compress program. "Pack",
123: > "Unpack", and "Pcat" are no longer required (no longer supplied).
124: >2. Installed work around for C compiler bug with "-O".
125: >3. Added a magic number header (\037\235). Put the bits specified
126: > in the file.
127: >4. Added "-f" flag to force overwrite of output file.
128: >5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you
129: > compile.
130: >6. The 'uncompress' script has been deleted; simply
131: > 'ln compress uncompress' after you compile and it will work.
132: >7. Removed extra bit masking for machines that support unsigned
133: > characters. If your machine doesn't support unsigned characters,
134: > define "NO_UCHAR" when compiling.
135: >
136: >Compile "compress.c" with "-O -o compress" flags. Move "compress" to a
137: >standard executable location, such as /usr/local. Then:
138: > cd /usr/local
139: > ln compress uncompress
140: > ln compress zcat
141: >
142: >On machines that have a fixed stack size (such as Perkin-Elmer), set the
143: >stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer).
144: >
145: >Next, install the manual (compress.l).
146: > cp compress.l /usr/man/manl - or -
147: > cp compress.l /usr/man/man1/compress.1
148: >
149: >Here is the README that I sent with my first posting:
150: >
151: >>Enclosed is a modified version of compress.c, along with scripts to make it
152: >>run identically to pack(1), unpack(1), an pcat(1). Here is what I
153: >>(petsd!joe) and a colleague (petsd!peora!srd) did:
154: >>
155: >>1. Removed VAX dependencies.
156: >>2. Changed the struct to separate arrays; saves mucho memory.
157: >>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.)
158: >>4. Sorted the character next chain and changed the search to stop
159: >>prematurely. This saves a lot on the execution time when compressing.
160: >>
161: >>This version is totally compatible with the original version. Even though
162: >>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
163: >>machine, due to the size of the arrays.
164: >>
165: >>Here is the README file from the original author:
166: >>
167: >>>Well, with all this discussion about file compression (for news batching
168: >>>in particular) going around, I decided to implement the text compression
169: >>>algorithm described in the June Computer magazine. The author claimed
170: >>>blinding speed and good compression ratios. It's certainly faster than
171: >>>compact (but, then, what wouldn't be), but it's also the same speed as
172: >>>pack, and gets better compression than both of them. On 350K bytes of
173: >>>unix-wizards, compact took about 8 minutes of CPU, pack took about 80
174: >>>seconds, and compress (herein) also took 80 seconds. But, compact and
175: >>>pack got about 30% compression, whereas compress got over 50%. So, I
176: >>>decided I had something, and that others might be interested, too.
177: >>>
178: >>>As is probably true of compact and pack (although I haven't checked),
179: >>>the byte order within a word is probably relevant here, but as long as
180: >>>you stay on a single machine type, you should be ok. (Can anybody
181: >>>elucidate on this?) There are a couple of asm's in the code (extv and
182: >>>insv instructions), so anyone porting it to another machine will have to
183: >>>deal with this anyway (and could probably make it compatible with Vax
184: >>>byte order at the same time). Anyway, I've linted the code (both with
185: >>>and without -p), so it should run elsewhere. Note the longs in the
186: >>>code, you can take these out if you reduce BITS to <= 15.
187: >>>
188: >>>Have fun, and as always, if you make good enhancements, or bug fixes,
189: >>>I'd like to see them.
190: >>>
191: >>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
192: >>
193: >> regards,
194: >> joe
195: >>
196: >>--
197: >>Full-Name: Joseph M. Orost
198: >>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
199: >>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
200: >>Phone: (201) 870-5844
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.