Annotation of researchv9/cmd/compress/README3.0, revision 1.1

1.1     ! root        1: Enclosed is compress version 3.0 with the following changes:
        !             2: 
        !             3: 1.     "Block" compression is performed.  After the BITS run out, the
        !             4:        compression ratio is checked every so often.  If it is decreasing,
        !             5:        the table is cleared and a new set of substrings are generated.
        !             6: 
        !             7:        This makes the output of compress 3.0 not compatable with that of
        !             8:        compress 2.0.  However, compress 3.0 still accepts the output of
        !             9:        compress 2.0.  To generate output that is compatable with compress
        !            10:        2.0, use the undocumented "-C" flag.
        !            11: 
        !            12: 2.     A quiet "-q" flag has been added for use by the news system.
        !            13: 
        !            14: 3.     The character chaining has been deleted and the program now uses
        !            15:        hashing.  This improves the speed of the program, especially
        !            16:        during decompression.  Other speed improvements have been made,
        !            17:        such as using putc() instead of fwrite().
        !            18: 
        !            19: 4.     A large table is used on large machines when a relatively small
        !            20:        number of bits is specified.  This saves much time when compressing
        !            21:        for a 16-bit machine on a 32-bit virtual machine.  Note that the
        !            22:        speed improvement only occurs when the input file is > 30000
        !            23:        characters, and the -b BITS is less than or equal to the cutoff
        !            24:        described below.
        !            25: 
        !            26: Most of these changes were made by James A. Woods (ames!jaw).  Thank you
        !            27: James!
        !            28: 
        !            29: Version 3.0 has been beta tested on many machines.
        !            30: 
        !            31: To compile compress:
        !            32: 
        !            33:        cc -O -DUSERMEM=usermem -o compress compress.c
        !            34: 
        !            35: Where "usermem" is the amount of physical user memory available (in bytes).  
        !            36: If any physical memory is to be reserved for other processes, put in 
        !            37: "-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
        !            38: 
        !            39: The difference "usermem-sacredmem" determines the maximum BITS that can be
        !            40: specified, and the cutoff bits where the large+fast table is used.
        !            41: 
        !            42: memory: at least               BITS            cutoff
        !            43: ------  -- -----                ----            ------
        !            44:    4,718,592                    16               13
        !            45:    2,621,440                    16               12
        !            46:    1,572,864                    16               11
        !            47:    1,048,576                    16               10
        !            48:      631,808                    16               --
        !            49:      329,728                    15               --
        !            50:      178,176                    14               --
        !            51:       99,328                    13               --
        !            52:            0                    12               --
        !            53: 
        !            54: The default memory size is 750,000 which gives a maximum BITS=16 and no
        !            55: large+fast table.
        !            56: 
        !            57: The maximum bits can be overrulled by specifying "-DBITS=bits" at
        !            58: compilation time.
        !            59: 
        !            60: If your machine doesn't support unsigned characters, define "NO_UCHAR" 
        !            61: when compiling.
        !            62: 
        !            63: If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
        !            64: 
        !            65: After compilation, move "compress" to a standard executable location, such 
        !            66: as /usr/local.  Then:
        !            67:        cd /usr/local
        !            68:        ln compress uncompress
        !            69:        ln compress zcat
        !            70: 
        !            71: On machines that have a fixed stack size (such as Perkin-Elmer), set the
        !            72: stack to at least 12kb.  ("setstack compress 12" on Perkin-Elmer).
        !            73: 
        !            74: Next, install the manual (compress.l).
        !            75:        cp compress.l /usr/man/manl
        !            76:        cd /usr/man/manl
        !            77:        ln compress.l uncompress.l
        !            78:        ln compress.l zcat.l
        !            79: 
        !            80:                - or -
        !            81: 
        !            82:        cp compress.l /usr/man/man1/compress.1
        !            83:        cd /usr/man/man1
        !            84:        ln compress.1 uncompress.1
        !            85:        ln compress.1 zcat.1
        !            86: 
        !            87: The zmore shell script and manual page are for use on systems that have a
        !            88: "more(1)" program.  Install the shell script and the manual page in a "bin"
        !            89: and "man" directory, respectively.  If your system doesn't have the
        !            90: "more(1)" program, just skip "zmore".
        !            91: 
        !            92:                                        regards,
        !            93:                                        petsd!joe
        !            94: 
        !            95: Here is the README file from the previous version of compress (2.0):
        !            96: 
        !            97: >Enclosed is compress.c version 2.0 with the following bugs fixed:
        !            98: >
        !            99: >1.    The packed files produced by compress are different on different
        !           100: >      machines and dependent on the vax sysgen option.
        !           101: >              The bug was in the different byte/bit ordering on the
        !           102: >              various machines.  This has been fixed.
        !           103: >
        !           104: >              This version is NOT compatible with the original vax posting
        !           105: >              unless the '-DCOMPATIBLE' option is specified to the C
        !           106: >              compiler.  The original posting has a bug which I fixed, 
        !           107: >              causing incompatible files.  I recommend you NOT to use this
        !           108: >              option unless you already have a lot of packed files from
        !           109: >              the original posting by thomas.
        !           110: >2.    The exit status is not well defined (on some machines) causing the
        !           111: >      scripts to fail.
        !           112: >              The exit status is now 0,1 or 2 and is documented in
        !           113: >              compress.l.
        !           114: >3.    The function getopt() is not available in all C libraries.
        !           115: >              The function getopt() is no longer referenced by the
        !           116: >              program.
        !           117: >4.    Error status is not being checked on the fwrite() and fflush() calls.
        !           118: >              Fixed.
        !           119: >
        !           120: >The following enhancements have been made:
        !           121: >
        !           122: >1.    Added facilities of "compact" into the compress program.  "Pack",
        !           123: >      "Unpack", and "Pcat" are no longer required (no longer supplied).
        !           124: >2.    Installed work around for C compiler bug with "-O".
        !           125: >3.    Added a magic number header (\037\235).  Put the bits specified
        !           126: >      in the file.
        !           127: >4.    Added "-f" flag to force overwrite of output file.
        !           128: >5.    Added "-c" flag and "zcat" program.  'ln compress zcat' after you
        !           129: >      compile.
        !           130: >6.    The 'uncompress' script has been deleted; simply 
        !           131: >      'ln compress uncompress' after you compile and it will work.
        !           132: >7.    Removed extra bit masking for machines that support unsigned
        !           133: >      characters.  If your machine doesn't support unsigned characters,
        !           134: >      define "NO_UCHAR" when compiling.
        !           135: >
        !           136: >Compile "compress.c" with "-O -o compress" flags.  Move "compress" to a
        !           137: >standard executable location, such as /usr/local.  Then:
        !           138: >      cd /usr/local
        !           139: >      ln compress uncompress
        !           140: >      ln compress zcat
        !           141: >
        !           142: >On machines that have a fixed stack size (such as Perkin-Elmer), set the
        !           143: >stack to at least 12kb.  ("setstack compress 12" on Perkin-Elmer).
        !           144: >
        !           145: >Next, install the manual (compress.l).
        !           146: >      cp compress.l /usr/man/manl             - or -
        !           147: >      cp compress.l /usr/man/man1/compress.1
        !           148: >
        !           149: >Here is the README that I sent with my first posting:
        !           150: >
        !           151: >>Enclosed is a modified version of compress.c, along with scripts to make it
        !           152: >>run identically to pack(1), unpack(1), an pcat(1).  Here is what I
        !           153: >>(petsd!joe) and a colleague (petsd!peora!srd) did:
        !           154: >>
        !           155: >>1. Removed VAX dependencies.
        !           156: >>2. Changed the struct to separate arrays; saves mucho memory.
        !           157: >>3. Did comparisons in unsigned, where possible.  (Faster on Perkin-Elmer.)
        !           158: >>4. Sorted the character next chain and changed the search to stop
        !           159: >>prematurely.  This saves a lot on the execution time when compressing.
        !           160: >>
        !           161: >>This version is totally compatible with the original version.  Even though
        !           162: >>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
        !           163: >>machine, due to the size of the arrays.
        !           164: >>
        !           165: >>Here is the README file from the original author:
        !           166: >> 
        !           167: >>>Well, with all this discussion about file compression (for news batching
        !           168: >>>in particular) going around, I decided to implement the text compression
        !           169: >>>algorithm described in the June Computer magazine.  The author claimed
        !           170: >>>blinding speed and good compression ratios.  It's certainly faster than
        !           171: >>>compact (but, then, what wouldn't be), but it's also the same speed as
        !           172: >>>pack, and gets better compression than both of them.  On 350K bytes of
        !           173: >>>unix-wizards, compact took about 8 minutes of CPU, pack took about 80
        !           174: >>>seconds, and compress (herein) also took 80 seconds.  But, compact and
        !           175: >>>pack got about 30% compression, whereas compress got over 50%.  So, I
        !           176: >>>decided I had something, and that others might be interested, too.
        !           177: >>>
        !           178: >>>As is probably true of compact and pack (although I haven't checked),
        !           179: >>>the byte order within a word is probably relevant here, but as long as
        !           180: >>>you stay on a single machine type, you should be ok.  (Can anybody
        !           181: >>>elucidate on this?)  There are a couple of asm's in the code (extv and
        !           182: >>>insv instructions), so anyone porting it to another machine will have to
        !           183: >>>deal with this anyway (and could probably make it compatible with Vax
        !           184: >>>byte order at the same time).  Anyway, I've linted the code (both with
        !           185: >>>and without -p), so it should run elsewhere.  Note the longs in the
        !           186: >>>code, you can take these out if you reduce BITS to <= 15.
        !           187: >>>
        !           188: >>>Have fun, and as always, if you make good enhancements, or bug fixes,
        !           189: >>>I'd like to see them.
        !           190: >>>
        !           191: >>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
        !           192: >>
        !           193: >>                                     regards,
        !           194: >>                                     joe
        !           195: >>
        !           196: >>--
        !           197: >>Full-Name:  Joseph M. Orost
        !           198: >>UUCP:       ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
        !           199: >>US Mail:    MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
        !           200: >>Phone:      (201) 870-5844

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.