Annotation of researchv10dc/vol2/grap/paper.cstr, revision 1.1.1.1

1.1       root        1: .so ../ADM/mac
                      2: .XX grap 109 "Grap \(em A Language for Typesetting Graphs"
                      3: .TR 114
                      4: .ND "Revised, May 1991
                      5: .EQ
                      6: delim $$
                      7: .EN
                      8: .so macros
                      9: .ds g \f2grap\fP
                     10: .ds G \f2Grap\fP
                     11: .TL
                     12: Grap \(em A Language for Typesetting Graphs
                     13: .br
                     14: Tutorial and User Manual
                     15: .AU
                     16: Jon L. Bentley
                     17: Brian W. Kernighan
                     18: .AI
                     19: .MH
                     20: .AB
                     21: \*G
                     22: is a language for describing plots of data.
                     23: This graph of the 1984
                     24: age distribution in the United States
                     25: .grap agepop1.g
                     26: is produced by the
                     27: \*g
                     28: commands
                     29: .P1
                     30: .get agepop1.g
                     31: .P2
                     32: (Each line in the data file
                     33: .CW agepop.d
                     34: contains an age and the number of Americans of that
                     35: age alive in 1984; the file is sorted by age.)
                     36: .PP
                     37: The
                     38: \*g
                     39: preprocessor works with
                     40: .I pic |reference(latest pic)
                     41: and
                     42: .I troff |reference(latest troff reference).
                     43: Most of its input is passed
                     44: through untouched, but statements between
                     45: .CW .G1
                     46: and
                     47: .CW .G2
                     48: are translated into
                     49: .I pic
                     50: commands that draw graphs.
                     51: .AE
                     52: .NH
                     53: Introduction
                     54: .PP
                     55: \*G
                     56: is a language for describing graphical
                     57: displays of data.
                     58: It provides such services as automatic scaling and
                     59: labeling of axes, and
                     60: .CW for
                     61: statements,
                     62: .CW if
                     63: statements, and macros to facilitate user
                     64: programmability.
                     65: \*G
                     66: is intended primarily for including graphs in
                     67: documents prepared on the
                     68: .UX
                     69: operating system, and is only marginally
                     70: useful for elementary tasks in data analysis.
                     71: .PP
                     72: Section 2 of this document is a tutorial introduction to
                     73: \*g;
                     74: readers who find it slow going may wish to skim ahead.
                     75: The examples in Section 3 illustrate
                     76: the various kinds of graphs that
                     77: \*g
                     78: can produce and some common
                     79: \*g
                     80: idioms.
                     81: Mundane matters about using
                     82: \*g
                     83: are discussed in Section 4,
                     84: and Section 5 contains a brief reference manual.
                     85: .PP
                     86: We have tried to illustrate good principles of
                     87: statistics and graphical design in the
                     88: graphs we present.
                     89: In several places, though, good taste has lost to
                     90: the necessity of illustrating
                     91: \*g
                     92: capabilities.
                     93: Readers interested in statistical
                     94: integrity and taste should
                     95: consult the literature, for example |reference(chambers graphs)
                     96: |reference(tufte graphs) |reference(cleveland elements).
                     97: .NH
                     98: Tutorial
                     99: .PP
                    100: The following is a simple
                    101: \*g
                    102: program\(dg
                    103: .FS
                    104: \(dg Throughout
                    105: this document we will show only the first five
                    106: lines and the last line of data files;
                    107: omitted lines are indicated by ``...''.
                    108: .FE
                    109: .P1
                    110: \&.G1
                    111: .d 400mtimes.d
                    112: \&.G2
                    113: .P2
                    114: The single number on each line
                    115: is the winning time in seconds for the
                    116: men's 400 meter run,
                    117: from the first modern Olympic Games (1896)
                    118: to the twenty-first (1988).
                    119: If the file
                    120: .CW olymp.g
                    121: contains the text above,
                    122: then typing the command
                    123: .P1
                    124: grap olymp.g | pic | troff > junk
                    125: .P2
                    126: creates a
                    127: .I troff
                    128: output file
                    129: .CW junk
                    130: that contains the
                    131: picture
                    132: .grap 4001.g
                    133: The graph shows the decrease
                    134: in winning times from 54.2
                    135: seconds to 43.87 seconds.
                    136: If the times are
                    137: contained in the file
                    138: .CW 400mtimes.d ,
                    139: we could
                    140: produce the same graph with the
                    141: shorter program
                    142: .P1
                    143: .get 4001.g
                    144: .P2
                    145: Writing
                    146: .CW copy
                    147: .CW \&"fname"
                    148: in a
                    149: \*g
                    150: program is equivalent to including the
                    151: contents of file
                    152: .CW fname
                    153: at that point in the file.
                    154: (In the interests of compatibility with other programs,
                    155: .CW include
                    156: is a synonym for
                    157: .CW copy .)
                    158: .PP
                    159: Each line in the file
                    160: .CW 400mpairs.d
                    161: contains two numbers, the
                    162: year of the Olympics and the winning time:
                    163: .P1
                    164: .d 400mpairs.d
                    165: .P2
                    166: If we plot this data with the program
                    167: .P1
                    168: .get 4002.g
                    169: .P2
                    170: the bottom ($x$) axis represents the year of the Olympics.
                    171: .grap 4002.g
                    172: The ``holes'' in $x$-values reflect the fact
                    173: that the 1916, 1940, and 1944 Olympics
                    174: were cancelled due to war.
                    175: Because the previous data
                    176: (in
                    177: .CW 400mtimes.d )
                    178: had just one number per
                    179: line,
                    180: \*g
                    181: viewed it as a ``time series'' and
                    182: supplied $x$-values of $1, ~ 2, ~ 3, ...$
                    183: before plotting
                    184: the data as $y$-values.
                    185: The input to the
                    186: second program has two values per line,
                    187: so they are interpreted as $( x , y )$ pairs.
                    188: .PP
                    189: Rather than a scatter plot of points, we might prefer to
                    190: see the winning times connected by a solid
                    191: line.
                    192: The program
                    193: .P1
                    194: .get 4003.g
                    195: .P2
                    196: produces the graph
                    197: .grap 4003.g
                    198: Eric Liddell of Great Britain
                    199: won his gold medal
                    200: in Paris in 1924 with a time of 47.6 seconds.
                    201: (Remember ``Chariots
                    202: of Fire''?)
                    203: .PP
                    204: We can make the graph more attractive
                    205: by modifying its frame
                    206: and adding labels.
                    207: .P1
                    208: .get 4004.g
                    209: .P2
                    210: The
                    211: .CW frame
                    212: command describes
                    213: the graph's bounding box:
                    214: the overall frame (which has four sides)
                    215: is invisible, it is 2 inches high and 3 inches
                    216: wide (which happen to be the
                    217: default height and width),
                    218: and the left and bottom
                    219: sides are solid (they could have been
                    220: dashed or dotted instead).
                    221: The labels appear on the left and bottom, as requested.
                    222: .grap 4004.g
                    223: .PP
                    224: To set the range of each axis,
                    225: \*g
                    226: examines the data and pads both
                    227: dimensions
                    228: by seven percent at each end.
                    229: The
                    230: .CW coord
                    231: (``coordinates'') command
                    232: allows you to specify the range of one or both axes explicitly;
                    233: it also turns off automatic padding.
                    234: .P1
                    235: .get 4005.g
                    236: .P2
                    237: The $y$-axis now ranges from 42 to 56 seconds
                    238: (a little more than before),
                    239: and the $x$-axis from 1894 to 1990
                    240: (a little less).
                    241: .grap 4005.g
                    242: .PP
                    243: The ticks in the preceding graphs were generated
                    244: by
                    245: \*g
                    246: guessing at reasonable values.
                    247: If you would rather provide your own,
                    248: you may
                    249: use the
                    250: .CW ticks
                    251: command,
                    252: which comes in the flavors illustrated below.
                    253: .P1
                    254: .get 4006.g
                    255: .P2
                    256: The first
                    257: .CW ticks
                    258: command deals with the left axis:
                    259: it puts the ticks facing out at
                    260: the numbers in the list.
                    261: \*G
                    262: puts labels only at values
                    263: with strings,
                    264: except that when no labels at all are
                    265: given, each number serves as its own label,
                    266: as in the second
                    267: .CW ticks
                    268: command.
                    269: That command
                    270: is for the bottom axis:
                    271: it puts the ticks facing in at steps of 20
                    272: from 1900 to 1980.
                    273: The command
                    274: .CW "ticks off"
                    275: turns off all ticks.
                    276: \*G
                    277: does its best to place labels appropriately, but
                    278: it sometimes needs your help:
                    279: the
                    280: .CW "left .2"
                    281: clause moves the left label 0.2 inches further left to
                    282: avoid the new ticks.
                    283: .grap 4006.g
                    284: .PP
                    285: The file
                    286: .CW 400wpairs.d
                    287: contains the times for
                    288: the women's 400 meter race, which has been run
                    289: only since 1964.
                    290: .P1
                    291: .d 400wpairs.d
                    292: .P2
                    293: To add these times to the graph,
                    294: we use
                    295: .P1
                    296: .get 4007.g
                    297: .P2
                    298: The
                    299: .CW new
                    300: command tells
                    301: \*g
                    302: to end
                    303: the old curve and to start a new curve
                    304: (which in this case will be drawn
                    305: with a dotted line).
                    306: Text is placed on the graph by
                    307: commands of the form
                    308: .P1
                    309: "string" at xvalue, yvalue
                    310: .P2
                    311: The
                    312: .CW size
                    313: clauses following the quoted strings tell
                    314: \*g
                    315: to shrink the characters by three points (absolute point sizes
                    316: may also be specified).
                    317: Strings are usually centered at the specified position,
                    318: but can be adjusted by clauses to be illustrated shortly.
                    319: .grap 4007.g
                    320: .PP
                    321: The file
                    322: .CW phone.d
                    323: records the number of telephones in the United States from
                    324: 1900 to 1970.
                    325: .P1
                    326: .d phone.d
                    327: .P2
                    328: Each line gives a year and the number of telephones
                    329: present in that year
                    330: (in millions, truncated to the nearest hundred thousand).
                    331: The simple
                    332: \*g
                    333: program
                    334: .P1
                    335: .get phone1.g
                    336: .P2
                    337: produces the simple graph
                    338: .grap phone1.g
                    339: .PP
                    340: The number of telephones appears to
                    341: grow exponentially;
                    342: to study that we will plot the data with
                    343: a logarithmic $y$-axis by adding
                    344: .CW log
                    345: .CW y
                    346: to the
                    347: .CW coord
                    348: command.
                    349: We will also add cosmetic changes of labels, more ticks,
                    350: and a solid line to replace the unconnected dots.
                    351: .P1
                    352: .get phone2.g
                    353: .P2
                    354: The third
                    355: .CW ticks
                    356: command provides a string that is used to print the tick
                    357: labels.
                    358: .UC C
                    359: programmers will recognize it as a
                    360: .CW printf
                    361: format string; others may view the
                    362: .CW %g
                    363: as the place to put
                    364: the number and anything else (in this case just an apostrophe) as
                    365: literal text to appear in the labels.
                    366: To suppress
                    367: labels, use the empty format string ("").
                    368: The program produces
                    369: .grap phone2.g
                    370: The number of telephones grew rapidly
                    371: in the first decade of this century,
                    372: and then settled down to an exponential growth rate upset only
                    373: by a decrease in the Great Depression and a post-war growth
                    374: spurt
                    375: to return the curve to its pre-Depression line.
                    376: .PP
                    377: Our presentation so far has been to
                    378: start with a simple
                    379: \*g
                    380: program that illustrates the data, and then refine it.
                    381: Later in this document we will ignore the design
                    382: phase, and present rather complex graphs in
                    383: their final form.
                    384: Beware.
                    385: .PP
                    386: All the examples so far have placed data on the
                    387: graph implicitly by
                    388: .CW copy ing
                    389: a file of numbers
                    390: (either a time series with one number per line or
                    391: pairs of numbers).
                    392: It is also possible to draw points and lines explicitly.
                    393: The
                    394: \*g 
                    395: commands to draw on a graph
                    396: are illustrated in the following
                    397: fragment.
                    398: .P1
                    399: .get geom.g
                    400: .P2
                    401: .PP
                    402: The
                    403: .CW grid
                    404: command is similar to the
                    405: .CW ticks
                    406: command, except that grid lines extend
                    407: across the frame.
                    408: The next few commands plot text at specified positions.
                    409: The plotting characters (such as
                    410: .CW bullet )
                    411: are implemented as predefined
                    412: macros \(em more on that shortly.
                    413: Unlike arbitrary characters,
                    414: the visual centers of the markers
                    415: are near their plotting centers.
                    416: The
                    417: .CW circle
                    418: command draws a circle centered at the specified location.
                    419: A radius in inches may be specified;
                    420: if no radius is given, then the circle will be the
                    421: small circle shown at the center of the graph.
                    422: The
                    423: .CW line
                    424: and
                    425: .CW arrow
                    426: commands draw the obvious objects shown at the upper left.
                    427: .grap geom.g
                    428: .PP
                    429: This figure also illustrates the combined use of the
                    430: .CW draw
                    431: and
                    432: .CW next
                    433: commands.
                    434: Saying
                    435: .CW draw
                    436: .CW A
                    437: .CW solid
                    438: defines the style
                    439: for a connected sequence of line fragments to be called
                    440: .CW A .
                    441: Subsequent commands of
                    442: .CW next
                    443: .CW A
                    444: .CW at
                    445: .I point
                    446: add
                    447: .I point
                    448: to the end of
                    449: .CW A .
                    450: There are two such sequences active in the above
                    451: example
                    452: .CW A "" (
                    453: and
                    454: .CW B );
                    455: note that their
                    456: .CW next
                    457: commands are intermixed.
                    458: Because the predefined string
                    459: .CW delta
                    460: follows the specification of
                    461: .CW B ,
                    462: that string is plotted at each point in the sequence.
                    463: .PP
                    464: \*G
                    465: has numeric variables (implemented as double-precision
                    466: floating point numbers) and
                    467: the usual collection of arithmetic operators and
                    468: mathematical functions; see the reference section
                    469: for details.
                    470: .PP
                    471: \*G
                    472: provides the same rudimentary macro facility that
                    473: .I pic
                    474: does:
                    475: .P1
                    476: define \f2name\fP  { \f2replacement text\fP }
                    477: .P2
                    478: defines
                    479: .IT name
                    480: to be the
                    481: .IT "replacement text" .
                    482: The replacement may be any text that contains balanced open and closing braces
                    483: .CW "{ }" .
                    484: (Alternatively, the
                    485: .IT "replacement text
                    486: may be quoted by
                    487: any single character that does not appear in the replacement;
                    488: the string is terminated by the next occurrence of that character.)
                    489: Any subsequent occurrence of
                    490: .IT name
                    491: will be replaced by
                    492: .IT "replacement text" .
                    493: .EQ
                    494: delim %%
                    495: .EN
                    496: .PP
                    497: The replacement text of a macro definition may
                    498: contain occurrences of
                    499: .CW $1 ,
                    500: .CW $2 ,
                    501: etc.;
                    502: these will be replaced by the corresponding actual
                    503: arguments when the macro is invoked.
                    504: The invocation for a macro with arguments is
                    505: .P1
                    506: name(arg1, arg2, ...)
                    507: .P2
                    508: Non-existent arguments are replaced by null
                    509: strings.
                    510: .EQ
                    511: delim $$
                    512: .EN
                    513: .PP
                    514: The following
                    515: \*g
                    516: program uses macros and arithmetic to plot
                    517: crude approximations to
                    518: the square and square root functions.
                    519: .P1
                    520: .get macarith.g
                    521: .P2
                    522: The macro
                    523: .CW root
                    524: uses the
                    525: .CW ^
                    526: exponentiation operator.
                    527: (Because
                    528: \*g
                    529: has the square root function
                    530: .CW sqrt ,
                    531: that macro is in fact superfluous.)
                    532: The program produces
                    533: .grap macarith.g
                    534: .PP
                    535: The
                    536: .CW copy
                    537: command has a
                    538: .CW thru
                    539: parameter that allows each line of a file to
                    540: be treated as though it were a macro call, with
                    541: the first field serving as
                    542: the first argument,
                    543: and so on.
                    544: This is the typical
                    545: \*g
                    546: mechanism for plotting files that are not stored as
                    547: time series or as $(x,y)$ pairs.
                    548: We will illustrate its use on the file
                    549: .CW states.d ,
                    550: which contains data on the fifty states.
                    551: .P1
                    552: .d states.d
                    553: .P2
                    554: The first field is the postal abbreviation of the state's
                    555: name (Alaska, Wyoming, Vermont, ...), the second field
                    556: is the number of Representatives to Congress from the state
                    557: after the 1981 reapportionment, and the third field is
                    558: the population of the state as measured in the 1980 Census.
                    559: The states appear in increasing order of
                    560: population.
                    561: .PP
                    562: We will first plot this data as
                    563: population, representative pairs.
                    564: (In the
                    565: .CW coord
                    566: statement,
                    567: .CW "log log"
                    568: is a synonym for
                    569: .CW "log x log y" .)
                    570: .P1
                    571: .get states1.g
                    572: .P2
                    573: Although the population is given in persons,
                    574: the
                    575: .CW PlotState
                    576: macro
                    577: plots the population in millions by dividing
                    578: the third input field
                    579: by one million (written in exponential notation
                    580: as
                    581: .CW 1e6 ,
                    582: for $1 times 10 sup 6$).
                    583: .grap states1.g
                    584: Using
                    585: .CW circle
                    586: as a plotting symbol displays
                    587: overlapping points that are obscured when
                    588: the data is plotted with bullets.
                    589: The representation of a state is roughly proportional
                    590: to its population, except in the very small states.
                    591: .PP
                    592: Our next plot will use the state's rank
                    593: in population as the $x$-coordinate and two
                    594: different $y$-coordinates: population and number of
                    595: representatives.
                    596: We will use two
                    597: .CW coord
                    598: commands to define the two coordinate systems
                    599: .CW pop
                    600: and
                    601: .CW rep .
                    602: We then explicitly give the coordinate system
                    603: whenever we refer to a point,
                    604: both in constructing axes and plotting data.
                    605: .P1
                    606: .get states2.g
                    607: .P2
                    608: The
                    609: .CW copy
                    610: statement in the program uses an
                    611: .I "immediate macro"
                    612: enclosed in curly brackets and thus avoids having to
                    613: name a macro for this task.
                    614: Because the program assumes that the states are
                    615: sorted in increasing order of population, it
                    616: generates
                    617: .CW thisrank
                    618: internally as a
                    619: \*g
                    620: variable.
                    621: The program produces
                    622: .grap states2.g
                    623: .PP
                    624: The plotting symbols were chosen for contrast in
                    625: both shape and shading.
                    626: This graph also indicates that representation is proportional
                    627: to population.
                    628: Once we see this graph, though, we should realize that we don't
                    629: really need two coordinate systems: we can relate the two by
                    630: dividing the population of the U.S. \(em about 226,000,000 \(em by
                    631: the number of representatives \(em 435 \(em to see that each
                    632: representative should count as 520,000 people.
                    633: If the purpose of this graph were to tell a story about
                    634: American politics rather than to illustrate
                    635: multiple coordinate systems,
                    636: it should be redrawn with a single coordinate
                    637: system.
                    638: .PP
                    639: Many graphs plot both observed data and a function
                    640: that (theoretically) describes the data.
                    641: There are many ways to draw a function
                    642: in \*g:
                    643: a series of
                    644: .CW next
                    645: commands is tedious but works, as does writing a
                    646: simple program to write a data file that is subsequently
                    647: read and plotted by \*g.
                    648: The
                    649: .CW for
                    650: statement often provides a better solution.
                    651: This
                    652: \*g
                    653: program
                    654: .P1
                    655: .get sin1.g
                    656: .P2
                    657: produces
                    658: .grap sin1.g
                    659: .a
                    660: The
                    661: .CW for
                    662: statement uses the same syntax as the
                    663: .CW ticks
                    664: statement, but the
                    665: .CW from
                    666: keyword can be replaced by
                    667: .CW = '', ``
                    668: which will look more familiar to programmers.
                    669: It varies the index variable over the specified range
                    670: and for each value executes all statements inside the delimiter
                    671: characters, which use the same rules as macro
                    672: delimiters.
                    673: It is, of course, useful for many tasks beyond plotting functions.
                    674: .EQ
                    675: delim %%
                    676: .EN
                    677: .PP
                    678: The
                    679: .CW if
                    680: statement provides a simple mechanism for conditional execution.
                    681: If a file contains data on both cities and states (and lines
                    682: describing states have ``S'' in the first field), it could be plotted
                    683: by statements like
                    684: .P1
                    685: if "$1" == "S" then {
                    686:        PlotState($2,$3,$4)
                    687: } else {
                    688:        PlotCity($2,$3,$4,$5,$6)
                    689: }
                    690: .P2
                    691: The
                    692: .CW else
                    693: clause
                    694: is optional; delimiters use the same rules as macros and
                    695: .CW for
                    696: statements.
                    697: .EQ
                    698: delim $$
                    699: .EN 
                    700: .NH
                    701: A Collection of Examples
                    702: .PP
                    703: The previous section covered the
                    704: \*g
                    705: commands that are used in common graphs.
                    706: In this section we'll spend less time on
                    707: language features, and survey a wider variety of
                    708: graphs.
                    709: These examples are intended more for browsing and
                    710: reference than for straight-through reading.
                    711: Be prepared to refer to the manual in Section 5 when you stumble over a new
                    712: \*g
                    713: feature.
                    714: .PP
                    715: The file
                    716: .CW cars.d
                    717: contains the mileage (miles per gallon) and the weight
                    718: (pounds) for 74 models of automobiles sold in the United States
                    719: in the 1979 model year.
                    720: .P1
                    721: .d cars.d
                    722: .P2
                    723: The trivial
                    724: \*g
                    725: program
                    726: .P1
                    727: .get cars1.g
                    728: .P2
                    729: produces
                    730: .grap cars1.g
                    731: This graph shows that weights bottom out somewhat
                    732: below 2000
                    733: pounds and that heavier cars get worse mileage;
                    734: it is hard to say much more about the relationship
                    735: between weight and mileage.
                    736: .PP
                    737: The next graph provides labels, uses circles
                    738: to expose data hidden in the clouds of bullets,
                    739: and re-expresses the $x$-axis in gallons per mile.
                    740: It also changes the point size and vertical spacing
                    741: to a size appropriate for camera-ready journal articles
                    742: and books; the size changes should be made outside the
                    743: \*g
                    744: program.
                    745: The
                    746: .CW \&.ft
                    747: command changes to a Helvetica font, which
                    748: some people prefer for graphs.
                    749: .P1
                    750: .get cars2.g
                    751: .P2
                    752: \*G
                    753: supports logarithmic re-expression of data with the
                    754: .CW log
                    755: clause in the
                    756: .CW coord
                    757: statement; any other re-expression of data must be done
                    758: with
                    759: \*g
                    760: arithmetic, as above.
                    761: .br
                    762: .grap cars2.g
                    763: This graph shows that
                    764: gallons per mile is roughly proportional to weight.
                    765: (The two outliers near 4000 pounds are the Cadillac
                    766: Seville and the Oldsmobile 98.)
                    767: .PP
                    768: In
                    769: .I "Visual Display of Quantitative Information" ,
                    770: Tufte proposes the ``dot-dash-plot'' as a means for maximizing
                    771: data ink (showing the two-dimensional distribution and
                    772: the two one-dimensional marginal distributions) while minimizing
                    773: what he calls ``chart junk'' \(em ink wasted on borders
                    774: and non-data labels.
                    775: His preference is easy to express in \*g:
                    776: .P1
                    777: .get cars3.g
                    778: .P2
                    779: Although visually attractive, we do not find the
                    780: resulting graph as useful for interpreting the data.
                    781: .grap cars3.g
                    782: Tufte's graph does point out two facts that are
                    783: not obvious in the previous graphs:
                    784: there is a gap in car weights near 3000 pounds (exhibited
                    785: by the hole in the $y$-axis ticks), and the gallons per
                    786: mile axis is regularly structured (the ticks
                    787: are the reciprocals of an almost dense sequence of integers).
                    788: The reader may decide whether those insights are worth
                    789: the decrease in clarity.
                    790: .PP
                    791: Throughout the twentieth century, horses, cars and people
                    792: have gotten faster;
                    793: let's study those improvements.
                    794: For horses, we'll consider the winning times
                    795: of the Kentucky Derby from 1909 to 1988, in
                    796: the file
                    797: .CW speedhorse.d :
                    798: .P1
                    799: .d speedhorse.d
                    800: .P2
                    801: The program
                    802: .P1
                    803: .get speedhorse1.g
                    804: .P2
                    805: produces the graph
                    806: .grap speedhorse1.g
                    807: Each race is recorded with a bullet and
                    808: record times are marked by horizontal lines.
                    809: Secretariat is the only horse to have run the
                    810: one-and-a-quarter-mile
                    811: race in under two minutes; he won in 1973 in
                    812: 1:59.4.
                    813: .PP
                    814: For automobiles we will study the
                    815: world land speed record (even though those vehicles
                    816: are by now just low-flying airplanes).
                    817: The file
                    818: .CW speedcar.d
                    819: lists years in which speed records were set and the record
                    820: set in that year, in miles per hour averaged over a one-mile
                    821: course.
                    822: .P1
                    823: .d speedcar.d
                    824: .P2
                    825: We will plot the data with the following
                    826: \*g
                    827: program, which uses nested braces in the
                    828: .CW copy
                    829: and
                    830: .CW if
                    831: statements.
                    832: .P1
                    833: .get speedcar1.g
                    834: .P2
                    835: .PP
                    836: Each record line is drawn after the
                    837: .I next
                    838: record is read, because
                    839: the program must know when the record was broken to draw
                    840: its line.
                    841: The
                    842: .CW if
                    843: statement handles the first record, and the extra
                    844: .CW line
                    845: command extends the last record out to the current date.
                    846: .grap speedcar1.g
                    847: The horizontal lines reflect the nature of world records: they
                    848: last until they are broken.
                    849: The records could also have been plotted by a scatterplot
                    850: in which each point represents the setting of a record,
                    851: but it would be misleading to connect adjacent
                    852: points with line segments
                    853: (which we inappropriately did in the graphs
                    854: of the Olympic 400 meter run).
                    855: .PP
                    856: The following graph shows the world record times for the
                    857: one mile run;
                    858: because its
                    859: \*g
                    860: program is so similar to its automotive counterpart,
                    861: we won't show the program or data.
                    862: .grap speedman1.g
                    863: The three graphs show three different kinds of
                    864: changes.
                    865: Although horses are getting faster, they appear to
                    866: be approaching a barrier near two minutes.
                    867: Cars show great jumps as new technologies are introduced
                    868: followed by a plateau as limits of the
                    869: technology are reached.
                    870: Milers have shown a fairly consistent
                    871: linear improvement
                    872: over this century, but there must be an
                    873: asymptote down there somewhere.
                    874: .PP
                    875: The next file gives the median heights of boys
                    876: in the United States aged 2 to 18, together with
                    877: the fifth and ninety-fifth percentiles.
                    878: .P1
                    879: .d boyhts.d
                    880: .P2
                    881: The heights are given in centimeters (1 foot = 30.48 centimeters).
                    882: The trivial program
                    883: .P1
                    884: .get boyhts1.g
                    885: .P2
                    886: displays the data as
                    887: .grap boyhts1.g
                    888: Because there are four numbers on each input line, the first is
                    889: taken as an $x$-value and the remaining three are plotted
                    890: as $y$-values.
                    891: .PP
                    892: The three curves appear to be roughly straight
                    893: (at least up to age 16),
                    894: so it makes sense to fit a line
                    895: through them.
                    896: We will use the standard least squares regression
                    897: in which
                    898: .EQ
                    899: slope ~=~ {
                    900: {n SIGMA x y ~ - ~ SIGMA x SIGMA y }
                    901: over
                    902: {n SIGMA x sup 2 ~ - ~ ( SIGMA x ) sup 2 }
                    903: }
                    904: .EN
                    905: (where the summations range over all $n$ $x$ and $y$ values
                    906: in the data set) and the $y$-intercept is
                    907: .EQ
                    908: {SIGMA y ~ - ~ slope times SIGMA x} over n
                    909: .EN
                    910: The following
                    911: \*g
                    912: program boldly (and rather foolishly) implements that formula.
                    913: .P1
                    914: .get boyhts3.g
                    915: .P2
                    916: It plots the extreme fifth percentiles as a bar through
                    917: the median, which is plotted as a bullet.
                    918: All heights are converted to feet before plotting and calculating
                    919: the regression line.
                    920: .grap boyhts3.g
                    921: .PP
                    922: \*G
                    923: .CW print
                    924: statements write on
                    925: .CW stderr
                    926: as they are processed by \*g;
                    927: their single argument can be either an expression or a string.
                    928: The
                    929: .CW print
                    930: statements (which are commented out in
                    931: the above
                    932: \*g
                    933: program) at one time
                    934: showed that the regression line is
                    935: .EQ
                    936: Height ~ in ~ Feet ~ = ~ 2.61 ~ + ~ .19 times Age
                    937: .EN
                    938: Thus for most American
                    939: boys between 3 and 16, you may safely assume
                    940: that they started out life at 2 feet 7 inches and grew at the
                    941: rate of two and a quarter inches per year.
                    942: .PP
                    943: This program probably misapplies \*g;
                    944: if you really want to perform least squares regressions on
                    945: data, you should usually use a simple
                    946: .I awk
                    947: program like
                    948: .P1
                    949: .get regress.awk
                    950: .P2
                    951: (Be warned, though, that this program is not numerically
                    952: robust.)
                    953: .PP
                    954: While we're on the subject of fitting straight lines to data,
                    955: we'll redraw three graphs from J. W. Tukey's
                    956: .I "Exploratory Data Analysis" .
                    957: The file
                    958: .CW usapop.d
                    959: records the population of the United States
                    960: in millions at ten-year intervals.
                    961: .P1
                    962: .d usapop.d
                    963: .P2
                    964: Tukey's first two graphs indicate that the later population
                    965: growth was linear while the early growth was exponential.
                    966: The following
                    967: \*g
                    968: program plots them as a pair, using
                    969: .CW graph
                    970: commands to place internally unrelated graphs adjacent to
                    971: one another.
                    972: .P1
                    973: .get usapop1.g
                    974: .P2
                    975: The statements defining each graph are indented for clarity.
                    976: The second graph has the northern point of its frame 0.05
                    977: inch below the southern point of the frame of the first graph;
                    978: the
                    979: .CW with
                    980: clause is passed directly through to
                    981: .I pic
                    982: without being evaluated for macros or expressions.
                    983: The names of both graphs begin with capital letters to
                    984: conform to
                    985: .I pic
                    986: syntax for labels.
                    987: .grap usapop1.g
                    988: .PP
                    989: Polynomial functions lie between the linear and exponential
                    990: functions; Tukey shows how a seventh-degree polynomial provides
                    991: a better (and longer) fit to the early population growth.
                    992: .P1
                    993: .get usapop2.g
                    994: .P2
                    995: This program re-expresses the $x$-axis with
                    996: \*g
                    997: arithmetic and uses an
                    998: .CW if
                    999: statement to graph only part of the data file.
                   1000: It produces
                   1001: .grap usapop2.g
                   1002: .nr k \n%
                   1003: The
                   1004: .I eqn
                   1005: .CW "space 0"
                   1006: clause is necessary to keep
                   1007: .I eqn
                   1008: from adding extra space that would interfere
                   1009: with positions computed by \*g;
                   1010: see Section 4.
                   1011: .PP
                   1012: The file
                   1013: .CW army.d
                   1014: contains four related time series
                   1015: describing the United States Army.
                   1016: .P1
                   1017: .d army.d
                   1018: .P2
                   1019: The first field is the year; the next four fields give
                   1020: the number of male officers, female officers, enlisted males
                   1021: and enlisted females, each in thousands.
                   1022: (Actually, there were no female enlisted personnel in the
                   1023: Army until 1943; the value 1 in 1940 and 1942 is just
                   1024: a placeholder, since
                   1025: \*g
                   1026: has no mechanism for handling missing data.)
                   1027: The following
                   1028: \*g
                   1029: program draws the four series with four different sets of
                   1030: .CW draw
                   1031: and
                   1032: .CW next
                   1033: commands.
                   1034: .P1
                   1035: .get army1.g
                   1036: .P2
                   1037: The program labels the lines by
                   1038: .CW copy ing
                   1039: immediate data;
                   1040: the program is therefore shorter to write and easier to change.
                   1041: The delimiter string
                   1042: .CW XXX
                   1043: in the
                   1044: .CW until
                   1045: clause could be deleted in this graph: the
                   1046: .CW \&.G2
                   1047: line also denotes the end of data.
                   1048: Even though that string is enclosed in quotes,
                   1049: it may not contain spaces.
                   1050: The $y$-positions of the labels are the
                   1051: result of several iterations.
                   1052: .grap army1.g
                   1053: .PP
                   1054: This data can tell many stories: the buildup during the
                   1055: Second World War is obvious, as is the exodus after the
                   1056: war; increases during Korea and Vietnam are
                   1057: also apparent.
                   1058: We will consider a different story: the ratio of
                   1059: enlisted men to the three other classes of personnel.
                   1060: There are several ways to plot this data
                   1061: (the most obvious graph uses three time series showing how
                   1062: the ratios change over time, and is
                   1063: left as an exercise for the reader).
                   1064: .PP
                   1065: We will instead construct a graph that gives little insight into this
                   1066: data, but illustrates a general method that is quite useful
                   1067: in conjunction with \*g.
                   1068: The graph is a ``scatterplot vector'' that shows how one
                   1069: variable (the number of enlisted men) varies as a function of
                   1070: the other three.
                   1071: Breaking with tradition, we first show the final graphs, all
                   1072: of which have logarithmic scales.
                   1073: .grap army2.g
                   1074: The number of enlisted men is almost linearly
                   1075: related to the number of male officers, it is somewhat related to the number
                   1076: of female officers, and it varies widely as a function of the number
                   1077: of enlisted women.
                   1078: .PP
                   1079: Much more interesting than the graph itself is the method we used to
                   1080: produce it.
                   1081: We wrote a miniature ``compiler'' that accepts as
                   1082: its ``source language'' a description of a scatterplot vector and
                   1083: produces as ``object code'' a
                   1084: \*g
                   1085: program to draw the graph.
                   1086: The source program for the above example is
                   1087: .P1
                   1088: .get army2.v
                   1089: .P2
                   1090: The program lists several
                   1091: global attributes of the graph, the
                   1092: $y$-variable to be plotted, and as many $x$-variables as
                   1093: are desired; with each variable is its field in the file
                   1094: and a descriptive string.
                   1095: The language is ``compiled'' by the following
                   1096: .I awk
                   1097: program.
                   1098: .P1
                   1099: .get scatvec.awk
                   1100: .P2
                   1101: Running this program on the above description produces the following
                   1102: output, which is typically piped directly to \*g.
                   1103: .P1
                   1104: .get army2.g
                   1105: .P2
                   1106: The generated program uses the
                   1107: .I pic
                   1108: trick of re-using the same name
                   1109: .CW A ) (
                   1110: for several objects.
                   1111: .PP
                   1112: Although the program above is merely a toy,
                   1113: ``minicompilers'' can produce useful preprocessors
                   1114: for \*g.
                   1115: The
                   1116: .CW scatmat
                   1117: program, for instance, is a 90-line
                   1118: .I awk
                   1119: program that reads a simple input language and produces as
                   1120: output a
                   1121: \*g
                   1122: program to produce a ``scatterplot matrix'', which
                   1123: is a handy graphical device for spotting pairwise interactions
                   1124: among several variables.
                   1125: If
                   1126: \*g
                   1127: lacks a feature you desire, consider building
                   1128: a simple preprocessor to provide it.
                   1129: An alternative is to define
                   1130: macros for the task; which approach is best depends
                   1131: strongly on the job you wish to accomplish.
                   1132: .PP
                   1133: The next graph uses iterators to make a graph without
                   1134: reading data from a file.
                   1135: Rather, its ``data'' is a
                   1136: function of two variables
                   1137: that describes a
                   1138: derivative field and a function of one variable
                   1139: that describes one solution to the differential
                   1140: equation.
                   1141: .P1
                   1142: .get ode1.g
                   1143: .P2
                   1144: The left label uses
                   1145: .I eqn
                   1146: text between the $font CW "$$"$ delimiters.
                   1147: The variable
                   1148: .CW scale
                   1149: ensures that all lines in the direction field are the same
                   1150: length.
                   1151: The
                   1152: .CW in
                   1153: clauses in the
                   1154: .CW ticks
                   1155: statements specify that the ticks go in zero inches
                   1156: to avoid overprinting.
                   1157: The variables
                   1158: .CW tx
                   1159: and
                   1160: .CW ty
                   1161: are so named because
                   1162: .CW x
                   1163: and
                   1164: .CW y
                   1165: are reserved words for the
                   1166: .CW coord
                   1167: statement.
                   1168: .grap ode1.g
                   1169: .PP
                   1170: Programmers familiar with floating point arithmetic may be
                   1171: surprised that the above graph is correct.
                   1172: Because of roundoff error, iteration
                   1173: .CW "from 0 to 1 by .05" '' ``
                   1174: usually produces the values
                   1175: $0, ~ .05, ~ .10, ~ ..., ~ .95$.
                   1176: \*G
                   1177: uses a ``fuzzy test''
                   1178: in the
                   1179: .CW for
                   1180: statement to avoid that problem, which may in turn introduce
                   1181: other problems.
                   1182: Such problems may be avoided by iterating over an integer range
                   1183: and incrementing a non-integer value within the loop.
                   1184: .PP
                   1185: Most of the data we have seen so far is inherently
                   1186: two (or more) dimensional.
                   1187: As an example of one-dimensional data, we will return to
                   1188: the populations of the fifty states, which
                   1189: is the third field in the file
                   1190: .CW states.d
                   1191: introduced earlier;
                   1192: the file is sorted in increasing order of population.
                   1193: Our first graph takes the most space, but
                   1194: it also gives the most information.
                   1195: .P1
                   1196: .get states8.g
                   1197: .P2
                   1198: The
                   1199: .CW L
                   1200: macro (for Label)
                   1201: with input parameter $X$ evaluates to the number
                   1202: $2 sup X / 1,000,000$ followed by the string "$X$"
                   1203: (the
                   1204: .CW ticks
                   1205: command expects a number followed by a string label).
                   1206: .grap states8.g
                   1207: The dotted line is the least squares regression
                   1208: .EQ
                   1209: log sub 10 ~ Population ~ = ~ 7.214 ~ - ~ .03 times Rank
                   1210: .EN
                   1211: which gives 15.3 million as the population of the
                   1212: largest state and .515 million as the population
                   1213: of the smallest state.
                   1214: It says that
                   1215: population drops by a factor of two every ten states
                   1216: (compare the top and left scales).
                   1217: As sloppy as the exponential fit is, though, it is a much better
                   1218: fit to this data
                   1219: than a Zipf's Law curve is (drawing that curve is left as
                   1220: an exercise for the reader).
                   1221: .PP
                   1222: The next graph is a more standard representation of
                   1223: one-dimensional data.
                   1224: .P1
                   1225: .get states3.g
                   1226: .P2
                   1227: The markers were chosen to be
                   1228: .CW vticks
                   1229: because they denote only an $x$-value.
                   1230: .grap states3.g
                   1231: .PP
                   1232: The next one-dimensional graph uses the state's name as
                   1233: its marker; to reduce overprinting the graph is ``jittered''
                   1234: by using a random number as a $y$-value.
                   1235: .P1
                   1236: .get states4.g
                   1237: .P2
                   1238: The function
                   1239: .CW rand()
                   1240: returns a pseudo-random real number chosen uniformly over the interval [0,1).
                   1241: .grap states4.g
                   1242: This graph is too cluttered; circles would have been
                   1243: a better choice as a plotting symbol (bullets, once again, would
                   1244: hide data).
                   1245: .PP
                   1246: Histograms are a standard way of presenting one-dimensional
                   1247: data in two-dimensional form.
                   1248: Our first step in building a histogram of the population
                   1249: data is the following
                   1250: .I awk
                   1251: program, which counts how many states are in each ``bin''
                   1252: of a million people.
                   1253: .P1
                   1254: .get states5.awk
                   1255: .P2
                   1256: The variable
                   1257: .CW bzs
                   1258: tells where bin zero starts; although it is zero in this
                   1259: graph, it might be 95 in a histogram
                   1260: of human body temperatures in degrees Fahrenheit.
                   1261: The program produces the following output in
                   1262: .CW states2.d :
                   1263: .P1
                   1264: .d states2.d
                   1265: .P2
                   1266: There are 12 states with population between 0 and 999,999,
                   1267: 5 states with population between 1,000,000 and 1,999,999,
                   1268: and so on.
                   1269: .PP
                   1270: This
                   1271: \*g
                   1272: program uses three
                   1273: .CW line
                   1274: commands to plot each rectangle in the histogram.
                   1275: .P1
                   1276: .get states5.g
                   1277: .P2
                   1278: It produces
                   1279: .grap states5.g
                   1280: .PP
                   1281: The same file can be plotted in a
                   1282: more attractive (and more useful) form by
                   1283: .P1
                   1284: .get states6.g
                   1285: .P2
                   1286: which produces
                   1287: one of Bill Cleveland's ``dot charts'' or ``lolliplots'':
                   1288: .grap states6.g
                   1289: (We use
                   1290: .CW \e(bu ,
                   1291: the
                   1292: .I troff
                   1293: character for a bullet, rather than the built-in string to
                   1294: get a larger size.)
                   1295: .PP
                   1296: Other histograms are possible.
                   1297: The following
                   1298: .I awk
                   1299: program
                   1300: .P1
                   1301: .get states7.awk
                   1302: .P2
                   1303: produces the file
                   1304: .CW states3.d
                   1305: .P1
                   1306: .d states3.d
                   1307: .P2
                   1308: which lists the state's abbreviation, bin number, and
                   1309: height within the bin.
                   1310: The
                   1311: \*g
                   1312: program
                   1313: .P1
                   1314: .get states7.g
                   1315: .P2
                   1316: reads that file to make the following histogram, in which
                   1317: the state names are used to display the heights of the bins.
                   1318: In each bin, the states occur in increasing order of
                   1319: population from bottom to top.
                   1320: .grap states7.g
                   1321: .PP
                   1322: The next data set is a run-time profile of an early version of \*g,
                   1323: created by compiling the program with the
                   1324: .CW -p
                   1325: option and running
                   1326: .CW prof
                   1327: after the program executed.
                   1328: .P1
                   1329: .d prof1.d
                   1330: .P2
                   1331: Although there were more than fifty procedures in the program, the
                   1332: top four time-hogs accounted for more than half of the run time.
                   1333: This file is difficult for
                   1334: \*g
                   1335: to deal with:
                   1336: even though
                   1337: .CW if
                   1338: statements would allow us to extract lines 2 through 11
                   1339: of the file, we could not remove the leading
                   1340: .CW _ 
                   1341: from a routine name or access the last field in a record.
                   1342: We will therefore process it with
                   1343: the following
                   1344: .I awk
                   1345: program.
                   1346: .P1
                   1347: .get prof1.awk
                   1348: .P2
                   1349: The program produces
                   1350: .P1
                   1351: .d prof2.d
                   1352: .P2
                   1353: We could even use the
                   1354: .I sh
                   1355: statement to execute the
                   1356: .I awk
                   1357: program from within \*g, which would make the latter entirely
                   1358: self-contained (see the reference manual for details).
                   1359: .PP
                   1360: We will display the data with this program.
                   1361: .P1
                   1362: .get prof1.g
                   1363: .P2
                   1364: Observe that the program knows nothing about the range of the data.
                   1365: It uses default ticks and a
                   1366: .CW frame
                   1367: statement with a computed height to achieve
                   1368: total data independence.
                   1369: .grap prof1.g
                   1370: This bar chart highlights the fact that most of the time spent by
                   1371: \*g
                   1372: is devoted to input and output.
                   1373: .PP
                   1374: J. W. Tukey's box and whisker plots
                   1375: represent the median, quartiles, and extremes of a
                   1376: one-dimensional distribution.
                   1377: The following
                   1378: \*g
                   1379: program defines a macro to draw a box plot, and then
                   1380: uses that shape to compare the distribution of heights of
                   1381: volcanoes with the distribution of heights of States of the Union.
                   1382: .P1
                   1383: .get box1.g
                   1384: .P2
                   1385: Boxes are one of many shapes used for the graphical
                   1386: representation of several quantities.
                   1387: If you use such shapes frequently then you should
                   1388: make a library file of their macros to
                   1389: .CW copy
                   1390: into your
                   1391: \*g
                   1392: programs.
                   1393: The above program produces
                   1394: .grap box1.g
                   1395: Even though the extreme heights are the same, state heights
                   1396: have a lower median and a greater spread.
                   1397: .PP
                   1398: Someday you may use
                   1399: \*g
                   1400: to prepare overhead transparencies, only to find that
                   1401: everything comes out too small.
                   1402: The following program illustrates some ways to get larger
                   1403: graphs.
                   1404: .P1
                   1405: .zzz slide1.g
                   1406: .P2
                   1407: The
                   1408: .CW ps
                   1409: and
                   1410: .CW vs
                   1411: commands preceding the graph set the text size to 14 points and
                   1412: the vertical spacing to 18 points; the two quantities are
                   1413: reset by the commands following the
                   1414: .CW .G2 .
                   1415: Such size changes should be made outside the
                   1416: \*g
                   1417: program, as mentioned earlier.
                   1418: The
                   1419: .CW 4
                   1420: following the
                   1421: .CW .G1
                   1422: stretches the graph (including
                   1423: \*g's
                   1424: estimate of the accompanying text) to be four inches wide;
                   1425: it is an alternative to altering the
                   1426: .CW frame
                   1427: command.
                   1428: The macro
                   1429: .CW blob
                   1430: is a plotting symbol that is much larger than
                   1431: .CW bullet ;
                   1432: the different name ensures that later references to
                   1433: .CW bullet
                   1434: are unaffected.
                   1435: The
                   1436: .I troff
                   1437: commands within the
                   1438: .CW blob
                   1439: string move the character down one-tenth of an em
                   1440: to center its plotting position (determined experimentally)
                   1441: and then reset the vertical position.
                   1442: The program produces this trivial (but large) graph.
                   1443: .br
                   1444: .grap slide1.g
                   1445: .NH
                   1446: Using Grap
                   1447: .PP
                   1448: Following are a few day-to-day matters about using \*g.
                   1449: .NH 2
                   1450: Errors
                   1451: .PP
                   1452: \*G
                   1453: attempts to pinpoint input errors; for example,
                   1454: the input
                   1455: .P1
                   1456: \&.G1
                   1457: i = i + 1
                   1458: .P2
                   1459: results in this message on
                   1460: .CW stderr :
                   1461: .P1
                   1462: grap: syntax error near line 1, file -
                   1463:  context is
                   1464:        i = i >>>  + <<<  1
                   1465: .P2
                   1466: The error was noticed
                   1467: at the
                   1468: .CW + .
                   1469: Unfortunately, pinpointing is not the same as explaining:
                   1470: the real error is that the variable
                   1471: .CW i
                   1472: was not initialized.
                   1473: .PP
                   1474: The ``words''
                   1475: .CW x
                   1476: and
                   1477: .CW y
                   1478: are reserved (for the
                   1479: .CW coord
                   1480: statement);
                   1481: you will get an equally inexplicable syntax error message if you use them
                   1482: as variable names.
                   1483: (This design is bad, but not nearly so bad as
                   1484: having the
                   1485: .CW log
                   1486: and
                   1487: .CW exp
                   1488: functions use base 10.)
                   1489: .PP
                   1490: \*G
                   1491: tries to load a file of standard macro definitions
                   1492: .CW /usr/lib/grap.defines ) (
                   1493: for terms like
                   1494: .CW bullet ,
                   1495: .CW plus ,
                   1496: etc.
                   1497: It doesn't complain if that file isn't found,
                   1498: but if you later use one of these words,
                   1499: you'll get a syntax error message.
                   1500: .PP
                   1501: Certain constructs suggested by analogy to
                   1502: .I pic
                   1503: do not work.
                   1504: For example,
                   1505: .CW .GS
                   1506: and
                   1507: .CW .GE
                   1508: would have been nicer than
                   1509: .CW .G1
                   1510: and
                   1511: .CW .G2 ,
                   1512: but they were already taken.
                   1513: The
                   1514: .I pic
                   1515: construct
                   1516: .P1
                   1517: \&.PS <file
                   1518: .P2
                   1519: has been superseded by 
                   1520: \*g's
                   1521: .CW copy
                   1522: command (which in turn has been retrofitted into
                   1523: .I pic ).
                   1524: .NH 2
                   1525: \fITroff\fP issues
                   1526: .PP
                   1527: You may use
                   1528: .I troff
                   1529: commands like
                   1530: .CW .ps
                   1531: or
                   1532: .CW .ft
                   1533: to change text sizes and fonts within a graph,
                   1534: or use balanced
                   1535: .CW \es
                   1536: and
                   1537: .CW \ef
                   1538: commands within a string.
                   1539: Do not, however,
                   1540: add space
                   1541: .CW .sp ) (
                   1542: or change the line spacing
                   1543: .CW .vs , (
                   1544: .CW .ls )
                   1545: within a graph.
                   1546: Some defined terms like
                   1547: .CW bullet
                   1548: contain embedded size changes;
                   1549: further qualifying them with
                   1550: \*g
                   1551: .CW size
                   1552: commands may not always work.
                   1553: .PP
                   1554: Because
                   1555: \*g
                   1556: is built on top of
                   1557: .I pic ,
                   1558: the following quote from the
                   1559: .I pic
                   1560: manual is relevant:
                   1561: ``There is a subtle problem with complicated equations inside
                   1562: .I pic
                   1563: pictures \(em they come out wrong if
                   1564: .I eqn
                   1565: has to leave extra vertical space for the equation.
                   1566: If your equation involves more than subscripts and superscripts,
                   1567: you must add to the beginning of each such equation the extra information
                   1568: .CW "space 0" ''.
                   1569: This feature was illustrated in the graph of the
                   1570: United States population in Section 3.
                   1571: .NH 2
                   1572: Alternatives
                   1573: .PP
                   1574: Besides
                   1575: \*g
                   1576: and your local draftsperson, what other choices are there?
                   1577: .PP
                   1578: The S system |reference(slanguage chambers) provides
                   1579: a host of tools for statistical analysis,
                   1580: but somewhat fewer tools than
                   1581: \*g
                   1582: for producing document-quality graphs.
                   1583: S produces graphs on the screen of a DMD 5620 terminal much more quickly than
                   1584: \*g
                   1585: (often in seconds rather than minutes), but it
                   1586: takes somewhat longer to learn (at least for us).
                   1587: If you expect to do a lot of interactive data analysis, then
                   1588: S is probably the right tool for you.
                   1589: S may be used to generate 
                   1590: .I pic
                   1591: commands.
                   1592: .PP
                   1593: The standard UNIX program
                   1594: .I graph
                   1595: provides many of the basic features of
                   1596: \*g,
                   1597: though with quite a bit less control over details, particularly
                   1598: text.
                   1599: It produces output only in the
                   1600: .UX
                   1601: .I plot (5)
                   1602: language,
                   1603: which may be processed by a variety of filters
                   1604: for a variety of output devices.
                   1605: .PP
                   1606: The original
                   1607: .UX
                   1608: typesetter graphics programs are
                   1609: .I pic
                   1610: and
                   1611: .I ideal ;
                   1612: you may be able to do as well without using
                   1613: \*g
                   1614: as an intermediary.
                   1615: In particular,
                   1616: .I ideal
                   1617: provides shading and clipping,
                   1618: which are useful
                   1619: in presentation-quality bar charts and the like, but are
                   1620: well beyond the capabilities of 
                   1621: .I pic .
                   1622: .EQ
                   1623: delim $$
                   1624: .EN
                   1625: .NH
                   1626: References
                   1627: .LP
                   1628: |reference_placement
                   1629: .NH
                   1630: Reference Manual
                   1631: .PP
                   1632: In the following, 
                   1633: .I italic
                   1634: terms are syntactic categories,
                   1635: .CW typewriter
                   1636: terms are literals,
                   1637: parenthesized constructs are optional, and ... indicates repetition.
                   1638: In most cases, the order of statements,
                   1639: constructs and attributes is immaterial.
                   1640: .P1
                   1641: .IT "grap program" :
                   1642:        .G1 \f2(width in inches)\fP
                   1643:        \f2grap statement\fP
                   1644:        ...
                   1645:        .G2
                   1646: .P2
                   1647: A width on the
                   1648: .CW .G1
                   1649: line overrides the computed width, as in
                   1650: .I pic .
                   1651: .P1
                   1652: .IT "grap statement" :
                   1653: .I
                   1654:          frame \(or label \(or coord \(or ticks \(or grid \(or plot \(or line \(or circle \(or draw \(or new \(or next
                   1655:        \(or graph \(or numberlist \(or copy \(or for \(or if \(or sh \(or pic \(or assignment \(or print
                   1656: .ft
                   1657: .P2
                   1658: .PP
                   1659: The
                   1660: .CW frame
                   1661: statement defines the frame that surrounds the graph:
                   1662: .P1
                   1663: .IT frame :
                   1664:        frame \f2(\fPht \f2expr)\fP \f2(\fPwid \f2expr)\fP \f2((side) linedesc)\fP \f2...\fP
                   1665: .IT side :
                   1666:        top \(or bot \(or left \(or right
                   1667: .IT linedesc :
                   1668:        solid \(or invis \(or dotted \f2(expr)\fP \(or dashed \f2(expr)\fP
                   1669: .P2
                   1670: Height and width default to 2 and 3 inches;
                   1671: sides default to solid.
                   1672: If
                   1673: .I side
                   1674: is omitted, the
                   1675: .I linedesc
                   1676: applies to the entire frame.
                   1677: The optional expressions after
                   1678: .CW dotted
                   1679: and
                   1680: .CW dashed
                   1681: change the spacing exactly as in
                   1682: .I pic .
                   1683: .PP
                   1684: The
                   1685: .CW label
                   1686: statement places a label on a specified side:
                   1687: .P1
                   1688: .IT label :
                   1689:        label \f2side\fP \f2strlist\fP \f2...\fP \f2shift\fP
                   1690: .IT shift:
                   1691:        left\f2 \(or \fPright\f2 \(or \fPup\f2 \(or \fPdown \f2expr ...\fP
                   1692: .IT strlist :
                   1693:        \f2str ... (\fPrjust\f2 \(or \fPljust\f2 \(or \fPabove\f2 \(or \fPbelow\f2) ... (\fPsize \f2(\fP\(+-\f2) expr) ...\fP
                   1694: .IT str :
                   1695:        "\f2...\fP"
                   1696: .P2
                   1697: Lists of text strings are stacked vertically.
                   1698: In any context, string lists may contain clauses
                   1699: to adjust the position or change the point size.
                   1700: Each clause applies to the string preceding it
                   1701: and all following strings.
                   1702: Labels may also have a
                   1703: .CW width
                   1704: attribute, to override
                   1705: \*g's
                   1706: default computation.
                   1707: .PP
                   1708: Normally the coordinate system is defined by the data,
                   1709: with 7 percent extra on each side.
                   1710: (To change that to 5 percent, assign 0.05 to the
                   1711: \*g
                   1712: variable
                   1713: .CW margin ,
                   1714: which is reset to 0.07 at each
                   1715: .CW .G1
                   1716: statement.)
                   1717: The
                   1718: .CW coord
                   1719: statement defines an overriding system:
                   1720: .P1
                   1721: .IT coord :
                   1722:        coord \f2(name)\fP \f2(\fPx \f2expr,expr)\fP \f2(\fPy \f2expr,expr)\fP \f2(\fPlog x \(or log y \(or log log\f2) \fP
                   1723: .P2
                   1724: Coordinate systems can be named;
                   1725: ranges, logarithmic scaling, etc., are done separately for each.
                   1726: .PP
                   1727: The
                   1728: .CW ticks
                   1729: statement places tick marks on one side of the frame:
                   1730: .P1
                   1731: .IT ticks :
                   1732:        ticks \f2side\fP \f2(\fPin \(or out \f2(expr))\fP \f2(shift)  (tick-locations)\fP
                   1733: .IT tick-locations :
                   1734:          at \f2(name) expr (str)\fP, \f2expr (str)\fP, \f2...\fP
                   1735:        \(or from \f2(name) expr\fP to \f2expr\fP \f2(\fPby \f2(op) expr)\fP \f2str\fP
                   1736: .P2
                   1737: If no ticks are specified, they will be provided automatically;
                   1738: .CW ticks
                   1739: .CW off
                   1740: suppresses automatic ticks.
                   1741: The optional expression after
                   1742: .CW in
                   1743: or
                   1744: .CW out
                   1745: specifies the length of the ticks in inches.
                   1746: The optional name refers to a coordinate system.
                   1747: If
                   1748: .IT str
                   1749: contains
                   1750: format specifiers like
                   1751: .CW %f
                   1752: or
                   1753: .CW %g ,
                   1754: they are interpreted as by
                   1755: .CW printf .
                   1756: If no
                   1757: .IT str
                   1758: is supplied, the tick labels will be the values of the
                   1759: expressions.
                   1760: .PP
                   1761: If the
                   1762: .CW by
                   1763: clause is omitted, steps are of size 1.
                   1764: If the
                   1765: .CW by
                   1766: expression is preceded by one of
                   1767: .CW + ,
                   1768: .CW - ,
                   1769: .CW *
                   1770: or
                   1771: .CW / ,
                   1772: the step is scaled by that operator,
                   1773: e.g.,
                   1774: .CW *10
                   1775: means that each step is 10 times the previous one.
                   1776: .PP
                   1777: The
                   1778: .CW grid
                   1779: statement produces grid lines along (i.e., perpendicular to)
                   1780: the named side.
                   1781: .P1
                   1782: .IT grid :
                   1783:        grid \f2side (linedesc) (shift)  (tick-locations)\fP
                   1784: .P2
                   1785: Grids are labeled by the same mechanism as
                   1786: .CW ticks .
                   1787: It is possible to draw grids without ticks by placing the phrase
                   1788: .CW ticks
                   1789: .CW off
                   1790: after the side name and before the iterator.
                   1791: .PP
                   1792: Plot
                   1793: statements place text at a point:
                   1794: .P1
                   1795: .IT plot :
                   1796:        \f2strlist\fP at \f2point\fP
                   1797:        plot \f2expr (str)\fP at \f2point\fP
                   1798: .IT point :
                   1799:        \f2(name) expr,expr\fP
                   1800: .P2
                   1801: As in the
                   1802: .CW label
                   1803: statement, the string list may contain
                   1804: position and size modifiers.
                   1805: The
                   1806: .CW plot
                   1807: statement uses the optional format string as in C's
                   1808: .CW printf
                   1809: statement \(em it may contain a
                   1810: .CW %f
                   1811: or
                   1812: .CW %g .
                   1813: The optional name refers to a coordinate system.
                   1814: .PP
                   1815: The
                   1816: .CW line
                   1817: statement draws a line or arrow from here to there:
                   1818: .P1
                   1819: .IT line :
                   1820:        \f2(\fPline \(or arrow\f2)\fP from \f2point\fP to \f2point (linedesc)\fP
                   1821: .P2
                   1822: The
                   1823: .CW circle
                   1824: statement draws a circle:
                   1825: .P1
                   1826: .IT circle :
                   1827:        circle at \f2point (\fPradius \f2expr)\fP
                   1828: .P2
                   1829: The radius is in inches; the default size is small.
                   1830: .PP
                   1831: The 
                   1832: .CW draw
                   1833: statement defines a sequence of lines:
                   1834: .P1
                   1835: .IT draw :
                   1836:        draw \f2(name) linedesc (str)\fP
                   1837: .P2
                   1838: Subsequent data for the named sequence
                   1839: will be plotted as a line of the specified style,
                   1840: with the optional
                   1841: .IT str
                   1842: plotted at each point.
                   1843: The
                   1844: .CW next
                   1845: statement continues a sequence:
                   1846: .P1
                   1847: .IT next :
                   1848:        next \f2(name)\fP at \f2point (linedesc)\fP
                   1849: .P2
                   1850: If a line description is specified, it overrides the default
                   1851: display mode for the line segment ending at
                   1852: .I point .
                   1853: The
                   1854: .CW new
                   1855: statement starts a new sequence; it has the same format as the
                   1856: .CW draw
                   1857: statement.
                   1858: .PP
                   1859: A line consisting of a set of numbers
                   1860: is treated as a family of points
                   1861: $x$, $y sub 1$, $y sub 2$, etc.,
                   1862: to be plotted at the single
                   1863: $x$ value.
                   1864: .P1
                   1865: .IT numberlist :
                   1866:        \f2number\fP ...
                   1867: .P2
                   1868: If there is only one number it is treated as
                   1869: a $y$ value, and $x$ values of 1, 2, 3, ...
                   1870: are supplied automatically.
                   1871: .PP
                   1872: \*G 
                   1873: provides arithmetic with the operators
                   1874: .CW + ,
                   1875: .CW - ,
                   1876: .CW * ,
                   1877: .CW / ,
                   1878: and
                   1879: .CW ^ .
                   1880: Variables may be assigned to;
                   1881: assignments are expressions.
                   1882: Built-in functions include
                   1883: .CW log ,
                   1884: .CW exp
                   1885: (both base 10 \(em beware!),
                   1886: .CW int
                   1887: (truncates towards zero),
                   1888: .CW sin ,
                   1889: .CW cos 
                   1890: (both use radians),
                   1891: .CW atan2(dy,dx) ,
                   1892: .CW sqrt ,
                   1893: .CW min
                   1894: (two arguments only),
                   1895: .CW max
                   1896: (ditto),
                   1897: and
                   1898: .CW rand()
                   1899: (returns a real number random on [0,1)).
                   1900: .PP
                   1901: The
                   1902: .CW for
                   1903: statement provides a modest looping facility:
                   1904: .P1
                   1905: .IT for :
                   1906:        for \f2var\fP from \f2expr\fP to \f2expr (\fPby \f2(op) expr)\fP do { \f2anything\fP }
                   1907: .P2
                   1908: The string may contain internally balanced braces.
                   1909: Alternatively, any other character may appear immediately after the word
                   1910: .CW do ,
                   1911: and the string is terminated by the next occurrence of that character.
                   1912: The text
                   1913: .IT anything
                   1914: (which may contain newlines) is repeated as 
                   1915: .IT var
                   1916: takes on values from
                   1917: .IT expr1
                   1918: to
                   1919: .IT expr2 .
                   1920: As with tick iterators, the
                   1921: .CW by
                   1922: clause is optional, and may proceed arithmetically or multiplicatively.
                   1923: In a
                   1924: .CW for
                   1925: statement,
                   1926: the
                   1927: .CW from
                   1928: may be replaced by
                   1929: .CW = ''. ``
                   1930: .PP
                   1931: The
                   1932: .CW if-then-else
                   1933: statement provides conditional evaluation:
                   1934: .P1
                   1935: .IT if :
                   1936:        if \f2expr\fP then { \f2anything\fP } else { \f2anything\fP }
                   1937: .P2
                   1938: The
                   1939: .CW else
                   1940: clause
                   1941: is optional.
                   1942: Relational operators include
                   1943: .CW == ,
                   1944: .CW != ,
                   1945: .CW > ,
                   1946: .CW >= ,
                   1947: .CW < ,
                   1948: .CW <= ,
                   1949: .CW ! ,
                   1950: .CW || ,
                   1951: and
                   1952: .CW && .
                   1953: Strings may be compared with the operators
                   1954: .CW ==
                   1955: and
                   1956: .CW != .
                   1957: .PP
                   1958: It is possible to convert numeric expressions to formatted strings:
                   1959: .P1
                   1960: sprintf("\f2format\fP", \f2expr\fP, \f2expr\fP, ...)
                   1961: .P2
                   1962: is equivalent to a quoted string in any context.
                   1963: Variants of
                   1964: .CW %f
                   1965: and
                   1966: .CW %g
                   1967: are the only sensible format conversions.
                   1968: .PP
                   1969: \*G
                   1970: provides the same macro processor that
                   1971: .I pic
                   1972: does:
                   1973: .P1
                   1974: define \f2macro-name\fP { \f2anything\fP }
                   1975: .P2
                   1976: .EQ
                   1977: delim %%
                   1978: .EN
                   1979: Subsequent occurrences of the macro name will be replaced
                   1980: by the string, with arguments of the form \f(CW$\fIn\fR
                   1981: replaced by corresponding actual arguments.
                   1982: Macro definitions persist across
                   1983: .CW .G2
                   1984: boundaries, as do values of variables.
                   1985: .EQ
                   1986: delim $$
                   1987: .EN
                   1988: .PP
                   1989: The
                   1990: .CW copy
                   1991: statement is somewhat overloaded:
                   1992: .P1
                   1993: copy "\f2filename\fP"
                   1994: .P2
                   1995: includes the contents of the named file at that point;
                   1996: .P1
                   1997: copy "\f2filename\fP" thru \f2macro-name\fP
                   1998: .P2
                   1999: copies the file through the macro; and
                   2000: .P1
                   2001: copy thru \f2macro-name\fP
                   2002: .P2
                   2003: copies subsequent lines through the macro;
                   2004: each number or quoted string is treated as an argument.
                   2005: In each case, copying continues until end of file or the next
                   2006: .CW .G2 .
                   2007: The optional clause
                   2008: .CW until
                   2009: .IT str
                   2010: causes copying to terminate when a line whose
                   2011: first field is
                   2012: .IT str
                   2013: occurs.
                   2014: In all cases, the macro can be specified inline rather than by name:
                   2015: .P1
                   2016: copy thru { \f2macro body\fP }
                   2017: .P2
                   2018: .PP
                   2019: The
                   2020: .CW sh
                   2021: command passes text through to the UNIX shell.
                   2022: .P1
                   2023: .IT sh :
                   2024:        sh { \f2anything\fP }
                   2025: .P2
                   2026: The body of the command is scanned for macros.
                   2027: The built-in macro
                   2028: .CW pid
                   2029: is a string consisting of the process identification number;
                   2030: it can be used to generate unique file names.
                   2031: .PP
                   2032: The
                   2033: .CW pic
                   2034: command passes text through to
                   2035: .I pic 
                   2036: with the 
                   2037: .CW pic '' ``
                   2038: removed; variables and macros are not evaluated.
                   2039: Lines beginning with a period (that are not numbers)
                   2040: are passed through literally, under the assumption that they
                   2041: are
                   2042: .I troff
                   2043: commands.
                   2044: .PP
                   2045: The
                   2046: .CW graph
                   2047: statement
                   2048: .P1
                   2049: .IT graph :
                   2050:        graph \f2Picname (pic-text)\fP
                   2051: .P2
                   2052: defines a new graph named
                   2053: .I Picname ,
                   2054: resetting all coordinate systems.
                   2055: If any
                   2056: .CW graph
                   2057: commands are used in a
                   2058: \*g
                   2059: program, then the statement after the
                   2060: .CW \&.G1
                   2061: must be a
                   2062: .CW graph
                   2063: command.
                   2064: The
                   2065: .I pic-text
                   2066: can be used to position this graph relative
                   2067: to previous graphs by referring to their
                   2068: .CW Frame s,
                   2069: as in
                   2070: .P1
                   2071:        graph First
                   2072:         ...
                   2073:        graph Second with .Frame.w at First.Frame.e + (0.1,0)
                   2074: .P2
                   2075: Macros and expressions in
                   2076: .I pic-text
                   2077: are not evaluated.
                   2078: .I Picname s
                   2079: must begin with a capital letter to satisfy 
                   2080: .I pic
                   2081: syntax.
                   2082: .PP
                   2083: The
                   2084: .CW print
                   2085: statement
                   2086: .P1
                   2087: .IT print :
                   2088:        print \f2(expr\fP \(or \f2str)\fP
                   2089: .P2
                   2090: writes on
                   2091: .CW stderr
                   2092: as
                   2093: \*g
                   2094: processes its input; it is sometimes useful for debugging.
                   2095: .PP
                   2096: Many reserved words have synonyms, such as
                   2097: .CW thru
                   2098: for
                   2099: .CW through ,
                   2100: .CW tick
                   2101: for
                   2102: .CW ticks,
                   2103: and
                   2104: .CW bot
                   2105: for
                   2106: .CW bottom .
                   2107: .PP
                   2108: The
                   2109: .CW #
                   2110: introduces a comment, which ends at the end of the line.
                   2111: Statements may be continued over several lines by preceding each
                   2112: newline with a
                   2113: backslash character.
                   2114: Multiple statements may appear on a single line separated
                   2115: by semicolons.
                   2116: \*G
                   2117: ignores any line that is entirely blank, including those
                   2118: processed by
                   2119: .CW "copy thru"
                   2120: commands.
                   2121: .PP
                   2122: When
                   2123: \*g
                   2124: is first executed it reads standard macro definitions
                   2125: from the file
                   2126: .CW /usr/lib/grap.defines .
                   2127: The definitions include
                   2128: .CW bullet ,
                   2129: .CW plus ,
                   2130: .CW box ,
                   2131: .CW star ,
                   2132: .CW dot ,
                   2133: .CW times ,
                   2134: .CW htick ,
                   2135: .CW vtick ,
                   2136: .CW square ,
                   2137: and
                   2138: .CW delta .

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.