|
|
1.1 root 1: .so ../ADM/mac
2: .XX grap 109 "Grap \(em A Language for Typesetting Graphs"
3: .TR 114
4: .ND "Revised, May 1991
5: .EQ
6: delim $$
7: .EN
8: .so macros
9: .ds g \f2grap\fP
10: .ds G \f2Grap\fP
11: .TL
12: Grap \(em A Language for Typesetting Graphs
13: .br
14: Tutorial and User Manual
15: .AU
16: Jon L. Bentley
17: Brian W. Kernighan
18: .AI
19: .MH
20: .AB
21: \*G
22: is a language for describing plots of data.
23: This graph of the 1984
24: age distribution in the United States
25: .grap agepop1.g
26: is produced by the
27: \*g
28: commands
29: .P1
30: .get agepop1.g
31: .P2
32: (Each line in the data file
33: .CW agepop.d
34: contains an age and the number of Americans of that
35: age alive in 1984; the file is sorted by age.)
36: .PP
37: The
38: \*g
39: preprocessor works with
40: .I pic |reference(latest pic)
41: and
42: .I troff |reference(latest troff reference).
43: Most of its input is passed
44: through untouched, but statements between
45: .CW .G1
46: and
47: .CW .G2
48: are translated into
49: .I pic
50: commands that draw graphs.
51: .AE
52: .NH
53: Introduction
54: .PP
55: \*G
56: is a language for describing graphical
57: displays of data.
58: It provides such services as automatic scaling and
59: labeling of axes, and
60: .CW for
61: statements,
62: .CW if
63: statements, and macros to facilitate user
64: programmability.
65: \*G
66: is intended primarily for including graphs in
67: documents prepared on the
68: .UX
69: operating system, and is only marginally
70: useful for elementary tasks in data analysis.
71: .PP
72: Section 2 of this document is a tutorial introduction to
73: \*g;
74: readers who find it slow going may wish to skim ahead.
75: The examples in Section 3 illustrate
76: the various kinds of graphs that
77: \*g
78: can produce and some common
79: \*g
80: idioms.
81: Mundane matters about using
82: \*g
83: are discussed in Section 4,
84: and Section 5 contains a brief reference manual.
85: .PP
86: We have tried to illustrate good principles of
87: statistics and graphical design in the
88: graphs we present.
89: In several places, though, good taste has lost to
90: the necessity of illustrating
91: \*g
92: capabilities.
93: Readers interested in statistical
94: integrity and taste should
95: consult the literature, for example |reference(chambers graphs)
96: |reference(tufte graphs) |reference(cleveland elements).
97: .NH
98: Tutorial
99: .PP
100: The following is a simple
101: \*g
102: program\(dg
103: .FS
104: \(dg Throughout
105: this document we will show only the first five
106: lines and the last line of data files;
107: omitted lines are indicated by ``...''.
108: .FE
109: .P1
110: \&.G1
111: .d 400mtimes.d
112: \&.G2
113: .P2
114: The single number on each line
115: is the winning time in seconds for the
116: men's 400 meter run,
117: from the first modern Olympic Games (1896)
118: to the twenty-first (1988).
119: If the file
120: .CW olymp.g
121: contains the text above,
122: then typing the command
123: .P1
124: grap olymp.g | pic | troff > junk
125: .P2
126: creates a
127: .I troff
128: output file
129: .CW junk
130: that contains the
131: picture
132: .grap 4001.g
133: The graph shows the decrease
134: in winning times from 54.2
135: seconds to 43.87 seconds.
136: If the times are
137: contained in the file
138: .CW 400mtimes.d ,
139: we could
140: produce the same graph with the
141: shorter program
142: .P1
143: .get 4001.g
144: .P2
145: Writing
146: .CW copy
147: .CW \&"fname"
148: in a
149: \*g
150: program is equivalent to including the
151: contents of file
152: .CW fname
153: at that point in the file.
154: (In the interests of compatibility with other programs,
155: .CW include
156: is a synonym for
157: .CW copy .)
158: .PP
159: Each line in the file
160: .CW 400mpairs.d
161: contains two numbers, the
162: year of the Olympics and the winning time:
163: .P1
164: .d 400mpairs.d
165: .P2
166: If we plot this data with the program
167: .P1
168: .get 4002.g
169: .P2
170: the bottom ($x$) axis represents the year of the Olympics.
171: .grap 4002.g
172: The ``holes'' in $x$-values reflect the fact
173: that the 1916, 1940, and 1944 Olympics
174: were cancelled due to war.
175: Because the previous data
176: (in
177: .CW 400mtimes.d )
178: had just one number per
179: line,
180: \*g
181: viewed it as a ``time series'' and
182: supplied $x$-values of $1, ~ 2, ~ 3, ...$
183: before plotting
184: the data as $y$-values.
185: The input to the
186: second program has two values per line,
187: so they are interpreted as $( x , y )$ pairs.
188: .PP
189: Rather than a scatter plot of points, we might prefer to
190: see the winning times connected by a solid
191: line.
192: The program
193: .P1
194: .get 4003.g
195: .P2
196: produces the graph
197: .grap 4003.g
198: Eric Liddell of Great Britain
199: won his gold medal
200: in Paris in 1924 with a time of 47.6 seconds.
201: (Remember ``Chariots
202: of Fire''?)
203: .PP
204: We can make the graph more attractive
205: by modifying its frame
206: and adding labels.
207: .P1
208: .get 4004.g
209: .P2
210: The
211: .CW frame
212: command describes
213: the graph's bounding box:
214: the overall frame (which has four sides)
215: is invisible, it is 2 inches high and 3 inches
216: wide (which happen to be the
217: default height and width),
218: and the left and bottom
219: sides are solid (they could have been
220: dashed or dotted instead).
221: The labels appear on the left and bottom, as requested.
222: .grap 4004.g
223: .PP
224: To set the range of each axis,
225: \*g
226: examines the data and pads both
227: dimensions
228: by seven percent at each end.
229: The
230: .CW coord
231: (``coordinates'') command
232: allows you to specify the range of one or both axes explicitly;
233: it also turns off automatic padding.
234: .P1
235: .get 4005.g
236: .P2
237: The $y$-axis now ranges from 42 to 56 seconds
238: (a little more than before),
239: and the $x$-axis from 1894 to 1990
240: (a little less).
241: .grap 4005.g
242: .PP
243: The ticks in the preceding graphs were generated
244: by
245: \*g
246: guessing at reasonable values.
247: If you would rather provide your own,
248: you may
249: use the
250: .CW ticks
251: command,
252: which comes in the flavors illustrated below.
253: .P1
254: .get 4006.g
255: .P2
256: The first
257: .CW ticks
258: command deals with the left axis:
259: it puts the ticks facing out at
260: the numbers in the list.
261: \*G
262: puts labels only at values
263: with strings,
264: except that when no labels at all are
265: given, each number serves as its own label,
266: as in the second
267: .CW ticks
268: command.
269: That command
270: is for the bottom axis:
271: it puts the ticks facing in at steps of 20
272: from 1900 to 1980.
273: The command
274: .CW "ticks off"
275: turns off all ticks.
276: \*G
277: does its best to place labels appropriately, but
278: it sometimes needs your help:
279: the
280: .CW "left .2"
281: clause moves the left label 0.2 inches further left to
282: avoid the new ticks.
283: .grap 4006.g
284: .PP
285: The file
286: .CW 400wpairs.d
287: contains the times for
288: the women's 400 meter race, which has been run
289: only since 1964.
290: .P1
291: .d 400wpairs.d
292: .P2
293: To add these times to the graph,
294: we use
295: .P1
296: .get 4007.g
297: .P2
298: The
299: .CW new
300: command tells
301: \*g
302: to end
303: the old curve and to start a new curve
304: (which in this case will be drawn
305: with a dotted line).
306: Text is placed on the graph by
307: commands of the form
308: .P1
309: "string" at xvalue, yvalue
310: .P2
311: The
312: .CW size
313: clauses following the quoted strings tell
314: \*g
315: to shrink the characters by three points (absolute point sizes
316: may also be specified).
317: Strings are usually centered at the specified position,
318: but can be adjusted by clauses to be illustrated shortly.
319: .grap 4007.g
320: .PP
321: The file
322: .CW phone.d
323: records the number of telephones in the United States from
324: 1900 to 1970.
325: .P1
326: .d phone.d
327: .P2
328: Each line gives a year and the number of telephones
329: present in that year
330: (in millions, truncated to the nearest hundred thousand).
331: The simple
332: \*g
333: program
334: .P1
335: .get phone1.g
336: .P2
337: produces the simple graph
338: .grap phone1.g
339: .PP
340: The number of telephones appears to
341: grow exponentially;
342: to study that we will plot the data with
343: a logarithmic $y$-axis by adding
344: .CW log
345: .CW y
346: to the
347: .CW coord
348: command.
349: We will also add cosmetic changes of labels, more ticks,
350: and a solid line to replace the unconnected dots.
351: .P1
352: .get phone2.g
353: .P2
354: The third
355: .CW ticks
356: command provides a string that is used to print the tick
357: labels.
358: .UC C
359: programmers will recognize it as a
360: .CW printf
361: format string; others may view the
362: .CW %g
363: as the place to put
364: the number and anything else (in this case just an apostrophe) as
365: literal text to appear in the labels.
366: To suppress
367: labels, use the empty format string ("").
368: The program produces
369: .grap phone2.g
370: The number of telephones grew rapidly
371: in the first decade of this century,
372: and then settled down to an exponential growth rate upset only
373: by a decrease in the Great Depression and a post-war growth
374: spurt
375: to return the curve to its pre-Depression line.
376: .PP
377: Our presentation so far has been to
378: start with a simple
379: \*g
380: program that illustrates the data, and then refine it.
381: Later in this document we will ignore the design
382: phase, and present rather complex graphs in
383: their final form.
384: Beware.
385: .PP
386: All the examples so far have placed data on the
387: graph implicitly by
388: .CW copy ing
389: a file of numbers
390: (either a time series with one number per line or
391: pairs of numbers).
392: It is also possible to draw points and lines explicitly.
393: The
394: \*g
395: commands to draw on a graph
396: are illustrated in the following
397: fragment.
398: .P1
399: .get geom.g
400: .P2
401: .PP
402: The
403: .CW grid
404: command is similar to the
405: .CW ticks
406: command, except that grid lines extend
407: across the frame.
408: The next few commands plot text at specified positions.
409: The plotting characters (such as
410: .CW bullet )
411: are implemented as predefined
412: macros \(em more on that shortly.
413: Unlike arbitrary characters,
414: the visual centers of the markers
415: are near their plotting centers.
416: The
417: .CW circle
418: command draws a circle centered at the specified location.
419: A radius in inches may be specified;
420: if no radius is given, then the circle will be the
421: small circle shown at the center of the graph.
422: The
423: .CW line
424: and
425: .CW arrow
426: commands draw the obvious objects shown at the upper left.
427: .grap geom.g
428: .PP
429: This figure also illustrates the combined use of the
430: .CW draw
431: and
432: .CW next
433: commands.
434: Saying
435: .CW draw
436: .CW A
437: .CW solid
438: defines the style
439: for a connected sequence of line fragments to be called
440: .CW A .
441: Subsequent commands of
442: .CW next
443: .CW A
444: .CW at
445: .I point
446: add
447: .I point
448: to the end of
449: .CW A .
450: There are two such sequences active in the above
451: example
452: .CW A "" (
453: and
454: .CW B );
455: note that their
456: .CW next
457: commands are intermixed.
458: Because the predefined string
459: .CW delta
460: follows the specification of
461: .CW B ,
462: that string is plotted at each point in the sequence.
463: .PP
464: \*G
465: has numeric variables (implemented as double-precision
466: floating point numbers) and
467: the usual collection of arithmetic operators and
468: mathematical functions; see the reference section
469: for details.
470: .PP
471: \*G
472: provides the same rudimentary macro facility that
473: .I pic
474: does:
475: .P1
476: define \f2name\fP { \f2replacement text\fP }
477: .P2
478: defines
479: .IT name
480: to be the
481: .IT "replacement text" .
482: The replacement may be any text that contains balanced open and closing braces
483: .CW "{ }" .
484: (Alternatively, the
485: .IT "replacement text
486: may be quoted by
487: any single character that does not appear in the replacement;
488: the string is terminated by the next occurrence of that character.)
489: Any subsequent occurrence of
490: .IT name
491: will be replaced by
492: .IT "replacement text" .
493: .EQ
494: delim %%
495: .EN
496: .PP
497: The replacement text of a macro definition may
498: contain occurrences of
499: .CW $1 ,
500: .CW $2 ,
501: etc.;
502: these will be replaced by the corresponding actual
503: arguments when the macro is invoked.
504: The invocation for a macro with arguments is
505: .P1
506: name(arg1, arg2, ...)
507: .P2
508: Non-existent arguments are replaced by null
509: strings.
510: .EQ
511: delim $$
512: .EN
513: .PP
514: The following
515: \*g
516: program uses macros and arithmetic to plot
517: crude approximations to
518: the square and square root functions.
519: .P1
520: .get macarith.g
521: .P2
522: The macro
523: .CW root
524: uses the
525: .CW ^
526: exponentiation operator.
527: (Because
528: \*g
529: has the square root function
530: .CW sqrt ,
531: that macro is in fact superfluous.)
532: The program produces
533: .grap macarith.g
534: .PP
535: The
536: .CW copy
537: command has a
538: .CW thru
539: parameter that allows each line of a file to
540: be treated as though it were a macro call, with
541: the first field serving as
542: the first argument,
543: and so on.
544: This is the typical
545: \*g
546: mechanism for plotting files that are not stored as
547: time series or as $(x,y)$ pairs.
548: We will illustrate its use on the file
549: .CW states.d ,
550: which contains data on the fifty states.
551: .P1
552: .d states.d
553: .P2
554: The first field is the postal abbreviation of the state's
555: name (Alaska, Wyoming, Vermont, ...), the second field
556: is the number of Representatives to Congress from the state
557: after the 1981 reapportionment, and the third field is
558: the population of the state as measured in the 1980 Census.
559: The states appear in increasing order of
560: population.
561: .PP
562: We will first plot this data as
563: population, representative pairs.
564: (In the
565: .CW coord
566: statement,
567: .CW "log log"
568: is a synonym for
569: .CW "log x log y" .)
570: .P1
571: .get states1.g
572: .P2
573: Although the population is given in persons,
574: the
575: .CW PlotState
576: macro
577: plots the population in millions by dividing
578: the third input field
579: by one million (written in exponential notation
580: as
581: .CW 1e6 ,
582: for $1 times 10 sup 6$).
583: .grap states1.g
584: Using
585: .CW circle
586: as a plotting symbol displays
587: overlapping points that are obscured when
588: the data is plotted with bullets.
589: The representation of a state is roughly proportional
590: to its population, except in the very small states.
591: .PP
592: Our next plot will use the state's rank
593: in population as the $x$-coordinate and two
594: different $y$-coordinates: population and number of
595: representatives.
596: We will use two
597: .CW coord
598: commands to define the two coordinate systems
599: .CW pop
600: and
601: .CW rep .
602: We then explicitly give the coordinate system
603: whenever we refer to a point,
604: both in constructing axes and plotting data.
605: .P1
606: .get states2.g
607: .P2
608: The
609: .CW copy
610: statement in the program uses an
611: .I "immediate macro"
612: enclosed in curly brackets and thus avoids having to
613: name a macro for this task.
614: Because the program assumes that the states are
615: sorted in increasing order of population, it
616: generates
617: .CW thisrank
618: internally as a
619: \*g
620: variable.
621: The program produces
622: .grap states2.g
623: .PP
624: The plotting symbols were chosen for contrast in
625: both shape and shading.
626: This graph also indicates that representation is proportional
627: to population.
628: Once we see this graph, though, we should realize that we don't
629: really need two coordinate systems: we can relate the two by
630: dividing the population of the U.S. \(em about 226,000,000 \(em by
631: the number of representatives \(em 435 \(em to see that each
632: representative should count as 520,000 people.
633: If the purpose of this graph were to tell a story about
634: American politics rather than to illustrate
635: multiple coordinate systems,
636: it should be redrawn with a single coordinate
637: system.
638: .PP
639: Many graphs plot both observed data and a function
640: that (theoretically) describes the data.
641: There are many ways to draw a function
642: in \*g:
643: a series of
644: .CW next
645: commands is tedious but works, as does writing a
646: simple program to write a data file that is subsequently
647: read and plotted by \*g.
648: The
649: .CW for
650: statement often provides a better solution.
651: This
652: \*g
653: program
654: .P1
655: .get sin1.g
656: .P2
657: produces
658: .grap sin1.g
659: .a
660: The
661: .CW for
662: statement uses the same syntax as the
663: .CW ticks
664: statement, but the
665: .CW from
666: keyword can be replaced by
667: .CW = '', ``
668: which will look more familiar to programmers.
669: It varies the index variable over the specified range
670: and for each value executes all statements inside the delimiter
671: characters, which use the same rules as macro
672: delimiters.
673: It is, of course, useful for many tasks beyond plotting functions.
674: .EQ
675: delim %%
676: .EN
677: .PP
678: The
679: .CW if
680: statement provides a simple mechanism for conditional execution.
681: If a file contains data on both cities and states (and lines
682: describing states have ``S'' in the first field), it could be plotted
683: by statements like
684: .P1
685: if "$1" == "S" then {
686: PlotState($2,$3,$4)
687: } else {
688: PlotCity($2,$3,$4,$5,$6)
689: }
690: .P2
691: The
692: .CW else
693: clause
694: is optional; delimiters use the same rules as macros and
695: .CW for
696: statements.
697: .EQ
698: delim $$
699: .EN
700: .NH
701: A Collection of Examples
702: .PP
703: The previous section covered the
704: \*g
705: commands that are used in common graphs.
706: In this section we'll spend less time on
707: language features, and survey a wider variety of
708: graphs.
709: These examples are intended more for browsing and
710: reference than for straight-through reading.
711: Be prepared to refer to the manual in Section 5 when you stumble over a new
712: \*g
713: feature.
714: .PP
715: The file
716: .CW cars.d
717: contains the mileage (miles per gallon) and the weight
718: (pounds) for 74 models of automobiles sold in the United States
719: in the 1979 model year.
720: .P1
721: .d cars.d
722: .P2
723: The trivial
724: \*g
725: program
726: .P1
727: .get cars1.g
728: .P2
729: produces
730: .grap cars1.g
731: This graph shows that weights bottom out somewhat
732: below 2000
733: pounds and that heavier cars get worse mileage;
734: it is hard to say much more about the relationship
735: between weight and mileage.
736: .PP
737: The next graph provides labels, uses circles
738: to expose data hidden in the clouds of bullets,
739: and re-expresses the $x$-axis in gallons per mile.
740: It also changes the point size and vertical spacing
741: to a size appropriate for camera-ready journal articles
742: and books; the size changes should be made outside the
743: \*g
744: program.
745: The
746: .CW \&.ft
747: command changes to a Helvetica font, which
748: some people prefer for graphs.
749: .P1
750: .get cars2.g
751: .P2
752: \*G
753: supports logarithmic re-expression of data with the
754: .CW log
755: clause in the
756: .CW coord
757: statement; any other re-expression of data must be done
758: with
759: \*g
760: arithmetic, as above.
761: .br
762: .grap cars2.g
763: This graph shows that
764: gallons per mile is roughly proportional to weight.
765: (The two outliers near 4000 pounds are the Cadillac
766: Seville and the Oldsmobile 98.)
767: .PP
768: In
769: .I "Visual Display of Quantitative Information" ,
770: Tufte proposes the ``dot-dash-plot'' as a means for maximizing
771: data ink (showing the two-dimensional distribution and
772: the two one-dimensional marginal distributions) while minimizing
773: what he calls ``chart junk'' \(em ink wasted on borders
774: and non-data labels.
775: His preference is easy to express in \*g:
776: .P1
777: .get cars3.g
778: .P2
779: Although visually attractive, we do not find the
780: resulting graph as useful for interpreting the data.
781: .grap cars3.g
782: Tufte's graph does point out two facts that are
783: not obvious in the previous graphs:
784: there is a gap in car weights near 3000 pounds (exhibited
785: by the hole in the $y$-axis ticks), and the gallons per
786: mile axis is regularly structured (the ticks
787: are the reciprocals of an almost dense sequence of integers).
788: The reader may decide whether those insights are worth
789: the decrease in clarity.
790: .PP
791: Throughout the twentieth century, horses, cars and people
792: have gotten faster;
793: let's study those improvements.
794: For horses, we'll consider the winning times
795: of the Kentucky Derby from 1909 to 1988, in
796: the file
797: .CW speedhorse.d :
798: .P1
799: .d speedhorse.d
800: .P2
801: The program
802: .P1
803: .get speedhorse1.g
804: .P2
805: produces the graph
806: .grap speedhorse1.g
807: Each race is recorded with a bullet and
808: record times are marked by horizontal lines.
809: Secretariat is the only horse to have run the
810: one-and-a-quarter-mile
811: race in under two minutes; he won in 1973 in
812: 1:59.4.
813: .PP
814: For automobiles we will study the
815: world land speed record (even though those vehicles
816: are by now just low-flying airplanes).
817: The file
818: .CW speedcar.d
819: lists years in which speed records were set and the record
820: set in that year, in miles per hour averaged over a one-mile
821: course.
822: .P1
823: .d speedcar.d
824: .P2
825: We will plot the data with the following
826: \*g
827: program, which uses nested braces in the
828: .CW copy
829: and
830: .CW if
831: statements.
832: .P1
833: .get speedcar1.g
834: .P2
835: .PP
836: Each record line is drawn after the
837: .I next
838: record is read, because
839: the program must know when the record was broken to draw
840: its line.
841: The
842: .CW if
843: statement handles the first record, and the extra
844: .CW line
845: command extends the last record out to the current date.
846: .grap speedcar1.g
847: The horizontal lines reflect the nature of world records: they
848: last until they are broken.
849: The records could also have been plotted by a scatterplot
850: in which each point represents the setting of a record,
851: but it would be misleading to connect adjacent
852: points with line segments
853: (which we inappropriately did in the graphs
854: of the Olympic 400 meter run).
855: .PP
856: The following graph shows the world record times for the
857: one mile run;
858: because its
859: \*g
860: program is so similar to its automotive counterpart,
861: we won't show the program or data.
862: .grap speedman1.g
863: The three graphs show three different kinds of
864: changes.
865: Although horses are getting faster, they appear to
866: be approaching a barrier near two minutes.
867: Cars show great jumps as new technologies are introduced
868: followed by a plateau as limits of the
869: technology are reached.
870: Milers have shown a fairly consistent
871: linear improvement
872: over this century, but there must be an
873: asymptote down there somewhere.
874: .PP
875: The next file gives the median heights of boys
876: in the United States aged 2 to 18, together with
877: the fifth and ninety-fifth percentiles.
878: .P1
879: .d boyhts.d
880: .P2
881: The heights are given in centimeters (1 foot = 30.48 centimeters).
882: The trivial program
883: .P1
884: .get boyhts1.g
885: .P2
886: displays the data as
887: .grap boyhts1.g
888: Because there are four numbers on each input line, the first is
889: taken as an $x$-value and the remaining three are plotted
890: as $y$-values.
891: .PP
892: The three curves appear to be roughly straight
893: (at least up to age 16),
894: so it makes sense to fit a line
895: through them.
896: We will use the standard least squares regression
897: in which
898: .EQ
899: slope ~=~ {
900: {n SIGMA x y ~ - ~ SIGMA x SIGMA y }
901: over
902: {n SIGMA x sup 2 ~ - ~ ( SIGMA x ) sup 2 }
903: }
904: .EN
905: (where the summations range over all $n$ $x$ and $y$ values
906: in the data set) and the $y$-intercept is
907: .EQ
908: {SIGMA y ~ - ~ slope times SIGMA x} over n
909: .EN
910: The following
911: \*g
912: program boldly (and rather foolishly) implements that formula.
913: .P1
914: .get boyhts3.g
915: .P2
916: It plots the extreme fifth percentiles as a bar through
917: the median, which is plotted as a bullet.
918: All heights are converted to feet before plotting and calculating
919: the regression line.
920: .grap boyhts3.g
921: .PP
922: \*G
923: .CW print
924: statements write on
925: .CW stderr
926: as they are processed by \*g;
927: their single argument can be either an expression or a string.
928: The
929: .CW print
930: statements (which are commented out in
931: the above
932: \*g
933: program) at one time
934: showed that the regression line is
935: .EQ
936: Height ~ in ~ Feet ~ = ~ 2.61 ~ + ~ .19 times Age
937: .EN
938: Thus for most American
939: boys between 3 and 16, you may safely assume
940: that they started out life at 2 feet 7 inches and grew at the
941: rate of two and a quarter inches per year.
942: .PP
943: This program probably misapplies \*g;
944: if you really want to perform least squares regressions on
945: data, you should usually use a simple
946: .I awk
947: program like
948: .P1
949: .get regress.awk
950: .P2
951: (Be warned, though, that this program is not numerically
952: robust.)
953: .PP
954: While we're on the subject of fitting straight lines to data,
955: we'll redraw three graphs from J. W. Tukey's
956: .I "Exploratory Data Analysis" .
957: The file
958: .CW usapop.d
959: records the population of the United States
960: in millions at ten-year intervals.
961: .P1
962: .d usapop.d
963: .P2
964: Tukey's first two graphs indicate that the later population
965: growth was linear while the early growth was exponential.
966: The following
967: \*g
968: program plots them as a pair, using
969: .CW graph
970: commands to place internally unrelated graphs adjacent to
971: one another.
972: .P1
973: .get usapop1.g
974: .P2
975: The statements defining each graph are indented for clarity.
976: The second graph has the northern point of its frame 0.05
977: inch below the southern point of the frame of the first graph;
978: the
979: .CW with
980: clause is passed directly through to
981: .I pic
982: without being evaluated for macros or expressions.
983: The names of both graphs begin with capital letters to
984: conform to
985: .I pic
986: syntax for labels.
987: .grap usapop1.g
988: .PP
989: Polynomial functions lie between the linear and exponential
990: functions; Tukey shows how a seventh-degree polynomial provides
991: a better (and longer) fit to the early population growth.
992: .P1
993: .get usapop2.g
994: .P2
995: This program re-expresses the $x$-axis with
996: \*g
997: arithmetic and uses an
998: .CW if
999: statement to graph only part of the data file.
1000: It produces
1001: .grap usapop2.g
1002: .nr k \n%
1003: The
1004: .I eqn
1005: .CW "space 0"
1006: clause is necessary to keep
1007: .I eqn
1008: from adding extra space that would interfere
1009: with positions computed by \*g;
1010: see Section 4.
1011: .PP
1012: The file
1013: .CW army.d
1014: contains four related time series
1015: describing the United States Army.
1016: .P1
1017: .d army.d
1018: .P2
1019: The first field is the year; the next four fields give
1020: the number of male officers, female officers, enlisted males
1021: and enlisted females, each in thousands.
1022: (Actually, there were no female enlisted personnel in the
1023: Army until 1943; the value 1 in 1940 and 1942 is just
1024: a placeholder, since
1025: \*g
1026: has no mechanism for handling missing data.)
1027: The following
1028: \*g
1029: program draws the four series with four different sets of
1030: .CW draw
1031: and
1032: .CW next
1033: commands.
1034: .P1
1035: .get army1.g
1036: .P2
1037: The program labels the lines by
1038: .CW copy ing
1039: immediate data;
1040: the program is therefore shorter to write and easier to change.
1041: The delimiter string
1042: .CW XXX
1043: in the
1044: .CW until
1045: clause could be deleted in this graph: the
1046: .CW \&.G2
1047: line also denotes the end of data.
1048: Even though that string is enclosed in quotes,
1049: it may not contain spaces.
1050: The $y$-positions of the labels are the
1051: result of several iterations.
1052: .grap army1.g
1053: .PP
1054: This data can tell many stories: the buildup during the
1055: Second World War is obvious, as is the exodus after the
1056: war; increases during Korea and Vietnam are
1057: also apparent.
1058: We will consider a different story: the ratio of
1059: enlisted men to the three other classes of personnel.
1060: There are several ways to plot this data
1061: (the most obvious graph uses three time series showing how
1062: the ratios change over time, and is
1063: left as an exercise for the reader).
1064: .PP
1065: We will instead construct a graph that gives little insight into this
1066: data, but illustrates a general method that is quite useful
1067: in conjunction with \*g.
1068: The graph is a ``scatterplot vector'' that shows how one
1069: variable (the number of enlisted men) varies as a function of
1070: the other three.
1071: Breaking with tradition, we first show the final graphs, all
1072: of which have logarithmic scales.
1073: .grap army2.g
1074: The number of enlisted men is almost linearly
1075: related to the number of male officers, it is somewhat related to the number
1076: of female officers, and it varies widely as a function of the number
1077: of enlisted women.
1078: .PP
1079: Much more interesting than the graph itself is the method we used to
1080: produce it.
1081: We wrote a miniature ``compiler'' that accepts as
1082: its ``source language'' a description of a scatterplot vector and
1083: produces as ``object code'' a
1084: \*g
1085: program to draw the graph.
1086: The source program for the above example is
1087: .P1
1088: .get army2.v
1089: .P2
1090: The program lists several
1091: global attributes of the graph, the
1092: $y$-variable to be plotted, and as many $x$-variables as
1093: are desired; with each variable is its field in the file
1094: and a descriptive string.
1095: The language is ``compiled'' by the following
1096: .I awk
1097: program.
1098: .P1
1099: .get scatvec.awk
1100: .P2
1101: Running this program on the above description produces the following
1102: output, which is typically piped directly to \*g.
1103: .P1
1104: .get army2.g
1105: .P2
1106: The generated program uses the
1107: .I pic
1108: trick of re-using the same name
1109: .CW A ) (
1110: for several objects.
1111: .PP
1112: Although the program above is merely a toy,
1113: ``minicompilers'' can produce useful preprocessors
1114: for \*g.
1115: The
1116: .CW scatmat
1117: program, for instance, is a 90-line
1118: .I awk
1119: program that reads a simple input language and produces as
1120: output a
1121: \*g
1122: program to produce a ``scatterplot matrix'', which
1123: is a handy graphical device for spotting pairwise interactions
1124: among several variables.
1125: If
1126: \*g
1127: lacks a feature you desire, consider building
1128: a simple preprocessor to provide it.
1129: An alternative is to define
1130: macros for the task; which approach is best depends
1131: strongly on the job you wish to accomplish.
1132: .PP
1133: The next graph uses iterators to make a graph without
1134: reading data from a file.
1135: Rather, its ``data'' is a
1136: function of two variables
1137: that describes a
1138: derivative field and a function of one variable
1139: that describes one solution to the differential
1140: equation.
1141: .P1
1142: .get ode1.g
1143: .P2
1144: The left label uses
1145: .I eqn
1146: text between the $font CW "$$"$ delimiters.
1147: The variable
1148: .CW scale
1149: ensures that all lines in the direction field are the same
1150: length.
1151: The
1152: .CW in
1153: clauses in the
1154: .CW ticks
1155: statements specify that the ticks go in zero inches
1156: to avoid overprinting.
1157: The variables
1158: .CW tx
1159: and
1160: .CW ty
1161: are so named because
1162: .CW x
1163: and
1164: .CW y
1165: are reserved words for the
1166: .CW coord
1167: statement.
1168: .grap ode1.g
1169: .PP
1170: Programmers familiar with floating point arithmetic may be
1171: surprised that the above graph is correct.
1172: Because of roundoff error, iteration
1173: .CW "from 0 to 1 by .05" '' ``
1174: usually produces the values
1175: $0, ~ .05, ~ .10, ~ ..., ~ .95$.
1176: \*G
1177: uses a ``fuzzy test''
1178: in the
1179: .CW for
1180: statement to avoid that problem, which may in turn introduce
1181: other problems.
1182: Such problems may be avoided by iterating over an integer range
1183: and incrementing a non-integer value within the loop.
1184: .PP
1185: Most of the data we have seen so far is inherently
1186: two (or more) dimensional.
1187: As an example of one-dimensional data, we will return to
1188: the populations of the fifty states, which
1189: is the third field in the file
1190: .CW states.d
1191: introduced earlier;
1192: the file is sorted in increasing order of population.
1193: Our first graph takes the most space, but
1194: it also gives the most information.
1195: .P1
1196: .get states8.g
1197: .P2
1198: The
1199: .CW L
1200: macro (for Label)
1201: with input parameter $X$ evaluates to the number
1202: $2 sup X / 1,000,000$ followed by the string "$X$"
1203: (the
1204: .CW ticks
1205: command expects a number followed by a string label).
1206: .grap states8.g
1207: The dotted line is the least squares regression
1208: .EQ
1209: log sub 10 ~ Population ~ = ~ 7.214 ~ - ~ .03 times Rank
1210: .EN
1211: which gives 15.3 million as the population of the
1212: largest state and .515 million as the population
1213: of the smallest state.
1214: It says that
1215: population drops by a factor of two every ten states
1216: (compare the top and left scales).
1217: As sloppy as the exponential fit is, though, it is a much better
1218: fit to this data
1219: than a Zipf's Law curve is (drawing that curve is left as
1220: an exercise for the reader).
1221: .PP
1222: The next graph is a more standard representation of
1223: one-dimensional data.
1224: .P1
1225: .get states3.g
1226: .P2
1227: The markers were chosen to be
1228: .CW vticks
1229: because they denote only an $x$-value.
1230: .grap states3.g
1231: .PP
1232: The next one-dimensional graph uses the state's name as
1233: its marker; to reduce overprinting the graph is ``jittered''
1234: by using a random number as a $y$-value.
1235: .P1
1236: .get states4.g
1237: .P2
1238: The function
1239: .CW rand()
1240: returns a pseudo-random real number chosen uniformly over the interval [0,1).
1241: .grap states4.g
1242: This graph is too cluttered; circles would have been
1243: a better choice as a plotting symbol (bullets, once again, would
1244: hide data).
1245: .PP
1246: Histograms are a standard way of presenting one-dimensional
1247: data in two-dimensional form.
1248: Our first step in building a histogram of the population
1249: data is the following
1250: .I awk
1251: program, which counts how many states are in each ``bin''
1252: of a million people.
1253: .P1
1254: .get states5.awk
1255: .P2
1256: The variable
1257: .CW bzs
1258: tells where bin zero starts; although it is zero in this
1259: graph, it might be 95 in a histogram
1260: of human body temperatures in degrees Fahrenheit.
1261: The program produces the following output in
1262: .CW states2.d :
1263: .P1
1264: .d states2.d
1265: .P2
1266: There are 12 states with population between 0 and 999,999,
1267: 5 states with population between 1,000,000 and 1,999,999,
1268: and so on.
1269: .PP
1270: This
1271: \*g
1272: program uses three
1273: .CW line
1274: commands to plot each rectangle in the histogram.
1275: .P1
1276: .get states5.g
1277: .P2
1278: It produces
1279: .grap states5.g
1280: .PP
1281: The same file can be plotted in a
1282: more attractive (and more useful) form by
1283: .P1
1284: .get states6.g
1285: .P2
1286: which produces
1287: one of Bill Cleveland's ``dot charts'' or ``lolliplots'':
1288: .grap states6.g
1289: (We use
1290: .CW \e(bu ,
1291: the
1292: .I troff
1293: character for a bullet, rather than the built-in string to
1294: get a larger size.)
1295: .PP
1296: Other histograms are possible.
1297: The following
1298: .I awk
1299: program
1300: .P1
1301: .get states7.awk
1302: .P2
1303: produces the file
1304: .CW states3.d
1305: .P1
1306: .d states3.d
1307: .P2
1308: which lists the state's abbreviation, bin number, and
1309: height within the bin.
1310: The
1311: \*g
1312: program
1313: .P1
1314: .get states7.g
1315: .P2
1316: reads that file to make the following histogram, in which
1317: the state names are used to display the heights of the bins.
1318: In each bin, the states occur in increasing order of
1319: population from bottom to top.
1320: .grap states7.g
1321: .PP
1322: The next data set is a run-time profile of an early version of \*g,
1323: created by compiling the program with the
1324: .CW -p
1325: option and running
1326: .CW prof
1327: after the program executed.
1328: .P1
1329: .d prof1.d
1330: .P2
1331: Although there were more than fifty procedures in the program, the
1332: top four time-hogs accounted for more than half of the run time.
1333: This file is difficult for
1334: \*g
1335: to deal with:
1336: even though
1337: .CW if
1338: statements would allow us to extract lines 2 through 11
1339: of the file, we could not remove the leading
1340: .CW _
1341: from a routine name or access the last field in a record.
1342: We will therefore process it with
1343: the following
1344: .I awk
1345: program.
1346: .P1
1347: .get prof1.awk
1348: .P2
1349: The program produces
1350: .P1
1351: .d prof2.d
1352: .P2
1353: We could even use the
1354: .I sh
1355: statement to execute the
1356: .I awk
1357: program from within \*g, which would make the latter entirely
1358: self-contained (see the reference manual for details).
1359: .PP
1360: We will display the data with this program.
1361: .P1
1362: .get prof1.g
1363: .P2
1364: Observe that the program knows nothing about the range of the data.
1365: It uses default ticks and a
1366: .CW frame
1367: statement with a computed height to achieve
1368: total data independence.
1369: .grap prof1.g
1370: This bar chart highlights the fact that most of the time spent by
1371: \*g
1372: is devoted to input and output.
1373: .PP
1374: J. W. Tukey's box and whisker plots
1375: represent the median, quartiles, and extremes of a
1376: one-dimensional distribution.
1377: The following
1378: \*g
1379: program defines a macro to draw a box plot, and then
1380: uses that shape to compare the distribution of heights of
1381: volcanoes with the distribution of heights of States of the Union.
1382: .P1
1383: .get box1.g
1384: .P2
1385: Boxes are one of many shapes used for the graphical
1386: representation of several quantities.
1387: If you use such shapes frequently then you should
1388: make a library file of their macros to
1389: .CW copy
1390: into your
1391: \*g
1392: programs.
1393: The above program produces
1394: .grap box1.g
1395: Even though the extreme heights are the same, state heights
1396: have a lower median and a greater spread.
1397: .PP
1398: Someday you may use
1399: \*g
1400: to prepare overhead transparencies, only to find that
1401: everything comes out too small.
1402: The following program illustrates some ways to get larger
1403: graphs.
1404: .P1
1405: .zzz slide1.g
1406: .P2
1407: The
1408: .CW ps
1409: and
1410: .CW vs
1411: commands preceding the graph set the text size to 14 points and
1412: the vertical spacing to 18 points; the two quantities are
1413: reset by the commands following the
1414: .CW .G2 .
1415: Such size changes should be made outside the
1416: \*g
1417: program, as mentioned earlier.
1418: The
1419: .CW 4
1420: following the
1421: .CW .G1
1422: stretches the graph (including
1423: \*g's
1424: estimate of the accompanying text) to be four inches wide;
1425: it is an alternative to altering the
1426: .CW frame
1427: command.
1428: The macro
1429: .CW blob
1430: is a plotting symbol that is much larger than
1431: .CW bullet ;
1432: the different name ensures that later references to
1433: .CW bullet
1434: are unaffected.
1435: The
1436: .I troff
1437: commands within the
1438: .CW blob
1439: string move the character down one-tenth of an em
1440: to center its plotting position (determined experimentally)
1441: and then reset the vertical position.
1442: The program produces this trivial (but large) graph.
1443: .br
1444: .grap slide1.g
1445: .NH
1446: Using Grap
1447: .PP
1448: Following are a few day-to-day matters about using \*g.
1449: .NH 2
1450: Errors
1451: .PP
1452: \*G
1453: attempts to pinpoint input errors; for example,
1454: the input
1455: .P1
1456: \&.G1
1457: i = i + 1
1458: .P2
1459: results in this message on
1460: .CW stderr :
1461: .P1
1462: grap: syntax error near line 1, file -
1463: context is
1464: i = i >>> + <<< 1
1465: .P2
1466: The error was noticed
1467: at the
1468: .CW + .
1469: Unfortunately, pinpointing is not the same as explaining:
1470: the real error is that the variable
1471: .CW i
1472: was not initialized.
1473: .PP
1474: The ``words''
1475: .CW x
1476: and
1477: .CW y
1478: are reserved (for the
1479: .CW coord
1480: statement);
1481: you will get an equally inexplicable syntax error message if you use them
1482: as variable names.
1483: (This design is bad, but not nearly so bad as
1484: having the
1485: .CW log
1486: and
1487: .CW exp
1488: functions use base 10.)
1489: .PP
1490: \*G
1491: tries to load a file of standard macro definitions
1492: .CW /usr/lib/grap.defines ) (
1493: for terms like
1494: .CW bullet ,
1495: .CW plus ,
1496: etc.
1497: It doesn't complain if that file isn't found,
1498: but if you later use one of these words,
1499: you'll get a syntax error message.
1500: .PP
1501: Certain constructs suggested by analogy to
1502: .I pic
1503: do not work.
1504: For example,
1505: .CW .GS
1506: and
1507: .CW .GE
1508: would have been nicer than
1509: .CW .G1
1510: and
1511: .CW .G2 ,
1512: but they were already taken.
1513: The
1514: .I pic
1515: construct
1516: .P1
1517: \&.PS <file
1518: .P2
1519: has been superseded by
1520: \*g's
1521: .CW copy
1522: command (which in turn has been retrofitted into
1523: .I pic ).
1524: .NH 2
1525: \fITroff\fP issues
1526: .PP
1527: You may use
1528: .I troff
1529: commands like
1530: .CW .ps
1531: or
1532: .CW .ft
1533: to change text sizes and fonts within a graph,
1534: or use balanced
1535: .CW \es
1536: and
1537: .CW \ef
1538: commands within a string.
1539: Do not, however,
1540: add space
1541: .CW .sp ) (
1542: or change the line spacing
1543: .CW .vs , (
1544: .CW .ls )
1545: within a graph.
1546: Some defined terms like
1547: .CW bullet
1548: contain embedded size changes;
1549: further qualifying them with
1550: \*g
1551: .CW size
1552: commands may not always work.
1553: .PP
1554: Because
1555: \*g
1556: is built on top of
1557: .I pic ,
1558: the following quote from the
1559: .I pic
1560: manual is relevant:
1561: ``There is a subtle problem with complicated equations inside
1562: .I pic
1563: pictures \(em they come out wrong if
1564: .I eqn
1565: has to leave extra vertical space for the equation.
1566: If your equation involves more than subscripts and superscripts,
1567: you must add to the beginning of each such equation the extra information
1568: .CW "space 0" ''.
1569: This feature was illustrated in the graph of the
1570: United States population in Section 3.
1571: .NH 2
1572: Alternatives
1573: .PP
1574: Besides
1575: \*g
1576: and your local draftsperson, what other choices are there?
1577: .PP
1578: The S system |reference(slanguage chambers) provides
1579: a host of tools for statistical analysis,
1580: but somewhat fewer tools than
1581: \*g
1582: for producing document-quality graphs.
1583: S produces graphs on the screen of a DMD 5620 terminal much more quickly than
1584: \*g
1585: (often in seconds rather than minutes), but it
1586: takes somewhat longer to learn (at least for us).
1587: If you expect to do a lot of interactive data analysis, then
1588: S is probably the right tool for you.
1589: S may be used to generate
1590: .I pic
1591: commands.
1592: .PP
1593: The standard UNIX program
1594: .I graph
1595: provides many of the basic features of
1596: \*g,
1597: though with quite a bit less control over details, particularly
1598: text.
1599: It produces output only in the
1600: .UX
1601: .I plot (5)
1602: language,
1603: which may be processed by a variety of filters
1604: for a variety of output devices.
1605: .PP
1606: The original
1607: .UX
1608: typesetter graphics programs are
1609: .I pic
1610: and
1611: .I ideal ;
1612: you may be able to do as well without using
1613: \*g
1614: as an intermediary.
1615: In particular,
1616: .I ideal
1617: provides shading and clipping,
1618: which are useful
1619: in presentation-quality bar charts and the like, but are
1620: well beyond the capabilities of
1621: .I pic .
1622: .EQ
1623: delim $$
1624: .EN
1625: .NH
1626: References
1627: .LP
1628: |reference_placement
1629: .NH
1630: Reference Manual
1631: .PP
1632: In the following,
1633: .I italic
1634: terms are syntactic categories,
1635: .CW typewriter
1636: terms are literals,
1637: parenthesized constructs are optional, and ... indicates repetition.
1638: In most cases, the order of statements,
1639: constructs and attributes is immaterial.
1640: .P1
1641: .IT "grap program" :
1642: .G1 \f2(width in inches)\fP
1643: \f2grap statement\fP
1644: ...
1645: .G2
1646: .P2
1647: A width on the
1648: .CW .G1
1649: line overrides the computed width, as in
1650: .I pic .
1651: .P1
1652: .IT "grap statement" :
1653: .I
1654: frame \(or label \(or coord \(or ticks \(or grid \(or plot \(or line \(or circle \(or draw \(or new \(or next
1655: \(or graph \(or numberlist \(or copy \(or for \(or if \(or sh \(or pic \(or assignment \(or print
1656: .ft
1657: .P2
1658: .PP
1659: The
1660: .CW frame
1661: statement defines the frame that surrounds the graph:
1662: .P1
1663: .IT frame :
1664: frame \f2(\fPht \f2expr)\fP \f2(\fPwid \f2expr)\fP \f2((side) linedesc)\fP \f2...\fP
1665: .IT side :
1666: top \(or bot \(or left \(or right
1667: .IT linedesc :
1668: solid \(or invis \(or dotted \f2(expr)\fP \(or dashed \f2(expr)\fP
1669: .P2
1670: Height and width default to 2 and 3 inches;
1671: sides default to solid.
1672: If
1673: .I side
1674: is omitted, the
1675: .I linedesc
1676: applies to the entire frame.
1677: The optional expressions after
1678: .CW dotted
1679: and
1680: .CW dashed
1681: change the spacing exactly as in
1682: .I pic .
1683: .PP
1684: The
1685: .CW label
1686: statement places a label on a specified side:
1687: .P1
1688: .IT label :
1689: label \f2side\fP \f2strlist\fP \f2...\fP \f2shift\fP
1690: .IT shift:
1691: left\f2 \(or \fPright\f2 \(or \fPup\f2 \(or \fPdown \f2expr ...\fP
1692: .IT strlist :
1693: \f2str ... (\fPrjust\f2 \(or \fPljust\f2 \(or \fPabove\f2 \(or \fPbelow\f2) ... (\fPsize \f2(\fP\(+-\f2) expr) ...\fP
1694: .IT str :
1695: "\f2...\fP"
1696: .P2
1697: Lists of text strings are stacked vertically.
1698: In any context, string lists may contain clauses
1699: to adjust the position or change the point size.
1700: Each clause applies to the string preceding it
1701: and all following strings.
1702: Labels may also have a
1703: .CW width
1704: attribute, to override
1705: \*g's
1706: default computation.
1707: .PP
1708: Normally the coordinate system is defined by the data,
1709: with 7 percent extra on each side.
1710: (To change that to 5 percent, assign 0.05 to the
1711: \*g
1712: variable
1713: .CW margin ,
1714: which is reset to 0.07 at each
1715: .CW .G1
1716: statement.)
1717: The
1718: .CW coord
1719: statement defines an overriding system:
1720: .P1
1721: .IT coord :
1722: coord \f2(name)\fP \f2(\fPx \f2expr,expr)\fP \f2(\fPy \f2expr,expr)\fP \f2(\fPlog x \(or log y \(or log log\f2) \fP
1723: .P2
1724: Coordinate systems can be named;
1725: ranges, logarithmic scaling, etc., are done separately for each.
1726: .PP
1727: The
1728: .CW ticks
1729: statement places tick marks on one side of the frame:
1730: .P1
1731: .IT ticks :
1732: ticks \f2side\fP \f2(\fPin \(or out \f2(expr))\fP \f2(shift) (tick-locations)\fP
1733: .IT tick-locations :
1734: at \f2(name) expr (str)\fP, \f2expr (str)\fP, \f2...\fP
1735: \(or from \f2(name) expr\fP to \f2expr\fP \f2(\fPby \f2(op) expr)\fP \f2str\fP
1736: .P2
1737: If no ticks are specified, they will be provided automatically;
1738: .CW ticks
1739: .CW off
1740: suppresses automatic ticks.
1741: The optional expression after
1742: .CW in
1743: or
1744: .CW out
1745: specifies the length of the ticks in inches.
1746: The optional name refers to a coordinate system.
1747: If
1748: .IT str
1749: contains
1750: format specifiers like
1751: .CW %f
1752: or
1753: .CW %g ,
1754: they are interpreted as by
1755: .CW printf .
1756: If no
1757: .IT str
1758: is supplied, the tick labels will be the values of the
1759: expressions.
1760: .PP
1761: If the
1762: .CW by
1763: clause is omitted, steps are of size 1.
1764: If the
1765: .CW by
1766: expression is preceded by one of
1767: .CW + ,
1768: .CW - ,
1769: .CW *
1770: or
1771: .CW / ,
1772: the step is scaled by that operator,
1773: e.g.,
1774: .CW *10
1775: means that each step is 10 times the previous one.
1776: .PP
1777: The
1778: .CW grid
1779: statement produces grid lines along (i.e., perpendicular to)
1780: the named side.
1781: .P1
1782: .IT grid :
1783: grid \f2side (linedesc) (shift) (tick-locations)\fP
1784: .P2
1785: Grids are labeled by the same mechanism as
1786: .CW ticks .
1787: It is possible to draw grids without ticks by placing the phrase
1788: .CW ticks
1789: .CW off
1790: after the side name and before the iterator.
1791: .PP
1792: Plot
1793: statements place text at a point:
1794: .P1
1795: .IT plot :
1796: \f2strlist\fP at \f2point\fP
1797: plot \f2expr (str)\fP at \f2point\fP
1798: .IT point :
1799: \f2(name) expr,expr\fP
1800: .P2
1801: As in the
1802: .CW label
1803: statement, the string list may contain
1804: position and size modifiers.
1805: The
1806: .CW plot
1807: statement uses the optional format string as in C's
1808: .CW printf
1809: statement \(em it may contain a
1810: .CW %f
1811: or
1812: .CW %g .
1813: The optional name refers to a coordinate system.
1814: .PP
1815: The
1816: .CW line
1817: statement draws a line or arrow from here to there:
1818: .P1
1819: .IT line :
1820: \f2(\fPline \(or arrow\f2)\fP from \f2point\fP to \f2point (linedesc)\fP
1821: .P2
1822: The
1823: .CW circle
1824: statement draws a circle:
1825: .P1
1826: .IT circle :
1827: circle at \f2point (\fPradius \f2expr)\fP
1828: .P2
1829: The radius is in inches; the default size is small.
1830: .PP
1831: The
1832: .CW draw
1833: statement defines a sequence of lines:
1834: .P1
1835: .IT draw :
1836: draw \f2(name) linedesc (str)\fP
1837: .P2
1838: Subsequent data for the named sequence
1839: will be plotted as a line of the specified style,
1840: with the optional
1841: .IT str
1842: plotted at each point.
1843: The
1844: .CW next
1845: statement continues a sequence:
1846: .P1
1847: .IT next :
1848: next \f2(name)\fP at \f2point (linedesc)\fP
1849: .P2
1850: If a line description is specified, it overrides the default
1851: display mode for the line segment ending at
1852: .I point .
1853: The
1854: .CW new
1855: statement starts a new sequence; it has the same format as the
1856: .CW draw
1857: statement.
1858: .PP
1859: A line consisting of a set of numbers
1860: is treated as a family of points
1861: $x$, $y sub 1$, $y sub 2$, etc.,
1862: to be plotted at the single
1863: $x$ value.
1864: .P1
1865: .IT numberlist :
1866: \f2number\fP ...
1867: .P2
1868: If there is only one number it is treated as
1869: a $y$ value, and $x$ values of 1, 2, 3, ...
1870: are supplied automatically.
1871: .PP
1872: \*G
1873: provides arithmetic with the operators
1874: .CW + ,
1875: .CW - ,
1876: .CW * ,
1877: .CW / ,
1878: and
1879: .CW ^ .
1880: Variables may be assigned to;
1881: assignments are expressions.
1882: Built-in functions include
1883: .CW log ,
1884: .CW exp
1885: (both base 10 \(em beware!),
1886: .CW int
1887: (truncates towards zero),
1888: .CW sin ,
1889: .CW cos
1890: (both use radians),
1891: .CW atan2(dy,dx) ,
1892: .CW sqrt ,
1893: .CW min
1894: (two arguments only),
1895: .CW max
1896: (ditto),
1897: and
1898: .CW rand()
1899: (returns a real number random on [0,1)).
1900: .PP
1901: The
1902: .CW for
1903: statement provides a modest looping facility:
1904: .P1
1905: .IT for :
1906: for \f2var\fP from \f2expr\fP to \f2expr (\fPby \f2(op) expr)\fP do { \f2anything\fP }
1907: .P2
1908: The string may contain internally balanced braces.
1909: Alternatively, any other character may appear immediately after the word
1910: .CW do ,
1911: and the string is terminated by the next occurrence of that character.
1912: The text
1913: .IT anything
1914: (which may contain newlines) is repeated as
1915: .IT var
1916: takes on values from
1917: .IT expr1
1918: to
1919: .IT expr2 .
1920: As with tick iterators, the
1921: .CW by
1922: clause is optional, and may proceed arithmetically or multiplicatively.
1923: In a
1924: .CW for
1925: statement,
1926: the
1927: .CW from
1928: may be replaced by
1929: .CW = ''. ``
1930: .PP
1931: The
1932: .CW if-then-else
1933: statement provides conditional evaluation:
1934: .P1
1935: .IT if :
1936: if \f2expr\fP then { \f2anything\fP } else { \f2anything\fP }
1937: .P2
1938: The
1939: .CW else
1940: clause
1941: is optional.
1942: Relational operators include
1943: .CW == ,
1944: .CW != ,
1945: .CW > ,
1946: .CW >= ,
1947: .CW < ,
1948: .CW <= ,
1949: .CW ! ,
1950: .CW || ,
1951: and
1952: .CW && .
1953: Strings may be compared with the operators
1954: .CW ==
1955: and
1956: .CW != .
1957: .PP
1958: It is possible to convert numeric expressions to formatted strings:
1959: .P1
1960: sprintf("\f2format\fP", \f2expr\fP, \f2expr\fP, ...)
1961: .P2
1962: is equivalent to a quoted string in any context.
1963: Variants of
1964: .CW %f
1965: and
1966: .CW %g
1967: are the only sensible format conversions.
1968: .PP
1969: \*G
1970: provides the same macro processor that
1971: .I pic
1972: does:
1973: .P1
1974: define \f2macro-name\fP { \f2anything\fP }
1975: .P2
1976: .EQ
1977: delim %%
1978: .EN
1979: Subsequent occurrences of the macro name will be replaced
1980: by the string, with arguments of the form \f(CW$\fIn\fR
1981: replaced by corresponding actual arguments.
1982: Macro definitions persist across
1983: .CW .G2
1984: boundaries, as do values of variables.
1985: .EQ
1986: delim $$
1987: .EN
1988: .PP
1989: The
1990: .CW copy
1991: statement is somewhat overloaded:
1992: .P1
1993: copy "\f2filename\fP"
1994: .P2
1995: includes the contents of the named file at that point;
1996: .P1
1997: copy "\f2filename\fP" thru \f2macro-name\fP
1998: .P2
1999: copies the file through the macro; and
2000: .P1
2001: copy thru \f2macro-name\fP
2002: .P2
2003: copies subsequent lines through the macro;
2004: each number or quoted string is treated as an argument.
2005: In each case, copying continues until end of file or the next
2006: .CW .G2 .
2007: The optional clause
2008: .CW until
2009: .IT str
2010: causes copying to terminate when a line whose
2011: first field is
2012: .IT str
2013: occurs.
2014: In all cases, the macro can be specified inline rather than by name:
2015: .P1
2016: copy thru { \f2macro body\fP }
2017: .P2
2018: .PP
2019: The
2020: .CW sh
2021: command passes text through to the UNIX shell.
2022: .P1
2023: .IT sh :
2024: sh { \f2anything\fP }
2025: .P2
2026: The body of the command is scanned for macros.
2027: The built-in macro
2028: .CW pid
2029: is a string consisting of the process identification number;
2030: it can be used to generate unique file names.
2031: .PP
2032: The
2033: .CW pic
2034: command passes text through to
2035: .I pic
2036: with the
2037: .CW pic '' ``
2038: removed; variables and macros are not evaluated.
2039: Lines beginning with a period (that are not numbers)
2040: are passed through literally, under the assumption that they
2041: are
2042: .I troff
2043: commands.
2044: .PP
2045: The
2046: .CW graph
2047: statement
2048: .P1
2049: .IT graph :
2050: graph \f2Picname (pic-text)\fP
2051: .P2
2052: defines a new graph named
2053: .I Picname ,
2054: resetting all coordinate systems.
2055: If any
2056: .CW graph
2057: commands are used in a
2058: \*g
2059: program, then the statement after the
2060: .CW \&.G1
2061: must be a
2062: .CW graph
2063: command.
2064: The
2065: .I pic-text
2066: can be used to position this graph relative
2067: to previous graphs by referring to their
2068: .CW Frame s,
2069: as in
2070: .P1
2071: graph First
2072: ...
2073: graph Second with .Frame.w at First.Frame.e + (0.1,0)
2074: .P2
2075: Macros and expressions in
2076: .I pic-text
2077: are not evaluated.
2078: .I Picname s
2079: must begin with a capital letter to satisfy
2080: .I pic
2081: syntax.
2082: .PP
2083: The
2084: .CW print
2085: statement
2086: .P1
2087: .IT print :
2088: print \f2(expr\fP \(or \f2str)\fP
2089: .P2
2090: writes on
2091: .CW stderr
2092: as
2093: \*g
2094: processes its input; it is sometimes useful for debugging.
2095: .PP
2096: Many reserved words have synonyms, such as
2097: .CW thru
2098: for
2099: .CW through ,
2100: .CW tick
2101: for
2102: .CW ticks,
2103: and
2104: .CW bot
2105: for
2106: .CW bottom .
2107: .PP
2108: The
2109: .CW #
2110: introduces a comment, which ends at the end of the line.
2111: Statements may be continued over several lines by preceding each
2112: newline with a
2113: backslash character.
2114: Multiple statements may appear on a single line separated
2115: by semicolons.
2116: \*G
2117: ignores any line that is entirely blank, including those
2118: processed by
2119: .CW "copy thru"
2120: commands.
2121: .PP
2122: When
2123: \*g
2124: is first executed it reads standard macro definitions
2125: from the file
2126: .CW /usr/lib/grap.defines .
2127: The definitions include
2128: .CW bullet ,
2129: .CW plus ,
2130: .CW box ,
2131: .CW star ,
2132: .CW dot ,
2133: .CW times ,
2134: .CW htick ,
2135: .CW vtick ,
2136: .CW square ,
2137: and
2138: .CW delta .
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.