|
|
1.1 root 1: .so ../ADM/mac
2: .XX grap 109 "Grap \(em A Language for Typesetting Graphs"
3: .EQ
4: delim $$
5: .EN
6: .so macros
7: .ds g \f2grap\fP
8: .ds G \f2Grap\fP
9: .TL
10: Grap \(em A Language for Typesetting Graphs
11: .br
12: Tutorial and User Manual
13: .AU
14: Jon L. Bentley
15: Brian W. Kernighan
16: .AI
17: .MH
18: .AB
19: \*G
20: is a language for describing plots of data.
21: This graph of the 1984
22: age distribution in the United States
23: .grap agepop1.g
24: is produced by the
25: \*g
26: commands
27: .P1
28: .get agepop1.g
29: .P2
30: (Each line in the data file
31: .UL agepop.d
32: contains an age and the number of Americans of that
33: age alive in 1984; the file is sorted by age.)
34: .PP
35: The
36: \*g
37: preprocessor works with
38: .I pic |reference(latest pic)
39: and
40: .I troff |reference(latest troff reference).
41: Most of its input is passed
42: through untouched, but statements between
43: .UL .G1
44: and
45: .UL .G2
46: are translated into
47: .I pic
48: commands that draw graphs.
49: .AE
50: .NH
51: Introduction
52: .PP
53: \*G
54: is a language for describing graphical
55: displays of data.
56: It provides such services as automatic scaling and
57: labeling of axes, and
58: .UL for
59: statements,
60: .UL if
61: statements, and macros to facilitate user
62: programmability.
63: \*G
64: is intended primarily for including graphs in
65: documents prepared on the
66: .UX
67: operating system, and is only marginally
68: useful for elementary tasks in data analysis.
69: .PP
70: Section 2 of this document is a tutorial introduction to
71: \*g;
72: readers who find it slow going may wish to skim ahead.
73: The examples in Section 3 illustrate
74: the various kinds of graphs that
75: \*g
76: can produce and some common
77: \*g
78: idioms.
79: Mundane matters about using
80: \*g
81: are discussed in Section 4,
82: and Section 5 contains a brief reference manual.
83: .PP
84: We have tried to illustrate good principles of
85: statistics and graphical design in the
86: graphs we present.
87: In several places, though, good taste has lost to
88: the necessity of illustrating
89: \*g
90: capabilities.
91: Readers interested in statistical
92: integrity and taste should
93: consult the literature, for example |reference(chambers graphs)
94: |reference(tufte graphs) |reference(cleveland elements).
95: .NH
96: Tutorial
97: .PP
98: The following is a simple
99: \*g
100: program\(dg
101: .FS
102: \(dg Throughout
103: this document we will show only the first five
104: lines and the last line of data files;
105: omitted lines are indicated by ``...''.
106: .FE
107: .P1
108: \&.G1
109: .d 400mtimes.d
110: \&.G2
111: .P2
112: The single number on each line
113: is the winning time in seconds for the
114: men's 400 meter run,
115: from the first modern Olympic Games (1896)
116: to the twenty-first (1988).
117: If the file
118: .UL olymp.g
119: contains the text above,
120: then typing the command
121: .P1
122: grap olymp.g | pic | troff > junk
123: .P2
124: creates a
125: .I troff
126: output file
127: .UL junk
128: that contains the
129: picture
130: .grap 4001.g
131: The graph shows the decrease
132: in winning times from 54.2
133: seconds to 43.87 seconds.
134: If the times are
135: contained in the file
136: .UL 400mtimes.d ,
137: we could
138: produce the same graph with the
139: shorter program
140: .P1
141: .get 4001.g
142: .P2
143: Writing
144: .UL copy
145: .UL \&"fname"
146: in a
147: \*g
148: program is equivalent to including the
149: contents of file
150: .UL fname
151: at that point in the file.
152: (In the interests of compatibility with other programs,
153: .UL include
154: is a synonym for
155: .UL copy .)
156: .PP
157: Each line in the file
158: .UL 400mpairs.d
159: contains two numbers, the
160: year of the Olympics and the winning time:
161: .P1
162: .d 400mpairs.d
163: .P2
164: If we plot this data with the program
165: .P1
166: .get 4002.g
167: .P2
168: the bottom ($x$) axis represents the year of the Olympics.
169: .grap 4002.g
170: The ``holes'' in $x$-values reflect the fact
171: that the 1916, 1940, and 1944 Olympics
172: were cancelled due to war.
173: Because the previous data
174: (in
175: .UL 400mtimes.d )
176: had just one number per
177: line,
178: \*g
179: viewed it as a ``time series'' and
180: supplied $x$-values of $1, ~ 2, ~ 3, ...$
181: before plotting
182: the data as $y$-values.
183: The input to the
184: second program has two values per line,
185: so they are interpreted as $( x , y )$ pairs.
186: .PP
187: Rather than a scatter plot of points, we might prefer to
188: see the winning times connected by a solid
189: line.
190: The program
191: .P1
192: .get 4003.g
193: .P2
194: produces the graph
195: .grap 4003.g
196: Eric Liddell of Great Britain
197: won his gold medal
198: in Paris in 1924 with a time of 47.6 seconds.
199: (Remember ``Chariots
200: of Fire''?)
201: .PP
202: We can make the graph more attractive
203: by modifying its frame
204: and adding labels.
205: .P1
206: .get 4004.g
207: .P2
208: The
209: .UL frame
210: command describes
211: the graph's bounding box:
212: the overall frame (which has four sides)
213: is invisible, it is 2 inches high and 3 inches
214: wide (which happen to be the
215: default height and width),
216: and the left and bottom
217: sides are solid (they could have been
218: dashed or dotted instead).
219: The labels appear on the left and bottom, as requested.
220: .grap 4004.g
221: .PP
222: To set the range of each axis,
223: \*g
224: examines the data and pads both
225: dimensions
226: by seven percent at each end.
227: The
228: .UL coord
229: (``coordinates'') command
230: allows you to specify the range of one or both axes explicitly;
231: it also turns off automatic padding.
232: .P1
233: .get 4005.g
234: .P2
235: The $y$-axis now ranges from 42 to 56 seconds
236: (a little more than before),
237: and the $x$-axis from 1894 to 1990
238: (a little less).
239: .grap 4005.g
240: .PP
241: The ticks in the preceding graphs were generated
242: by
243: \*g
244: guessing at reasonable values.
245: If you would rather provide your own,
246: you may
247: use the
248: .UL ticks
249: command,
250: which comes in the flavors illustrated below.
251: .P1
252: .get 4006.g
253: .P2
254: The first
255: .UL ticks
256: command deals with the left axis:
257: it puts the ticks facing out at
258: the numbers in the list.
259: \*G
260: puts labels only at values
261: with strings,
262: except that when no labels at all are
263: given, each number serves as its own label,
264: as in the second
265: .UL ticks
266: command.
267: That command
268: is for the bottom axis:
269: it puts the ticks facing in at steps of 20
270: from 1900 to 1980.
271: The command
272: .UL "ticks off"
273: turns off all ticks.
274: \*G
275: does its best to place labels appropriately, but
276: it sometimes needs your help:
277: the
278: .UL "left .2"
279: clause moves the left label 0.2 inches further left to
280: avoid the new ticks.
281: .grap 4006.g
282: .PP
283: The file
284: .UL 400wpairs.d
285: contains the times for
286: the women's 400 meter race, which has been run
287: only since 1964.
288: .P1
289: .d 400wpairs.d
290: .P2
291: To add these times to the graph,
292: we use
293: .P1
294: .get 4007.g
295: .P2
296: The
297: .UL new
298: command tells
299: \*g
300: to end
301: the old curve and to start a new curve
302: (which in this case will be drawn
303: with a dotted line).
304: Text is placed on the graph by
305: commands of the form
306: .P1
307: "string" at xvalue, yvalue
308: .P2
309: The
310: .UL size
311: clauses following the quoted strings tell
312: \*g
313: to shrink the characters by three points (absolute point sizes
314: may also be specified).
315: Strings are usually centered at the specified position,
316: but can be adjusted by clauses to be illustrated shortly.
317: .grap 4007.g
318: .PP
319: The file
320: .UL phone.d
321: records the number of telephones in the United States from
322: 1900 to 1970.
323: .P1
324: .d phone.d
325: .P2
326: Each line gives a year and the number of telephones
327: present in that year
328: (in millions, truncated to the nearest hundred thousand).
329: The simple
330: \*g
331: program
332: .P1
333: .get phone1.g
334: .P2
335: produces the simple graph
336: .grap phone1.g
337: .PP
338: The number of telephones appears to
339: grow exponentially;
340: to study that we will plot the data with
341: a logarithmic $y$-axis by adding
342: .UL log
343: .UL y
344: to the
345: .UL coord
346: command.
347: We will also add cosmetic changes of labels, more ticks,
348: and a solid line to replace the unconnected dots.
349: .P1
350: .get phone2.g
351: .P2
352: The third
353: .UL ticks
354: command provides a string that is used to print the tick
355: labels.
356: .UC C
357: programmers will recognize it as a
358: .UL printf
359: format string; others may view the
360: .CW %g
361: as the place to put
362: the number and anything else (in this case just an apostrophe) as
363: literal text to appear in the labels.
364: To suppress
365: labels, use the empty format string ("").
366: The program produces
367: .grap phone2.g
368: The number of telephones grew rapidly
369: in the first decade of this century,
370: and then settled down to an exponential growth rate upset only
371: by a decrease in the Great Depression and a post-war growth
372: spurt
373: to return the curve to its pre-Depression line.
374: .PP
375: Our presentation so far has been to
376: start with a simple
377: \*g
378: program that illustrates the data, and then refine it.
379: Later in this document we will ignore the design
380: phase, and present rather complex graphs in
381: their final form.
382: Beware.
383: .PP
384: All the examples so far have placed data on the
385: graph implicitly by
386: .UL copy ing
387: a file of numbers
388: (either a time series with one number per line or
389: pairs of numbers).
390: It is also possible to draw points and lines explicitly.
391: The
392: \*g
393: commands to draw on a graph
394: are illustrated in the following
395: fragment.
396: .P1
397: .get geom.g
398: .P2
399: .PP
400: The
401: .UL grid
402: command is similar to the
403: .UL ticks
404: command, except that grid lines extend
405: across the frame.
406: The next few commands plot text at specified positions.
407: The plotting characters (such as
408: .UL bullet )
409: are implemented as predefined
410: macros \(em more on that shortly.
411: Unlike arbitrary characters,
412: the visual centers of the markers
413: are near their plotting centers.
414: The
415: .UL circle
416: command draws a circle centered at the specified location.
417: A radius in inches may be specified;
418: if no radius is given, then the circle will be the
419: small circle shown at the center of the graph.
420: The
421: .UL line
422: and
423: .UL arrow
424: commands draw the obvious objects shown at the upper left.
425: .grap geom.g
426: .PP
427: This figure also illustrates the combined use of the
428: .UL draw
429: and
430: .UL next
431: commands.
432: Saying
433: .UL draw
434: .UL A
435: .UL solid
436: defines the style
437: for a connected sequence of line fragments to be called
438: .UL A .
439: Subsequent commands of
440: .UL next
441: .UL A
442: .UL at
443: .I point
444: add
445: .I point
446: to the end of
447: .UL A .
448: There are two such sequences active in the above
449: example
450: .UL A "" (
451: and
452: .UL B );
453: note that their
454: .UL next
455: commands are intermixed.
456: Because the predefined string
457: .UL delta
458: follows the specification of
459: .UL B ,
460: that string is plotted at each point in the sequence.
461: .PP
462: \*G
463: has numeric variables (implemented as double-precision
464: floating point numbers) and
465: the usual collection of arithmetic operators and
466: mathematical functions; see the reference section
467: for details.
468: .PP
469: \*G
470: provides the same rudimentary macro facility that
471: .I pic
472: does:
473: .P1
474: define \f2name\fP { \f2replacement text\fP }
475: .P2
476: defines
477: .IT name
478: to be the
479: .IT "replacement text" .
480: The replacement may be any text that contains balanced open and closing braces
481: .UL "{ }" .
482: (Alternatively, the
483: .IT "replacement text
484: may be quoted by
485: any single character that does not appear in the replacement;
486: the string is terminated by the next occurrence of that character.)
487: Any subsequent occurrence of
488: .IT name
489: will be replaced by
490: .IT "replacement text" .
491: .EQ
492: delim %%
493: .EN
494: .PP
495: The replacement text of a macro definition may
496: contain occurrences of
497: .UL $1 ,
498: .UL $2 ,
499: etc.;
500: these will be replaced by the corresponding actual
501: arguments when the macro is invoked.
502: The invocation for a macro with arguments is
503: .P1
504: name(arg1, arg2, ...)
505: .P2
506: Non-existent arguments are replaced by null
507: strings.
508: .EQ
509: delim $$
510: .EN
511: .PP
512: The following
513: \*g
514: program uses macros and arithmetic to plot
515: crude approximations to
516: the square and square root functions.
517: .P1
518: .get macarith.g
519: .P2
520: The macro
521: .UL root
522: uses the
523: .UL ^
524: exponentiation operator.
525: (Because
526: \*g
527: has the square root function
528: .UL sqrt ,
529: that macro is in fact superfluous.)
530: The program produces
531: .grap macarith.g
532: .PP
533: The
534: .UL copy
535: command has a
536: .UL thru
537: parameter that allows each line of a file to
538: be treated as though it were a macro call, with
539: the first field serving as
540: the first argument,
541: and so on.
542: This is the typical
543: \*g
544: mechanism for plotting files that are not stored as
545: time series or as $(x,y)$ pairs.
546: We will illustrate its use on the file
547: .UL states.d ,
548: which contains data on the fifty states.
549: .P1
550: .d states.d
551: .P2
552: The first field is the postal abbreviation of the state's
553: name (Alaska, Wyoming, Vermont, ...), the second field
554: is the number of Representatives to Congress from the state
555: after the 1981 reapportionment, and the third field is
556: the population of the state as measured in the 1980 Census.
557: The states appear in increasing order of
558: population.
559: .PP
560: We will first plot this data as
561: population, representative pairs.
562: (In the
563: .UL coord
564: statement,
565: .UL "log log"
566: is a synonym for
567: .UL "log x log y" .)
568: .P1
569: .get states1.g
570: .P2
571: Although the population is given in persons,
572: the
573: .UL PlotState
574: macro
575: plots the population in millions by dividing
576: the third input field
577: by one million (written in exponential notation
578: as
579: .UL 1e6 ,
580: for $1 times 10 sup 6$).
581: .grap states1.g
582: Using
583: .UL circle
584: as a plotting symbol displays
585: overlapping points that are obscured when
586: the data is plotted with bullets.
587: The representation of a state is roughly proportional
588: to its population, except in the very small states.
589: .PP
590: Our next plot will use the state's rank
591: in population as the $x$-coordinate and two
592: different $y$-coordinates: population and number of
593: representatives.
594: We will use two
595: .UL coord
596: commands to define the two coordinate systems
597: .UL pop
598: and
599: .UL rep .
600: We then explicitly give the coordinate system
601: whenever we refer to a point,
602: both in constructing axes and plotting data.
603: .P1
604: .get states2.g
605: .P2
606: The
607: .UL copy
608: statement in the program uses an
609: .I "immediate macro"
610: enclosed in curly brackets and thus avoids having to
611: name a macro for this task.
612: Because the program assumes that the states are
613: sorted in increasing order of population, it
614: generates
615: .UL thisrank
616: internally as a
617: \*g
618: variable.
619: The program produces
620: .grap states2.g
621: .PP
622: The plotting symbols were chosen for contrast in
623: both shape and shading.
624: This graph also indicates that representation is proportional
625: to population.
626: Once we see this graph, though, we should realize that we don't
627: really need two coordinate systems: we can relate the two by
628: dividing the population of the U.S. \(em about 226,000,000 \(em by
629: the number of representatives \(em 435 \(em to see that each
630: representative should count as 520,000 people.
631: If the purpose of this graph were to tell a story about
632: American politics rather than to illustrate
633: multiple coordinate systems,
634: it should be redrawn with a single coordinate
635: system.
636: .PP
637: Many graphs plot both observed data and a function
638: that (theoretically) describes the data.
639: There are many ways to draw a function
640: in \*g:
641: a series of
642: .UL next
643: commands is tedious but works, as does writing a
644: simple program to write a data file that is subsequently
645: read and plotted by \*g.
646: The
647: .UL for
648: statement often provides a better solution.
649: This
650: \*g
651: program
652: .P1
653: .get sin1.g
654: .P2
655: produces
656: .grap sin1.g
657: .a
658: The
659: .UL for
660: statement uses the same syntax as the
661: .UL ticks
662: statement, but the
663: .UL from
664: keyword can be replaced by
665: .UL = '', ``
666: which will look more familiar to programmers.
667: It varies the index variable over the specified range
668: and for each value executes all statements inside the delimiter
669: characters, which use the same rules as macro
670: delimiters.
671: It is, of course, useful for many tasks beyond plotting functions.
672: .EQ
673: delim %%
674: .EN
675: .PP
676: The
677: .UL if
678: statement provides a simple mechanism for conditional execution.
679: If a file contains data on both cities and states (and lines
680: describing states have ``S'' in the first field), it could be plotted
681: by statements like
682: .P1
683: if "$1" == "S" then {
684: PlotState($2,$3,$4)
685: } else {
686: PlotCity($2,$3,$4,$5,$6)
687: }
688: .P2
689: The
690: .UL else
691: clause
692: is optional; delimiters use the same rules as macros and
693: .UL for
694: statements.
695: .EQ
696: delim $$
697: .EN
698: .NH
699: A Collection of Examples
700: .PP
701: The previous section covered the
702: \*g
703: commands that are used in common graphs.
704: In this section we'll spend less time on
705: language features, and survey a wider variety of
706: graphs.
707: These examples are intended more for browsing and
708: reference than for straight-through reading.
709: Be prepared to refer to the manual in Section 5 when you stumble over a new
710: \*g
711: feature.
712: .PP
713: The file
714: .UL cars.d
715: contains the mileage (miles per gallon) and the weight
716: (pounds) for 74 models of automobiles sold in the United States
717: in the 1979 model year.
718: .P1
719: .d cars.d
720: .P2
721: The trivial
722: \*g
723: program
724: .P1
725: .get cars1.g
726: .P2
727: produces
728: .grap cars1.g
729: This graph shows that weights bottom out somewhat
730: below 2000
731: pounds and that heavier cars get worse mileage;
732: it is hard to say much more about the relationship
733: between weight and mileage.
734: .PP
735: The next graph provides labels, uses circles
736: to expose data hidden in the clouds of bullets,
737: and re-expresses the $x$-axis in gallons per mile.
738: It also changes the point size and vertical spacing
739: to a size appropriate for camera-ready journal articles
740: and books; the size changes should be made outside the
741: \*g
742: program.
743: The
744: .UL \&.ft
745: command changes to a Helvetica font, which
746: some people prefer for graphs.
747: .P1
748: .get cars2.g
749: .P2
750: \*G
751: supports logarithmic re-expression of data with the
752: .UL log
753: clause in the
754: .UL coord
755: statement; any other re-expression of data must be done
756: with
757: \*g
758: arithmetic, as above.
759: .br
760: .grap cars2.g
761: This graph shows that
762: gallons per mile is roughly proportional to weight.
763: (The two outliers near 4000 pounds are the Cadillac
764: Seville and the Oldsmobile 98.)
765: .PP
766: In
767: .I "Visual Display of Quantitative Information" ,
768: Tufte proposes the ``dot-dash-plot'' as a means for maximizing
769: data ink (showing the two-dimensional distribution and
770: the two one-dimensional marginal distributions) while minimizing
771: what he calls ``chart junk'' \(em ink wasted on borders
772: and non-data labels.
773: His preference is easy to express in \*g:
774: .P1
775: .get cars3.g
776: .P2
777: Although visually attractive, we do not find the
778: resulting graph as useful for interpreting the data.
779: .grap cars3.g
780: Tufte's graph does point out two facts that are
781: not obvious in the previous graphs:
782: there is a gap in car weights near 3000 pounds (exhibited
783: by the hole in the $y$-axis ticks), and the gallons per
784: mile axis is regularly structured (the ticks
785: are the reciprocals of an almost dense sequence of integers).
786: The reader may decide whether those insights are worth
787: the decrease in clarity.
788: .PP
789: Throughout the twentieth century, horses, cars and people
790: have gotten faster;
791: let's study those improvements.
792: For horses, we'll consider the winning times
793: of the Kentucky Derby from 1909 to 1988, in
794: the file
795: .UL speedhorse.d :
796: .P1
797: .d speedhorse.d
798: .P2
799: The program
800: .P1
801: .get speedhorse1.g
802: .P2
803: produces the graph
804: .grap speedhorse1.g
805: Each race is recorded with a bullet and
806: record times are marked by horizontal lines.
807: Secretariat is the only horse to have run the
808: one-and-a-quarter-mile
809: race in under two minutes; he won in 1973 in
810: 1:59.4.
811: .PP
812: For automobiles we will study the
813: world land speed record (even though those vehicles
814: are by now just low-flying airplanes).
815: The file
816: .UL speedcar.d
817: lists years in which speed records were set and the record
818: set in that year, in miles per hour averaged over a one-mile
819: course.
820: .P1
821: .d speedcar.d
822: .P2
823: We will plot the data with the following
824: \*g
825: program, which uses nested braces in the
826: .UL copy
827: and
828: .UL if
829: statements.
830: .P1
831: .get speedcar1.g
832: .P2
833: .PP
834: Each record line is drawn after the
835: .I next
836: record is read, because
837: the program must know when the record was broken to draw
838: its line.
839: The
840: .UL if
841: statement handles the first record, and the extra
842: .UL line
843: command extends the last record out to the current date.
844: .grap speedcar1.g
845: The horizontal lines reflect the nature of world records: they
846: last until they are broken.
847: The records could also have been plotted by a scatterplot
848: in which each point represents the setting of a record,
849: but it would be misleading to connect adjacent
850: points with line segments
851: (which we inappropriately did in the graphs
852: of the Olympic 400 meter run).
853: .PP
854: The following graph shows the world record times for the
855: one mile run;
856: because its
857: \*g
858: program is so similar to its automotive counterpart,
859: we won't show the program or data.
860: .grap speedman1.g
861: The three graphs show three different kinds of
862: changes.
863: Although horses are getting faster, they appear to
864: be approaching a barrier near two minutes.
865: Cars show great jumps as new technologies are introduced
866: followed by a plateau as limits of the
867: technology are reached.
868: Milers have shown a fairly consistent
869: linear improvement
870: over this century, but there must be an
871: asymptote down there somewhere.
872: .PP
873: The next file gives the median heights of boys
874: in the United States aged 2 to 18, together with
875: the fifth and ninety-fifth percentiles.
876: .P1
877: .d boyhts.d
878: .P2
879: The heights are given in centimeters (1 foot = 30.48 centimeters).
880: The trivial program
881: .P1
882: .get boyhts1.g
883: .P2
884: displays the data as
885: .grap boyhts1.g
886: Because there are four numbers on each input line, the first is
887: taken as an $x$-value and the remaining three are plotted
888: as $y$-values.
889: .PP
890: The three curves appear to be roughly straight
891: (at least up to age 16),
892: so it makes sense to fit a line
893: through them.
894: We will use the standard least squares regression
895: in which
896: .EQ
897: slope ~=~ {
898: {n SIGMA x y ~ - ~ SIGMA x SIGMA y }
899: over
900: {n SIGMA x sup 2 ~ - ~ ( SIGMA x ) sup 2 }
901: }
902: .EN
903: (where the summations range over all $n$ $x$ and $y$ values
904: in the data set) and the $y$-intercept is
905: .EQ
906: {SIGMA y ~ - ~ slope times SIGMA x} over n
907: .EN
908: The following
909: \*g
910: program boldly (and rather foolishly) implements that formula.
911: .P1
912: .get boyhts3.g
913: .P2
914: It plots the extreme fifth percentiles as a bar through
915: the median, which is plotted as a bullet.
916: All heights are converted to feet before plotting and calculating
917: the regression line.
918: .grap boyhts3.g
919: .PP
920: \*G
921: .UL print
922: statements write on
923: .UL stderr
924: as they are processed by \*g;
925: their single argument can be either an expression or a string.
926: The
927: .UL print
928: statements (which are commented out in
929: the above
930: \*g
931: program) at one time
932: showed that the regression line is
933: .EQ
934: Height ~ in ~ Feet ~ = ~ 2.61 ~ + ~ .19 times Age
935: .EN
936: Thus for most American
937: boys between 3 and 16, you may safely assume
938: that they started out life at 2 feet 7 inches and grew at the
939: rate of two and a quarter inches per year.
940: .PP
941: This program probably misapplies \*g;
942: if you really want to perform least squares regressions on
943: data, you should usually use a simple
944: .I awk
945: program like
946: .P1
947: .get regress.awk
948: .P2
949: (Be warned, though, that this program is not numerically
950: robust.)
951: .PP
952: While we're on the subject of fitting straight lines to data,
953: we'll redraw three graphs from J. W. Tukey's
954: .I "Exploratory Data Analysis" .
955: The file
956: .UL usapop.d
957: records the population of the United States
958: in millions at ten-year intervals.
959: .P1
960: .d usapop.d
961: .P2
962: Tukey's first two graphs indicate that the later population
963: growth was linear while the early growth was exponential.
964: The following
965: \*g
966: program plots them as a pair, using
967: .UL graph
968: commands to place internally unrelated graphs adjacent to
969: one another.
970: .P1
971: .get usapop1.g
972: .P2
973: The statements defining each graph are indented for clarity.
974: The second graph has the northern point of its frame 0.05
975: inch below the southern point of the frame of the first graph;
976: the
977: .UL with
978: clause is passed directly through to
979: .I pic
980: without being evaluated for macros or expressions.
981: The names of both graphs begin with capital letters to
982: conform to
983: .I pic
984: syntax for labels.
985: .grap usapop1.g
986: .PP
987: Polynomial functions lie between the linear and exponential
988: functions; Tukey shows how a seventh-degree polynomial provides
989: a better (and longer) fit to the early population growth.
990: .P1
991: .get usapop2.g
992: .P2
993: This program re-expresses the $x$-axis with
994: \*g
995: arithmetic and uses an
996: .UL if
997: statement to graph only part of the data file.
998: It produces
999: .grap usapop2.g
1000: .nr k \n%
1001: The
1002: .I eqn
1003: .UL "space 0"
1004: clause is necessary to keep
1005: .I eqn
1006: from adding extra space that would interfere
1007: with positions computed by \*g;
1008: see Section 4.
1009: .PP
1010: The file
1011: .UL army.d
1012: contains four related time series
1013: describing the United States Army.
1014: .P1
1015: .d army.d
1016: .P2
1017: The first field is the year; the next four fields give
1018: the number of male officers, female officers, enlisted males
1019: and enlisted females, each in thousands.
1020: (Actually, there were no female enlisted personnel in the
1021: Army until 1943; the value 1 in 1940 and 1942 is just
1022: a placeholder, since
1023: \*g
1024: has no mechanism for handling missing data.)
1025: The following
1026: \*g
1027: program draws the four series with four different sets of
1028: .UL draw
1029: and
1030: .UL next
1031: commands.
1032: .P1
1033: .get army1.g
1034: .P2
1035: The program labels the lines by
1036: .UL copy ing
1037: immediate data;
1038: the program is therefore shorter to write and easier to change.
1039: The delimiter string
1040: .UL XXX
1041: in the
1042: .UL until
1043: clause could be deleted in this graph: the
1044: .UL \&.G2
1045: line also denotes the end of data.
1046: Even though that string is enclosed in quotes,
1047: it may not contain spaces.
1048: The $y$-positions of the labels are the
1049: result of several iterations.
1050: .grap army1.g
1051: .PP
1052: This data can tell many stories: the buildup during the
1053: Second World War is obvious, as is the exodus after the
1054: war; increases during Korea and Vietnam are
1055: also apparent.
1056: We will consider a different story: the ratio of
1057: enlisted men to the three other classes of personnel.
1058: There are several ways to plot this data
1059: (the most obvious graph uses three time series showing how
1060: the ratios change over time, and is
1061: left as an exercise for the reader).
1062: .PP
1063: We will instead construct a graph that gives little insight into this
1064: data, but illustrates a general method that is quite useful
1065: in conjunction with \*g.
1066: The graph is a ``scatterplot vector'' that shows how one
1067: variable (the number of enlisted men) varies as a function of
1068: the other three.
1069: Breaking with tradition, we first show the final graphs, all
1070: of which have logarithmic scales.
1071: .grap army2.g
1072: The number of enlisted men is almost linearly
1073: related to the number of male officers, it is somewhat related to the number
1074: of female officers, and it varies widely as a function of the number
1075: of enlisted women.
1076: .PP
1077: Much more interesting than the graph itself is the method we used to
1078: produce it.
1079: We wrote a miniature ``compiler'' that accepts as
1080: its ``source language'' a description of a scatterplot vector and
1081: produces as ``object code'' a
1082: \*g
1083: program to draw the graph.
1084: The source program for the above example is
1085: .P1
1086: .get army2.v
1087: .P2
1088: The program lists several
1089: global attributes of the graph, the
1090: $y$-variable to be plotted, and as many $x$-variables as
1091: are desired; with each variable is its field in the file
1092: and a descriptive string.
1093: The language is ``compiled'' by the following
1094: .I awk
1095: program.
1096: .P1
1097: .get scatvec.awk
1098: .P2
1099: Running this program on the above description produces the following
1100: output, which is typically piped directly to \*g.
1101: .P1
1102: .get army2.g
1103: .P2
1104: The generated program uses the
1105: .I pic
1106: trick of re-using the same name
1107: .UL A ) (
1108: for several objects.
1109: .PP
1110: Although the program above is merely a toy,
1111: ``minicompilers'' can produce useful preprocessors
1112: for \*g.
1113: The
1114: .UL scatmat
1115: program, for instance, is a 90-line
1116: .I awk
1117: program that reads a simple input language and produces as
1118: output a
1119: \*g
1120: program to produce a ``scatterplot matrix'', which
1121: is a handy graphical device for spotting pairwise interactions
1122: among several variables.
1123: If
1124: \*g
1125: lacks a feature you desire, consider building
1126: a simple preprocessor to provide it.
1127: An alternative is to define
1128: macros for the task; which approach is best depends
1129: strongly on the job you wish to accomplish.
1130: .PP
1131: The next graph uses iterators to make a graph without
1132: reading data from a file.
1133: Rather, its ``data'' is a
1134: function of two variables
1135: that describes a
1136: derivative field and a function of one variable
1137: that describes one solution to the differential
1138: equation.
1139: .P1
1140: .get ode1.g
1141: .P2
1142: The left label uses
1143: .I eqn
1144: text between the $font CW "$$"$ delimiters.
1145: The variable
1146: .UL scale
1147: ensures that all lines in the direction field are the same
1148: length.
1149: The
1150: .UL in
1151: clauses in the
1152: .UL ticks
1153: statements specify that the ticks go in zero inches
1154: to avoid overprinting.
1155: The variables
1156: .UL tx
1157: and
1158: .UL ty
1159: are so named because
1160: .UL x
1161: and
1162: .UL y
1163: are reserved words for the
1164: .UL coord
1165: statement.
1166: .grap ode1.g
1167: .PP
1168: Programmers familiar with floating point arithmetic may be
1169: surprised that the above graph is correct.
1170: Because of roundoff error, iteration
1171: .UL "from 0 to 1 by .05" '' ``
1172: usually produces the values
1173: $0, ~ .05, ~ .10, ~ ..., ~ .95$.
1174: \*G
1175: uses a ``fuzzy test''
1176: in the
1177: .UL for
1178: statement to avoid that problem, which may in turn introduce
1179: other problems.
1180: Such problems may be avoided by iterating over an integer range
1181: and incrementing a non-integer value within the loop.
1182: .PP
1183: Most of the data we have seen so far is inherently
1184: two (or more) dimensional.
1185: As an example of one-dimensional data, we will return to
1186: the populations of the fifty states, which
1187: is the third field in the file
1188: .UL states.d
1189: introduced earlier;
1190: the file is sorted in increasing order of population.
1191: Our first graph takes the most space, but
1192: it also gives the most information.
1193: .P1
1194: .get states8.g
1195: .P2
1196: The
1197: .UL L
1198: macro (for Label)
1199: with input parameter $X$ evaluates to the number
1200: $2 sup X / 1,000,000$ followed by the string "$X$"
1201: (the
1202: .UL ticks
1203: command expects a number followed by a string label).
1204: .grap states8.g
1205: The dotted line is the least squares regression
1206: .EQ
1207: log sub 10 ~ Population ~ = ~ 7.214 ~ - ~ .03 times Rank
1208: .EN
1209: which gives 15.3 million as the population of the
1210: largest state and .515 million as the population
1211: of the smallest state.
1212: It says that
1213: population drops by a factor of two every ten states
1214: (compare the top and left scales).
1215: As sloppy as the exponential fit is, though, it is a much better
1216: fit to this data
1217: than a Zipf's Law curve is (drawing that curve is left as
1218: an exercise for the reader).
1219: .PP
1220: The next graph is a more standard representation of
1221: one-dimensional data.
1222: .P1
1223: .get states3.g
1224: .P2
1225: The markers were chosen to be
1226: .UL vticks
1227: because they denote only an $x$-value.
1228: .grap states3.g
1229: .PP
1230: The next one-dimensional graph uses the state's name as
1231: its marker; to reduce overprinting the graph is ``jittered''
1232: by using a random number as a $y$-value.
1233: .P1
1234: .get states4.g
1235: .P2
1236: The function
1237: .UL rand()
1238: returns a pseudo-random real number chosen uniformly over the interval [0,1).
1239: .grap states4.g
1240: This graph is too cluttered; circles would have been
1241: a better choice as a plotting symbol (bullets, once again, would
1242: hide data).
1243: .PP
1244: Histograms are a standard way of presenting one-dimensional
1245: data in two-dimensional form.
1246: Our first step in building a histogram of the population
1247: data is the following
1248: .I awk
1249: program, which counts how many states are in each ``bin''
1250: of a million people.
1251: .P1
1252: .get states5.awk
1253: .P2
1254: The variable
1255: .UL bzs
1256: tells where bin zero starts; although it is zero in this
1257: graph, it might be 95 in a histogram
1258: of human body temperatures in degrees Fahrenheit.
1259: The program produces the following output in
1260: .UL states2.d :
1261: .P1
1262: .d states2.d
1263: .P2
1264: There are 12 states with population between 0 and 999,999,
1265: 5 states with population between 1,000,000 and 1,999,999,
1266: and so on.
1267: .PP
1268: This
1269: \*g
1270: program uses three
1271: .UL line
1272: commands to plot each rectangle in the histogram.
1273: .P1
1274: .get states5.g
1275: .P2
1276: It produces
1277: .grap states5.g
1278: .PP
1279: The same file can be plotted in a
1280: more attractive (and more useful) form by
1281: .P1
1282: .get states6.g
1283: .P2
1284: which produces
1285: one of Bill Cleveland's ``dot charts'' or ``lolliplots'':
1286: .grap states6.g
1287: (We use
1288: .UL \e(bu ,
1289: the
1290: .I troff
1291: character for a bullet, rather than the built-in string to
1292: get a larger size.)
1293: .PP
1294: Other histograms are possible.
1295: The following
1296: .I awk
1297: program
1298: .P1
1299: .get states7.awk
1300: .P2
1301: produces the file
1302: .UL states3.d
1303: .P1
1304: .d states3.d
1305: .P2
1306: which lists the state's abbreviation, bin number, and
1307: height within the bin.
1308: The
1309: \*g
1310: program
1311: .P1
1312: .get states7.g
1313: .P2
1314: reads that file to make the following histogram, in which
1315: the state names are used to display the heights of the bins.
1316: In each bin, the states occur in increasing order of
1317: population from bottom to top.
1318: .grap states7.g
1319: .PP
1320: The next data set is a run-time profile of an early version of \*g,
1321: created by compiling the program with the
1322: .UL -p
1323: option and running
1324: .UL prof
1325: after the program executed.
1326: .P1
1327: .d prof1.d
1328: .P2
1329: Although there were more than fifty procedures in the program, the
1330: top four time-hogs accounted for more than half of the run time.
1331: This file is difficult for
1332: \*g
1333: to deal with:
1334: even though
1335: .UL if
1336: statements would allow us to extract lines 2 through 11
1337: of the file, we could not remove the leading
1338: .CW _
1339: from a routine name or access the last field in a record.
1340: We will therefore process it with
1341: the following
1342: .I awk
1343: program.
1344: .P1
1345: .get prof1.awk
1346: .P2
1347: The program produces
1348: .P1
1349: .d prof2.d
1350: .P2
1351: We could even use the
1352: .I sh
1353: statement to execute the
1354: .I awk
1355: program from within \*g, which would make the latter entirely
1356: self-contained (see the reference manual for details).
1357: .PP
1358: We will display the data with this program.
1359: .P1
1360: .get prof1.g
1361: .P2
1362: Observe that the program knows nothing about the range of the data.
1363: It uses default ticks and a
1364: .UL frame
1365: statement with a computed height to achieve
1366: total data independence.
1367: .grap prof1.g
1368: This bar chart highlights the fact that most of the time spent by
1369: \*g
1370: is devoted to input and output.
1371: .PP
1372: J. W. Tukey's box and whisker plots
1373: represent the median, quartiles, and extremes of a
1374: one-dimensional distribution.
1375: The following
1376: \*g
1377: program defines a macro to draw a box plot, and then
1378: uses that shape to compare the distribution of heights of
1379: volcanoes with the distribution of heights of States of the Union.
1380: .P1
1381: .get box1.g
1382: .P2
1383: Boxes are one of many shapes used for the graphical
1384: representation of several quantities.
1385: If you use such shapes frequently then you should
1386: make a library file of their macros to
1387: .UL copy
1388: into your
1389: \*g
1390: programs.
1391: The above program produces
1392: .grap box1.g
1393: Even though the extreme heights are the same, state heights
1394: have a lower median and a greater spread.
1395: .PP
1396: Someday you may use
1397: \*g
1398: to prepare overhead transparencies, only to find that
1399: everything comes out too small.
1400: The following program illustrates some ways to get larger
1401: graphs.
1402: .P1
1403: .zzz slide1.g
1404: .P2
1405: The
1406: .UL ps
1407: and
1408: .UL vs
1409: commands preceding the graph set the text size to 14 points and
1410: the vertical spacing to 18 points; the two quantities are
1411: reset by the commands following the
1412: .UL .G2 .
1413: Such size changes should be made outside the
1414: \*g
1415: program, as mentioned earlier.
1416: The
1417: .UL 4
1418: following the
1419: .UL .G1
1420: stretches the graph (including
1421: \*g's
1422: estimate of the accompanying text) to be four inches wide;
1423: it is an alternative to altering the
1424: .UL frame
1425: command.
1426: The macro
1427: .UL blob
1428: is a plotting symbol that is much larger than
1429: .UL bullet ;
1430: the different name ensures that later references to
1431: .UL bullet
1432: are unaffected.
1433: The
1434: .I troff
1435: commands within the
1436: .UL blob
1437: string move the character down one-tenth of an em
1438: to center its plotting position (determined experimentally)
1439: and then reset the vertical position.
1440: The program produces this trivial (but large) graph.
1441: .br
1442: .grap slide1.g
1443: .NH
1444: Using Grap
1445: .PP
1446: Following are a few day-to-day matters about using \*g.
1447: .NH 2
1448: Errors
1449: .PP
1450: \*G
1451: attempts to pinpoint input errors; for example,
1452: the input
1453: .P1
1454: \&.G1
1455: i = i + 1
1456: .P2
1457: results in this message on
1458: .UL stderr :
1459: .P1
1460: grap: syntax error near line 1, file -
1461: context is
1462: i = i >>> + <<< 1
1463: .P2
1464: The error was noticed
1465: at the
1466: .UL + .
1467: Unfortunately, pinpointing is not the same as explaining:
1468: the real error is that the variable
1469: .UL i
1470: was not initialized.
1471: .PP
1472: The ``words''
1473: .UL x
1474: and
1475: .UL y
1476: are reserved (for the
1477: .UL coord
1478: statement);
1479: you will get an equally inexplicable syntax error message if you use them
1480: as variable names.
1481: (This design is bad, but not nearly so bad as
1482: having the
1483: .UL log
1484: and
1485: .UL exp
1486: functions use base 10.)
1487: .PP
1488: \*G
1489: tries to load a file of standard macro definitions
1490: .UL /usr/lib/grap.defines ) (
1491: for terms like
1492: .UL bullet ,
1493: .UL plus ,
1494: etc.
1495: It doesn't complain if that file isn't found,
1496: but if you later use one of these words,
1497: you'll get a syntax error message.
1498: .PP
1499: Certain constructs suggested by analogy to
1500: .I pic
1501: do not work.
1502: For example,
1503: .UL .GS
1504: and
1505: .UL .GE
1506: would have been nicer than
1507: .UL .G1
1508: and
1509: .UL .G2 ,
1510: but they were already taken.
1511: The
1512: .I pic
1513: construct
1514: .P1
1515: \&.PS <file
1516: .P2
1517: has been superseded by
1518: \*g's
1519: .UL copy
1520: command (which in turn has been retrofitted into
1521: .I pic ).
1522: .NH 2
1523: \fITroff\fP issues
1524: .PP
1525: You may use
1526: .I troff
1527: commands like
1528: .UL .ps
1529: or
1530: .UL .ft
1531: to change text sizes and fonts within a graph,
1532: or use balanced
1533: .UL \es
1534: and
1535: .UL \ef
1536: commands within a string.
1537: Do not, however,
1538: add space
1539: .UL .sp ) (
1540: or change the line spacing
1541: .UL .vs , (
1542: .UL .ls )
1543: within a graph.
1544: Some defined terms like
1545: .UL bullet
1546: contain embedded size changes;
1547: further qualifying them with
1548: \*g
1549: .UL size
1550: commands may not always work.
1551: .PP
1552: Because
1553: \*g
1554: is built on top of
1555: .I pic ,
1556: the following quote from the
1557: .I pic
1558: manual is relevant:
1559: ``There is a subtle problem with complicated equations inside
1560: .I pic
1561: pictures \(em they come out wrong if
1562: .I eqn
1563: has to leave extra vertical space for the equation.
1564: If your equation involves more than subscripts and superscripts,
1565: you must add to the beginning of each such equation the extra information
1566: .UL "space 0" ''.
1567: This feature was illustrated in the graph of the
1568: United States population in Section 3.
1569: .NH 2
1570: Alternatives
1571: .PP
1572: Besides
1573: \*g
1574: and your local draftsperson, what other choices are there?
1575: .PP
1576: The S system |reference(slanguage chambers) provides
1577: a host of tools for statistical analysis,
1578: but somewhat fewer tools than
1579: \*g
1580: for producing document-quality graphs.
1581: S produces graphs on the screen of a DMD 5620 terminal much more quickly than
1582: \*g
1583: (often in seconds rather than minutes), but it
1584: takes somewhat longer to learn (at least for us).
1585: If you expect to do a lot of interactive data analysis, then
1586: S is probably the right tool for you.
1587: S may be used to generate
1588: .I pic
1589: commands.
1590: .PP
1591: The standard UNIX program
1592: .I graph
1593: provides many of the basic features of
1594: \*g,
1595: though with quite a bit less control over details, particularly
1596: text.
1597: It produces output only in the
1598: .UX
1599: .I plot (5)
1600: language,
1601: which may be processed by a variety of filters
1602: for a variety of output devices.
1603: .PP
1604: The original
1605: .UX
1606: typesetter graphics programs are
1607: .I pic
1608: and
1609: .I ideal ;
1610: you may be able to do as well without using
1611: \*g
1612: as an intermediary.
1613: In particular,
1614: .I ideal
1615: provides shading and clipping,
1616: which are useful
1617: in presentation-quality bar charts and the like, but are
1618: well beyond the capabilities of
1619: .I pic .
1620: .EQ
1621: delim $$
1622: .EN
1623: .NH
1624: References
1625: .LP
1626: |reference_placement
1627: .NH
1628: Reference Manual
1629: .PP
1630: In the following,
1631: .I italic
1632: terms are syntactic categories,
1633: .UL typewriter
1634: terms are literals,
1635: parenthesized constructs are optional, and ... indicates repetition.
1636: In most cases, the order of statements,
1637: constructs and attributes is immaterial.
1638: .P1
1639: .IT "grap program" :
1640: .G1 \f2(width in inches)\fP
1641: \f2grap statement\fP
1642: ...
1643: .G2
1644: .P2
1645: A width on the
1646: .UL .G1
1647: line overrides the computed width, as in
1648: .I pic .
1649: .P1
1650: .IT "grap statement" :
1651: .I
1652: frame \(or label \(or coord \(or ticks \(or grid \(or plot \(or line \(or circle \(or draw \(or new \(or next
1653: \(or graph \(or numberlist \(or copy \(or for \(or if \(or sh \(or pic \(or assignment \(or print
1654: .ft
1655: .P2
1656: .PP
1657: The
1658: .UL frame
1659: statement defines the frame that surrounds the graph:
1660: .P1
1661: .IT frame :
1662: frame \f2(\fPht \f2expr)\fP \f2(\fPwid \f2expr)\fP \f2((side) linedesc)\fP \f2...\fP
1663: .IT side :
1664: top \(or bot \(or left \(or right
1665: .IT linedesc :
1666: solid \(or invis \(or dotted \f2(expr)\fP \(or dashed \f2(expr)\fP
1667: .P2
1668: Height and width default to 2 and 3 inches;
1669: sides default to solid.
1670: If
1671: .I side
1672: is omitted, the
1673: .I linedesc
1674: applies to the entire frame.
1675: The optional expressions after
1676: .UL dotted
1677: and
1678: .UL dashed
1679: change the spacing exactly as in
1680: .I pic .
1681: .PP
1682: The
1683: .UL label
1684: statement places a label on a specified side:
1685: .P1
1686: .IT label :
1687: label \f2side\fP \f2strlist\fP \f2...\fP \f2shift\fP
1688: .IT shift:
1689: left\f2 \(or \fPright\f2 \(or \fPup\f2 \(or \fPdown \f2expr ...\fP
1690: .IT strlist :
1691: \f2str ... (\fPrjust\f2 \(or \fPljust\f2 \(or \fPabove\f2 \(or \fPbelow\f2) ... (\fPsize \f2(\fP\(+-\f2) expr) ...\fP
1692: .IT str :
1693: "\f2...\fP"
1694: .P2
1695: Lists of text strings are stacked vertically.
1696: In any context, string lists may contain clauses
1697: to adjust the position or change the point size.
1698: Each clause applies to the string preceding it
1699: and all following strings.
1700: Labels may also have a
1701: .UL width
1702: attribute, to override
1703: \*g's
1704: default computation.
1705: .PP
1706: Normally the coordinate system is defined by the data,
1707: with 7 percent extra on each side.
1708: (To change that to 5 percent, assign 0.05 to the
1709: \*g
1710: variable
1711: .UL margin ,
1712: which is reset to 0.07 at each
1713: .UL .G1
1714: statement.)
1715: The
1716: .UL coord
1717: statement defines an overriding system:
1718: .P1
1719: .IT coord :
1720: coord \f2(name)\fP \f2(\fPx \f2expr,expr)\fP \f2(\fPy \f2expr,expr)\fP \f2(\fPlog x \(or log y \(or log log\f2) \fP
1721: .P2
1722: Coordinate systems can be named;
1723: ranges, logarithmic scaling, etc., are done separately for each.
1724: .PP
1725: The
1726: .UL ticks
1727: statement places tick marks on one side of the frame:
1728: .P1
1729: .IT ticks :
1730: ticks \f2side\fP \f2(\fPin \(or out \f2(expr))\fP \f2(shift) (tick-locations)\fP
1731: .IT tick-locations :
1732: at \f2(name) expr (str)\fP, \f2expr (str)\fP, \f2...\fP
1733: \(or from \f2(name) expr\fP to \f2expr\fP \f2(\fPby \f2(op) expr)\fP \f2str\fP
1734: .P2
1735: If no ticks are specified, they will be provided automatically;
1736: .UL ticks
1737: .UL off
1738: suppresses automatic ticks.
1739: The optional expression after
1740: .UL in
1741: or
1742: .UL out
1743: specifies the length of the ticks in inches.
1744: The optional name refers to a coordinate system.
1745: If
1746: .IT str
1747: contains
1748: format specifiers like
1749: .UL %f
1750: or
1751: .UL %g ,
1752: they are interpreted as by
1753: .UL printf .
1754: If no
1755: .IT str
1756: is supplied, the tick labels will be the values of the
1757: expressions.
1758: .PP
1759: If the
1760: .UL by
1761: clause is omitted, steps are of size 1.
1762: If the
1763: .UL by
1764: expression is preceded by one of
1765: .UL + ,
1766: .UL - ,
1767: .UL *
1768: or
1769: .UL / ,
1770: the step is scaled by that operator,
1771: e.g.,
1772: .UL *10
1773: means that each step is 10 times the previous one.
1774: .PP
1775: The
1776: .UL grid
1777: statement produces grid lines along (i.e., perpendicular to)
1778: the named side.
1779: .P1
1780: .IT grid :
1781: grid \f2side (linedesc) (shift) (tick-locations)\fP
1782: .P2
1783: Grids are labeled by the same mechanism as
1784: .UL ticks .
1785: It is possible to draw grids without ticks by placing the phrase
1786: .UL ticks
1787: .UL off
1788: after the side name and before the iterator.
1789: .PP
1790: Plot
1791: statements place text at a point:
1792: .P1
1793: .IT plot :
1794: \f2strlist\fP at \f2point\fP
1795: plot \f2expr (str)\fP at \f2point\fP
1796: .IT point :
1797: \f2(name) expr,expr\fP
1798: .P2
1799: As in the
1800: .UL label
1801: statement, the string list may contain
1802: position and size modifiers.
1803: The
1804: .UL plot
1805: statement uses the optional format string as in C's
1806: .UL printf
1807: statement \(em it may contain a
1808: .UL %f
1809: or
1810: .UL %g .
1811: The optional name refers to a coordinate system.
1812: .PP
1813: The
1814: .UL line
1815: statement draws a line or arrow from here to there:
1816: .P1
1817: .IT line :
1818: \f2(\fPline \(or arrow\f2)\fP from \f2point\fP to \f2point (linedesc)\fP
1819: .P2
1820: The
1821: .UL circle
1822: statement draws a circle:
1823: .P1
1824: .IT circle :
1825: circle at \f2point (\fPradius \f2expr)\fP
1826: .P2
1827: The radius is in inches; the default size is small.
1828: .PP
1829: The
1830: .UL draw
1831: statement defines a sequence of lines:
1832: .P1
1833: .IT draw :
1834: draw \f2(name) linedesc (str)\fP
1835: .P2
1836: Subsequent data for the named sequence
1837: will be plotted as a line of the specified style,
1838: with the optional
1839: .IT str
1840: plotted at each point.
1841: The
1842: .UL next
1843: statement continues a sequence:
1844: .P1
1845: .IT next :
1846: next \f2(name)\fP at \f2point (linedesc)\fP
1847: .P2
1848: If a line description is specified, it overrides the default
1849: display mode for the line segment ending at
1850: .I point .
1851: The
1852: .UL new
1853: statement starts a new sequence; it has the same format as the
1854: .UL draw
1855: statement.
1856: .PP
1857: A line consisting of a set of numbers
1858: is treated as a family of points
1859: $x$, $y sub 1$, $y sub 2$, etc.,
1860: to be plotted at the single
1861: $x$ value.
1862: .P1
1863: .IT numberlist :
1864: \f2number\fP ...
1865: .P2
1866: If there is only one number it is treated as
1867: a $y$ value, and $x$ values of 1, 2, 3, ...
1868: are supplied automatically.
1869: .PP
1870: \*G
1871: provides arithmetic with the operators
1872: .UL + ,
1873: .UL - ,
1874: .UL * ,
1875: .UL / ,
1876: and
1877: .UL ^ .
1878: Variables may be assigned to;
1879: assignments are expressions.
1880: Built-in functions include
1881: .UL log ,
1882: .UL exp
1883: (both base 10 \(em beware!),
1884: .UL int
1885: (truncates towards zero),
1886: .UL sin ,
1887: .UL cos
1888: (both use radians),
1889: .UL atan2(dy,dx) ,
1890: .UL sqrt ,
1891: .UL min
1892: (two arguments only),
1893: .UL max
1894: (ditto),
1895: and
1896: .UL rand()
1897: (returns a real number random on [0,1)).
1898: .PP
1899: The
1900: .UL for
1901: statement provides a modest looping facility:
1902: .P1
1903: .IT for :
1904: for \f2var\fP from \f2expr\fP to \f2expr (\fPby \f2(op) expr)\fP do { \f2anything\fP }
1905: .P2
1906: The string may contain internally balanced braces.
1907: Alternatively, any other character may appear immediately after the word
1908: .UL do ,
1909: and the string is terminated by the next occurrence of that character.
1910: The text
1911: .IT anything
1912: (which may contain newlines) is repeated as
1913: .IT var
1914: takes on values from
1915: .IT expr1
1916: to
1917: .IT expr2 .
1918: As with tick iterators, the
1919: .UL by
1920: clause is optional, and may proceed arithmetically or multiplicatively.
1921: In a
1922: .UL for
1923: statement,
1924: the
1925: .UL from
1926: may be replaced by
1927: .UL = ''. ``
1928: .PP
1929: The
1930: .UL if-then-else
1931: statement provides conditional evaluation:
1932: .P1
1933: .IT if :
1934: if \f2expr\fP then { \f2anything\fP } else { \f2anything\fP }
1935: .P2
1936: The
1937: .UL else
1938: clause
1939: is optional.
1940: Relational operators include
1941: .UL == ,
1942: .UL != ,
1943: .UL > ,
1944: .UL >= ,
1945: .UL < ,
1946: .UL <= ,
1947: .UL ! ,
1948: .UL || ,
1949: and
1950: .UL && .
1951: Strings may be compared with the operators
1952: .UL ==
1953: and
1954: .UL != .
1955: .PP
1956: It is possible to convert numeric expressions to formatted strings:
1957: .P1
1958: sprintf("\f2format\fP", \f2expr\fP, \f2expr\fP, ...)
1959: .P2
1960: is equivalent to a quoted string in any context.
1961: Variants of
1962: .UL %f
1963: and
1964: .UL %g
1965: are the only sensible format conversions.
1966: .PP
1967: \*G
1968: provides the same macro processor that
1969: .I pic
1970: does:
1971: .P1
1972: define \f2macro-name\fP { \f2anything\fP }
1973: .P2
1974: .EQ
1975: delim %%
1976: .EN
1977: Subsequent occurrences of the macro name will be replaced
1978: by the string, with arguments of the form \f(CW$\fIn\fR
1979: replaced by corresponding actual arguments.
1980: Macro definitions persist across
1981: .UL .G2
1982: boundaries, as do values of variables.
1983: .EQ
1984: delim $$
1985: .EN
1986: .PP
1987: The
1988: .UL copy
1989: statement is somewhat overloaded:
1990: .P1
1991: copy "\f2filename\fP"
1992: .P2
1993: includes the contents of the named file at that point;
1994: .P1
1995: copy "\f2filename\fP" thru \f2macro-name\fP
1996: .P2
1997: copies the file through the macro; and
1998: .P1
1999: copy thru \f2macro-name\fP
2000: .P2
2001: copies subsequent lines through the macro;
2002: each number or quoted string is treated as an argument.
2003: In each case, copying continues until end of file or the next
2004: .UL .G2 .
2005: The optional clause
2006: .UL until
2007: .IT str
2008: causes copying to terminate when a line whose
2009: first field is
2010: .IT str
2011: occurs.
2012: In all cases, the macro can be specified inline rather than by name:
2013: .P1
2014: copy thru { \f2macro body\fP }
2015: .P2
2016: .PP
2017: The
2018: .UL sh
2019: command passes text through to the UNIX shell.
2020: .P1
2021: .IT sh :
2022: sh { \f2anything\fP }
2023: .P2
2024: The body of the command is scanned for macros.
2025: The built-in macro
2026: .UL pid
2027: is a string consisting of the process identification number;
2028: it can be used to generate unique file names.
2029: .PP
2030: The
2031: .UL pic
2032: command passes text through to
2033: .I pic
2034: with the
2035: .UL pic '' ``
2036: removed; variables and macros are not evaluated.
2037: Lines beginning with a period (that are not numbers)
2038: are passed through literally, under the assumption that they
2039: are
2040: .I troff
2041: commands.
2042: .PP
2043: The
2044: .UL graph
2045: statement
2046: .P1
2047: .IT graph :
2048: graph \f2Picname (pic-text)\fP
2049: .P2
2050: defines a new graph named
2051: .I Picname ,
2052: resetting all coordinate systems.
2053: If any
2054: .UL graph
2055: commands are used in a
2056: \*g
2057: program, then the statement after the
2058: .UL \&.G1
2059: must be a
2060: .UL graph
2061: command.
2062: The
2063: .I pic-text
2064: can be used to position this graph relative
2065: to previous graphs by referring to their
2066: .UL Frame s,
2067: as in
2068: .P1
2069: graph First
2070: ...
2071: graph Second with .Frame.w at First.Frame.e + (0.1,0)
2072: .P2
2073: Macros and expressions in
2074: .I pic-text
2075: are not evaluated.
2076: .I Picname s
2077: must begin with a capital letter to satisfy
2078: .I pic
2079: syntax.
2080: .PP
2081: The
2082: .UL print
2083: statement
2084: .P1
2085: .IT print :
2086: print \f2(expr\fP \(or \f2str)\fP
2087: .P2
2088: writes on
2089: .UL stderr
2090: as
2091: \*g
2092: processes its input; it is sometimes useful for debugging.
2093: .PP
2094: Many reserved words have synonyms, such as
2095: .UL thru
2096: for
2097: .UL through ,
2098: .UL tick
2099: for
2100: .UL ticks,
2101: and
2102: .UL bot
2103: for
2104: .UL bottom .
2105: .PP
2106: The
2107: .UL #
2108: introduces a comment, which ends at the end of the line.
2109: Statements may be continued over several lines by preceding each
2110: newline with a
2111: backslash character.
2112: Multiple statements may appear on a single line separated
2113: by semicolons.
2114: \*G
2115: ignores any line that is entirely blank, including those
2116: processed by
2117: .UL "copy thru"
2118: commands.
2119: .PP
2120: When
2121: \*g
2122: is first executed it reads standard macro definitions
2123: from the file
2124: .UL /usr/lib/grap.defines .
2125: The definitions include
2126: .UL bullet ,
2127: .UL plus ,
2128: .UL box ,
2129: .UL star ,
2130: .UL dot ,
2131: .UL times ,
2132: .UL htick ,
2133: .UL vtick ,
2134: .UL square ,
2135: and
2136: .UL delta .
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.