|
|
1.1 ! root 1: .so ../ADM/mac ! 2: .XX grap 109 "Grap \(em A Language for Typesetting Graphs" ! 3: .EQ ! 4: delim $$ ! 5: .EN ! 6: .so macros ! 7: .ds g \f2grap\fP ! 8: .ds G \f2Grap\fP ! 9: .TL ! 10: Grap \(em A Language for Typesetting Graphs ! 11: .br ! 12: Tutorial and User Manual ! 13: .AU ! 14: Jon L. Bentley ! 15: Brian W. Kernighan ! 16: .AI ! 17: .MH ! 18: .AB ! 19: \*G ! 20: is a language for describing plots of data. ! 21: This graph of the 1984 ! 22: age distribution in the United States ! 23: .grap agepop1.g ! 24: is produced by the ! 25: \*g ! 26: commands ! 27: .P1 ! 28: .get agepop1.g ! 29: .P2 ! 30: (Each line in the data file ! 31: .UL agepop.d ! 32: contains an age and the number of Americans of that ! 33: age alive in 1984; the file is sorted by age.) ! 34: .PP ! 35: The ! 36: \*g ! 37: preprocessor works with ! 38: .I pic |reference(latest pic) ! 39: and ! 40: .I troff |reference(latest troff reference). ! 41: Most of its input is passed ! 42: through untouched, but statements between ! 43: .UL .G1 ! 44: and ! 45: .UL .G2 ! 46: are translated into ! 47: .I pic ! 48: commands that draw graphs. ! 49: .AE ! 50: .NH ! 51: Introduction ! 52: .PP ! 53: \*G ! 54: is a language for describing graphical ! 55: displays of data. ! 56: It provides such services as automatic scaling and ! 57: labeling of axes, and ! 58: .UL for ! 59: statements, ! 60: .UL if ! 61: statements, and macros to facilitate user ! 62: programmability. ! 63: \*G ! 64: is intended primarily for including graphs in ! 65: documents prepared on the ! 66: .UX ! 67: operating system, and is only marginally ! 68: useful for elementary tasks in data analysis. ! 69: .PP ! 70: Section 2 of this document is a tutorial introduction to ! 71: \*g; ! 72: readers who find it slow going may wish to skim ahead. ! 73: The examples in Section 3 illustrate ! 74: the various kinds of graphs that ! 75: \*g ! 76: can produce and some common ! 77: \*g ! 78: idioms. ! 79: Mundane matters about using ! 80: \*g ! 81: are discussed in Section 4, ! 82: and Section 5 contains a brief reference manual. ! 83: .PP ! 84: We have tried to illustrate good principles of ! 85: statistics and graphical design in the ! 86: graphs we present. ! 87: In several places, though, good taste has lost to ! 88: the necessity of illustrating ! 89: \*g ! 90: capabilities. ! 91: Readers interested in statistical ! 92: integrity and taste should ! 93: consult the literature, for example |reference(chambers graphs) ! 94: |reference(tufte graphs) |reference(cleveland elements). ! 95: .NH ! 96: Tutorial ! 97: .PP ! 98: The following is a simple ! 99: \*g ! 100: program\(dg ! 101: .FS ! 102: \(dg Throughout ! 103: this document we will show only the first five ! 104: lines and the last line of data files; ! 105: omitted lines are indicated by ``...''. ! 106: .FE ! 107: .P1 ! 108: \&.G1 ! 109: .d 400mtimes.d ! 110: \&.G2 ! 111: .P2 ! 112: The single number on each line ! 113: is the winning time in seconds for the ! 114: men's 400 meter run, ! 115: from the first modern Olympic Games (1896) ! 116: to the twenty-first (1988). ! 117: If the file ! 118: .UL olymp.g ! 119: contains the text above, ! 120: then typing the command ! 121: .P1 ! 122: grap olymp.g | pic | troff > junk ! 123: .P2 ! 124: creates a ! 125: .I troff ! 126: output file ! 127: .UL junk ! 128: that contains the ! 129: picture ! 130: .grap 4001.g ! 131: The graph shows the decrease ! 132: in winning times from 54.2 ! 133: seconds to 43.87 seconds. ! 134: If the times are ! 135: contained in the file ! 136: .UL 400mtimes.d , ! 137: we could ! 138: produce the same graph with the ! 139: shorter program ! 140: .P1 ! 141: .get 4001.g ! 142: .P2 ! 143: Writing ! 144: .UL copy ! 145: .UL \&"fname" ! 146: in a ! 147: \*g ! 148: program is equivalent to including the ! 149: contents of file ! 150: .UL fname ! 151: at that point in the file. ! 152: (In the interests of compatibility with other programs, ! 153: .UL include ! 154: is a synonym for ! 155: .UL copy .) ! 156: .PP ! 157: Each line in the file ! 158: .UL 400mpairs.d ! 159: contains two numbers, the ! 160: year of the Olympics and the winning time: ! 161: .P1 ! 162: .d 400mpairs.d ! 163: .P2 ! 164: If we plot this data with the program ! 165: .P1 ! 166: .get 4002.g ! 167: .P2 ! 168: the bottom ($x$) axis represents the year of the Olympics. ! 169: .grap 4002.g ! 170: The ``holes'' in $x$-values reflect the fact ! 171: that the 1916, 1940, and 1944 Olympics ! 172: were cancelled due to war. ! 173: Because the previous data ! 174: (in ! 175: .UL 400mtimes.d ) ! 176: had just one number per ! 177: line, ! 178: \*g ! 179: viewed it as a ``time series'' and ! 180: supplied $x$-values of $1, ~ 2, ~ 3, ...$ ! 181: before plotting ! 182: the data as $y$-values. ! 183: The input to the ! 184: second program has two values per line, ! 185: so they are interpreted as $( x , y )$ pairs. ! 186: .PP ! 187: Rather than a scatter plot of points, we might prefer to ! 188: see the winning times connected by a solid ! 189: line. ! 190: The program ! 191: .P1 ! 192: .get 4003.g ! 193: .P2 ! 194: produces the graph ! 195: .grap 4003.g ! 196: Eric Liddell of Great Britain ! 197: won his gold medal ! 198: in Paris in 1924 with a time of 47.6 seconds. ! 199: (Remember ``Chariots ! 200: of Fire''?) ! 201: .PP ! 202: We can make the graph more attractive ! 203: by modifying its frame ! 204: and adding labels. ! 205: .P1 ! 206: .get 4004.g ! 207: .P2 ! 208: The ! 209: .UL frame ! 210: command describes ! 211: the graph's bounding box: ! 212: the overall frame (which has four sides) ! 213: is invisible, it is 2 inches high and 3 inches ! 214: wide (which happen to be the ! 215: default height and width), ! 216: and the left and bottom ! 217: sides are solid (they could have been ! 218: dashed or dotted instead). ! 219: The labels appear on the left and bottom, as requested. ! 220: .grap 4004.g ! 221: .PP ! 222: To set the range of each axis, ! 223: \*g ! 224: examines the data and pads both ! 225: dimensions ! 226: by seven percent at each end. ! 227: The ! 228: .UL coord ! 229: (``coordinates'') command ! 230: allows you to specify the range of one or both axes explicitly; ! 231: it also turns off automatic padding. ! 232: .P1 ! 233: .get 4005.g ! 234: .P2 ! 235: The $y$-axis now ranges from 42 to 56 seconds ! 236: (a little more than before), ! 237: and the $x$-axis from 1894 to 1990 ! 238: (a little less). ! 239: .grap 4005.g ! 240: .PP ! 241: The ticks in the preceding graphs were generated ! 242: by ! 243: \*g ! 244: guessing at reasonable values. ! 245: If you would rather provide your own, ! 246: you may ! 247: use the ! 248: .UL ticks ! 249: command, ! 250: which comes in the flavors illustrated below. ! 251: .P1 ! 252: .get 4006.g ! 253: .P2 ! 254: The first ! 255: .UL ticks ! 256: command deals with the left axis: ! 257: it puts the ticks facing out at ! 258: the numbers in the list. ! 259: \*G ! 260: puts labels only at values ! 261: with strings, ! 262: except that when no labels at all are ! 263: given, each number serves as its own label, ! 264: as in the second ! 265: .UL ticks ! 266: command. ! 267: That command ! 268: is for the bottom axis: ! 269: it puts the ticks facing in at steps of 20 ! 270: from 1900 to 1980. ! 271: The command ! 272: .UL "ticks off" ! 273: turns off all ticks. ! 274: \*G ! 275: does its best to place labels appropriately, but ! 276: it sometimes needs your help: ! 277: the ! 278: .UL "left .2" ! 279: clause moves the left label 0.2 inches further left to ! 280: avoid the new ticks. ! 281: .grap 4006.g ! 282: .PP ! 283: The file ! 284: .UL 400wpairs.d ! 285: contains the times for ! 286: the women's 400 meter race, which has been run ! 287: only since 1964. ! 288: .P1 ! 289: .d 400wpairs.d ! 290: .P2 ! 291: To add these times to the graph, ! 292: we use ! 293: .P1 ! 294: .get 4007.g ! 295: .P2 ! 296: The ! 297: .UL new ! 298: command tells ! 299: \*g ! 300: to end ! 301: the old curve and to start a new curve ! 302: (which in this case will be drawn ! 303: with a dotted line). ! 304: Text is placed on the graph by ! 305: commands of the form ! 306: .P1 ! 307: "string" at xvalue, yvalue ! 308: .P2 ! 309: The ! 310: .UL size ! 311: clauses following the quoted strings tell ! 312: \*g ! 313: to shrink the characters by three points (absolute point sizes ! 314: may also be specified). ! 315: Strings are usually centered at the specified position, ! 316: but can be adjusted by clauses to be illustrated shortly. ! 317: .grap 4007.g ! 318: .PP ! 319: The file ! 320: .UL phone.d ! 321: records the number of telephones in the United States from ! 322: 1900 to 1970. ! 323: .P1 ! 324: .d phone.d ! 325: .P2 ! 326: Each line gives a year and the number of telephones ! 327: present in that year ! 328: (in millions, truncated to the nearest hundred thousand). ! 329: The simple ! 330: \*g ! 331: program ! 332: .P1 ! 333: .get phone1.g ! 334: .P2 ! 335: produces the simple graph ! 336: .grap phone1.g ! 337: .PP ! 338: The number of telephones appears to ! 339: grow exponentially; ! 340: to study that we will plot the data with ! 341: a logarithmic $y$-axis by adding ! 342: .UL log ! 343: .UL y ! 344: to the ! 345: .UL coord ! 346: command. ! 347: We will also add cosmetic changes of labels, more ticks, ! 348: and a solid line to replace the unconnected dots. ! 349: .P1 ! 350: .get phone2.g ! 351: .P2 ! 352: The third ! 353: .UL ticks ! 354: command provides a string that is used to print the tick ! 355: labels. ! 356: .UC C ! 357: programmers will recognize it as a ! 358: .UL printf ! 359: format string; others may view the ! 360: .CW %g ! 361: as the place to put ! 362: the number and anything else (in this case just an apostrophe) as ! 363: literal text to appear in the labels. ! 364: To suppress ! 365: labels, use the empty format string (""). ! 366: The program produces ! 367: .grap phone2.g ! 368: The number of telephones grew rapidly ! 369: in the first decade of this century, ! 370: and then settled down to an exponential growth rate upset only ! 371: by a decrease in the Great Depression and a post-war growth ! 372: spurt ! 373: to return the curve to its pre-Depression line. ! 374: .PP ! 375: Our presentation so far has been to ! 376: start with a simple ! 377: \*g ! 378: program that illustrates the data, and then refine it. ! 379: Later in this document we will ignore the design ! 380: phase, and present rather complex graphs in ! 381: their final form. ! 382: Beware. ! 383: .PP ! 384: All the examples so far have placed data on the ! 385: graph implicitly by ! 386: .UL copy ing ! 387: a file of numbers ! 388: (either a time series with one number per line or ! 389: pairs of numbers). ! 390: It is also possible to draw points and lines explicitly. ! 391: The ! 392: \*g ! 393: commands to draw on a graph ! 394: are illustrated in the following ! 395: fragment. ! 396: .P1 ! 397: .get geom.g ! 398: .P2 ! 399: .PP ! 400: The ! 401: .UL grid ! 402: command is similar to the ! 403: .UL ticks ! 404: command, except that grid lines extend ! 405: across the frame. ! 406: The next few commands plot text at specified positions. ! 407: The plotting characters (such as ! 408: .UL bullet ) ! 409: are implemented as predefined ! 410: macros \(em more on that shortly. ! 411: Unlike arbitrary characters, ! 412: the visual centers of the markers ! 413: are near their plotting centers. ! 414: The ! 415: .UL circle ! 416: command draws a circle centered at the specified location. ! 417: A radius in inches may be specified; ! 418: if no radius is given, then the circle will be the ! 419: small circle shown at the center of the graph. ! 420: The ! 421: .UL line ! 422: and ! 423: .UL arrow ! 424: commands draw the obvious objects shown at the upper left. ! 425: .grap geom.g ! 426: .PP ! 427: This figure also illustrates the combined use of the ! 428: .UL draw ! 429: and ! 430: .UL next ! 431: commands. ! 432: Saying ! 433: .UL draw ! 434: .UL A ! 435: .UL solid ! 436: defines the style ! 437: for a connected sequence of line fragments to be called ! 438: .UL A . ! 439: Subsequent commands of ! 440: .UL next ! 441: .UL A ! 442: .UL at ! 443: .I point ! 444: add ! 445: .I point ! 446: to the end of ! 447: .UL A . ! 448: There are two such sequences active in the above ! 449: example ! 450: .UL A "" ( ! 451: and ! 452: .UL B ); ! 453: note that their ! 454: .UL next ! 455: commands are intermixed. ! 456: Because the predefined string ! 457: .UL delta ! 458: follows the specification of ! 459: .UL B , ! 460: that string is plotted at each point in the sequence. ! 461: .PP ! 462: \*G ! 463: has numeric variables (implemented as double-precision ! 464: floating point numbers) and ! 465: the usual collection of arithmetic operators and ! 466: mathematical functions; see the reference section ! 467: for details. ! 468: .PP ! 469: \*G ! 470: provides the same rudimentary macro facility that ! 471: .I pic ! 472: does: ! 473: .P1 ! 474: define \f2name\fP { \f2replacement text\fP } ! 475: .P2 ! 476: defines ! 477: .IT name ! 478: to be the ! 479: .IT "replacement text" . ! 480: The replacement may be any text that contains balanced open and closing braces ! 481: .UL "{ }" . ! 482: (Alternatively, the ! 483: .IT "replacement text ! 484: may be quoted by ! 485: any single character that does not appear in the replacement; ! 486: the string is terminated by the next occurrence of that character.) ! 487: Any subsequent occurrence of ! 488: .IT name ! 489: will be replaced by ! 490: .IT "replacement text" . ! 491: .EQ ! 492: delim %% ! 493: .EN ! 494: .PP ! 495: The replacement text of a macro definition may ! 496: contain occurrences of ! 497: .UL $1 , ! 498: .UL $2 , ! 499: etc.; ! 500: these will be replaced by the corresponding actual ! 501: arguments when the macro is invoked. ! 502: The invocation for a macro with arguments is ! 503: .P1 ! 504: name(arg1, arg2, ...) ! 505: .P2 ! 506: Non-existent arguments are replaced by null ! 507: strings. ! 508: .EQ ! 509: delim $$ ! 510: .EN ! 511: .PP ! 512: The following ! 513: \*g ! 514: program uses macros and arithmetic to plot ! 515: crude approximations to ! 516: the square and square root functions. ! 517: .P1 ! 518: .get macarith.g ! 519: .P2 ! 520: The macro ! 521: .UL root ! 522: uses the ! 523: .UL ^ ! 524: exponentiation operator. ! 525: (Because ! 526: \*g ! 527: has the square root function ! 528: .UL sqrt , ! 529: that macro is in fact superfluous.) ! 530: The program produces ! 531: .grap macarith.g ! 532: .PP ! 533: The ! 534: .UL copy ! 535: command has a ! 536: .UL thru ! 537: parameter that allows each line of a file to ! 538: be treated as though it were a macro call, with ! 539: the first field serving as ! 540: the first argument, ! 541: and so on. ! 542: This is the typical ! 543: \*g ! 544: mechanism for plotting files that are not stored as ! 545: time series or as $(x,y)$ pairs. ! 546: We will illustrate its use on the file ! 547: .UL states.d , ! 548: which contains data on the fifty states. ! 549: .P1 ! 550: .d states.d ! 551: .P2 ! 552: The first field is the postal abbreviation of the state's ! 553: name (Alaska, Wyoming, Vermont, ...), the second field ! 554: is the number of Representatives to Congress from the state ! 555: after the 1981 reapportionment, and the third field is ! 556: the population of the state as measured in the 1980 Census. ! 557: The states appear in increasing order of ! 558: population. ! 559: .PP ! 560: We will first plot this data as ! 561: population, representative pairs. ! 562: (In the ! 563: .UL coord ! 564: statement, ! 565: .UL "log log" ! 566: is a synonym for ! 567: .UL "log x log y" .) ! 568: .P1 ! 569: .get states1.g ! 570: .P2 ! 571: Although the population is given in persons, ! 572: the ! 573: .UL PlotState ! 574: macro ! 575: plots the population in millions by dividing ! 576: the third input field ! 577: by one million (written in exponential notation ! 578: as ! 579: .UL 1e6 , ! 580: for $1 times 10 sup 6$). ! 581: .grap states1.g ! 582: Using ! 583: .UL circle ! 584: as a plotting symbol displays ! 585: overlapping points that are obscured when ! 586: the data is plotted with bullets. ! 587: The representation of a state is roughly proportional ! 588: to its population, except in the very small states. ! 589: .PP ! 590: Our next plot will use the state's rank ! 591: in population as the $x$-coordinate and two ! 592: different $y$-coordinates: population and number of ! 593: representatives. ! 594: We will use two ! 595: .UL coord ! 596: commands to define the two coordinate systems ! 597: .UL pop ! 598: and ! 599: .UL rep . ! 600: We then explicitly give the coordinate system ! 601: whenever we refer to a point, ! 602: both in constructing axes and plotting data. ! 603: .P1 ! 604: .get states2.g ! 605: .P2 ! 606: The ! 607: .UL copy ! 608: statement in the program uses an ! 609: .I "immediate macro" ! 610: enclosed in curly brackets and thus avoids having to ! 611: name a macro for this task. ! 612: Because the program assumes that the states are ! 613: sorted in increasing order of population, it ! 614: generates ! 615: .UL thisrank ! 616: internally as a ! 617: \*g ! 618: variable. ! 619: The program produces ! 620: .grap states2.g ! 621: .PP ! 622: The plotting symbols were chosen for contrast in ! 623: both shape and shading. ! 624: This graph also indicates that representation is proportional ! 625: to population. ! 626: Once we see this graph, though, we should realize that we don't ! 627: really need two coordinate systems: we can relate the two by ! 628: dividing the population of the U.S. \(em about 226,000,000 \(em by ! 629: the number of representatives \(em 435 \(em to see that each ! 630: representative should count as 520,000 people. ! 631: If the purpose of this graph were to tell a story about ! 632: American politics rather than to illustrate ! 633: multiple coordinate systems, ! 634: it should be redrawn with a single coordinate ! 635: system. ! 636: .PP ! 637: Many graphs plot both observed data and a function ! 638: that (theoretically) describes the data. ! 639: There are many ways to draw a function ! 640: in \*g: ! 641: a series of ! 642: .UL next ! 643: commands is tedious but works, as does writing a ! 644: simple program to write a data file that is subsequently ! 645: read and plotted by \*g. ! 646: The ! 647: .UL for ! 648: statement often provides a better solution. ! 649: This ! 650: \*g ! 651: program ! 652: .P1 ! 653: .get sin1.g ! 654: .P2 ! 655: produces ! 656: .grap sin1.g ! 657: .a ! 658: The ! 659: .UL for ! 660: statement uses the same syntax as the ! 661: .UL ticks ! 662: statement, but the ! 663: .UL from ! 664: keyword can be replaced by ! 665: .UL = '', `` ! 666: which will look more familiar to programmers. ! 667: It varies the index variable over the specified range ! 668: and for each value executes all statements inside the delimiter ! 669: characters, which use the same rules as macro ! 670: delimiters. ! 671: It is, of course, useful for many tasks beyond plotting functions. ! 672: .EQ ! 673: delim %% ! 674: .EN ! 675: .PP ! 676: The ! 677: .UL if ! 678: statement provides a simple mechanism for conditional execution. ! 679: If a file contains data on both cities and states (and lines ! 680: describing states have ``S'' in the first field), it could be plotted ! 681: by statements like ! 682: .P1 ! 683: if "$1" == "S" then { ! 684: PlotState($2,$3,$4) ! 685: } else { ! 686: PlotCity($2,$3,$4,$5,$6) ! 687: } ! 688: .P2 ! 689: The ! 690: .UL else ! 691: clause ! 692: is optional; delimiters use the same rules as macros and ! 693: .UL for ! 694: statements. ! 695: .EQ ! 696: delim $$ ! 697: .EN ! 698: .NH ! 699: A Collection of Examples ! 700: .PP ! 701: The previous section covered the ! 702: \*g ! 703: commands that are used in common graphs. ! 704: In this section we'll spend less time on ! 705: language features, and survey a wider variety of ! 706: graphs. ! 707: These examples are intended more for browsing and ! 708: reference than for straight-through reading. ! 709: Be prepared to refer to the manual in Section 5 when you stumble over a new ! 710: \*g ! 711: feature. ! 712: .PP ! 713: The file ! 714: .UL cars.d ! 715: contains the mileage (miles per gallon) and the weight ! 716: (pounds) for 74 models of automobiles sold in the United States ! 717: in the 1979 model year. ! 718: .P1 ! 719: .d cars.d ! 720: .P2 ! 721: The trivial ! 722: \*g ! 723: program ! 724: .P1 ! 725: .get cars1.g ! 726: .P2 ! 727: produces ! 728: .grap cars1.g ! 729: This graph shows that weights bottom out somewhat ! 730: below 2000 ! 731: pounds and that heavier cars get worse mileage; ! 732: it is hard to say much more about the relationship ! 733: between weight and mileage. ! 734: .PP ! 735: The next graph provides labels, uses circles ! 736: to expose data hidden in the clouds of bullets, ! 737: and re-expresses the $x$-axis in gallons per mile. ! 738: It also changes the point size and vertical spacing ! 739: to a size appropriate for camera-ready journal articles ! 740: and books; the size changes should be made outside the ! 741: \*g ! 742: program. ! 743: The ! 744: .UL \&.ft ! 745: command changes to a Helvetica font, which ! 746: some people prefer for graphs. ! 747: .P1 ! 748: .get cars2.g ! 749: .P2 ! 750: \*G ! 751: supports logarithmic re-expression of data with the ! 752: .UL log ! 753: clause in the ! 754: .UL coord ! 755: statement; any other re-expression of data must be done ! 756: with ! 757: \*g ! 758: arithmetic, as above. ! 759: .br ! 760: .grap cars2.g ! 761: This graph shows that ! 762: gallons per mile is roughly proportional to weight. ! 763: (The two outliers near 4000 pounds are the Cadillac ! 764: Seville and the Oldsmobile 98.) ! 765: .PP ! 766: In ! 767: .I "Visual Display of Quantitative Information" , ! 768: Tufte proposes the ``dot-dash-plot'' as a means for maximizing ! 769: data ink (showing the two-dimensional distribution and ! 770: the two one-dimensional marginal distributions) while minimizing ! 771: what he calls ``chart junk'' \(em ink wasted on borders ! 772: and non-data labels. ! 773: His preference is easy to express in \*g: ! 774: .P1 ! 775: .get cars3.g ! 776: .P2 ! 777: Although visually attractive, we do not find the ! 778: resulting graph as useful for interpreting the data. ! 779: .grap cars3.g ! 780: Tufte's graph does point out two facts that are ! 781: not obvious in the previous graphs: ! 782: there is a gap in car weights near 3000 pounds (exhibited ! 783: by the hole in the $y$-axis ticks), and the gallons per ! 784: mile axis is regularly structured (the ticks ! 785: are the reciprocals of an almost dense sequence of integers). ! 786: The reader may decide whether those insights are worth ! 787: the decrease in clarity. ! 788: .PP ! 789: Throughout the twentieth century, horses, cars and people ! 790: have gotten faster; ! 791: let's study those improvements. ! 792: For horses, we'll consider the winning times ! 793: of the Kentucky Derby from 1909 to 1988, in ! 794: the file ! 795: .UL speedhorse.d : ! 796: .P1 ! 797: .d speedhorse.d ! 798: .P2 ! 799: The program ! 800: .P1 ! 801: .get speedhorse1.g ! 802: .P2 ! 803: produces the graph ! 804: .grap speedhorse1.g ! 805: Each race is recorded with a bullet and ! 806: record times are marked by horizontal lines. ! 807: Secretariat is the only horse to have run the ! 808: one-and-a-quarter-mile ! 809: race in under two minutes; he won in 1973 in ! 810: 1:59.4. ! 811: .PP ! 812: For automobiles we will study the ! 813: world land speed record (even though those vehicles ! 814: are by now just low-flying airplanes). ! 815: The file ! 816: .UL speedcar.d ! 817: lists years in which speed records were set and the record ! 818: set in that year, in miles per hour averaged over a one-mile ! 819: course. ! 820: .P1 ! 821: .d speedcar.d ! 822: .P2 ! 823: We will plot the data with the following ! 824: \*g ! 825: program, which uses nested braces in the ! 826: .UL copy ! 827: and ! 828: .UL if ! 829: statements. ! 830: .P1 ! 831: .get speedcar1.g ! 832: .P2 ! 833: .PP ! 834: Each record line is drawn after the ! 835: .I next ! 836: record is read, because ! 837: the program must know when the record was broken to draw ! 838: its line. ! 839: The ! 840: .UL if ! 841: statement handles the first record, and the extra ! 842: .UL line ! 843: command extends the last record out to the current date. ! 844: .grap speedcar1.g ! 845: The horizontal lines reflect the nature of world records: they ! 846: last until they are broken. ! 847: The records could also have been plotted by a scatterplot ! 848: in which each point represents the setting of a record, ! 849: but it would be misleading to connect adjacent ! 850: points with line segments ! 851: (which we inappropriately did in the graphs ! 852: of the Olympic 400 meter run). ! 853: .PP ! 854: The following graph shows the world record times for the ! 855: one mile run; ! 856: because its ! 857: \*g ! 858: program is so similar to its automotive counterpart, ! 859: we won't show the program or data. ! 860: .grap speedman1.g ! 861: The three graphs show three different kinds of ! 862: changes. ! 863: Although horses are getting faster, they appear to ! 864: be approaching a barrier near two minutes. ! 865: Cars show great jumps as new technologies are introduced ! 866: followed by a plateau as limits of the ! 867: technology are reached. ! 868: Milers have shown a fairly consistent ! 869: linear improvement ! 870: over this century, but there must be an ! 871: asymptote down there somewhere. ! 872: .PP ! 873: The next file gives the median heights of boys ! 874: in the United States aged 2 to 18, together with ! 875: the fifth and ninety-fifth percentiles. ! 876: .P1 ! 877: .d boyhts.d ! 878: .P2 ! 879: The heights are given in centimeters (1 foot = 30.48 centimeters). ! 880: The trivial program ! 881: .P1 ! 882: .get boyhts1.g ! 883: .P2 ! 884: displays the data as ! 885: .grap boyhts1.g ! 886: Because there are four numbers on each input line, the first is ! 887: taken as an $x$-value and the remaining three are plotted ! 888: as $y$-values. ! 889: .PP ! 890: The three curves appear to be roughly straight ! 891: (at least up to age 16), ! 892: so it makes sense to fit a line ! 893: through them. ! 894: We will use the standard least squares regression ! 895: in which ! 896: .EQ ! 897: slope ~=~ { ! 898: {n SIGMA x y ~ - ~ SIGMA x SIGMA y } ! 899: over ! 900: {n SIGMA x sup 2 ~ - ~ ( SIGMA x ) sup 2 } ! 901: } ! 902: .EN ! 903: (where the summations range over all $n$ $x$ and $y$ values ! 904: in the data set) and the $y$-intercept is ! 905: .EQ ! 906: {SIGMA y ~ - ~ slope times SIGMA x} over n ! 907: .EN ! 908: The following ! 909: \*g ! 910: program boldly (and rather foolishly) implements that formula. ! 911: .P1 ! 912: .get boyhts3.g ! 913: .P2 ! 914: It plots the extreme fifth percentiles as a bar through ! 915: the median, which is plotted as a bullet. ! 916: All heights are converted to feet before plotting and calculating ! 917: the regression line. ! 918: .grap boyhts3.g ! 919: .PP ! 920: \*G ! 921: .UL print ! 922: statements write on ! 923: .UL stderr ! 924: as they are processed by \*g; ! 925: their single argument can be either an expression or a string. ! 926: The ! 927: .UL print ! 928: statements (which are commented out in ! 929: the above ! 930: \*g ! 931: program) at one time ! 932: showed that the regression line is ! 933: .EQ ! 934: Height ~ in ~ Feet ~ = ~ 2.61 ~ + ~ .19 times Age ! 935: .EN ! 936: Thus for most American ! 937: boys between 3 and 16, you may safely assume ! 938: that they started out life at 2 feet 7 inches and grew at the ! 939: rate of two and a quarter inches per year. ! 940: .PP ! 941: This program probably misapplies \*g; ! 942: if you really want to perform least squares regressions on ! 943: data, you should usually use a simple ! 944: .I awk ! 945: program like ! 946: .P1 ! 947: .get regress.awk ! 948: .P2 ! 949: (Be warned, though, that this program is not numerically ! 950: robust.) ! 951: .PP ! 952: While we're on the subject of fitting straight lines to data, ! 953: we'll redraw three graphs from J. W. Tukey's ! 954: .I "Exploratory Data Analysis" . ! 955: The file ! 956: .UL usapop.d ! 957: records the population of the United States ! 958: in millions at ten-year intervals. ! 959: .P1 ! 960: .d usapop.d ! 961: .P2 ! 962: Tukey's first two graphs indicate that the later population ! 963: growth was linear while the early growth was exponential. ! 964: The following ! 965: \*g ! 966: program plots them as a pair, using ! 967: .UL graph ! 968: commands to place internally unrelated graphs adjacent to ! 969: one another. ! 970: .P1 ! 971: .get usapop1.g ! 972: .P2 ! 973: The statements defining each graph are indented for clarity. ! 974: The second graph has the northern point of its frame 0.05 ! 975: inch below the southern point of the frame of the first graph; ! 976: the ! 977: .UL with ! 978: clause is passed directly through to ! 979: .I pic ! 980: without being evaluated for macros or expressions. ! 981: The names of both graphs begin with capital letters to ! 982: conform to ! 983: .I pic ! 984: syntax for labels. ! 985: .grap usapop1.g ! 986: .PP ! 987: Polynomial functions lie between the linear and exponential ! 988: functions; Tukey shows how a seventh-degree polynomial provides ! 989: a better (and longer) fit to the early population growth. ! 990: .P1 ! 991: .get usapop2.g ! 992: .P2 ! 993: This program re-expresses the $x$-axis with ! 994: \*g ! 995: arithmetic and uses an ! 996: .UL if ! 997: statement to graph only part of the data file. ! 998: It produces ! 999: .grap usapop2.g ! 1000: .nr k \n% ! 1001: The ! 1002: .I eqn ! 1003: .UL "space 0" ! 1004: clause is necessary to keep ! 1005: .I eqn ! 1006: from adding extra space that would interfere ! 1007: with positions computed by \*g; ! 1008: see Section 4. ! 1009: .PP ! 1010: The file ! 1011: .UL army.d ! 1012: contains four related time series ! 1013: describing the United States Army. ! 1014: .P1 ! 1015: .d army.d ! 1016: .P2 ! 1017: The first field is the year; the next four fields give ! 1018: the number of male officers, female officers, enlisted males ! 1019: and enlisted females, each in thousands. ! 1020: (Actually, there were no female enlisted personnel in the ! 1021: Army until 1943; the value 1 in 1940 and 1942 is just ! 1022: a placeholder, since ! 1023: \*g ! 1024: has no mechanism for handling missing data.) ! 1025: The following ! 1026: \*g ! 1027: program draws the four series with four different sets of ! 1028: .UL draw ! 1029: and ! 1030: .UL next ! 1031: commands. ! 1032: .P1 ! 1033: .get army1.g ! 1034: .P2 ! 1035: The program labels the lines by ! 1036: .UL copy ing ! 1037: immediate data; ! 1038: the program is therefore shorter to write and easier to change. ! 1039: The delimiter string ! 1040: .UL XXX ! 1041: in the ! 1042: .UL until ! 1043: clause could be deleted in this graph: the ! 1044: .UL \&.G2 ! 1045: line also denotes the end of data. ! 1046: Even though that string is enclosed in quotes, ! 1047: it may not contain spaces. ! 1048: The $y$-positions of the labels are the ! 1049: result of several iterations. ! 1050: .grap army1.g ! 1051: .PP ! 1052: This data can tell many stories: the buildup during the ! 1053: Second World War is obvious, as is the exodus after the ! 1054: war; increases during Korea and Vietnam are ! 1055: also apparent. ! 1056: We will consider a different story: the ratio of ! 1057: enlisted men to the three other classes of personnel. ! 1058: There are several ways to plot this data ! 1059: (the most obvious graph uses three time series showing how ! 1060: the ratios change over time, and is ! 1061: left as an exercise for the reader). ! 1062: .PP ! 1063: We will instead construct a graph that gives little insight into this ! 1064: data, but illustrates a general method that is quite useful ! 1065: in conjunction with \*g. ! 1066: The graph is a ``scatterplot vector'' that shows how one ! 1067: variable (the number of enlisted men) varies as a function of ! 1068: the other three. ! 1069: Breaking with tradition, we first show the final graphs, all ! 1070: of which have logarithmic scales. ! 1071: .grap army2.g ! 1072: The number of enlisted men is almost linearly ! 1073: related to the number of male officers, it is somewhat related to the number ! 1074: of female officers, and it varies widely as a function of the number ! 1075: of enlisted women. ! 1076: .PP ! 1077: Much more interesting than the graph itself is the method we used to ! 1078: produce it. ! 1079: We wrote a miniature ``compiler'' that accepts as ! 1080: its ``source language'' a description of a scatterplot vector and ! 1081: produces as ``object code'' a ! 1082: \*g ! 1083: program to draw the graph. ! 1084: The source program for the above example is ! 1085: .P1 ! 1086: .get army2.v ! 1087: .P2 ! 1088: The program lists several ! 1089: global attributes of the graph, the ! 1090: $y$-variable to be plotted, and as many $x$-variables as ! 1091: are desired; with each variable is its field in the file ! 1092: and a descriptive string. ! 1093: The language is ``compiled'' by the following ! 1094: .I awk ! 1095: program. ! 1096: .P1 ! 1097: .get scatvec.awk ! 1098: .P2 ! 1099: Running this program on the above description produces the following ! 1100: output, which is typically piped directly to \*g. ! 1101: .P1 ! 1102: .get army2.g ! 1103: .P2 ! 1104: The generated program uses the ! 1105: .I pic ! 1106: trick of re-using the same name ! 1107: .UL A ) ( ! 1108: for several objects. ! 1109: .PP ! 1110: Although the program above is merely a toy, ! 1111: ``minicompilers'' can produce useful preprocessors ! 1112: for \*g. ! 1113: The ! 1114: .UL scatmat ! 1115: program, for instance, is a 90-line ! 1116: .I awk ! 1117: program that reads a simple input language and produces as ! 1118: output a ! 1119: \*g ! 1120: program to produce a ``scatterplot matrix'', which ! 1121: is a handy graphical device for spotting pairwise interactions ! 1122: among several variables. ! 1123: If ! 1124: \*g ! 1125: lacks a feature you desire, consider building ! 1126: a simple preprocessor to provide it. ! 1127: An alternative is to define ! 1128: macros for the task; which approach is best depends ! 1129: strongly on the job you wish to accomplish. ! 1130: .PP ! 1131: The next graph uses iterators to make a graph without ! 1132: reading data from a file. ! 1133: Rather, its ``data'' is a ! 1134: function of two variables ! 1135: that describes a ! 1136: derivative field and a function of one variable ! 1137: that describes one solution to the differential ! 1138: equation. ! 1139: .P1 ! 1140: .get ode1.g ! 1141: .P2 ! 1142: The left label uses ! 1143: .I eqn ! 1144: text between the $font CW "$$"$ delimiters. ! 1145: The variable ! 1146: .UL scale ! 1147: ensures that all lines in the direction field are the same ! 1148: length. ! 1149: The ! 1150: .UL in ! 1151: clauses in the ! 1152: .UL ticks ! 1153: statements specify that the ticks go in zero inches ! 1154: to avoid overprinting. ! 1155: The variables ! 1156: .UL tx ! 1157: and ! 1158: .UL ty ! 1159: are so named because ! 1160: .UL x ! 1161: and ! 1162: .UL y ! 1163: are reserved words for the ! 1164: .UL coord ! 1165: statement. ! 1166: .grap ode1.g ! 1167: .PP ! 1168: Programmers familiar with floating point arithmetic may be ! 1169: surprised that the above graph is correct. ! 1170: Because of roundoff error, iteration ! 1171: .UL "from 0 to 1 by .05" '' `` ! 1172: usually produces the values ! 1173: $0, ~ .05, ~ .10, ~ ..., ~ .95$. ! 1174: \*G ! 1175: uses a ``fuzzy test'' ! 1176: in the ! 1177: .UL for ! 1178: statement to avoid that problem, which may in turn introduce ! 1179: other problems. ! 1180: Such problems may be avoided by iterating over an integer range ! 1181: and incrementing a non-integer value within the loop. ! 1182: .PP ! 1183: Most of the data we have seen so far is inherently ! 1184: two (or more) dimensional. ! 1185: As an example of one-dimensional data, we will return to ! 1186: the populations of the fifty states, which ! 1187: is the third field in the file ! 1188: .UL states.d ! 1189: introduced earlier; ! 1190: the file is sorted in increasing order of population. ! 1191: Our first graph takes the most space, but ! 1192: it also gives the most information. ! 1193: .P1 ! 1194: .get states8.g ! 1195: .P2 ! 1196: The ! 1197: .UL L ! 1198: macro (for Label) ! 1199: with input parameter $X$ evaluates to the number ! 1200: $2 sup X / 1,000,000$ followed by the string "$X$" ! 1201: (the ! 1202: .UL ticks ! 1203: command expects a number followed by a string label). ! 1204: .grap states8.g ! 1205: The dotted line is the least squares regression ! 1206: .EQ ! 1207: log sub 10 ~ Population ~ = ~ 7.214 ~ - ~ .03 times Rank ! 1208: .EN ! 1209: which gives 15.3 million as the population of the ! 1210: largest state and .515 million as the population ! 1211: of the smallest state. ! 1212: It says that ! 1213: population drops by a factor of two every ten states ! 1214: (compare the top and left scales). ! 1215: As sloppy as the exponential fit is, though, it is a much better ! 1216: fit to this data ! 1217: than a Zipf's Law curve is (drawing that curve is left as ! 1218: an exercise for the reader). ! 1219: .PP ! 1220: The next graph is a more standard representation of ! 1221: one-dimensional data. ! 1222: .P1 ! 1223: .get states3.g ! 1224: .P2 ! 1225: The markers were chosen to be ! 1226: .UL vticks ! 1227: because they denote only an $x$-value. ! 1228: .grap states3.g ! 1229: .PP ! 1230: The next one-dimensional graph uses the state's name as ! 1231: its marker; to reduce overprinting the graph is ``jittered'' ! 1232: by using a random number as a $y$-value. ! 1233: .P1 ! 1234: .get states4.g ! 1235: .P2 ! 1236: The function ! 1237: .UL rand() ! 1238: returns a pseudo-random real number chosen uniformly over the interval [0,1). ! 1239: .grap states4.g ! 1240: This graph is too cluttered; circles would have been ! 1241: a better choice as a plotting symbol (bullets, once again, would ! 1242: hide data). ! 1243: .PP ! 1244: Histograms are a standard way of presenting one-dimensional ! 1245: data in two-dimensional form. ! 1246: Our first step in building a histogram of the population ! 1247: data is the following ! 1248: .I awk ! 1249: program, which counts how many states are in each ``bin'' ! 1250: of a million people. ! 1251: .P1 ! 1252: .get states5.awk ! 1253: .P2 ! 1254: The variable ! 1255: .UL bzs ! 1256: tells where bin zero starts; although it is zero in this ! 1257: graph, it might be 95 in a histogram ! 1258: of human body temperatures in degrees Fahrenheit. ! 1259: The program produces the following output in ! 1260: .UL states2.d : ! 1261: .P1 ! 1262: .d states2.d ! 1263: .P2 ! 1264: There are 12 states with population between 0 and 999,999, ! 1265: 5 states with population between 1,000,000 and 1,999,999, ! 1266: and so on. ! 1267: .PP ! 1268: This ! 1269: \*g ! 1270: program uses three ! 1271: .UL line ! 1272: commands to plot each rectangle in the histogram. ! 1273: .P1 ! 1274: .get states5.g ! 1275: .P2 ! 1276: It produces ! 1277: .grap states5.g ! 1278: .PP ! 1279: The same file can be plotted in a ! 1280: more attractive (and more useful) form by ! 1281: .P1 ! 1282: .get states6.g ! 1283: .P2 ! 1284: which produces ! 1285: one of Bill Cleveland's ``dot charts'' or ``lolliplots'': ! 1286: .grap states6.g ! 1287: (We use ! 1288: .UL \e(bu , ! 1289: the ! 1290: .I troff ! 1291: character for a bullet, rather than the built-in string to ! 1292: get a larger size.) ! 1293: .PP ! 1294: Other histograms are possible. ! 1295: The following ! 1296: .I awk ! 1297: program ! 1298: .P1 ! 1299: .get states7.awk ! 1300: .P2 ! 1301: produces the file ! 1302: .UL states3.d ! 1303: .P1 ! 1304: .d states3.d ! 1305: .P2 ! 1306: which lists the state's abbreviation, bin number, and ! 1307: height within the bin. ! 1308: The ! 1309: \*g ! 1310: program ! 1311: .P1 ! 1312: .get states7.g ! 1313: .P2 ! 1314: reads that file to make the following histogram, in which ! 1315: the state names are used to display the heights of the bins. ! 1316: In each bin, the states occur in increasing order of ! 1317: population from bottom to top. ! 1318: .grap states7.g ! 1319: .PP ! 1320: The next data set is a run-time profile of an early version of \*g, ! 1321: created by compiling the program with the ! 1322: .UL -p ! 1323: option and running ! 1324: .UL prof ! 1325: after the program executed. ! 1326: .P1 ! 1327: .d prof1.d ! 1328: .P2 ! 1329: Although there were more than fifty procedures in the program, the ! 1330: top four time-hogs accounted for more than half of the run time. ! 1331: This file is difficult for ! 1332: \*g ! 1333: to deal with: ! 1334: even though ! 1335: .UL if ! 1336: statements would allow us to extract lines 2 through 11 ! 1337: of the file, we could not remove the leading ! 1338: .CW _ ! 1339: from a routine name or access the last field in a record. ! 1340: We will therefore process it with ! 1341: the following ! 1342: .I awk ! 1343: program. ! 1344: .P1 ! 1345: .get prof1.awk ! 1346: .P2 ! 1347: The program produces ! 1348: .P1 ! 1349: .d prof2.d ! 1350: .P2 ! 1351: We could even use the ! 1352: .I sh ! 1353: statement to execute the ! 1354: .I awk ! 1355: program from within \*g, which would make the latter entirely ! 1356: self-contained (see the reference manual for details). ! 1357: .PP ! 1358: We will display the data with this program. ! 1359: .P1 ! 1360: .get prof1.g ! 1361: .P2 ! 1362: Observe that the program knows nothing about the range of the data. ! 1363: It uses default ticks and a ! 1364: .UL frame ! 1365: statement with a computed height to achieve ! 1366: total data independence. ! 1367: .grap prof1.g ! 1368: This bar chart highlights the fact that most of the time spent by ! 1369: \*g ! 1370: is devoted to input and output. ! 1371: .PP ! 1372: J. W. Tukey's box and whisker plots ! 1373: represent the median, quartiles, and extremes of a ! 1374: one-dimensional distribution. ! 1375: The following ! 1376: \*g ! 1377: program defines a macro to draw a box plot, and then ! 1378: uses that shape to compare the distribution of heights of ! 1379: volcanoes with the distribution of heights of States of the Union. ! 1380: .P1 ! 1381: .get box1.g ! 1382: .P2 ! 1383: Boxes are one of many shapes used for the graphical ! 1384: representation of several quantities. ! 1385: If you use such shapes frequently then you should ! 1386: make a library file of their macros to ! 1387: .UL copy ! 1388: into your ! 1389: \*g ! 1390: programs. ! 1391: The above program produces ! 1392: .grap box1.g ! 1393: Even though the extreme heights are the same, state heights ! 1394: have a lower median and a greater spread. ! 1395: .PP ! 1396: Someday you may use ! 1397: \*g ! 1398: to prepare overhead transparencies, only to find that ! 1399: everything comes out too small. ! 1400: The following program illustrates some ways to get larger ! 1401: graphs. ! 1402: .P1 ! 1403: .zzz slide1.g ! 1404: .P2 ! 1405: The ! 1406: .UL ps ! 1407: and ! 1408: .UL vs ! 1409: commands preceding the graph set the text size to 14 points and ! 1410: the vertical spacing to 18 points; the two quantities are ! 1411: reset by the commands following the ! 1412: .UL .G2 . ! 1413: Such size changes should be made outside the ! 1414: \*g ! 1415: program, as mentioned earlier. ! 1416: The ! 1417: .UL 4 ! 1418: following the ! 1419: .UL .G1 ! 1420: stretches the graph (including ! 1421: \*g's ! 1422: estimate of the accompanying text) to be four inches wide; ! 1423: it is an alternative to altering the ! 1424: .UL frame ! 1425: command. ! 1426: The macro ! 1427: .UL blob ! 1428: is a plotting symbol that is much larger than ! 1429: .UL bullet ; ! 1430: the different name ensures that later references to ! 1431: .UL bullet ! 1432: are unaffected. ! 1433: The ! 1434: .I troff ! 1435: commands within the ! 1436: .UL blob ! 1437: string move the character down one-tenth of an em ! 1438: to center its plotting position (determined experimentally) ! 1439: and then reset the vertical position. ! 1440: The program produces this trivial (but large) graph. ! 1441: .br ! 1442: .grap slide1.g ! 1443: .NH ! 1444: Using Grap ! 1445: .PP ! 1446: Following are a few day-to-day matters about using \*g. ! 1447: .NH 2 ! 1448: Errors ! 1449: .PP ! 1450: \*G ! 1451: attempts to pinpoint input errors; for example, ! 1452: the input ! 1453: .P1 ! 1454: \&.G1 ! 1455: i = i + 1 ! 1456: .P2 ! 1457: results in this message on ! 1458: .UL stderr : ! 1459: .P1 ! 1460: grap: syntax error near line 1, file - ! 1461: context is ! 1462: i = i >>> + <<< 1 ! 1463: .P2 ! 1464: The error was noticed ! 1465: at the ! 1466: .UL + . ! 1467: Unfortunately, pinpointing is not the same as explaining: ! 1468: the real error is that the variable ! 1469: .UL i ! 1470: was not initialized. ! 1471: .PP ! 1472: The ``words'' ! 1473: .UL x ! 1474: and ! 1475: .UL y ! 1476: are reserved (for the ! 1477: .UL coord ! 1478: statement); ! 1479: you will get an equally inexplicable syntax error message if you use them ! 1480: as variable names. ! 1481: (This design is bad, but not nearly so bad as ! 1482: having the ! 1483: .UL log ! 1484: and ! 1485: .UL exp ! 1486: functions use base 10.) ! 1487: .PP ! 1488: \*G ! 1489: tries to load a file of standard macro definitions ! 1490: .UL /usr/lib/grap.defines ) ( ! 1491: for terms like ! 1492: .UL bullet , ! 1493: .UL plus , ! 1494: etc. ! 1495: It doesn't complain if that file isn't found, ! 1496: but if you later use one of these words, ! 1497: you'll get a syntax error message. ! 1498: .PP ! 1499: Certain constructs suggested by analogy to ! 1500: .I pic ! 1501: do not work. ! 1502: For example, ! 1503: .UL .GS ! 1504: and ! 1505: .UL .GE ! 1506: would have been nicer than ! 1507: .UL .G1 ! 1508: and ! 1509: .UL .G2 , ! 1510: but they were already taken. ! 1511: The ! 1512: .I pic ! 1513: construct ! 1514: .P1 ! 1515: \&.PS <file ! 1516: .P2 ! 1517: has been superseded by ! 1518: \*g's ! 1519: .UL copy ! 1520: command (which in turn has been retrofitted into ! 1521: .I pic ). ! 1522: .NH 2 ! 1523: \fITroff\fP issues ! 1524: .PP ! 1525: You may use ! 1526: .I troff ! 1527: commands like ! 1528: .UL .ps ! 1529: or ! 1530: .UL .ft ! 1531: to change text sizes and fonts within a graph, ! 1532: or use balanced ! 1533: .UL \es ! 1534: and ! 1535: .UL \ef ! 1536: commands within a string. ! 1537: Do not, however, ! 1538: add space ! 1539: .UL .sp ) ( ! 1540: or change the line spacing ! 1541: .UL .vs , ( ! 1542: .UL .ls ) ! 1543: within a graph. ! 1544: Some defined terms like ! 1545: .UL bullet ! 1546: contain embedded size changes; ! 1547: further qualifying them with ! 1548: \*g ! 1549: .UL size ! 1550: commands may not always work. ! 1551: .PP ! 1552: Because ! 1553: \*g ! 1554: is built on top of ! 1555: .I pic , ! 1556: the following quote from the ! 1557: .I pic ! 1558: manual is relevant: ! 1559: ``There is a subtle problem with complicated equations inside ! 1560: .I pic ! 1561: pictures \(em they come out wrong if ! 1562: .I eqn ! 1563: has to leave extra vertical space for the equation. ! 1564: If your equation involves more than subscripts and superscripts, ! 1565: you must add to the beginning of each such equation the extra information ! 1566: .UL "space 0" ''. ! 1567: This feature was illustrated in the graph of the ! 1568: United States population in Section 3. ! 1569: .NH 2 ! 1570: Alternatives ! 1571: .PP ! 1572: Besides ! 1573: \*g ! 1574: and your local draftsperson, what other choices are there? ! 1575: .PP ! 1576: The S system |reference(slanguage chambers) provides ! 1577: a host of tools for statistical analysis, ! 1578: but somewhat fewer tools than ! 1579: \*g ! 1580: for producing document-quality graphs. ! 1581: S produces graphs on the screen of a DMD 5620 terminal much more quickly than ! 1582: \*g ! 1583: (often in seconds rather than minutes), but it ! 1584: takes somewhat longer to learn (at least for us). ! 1585: If you expect to do a lot of interactive data analysis, then ! 1586: S is probably the right tool for you. ! 1587: S may be used to generate ! 1588: .I pic ! 1589: commands. ! 1590: .PP ! 1591: The standard UNIX program ! 1592: .I graph ! 1593: provides many of the basic features of ! 1594: \*g, ! 1595: though with quite a bit less control over details, particularly ! 1596: text. ! 1597: It produces output only in the ! 1598: .UX ! 1599: .I plot (5) ! 1600: language, ! 1601: which may be processed by a variety of filters ! 1602: for a variety of output devices. ! 1603: .PP ! 1604: The original ! 1605: .UX ! 1606: typesetter graphics programs are ! 1607: .I pic ! 1608: and ! 1609: .I ideal ; ! 1610: you may be able to do as well without using ! 1611: \*g ! 1612: as an intermediary. ! 1613: In particular, ! 1614: .I ideal ! 1615: provides shading and clipping, ! 1616: which are useful ! 1617: in presentation-quality bar charts and the like, but are ! 1618: well beyond the capabilities of ! 1619: .I pic . ! 1620: .EQ ! 1621: delim $$ ! 1622: .EN ! 1623: .NH ! 1624: References ! 1625: .LP ! 1626: |reference_placement ! 1627: .NH ! 1628: Reference Manual ! 1629: .PP ! 1630: In the following, ! 1631: .I italic ! 1632: terms are syntactic categories, ! 1633: .UL typewriter ! 1634: terms are literals, ! 1635: parenthesized constructs are optional, and ... indicates repetition. ! 1636: In most cases, the order of statements, ! 1637: constructs and attributes is immaterial. ! 1638: .P1 ! 1639: .IT "grap program" : ! 1640: .G1 \f2(width in inches)\fP ! 1641: \f2grap statement\fP ! 1642: ... ! 1643: .G2 ! 1644: .P2 ! 1645: A width on the ! 1646: .UL .G1 ! 1647: line overrides the computed width, as in ! 1648: .I pic . ! 1649: .P1 ! 1650: .IT "grap statement" : ! 1651: .I ! 1652: frame \(or label \(or coord \(or ticks \(or grid \(or plot \(or line \(or circle \(or draw \(or new \(or next ! 1653: \(or graph \(or numberlist \(or copy \(or for \(or if \(or sh \(or pic \(or assignment \(or print ! 1654: .ft ! 1655: .P2 ! 1656: .PP ! 1657: The ! 1658: .UL frame ! 1659: statement defines the frame that surrounds the graph: ! 1660: .P1 ! 1661: .IT frame : ! 1662: frame \f2(\fPht \f2expr)\fP \f2(\fPwid \f2expr)\fP \f2((side) linedesc)\fP \f2...\fP ! 1663: .IT side : ! 1664: top \(or bot \(or left \(or right ! 1665: .IT linedesc : ! 1666: solid \(or invis \(or dotted \f2(expr)\fP \(or dashed \f2(expr)\fP ! 1667: .P2 ! 1668: Height and width default to 2 and 3 inches; ! 1669: sides default to solid. ! 1670: If ! 1671: .I side ! 1672: is omitted, the ! 1673: .I linedesc ! 1674: applies to the entire frame. ! 1675: The optional expressions after ! 1676: .UL dotted ! 1677: and ! 1678: .UL dashed ! 1679: change the spacing exactly as in ! 1680: .I pic . ! 1681: .PP ! 1682: The ! 1683: .UL label ! 1684: statement places a label on a specified side: ! 1685: .P1 ! 1686: .IT label : ! 1687: label \f2side\fP \f2strlist\fP \f2...\fP \f2shift\fP ! 1688: .IT shift: ! 1689: left\f2 \(or \fPright\f2 \(or \fPup\f2 \(or \fPdown \f2expr ...\fP ! 1690: .IT strlist : ! 1691: \f2str ... (\fPrjust\f2 \(or \fPljust\f2 \(or \fPabove\f2 \(or \fPbelow\f2) ... (\fPsize \f2(\fP\(+-\f2) expr) ...\fP ! 1692: .IT str : ! 1693: "\f2...\fP" ! 1694: .P2 ! 1695: Lists of text strings are stacked vertically. ! 1696: In any context, string lists may contain clauses ! 1697: to adjust the position or change the point size. ! 1698: Each clause applies to the string preceding it ! 1699: and all following strings. ! 1700: Labels may also have a ! 1701: .UL width ! 1702: attribute, to override ! 1703: \*g's ! 1704: default computation. ! 1705: .PP ! 1706: Normally the coordinate system is defined by the data, ! 1707: with 7 percent extra on each side. ! 1708: (To change that to 5 percent, assign 0.05 to the ! 1709: \*g ! 1710: variable ! 1711: .UL margin , ! 1712: which is reset to 0.07 at each ! 1713: .UL .G1 ! 1714: statement.) ! 1715: The ! 1716: .UL coord ! 1717: statement defines an overriding system: ! 1718: .P1 ! 1719: .IT coord : ! 1720: coord \f2(name)\fP \f2(\fPx \f2expr,expr)\fP \f2(\fPy \f2expr,expr)\fP \f2(\fPlog x \(or log y \(or log log\f2) \fP ! 1721: .P2 ! 1722: Coordinate systems can be named; ! 1723: ranges, logarithmic scaling, etc., are done separately for each. ! 1724: .PP ! 1725: The ! 1726: .UL ticks ! 1727: statement places tick marks on one side of the frame: ! 1728: .P1 ! 1729: .IT ticks : ! 1730: ticks \f2side\fP \f2(\fPin \(or out \f2(expr))\fP \f2(shift) (tick-locations)\fP ! 1731: .IT tick-locations : ! 1732: at \f2(name) expr (str)\fP, \f2expr (str)\fP, \f2...\fP ! 1733: \(or from \f2(name) expr\fP to \f2expr\fP \f2(\fPby \f2(op) expr)\fP \f2str\fP ! 1734: .P2 ! 1735: If no ticks are specified, they will be provided automatically; ! 1736: .UL ticks ! 1737: .UL off ! 1738: suppresses automatic ticks. ! 1739: The optional expression after ! 1740: .UL in ! 1741: or ! 1742: .UL out ! 1743: specifies the length of the ticks in inches. ! 1744: The optional name refers to a coordinate system. ! 1745: If ! 1746: .IT str ! 1747: contains ! 1748: format specifiers like ! 1749: .UL %f ! 1750: or ! 1751: .UL %g , ! 1752: they are interpreted as by ! 1753: .UL printf . ! 1754: If no ! 1755: .IT str ! 1756: is supplied, the tick labels will be the values of the ! 1757: expressions. ! 1758: .PP ! 1759: If the ! 1760: .UL by ! 1761: clause is omitted, steps are of size 1. ! 1762: If the ! 1763: .UL by ! 1764: expression is preceded by one of ! 1765: .UL + , ! 1766: .UL - , ! 1767: .UL * ! 1768: or ! 1769: .UL / , ! 1770: the step is scaled by that operator, ! 1771: e.g., ! 1772: .UL *10 ! 1773: means that each step is 10 times the previous one. ! 1774: .PP ! 1775: The ! 1776: .UL grid ! 1777: statement produces grid lines along (i.e., perpendicular to) ! 1778: the named side. ! 1779: .P1 ! 1780: .IT grid : ! 1781: grid \f2side (linedesc) (shift) (tick-locations)\fP ! 1782: .P2 ! 1783: Grids are labeled by the same mechanism as ! 1784: .UL ticks . ! 1785: It is possible to draw grids without ticks by placing the phrase ! 1786: .UL ticks ! 1787: .UL off ! 1788: after the side name and before the iterator. ! 1789: .PP ! 1790: Plot ! 1791: statements place text at a point: ! 1792: .P1 ! 1793: .IT plot : ! 1794: \f2strlist\fP at \f2point\fP ! 1795: plot \f2expr (str)\fP at \f2point\fP ! 1796: .IT point : ! 1797: \f2(name) expr,expr\fP ! 1798: .P2 ! 1799: As in the ! 1800: .UL label ! 1801: statement, the string list may contain ! 1802: position and size modifiers. ! 1803: The ! 1804: .UL plot ! 1805: statement uses the optional format string as in C's ! 1806: .UL printf ! 1807: statement \(em it may contain a ! 1808: .UL %f ! 1809: or ! 1810: .UL %g . ! 1811: The optional name refers to a coordinate system. ! 1812: .PP ! 1813: The ! 1814: .UL line ! 1815: statement draws a line or arrow from here to there: ! 1816: .P1 ! 1817: .IT line : ! 1818: \f2(\fPline \(or arrow\f2)\fP from \f2point\fP to \f2point (linedesc)\fP ! 1819: .P2 ! 1820: The ! 1821: .UL circle ! 1822: statement draws a circle: ! 1823: .P1 ! 1824: .IT circle : ! 1825: circle at \f2point (\fPradius \f2expr)\fP ! 1826: .P2 ! 1827: The radius is in inches; the default size is small. ! 1828: .PP ! 1829: The ! 1830: .UL draw ! 1831: statement defines a sequence of lines: ! 1832: .P1 ! 1833: .IT draw : ! 1834: draw \f2(name) linedesc (str)\fP ! 1835: .P2 ! 1836: Subsequent data for the named sequence ! 1837: will be plotted as a line of the specified style, ! 1838: with the optional ! 1839: .IT str ! 1840: plotted at each point. ! 1841: The ! 1842: .UL next ! 1843: statement continues a sequence: ! 1844: .P1 ! 1845: .IT next : ! 1846: next \f2(name)\fP at \f2point (linedesc)\fP ! 1847: .P2 ! 1848: If a line description is specified, it overrides the default ! 1849: display mode for the line segment ending at ! 1850: .I point . ! 1851: The ! 1852: .UL new ! 1853: statement starts a new sequence; it has the same format as the ! 1854: .UL draw ! 1855: statement. ! 1856: .PP ! 1857: A line consisting of a set of numbers ! 1858: is treated as a family of points ! 1859: $x$, $y sub 1$, $y sub 2$, etc., ! 1860: to be plotted at the single ! 1861: $x$ value. ! 1862: .P1 ! 1863: .IT numberlist : ! 1864: \f2number\fP ... ! 1865: .P2 ! 1866: If there is only one number it is treated as ! 1867: a $y$ value, and $x$ values of 1, 2, 3, ... ! 1868: are supplied automatically. ! 1869: .PP ! 1870: \*G ! 1871: provides arithmetic with the operators ! 1872: .UL + , ! 1873: .UL - , ! 1874: .UL * , ! 1875: .UL / , ! 1876: and ! 1877: .UL ^ . ! 1878: Variables may be assigned to; ! 1879: assignments are expressions. ! 1880: Built-in functions include ! 1881: .UL log , ! 1882: .UL exp ! 1883: (both base 10 \(em beware!), ! 1884: .UL int ! 1885: (truncates towards zero), ! 1886: .UL sin , ! 1887: .UL cos ! 1888: (both use radians), ! 1889: .UL atan2(dy,dx) , ! 1890: .UL sqrt , ! 1891: .UL min ! 1892: (two arguments only), ! 1893: .UL max ! 1894: (ditto), ! 1895: and ! 1896: .UL rand() ! 1897: (returns a real number random on [0,1)). ! 1898: .PP ! 1899: The ! 1900: .UL for ! 1901: statement provides a modest looping facility: ! 1902: .P1 ! 1903: .IT for : ! 1904: for \f2var\fP from \f2expr\fP to \f2expr (\fPby \f2(op) expr)\fP do { \f2anything\fP } ! 1905: .P2 ! 1906: The string may contain internally balanced braces. ! 1907: Alternatively, any other character may appear immediately after the word ! 1908: .UL do , ! 1909: and the string is terminated by the next occurrence of that character. ! 1910: The text ! 1911: .IT anything ! 1912: (which may contain newlines) is repeated as ! 1913: .IT var ! 1914: takes on values from ! 1915: .IT expr1 ! 1916: to ! 1917: .IT expr2 . ! 1918: As with tick iterators, the ! 1919: .UL by ! 1920: clause is optional, and may proceed arithmetically or multiplicatively. ! 1921: In a ! 1922: .UL for ! 1923: statement, ! 1924: the ! 1925: .UL from ! 1926: may be replaced by ! 1927: .UL = ''. `` ! 1928: .PP ! 1929: The ! 1930: .UL if-then-else ! 1931: statement provides conditional evaluation: ! 1932: .P1 ! 1933: .IT if : ! 1934: if \f2expr\fP then { \f2anything\fP } else { \f2anything\fP } ! 1935: .P2 ! 1936: The ! 1937: .UL else ! 1938: clause ! 1939: is optional. ! 1940: Relational operators include ! 1941: .UL == , ! 1942: .UL != , ! 1943: .UL > , ! 1944: .UL >= , ! 1945: .UL < , ! 1946: .UL <= , ! 1947: .UL ! , ! 1948: .UL || , ! 1949: and ! 1950: .UL && . ! 1951: Strings may be compared with the operators ! 1952: .UL == ! 1953: and ! 1954: .UL != . ! 1955: .PP ! 1956: It is possible to convert numeric expressions to formatted strings: ! 1957: .P1 ! 1958: sprintf("\f2format\fP", \f2expr\fP, \f2expr\fP, ...) ! 1959: .P2 ! 1960: is equivalent to a quoted string in any context. ! 1961: Variants of ! 1962: .UL %f ! 1963: and ! 1964: .UL %g ! 1965: are the only sensible format conversions. ! 1966: .PP ! 1967: \*G ! 1968: provides the same macro processor that ! 1969: .I pic ! 1970: does: ! 1971: .P1 ! 1972: define \f2macro-name\fP { \f2anything\fP } ! 1973: .P2 ! 1974: .EQ ! 1975: delim %% ! 1976: .EN ! 1977: Subsequent occurrences of the macro name will be replaced ! 1978: by the string, with arguments of the form \f(CW$\fIn\fR ! 1979: replaced by corresponding actual arguments. ! 1980: Macro definitions persist across ! 1981: .UL .G2 ! 1982: boundaries, as do values of variables. ! 1983: .EQ ! 1984: delim $$ ! 1985: .EN ! 1986: .PP ! 1987: The ! 1988: .UL copy ! 1989: statement is somewhat overloaded: ! 1990: .P1 ! 1991: copy "\f2filename\fP" ! 1992: .P2 ! 1993: includes the contents of the named file at that point; ! 1994: .P1 ! 1995: copy "\f2filename\fP" thru \f2macro-name\fP ! 1996: .P2 ! 1997: copies the file through the macro; and ! 1998: .P1 ! 1999: copy thru \f2macro-name\fP ! 2000: .P2 ! 2001: copies subsequent lines through the macro; ! 2002: each number or quoted string is treated as an argument. ! 2003: In each case, copying continues until end of file or the next ! 2004: .UL .G2 . ! 2005: The optional clause ! 2006: .UL until ! 2007: .IT str ! 2008: causes copying to terminate when a line whose ! 2009: first field is ! 2010: .IT str ! 2011: occurs. ! 2012: In all cases, the macro can be specified inline rather than by name: ! 2013: .P1 ! 2014: copy thru { \f2macro body\fP } ! 2015: .P2 ! 2016: .PP ! 2017: The ! 2018: .UL sh ! 2019: command passes text through to the UNIX shell. ! 2020: .P1 ! 2021: .IT sh : ! 2022: sh { \f2anything\fP } ! 2023: .P2 ! 2024: The body of the command is scanned for macros. ! 2025: The built-in macro ! 2026: .UL pid ! 2027: is a string consisting of the process identification number; ! 2028: it can be used to generate unique file names. ! 2029: .PP ! 2030: The ! 2031: .UL pic ! 2032: command passes text through to ! 2033: .I pic ! 2034: with the ! 2035: .UL pic '' `` ! 2036: removed; variables and macros are not evaluated. ! 2037: Lines beginning with a period (that are not numbers) ! 2038: are passed through literally, under the assumption that they ! 2039: are ! 2040: .I troff ! 2041: commands. ! 2042: .PP ! 2043: The ! 2044: .UL graph ! 2045: statement ! 2046: .P1 ! 2047: .IT graph : ! 2048: graph \f2Picname (pic-text)\fP ! 2049: .P2 ! 2050: defines a new graph named ! 2051: .I Picname , ! 2052: resetting all coordinate systems. ! 2053: If any ! 2054: .UL graph ! 2055: commands are used in a ! 2056: \*g ! 2057: program, then the statement after the ! 2058: .UL \&.G1 ! 2059: must be a ! 2060: .UL graph ! 2061: command. ! 2062: The ! 2063: .I pic-text ! 2064: can be used to position this graph relative ! 2065: to previous graphs by referring to their ! 2066: .UL Frame s, ! 2067: as in ! 2068: .P1 ! 2069: graph First ! 2070: ... ! 2071: graph Second with .Frame.w at First.Frame.e + (0.1,0) ! 2072: .P2 ! 2073: Macros and expressions in ! 2074: .I pic-text ! 2075: are not evaluated. ! 2076: .I Picname s ! 2077: must begin with a capital letter to satisfy ! 2078: .I pic ! 2079: syntax. ! 2080: .PP ! 2081: The ! 2082: .UL print ! 2083: statement ! 2084: .P1 ! 2085: .IT print : ! 2086: print \f2(expr\fP \(or \f2str)\fP ! 2087: .P2 ! 2088: writes on ! 2089: .UL stderr ! 2090: as ! 2091: \*g ! 2092: processes its input; it is sometimes useful for debugging. ! 2093: .PP ! 2094: Many reserved words have synonyms, such as ! 2095: .UL thru ! 2096: for ! 2097: .UL through , ! 2098: .UL tick ! 2099: for ! 2100: .UL ticks, ! 2101: and ! 2102: .UL bot ! 2103: for ! 2104: .UL bottom . ! 2105: .PP ! 2106: The ! 2107: .UL # ! 2108: introduces a comment, which ends at the end of the line. ! 2109: Statements may be continued over several lines by preceding each ! 2110: newline with a ! 2111: backslash character. ! 2112: Multiple statements may appear on a single line separated ! 2113: by semicolons. ! 2114: \*G ! 2115: ignores any line that is entirely blank, including those ! 2116: processed by ! 2117: .UL "copy thru" ! 2118: commands. ! 2119: .PP ! 2120: When ! 2121: \*g ! 2122: is first executed it reads standard macro definitions ! 2123: from the file ! 2124: .UL /usr/lib/grap.defines . ! 2125: The definitions include ! 2126: .UL bullet , ! 2127: .UL plus , ! 2128: .UL box , ! 2129: .UL star , ! 2130: .UL dot , ! 2131: .UL times , ! 2132: .UL htick , ! 2133: .UL vtick , ! 2134: .UL square , ! 2135: and ! 2136: .UL delta .
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.