|
|
1.1 ! root 1: .so ../ADM/mac ! 2: .XX grap 109 "Grap \(em A Language for Typesetting Graphs" ! 3: .TR 114 ! 4: .ND "Revised, May 1991 ! 5: .EQ ! 6: delim $$ ! 7: .EN ! 8: .so macros ! 9: .ds g \f2grap\fP ! 10: .ds G \f2Grap\fP ! 11: .TL ! 12: Grap \(em A Language for Typesetting Graphs ! 13: .br ! 14: Tutorial and User Manual ! 15: .AU ! 16: Jon L. Bentley ! 17: Brian W. Kernighan ! 18: .AI ! 19: .MH ! 20: .AB ! 21: \*G ! 22: is a language for describing plots of data. ! 23: This graph of the 1984 ! 24: age distribution in the United States ! 25: .grap agepop1.g ! 26: is produced by the ! 27: \*g ! 28: commands ! 29: .P1 ! 30: .get agepop1.g ! 31: .P2 ! 32: (Each line in the data file ! 33: .CW agepop.d ! 34: contains an age and the number of Americans of that ! 35: age alive in 1984; the file is sorted by age.) ! 36: .PP ! 37: The ! 38: \*g ! 39: preprocessor works with ! 40: .I pic |reference(latest pic) ! 41: and ! 42: .I troff |reference(latest troff reference). ! 43: Most of its input is passed ! 44: through untouched, but statements between ! 45: .CW .G1 ! 46: and ! 47: .CW .G2 ! 48: are translated into ! 49: .I pic ! 50: commands that draw graphs. ! 51: .AE ! 52: .NH ! 53: Introduction ! 54: .PP ! 55: \*G ! 56: is a language for describing graphical ! 57: displays of data. ! 58: It provides such services as automatic scaling and ! 59: labeling of axes, and ! 60: .CW for ! 61: statements, ! 62: .CW if ! 63: statements, and macros to facilitate user ! 64: programmability. ! 65: \*G ! 66: is intended primarily for including graphs in ! 67: documents prepared on the ! 68: .UX ! 69: operating system, and is only marginally ! 70: useful for elementary tasks in data analysis. ! 71: .PP ! 72: Section 2 of this document is a tutorial introduction to ! 73: \*g; ! 74: readers who find it slow going may wish to skim ahead. ! 75: The examples in Section 3 illustrate ! 76: the various kinds of graphs that ! 77: \*g ! 78: can produce and some common ! 79: \*g ! 80: idioms. ! 81: Mundane matters about using ! 82: \*g ! 83: are discussed in Section 4, ! 84: and Section 5 contains a brief reference manual. ! 85: .PP ! 86: We have tried to illustrate good principles of ! 87: statistics and graphical design in the ! 88: graphs we present. ! 89: In several places, though, good taste has lost to ! 90: the necessity of illustrating ! 91: \*g ! 92: capabilities. ! 93: Readers interested in statistical ! 94: integrity and taste should ! 95: consult the literature, for example |reference(chambers graphs) ! 96: |reference(tufte graphs) |reference(cleveland elements). ! 97: .NH ! 98: Tutorial ! 99: .PP ! 100: The following is a simple ! 101: \*g ! 102: program\(dg ! 103: .FS ! 104: \(dg Throughout ! 105: this document we will show only the first five ! 106: lines and the last line of data files; ! 107: omitted lines are indicated by ``...''. ! 108: .FE ! 109: .P1 ! 110: \&.G1 ! 111: .d 400mtimes.d ! 112: \&.G2 ! 113: .P2 ! 114: The single number on each line ! 115: is the winning time in seconds for the ! 116: men's 400 meter run, ! 117: from the first modern Olympic Games (1896) ! 118: to the twenty-first (1988). ! 119: If the file ! 120: .CW olymp.g ! 121: contains the text above, ! 122: then typing the command ! 123: .P1 ! 124: grap olymp.g | pic | troff > junk ! 125: .P2 ! 126: creates a ! 127: .I troff ! 128: output file ! 129: .CW junk ! 130: that contains the ! 131: picture ! 132: .grap 4001.g ! 133: The graph shows the decrease ! 134: in winning times from 54.2 ! 135: seconds to 43.87 seconds. ! 136: If the times are ! 137: contained in the file ! 138: .CW 400mtimes.d , ! 139: we could ! 140: produce the same graph with the ! 141: shorter program ! 142: .P1 ! 143: .get 4001.g ! 144: .P2 ! 145: Writing ! 146: .CW copy ! 147: .CW \&"fname" ! 148: in a ! 149: \*g ! 150: program is equivalent to including the ! 151: contents of file ! 152: .CW fname ! 153: at that point in the file. ! 154: (In the interests of compatibility with other programs, ! 155: .CW include ! 156: is a synonym for ! 157: .CW copy .) ! 158: .PP ! 159: Each line in the file ! 160: .CW 400mpairs.d ! 161: contains two numbers, the ! 162: year of the Olympics and the winning time: ! 163: .P1 ! 164: .d 400mpairs.d ! 165: .P2 ! 166: If we plot this data with the program ! 167: .P1 ! 168: .get 4002.g ! 169: .P2 ! 170: the bottom ($x$) axis represents the year of the Olympics. ! 171: .grap 4002.g ! 172: The ``holes'' in $x$-values reflect the fact ! 173: that the 1916, 1940, and 1944 Olympics ! 174: were cancelled due to war. ! 175: Because the previous data ! 176: (in ! 177: .CW 400mtimes.d ) ! 178: had just one number per ! 179: line, ! 180: \*g ! 181: viewed it as a ``time series'' and ! 182: supplied $x$-values of $1, ~ 2, ~ 3, ...$ ! 183: before plotting ! 184: the data as $y$-values. ! 185: The input to the ! 186: second program has two values per line, ! 187: so they are interpreted as $( x , y )$ pairs. ! 188: .PP ! 189: Rather than a scatter plot of points, we might prefer to ! 190: see the winning times connected by a solid ! 191: line. ! 192: The program ! 193: .P1 ! 194: .get 4003.g ! 195: .P2 ! 196: produces the graph ! 197: .grap 4003.g ! 198: Eric Liddell of Great Britain ! 199: won his gold medal ! 200: in Paris in 1924 with a time of 47.6 seconds. ! 201: (Remember ``Chariots ! 202: of Fire''?) ! 203: .PP ! 204: We can make the graph more attractive ! 205: by modifying its frame ! 206: and adding labels. ! 207: .P1 ! 208: .get 4004.g ! 209: .P2 ! 210: The ! 211: .CW frame ! 212: command describes ! 213: the graph's bounding box: ! 214: the overall frame (which has four sides) ! 215: is invisible, it is 2 inches high and 3 inches ! 216: wide (which happen to be the ! 217: default height and width), ! 218: and the left and bottom ! 219: sides are solid (they could have been ! 220: dashed or dotted instead). ! 221: The labels appear on the left and bottom, as requested. ! 222: .grap 4004.g ! 223: .PP ! 224: To set the range of each axis, ! 225: \*g ! 226: examines the data and pads both ! 227: dimensions ! 228: by seven percent at each end. ! 229: The ! 230: .CW coord ! 231: (``coordinates'') command ! 232: allows you to specify the range of one or both axes explicitly; ! 233: it also turns off automatic padding. ! 234: .P1 ! 235: .get 4005.g ! 236: .P2 ! 237: The $y$-axis now ranges from 42 to 56 seconds ! 238: (a little more than before), ! 239: and the $x$-axis from 1894 to 1990 ! 240: (a little less). ! 241: .grap 4005.g ! 242: .PP ! 243: The ticks in the preceding graphs were generated ! 244: by ! 245: \*g ! 246: guessing at reasonable values. ! 247: If you would rather provide your own, ! 248: you may ! 249: use the ! 250: .CW ticks ! 251: command, ! 252: which comes in the flavors illustrated below. ! 253: .P1 ! 254: .get 4006.g ! 255: .P2 ! 256: The first ! 257: .CW ticks ! 258: command deals with the left axis: ! 259: it puts the ticks facing out at ! 260: the numbers in the list. ! 261: \*G ! 262: puts labels only at values ! 263: with strings, ! 264: except that when no labels at all are ! 265: given, each number serves as its own label, ! 266: as in the second ! 267: .CW ticks ! 268: command. ! 269: That command ! 270: is for the bottom axis: ! 271: it puts the ticks facing in at steps of 20 ! 272: from 1900 to 1980. ! 273: The command ! 274: .CW "ticks off" ! 275: turns off all ticks. ! 276: \*G ! 277: does its best to place labels appropriately, but ! 278: it sometimes needs your help: ! 279: the ! 280: .CW "left .2" ! 281: clause moves the left label 0.2 inches further left to ! 282: avoid the new ticks. ! 283: .grap 4006.g ! 284: .PP ! 285: The file ! 286: .CW 400wpairs.d ! 287: contains the times for ! 288: the women's 400 meter race, which has been run ! 289: only since 1964. ! 290: .P1 ! 291: .d 400wpairs.d ! 292: .P2 ! 293: To add these times to the graph, ! 294: we use ! 295: .P1 ! 296: .get 4007.g ! 297: .P2 ! 298: The ! 299: .CW new ! 300: command tells ! 301: \*g ! 302: to end ! 303: the old curve and to start a new curve ! 304: (which in this case will be drawn ! 305: with a dotted line). ! 306: Text is placed on the graph by ! 307: commands of the form ! 308: .P1 ! 309: "string" at xvalue, yvalue ! 310: .P2 ! 311: The ! 312: .CW size ! 313: clauses following the quoted strings tell ! 314: \*g ! 315: to shrink the characters by three points (absolute point sizes ! 316: may also be specified). ! 317: Strings are usually centered at the specified position, ! 318: but can be adjusted by clauses to be illustrated shortly. ! 319: .grap 4007.g ! 320: .PP ! 321: The file ! 322: .CW phone.d ! 323: records the number of telephones in the United States from ! 324: 1900 to 1970. ! 325: .P1 ! 326: .d phone.d ! 327: .P2 ! 328: Each line gives a year and the number of telephones ! 329: present in that year ! 330: (in millions, truncated to the nearest hundred thousand). ! 331: The simple ! 332: \*g ! 333: program ! 334: .P1 ! 335: .get phone1.g ! 336: .P2 ! 337: produces the simple graph ! 338: .grap phone1.g ! 339: .PP ! 340: The number of telephones appears to ! 341: grow exponentially; ! 342: to study that we will plot the data with ! 343: a logarithmic $y$-axis by adding ! 344: .CW log ! 345: .CW y ! 346: to the ! 347: .CW coord ! 348: command. ! 349: We will also add cosmetic changes of labels, more ticks, ! 350: and a solid line to replace the unconnected dots. ! 351: .P1 ! 352: .get phone2.g ! 353: .P2 ! 354: The third ! 355: .CW ticks ! 356: command provides a string that is used to print the tick ! 357: labels. ! 358: .UC C ! 359: programmers will recognize it as a ! 360: .CW printf ! 361: format string; others may view the ! 362: .CW %g ! 363: as the place to put ! 364: the number and anything else (in this case just an apostrophe) as ! 365: literal text to appear in the labels. ! 366: To suppress ! 367: labels, use the empty format string (""). ! 368: The program produces ! 369: .grap phone2.g ! 370: The number of telephones grew rapidly ! 371: in the first decade of this century, ! 372: and then settled down to an exponential growth rate upset only ! 373: by a decrease in the Great Depression and a post-war growth ! 374: spurt ! 375: to return the curve to its pre-Depression line. ! 376: .PP ! 377: Our presentation so far has been to ! 378: start with a simple ! 379: \*g ! 380: program that illustrates the data, and then refine it. ! 381: Later in this document we will ignore the design ! 382: phase, and present rather complex graphs in ! 383: their final form. ! 384: Beware. ! 385: .PP ! 386: All the examples so far have placed data on the ! 387: graph implicitly by ! 388: .CW copy ing ! 389: a file of numbers ! 390: (either a time series with one number per line or ! 391: pairs of numbers). ! 392: It is also possible to draw points and lines explicitly. ! 393: The ! 394: \*g ! 395: commands to draw on a graph ! 396: are illustrated in the following ! 397: fragment. ! 398: .P1 ! 399: .get geom.g ! 400: .P2 ! 401: .PP ! 402: The ! 403: .CW grid ! 404: command is similar to the ! 405: .CW ticks ! 406: command, except that grid lines extend ! 407: across the frame. ! 408: The next few commands plot text at specified positions. ! 409: The plotting characters (such as ! 410: .CW bullet ) ! 411: are implemented as predefined ! 412: macros \(em more on that shortly. ! 413: Unlike arbitrary characters, ! 414: the visual centers of the markers ! 415: are near their plotting centers. ! 416: The ! 417: .CW circle ! 418: command draws a circle centered at the specified location. ! 419: A radius in inches may be specified; ! 420: if no radius is given, then the circle will be the ! 421: small circle shown at the center of the graph. ! 422: The ! 423: .CW line ! 424: and ! 425: .CW arrow ! 426: commands draw the obvious objects shown at the upper left. ! 427: .grap geom.g ! 428: .PP ! 429: This figure also illustrates the combined use of the ! 430: .CW draw ! 431: and ! 432: .CW next ! 433: commands. ! 434: Saying ! 435: .CW draw ! 436: .CW A ! 437: .CW solid ! 438: defines the style ! 439: for a connected sequence of line fragments to be called ! 440: .CW A . ! 441: Subsequent commands of ! 442: .CW next ! 443: .CW A ! 444: .CW at ! 445: .I point ! 446: add ! 447: .I point ! 448: to the end of ! 449: .CW A . ! 450: There are two such sequences active in the above ! 451: example ! 452: .CW A "" ( ! 453: and ! 454: .CW B ); ! 455: note that their ! 456: .CW next ! 457: commands are intermixed. ! 458: Because the predefined string ! 459: .CW delta ! 460: follows the specification of ! 461: .CW B , ! 462: that string is plotted at each point in the sequence. ! 463: .PP ! 464: \*G ! 465: has numeric variables (implemented as double-precision ! 466: floating point numbers) and ! 467: the usual collection of arithmetic operators and ! 468: mathematical functions; see the reference section ! 469: for details. ! 470: .PP ! 471: \*G ! 472: provides the same rudimentary macro facility that ! 473: .I pic ! 474: does: ! 475: .P1 ! 476: define \f2name\fP { \f2replacement text\fP } ! 477: .P2 ! 478: defines ! 479: .IT name ! 480: to be the ! 481: .IT "replacement text" . ! 482: The replacement may be any text that contains balanced open and closing braces ! 483: .CW "{ }" . ! 484: (Alternatively, the ! 485: .IT "replacement text ! 486: may be quoted by ! 487: any single character that does not appear in the replacement; ! 488: the string is terminated by the next occurrence of that character.) ! 489: Any subsequent occurrence of ! 490: .IT name ! 491: will be replaced by ! 492: .IT "replacement text" . ! 493: .EQ ! 494: delim %% ! 495: .EN ! 496: .PP ! 497: The replacement text of a macro definition may ! 498: contain occurrences of ! 499: .CW $1 , ! 500: .CW $2 , ! 501: etc.; ! 502: these will be replaced by the corresponding actual ! 503: arguments when the macro is invoked. ! 504: The invocation for a macro with arguments is ! 505: .P1 ! 506: name(arg1, arg2, ...) ! 507: .P2 ! 508: Non-existent arguments are replaced by null ! 509: strings. ! 510: .EQ ! 511: delim $$ ! 512: .EN ! 513: .PP ! 514: The following ! 515: \*g ! 516: program uses macros and arithmetic to plot ! 517: crude approximations to ! 518: the square and square root functions. ! 519: .P1 ! 520: .get macarith.g ! 521: .P2 ! 522: The macro ! 523: .CW root ! 524: uses the ! 525: .CW ^ ! 526: exponentiation operator. ! 527: (Because ! 528: \*g ! 529: has the square root function ! 530: .CW sqrt , ! 531: that macro is in fact superfluous.) ! 532: The program produces ! 533: .grap macarith.g ! 534: .PP ! 535: The ! 536: .CW copy ! 537: command has a ! 538: .CW thru ! 539: parameter that allows each line of a file to ! 540: be treated as though it were a macro call, with ! 541: the first field serving as ! 542: the first argument, ! 543: and so on. ! 544: This is the typical ! 545: \*g ! 546: mechanism for plotting files that are not stored as ! 547: time series or as $(x,y)$ pairs. ! 548: We will illustrate its use on the file ! 549: .CW states.d , ! 550: which contains data on the fifty states. ! 551: .P1 ! 552: .d states.d ! 553: .P2 ! 554: The first field is the postal abbreviation of the state's ! 555: name (Alaska, Wyoming, Vermont, ...), the second field ! 556: is the number of Representatives to Congress from the state ! 557: after the 1981 reapportionment, and the third field is ! 558: the population of the state as measured in the 1980 Census. ! 559: The states appear in increasing order of ! 560: population. ! 561: .PP ! 562: We will first plot this data as ! 563: population, representative pairs. ! 564: (In the ! 565: .CW coord ! 566: statement, ! 567: .CW "log log" ! 568: is a synonym for ! 569: .CW "log x log y" .) ! 570: .P1 ! 571: .get states1.g ! 572: .P2 ! 573: Although the population is given in persons, ! 574: the ! 575: .CW PlotState ! 576: macro ! 577: plots the population in millions by dividing ! 578: the third input field ! 579: by one million (written in exponential notation ! 580: as ! 581: .CW 1e6 , ! 582: for $1 times 10 sup 6$). ! 583: .grap states1.g ! 584: Using ! 585: .CW circle ! 586: as a plotting symbol displays ! 587: overlapping points that are obscured when ! 588: the data is plotted with bullets. ! 589: The representation of a state is roughly proportional ! 590: to its population, except in the very small states. ! 591: .PP ! 592: Our next plot will use the state's rank ! 593: in population as the $x$-coordinate and two ! 594: different $y$-coordinates: population and number of ! 595: representatives. ! 596: We will use two ! 597: .CW coord ! 598: commands to define the two coordinate systems ! 599: .CW pop ! 600: and ! 601: .CW rep . ! 602: We then explicitly give the coordinate system ! 603: whenever we refer to a point, ! 604: both in constructing axes and plotting data. ! 605: .P1 ! 606: .get states2.g ! 607: .P2 ! 608: The ! 609: .CW copy ! 610: statement in the program uses an ! 611: .I "immediate macro" ! 612: enclosed in curly brackets and thus avoids having to ! 613: name a macro for this task. ! 614: Because the program assumes that the states are ! 615: sorted in increasing order of population, it ! 616: generates ! 617: .CW thisrank ! 618: internally as a ! 619: \*g ! 620: variable. ! 621: The program produces ! 622: .grap states2.g ! 623: .PP ! 624: The plotting symbols were chosen for contrast in ! 625: both shape and shading. ! 626: This graph also indicates that representation is proportional ! 627: to population. ! 628: Once we see this graph, though, we should realize that we don't ! 629: really need two coordinate systems: we can relate the two by ! 630: dividing the population of the U.S. \(em about 226,000,000 \(em by ! 631: the number of representatives \(em 435 \(em to see that each ! 632: representative should count as 520,000 people. ! 633: If the purpose of this graph were to tell a story about ! 634: American politics rather than to illustrate ! 635: multiple coordinate systems, ! 636: it should be redrawn with a single coordinate ! 637: system. ! 638: .PP ! 639: Many graphs plot both observed data and a function ! 640: that (theoretically) describes the data. ! 641: There are many ways to draw a function ! 642: in \*g: ! 643: a series of ! 644: .CW next ! 645: commands is tedious but works, as does writing a ! 646: simple program to write a data file that is subsequently ! 647: read and plotted by \*g. ! 648: The ! 649: .CW for ! 650: statement often provides a better solution. ! 651: This ! 652: \*g ! 653: program ! 654: .P1 ! 655: .get sin1.g ! 656: .P2 ! 657: produces ! 658: .grap sin1.g ! 659: .a ! 660: The ! 661: .CW for ! 662: statement uses the same syntax as the ! 663: .CW ticks ! 664: statement, but the ! 665: .CW from ! 666: keyword can be replaced by ! 667: .CW = '', `` ! 668: which will look more familiar to programmers. ! 669: It varies the index variable over the specified range ! 670: and for each value executes all statements inside the delimiter ! 671: characters, which use the same rules as macro ! 672: delimiters. ! 673: It is, of course, useful for many tasks beyond plotting functions. ! 674: .EQ ! 675: delim %% ! 676: .EN ! 677: .PP ! 678: The ! 679: .CW if ! 680: statement provides a simple mechanism for conditional execution. ! 681: If a file contains data on both cities and states (and lines ! 682: describing states have ``S'' in the first field), it could be plotted ! 683: by statements like ! 684: .P1 ! 685: if "$1" == "S" then { ! 686: PlotState($2,$3,$4) ! 687: } else { ! 688: PlotCity($2,$3,$4,$5,$6) ! 689: } ! 690: .P2 ! 691: The ! 692: .CW else ! 693: clause ! 694: is optional; delimiters use the same rules as macros and ! 695: .CW for ! 696: statements. ! 697: .EQ ! 698: delim $$ ! 699: .EN ! 700: .NH ! 701: A Collection of Examples ! 702: .PP ! 703: The previous section covered the ! 704: \*g ! 705: commands that are used in common graphs. ! 706: In this section we'll spend less time on ! 707: language features, and survey a wider variety of ! 708: graphs. ! 709: These examples are intended more for browsing and ! 710: reference than for straight-through reading. ! 711: Be prepared to refer to the manual in Section 5 when you stumble over a new ! 712: \*g ! 713: feature. ! 714: .PP ! 715: The file ! 716: .CW cars.d ! 717: contains the mileage (miles per gallon) and the weight ! 718: (pounds) for 74 models of automobiles sold in the United States ! 719: in the 1979 model year. ! 720: .P1 ! 721: .d cars.d ! 722: .P2 ! 723: The trivial ! 724: \*g ! 725: program ! 726: .P1 ! 727: .get cars1.g ! 728: .P2 ! 729: produces ! 730: .grap cars1.g ! 731: This graph shows that weights bottom out somewhat ! 732: below 2000 ! 733: pounds and that heavier cars get worse mileage; ! 734: it is hard to say much more about the relationship ! 735: between weight and mileage. ! 736: .PP ! 737: The next graph provides labels, uses circles ! 738: to expose data hidden in the clouds of bullets, ! 739: and re-expresses the $x$-axis in gallons per mile. ! 740: It also changes the point size and vertical spacing ! 741: to a size appropriate for camera-ready journal articles ! 742: and books; the size changes should be made outside the ! 743: \*g ! 744: program. ! 745: The ! 746: .CW \&.ft ! 747: command changes to a Helvetica font, which ! 748: some people prefer for graphs. ! 749: .P1 ! 750: .get cars2.g ! 751: .P2 ! 752: \*G ! 753: supports logarithmic re-expression of data with the ! 754: .CW log ! 755: clause in the ! 756: .CW coord ! 757: statement; any other re-expression of data must be done ! 758: with ! 759: \*g ! 760: arithmetic, as above. ! 761: .br ! 762: .grap cars2.g ! 763: This graph shows that ! 764: gallons per mile is roughly proportional to weight. ! 765: (The two outliers near 4000 pounds are the Cadillac ! 766: Seville and the Oldsmobile 98.) ! 767: .PP ! 768: In ! 769: .I "Visual Display of Quantitative Information" , ! 770: Tufte proposes the ``dot-dash-plot'' as a means for maximizing ! 771: data ink (showing the two-dimensional distribution and ! 772: the two one-dimensional marginal distributions) while minimizing ! 773: what he calls ``chart junk'' \(em ink wasted on borders ! 774: and non-data labels. ! 775: His preference is easy to express in \*g: ! 776: .P1 ! 777: .get cars3.g ! 778: .P2 ! 779: Although visually attractive, we do not find the ! 780: resulting graph as useful for interpreting the data. ! 781: .grap cars3.g ! 782: Tufte's graph does point out two facts that are ! 783: not obvious in the previous graphs: ! 784: there is a gap in car weights near 3000 pounds (exhibited ! 785: by the hole in the $y$-axis ticks), and the gallons per ! 786: mile axis is regularly structured (the ticks ! 787: are the reciprocals of an almost dense sequence of integers). ! 788: The reader may decide whether those insights are worth ! 789: the decrease in clarity. ! 790: .PP ! 791: Throughout the twentieth century, horses, cars and people ! 792: have gotten faster; ! 793: let's study those improvements. ! 794: For horses, we'll consider the winning times ! 795: of the Kentucky Derby from 1909 to 1988, in ! 796: the file ! 797: .CW speedhorse.d : ! 798: .P1 ! 799: .d speedhorse.d ! 800: .P2 ! 801: The program ! 802: .P1 ! 803: .get speedhorse1.g ! 804: .P2 ! 805: produces the graph ! 806: .grap speedhorse1.g ! 807: Each race is recorded with a bullet and ! 808: record times are marked by horizontal lines. ! 809: Secretariat is the only horse to have run the ! 810: one-and-a-quarter-mile ! 811: race in under two minutes; he won in 1973 in ! 812: 1:59.4. ! 813: .PP ! 814: For automobiles we will study the ! 815: world land speed record (even though those vehicles ! 816: are by now just low-flying airplanes). ! 817: The file ! 818: .CW speedcar.d ! 819: lists years in which speed records were set and the record ! 820: set in that year, in miles per hour averaged over a one-mile ! 821: course. ! 822: .P1 ! 823: .d speedcar.d ! 824: .P2 ! 825: We will plot the data with the following ! 826: \*g ! 827: program, which uses nested braces in the ! 828: .CW copy ! 829: and ! 830: .CW if ! 831: statements. ! 832: .P1 ! 833: .get speedcar1.g ! 834: .P2 ! 835: .PP ! 836: Each record line is drawn after the ! 837: .I next ! 838: record is read, because ! 839: the program must know when the record was broken to draw ! 840: its line. ! 841: The ! 842: .CW if ! 843: statement handles the first record, and the extra ! 844: .CW line ! 845: command extends the last record out to the current date. ! 846: .grap speedcar1.g ! 847: The horizontal lines reflect the nature of world records: they ! 848: last until they are broken. ! 849: The records could also have been plotted by a scatterplot ! 850: in which each point represents the setting of a record, ! 851: but it would be misleading to connect adjacent ! 852: points with line segments ! 853: (which we inappropriately did in the graphs ! 854: of the Olympic 400 meter run). ! 855: .PP ! 856: The following graph shows the world record times for the ! 857: one mile run; ! 858: because its ! 859: \*g ! 860: program is so similar to its automotive counterpart, ! 861: we won't show the program or data. ! 862: .grap speedman1.g ! 863: The three graphs show three different kinds of ! 864: changes. ! 865: Although horses are getting faster, they appear to ! 866: be approaching a barrier near two minutes. ! 867: Cars show great jumps as new technologies are introduced ! 868: followed by a plateau as limits of the ! 869: technology are reached. ! 870: Milers have shown a fairly consistent ! 871: linear improvement ! 872: over this century, but there must be an ! 873: asymptote down there somewhere. ! 874: .PP ! 875: The next file gives the median heights of boys ! 876: in the United States aged 2 to 18, together with ! 877: the fifth and ninety-fifth percentiles. ! 878: .P1 ! 879: .d boyhts.d ! 880: .P2 ! 881: The heights are given in centimeters (1 foot = 30.48 centimeters). ! 882: The trivial program ! 883: .P1 ! 884: .get boyhts1.g ! 885: .P2 ! 886: displays the data as ! 887: .grap boyhts1.g ! 888: Because there are four numbers on each input line, the first is ! 889: taken as an $x$-value and the remaining three are plotted ! 890: as $y$-values. ! 891: .PP ! 892: The three curves appear to be roughly straight ! 893: (at least up to age 16), ! 894: so it makes sense to fit a line ! 895: through them. ! 896: We will use the standard least squares regression ! 897: in which ! 898: .EQ ! 899: slope ~=~ { ! 900: {n SIGMA x y ~ - ~ SIGMA x SIGMA y } ! 901: over ! 902: {n SIGMA x sup 2 ~ - ~ ( SIGMA x ) sup 2 } ! 903: } ! 904: .EN ! 905: (where the summations range over all $n$ $x$ and $y$ values ! 906: in the data set) and the $y$-intercept is ! 907: .EQ ! 908: {SIGMA y ~ - ~ slope times SIGMA x} over n ! 909: .EN ! 910: The following ! 911: \*g ! 912: program boldly (and rather foolishly) implements that formula. ! 913: .P1 ! 914: .get boyhts3.g ! 915: .P2 ! 916: It plots the extreme fifth percentiles as a bar through ! 917: the median, which is plotted as a bullet. ! 918: All heights are converted to feet before plotting and calculating ! 919: the regression line. ! 920: .grap boyhts3.g ! 921: .PP ! 922: \*G ! 923: .CW print ! 924: statements write on ! 925: .CW stderr ! 926: as they are processed by \*g; ! 927: their single argument can be either an expression or a string. ! 928: The ! 929: .CW print ! 930: statements (which are commented out in ! 931: the above ! 932: \*g ! 933: program) at one time ! 934: showed that the regression line is ! 935: .EQ ! 936: Height ~ in ~ Feet ~ = ~ 2.61 ~ + ~ .19 times Age ! 937: .EN ! 938: Thus for most American ! 939: boys between 3 and 16, you may safely assume ! 940: that they started out life at 2 feet 7 inches and grew at the ! 941: rate of two and a quarter inches per year. ! 942: .PP ! 943: This program probably misapplies \*g; ! 944: if you really want to perform least squares regressions on ! 945: data, you should usually use a simple ! 946: .I awk ! 947: program like ! 948: .P1 ! 949: .get regress.awk ! 950: .P2 ! 951: (Be warned, though, that this program is not numerically ! 952: robust.) ! 953: .PP ! 954: While we're on the subject of fitting straight lines to data, ! 955: we'll redraw three graphs from J. W. Tukey's ! 956: .I "Exploratory Data Analysis" . ! 957: The file ! 958: .CW usapop.d ! 959: records the population of the United States ! 960: in millions at ten-year intervals. ! 961: .P1 ! 962: .d usapop.d ! 963: .P2 ! 964: Tukey's first two graphs indicate that the later population ! 965: growth was linear while the early growth was exponential. ! 966: The following ! 967: \*g ! 968: program plots them as a pair, using ! 969: .CW graph ! 970: commands to place internally unrelated graphs adjacent to ! 971: one another. ! 972: .P1 ! 973: .get usapop1.g ! 974: .P2 ! 975: The statements defining each graph are indented for clarity. ! 976: The second graph has the northern point of its frame 0.05 ! 977: inch below the southern point of the frame of the first graph; ! 978: the ! 979: .CW with ! 980: clause is passed directly through to ! 981: .I pic ! 982: without being evaluated for macros or expressions. ! 983: The names of both graphs begin with capital letters to ! 984: conform to ! 985: .I pic ! 986: syntax for labels. ! 987: .grap usapop1.g ! 988: .PP ! 989: Polynomial functions lie between the linear and exponential ! 990: functions; Tukey shows how a seventh-degree polynomial provides ! 991: a better (and longer) fit to the early population growth. ! 992: .P1 ! 993: .get usapop2.g ! 994: .P2 ! 995: This program re-expresses the $x$-axis with ! 996: \*g ! 997: arithmetic and uses an ! 998: .CW if ! 999: statement to graph only part of the data file. ! 1000: It produces ! 1001: .grap usapop2.g ! 1002: .nr k \n% ! 1003: The ! 1004: .I eqn ! 1005: .CW "space 0" ! 1006: clause is necessary to keep ! 1007: .I eqn ! 1008: from adding extra space that would interfere ! 1009: with positions computed by \*g; ! 1010: see Section 4. ! 1011: .PP ! 1012: The file ! 1013: .CW army.d ! 1014: contains four related time series ! 1015: describing the United States Army. ! 1016: .P1 ! 1017: .d army.d ! 1018: .P2 ! 1019: The first field is the year; the next four fields give ! 1020: the number of male officers, female officers, enlisted males ! 1021: and enlisted females, each in thousands. ! 1022: (Actually, there were no female enlisted personnel in the ! 1023: Army until 1943; the value 1 in 1940 and 1942 is just ! 1024: a placeholder, since ! 1025: \*g ! 1026: has no mechanism for handling missing data.) ! 1027: The following ! 1028: \*g ! 1029: program draws the four series with four different sets of ! 1030: .CW draw ! 1031: and ! 1032: .CW next ! 1033: commands. ! 1034: .P1 ! 1035: .get army1.g ! 1036: .P2 ! 1037: The program labels the lines by ! 1038: .CW copy ing ! 1039: immediate data; ! 1040: the program is therefore shorter to write and easier to change. ! 1041: The delimiter string ! 1042: .CW XXX ! 1043: in the ! 1044: .CW until ! 1045: clause could be deleted in this graph: the ! 1046: .CW \&.G2 ! 1047: line also denotes the end of data. ! 1048: Even though that string is enclosed in quotes, ! 1049: it may not contain spaces. ! 1050: The $y$-positions of the labels are the ! 1051: result of several iterations. ! 1052: .grap army1.g ! 1053: .PP ! 1054: This data can tell many stories: the buildup during the ! 1055: Second World War is obvious, as is the exodus after the ! 1056: war; increases during Korea and Vietnam are ! 1057: also apparent. ! 1058: We will consider a different story: the ratio of ! 1059: enlisted men to the three other classes of personnel. ! 1060: There are several ways to plot this data ! 1061: (the most obvious graph uses three time series showing how ! 1062: the ratios change over time, and is ! 1063: left as an exercise for the reader). ! 1064: .PP ! 1065: We will instead construct a graph that gives little insight into this ! 1066: data, but illustrates a general method that is quite useful ! 1067: in conjunction with \*g. ! 1068: The graph is a ``scatterplot vector'' that shows how one ! 1069: variable (the number of enlisted men) varies as a function of ! 1070: the other three. ! 1071: Breaking with tradition, we first show the final graphs, all ! 1072: of which have logarithmic scales. ! 1073: .grap army2.g ! 1074: The number of enlisted men is almost linearly ! 1075: related to the number of male officers, it is somewhat related to the number ! 1076: of female officers, and it varies widely as a function of the number ! 1077: of enlisted women. ! 1078: .PP ! 1079: Much more interesting than the graph itself is the method we used to ! 1080: produce it. ! 1081: We wrote a miniature ``compiler'' that accepts as ! 1082: its ``source language'' a description of a scatterplot vector and ! 1083: produces as ``object code'' a ! 1084: \*g ! 1085: program to draw the graph. ! 1086: The source program for the above example is ! 1087: .P1 ! 1088: .get army2.v ! 1089: .P2 ! 1090: The program lists several ! 1091: global attributes of the graph, the ! 1092: $y$-variable to be plotted, and as many $x$-variables as ! 1093: are desired; with each variable is its field in the file ! 1094: and a descriptive string. ! 1095: The language is ``compiled'' by the following ! 1096: .I awk ! 1097: program. ! 1098: .P1 ! 1099: .get scatvec.awk ! 1100: .P2 ! 1101: Running this program on the above description produces the following ! 1102: output, which is typically piped directly to \*g. ! 1103: .P1 ! 1104: .get army2.g ! 1105: .P2 ! 1106: The generated program uses the ! 1107: .I pic ! 1108: trick of re-using the same name ! 1109: .CW A ) ( ! 1110: for several objects. ! 1111: .PP ! 1112: Although the program above is merely a toy, ! 1113: ``minicompilers'' can produce useful preprocessors ! 1114: for \*g. ! 1115: The ! 1116: .CW scatmat ! 1117: program, for instance, is a 90-line ! 1118: .I awk ! 1119: program that reads a simple input language and produces as ! 1120: output a ! 1121: \*g ! 1122: program to produce a ``scatterplot matrix'', which ! 1123: is a handy graphical device for spotting pairwise interactions ! 1124: among several variables. ! 1125: If ! 1126: \*g ! 1127: lacks a feature you desire, consider building ! 1128: a simple preprocessor to provide it. ! 1129: An alternative is to define ! 1130: macros for the task; which approach is best depends ! 1131: strongly on the job you wish to accomplish. ! 1132: .PP ! 1133: The next graph uses iterators to make a graph without ! 1134: reading data from a file. ! 1135: Rather, its ``data'' is a ! 1136: function of two variables ! 1137: that describes a ! 1138: derivative field and a function of one variable ! 1139: that describes one solution to the differential ! 1140: equation. ! 1141: .P1 ! 1142: .get ode1.g ! 1143: .P2 ! 1144: The left label uses ! 1145: .I eqn ! 1146: text between the $font CW "$$"$ delimiters. ! 1147: The variable ! 1148: .CW scale ! 1149: ensures that all lines in the direction field are the same ! 1150: length. ! 1151: The ! 1152: .CW in ! 1153: clauses in the ! 1154: .CW ticks ! 1155: statements specify that the ticks go in zero inches ! 1156: to avoid overprinting. ! 1157: The variables ! 1158: .CW tx ! 1159: and ! 1160: .CW ty ! 1161: are so named because ! 1162: .CW x ! 1163: and ! 1164: .CW y ! 1165: are reserved words for the ! 1166: .CW coord ! 1167: statement. ! 1168: .grap ode1.g ! 1169: .PP ! 1170: Programmers familiar with floating point arithmetic may be ! 1171: surprised that the above graph is correct. ! 1172: Because of roundoff error, iteration ! 1173: .CW "from 0 to 1 by .05" '' `` ! 1174: usually produces the values ! 1175: $0, ~ .05, ~ .10, ~ ..., ~ .95$. ! 1176: \*G ! 1177: uses a ``fuzzy test'' ! 1178: in the ! 1179: .CW for ! 1180: statement to avoid that problem, which may in turn introduce ! 1181: other problems. ! 1182: Such problems may be avoided by iterating over an integer range ! 1183: and incrementing a non-integer value within the loop. ! 1184: .PP ! 1185: Most of the data we have seen so far is inherently ! 1186: two (or more) dimensional. ! 1187: As an example of one-dimensional data, we will return to ! 1188: the populations of the fifty states, which ! 1189: is the third field in the file ! 1190: .CW states.d ! 1191: introduced earlier; ! 1192: the file is sorted in increasing order of population. ! 1193: Our first graph takes the most space, but ! 1194: it also gives the most information. ! 1195: .P1 ! 1196: .get states8.g ! 1197: .P2 ! 1198: The ! 1199: .CW L ! 1200: macro (for Label) ! 1201: with input parameter $X$ evaluates to the number ! 1202: $2 sup X / 1,000,000$ followed by the string "$X$" ! 1203: (the ! 1204: .CW ticks ! 1205: command expects a number followed by a string label). ! 1206: .grap states8.g ! 1207: The dotted line is the least squares regression ! 1208: .EQ ! 1209: log sub 10 ~ Population ~ = ~ 7.214 ~ - ~ .03 times Rank ! 1210: .EN ! 1211: which gives 15.3 million as the population of the ! 1212: largest state and .515 million as the population ! 1213: of the smallest state. ! 1214: It says that ! 1215: population drops by a factor of two every ten states ! 1216: (compare the top and left scales). ! 1217: As sloppy as the exponential fit is, though, it is a much better ! 1218: fit to this data ! 1219: than a Zipf's Law curve is (drawing that curve is left as ! 1220: an exercise for the reader). ! 1221: .PP ! 1222: The next graph is a more standard representation of ! 1223: one-dimensional data. ! 1224: .P1 ! 1225: .get states3.g ! 1226: .P2 ! 1227: The markers were chosen to be ! 1228: .CW vticks ! 1229: because they denote only an $x$-value. ! 1230: .grap states3.g ! 1231: .PP ! 1232: The next one-dimensional graph uses the state's name as ! 1233: its marker; to reduce overprinting the graph is ``jittered'' ! 1234: by using a random number as a $y$-value. ! 1235: .P1 ! 1236: .get states4.g ! 1237: .P2 ! 1238: The function ! 1239: .CW rand() ! 1240: returns a pseudo-random real number chosen uniformly over the interval [0,1). ! 1241: .grap states4.g ! 1242: This graph is too cluttered; circles would have been ! 1243: a better choice as a plotting symbol (bullets, once again, would ! 1244: hide data). ! 1245: .PP ! 1246: Histograms are a standard way of presenting one-dimensional ! 1247: data in two-dimensional form. ! 1248: Our first step in building a histogram of the population ! 1249: data is the following ! 1250: .I awk ! 1251: program, which counts how many states are in each ``bin'' ! 1252: of a million people. ! 1253: .P1 ! 1254: .get states5.awk ! 1255: .P2 ! 1256: The variable ! 1257: .CW bzs ! 1258: tells where bin zero starts; although it is zero in this ! 1259: graph, it might be 95 in a histogram ! 1260: of human body temperatures in degrees Fahrenheit. ! 1261: The program produces the following output in ! 1262: .CW states2.d : ! 1263: .P1 ! 1264: .d states2.d ! 1265: .P2 ! 1266: There are 12 states with population between 0 and 999,999, ! 1267: 5 states with population between 1,000,000 and 1,999,999, ! 1268: and so on. ! 1269: .PP ! 1270: This ! 1271: \*g ! 1272: program uses three ! 1273: .CW line ! 1274: commands to plot each rectangle in the histogram. ! 1275: .P1 ! 1276: .get states5.g ! 1277: .P2 ! 1278: It produces ! 1279: .grap states5.g ! 1280: .PP ! 1281: The same file can be plotted in a ! 1282: more attractive (and more useful) form by ! 1283: .P1 ! 1284: .get states6.g ! 1285: .P2 ! 1286: which produces ! 1287: one of Bill Cleveland's ``dot charts'' or ``lolliplots'': ! 1288: .grap states6.g ! 1289: (We use ! 1290: .CW \e(bu , ! 1291: the ! 1292: .I troff ! 1293: character for a bullet, rather than the built-in string to ! 1294: get a larger size.) ! 1295: .PP ! 1296: Other histograms are possible. ! 1297: The following ! 1298: .I awk ! 1299: program ! 1300: .P1 ! 1301: .get states7.awk ! 1302: .P2 ! 1303: produces the file ! 1304: .CW states3.d ! 1305: .P1 ! 1306: .d states3.d ! 1307: .P2 ! 1308: which lists the state's abbreviation, bin number, and ! 1309: height within the bin. ! 1310: The ! 1311: \*g ! 1312: program ! 1313: .P1 ! 1314: .get states7.g ! 1315: .P2 ! 1316: reads that file to make the following histogram, in which ! 1317: the state names are used to display the heights of the bins. ! 1318: In each bin, the states occur in increasing order of ! 1319: population from bottom to top. ! 1320: .grap states7.g ! 1321: .PP ! 1322: The next data set is a run-time profile of an early version of \*g, ! 1323: created by compiling the program with the ! 1324: .CW -p ! 1325: option and running ! 1326: .CW prof ! 1327: after the program executed. ! 1328: .P1 ! 1329: .d prof1.d ! 1330: .P2 ! 1331: Although there were more than fifty procedures in the program, the ! 1332: top four time-hogs accounted for more than half of the run time. ! 1333: This file is difficult for ! 1334: \*g ! 1335: to deal with: ! 1336: even though ! 1337: .CW if ! 1338: statements would allow us to extract lines 2 through 11 ! 1339: of the file, we could not remove the leading ! 1340: .CW _ ! 1341: from a routine name or access the last field in a record. ! 1342: We will therefore process it with ! 1343: the following ! 1344: .I awk ! 1345: program. ! 1346: .P1 ! 1347: .get prof1.awk ! 1348: .P2 ! 1349: The program produces ! 1350: .P1 ! 1351: .d prof2.d ! 1352: .P2 ! 1353: We could even use the ! 1354: .I sh ! 1355: statement to execute the ! 1356: .I awk ! 1357: program from within \*g, which would make the latter entirely ! 1358: self-contained (see the reference manual for details). ! 1359: .PP ! 1360: We will display the data with this program. ! 1361: .P1 ! 1362: .get prof1.g ! 1363: .P2 ! 1364: Observe that the program knows nothing about the range of the data. ! 1365: It uses default ticks and a ! 1366: .CW frame ! 1367: statement with a computed height to achieve ! 1368: total data independence. ! 1369: .grap prof1.g ! 1370: This bar chart highlights the fact that most of the time spent by ! 1371: \*g ! 1372: is devoted to input and output. ! 1373: .PP ! 1374: J. W. Tukey's box and whisker plots ! 1375: represent the median, quartiles, and extremes of a ! 1376: one-dimensional distribution. ! 1377: The following ! 1378: \*g ! 1379: program defines a macro to draw a box plot, and then ! 1380: uses that shape to compare the distribution of heights of ! 1381: volcanoes with the distribution of heights of States of the Union. ! 1382: .P1 ! 1383: .get box1.g ! 1384: .P2 ! 1385: Boxes are one of many shapes used for the graphical ! 1386: representation of several quantities. ! 1387: If you use such shapes frequently then you should ! 1388: make a library file of their macros to ! 1389: .CW copy ! 1390: into your ! 1391: \*g ! 1392: programs. ! 1393: The above program produces ! 1394: .grap box1.g ! 1395: Even though the extreme heights are the same, state heights ! 1396: have a lower median and a greater spread. ! 1397: .PP ! 1398: Someday you may use ! 1399: \*g ! 1400: to prepare overhead transparencies, only to find that ! 1401: everything comes out too small. ! 1402: The following program illustrates some ways to get larger ! 1403: graphs. ! 1404: .P1 ! 1405: .zzz slide1.g ! 1406: .P2 ! 1407: The ! 1408: .CW ps ! 1409: and ! 1410: .CW vs ! 1411: commands preceding the graph set the text size to 14 points and ! 1412: the vertical spacing to 18 points; the two quantities are ! 1413: reset by the commands following the ! 1414: .CW .G2 . ! 1415: Such size changes should be made outside the ! 1416: \*g ! 1417: program, as mentioned earlier. ! 1418: The ! 1419: .CW 4 ! 1420: following the ! 1421: .CW .G1 ! 1422: stretches the graph (including ! 1423: \*g's ! 1424: estimate of the accompanying text) to be four inches wide; ! 1425: it is an alternative to altering the ! 1426: .CW frame ! 1427: command. ! 1428: The macro ! 1429: .CW blob ! 1430: is a plotting symbol that is much larger than ! 1431: .CW bullet ; ! 1432: the different name ensures that later references to ! 1433: .CW bullet ! 1434: are unaffected. ! 1435: The ! 1436: .I troff ! 1437: commands within the ! 1438: .CW blob ! 1439: string move the character down one-tenth of an em ! 1440: to center its plotting position (determined experimentally) ! 1441: and then reset the vertical position. ! 1442: The program produces this trivial (but large) graph. ! 1443: .br ! 1444: .grap slide1.g ! 1445: .NH ! 1446: Using Grap ! 1447: .PP ! 1448: Following are a few day-to-day matters about using \*g. ! 1449: .NH 2 ! 1450: Errors ! 1451: .PP ! 1452: \*G ! 1453: attempts to pinpoint input errors; for example, ! 1454: the input ! 1455: .P1 ! 1456: \&.G1 ! 1457: i = i + 1 ! 1458: .P2 ! 1459: results in this message on ! 1460: .CW stderr : ! 1461: .P1 ! 1462: grap: syntax error near line 1, file - ! 1463: context is ! 1464: i = i >>> + <<< 1 ! 1465: .P2 ! 1466: The error was noticed ! 1467: at the ! 1468: .CW + . ! 1469: Unfortunately, pinpointing is not the same as explaining: ! 1470: the real error is that the variable ! 1471: .CW i ! 1472: was not initialized. ! 1473: .PP ! 1474: The ``words'' ! 1475: .CW x ! 1476: and ! 1477: .CW y ! 1478: are reserved (for the ! 1479: .CW coord ! 1480: statement); ! 1481: you will get an equally inexplicable syntax error message if you use them ! 1482: as variable names. ! 1483: (This design is bad, but not nearly so bad as ! 1484: having the ! 1485: .CW log ! 1486: and ! 1487: .CW exp ! 1488: functions use base 10.) ! 1489: .PP ! 1490: \*G ! 1491: tries to load a file of standard macro definitions ! 1492: .CW /usr/lib/grap.defines ) ( ! 1493: for terms like ! 1494: .CW bullet , ! 1495: .CW plus , ! 1496: etc. ! 1497: It doesn't complain if that file isn't found, ! 1498: but if you later use one of these words, ! 1499: you'll get a syntax error message. ! 1500: .PP ! 1501: Certain constructs suggested by analogy to ! 1502: .I pic ! 1503: do not work. ! 1504: For example, ! 1505: .CW .GS ! 1506: and ! 1507: .CW .GE ! 1508: would have been nicer than ! 1509: .CW .G1 ! 1510: and ! 1511: .CW .G2 , ! 1512: but they were already taken. ! 1513: The ! 1514: .I pic ! 1515: construct ! 1516: .P1 ! 1517: \&.PS <file ! 1518: .P2 ! 1519: has been superseded by ! 1520: \*g's ! 1521: .CW copy ! 1522: command (which in turn has been retrofitted into ! 1523: .I pic ). ! 1524: .NH 2 ! 1525: \fITroff\fP issues ! 1526: .PP ! 1527: You may use ! 1528: .I troff ! 1529: commands like ! 1530: .CW .ps ! 1531: or ! 1532: .CW .ft ! 1533: to change text sizes and fonts within a graph, ! 1534: or use balanced ! 1535: .CW \es ! 1536: and ! 1537: .CW \ef ! 1538: commands within a string. ! 1539: Do not, however, ! 1540: add space ! 1541: .CW .sp ) ( ! 1542: or change the line spacing ! 1543: .CW .vs , ( ! 1544: .CW .ls ) ! 1545: within a graph. ! 1546: Some defined terms like ! 1547: .CW bullet ! 1548: contain embedded size changes; ! 1549: further qualifying them with ! 1550: \*g ! 1551: .CW size ! 1552: commands may not always work. ! 1553: .PP ! 1554: Because ! 1555: \*g ! 1556: is built on top of ! 1557: .I pic , ! 1558: the following quote from the ! 1559: .I pic ! 1560: manual is relevant: ! 1561: ``There is a subtle problem with complicated equations inside ! 1562: .I pic ! 1563: pictures \(em they come out wrong if ! 1564: .I eqn ! 1565: has to leave extra vertical space for the equation. ! 1566: If your equation involves more than subscripts and superscripts, ! 1567: you must add to the beginning of each such equation the extra information ! 1568: .CW "space 0" ''. ! 1569: This feature was illustrated in the graph of the ! 1570: United States population in Section 3. ! 1571: .NH 2 ! 1572: Alternatives ! 1573: .PP ! 1574: Besides ! 1575: \*g ! 1576: and your local draftsperson, what other choices are there? ! 1577: .PP ! 1578: The S system |reference(slanguage chambers) provides ! 1579: a host of tools for statistical analysis, ! 1580: but somewhat fewer tools than ! 1581: \*g ! 1582: for producing document-quality graphs. ! 1583: S produces graphs on the screen of a DMD 5620 terminal much more quickly than ! 1584: \*g ! 1585: (often in seconds rather than minutes), but it ! 1586: takes somewhat longer to learn (at least for us). ! 1587: If you expect to do a lot of interactive data analysis, then ! 1588: S is probably the right tool for you. ! 1589: S may be used to generate ! 1590: .I pic ! 1591: commands. ! 1592: .PP ! 1593: The standard UNIX program ! 1594: .I graph ! 1595: provides many of the basic features of ! 1596: \*g, ! 1597: though with quite a bit less control over details, particularly ! 1598: text. ! 1599: It produces output only in the ! 1600: .UX ! 1601: .I plot (5) ! 1602: language, ! 1603: which may be processed by a variety of filters ! 1604: for a variety of output devices. ! 1605: .PP ! 1606: The original ! 1607: .UX ! 1608: typesetter graphics programs are ! 1609: .I pic ! 1610: and ! 1611: .I ideal ; ! 1612: you may be able to do as well without using ! 1613: \*g ! 1614: as an intermediary. ! 1615: In particular, ! 1616: .I ideal ! 1617: provides shading and clipping, ! 1618: which are useful ! 1619: in presentation-quality bar charts and the like, but are ! 1620: well beyond the capabilities of ! 1621: .I pic . ! 1622: .EQ ! 1623: delim $$ ! 1624: .EN ! 1625: .NH ! 1626: References ! 1627: .LP ! 1628: |reference_placement ! 1629: .NH ! 1630: Reference Manual ! 1631: .PP ! 1632: In the following, ! 1633: .I italic ! 1634: terms are syntactic categories, ! 1635: .CW typewriter ! 1636: terms are literals, ! 1637: parenthesized constructs are optional, and ... indicates repetition. ! 1638: In most cases, the order of statements, ! 1639: constructs and attributes is immaterial. ! 1640: .P1 ! 1641: .IT "grap program" : ! 1642: .G1 \f2(width in inches)\fP ! 1643: \f2grap statement\fP ! 1644: ... ! 1645: .G2 ! 1646: .P2 ! 1647: A width on the ! 1648: .CW .G1 ! 1649: line overrides the computed width, as in ! 1650: .I pic . ! 1651: .P1 ! 1652: .IT "grap statement" : ! 1653: .I ! 1654: frame \(or label \(or coord \(or ticks \(or grid \(or plot \(or line \(or circle \(or draw \(or new \(or next ! 1655: \(or graph \(or numberlist \(or copy \(or for \(or if \(or sh \(or pic \(or assignment \(or print ! 1656: .ft ! 1657: .P2 ! 1658: .PP ! 1659: The ! 1660: .CW frame ! 1661: statement defines the frame that surrounds the graph: ! 1662: .P1 ! 1663: .IT frame : ! 1664: frame \f2(\fPht \f2expr)\fP \f2(\fPwid \f2expr)\fP \f2((side) linedesc)\fP \f2...\fP ! 1665: .IT side : ! 1666: top \(or bot \(or left \(or right ! 1667: .IT linedesc : ! 1668: solid \(or invis \(or dotted \f2(expr)\fP \(or dashed \f2(expr)\fP ! 1669: .P2 ! 1670: Height and width default to 2 and 3 inches; ! 1671: sides default to solid. ! 1672: If ! 1673: .I side ! 1674: is omitted, the ! 1675: .I linedesc ! 1676: applies to the entire frame. ! 1677: The optional expressions after ! 1678: .CW dotted ! 1679: and ! 1680: .CW dashed ! 1681: change the spacing exactly as in ! 1682: .I pic . ! 1683: .PP ! 1684: The ! 1685: .CW label ! 1686: statement places a label on a specified side: ! 1687: .P1 ! 1688: .IT label : ! 1689: label \f2side\fP \f2strlist\fP \f2...\fP \f2shift\fP ! 1690: .IT shift: ! 1691: left\f2 \(or \fPright\f2 \(or \fPup\f2 \(or \fPdown \f2expr ...\fP ! 1692: .IT strlist : ! 1693: \f2str ... (\fPrjust\f2 \(or \fPljust\f2 \(or \fPabove\f2 \(or \fPbelow\f2) ... (\fPsize \f2(\fP\(+-\f2) expr) ...\fP ! 1694: .IT str : ! 1695: "\f2...\fP" ! 1696: .P2 ! 1697: Lists of text strings are stacked vertically. ! 1698: In any context, string lists may contain clauses ! 1699: to adjust the position or change the point size. ! 1700: Each clause applies to the string preceding it ! 1701: and all following strings. ! 1702: Labels may also have a ! 1703: .CW width ! 1704: attribute, to override ! 1705: \*g's ! 1706: default computation. ! 1707: .PP ! 1708: Normally the coordinate system is defined by the data, ! 1709: with 7 percent extra on each side. ! 1710: (To change that to 5 percent, assign 0.05 to the ! 1711: \*g ! 1712: variable ! 1713: .CW margin , ! 1714: which is reset to 0.07 at each ! 1715: .CW .G1 ! 1716: statement.) ! 1717: The ! 1718: .CW coord ! 1719: statement defines an overriding system: ! 1720: .P1 ! 1721: .IT coord : ! 1722: coord \f2(name)\fP \f2(\fPx \f2expr,expr)\fP \f2(\fPy \f2expr,expr)\fP \f2(\fPlog x \(or log y \(or log log\f2) \fP ! 1723: .P2 ! 1724: Coordinate systems can be named; ! 1725: ranges, logarithmic scaling, etc., are done separately for each. ! 1726: .PP ! 1727: The ! 1728: .CW ticks ! 1729: statement places tick marks on one side of the frame: ! 1730: .P1 ! 1731: .IT ticks : ! 1732: ticks \f2side\fP \f2(\fPin \(or out \f2(expr))\fP \f2(shift) (tick-locations)\fP ! 1733: .IT tick-locations : ! 1734: at \f2(name) expr (str)\fP, \f2expr (str)\fP, \f2...\fP ! 1735: \(or from \f2(name) expr\fP to \f2expr\fP \f2(\fPby \f2(op) expr)\fP \f2str\fP ! 1736: .P2 ! 1737: If no ticks are specified, they will be provided automatically; ! 1738: .CW ticks ! 1739: .CW off ! 1740: suppresses automatic ticks. ! 1741: The optional expression after ! 1742: .CW in ! 1743: or ! 1744: .CW out ! 1745: specifies the length of the ticks in inches. ! 1746: The optional name refers to a coordinate system. ! 1747: If ! 1748: .IT str ! 1749: contains ! 1750: format specifiers like ! 1751: .CW %f ! 1752: or ! 1753: .CW %g , ! 1754: they are interpreted as by ! 1755: .CW printf . ! 1756: If no ! 1757: .IT str ! 1758: is supplied, the tick labels will be the values of the ! 1759: expressions. ! 1760: .PP ! 1761: If the ! 1762: .CW by ! 1763: clause is omitted, steps are of size 1. ! 1764: If the ! 1765: .CW by ! 1766: expression is preceded by one of ! 1767: .CW + , ! 1768: .CW - , ! 1769: .CW * ! 1770: or ! 1771: .CW / , ! 1772: the step is scaled by that operator, ! 1773: e.g., ! 1774: .CW *10 ! 1775: means that each step is 10 times the previous one. ! 1776: .PP ! 1777: The ! 1778: .CW grid ! 1779: statement produces grid lines along (i.e., perpendicular to) ! 1780: the named side. ! 1781: .P1 ! 1782: .IT grid : ! 1783: grid \f2side (linedesc) (shift) (tick-locations)\fP ! 1784: .P2 ! 1785: Grids are labeled by the same mechanism as ! 1786: .CW ticks . ! 1787: It is possible to draw grids without ticks by placing the phrase ! 1788: .CW ticks ! 1789: .CW off ! 1790: after the side name and before the iterator. ! 1791: .PP ! 1792: Plot ! 1793: statements place text at a point: ! 1794: .P1 ! 1795: .IT plot : ! 1796: \f2strlist\fP at \f2point\fP ! 1797: plot \f2expr (str)\fP at \f2point\fP ! 1798: .IT point : ! 1799: \f2(name) expr,expr\fP ! 1800: .P2 ! 1801: As in the ! 1802: .CW label ! 1803: statement, the string list may contain ! 1804: position and size modifiers. ! 1805: The ! 1806: .CW plot ! 1807: statement uses the optional format string as in C's ! 1808: .CW printf ! 1809: statement \(em it may contain a ! 1810: .CW %f ! 1811: or ! 1812: .CW %g . ! 1813: The optional name refers to a coordinate system. ! 1814: .PP ! 1815: The ! 1816: .CW line ! 1817: statement draws a line or arrow from here to there: ! 1818: .P1 ! 1819: .IT line : ! 1820: \f2(\fPline \(or arrow\f2)\fP from \f2point\fP to \f2point (linedesc)\fP ! 1821: .P2 ! 1822: The ! 1823: .CW circle ! 1824: statement draws a circle: ! 1825: .P1 ! 1826: .IT circle : ! 1827: circle at \f2point (\fPradius \f2expr)\fP ! 1828: .P2 ! 1829: The radius is in inches; the default size is small. ! 1830: .PP ! 1831: The ! 1832: .CW draw ! 1833: statement defines a sequence of lines: ! 1834: .P1 ! 1835: .IT draw : ! 1836: draw \f2(name) linedesc (str)\fP ! 1837: .P2 ! 1838: Subsequent data for the named sequence ! 1839: will be plotted as a line of the specified style, ! 1840: with the optional ! 1841: .IT str ! 1842: plotted at each point. ! 1843: The ! 1844: .CW next ! 1845: statement continues a sequence: ! 1846: .P1 ! 1847: .IT next : ! 1848: next \f2(name)\fP at \f2point (linedesc)\fP ! 1849: .P2 ! 1850: If a line description is specified, it overrides the default ! 1851: display mode for the line segment ending at ! 1852: .I point . ! 1853: The ! 1854: .CW new ! 1855: statement starts a new sequence; it has the same format as the ! 1856: .CW draw ! 1857: statement. ! 1858: .PP ! 1859: A line consisting of a set of numbers ! 1860: is treated as a family of points ! 1861: $x$, $y sub 1$, $y sub 2$, etc., ! 1862: to be plotted at the single ! 1863: $x$ value. ! 1864: .P1 ! 1865: .IT numberlist : ! 1866: \f2number\fP ... ! 1867: .P2 ! 1868: If there is only one number it is treated as ! 1869: a $y$ value, and $x$ values of 1, 2, 3, ... ! 1870: are supplied automatically. ! 1871: .PP ! 1872: \*G ! 1873: provides arithmetic with the operators ! 1874: .CW + , ! 1875: .CW - , ! 1876: .CW * , ! 1877: .CW / , ! 1878: and ! 1879: .CW ^ . ! 1880: Variables may be assigned to; ! 1881: assignments are expressions. ! 1882: Built-in functions include ! 1883: .CW log , ! 1884: .CW exp ! 1885: (both base 10 \(em beware!), ! 1886: .CW int ! 1887: (truncates towards zero), ! 1888: .CW sin , ! 1889: .CW cos ! 1890: (both use radians), ! 1891: .CW atan2(dy,dx) , ! 1892: .CW sqrt , ! 1893: .CW min ! 1894: (two arguments only), ! 1895: .CW max ! 1896: (ditto), ! 1897: and ! 1898: .CW rand() ! 1899: (returns a real number random on [0,1)). ! 1900: .PP ! 1901: The ! 1902: .CW for ! 1903: statement provides a modest looping facility: ! 1904: .P1 ! 1905: .IT for : ! 1906: for \f2var\fP from \f2expr\fP to \f2expr (\fPby \f2(op) expr)\fP do { \f2anything\fP } ! 1907: .P2 ! 1908: The string may contain internally balanced braces. ! 1909: Alternatively, any other character may appear immediately after the word ! 1910: .CW do , ! 1911: and the string is terminated by the next occurrence of that character. ! 1912: The text ! 1913: .IT anything ! 1914: (which may contain newlines) is repeated as ! 1915: .IT var ! 1916: takes on values from ! 1917: .IT expr1 ! 1918: to ! 1919: .IT expr2 . ! 1920: As with tick iterators, the ! 1921: .CW by ! 1922: clause is optional, and may proceed arithmetically or multiplicatively. ! 1923: In a ! 1924: .CW for ! 1925: statement, ! 1926: the ! 1927: .CW from ! 1928: may be replaced by ! 1929: .CW = ''. `` ! 1930: .PP ! 1931: The ! 1932: .CW if-then-else ! 1933: statement provides conditional evaluation: ! 1934: .P1 ! 1935: .IT if : ! 1936: if \f2expr\fP then { \f2anything\fP } else { \f2anything\fP } ! 1937: .P2 ! 1938: The ! 1939: .CW else ! 1940: clause ! 1941: is optional. ! 1942: Relational operators include ! 1943: .CW == , ! 1944: .CW != , ! 1945: .CW > , ! 1946: .CW >= , ! 1947: .CW < , ! 1948: .CW <= , ! 1949: .CW ! , ! 1950: .CW || , ! 1951: and ! 1952: .CW && . ! 1953: Strings may be compared with the operators ! 1954: .CW == ! 1955: and ! 1956: .CW != . ! 1957: .PP ! 1958: It is possible to convert numeric expressions to formatted strings: ! 1959: .P1 ! 1960: sprintf("\f2format\fP", \f2expr\fP, \f2expr\fP, ...) ! 1961: .P2 ! 1962: is equivalent to a quoted string in any context. ! 1963: Variants of ! 1964: .CW %f ! 1965: and ! 1966: .CW %g ! 1967: are the only sensible format conversions. ! 1968: .PP ! 1969: \*G ! 1970: provides the same macro processor that ! 1971: .I pic ! 1972: does: ! 1973: .P1 ! 1974: define \f2macro-name\fP { \f2anything\fP } ! 1975: .P2 ! 1976: .EQ ! 1977: delim %% ! 1978: .EN ! 1979: Subsequent occurrences of the macro name will be replaced ! 1980: by the string, with arguments of the form \f(CW$\fIn\fR ! 1981: replaced by corresponding actual arguments. ! 1982: Macro definitions persist across ! 1983: .CW .G2 ! 1984: boundaries, as do values of variables. ! 1985: .EQ ! 1986: delim $$ ! 1987: .EN ! 1988: .PP ! 1989: The ! 1990: .CW copy ! 1991: statement is somewhat overloaded: ! 1992: .P1 ! 1993: copy "\f2filename\fP" ! 1994: .P2 ! 1995: includes the contents of the named file at that point; ! 1996: .P1 ! 1997: copy "\f2filename\fP" thru \f2macro-name\fP ! 1998: .P2 ! 1999: copies the file through the macro; and ! 2000: .P1 ! 2001: copy thru \f2macro-name\fP ! 2002: .P2 ! 2003: copies subsequent lines through the macro; ! 2004: each number or quoted string is treated as an argument. ! 2005: In each case, copying continues until end of file or the next ! 2006: .CW .G2 . ! 2007: The optional clause ! 2008: .CW until ! 2009: .IT str ! 2010: causes copying to terminate when a line whose ! 2011: first field is ! 2012: .IT str ! 2013: occurs. ! 2014: In all cases, the macro can be specified inline rather than by name: ! 2015: .P1 ! 2016: copy thru { \f2macro body\fP } ! 2017: .P2 ! 2018: .PP ! 2019: The ! 2020: .CW sh ! 2021: command passes text through to the UNIX shell. ! 2022: .P1 ! 2023: .IT sh : ! 2024: sh { \f2anything\fP } ! 2025: .P2 ! 2026: The body of the command is scanned for macros. ! 2027: The built-in macro ! 2028: .CW pid ! 2029: is a string consisting of the process identification number; ! 2030: it can be used to generate unique file names. ! 2031: .PP ! 2032: The ! 2033: .CW pic ! 2034: command passes text through to ! 2035: .I pic ! 2036: with the ! 2037: .CW pic '' `` ! 2038: removed; variables and macros are not evaluated. ! 2039: Lines beginning with a period (that are not numbers) ! 2040: are passed through literally, under the assumption that they ! 2041: are ! 2042: .I troff ! 2043: commands. ! 2044: .PP ! 2045: The ! 2046: .CW graph ! 2047: statement ! 2048: .P1 ! 2049: .IT graph : ! 2050: graph \f2Picname (pic-text)\fP ! 2051: .P2 ! 2052: defines a new graph named ! 2053: .I Picname , ! 2054: resetting all coordinate systems. ! 2055: If any ! 2056: .CW graph ! 2057: commands are used in a ! 2058: \*g ! 2059: program, then the statement after the ! 2060: .CW \&.G1 ! 2061: must be a ! 2062: .CW graph ! 2063: command. ! 2064: The ! 2065: .I pic-text ! 2066: can be used to position this graph relative ! 2067: to previous graphs by referring to their ! 2068: .CW Frame s, ! 2069: as in ! 2070: .P1 ! 2071: graph First ! 2072: ... ! 2073: graph Second with .Frame.w at First.Frame.e + (0.1,0) ! 2074: .P2 ! 2075: Macros and expressions in ! 2076: .I pic-text ! 2077: are not evaluated. ! 2078: .I Picname s ! 2079: must begin with a capital letter to satisfy ! 2080: .I pic ! 2081: syntax. ! 2082: .PP ! 2083: The ! 2084: .CW print ! 2085: statement ! 2086: .P1 ! 2087: .IT print : ! 2088: print \f2(expr\fP \(or \f2str)\fP ! 2089: .P2 ! 2090: writes on ! 2091: .CW stderr ! 2092: as ! 2093: \*g ! 2094: processes its input; it is sometimes useful for debugging. ! 2095: .PP ! 2096: Many reserved words have synonyms, such as ! 2097: .CW thru ! 2098: for ! 2099: .CW through , ! 2100: .CW tick ! 2101: for ! 2102: .CW ticks, ! 2103: and ! 2104: .CW bot ! 2105: for ! 2106: .CW bottom . ! 2107: .PP ! 2108: The ! 2109: .CW # ! 2110: introduces a comment, which ends at the end of the line. ! 2111: Statements may be continued over several lines by preceding each ! 2112: newline with a ! 2113: backslash character. ! 2114: Multiple statements may appear on a single line separated ! 2115: by semicolons. ! 2116: \*G ! 2117: ignores any line that is entirely blank, including those ! 2118: processed by ! 2119: .CW "copy thru" ! 2120: commands. ! 2121: .PP ! 2122: When ! 2123: \*g ! 2124: is first executed it reads standard macro definitions ! 2125: from the file ! 2126: .CW /usr/lib/grap.defines . ! 2127: The definitions include ! 2128: .CW bullet , ! 2129: .CW plus , ! 2130: .CW box , ! 2131: .CW star , ! 2132: .CW dot , ! 2133: .CW times , ! 2134: .CW htick , ! 2135: .CW vtick , ! 2136: .CW square , ! 2137: and ! 2138: .CW delta .
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.