|
|
1.1 ! root 1: .TH AWK 1 ! 2: .SH NAME ! 3: awk \- pattern-directed scanning and processing language ! 4: .SH SYNOPSIS ! 5: .B awk ! 6: [ ! 7: .BI -F fs ! 8: ] ! 9: [ ! 10: .BI -v ! 11: .I var=value ! 12: ] ! 13: [ ! 14: .BI -mr n ! 15: ] ! 16: [ ! 17: .BI -mf n ! 18: ] ! 19: [ ! 20: .B -f ! 21: .I prog ! 22: [ ! 23: .I prog ! 24: ] ! 25: [ ! 26: .I file ... ! 27: ] ! 28: .SH DESCRIPTION ! 29: .I Awk ! 30: scans each input ! 31: .I file ! 32: for lines that match any of a set of patterns specified literally in ! 33: .IR prog ! 34: or in one or more files ! 35: specified as ! 36: .B -f ! 37: .IR file . ! 38: With each pattern ! 39: there can be an associated action that will be performed ! 40: when a line of a ! 41: .I file ! 42: matches the pattern. ! 43: Each line is matched against the ! 44: pattern portion of every pattern-action statement; ! 45: the associated action is performed for each matched pattern. ! 46: The file name ! 47: .L - ! 48: means the standard input. ! 49: Any ! 50: .IR file ! 51: of the form ! 52: .I var=value ! 53: is treated as an assignment, not a file name, ! 54: and is executed at the time it would have been opened if it were a file name. ! 55: The option ! 56: .B -v ! 57: followed by ! 58: .I var=value ! 59: is an assignment to be done before ! 60: .I prog ! 61: is executed; ! 62: any number of ! 63: .B -v ! 64: options may be present. ! 65: .PP ! 66: An input line is normally made up of fields separated by white space, ! 67: or by regular expression ! 68: .IR fs . ! 69: The fields are denoted ! 70: .BR $1 , ! 71: .BR $2 , ! 72: \&..., while ! 73: .B $0 ! 74: refers to the entire line. ! 75: .PP ! 76: To compensate for inadequate implementation of storage management, ! 77: the ! 78: .B -mr ! 79: option can be used to set the maximum size of the input record, ! 80: and the ! 81: .B -mf ! 82: option to set the maximum number of fields. ! 83: .PP ! 84: A pattern-action statement has the form ! 85: .IP ! 86: .IB pattern " { " action " } ! 87: .PP ! 88: A missing ! 89: .BI { " action " } ! 90: means print the line; ! 91: a missing pattern always matches. ! 92: Pattern-action statements are separated by newlines or semicolons. ! 93: .PP ! 94: An action is a sequence of statements. ! 95: A statement can be one of the following: ! 96: .PP ! 97: .EX ! 98: .ta \w'\fLdelete array[expression]'u ! 99: if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]\fP ! 100: while(\fI expression \fP)\fI statement\fP ! 101: for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI statement\fP ! 102: for(\fI var \fPin\fI array \fP)\fI statement\fP ! 103: do\fI statement \fPwhile(\fI expression \fP) ! 104: break ! 105: continue ! 106: {\fR [\fP\fI statement ... \fP\fR] \fP} ! 107: \fIexpression\fP #\fR commonly\fP\fI var = expression\fP ! 108: print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP ! 109: printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP ! 110: return\fR [ \fP\fIexpression \fP\fR]\fP ! 111: next #\fR skip remaining patterns on this input line\fP ! 112: delete\fI array\fP[\fI expression \fP] #\fR delete an array element\fP ! 113: exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; status is \fP\fIexpression\fP ! 114: .EE ! 115: .DT ! 116: .PP ! 117: Statements are terminated by ! 118: semicolons, newlines or right braces. ! 119: An empty ! 120: .I expression-list ! 121: stands for ! 122: .BR $0 . ! 123: String constants are quoted \&\fL"\ "\fR, ! 124: with the usual C escapes recognized within. ! 125: Expressions take on string or numeric values as appropriate, ! 126: and are built using the operators ! 127: .B + - * / % ^ ! 128: (exponentiation), and concatenation (indicated by white space). ! 129: The operators ! 130: .B ! 131: ! ++ -- += -= *= /= %= ^= > >= < <= == != ?: ! 132: are also available in expressions. ! 133: Variables may be scalars, array elements ! 134: (denoted ! 135: .IB x [ i ] ) ! 136: or fields. ! 137: Variables are initialized to the null string. ! 138: Array subscripts may be any string, ! 139: not necessarily numeric; ! 140: this allows for a form of associative memory. ! 141: Multiple subscripts such as ! 142: .B [i,j,k] ! 143: are permitted; the constituents are concatenated, ! 144: separated by the value of ! 145: .BR SUBSEP . ! 146: .PP ! 147: The ! 148: .B print ! 149: statement prints its arguments on the standard output ! 150: (or on a file if ! 151: .BI > file ! 152: or ! 153: .BI >> file ! 154: is present or on a pipe if ! 155: .BI | cmd ! 156: is present), separated by the current output field separator, ! 157: and terminated by the output record separator. ! 158: .I file ! 159: and ! 160: .I cmd ! 161: may be literal names or parenthesized expressions; ! 162: identical string values in different statements denote ! 163: the same open file. ! 164: The ! 165: .B printf ! 166: statement formats its expression list according to the format ! 167: (see ! 168: .IR fprintf (2)) . ! 169: The built-in function ! 170: .BI close( expr ) ! 171: closes the file or pipe ! 172: .IR expr . ! 173: .PP ! 174: The mathematical functions ! 175: .BR exp , ! 176: .BR log , ! 177: .BR sqrt , ! 178: .BR sin , ! 179: .BR cos , ! 180: and ! 181: .BR atan2 ! 182: are built in. ! 183: Other built-in functions: ! 184: .TF length ! 185: .TP ! 186: .B length ! 187: the length of its argument ! 188: taken as a string, ! 189: or of ! 190: .B $0 ! 191: if no argument. ! 192: .TP ! 193: .B rand ! 194: random number on (0,1) ! 195: .TP ! 196: .B srand ! 197: sets seed for ! 198: .B rand ! 199: and returns the previous seed. ! 200: .TP ! 201: .B int ! 202: truncates to an integer value ! 203: .TP ! 204: .B utf ! 205: converts its numerical argument, a character number, to a ! 206: .SM UTF ! 207: string ! 208: .TP ! 209: .BI substr( s , " m" , " n\fL) ! 210: the ! 211: .IR n -character ! 212: substring of ! 213: .I s ! 214: that begins at position ! 215: .IR m ! 216: counted from 1. ! 217: .TP ! 218: .BI index( s , " t" ) ! 219: the position in ! 220: .I s ! 221: where the string ! 222: .I t ! 223: occurs, or 0 if it does not. ! 224: .TP ! 225: .BI match( s , " r" ) ! 226: the position in ! 227: .I s ! 228: where the regular expression ! 229: .I r ! 230: occurs, or 0 if it does not. ! 231: The variables ! 232: .B RSTART ! 233: and ! 234: .B RLENGTH ! 235: are set to the position and length of the matched string. ! 236: .TP ! 237: .BI split( s , " a" , " fs\fL) ! 238: splits the string ! 239: .I s ! 240: into array elements ! 241: .IB a [1]\f1, ! 242: .IB a [2]\f1, ! 243: \&..., ! 244: .IB a [ n ]\f1, ! 245: and returns ! 246: .IR n . ! 247: The separation is done with the regular expression ! 248: .I fs ! 249: or with the field separator ! 250: .B FS ! 251: if ! 252: .I fs ! 253: is not given. ! 254: .TP ! 255: .BI sub( r , " t" , " s\fL) ! 256: substitutes ! 257: .I t ! 258: for the first occurrence of the regular expression ! 259: .I r ! 260: in the string ! 261: .IR s . ! 262: If ! 263: .I s ! 264: is not given, ! 265: .B $0 ! 266: is used. ! 267: .TP ! 268: .B gsub ! 269: same as ! 270: .B sub ! 271: except that all occurrences of the regular expression ! 272: are replaced; ! 273: .B sub ! 274: and ! 275: .B gsub ! 276: return the number of replacements. ! 277: .TP ! 278: .BI sprintf( fmt , " expr" , " ...\fL) ! 279: the string resulting from formatting ! 280: .I expr ... ! 281: according to the ! 282: .I printf ! 283: format ! 284: .I fmt ! 285: .TP ! 286: .BI system( cmd ) ! 287: executes ! 288: .I cmd ! 289: and returns its exit status ! 290: .PD ! 291: .PP ! 292: The ``function'' ! 293: .B getline ! 294: sets ! 295: .B $0 ! 296: to ! 297: the next input record from the current input file; ! 298: .B getline ! 299: .BI < file ! 300: sets ! 301: .B $0 ! 302: to the next record from ! 303: .IR file . ! 304: .B getline ! 305: .I x ! 306: sets variable ! 307: .I x ! 308: instead. ! 309: Finally, ! 310: .IB cmd " | getline ! 311: pipes the output of ! 312: .I cmd ! 313: into ! 314: .BR getline ; ! 315: each call of ! 316: .B getline ! 317: returns the next line of output from ! 318: .IR cmd . ! 319: In all cases, ! 320: .B getline ! 321: returns 1 for a successful input, ! 322: 0 for end of file, and \-1 for an error. ! 323: .PP ! 324: Patterns are arbitrary Boolean combinations ! 325: (with ! 326: .BR "! || &&" ) ! 327: of regular expressions and ! 328: relational expressions. ! 329: Regular expressions are as in ! 330: .IR regexp (6). ! 331: Isolated regular expressions ! 332: in a pattern apply to the entire line. ! 333: Regular expressions may also occur in ! 334: relational expressions, using the operators ! 335: .BR ~ ! 336: and ! 337: .BR !~ . ! 338: .BI / re / ! 339: is a constant regular expression; ! 340: any string (constant or variable) may be used ! 341: as a regular expression, except in the position of an isolated regular expression ! 342: in a pattern. ! 343: .PP ! 344: A pattern may consist of two patterns separated by a comma; ! 345: in this case, the action is performed for all lines ! 346: from an occurrence of the first pattern ! 347: though an occurrence of the second. ! 348: .PP ! 349: A relational expression is one of the following: ! 350: .IP ! 351: .I expression matchop regular-expression ! 352: .br ! 353: .I expression relop expression ! 354: .br ! 355: .IB expression " in " array-name ! 356: .br ! 357: .BI ( expr , expr,... ") in " array-name ! 358: .PP ! 359: where a ! 360: .I relop ! 361: is any of the six relational operators in C, ! 362: and a ! 363: .I matchop ! 364: is either ! 365: .B ~ ! 366: (matches) ! 367: or ! 368: .B !~ ! 369: (does not match). ! 370: A conditional is an arithmetic expression, ! 371: a relational expression, ! 372: or a Boolean combination ! 373: of these. ! 374: .PP ! 375: The special patterns ! 376: .B BEGIN ! 377: and ! 378: .B END ! 379: may be used to capture control before the first input line is read ! 380: and after the last. ! 381: .B BEGIN ! 382: and ! 383: .B END ! 384: do not combine with other patterns. ! 385: .PP ! 386: Variable names with special meanings: ! 387: .TF FILENAME ! 388: .TP ! 389: .B FS ! 390: regular expression used to separate fields; also settable ! 391: by option ! 392: .BI -F fs\f1. ! 393: .TP ! 394: .BR NF ! 395: number of fields in the current record ! 396: .TP ! 397: .B NR ! 398: ordinal number of the current record ! 399: .TP ! 400: .B FNR ! 401: ordinal number of the current record in the current file ! 402: .TP ! 403: .B FILENAME ! 404: the name of the current input file ! 405: .TP ! 406: .B RS ! 407: input record separator (default newline) ! 408: .TP ! 409: .B OFS ! 410: output field separator (default blank) ! 411: .TP ! 412: .B ORS ! 413: output record separator (default newline) ! 414: .TP ! 415: .B OFMT ! 416: output format for numbers (default ! 417: .BR "%.6g" ) ! 418: .TP ! 419: .B SUBSEP ! 420: separates multiple subscripts (default 034) ! 421: .TP ! 422: .B ARGC ! 423: argument count, assignable ! 424: .TP ! 425: .B ARGV ! 426: argument array, assignable; ! 427: non-null members are taken as file names ! 428: .TP ! 429: .B ENVIRON ! 430: array of environment variables; subscripts are names. ! 431: .PD ! 432: .PP ! 433: Functions may be defined (at the position of a pattern-action statement) thus: ! 434: .IP ! 435: .L ! 436: function foo(a, b, c) { ...; return x } ! 437: .PP ! 438: Parameters are passed by value if scalar and by reference if array name; ! 439: functions may be called recursively. ! 440: Parameters are local to the function; all other variables are global. ! 441: Thus local variables may be created by providing excess parameters in ! 442: the function definition. ! 443: .SH EXAMPLES ! 444: .TP ! 445: .L ! 446: length > 72 ! 447: Print lines longer than 72 characters. ! 448: .TP ! 449: .L ! 450: { print $2, $1 } ! 451: Print first two fields in opposite order. ! 452: .PP ! 453: .EX ! 454: BEGIN { FS = ",[ \et]*|[ \et]+" } ! 455: { print $2, $1 } ! 456: .EE ! 457: .ns ! 458: .IP ! 459: Same, with input fields separated by comma and/or blanks and tabs. ! 460: .PP ! 461: .EX ! 462: { s += $1 } ! 463: END { print "sum is", s, " average is", s/NR } ! 464: .EE ! 465: .ns ! 466: .IP ! 467: Add up first column, print sum and average. ! 468: .TP ! 469: .L ! 470: /start/, /stop/ ! 471: Print all lines between start/stop pairs. ! 472: .PP ! 473: .EX ! 474: BEGIN { # Simulate echo(1) ! 475: for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] ! 476: printf "\en" ! 477: exit } ! 478: .EE ! 479: .SH SOURCE ! 480: .B /sys/src/cmd/awk ! 481: .SH SEE ALSO ! 482: .IR sed (1), ! 483: .IR regexp (6), ! 484: .br ! 485: A. V. Aho, B. W. Kernighan, P. J. Weinberger, ! 486: .I ! 487: The AWK Programming Language, ! 488: Addison-Wesley, 1988. ! 489: .SH BUGS ! 490: There are no explicit conversions between numbers and strings. ! 491: To force an expression to be treated as a number add 0 to it; ! 492: to force it to be treated as a string concatenate ! 493: \&\fL""\fP to it. ! 494: .br ! 495: The scope rules for variables in functions are a botch; ! 496: the syntax is worse.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.