|
|
1.1 root 1: .\" Copyright (c) 1990 Regents of the University of California.
2: .\" All rights reserved. The Berkeley software License Agreement
3: .\" specifies the terms and conditions for redistribution.
4: .\"
5: .\" @(#)awk.1 6.4 (Berkeley) 7/24/90
6: .\"
7: .Dd July 24, 1990
8: .Dt AWK 1
9: .Os ATT 7
10: .Sh NAME
11: .Nm awk
12: .Nd pattern scanning and processing language
13: .Sh SYNOPSIS
14: .Nm awk
15: .Oo
16: .Op Fl \&F Ar \&c
17: .Oo
18: .Op Fl f Ar prog_file
19: .Op Ar prog
20: .Ar
21: .Sh DESCRIPTION
22: .Nm Awk
23: scans each input
24: .Ar file
25: for lines that match any of a set of patterns specified in
26: .Ar prog .
27: With each pattern in
28: .Ar prog
29: there can be an associated action that will be performed
30: when a line of a
31: .Ar file
32: matches the pattern.
33: The set of patterns may appear literally as
34: .Ar prog
35: or in a file
36: specified as
37: .Fl f
38: .Ar file .
39: .Pp
40: .Tw Ds
41: .Tp Cx Fl F
42: .Ar c
43: .Cx
44: Specify a field separator of
45: .Ar c .
46: .Tp Fl f
47: Use
48: .Ar prog_file
49: as an input
50: .Ar prog
51: (an awk script).
52: .Tp
53: .Pp
54: Files are read in order;
55: if there are no files, the standard input is read.
56: The file name
57: .Sq Fl
58: means the standard input.
59: Each line is matched against the
60: pattern portion of every pattern-action statement;
61: the associated action is performed for each matched pattern.
62: .Pp
63: An input line is made up of fields separated by white space.
64: (This default can be changed by using
65: .Li FS ,
66: .Em vide infra . )
67: The fields are denoted $1, $2, ... ;
68: $0 refers to the entire line.
69: .Pp
70: A pattern-action statement has the form
71: .Pp
72: .Dl pattern {action}
73: .Pp
74: A missing { action } means print the line;
75: a missing pattern always matches.
76: .Pp
77: An action is a sequence of statements.
78: A statement can be one of the following:
79: .Pp
80: .Ds I
81: if ( conditional ) statement [ else statement ]
82: while ( conditional ) statement
83: for ( expression ; conditional ; expression ) statement
84: break
85: continue
86: { [ statement ] ... }
87: variable = expression
88: print [ expression-list ] [ >expression ]
89: printf format [, expression-list ] [ >expression ]
90: next # skip remaining patterns on this input line
91: exit # skip the rest of the input
92: .De
93: .Pp
94: Statements are terminated by
95: semicolons, newlines or right braces.
96: An empty expression-list stands for the whole line.
97: Expressions take on string or numeric values as appropriate,
98: and are built using the operators
99: +, \-, *, /, %, and concatenation (indicated by a blank).
100: The C operators ++, \-\-, +=, \-=, *=, /=, and %=
101: are also available in expressions.
102: Variables may be scalars, array elements
103: (denoted
104: .Cx x
105: .Op i
106: .Cx )
107: .Cx
108: or fields.
109: Variables are initialized to the null string.
110: Array subscripts may be any string,
111: not necessarily numeric;
112: this allows for a form of associative memory.
113: String constants are quoted "...".
114: .Pp
115: The
116: .Ic print
117: statement prints its arguments on the standard output
118: (or on a file if
119: .Ar \&>file
120: is present), separated by the current output field separator,
121: and terminated by the output record separator.
122: The
123: .Ic printf
124: statement formats its expression list according to the format
125: (see
126: .Xr printf 3 ) .
127: .Pp
128: The built-in function
129: .Ic length
130: returns the length of its argument
131: taken as a string,
132: or of the whole line if no argument.
133: There are also built-in functions
134: .Ic exp ,
135: .Ic log ,
136: .Ic sqrt
137: and
138: .Ic int .
139: The last truncates its argument to an integer.
140: The function
141: .Fn substr s m n
142: returns the
143: .Cx Ar n
144: .Cx \-
145: .Cx character
146: .Cx
147: substring of
148: .Ar s
149: that begins at position
150: .Ar m .
151: The
152: .Fn sprintf fmt expr expr \&...
153: function
154: formats the expressions
155: according to the
156: .Xr printf 3
157: format given by
158: .Ar fmt
159: and returns the resulting string.
160: .Pp
161: Patterns are arbitrary Boolean combinations
162: (!, \(or\(or, &&, and parentheses) of
163: regular expressions and
164: relational expressions.
165: Regular expressions must be surrounded
166: by slashes and are as in
167: .Xr egrep 1 .
168: Isolated regular expressions
169: in a pattern apply to the entire line.
170: Regular expressions may also occur in
171: relational expressions.
172: .Pp
173: A pattern may consist of two patterns separated by a comma;
174: in this case, the action is performed for all lines
175: between an occurrence of the first pattern
176: and the next occurrence of the second.
177: .Pp
178: A relational expression is one of the following:
179: .Pp
180: .Ds I
181: expression matchop regular-expression
182: expression relop expression
183: .De
184: .Pp
185: where a relop is any of the six relational operators in C,
186: and a matchop is either ~ (for contains)
187: or !~ (for does not contain).
188: A conditional is an arithmetic expression,
189: a relational expression,
190: or a Boolean combination
191: of these.
192: .Pp
193: The special patterns
194: .Li BEGIN
195: and
196: .Li END
197: may be used to capture control before the first input line is read
198: and after the last.
199: .Li BEGIN
200: must be the first pattern,
201: .Li END
202: the last.
203: .Pp
204: A single character
205: .Ar c
206: may be used to separate the fields by starting
207: the program with
208: .Pp
209: .Dl BEGIN { FS = "c" }
210: .Pp
211: or by using the
212: .Cx Fl F
213: .Ar c
214: .Cx
215: option.
216: .Pp
217: Other variable names with special meanings
218: include
219: .Dp Li NF
220: the number of fields in the current record;
221: .Dp Li NR
222: the ordinal number of the current record;
223: .Dp Li FILENAME
224: the name of the current input file;
225: .Dp Li OFS
226: the output field separator (default blank);
227: .Dp Li ORS
228: the output record separator (default newline);
229: .Dp Li OFMT
230: the output format for numbers (default "%.6g").
231: .Dp
232: .Pp
233: .Sh EXAMPLES
234: .Pp
235: Print lines longer than 72 characters:
236: .Pp
237: .Dl length > 72
238: .Pp
239: Print first two fields in opposite order:
240: .Pp
241: .Dl { print $2, $1 }
242: .Pp
243: Add up first column, print sum and average:
244: .Pp
245: .Ds I
246: { s += $1 }
247: END { print "sum is", s, " average is", s/NR }
248: .De
249: .Pp
250: Print fields in reverse order:
251: .Pp
252: .Dl { for (i = NF; i > 0; \-\-i) print $i }
253: .Pp
254: Print all lines between start/stop pairs:
255: .Pp
256: .Dl /start/, /stop/
257: .Pp
258: Print all lines whose first field is different from previous one:
259: .Pp
260: .Dl $1 != prev { print; prev = $1 }
261: .Sh SEE ALSO
262: .Xr lex 1 ,
263: .Xr sed 1
264: .Pp
265: A. V. Aho, B. W. Kernighan, P. J. Weinberger,
266: .Em Awk \- a pattern scanning and processing language
267: .Sh HISTORY
268: .Nm Awk
269: appeared in Version 7 AT&T UNIX. A much improved
270: and true to the book version of
271: .Nm awk
272: appeared in the AT&T Toolchest in the late 1980's.
273: The version of
274: .Nm awk
275: this manual page describes
276: is a derivative of the original and not the Toolchest version.
277: .Sh BUGS
278: There are no explicit conversions between numbers and strings.
279: To force an expression to be treated as a number add 0 to it;
280: to force it to be treated as a string concatenate "" (an empty
281: string) to it.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.