|
|
1.1 root 1: .TH SORT 1
2: .SH NAME
3: sort \- sort and/or merge files
4: .SH SYNOPSIS
5: .B sort
6: [
7: .BI -cmuMbdf\&inrwt x
8: ]
9: [
10: .BI + pos1
11: [
12: .BI - pos2
13: ] ...
14: ] ...
15: [
16: .B -k
17: .I pos1
18: [
19: .I ,pos2
20: ]
21: ] ...
22: [
23: .B -o
24: .I output
25: ]
26: [
27: .B -T
28: .I dir
29: \&...
30: ]
31: [
32: .I option
33: \&...
34: ]
35: [
36: .I file
37: \&...
38: ]
39: .SH DESCRIPTION
40: .I Sort\^
41: sorts
42: lines of all the
43: .I files
44: together and writes the result on
45: the standard output.
46: If no input files are named, the standard input is sorted.
47: .PP
48: The default sort key is an entire line.
49: Default ordering is
50: lexicographic by runes.
51: The ordering is affected globally by the following options,
52: one or more of which may appear.
53: .TP
54: .B -M
55: Compare as months.
56: The first three
57: non-white space characters
58: of the field
59: are folded
60: to upper case
61: and compared
62: so that
63: .L JAN
64: precedes
65: .LR FEB ,
66: etc.
67: Invalid fields
68: compare low to
69: .LR JAN .
70: .TP
71: .B -b
72: Ignore leading white space (spaces and tabs) in field comparisons.
73: .TP
74: .B -d
75: `Phone directory' order:
76: only letters,
77: accented letters,
78: digits and white space
79: are significant in comparisons.
80: .TP
81: .B -f
82: Fold lower case
83: letters onto upper case.
84: Accented characters are folded to their
85: non-accented upper case form.
86: .TP
87: .B -i
88: Ignore characters outside the
89: .SM ASCII
90: range 040-0176
91: in non-numeric comparisons.
92: .TP
93: .B -w
94: Like
95: .BR -i ,
96: but ignore only tabs and spaces.
97: .TP
98: .B -n
99: An initial numeric string,
100: consisting of optional white space,
101: optional plus or minus sign,
102: and zero or more digits with optional decimal point,
103: is sorted by arithmetic value.
104: .TP
105: .B -g
106: Numbers, like
107: .B -n
108: but with optional
109: .BR e -style
110: exponents, are sorted by value.
111: .TP
112: .B -r
113: Reverse the sense of comparisons.
114: .TP
115: .BI -t x\^
116: `Tab character' separating fields is
117: .IR x .
118: .PP
119: The notation
120: .BI + "pos1\| " - pos2\^
121: restricts a sort key to a field beginning at
122: .I pos1\^
123: and ending just before
124: .IR pos2 .
125: .I Pos1\^
126: and
127: .I pos2\^
128: each have the form
129: .IB m . n\f1,
130: optionally followed by one or more of the flags
131: .BR Mbdfginr ,
132: where
133: .I m\^
134: tells a number of fields to skip from the beginning of the line and
135: .I n\^
136: tells a number of characters to skip further.
137: If any flags are present they override all the global
138: ordering options for this key.
139: A missing
140: .BI \&. n\^
141: means
142: .BR \&.0 ;
143: a missing
144: .BI - pos2\^
145: means the end of the line.
146: Under the
147: .BI -t x\^
148: option, fields are strings separated by
149: .IR x ;
150: otherwise fields are
151: non-empty strings separated by white space.
152: White space before a field
153: is part of the field, except under option
154: .BR -b .
155: A
156: .B b
157: flag may be attached independently to
158: .IR pos1
159: and
160: .IR pos2.
161: .PP
162: The notation
163: .B -k
164: .IR pos1 [, pos2 ]
165: is how POSIX
166: .I sort
167: defines fields:
168: .I pos1
169: and
170: .I pos2
171: have the same format but different meanings.
172: The value of
173: .I m\^
174: is origin 1 instead of origin 0
175: and a missing
176: .BI \&. n\^
177: in
178: .I pos2
179: is the end of the field.
180: .PP
181: When there are multiple sort keys, later keys
182: are compared only after all earlier keys
183: compare equal.
184: Lines that otherwise compare equal are ordered
185: with all bytes significant.
186: .PP
187: These option arguments are also understood:
188: .TP \w'\fL-z\fIrecsize\fLXX'u
189: .B -c
190: Check that the single input file is sorted according to the ordering rules;
191: give no output unless the file is out of sort.
192: .TP
193: .B -m
194: Merge; assume the input files are already sorted.
195: .TP
196: .B -u
197: Suppress all but one in each
198: set of equal lines.
199: Ignored bytes
200: and bytes outside keys
201: do not participate in
202: this comparison.
203: .TP
204: .B -o
205: The next argument is the name of an output file
206: to use instead of the standard output.
207: This file may be the same as one of the inputs.
208: .TP
209: .BI -T dir
210: Put temporary files in
211: .I dir
212: rather than in
213: .BR /tmp .
214: .ne 4
215: .SH EXAMPLES
216: .TP
217: .L sort -u +0f +0 list
218: Print in alphabetical order all the unique spellings
219: in a list of words
220: where capitalized words differ from uncapitalized.
221: .TP
222: .L sort -t: +1 /adm/users
223: Print the users file
224: sorted by user name
225: (the second colon-separated field).
226: .TP
227: .L sort -umM dates
228: Print the first instance of each month in an already sorted file.
229: Options
230: .B -um
231: with just one input file make the choice of a
232: unique representative from a set of equal lines predictable.
233: .TP
234: .L
235: grep -n '^' input | sort -t: +1f +0n | sed 's/[0-9]*://'
236: A stable sort: input lines that compare equal will
237: come out in their original order.
238: .SH FILES
239: .BI /tmp/sort. <pid>.<ordinal>
240: .SH SOURCE
241: .B /sys/src/cmd/sort.c
242: .SH SEE ALSO
243: .IR uniq (1),
244: .IR look (1)
245: .SH DIAGNOSTICS
246: .I Sort
247: comments and exits with non-null status for various trouble
248: conditions and for disorder discovered under option
249: .BR -c .
250: .SH BUGS
251: An external null character can be confused
252: with an internally generated end-of-field character.
253: The result can make a sub-field not sort
254: less than a longer field.
255: .PP
256: Some of the options, e.g.
257: .B -i
258: and
259: .BR -M ,
260: are hopelessly provincial.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.