|
|
1.1 root 1: .\" @(#)ae2 6.1 (Berkeley) 5/22/86
2: .\"
3: .NH
4: SPECIAL CHARACTERS
5: .PP
6: The editor
7: .UL ed
8: is the primary interface to the system
9: for many people, so
10: it is worthwhile to know
11: how to get the most out of
12: .UL ed
13: for the least effort.
14: .PP
15: The next few sections will discuss
16: shortcuts
17: and labor-saving devices.
18: Not all of these will be instantly useful
19: to any one person, of course,
20: but a few will be,
21: and the others should give you ideas to store
22: away for future use.
23: And as always,
24: until you try these things,
25: they will remain theoretical knowledge,
26: not something you have confidence in.
27: .SH
28: The List command `l'
29: .PP
30: .UL ed
31: provides two commands for printing the contents of the lines
32: you're editing.
33: Most people are familiar with
34: .UL p ,
35: in combinations like
36: .P1
37: 1,$p
38: .P2
39: to print all the lines you're editing,
40: or
41: .P1
42: s/abc/def/p
43: .P2
44: to change
45: `abc'
46: to
47: `def'
48: on the current line.
49: Less familiar is the
50: .ul
51: list
52: command
53: .UL l
54: (the letter `\fIl\|\fR'),
55: which gives slightly more information than
56: .UL p .
57: In particular,
58: .UL l
59: makes visible characters that are normally invisible,
60: such as tabs and backspaces.
61: If you list a line that contains some of these,
62: .UL l
63: will print each tab as
64: .UL \z\(mi>
65: and each backspace as
66: .UL \z\(mi< .\(dg
67: .FS
68: \(dg These composite characters are created by overstriking a minus
69: and a > or <, so they only appear as < or > on display terminals.
70: .FE
71: This makes it much easier to correct the sort of typing mistake
72: that inserts extra spaces adjacent to tabs,
73: or inserts a backspace followed by a space.
74: .PP
75: The
76: .UL l
77: command
78: also `folds' long lines for printing _
79: any line that exceeds 72 characters is printed on multiple lines;
80: each printed line except the last is terminated by a backslash
81: .UL \*e ,
82: so you can tell it was folded.
83: This is useful for printing long lines on short terminals.
84: .PP
85: Occasionally the
86: .UL l
87: command will print in a line a string of numbers preceded by a backslash,
88: such as \*e07 or \*e16.
89: These combinations are used to make visible characters that normally don't print,
90: like form feed or vertical tab or bell.
91: Each such combination is a single character.
92: When you see such characters, be wary _
93: they may have surprising meanings when printed on some terminals.
94: Often their presence means that your finger slipped while you were typing;
95: you almost never want them.
96: .SH
97: The Substitute Command `s'
98: .PP
99: Most of the next few sections will be taken up with a discussion
100: of the
101: substitute
102: command
103: .UL s .
104: Since this is the command for changing the contents of individual
105: lines,
106: it probably has the most complexity of any
107: .UL ed
108: command,
109: and the most potential for effective use.
110: .PP
111: As the simplest place to begin,
112: recall the meaning of a trailing
113: .UL g
114: after a substitute command.
115: With
116: .P1
117: s/this/that/
118: .P2
119: and
120: .P1
121: s/this/that/g
122: .P2
123: the
124: first
125: one replaces the
126: .ul
127: first
128: `this' on the line
129: with `that'.
130: If there is more than one `this' on the line,
131: the second form
132: with the trailing
133: .UL g
134: changes
135: .ul
136: all
137: of them.
138: .PP
139: Either form of the
140: .UL s
141: command can be followed by
142: .UL p
143: or
144: .UL l
145: to `print' or `list' (as described in the previous section)
146: the contents of the line:
147: .P1
148: s/this/that/p
149: s/this/that/l
150: s/this/that/gp
151: s/this/that/gl
152: .P2
153: are all legal, and mean slightly different things.
154: Make sure you know what the differences are.
155: .PP
156: Of course, any
157: .UL s
158: command can be preceded by one or two `line numbers'
159: to specify that the substitution is to take place
160: on a group of lines.
161: Thus
162: .P1
163: 1,$s/mispell/misspell/
164: .P2
165: changes the
166: .ul
167: first
168: occurrence of
169: `mispell' to `misspell' on every line of the file.
170: But
171: .P1
172: 1,$s/mispell/misspell/g
173: .P2
174: changes
175: .ul
176: every
177: occurrence in every line
178: (and this is more likely to be what you wanted in this
179: particular case).
180: .PP
181: You should also notice that if you add a
182: .UL p
183: or
184: .UL l
185: to the end of any of these substitute commands,
186: only the last line that got changed will be printed,
187: not all the lines.
188: We will talk later about how to print all the lines
189: that were modified.
190: .SH
191: The Undo Command `u'
192: .PP
193: Occasionally you will make a substitution in a line,
194: only to realize too late that it was a ghastly mistake.
195: The `undo' command
196: .UL u
197: lets you `undo' the last substitution:
198: the last line that was substituted can be restored to
199: its previous state by typing the command
200: .P1
201: u
202: .P2
203: .SH
204: The Metacharacter `\*.'
205: .PP
206: As you have undoubtedly noticed
207: when you use
208: .UL ed ,
209: certain characters have unexpected meanings
210: when they occur in the left side of a substitute command,
211: or in a search for a particular line.
212: In the next several sections, we will talk about
213: these special characters,
214: which are often called `metacharacters'.
215: .PP
216: The first one is the period `\*.'.
217: On the left side of a substitute command,
218: or in a search with `/.../',
219: `\*.' stands for
220: .ul
221: any
222: single character.
223: Thus the search
224: .P1
225: /x\*.y/
226: .P2
227: finds any line where `x' and `y' occur separated by
228: a single character, as in
229: .P1
230: x+y
231: x\-y
232: x\*(BLy
233: x\*.y
234: .P2
235: and so on.
236: (We will use \*(BL to stand for a space whenever we need to
237: make it visible.)
238: .PP
239: Since `\*.' matches a single character,
240: that gives you a way to deal with funny characters
241: printed by
242: .UL l .
243: Suppose you have a line that, when printed with the
244: .UL l
245: command, appears as
246: .P1
247: .... th\*e07is ....
248: .P2
249: and you want to get rid of the
250: \*e07
251: (which represents the bell character, by the way).
252: .PP
253: The most obvious solution is to try
254: .P1
255: s/\*e07//
256: .P2
257: but this will fail. (Try it.)
258: The brute force solution, which most people would now take,
259: is to re-type the entire line.
260: This is guaranteed, and is actually quite a reasonable tactic
261: if the line in question isn't too big,
262: but for a very long line,
263: re-typing is a bore.
264: This is where the metacharacter `\*.' comes in handy.
265: Since `\*e07' really represents a single character,
266: if we say
267: .P1
268: s/th\*.is/this/
269: .P2
270: the job is done.
271: The `\*.' matches the mysterious character between the `h' and the `i',
272: .ul
273: whatever it is.
274: .PP
275: Bear in mind that since `\*.' matches any single character,
276: the command
277: .P1
278: s/\*./,/
279: .P2
280: converts the first character on a line into a `,',
281: which very often is not what you intended.
282: .PP
283: As is true of many characters in
284: .UL ed ,
285: the `\*.' has several meanings, depending
286: on its context.
287: This line shows all three:
288: .P1
289: \&\*.s/\*./\*./
290: .P2
291: The first `\*.' is a line number,
292: the number of
293: the line we are editing,
294: which is called `line dot'.
295: (We will discuss line dot more in Section 3.)
296: The second `\*.' is a metacharacter
297: that matches any single character on that line.
298: The third `\*.' is the only one that really is
299: an honest literal period.
300: On the
301: .ul
302: right
303: side of a substitution, `\*.'
304: is not special.
305: If you apply this command to the line
306: .P1
307: Now is the time\*.
308: .P2
309: the result will
310: be
311: .P1
312: \&\*.ow is the time\*.
313: .P2
314: which is probably not what you intended.
315: .SH
316: The Backslash `\*e'
317: .PP
318: Since a period means `any character',
319: the question naturally arises of what to do
320: when you really want a period.
321: For example, how do you convert the line
322: .P1
323: Now is the time\*.
324: .P2
325: into
326: .P1
327: Now is the time?
328: .P2
329: The backslash `\*e' does the job.
330: A backslash turns off any special meaning that the next character
331: might have; in particular,
332: `\*e\*.' converts the `\*.' from a `match anything'
333: into a period, so
334: you can use it to replace
335: the period in
336: .P1
337: Now is the time\*.
338: .P2
339: like this:
340: .P1
341: s/\*e\*./?/
342: .P2
343: The pair of characters `\*e\*.' is considered by
344: .UL ed
345: to be a single real period.
346: .PP
347: The backslash can also be used when searching for lines
348: that contain a special character.
349: Suppose you are looking for a line that contains
350: .P1
351: \&\*.PP
352: .P2
353: The search
354: .P1
355: /\*.PP/
356: .P2
357: isn't adequate, for it will find
358: a line like
359: .P1
360: THE APPLICATION OF ...
361: .P2
362: because the `\*.' matches the letter `A'.
363: But if you say
364: .P1
365: /\*e\*.PP/
366: .P2
367: you will find only lines that contain `\*.PP'.
368: .PP
369: The backslash can also be used to turn off special meanings for
370: characters other than `\*.'.
371: For example, consider finding a line that contains a backslash.
372: The search
373: .P1
374: /\*e/
375: .P2
376: won't work,
377: because the `\*e' isn't a literal `\*e', but instead means that the second `/'
378: no longer \%delimits the search.
379: But by preceding a backslash with another one,
380: you can search for a literal backslash.
381: Thus
382: .P1
383: /\*e\*e/
384: .P2
385: does work.
386: Similarly, you can search for a forward slash `/' with
387: .P1
388: /\*e//
389: .P2
390: The backslash turns off the meaning of the immediately following `/' so that
391: it doesn't terminate the /.../ construction prematurely.
392: .PP
393: As an exercise, before reading further, find two substitute commands each of which will
394: convert the line
395: .P1
396: \*ex\*e\*.\*ey
397: .P2
398: into the line
399: .P1
400: \*ex\*ey
401: .P2
402: .PP
403: Here are several solutions;
404: verify that each works as advertised.
405: .P1
406: s/\*e\*e\*e\*.//
407: s/x\*.\*./x/
408: s/\*.\*.y/y/
409: .P2
410: .PP
411: A couple of miscellaneous notes about
412: backslashes and special characters.
413: First, you can use any character to delimit the pieces
414: of an
415: .UL s
416: command: there is nothing sacred about slashes.
417: (But you must use slashes for context searching.)
418: For instance, in a line that contains a lot of slashes already, like
419: .P1
420: //exec //sys.fort.go // etc...
421: .P2
422: you could use a colon as the delimiter _
423: to delete all the slashes, type
424: .P1
425: s:/::g
426: .P2
427: .PP
428: Second, if # and @ are your character erase and line kill characters,
429: you have to type \*e# and \*e@;
430: this is true whether you're talking to
431: .UL ed
432: or any other program.
433: .PP
434: When you are adding text with
435: .UL a
436: or
437: .UL i
438: or
439: .UL c ,
440: backslash is not special, and you should only put in
441: one backslash for each one you really want.
442: .SH
443: The Dollar Sign `$'
444: .PP
445: The next metacharacter, the `$', stands for `the end of the line'.
446: As its most obvious use, suppose you have the line
447: .P1
448: Now is the
449: .P2
450: and you wish to add the word `time' to the end.
451: Use the $ like this:
452: .P1
453: s/$/\*(BLtime/
454: .P2
455: to get
456: .P1
457: Now is the time
458: .P2
459: Notice that a space is needed before `time' in
460: the substitute command,
461: or you will get
462: .P1
463: Now is thetime
464: .P2
465: .PP
466: As another example, replace the second comma in
467: the following line with a period without altering the first:
468: .P1
469: Now is the time, for all good men,
470: .P2
471: The command needed is
472: .P1
473: s/,$/\*./
474: .P2
475: The $ sign here provides context to make specific which comma we mean.
476: Without it, of course, the
477: .UL s
478: command would operate on the first comma to produce
479: .P1
480: Now is the time\*. for all good men,
481: .P2
482: .PP
483: As another example, to convert
484: .P1
485: Now is the time\*.
486: .P2
487: into
488: .P1
489: Now is the time?
490: .P2
491: as we did earlier, we can use
492: .P1
493: s/\*.$/?/
494: .P2
495: .PP
496: Like `\*.', the `$'
497: has multiple meanings depending on context.
498: In the line
499: .P1
500: $s/$/$/
501: .P2
502: the first `$' refers to the
503: last line of the file,
504: the second refers to the end of that line,
505: and the third is a literal dollar sign,
506: to be added to that line.
507: .SH
508: The Circumflex `^'
509: .PP
510: The circumflex (or hat or caret)
511: `^' stands for the beginning of the line.
512: For example, suppose you are looking for a line that begins
513: with `the'.
514: If you simply say
515: .P1
516: /the/
517: .P2
518: you will in all likelihood find several lines that contain `the' in the middle before
519: arriving at the one you want.
520: But with
521: .P1
522: /^the/
523: .P2
524: you narrow the context, and thus arrive at the desired one
525: more easily.
526: .PP
527: The other use of `^' is of course to enable you to insert
528: something at the beginning of a line:
529: .P1
530: s/^/\*(BL/
531: .P2
532: places a space at the beginning of the current line.
533: .PP
534: Metacharacters can be combined. To search for a
535: line that contains
536: .ul
537: only
538: the characters
539: .P1
540: \&\*.PP
541: .P2
542: you can use the command
543: .P1
544: /^\*e\*.PP$/
545: .P2
546: .SH
547: The Star `*'
548: .PP
549: Suppose you have a line that looks like this:
550: .P1
551: \fItext \fR x y \fI text \fR
552: .P2
553: where
554: .ul
555: text
556: stands
557: for lots of text,
558: and there are some indeterminate number of spaces between the
559: .UL x
560: and the
561: .UL y .
562: Suppose the job is to replace all the spaces between
563: .UL x
564: and
565: .UL y
566: by a single space.
567: The line is too long to retype, and there are too many spaces
568: to count.
569: What now?
570: .PP
571: This is where the metacharacter `*'
572: comes in handy.
573: A character followed by a star
574: stands for as many consecutive occurrences of that
575: character as possible.
576: To refer to all the spaces at once, say
577: .P1
578: s/x\*(BL*y/x\*(BLy/
579: .P2
580: The construction
581: `\*(BL*'
582: means
583: `as many spaces as possible'.
584: Thus `x\*(BL*y' means `an x, as many spaces as possible, then a y'.
585: .PP
586: The star can be used with any character, not just space.
587: If the original example was instead
588: .P1
589: \fItext \fR x--------y \fI text \fR
590: .P2
591: then all `\-' signs can be replaced by a single space
592: with the command
593: .P1
594: s/x-*y/x\*(BLy/
595: .P2
596: .PP
597: Finally, suppose that the line was
598: .P1
599: \fItext \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fR
600: .P2
601: Can you see what trap lies in wait for the unwary?
602: If you blindly type
603: .P1
604: s/x\*.*y/x\*(BLy/
605: .P2
606: what will happen?
607: The answer, naturally, is that it depends.
608: If there are no other x's or y's on the line,
609: then everything works, but it's blind luck, not good management.
610: Remember that `\*.' matches
611: .ul
612: any
613: single character?
614: Then `\*.*' matches as many single characters as possible,
615: and unless you're careful, it can eat up a lot more of the line
616: than you expected.
617: If the line was, for example, like this:
618: .P1
619: \fItext \fRx\fI text \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fRy\fI text \fR
620: .P2
621: then saying
622: .P1
623: s/x\*.*y/x\*(BLy/
624: .P2
625: will take everything from the
626: .ul
627: first
628: `x' to the
629: .ul
630: last
631: `y',
632: which, in this example, is undoubtedly more than you wanted.
633: .PP
634: The solution, of course, is to turn off the special meaning of
635: `\*.' with
636: `\*e\*.':
637: .P1
638: s/x\*e\*.*y/x\*(BLy/
639: .P2
640: Now everything works, for `\*e\*.*' means `as many
641: .ul
642: periods
643: as possible'.
644: .PP
645: There are times when the pattern `\*.*' is exactly what you want.
646: For example, to change
647: .P1
648: Now is the time for all good men ....
649: .P2
650: into
651: .P1
652: Now is the time\*.
653: .P2
654: use `\*.*' to eat up everything after the `for':
655: .P1
656: s/\*(BLfor\*.*/\*./
657: .P2
658: .PP
659: There are a couple of additional pitfalls associated with `*' that you should be aware of.
660: Most notable is the fact that `as many as possible' means
661: .ul
662: zero
663: or more.
664: The fact that zero is a legitimate possibility is
665: sometimes rather surprising.
666: For example, if our line contained
667: .P1
668: \fItext \fR xy \fI text \fR x y \fI text \fR
669: .P2
670: and we said
671: .P1
672: s/x\*(BL*y/x\*(BLy/
673: .P2
674: the
675: .ul
676: first
677: `xy' matches this pattern, for it consists of an `x',
678: zero spaces, and a `y'.
679: The result is that the substitute acts on the first `xy',
680: and does not touch the later one that actually contains some intervening spaces.
681: .PP
682: The way around this, if it matters, is to specify a pattern like
683: .P1
684: /x\*(BL\*(BL*y/
685: .P2
686: which says `an x, a space, then as many more spaces as possible, then a y',
687: in other words, one or more spaces.
688: .PP
689: The other startling behavior of `*' is again related to the fact
690: that zero is a legitimate number of occurrences of something
691: followed by a star. The command
692: .P1
693: s/x*/y/g
694: .P2
695: when applied to the line
696: .P1
697: abcdef
698: .P2
699: produces
700: .P1
701: yaybycydyeyfy
702: .P2
703: which is almost certainly not what was intended.
704: The reason for this behavior is that zero is a legal number
705: of matches,
706: and there are no x's at the beginning of the line
707: (so that gets converted into a `y'),
708: nor between the `a' and the `b'
709: (so that gets converted into a `y'), nor ...
710: and so on.
711: Make sure you really want zero matches;
712: if not, in this case write
713: .P1
714: s/xx*/y/g
715: .P2
716: `xx*' is one or more x's.
717: .SH
718: The Brackets `[ ]'
719: .PP
720: Suppose that you want to delete any numbers
721: that appear
722: at the beginning of all lines of a file.
723: You might first think of trying a series of commands like
724: .P1
725: 1,$s/^1*//
726: 1,$s/^2*//
727: 1,$s/^3*//
728: .P2
729: and so on,
730: but this is clearly going to take forever if the numbers are at all long.
731: Unless you want to repeat the commands over and over until
732: finally all numbers are gone,
733: you must get all the digits on one pass.
734: This is the purpose of the brackets [ and ].
735: .PP
736: The construction
737: .P1
738: [0123456789]
739: .P2
740: matches any single digit _
741: the whole thing is called a `character class'.
742: With a character class, the job is easy.
743: The pattern `[0123456789]*' matches zero or more digits (an entire number), so
744: .P1
745: 1,$s/^[0123456789]*//
746: .P2
747: deletes all digits from the beginning of all lines.
748: .PP
749: Any characters can appear within a character class,
750: and just to confuse the issue there are essentially no special characters
751: inside the brackets;
752: even the backslash doesn't have a special meaning.
753: To search for special characters, for example, you can say
754: .P1
755: /[\*.\*e$^[]/
756: .P2
757: Within [...], the `[' is not special.
758: To get a `]' into a character class,
759: make it the first character.
760: .PP
761: It's a nuisance to have to spell out the digits,
762: so you can abbreviate them as
763: [0\-9];
764: similarly, [a\-z] stands for the lower case letters,
765: and
766: [A\-Z] for upper case.
767: .PP
768: As a final frill on character classes, you can specify a class
769: that means `none of the following characters'.
770: This is done by beginning the class with a `^':
771: .P1
772: [^0-9]
773: .P2
774: stands for `any character
775: .ul
776: except
777: a digit'.
778: Thus you might find the first line that doesn't begin with a tab or space
779: by a search like
780: .P1
781: /^[^(space)(tab)]/
782: .P2
783: .PP
784: Within a character class,
785: the circumflex has a special meaning
786: only if it occurs at the beginning.
787: Just to convince yourself, verify that
788: .P1
789: /^[^^]/
790: .P2
791: finds a line that doesn't begin with a circumflex.
792: .SH
793: The Ampersand `&'
794: .PP
795: The ampersand `&' is used primarily to save typing.
796: Suppose you have the line
797: .P1
798: Now is the time
799: .P2
800: and you want to make it
801: .P1
802: Now is the best time
803: .P2
804: Of course you can always say
805: .P1
806: s/the/the best/
807: .P2
808: but it seems silly to have to repeat the `the'.
809: The `&' is used to eliminate the repetition.
810: On the
811: .ul
812: right
813: side of a substitute, the ampersand means `whatever
814: was just matched', so you can say
815: .P1
816: s/the/& best/
817: .P2
818: and the `&' will stand for `the'.
819: Of course this isn't much of a saving if the thing
820: matched is just `the', but if it is something truly long or awful,
821: or if it is something like `.*'
822: which matches a lot of text,
823: you can save some tedious typing.
824: There is also much less chance of making a typing error
825: in the replacement text.
826: For example, to parenthesize a line,
827: regardless of its length,
828: .P1
829: s/\*.*/(&)/
830: .P2
831: .PP
832: The ampersand can occur more than once on the right side:
833: .P1
834: s/the/& best and & worst/
835: .P2
836: makes
837: .P1
838: Now is the best and the worst time
839: .P2
840: and
841: .P1
842: s/\*.*/&? &!!/
843: .P2
844: converts the original line into
845: .P1
846: Now is the time? Now is the time!!
847: .P2
848: .PP
849: To get a literal ampersand, naturally the backslash is used to turn off the special meaning:
850: .P1
851: s/ampersand/\*e&/
852: .P2
853: converts the word into the symbol.
854: Notice that `&' is not special on the left side
855: of a substitute, only on the
856: .ul
857: right
858: side.
859: .SH
860: Substituting Newlines
861: .PP
862: .UL ed
863: provides a facility for splitting a single line into two or more shorter lines by `substituting in a newline'.
864: As the simplest example, suppose a line has gotten unmanageably long
865: because of editing (or merely because it was unwisely typed).
866: If it looks like
867: .P1
868: \fItext \fR xy \fI text \fR
869: .P2
870: you can break it between the `x' and the `y' like this:
871: .P1
872: s/xy/x\*e
873: y/
874: .P2
875: This is actually a single command,
876: although it is typed on two lines.
877: Bearing in mind that `\*e' turns off special meanings,
878: it seems relatively intuitive that a `\*e' at the end of
879: a line would make the newline there
880: no longer special.
881: .PP
882: You can in fact make a single line into several lines
883: with this same mechanism.
884: As a large example, consider underlining the word `very'
885: in a long line
886: by splitting `very' onto a separate line,
887: and preceding it by the
888: .UL roff
889: or
890: .UL nroff
891: formatting command `.ul'.
892: .P1
893: \fItext \fR a very big \fI text \fR
894: .P2
895: The command
896: .P1
897: s/\*(BLvery\*(BL/\*e
898: \&.ul\*e
899: very\*e
900: /
901: .P2
902: converts the line into four shorter lines,
903: preceding the word `very' by the
904: line
905: `.ul',
906: and eliminating the spaces around the `very',
907: all at the same time.
908: .PP
909: When a newline is substituted
910: in, dot is left pointing at the last line created.
911: .PP
912: .SH
913: Joining Lines
914: .PP
915: Lines may also be joined together,
916: but this is done with the
917: .UL j
918: command
919: instead of
920: .UL s .
921: Given the lines
922: .P1
923: Now is
924: \*(BLthe time
925: .P2
926: and supposing that dot is set to the first of them,
927: then the command
928: .P1
929: j
930: .P2
931: joins them together.
932: No blanks are added,
933: which is why we carefully showed a blank
934: at the beginning of the second line.
935: .PP
936: All by itself,
937: a
938: .UL j
939: command
940: joins line dot to line dot+1,
941: but any contiguous set of lines can be joined.
942: Just specify the starting and ending line numbers.
943: For example,
944: .P1
945: 1,$jp
946: .P2
947: joins all the lines into one big one
948: and prints it.
949: (More on line numbers in Section 3.)
950: .SH
951: Rearranging a Line with \*e( ... \*e)
952: .PP
953: (This section should be skipped on first reading.)
954: Recall that `&' is a shorthand that stands for whatever
955: was matched by the left side of an
956: .UL s
957: command.
958: In much the same way you can capture separate pieces
959: of what was matched;
960: the only difference is that you have to specify
961: on the left side just what pieces you're interested in.
962: .PP
963: Suppose, for instance, that
964: you have a file of lines that consist of names in the form
965: .P1
966: Smith, A. B.
967: Jones, C.
968: .P2
969: and so on,
970: and you want the initials to precede the name, as in
971: .P1
972: A. B. Smith
973: C. Jones
974: .P2
975: It is possible to do this with a series of editing commands,
976: but it is tedious and error-prone.
977: (It is instructive to figure out how it is done, though.)
978: .PP
979: The alternative
980: is to `tag' the pieces of the pattern (in this case,
981: the last name, and the initials),
982: and then rearrange the pieces.
983: On the left side of a substitution,
984: if part of the pattern is enclosed between
985: \*e( and \*e),
986: whatever matched that part is remembered,
987: and available for use on the right side.
988: On the right side,
989: the symbol `\*e1' refers to whatever
990: matched the first \*e(...\*e) pair,
991: `\*e2' to the second \*e(...\*e),
992: and so on.
993: .PP
994: The command
995: .P1
996: 1,$s/^\*e([^,]*\*e),\*(BL*\*e(\*.*\*e)/\*e2\*(BL\*e1/
997: .P2
998: although hard to read, does the job.
999: The first \*e(...\*e) matches the last name,
1000: which is any string up to the comma;
1001: this is referred to on the right side with `\*e1'.
1002: The second \*e(...\*e) is whatever follows
1003: the comma and any spaces,
1004: and is referred to as `\*e2'.
1005: .PP
1006: Of course, with any editing sequence this complicated,
1007: it's foolhardy to simply run it and hope.
1008: The global commands
1009: .UL g
1010: and
1011: .UL v
1012: discussed in section 4
1013: provide a way for you to print exactly those
1014: lines which were affected by the
1015: substitute command,
1016: and thus verify that it did what you wanted
1017: in all cases.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.