|
|
1.1 root 1: .TI CSHELL/INTERNALS
2: Internal Operation of the C Shell
3:
4:
5: The whole point of the shell is to find programs and start them running.
6: When you enter a command it searches for a program to do your bidding,
7: starts it running, stands by idly until it completes, and then prints
8: a prompt for another command.
9: It continues this read-execute-prompt cycle indefinitely, stopping
10: only when you logout or the computer goes down.
11:
12: Ultimately the shell's purpose is to take a user command and put it
13: in the form Unix requires for starting execution of new
14: programs: execl( PROGFILE, ARG0, ARG1, ARG2, ..., 0 ).
15: For example, if your command were "nroff -ms myfile", the shell's job
16: would be to execl( "/usr/bin/nroff", "nroff", "-ms", "myfile", 0 ),
17: where "/usr/bin/nroff" tells Unix in which file to find the nroff program.
18: In this case the shell had very little work to do.
19: If your next command were "!! | lpr ; wc * > ~/wcout",
20: the shell would have much more work to do and end up with 3 execl calls
21: bearing little resemblance to your command.
22: This is important because what the shell winds up sending to execl
23: as arguments are what the programs involved really see.
24:
25: A program that is executing, as opposed
26: to one that is stored in a file, is called a process.
27: When you login, Unix finds the C shell program in the file "/bin/csh"
28: and starts it running as a process on your terminal.
29: The same happens to everyone else when they login,
30: but each of the resulting processes is independent and has
31: no knowledge of any other processes except those it might create.
32: Thus you have your own shell when you login, and can
33: in fact personalize it to some extent.
34:
35: In a little greater detail than before,
36: here is what the C shell does with a command.
37: To illustrate this suppose you enter the command
38: .br
39: % nroff -ms chap* > outfile
40: .br
41: Your shell process ...
42:
43: [1] reads the command and breaks it into separate
44: command words: "nroff", "-ms", "chap*" ">", "outfile";
45:
46: [2] makes new command words if necessary: in this case replaces the
47: command word "chap*" by all filenames beginning with "chap",
48: for example, "chapintro", "chapter1", "chapter2";
49:
50: [3] finds a file (assumed to contain the program)
51: named by the first command word: "/usr/bin/nroff";
52:
53: [4] makes a copy of itself -- a child process -- which will later be
54: transformed into the nroff process.
55:
56: Here the child and parent processes do different things.
57:
58: [5] The child sets up input and output, removing command words which
59: indicate redirection: in this case opens a file
60: called "output" to which all future output from this child
61: process will be written instead of the terminal and removes
62: the words ">" and "outfile" from your command;
63:
64: [6] the child transforms itself into the program found in step 3 above
65: using execl: execl( "/usr/bin/nroff", "nroff", "-ms", "chapintro",
66: "chapter1", "chapter2", 0 );
67:
68: [7] the child dies, either because it is done or there was an error,
69: at which point the Unix kernel removes all
70: traces of it and sends a signal of this event to the parent process;
71:
72: [8] the parent process meanwhile literally waits idly for the child
73: process to finish, and then issues a prompt for another command.
74:
75: Each of these steps have interesting and important ramifications.
76: Some are explained below, others are mentioned below and explained
77: elsewhere.
78:
79: [1] Reads the command and breaks it into separate command words.
80:
81: This step (lexical analysis) is needed to get the command words
82: (arguments) into the execl format.
83: It gives the typist some flexibility while imposing some restrictions.
84: In particular, the shell breaks the line into separate words
85: at blanks and tabs, treating multiple blanks and tabs as if they
86: were one blank.
87: So, for example, if you accidentally type extra blanks at
88: the beginning or end of the command, or between words, the
89: shell will probably do what you had in mind.
90: On the other hand, if you leave out blanks between two adjacent
91: arguments, it will go ahead and bundle them up as one word.
92: For example, the shell considers the command
93: .br
94: % nroff-ms myfile
95: .br
96: as having only two words, the name of the command being "nroff-ms",
97: then tries unsuccessfully to locate the program (step 3)
98: in a file of that name and responds with
99: .br
100: nroff-ms: Command not found.
101: .br
102: The last argument would have been correctly interpreted as "myfile".
103: To add another twist, the command
104: .br
105: nroff -ms-o1,5 myfile
106: .br
107: would be execl'd successfully (step 6) but would provoke
108: an error message from nroff.
109:
110: One additional rule says that any one of the
111: characters &|;<>() is considered a separate word,
112: except when one of &|<> appear doubled, in which case
113: the doubled character is one word.
114: For example, the commands
115: .br
116: % neqn <paper|nroff -ms>> outfile&
117: .br
118: % neqn < paper | nroff -ms >> outfile &
119: .br
120: are interpreted identically, each consisting of 9 words.
121:
122: On the other hand, if you want a blank, tab, or one of &|;<>()
123: to be considered part of another word, you must surround
124: it with quote marks of the type ", `, or ', or precede it with a \\
125: (use of \\ is also termed quoting).
126: If you want a carriage-return (newline) to be part of a word,
127: you must surround it with quote marks AND precede it with a \\,
128: since preceding it with a \\ and
129: not using quote marks is treated as a blank.
130:
131: Beware.
132: Strictly speaking, quoting prevents the shell from interpreting
133: the quoted characters according to its usual practice, and this discussion
134: only mentions how the usual practice is suspended with respect
135: to word separation.
136: There are other much more profound side-effects of quoting depending on both
137: the quoted and the quoting characters.
138: The documentation is perhaps more unyielding, incomplete, and confusing
139: on this issue than on any other.
140:
141: [2] Makes new command words if necessary.
142:
143: The C Shell recognizes a large variety of characters and constructs
144: as having special meanings and substitutes other words in their place.
145: This means that if your command line contains any of them,
146: as in "!! | lpr ; wc * > ~/wcout" from before,
147: the resulting call (or calls) to execl (step 6) may be the result
148: of sweeping changes made in this step.
149: Note that the programs being called never see your original command
150: and never have to know anything about the special characters.
151: Consequently, the same substitution rules apply to ALL programs called
152: from the shell (for example, "lpr", "vi", "nroff", etc.).
153:
154: Substitutions are classified by type and are applied in a definite order.
155: The shell scans command words for characters or constructs of the first type,
156: making substitutions if it finds any.
157: Then it takes the resulting command words and scans them to find and make
158: substitutions of the second type, if any, and so forth.
159: Here is a list of substitution types in order with an indication of the kinds
160: of special characters that will trigger them.
161:
162: .nf
163: .ta 8n 16n 24n 32n 40n 48n 56n 64n
164: Type Triggered By Typical Uses
165: -------------------------------------------------------------------
166: History !event, ^old^new re-use earlier commands
167: Alias first command word re-name commands
168: Variable $var, $#var, $var[n] scripts, personalized shell
169: Command `shell command` use command output as args
170: Filename *, ?, [], {}, ~ abbreviate groups of files
171: Input/Output <, >, |, <<, >>, $< re-route input and output
172: Expressions ( x <>=!~+-*/()&|^ y ) arithmetic and branching
173: .fi
174:
175: In the hands of a sober, well-informed user, substitutions are very
176: useful: (1) they can save tremendous amounts of typing, (2) they
177: need only be learned for the shell, since all programs called by
178: users have to go through the shell, and (3) they make it possible
179: to write programs consisting of shell commands.
180:
181: In the wrong hands, however, substitutions can be a tricky.
182: To help you practice, the shell provides a way for you to see
183: exactly what it comes up with just before it calls execl.
184: The command "set echo" will cause it to print your command after
185: all substitutions have been made, just before calling execl.
186: To avoid the danger of executing a possibly incorrect command, you
187: can test whether a construct will end up the way you think
188: just by entering it as an argument to the "echo" command.
189: The "echo" command does nothing more than print its arguments
190: on the terminal and like all commands is subject to substitutions.
191: So, for example, "echo *" prints the words that would result,
192: on any command line, from substituting for * (which lists all your files).
193:
194: [3] Finds a file named by the first command word.
195:
196: The whole point of the shell is to run programs other than itself,
197: such as "vi", "cc", "troff", etc.
198: Occasionally there is a need for a command that the shell can perform
199: internally, that is, without locating a program file or
200: creating another process.
201: So in this step the shell usually tries to locate a file containing the
202: program named by the first command word, but not before checking
203: to see if it belongs to the set of commands built-in to itself.
204:
205: If a command is non-built-in, the shell scans a list of directories
206: called the searchpath, which may be personalized for each user.
207: It appends the first command word to the first directory on
208: the list and checks to see if the resulting file name exists.
209: If not, it checks the second directory in a similar fashion,
210: and so forth, until a file is found, and that file name is
211: used when execl is called in step 6.
212: In the case that no file is found, the shell reports this
213: and prompts for another command.
214:
215: If your searchpath becomes garbled, usually because you were
216: experimenting with it, the shell may not find some or all of
217: the usual non-built-in commands.
218: Besides panicking, there are two things to do.
219: Fortunately, the command to correct the searchpath is built-in
220: and can still be used, but only if you recognize that
221: that is the problem.
222: Also, if the first command word begins with a /, the shell
223: considers it to be the name of the program file to execute,
224: for example, the command "/usr/ucb/vi .cshrc" would work.
225:
226: If a command is built-in, the shell bypasses steps 4, 6, 7, and 8,
227: which reduces run time greatly, and performs the command in its own way.
228: For the sake of efficiency, a built-in command is preferred to a
229: non-built-in command if they perform the same function, and that is
230: why some of the built-in commands were created.
231: Other commands were built-in because they would not have worked
232: otherwise, due to the way that processes
233: disappear completely in step 7; in particular, if a command is
234: needed to change the behavior of your shell from that point on,
235: a non-built-in command would only be able to change the characteristics
236: of a child process of your shell, the shell process that will read
237: your next command when the child dies leaving no trace of the change.
238:
239: The "echo" command, for example, is built-in to the C shell
240: because it is used so often.
241: A quick and ugly way to list the files in your directory,
242: without using the "ls" command, is to type "echo *".
243: A very quick way to create a one line file, without "vi",
244: is "echo This is a one line file. > oneliner".
245: Some commands that have to be built-in are "cd", "set",
246: "alias", and "history".
247: Unfortunately, most built-in commands do not have separate manual
248: sections, so the command "man set" will yield nothing, while "man csh"
249: will tell you about "cd" after printing the first 9 pages or so.
250: Ironically, "man echo" will display a manual page because users
251: of the Bourne shell do not have a built-in "echo" command.
252:
253: [4] Makes a copy of itself -- a child process.
254:
255: The Unix kernel requires the C shell -- in fact, requires
256: all programs that run other programs -- to use execl.
257: Unfortunately, that causes the process running the new program
258: to die when it is done.
259: Your shell therefore has to create a new process to do the execl
260: in order that the old process survive to prompt you for the
261: next command.
262: The only way to create a new process on Unix, though, is for
263: an existing process to make a copy of itself by executing
264: a program statement called fork.
265: The new and old processes are identical except that one knows
266: it is a parent and the other knows it is a child, and
267: the internal code of the program for both processes can
268: take different branches on the basis of this information.
269: This step is time-consuming, and the documentation sometimes
270: mentions useful ways to avoid having to fork new processes,
271: for instance, by using built-in commands.
272:
273: [5] The child sets up input and output.
274:
275: In this step, the command words are scanned for special input or
276: output redirection constructs.
277: When these constructs have been interpreted, they are removed
278: from the list of command words.
279: Any output file specified is created if it does not already exist.
280: If the file or directory does not have the correct permissions,
281: or an input file does not exist, the shell, not the program named
282: by the first command word, issues an error message and prompts
283: for another command.
284: The program to be run has no knowledge that its inputs and outputs
285: have been changed.
286:
287: In the presence of a pipe between commands, the shell removes the
288: pipe constructs from the command line after first breaking it up
289: into separate subcommands. Each of these subcommands is processed
290: like any other command, with a separate fork and execl for each.
291: The main difference is that the parent sets up input and output
292: between processes and has them all started up before beginning
293: to wait on any of them.
294:
295: [6] The child transforms itself into the program found in step 3.
296:
297: This is where the child does the execl, but not precisely.
298: For simplicity I did not mention that the actual call is
299: of the form: execve( PROGFILE, ARG0, ARG1, ARG2, ..., 0 , ENV0, ENV1,
300: ENV2, ..., 0 ).
301: The new arguments (after the first 0) contain definitions of
302: all the current process's environment variables.
303: These may contain any information the user may choose
304: to store in them using the built-in command "setenv" and
305: have the property that besides input/output redirection, the
306: current directory, and a handful of other data,
307: they are some of the very few things
308: that can be inherited by the new program after execl.
309:
310: [7] The child dies.
311:
312: Processes can finish normally or abnormally,
313: but all of them die eventually.
314: For example, when you leave "vi" by typing ZZ, or when
315: "nroff" stops because of a macro/diversion overflow,
316: then the associated processes die.
317: Your shell itself is a process which dies when you logout.
318:
319: When the child process running the new program dies,
320: the Unix kernel sends a signal to the parent process (your shell)
321: notifying it of the event.
322:
323: [8] The parent waits for the child to die, then prompts the user.
324:
325: In the meantime, the parent process has executed a program statement
326: called wait which just puts it on hold until Unix
327: sends a signal notifying the shell that the child has died.
328: If you had entered an & at the end of the original command,
329: your shell would not wait for notification of the child's death
330: but would print the child's process number and then
331: prompt you for the next command.
332: That procedure is called backgrounding a process.
333:
334: While the C shell is waiting for the child (only on 4.1 or 4.2 BSD Unix)
335: you can type ^Z to wakeup the parent and
336: freeze the child for the time being.
337: At that point you could enter other commands to shell and
338: at a later time you could issue commands to resume execution,
339: kill it altogether, or resume execution in the background.
340: This useful feature is called job control.
341:
342:
343: jak
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.