43BSDTahoe/new/help/src/cshell/internals - annotate

Return to internals CVS log
Up to [CSRG BSD Unix] / 43BSDTahoe / new / help / src / cshell
Annotation of 43BSDTahoe/new/help/src/cshell/internals, revision 1.1.1.1

1.1       root        1: .TI CSHELL/INTERNALS
                      2: Internal Operation of the C Shell
                      3: 
                      4: 
                      5: The whole point of the shell is to find programs and start them running.
                      6: When you enter a command it searches for a program to do your bidding,
                      7: starts it running, stands by idly until it completes, and then prints
                      8: a prompt for another command.
                      9: It continues this read-execute-prompt cycle indefinitely, stopping
                     10: only when you logout or the computer goes down.
                     11: 
                     12: Ultimately the shell's purpose is to take a user command and put it
                     13: in the form Unix requires for starting execution of new
                     14: programs:  execl( PROGFILE, ARG0, ARG1, ARG2, ..., 0 ).
                     15: For example, if your command were "nroff -ms myfile", the shell's job
                     16: would be to execl( "/usr/bin/nroff", "nroff", "-ms", "myfile", 0 ),
                     17: where "/usr/bin/nroff" tells Unix in which file to find the nroff program.
                     18: In this case the shell had very little work to do.
                     19: If your next command were "!! | lpr ; wc * > ~/wcout",
                     20: the shell would have much more work to do and end up with 3 execl calls
                     21: bearing little resemblance to your command.
                     22: This is important because what the shell winds up sending to execl
                     23: as arguments are what the programs involved really see.
                     24: 
                     25: A program that is executing, as opposed
                     26: to one that is stored in a file, is called a process.
                     27: When you login, Unix finds the C shell program in the file "/bin/csh"
                     28: and starts it running as a process on your terminal.
                     29: The same happens to everyone else when they login,
                     30: but each of the resulting processes is independent and has
                     31: no knowledge of any other processes except those it might create.
                     32: Thus you have your own shell when you login, and can
                     33: in fact personalize it to some extent.
                     34: 
                     35: In a little greater detail than before,
                     36: here is what the C shell does with a command.
                     37: To illustrate this suppose you enter the command
                     38: .br
                     39:        % nroff -ms chap* > outfile
                     40: .br
                     41: Your shell process ...
                     42: 
                     43: [1] reads the command and breaks it into separate
                     44: command words:  "nroff", "-ms", "chap*" ">", "outfile";
                     45: 
                     46: [2] makes new command words if necessary:  in this case replaces the
                     47: command word "chap*" by all filenames beginning with "chap",
                     48: for example, "chapintro", "chapter1", "chapter2";
                     49: 
                     50: [3] finds a file (assumed to contain the program)
                     51: named by the first command word:  "/usr/bin/nroff";
                     52: 
                     53: [4] makes a copy of itself -- a child process -- which will later be
                     54: transformed into the nroff process.
                     55: 
                     56: Here the child and parent processes do different things.
                     57: 
                     58: [5] The child sets up input and output, removing command words which
                     59: indicate redirection:  in this case opens a file
                     60: called "output" to which all future output from this child
                     61: process will be written instead of the terminal and removes
                     62: the words ">" and "outfile" from your command;
                     63: 
                     64: [6] the child transforms itself into the program found in step 3 above
                     65: using execl:  execl( "/usr/bin/nroff", "nroff", "-ms", "chapintro",
                     66: "chapter1", "chapter2", 0 );
                     67: 
                     68: [7] the child dies, either because it is done or there was an error,
                     69: at which point the Unix kernel removes all
                     70: traces of it and sends a signal of this event to the parent process;
                     71: 
                     72: [8] the parent process meanwhile literally waits idly for the child
                     73: process to finish, and then issues a prompt for another command.
                     74: 
                     75: Each of these steps have interesting and important ramifications.
                     76: Some are explained below, others are mentioned below and explained
                     77: elsewhere.
                     78: 
                     79: [1] Reads the command and breaks it into separate command words.
                     80: 
                     81: This step (lexical analysis) is needed to get the command words
                     82: (arguments) into the execl format.
                     83: It gives the typist some flexibility while imposing some restrictions.
                     84: In particular, the shell breaks the line into separate words
                     85: at blanks and tabs, treating multiple blanks and tabs as if they
                     86: were one blank.
                     87: So, for example, if you accidentally type extra blanks at
                     88: the beginning or end of the command, or between words, the
                     89: shell will probably do what you had in mind.
                     90: On the other hand, if you leave out blanks between two adjacent
                     91: arguments, it will go ahead and bundle them up as one word.
                     92: For example, the shell considers the command
                     93: .br
                     94:        % nroff-ms                myfile
                     95: .br
                     96: as having only two words, the name of the command being "nroff-ms",
                     97: then tries unsuccessfully to locate the program (step 3)
                     98: in a file of that name and responds with
                     99: .br
                    100:        nroff-ms: Command not found.
                    101: .br
                    102: The last argument would have been correctly interpreted as "myfile".
                    103: To add another twist, the command
                    104: .br
                    105:        nroff -ms-o1,5 myfile
                    106: .br
                    107: would be execl'd successfully (step 6) but would provoke
                    108: an error message from nroff.
                    109: 
                    110: One additional rule says that any one of the
                    111: characters &|;<>() is considered a separate word,
                    112: except when one of &|<> appear doubled, in which case
                    113: the doubled character is one word.
                    114: For example, the commands
                    115: .br
                    116:        % neqn    <paper|nroff -ms>>   outfile&
                    117: .br
                    118:        % neqn < paper | nroff -ms >> outfile &
                    119: .br
                    120: are interpreted identically, each consisting of 9 words.
                    121: 
                    122: On the other hand, if you want a blank, tab, or one of &|;<>()
                    123: to be considered part of another word, you must surround
                    124: it with quote marks of the type ", `, or ', or precede it with a \\
                    125: (use of \\ is also termed quoting).
                    126: If you want a carriage-return (newline) to be part of a word,
                    127: you must surround it with quote marks AND precede it with a \\,
                    128: since preceding it with a \\ and
                    129: not using quote marks is treated as a blank.
                    130: 
                    131: Beware.
                    132: Strictly speaking, quoting prevents the shell from interpreting
                    133: the quoted characters according to its usual practice, and this discussion
                    134: only mentions how the usual practice is suspended with respect
                    135: to word separation.
                    136: There are other much more profound side-effects of quoting depending on both
                    137: the quoted and the quoting characters.
                    138: The documentation is perhaps more unyielding, incomplete, and confusing
                    139: on this issue than on any other.
                    140: 
                    141: [2] Makes new command words if necessary.
                    142: 
                    143: The C Shell recognizes a large variety of characters and constructs
                    144: as having special meanings and substitutes other words in their place.
                    145: This means that if your command line contains any of them,
                    146: as in "!! | lpr ; wc * > ~/wcout" from before,
                    147: the resulting call (or calls) to execl (step 6) may be the result
                    148: of sweeping changes made in this step.
                    149: Note that the programs being called never see your original command
                    150: and never have to know anything about the special characters.
                    151: Consequently, the same substitution rules apply to ALL programs called
                    152: from the shell (for example, "lpr", "vi", "nroff", etc.).
                    153: 
                    154: Substitutions are classified by type and are applied in a definite order.
                    155: The shell scans command words for characters or constructs of the first type,
                    156: making substitutions if it finds any.
                    157: Then it takes the resulting command words and scans them to find and make
                    158: substitutions of the second type, if any, and so forth.
                    159: Here is a list of substitution types in order with an indication of the kinds
                    160: of special characters that will trigger them.
                    161: 
                    162: .nf
                    163: .ta 8n 16n 24n 32n 40n 48n 56n 64n
                    164: Type           Triggered By            Typical Uses
                    165: -------------------------------------------------------------------
                    166: History                !event, ^old^new        re-use earlier commands
                    167: Alias          first command word      re-name commands
                    168: Variable       $var, $#var, $var[n]    scripts, personalized shell
                    169: Command                `shell command`         use command output as args
                    170: Filename       *, ?, [], {}, ~         abbreviate groups of files
                    171: Input/Output   <, >, |, <<, >>, $<     re-route input and output
                    172: Expressions    ( x <>=!~+-*/()&|^ y )  arithmetic and branching
                    173: .fi
                    174: 
                    175: In the hands of a sober, well-informed user, substitutions are very
                    176: useful:  (1) they can save tremendous amounts of typing, (2) they
                    177: need only be learned for the shell, since all programs called by
                    178: users have to go through the shell, and (3) they make it possible
                    179: to write programs consisting of shell commands.
                    180: 
                    181: In the wrong hands, however, substitutions can be a tricky.
                    182: To help you practice, the shell provides a way for you to see
                    183: exactly what it comes up with just before it calls execl.
                    184: The command "set echo" will cause it to print your command after
                    185: all substitutions have been made, just before calling execl.
                    186: To avoid the danger of executing a possibly incorrect command, you
                    187: can test whether a construct will end up the way you think
                    188: just by entering it as an argument to the "echo" command.
                    189: The "echo" command does nothing more than print its arguments
                    190: on the terminal and like all commands is subject to substitutions.
                    191: So, for example, "echo *" prints the words that would result,
                    192: on any command line, from substituting for * (which lists all your files).
                    193: 
                    194: [3] Finds a file named by the first command word.
                    195: 
                    196: The whole point of the shell is to run programs other than itself,
                    197: such as "vi", "cc", "troff", etc.
                    198: Occasionally there is a need for a command that the shell can perform
                    199: internally, that is, without locating a program file or
                    200: creating another process.
                    201: So in this step the shell usually tries to locate a file containing the
                    202: program named by the first command word, but not before checking
                    203: to see if it belongs to the set of commands built-in to itself.
                    204: 
                    205: If a command is non-built-in, the shell scans a list of directories
                    206: called the searchpath, which may be personalized for each user.
                    207: It appends the first command word to the first directory on
                    208: the list and checks to see if the resulting file name exists.
                    209: If not, it checks the second directory in a similar fashion,
                    210: and so forth, until a file is found, and that file name is
                    211: used when execl is called in step 6.
                    212: In the case that no file is found, the shell reports this
                    213: and prompts for another command.
                    214: 
                    215: If your searchpath becomes garbled, usually because you were
                    216: experimenting with it, the shell may not find some or all of
                    217: the usual non-built-in commands.
                    218: Besides panicking, there are two things to do.
                    219: Fortunately, the command to correct the searchpath is built-in
                    220: and can still be used, but only if you recognize that
                    221: that is the problem.
                    222: Also, if the first command word begins with a /, the shell
                    223: considers it to be the name of the program file to execute,
                    224: for example, the command "/usr/ucb/vi .cshrc" would work.
                    225: 
                    226: If a command is built-in, the shell bypasses steps 4, 6, 7, and 8,
                    227: which reduces run time greatly, and performs the command in its own way.
                    228: For the sake of efficiency, a built-in command is preferred to a
                    229: non-built-in command if they perform the same function, and that is
                    230: why some of the built-in commands were created.
                    231: Other commands were built-in because they would not have worked
                    232: otherwise, due to the way that processes
                    233: disappear completely in step 7; in particular, if a command is
                    234: needed to change the behavior of your shell from that point on,
                    235: a non-built-in command would only be able to change the characteristics
                    236: of a child process of your shell, the shell process that will read
                    237: your next command when the child dies leaving no trace of the change.
                    238: 
                    239: The "echo" command, for example, is built-in to the C shell
                    240: because it is used so often.
                    241: A quick and ugly way to list the files in your directory,
                    242: without using the "ls" command, is to type "echo *".
                    243: A very quick way to create a one line file, without "vi",
                    244: is "echo This is a one line file. > oneliner".
                    245: Some commands that have to be built-in are "cd", "set",
                    246: "alias", and "history".
                    247: Unfortunately, most built-in commands do not have separate manual
                    248: sections, so the command "man set" will yield nothing, while "man csh"
                    249: will tell you about "cd" after printing the first 9 pages or so.
                    250: Ironically, "man echo" will display a manual page because users
                    251: of the Bourne shell do not have a built-in "echo" command.
                    252: 
                    253: [4] Makes a copy of itself -- a child process.
                    254: 
                    255: The Unix kernel requires the C shell -- in fact, requires
                    256: all programs that run other programs -- to use execl.
                    257: Unfortunately, that causes the process running the new program
                    258: to die when it is done.
                    259: Your shell therefore has to create a new process to do the execl
                    260: in order that the old process survive to prompt you for the
                    261: next command.
                    262: The only way to create a new process on Unix, though, is for
                    263: an existing process to make a copy of itself by executing
                    264: a program statement called fork.
                    265: The new and old processes are identical except that one knows
                    266: it is a parent and the other knows it is a child, and
                    267: the internal code of the program for both processes can
                    268: take different branches on the basis of this information.
                    269: This step is time-consuming, and the documentation sometimes
                    270: mentions useful ways to avoid having to fork new processes,
                    271: for instance, by using built-in commands.
                    272: 
                    273: [5] The child sets up input and output.
                    274: 
                    275: In this step, the command words are scanned for special input or
                    276: output redirection constructs.
                    277: When these constructs have been interpreted, they are removed
                    278: from the list of command words.
                    279: Any output file specified is created if it does not already exist.
                    280: If the file or directory does not have the correct permissions,
                    281: or an input file does not exist, the shell, not the program named
                    282: by the first command word, issues an error message and prompts
                    283: for another command.
                    284: The program to be run has no knowledge that its inputs and outputs
                    285: have been changed.
                    286: 
                    287: In the presence of a pipe between commands, the shell removes the
                    288: pipe constructs from the command line after first breaking it up
                    289: into separate subcommands.  Each of these subcommands is processed
                    290: like any other command, with a separate fork and execl for each.
                    291: The main difference is that the parent sets up input and output
                    292: between processes and has them all started up before beginning
                    293: to wait on any of them.
                    294: 
                    295: [6] The child transforms itself into the program found in step 3.
                    296: 
                    297: This is where the child does the execl, but not precisely.
                    298: For simplicity I did not mention that the actual call is
                    299: of the form:  execve( PROGFILE, ARG0, ARG1, ARG2, ..., 0 , ENV0, ENV1,
                    300: ENV2, ..., 0 ).
                    301: The new arguments (after the first 0) contain definitions of
                    302: all the current process's environment variables.
                    303: These may contain any information the user may choose
                    304: to store in them using the built-in command "setenv" and
                    305: have the property that besides input/output redirection, the
                    306: current directory, and a handful of other data,
                    307: they are some of the very few things
                    308: that can be inherited by the new program after execl.
                    309: 
                    310: [7] The child dies.
                    311: 
                    312: Processes can finish normally or abnormally,
                    313: but all of them die eventually.
                    314: For example, when you leave "vi" by typing ZZ, or when
                    315: "nroff" stops because of a macro/diversion overflow,
                    316: then the associated processes die.
                    317: Your shell itself is a process which dies when you logout.
                    318: 
                    319: When the child process running the new program dies,
                    320: the Unix kernel sends a signal to the parent process (your shell)
                    321: notifying it of the event.
                    322: 
                    323: [8] The parent waits for the child to die, then prompts the user.
                    324: 
                    325: In the meantime, the parent process has executed a program statement
                    326: called wait which just puts it on hold until Unix
                    327: sends a signal notifying the shell that the child has died.
                    328: If you had entered an & at the end of the original command,
                    329: your shell would not wait for notification of the child's death
                    330: but would print the child's process number and then
                    331: prompt you for the next command.
                    332: That procedure is called backgrounding a process.
                    333: 
                    334: While the C shell is waiting for the child (only on 4.1 or 4.2 BSD Unix)
                    335: you can type ^Z to wakeup the parent and
                    336: freeze the child for the time being.
                    337: At that point you could enter other commands to shell and
                    338: at a later time you could issue commands to resume execution,
                    339: kill it altogether, or resume execution in the background.
                    340: This useful feature is called job control.
                    341: 
                    342: 
                    343: jak
unix.superglobalmegacorp.com
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.