.TI CSHELL/FLOWCONTROL Control Statements: Branching and Looping In the C shell, control statements are special built-in commands that you enter around or near conventional commands, often on separate lines. They let you tell the shell how to control the execution of the commands they affect. This means that instead of having the shell execute all your commands one after another as it does normally, you can have it skip some or repeat some. This ability to skip or repeat commands, called branching and looping, respectively, is more powerful than it might sound at first. Branching control statements are truly useful only inside shell scripts, where the shell reads and executes commands from a file instead of from the terminal. Inside the script they serve mainly to let you store several different command sequences compactly, specifying ahead of time on what basis you want the shell to skip or choose certain statements while the script is running. The basis for decisions made while the script is running is usually furnished by such things as arguments given to the script, the time of day, whether a certain file exists, etc. You enter decision points using a special notation called expressions. Looping control statements are useful both inside shell scripts and when entered from the terminal. Although they are used to execute commands repeatedly, you seldom want to repeat the exact same command, but usually one that is close to it. To accomplish this you use a special notation called variables to indicate where in a command the shell is to substitute the piece of text that is to change when the command is repeated. In general you can best exploit control statements by using them non-interactively, that is, by putting them in command scripts so that they decide in your absence which commands to execute, when, and how often. On the other hand, you can use some of the looping statements interactively to great advantage. How can a script of commands do different things from one run to the next if the text of the commands does not change? The answer is: Only if parts of the commands and decisions have been entered by you with the special notation for variables and expressions. You need a basic working knowledge of variables and expressions to understand control statements. The next example illustrates the only control structure not requiring any special knowledge. A Simple Example Normally in a shell script or a terminal session, when the shell finishes doing one command, it proceeds to do the command on the next line, and then the next, and so forth. This usually continues until the end of the script file is reached or the end of the terminal session (logout). Suppose that instead of this normal behavior you wanted to execute the same command 10 times. In particular, suppose that you wanted 10 lineprinted copies of the file "letter". The line, .IP repeat 10 lpr letter .LP would use the control statement "repeat" to cause the command "lpr letter" to be run 10 times. Lines such as these are sometimes called control structures because they consist of a control part and a command part. A More Complicated Example Most control structures involve several lines and require variables. Here is an example of a 4-line structure which looks up some words in the Unix dictionary and then displays on the terminal all occurrences of them in a file called "paper". .IP 'nf foreach x (sharp ternimal prestidigitation aint unix) look $x grep $x paper end .fi .LP The "foreach" line tells the shell that it must first scan the succeeding lines until it finds one with the keyword "end". The two intervening commands will be executed in the order they appear as many times as there are words in parentheses. Before each cycle of the loop begins, the variable "x" is set to the next word in the list. In the first cycle, the shell executes the two commands using "$x" to signify the word "sharp", that is, it performs the command "look sharp" followed by "grep sharp paper". Then the shell proceeds with the next word on the list, this time performing the two commands using "$x" to signify the mispelled word "ternimal". This continues for each word on the list until they have all been used, at which point the shell resumes its normal operation by proceeding to the next command, if any, after the "end" statement. A Summary of Control Statements There is rarely a situation that requires the use of one kind of control statement over another, although some kinds are more convenient than others in certain circumstances. Here is a summary of the C shell's control statements indicating the general form of the first line, special associated words that accompany the statement on other lines, whether it is practical to use interactively, and the usual purpose. .nf .ne 8 Name and First Line Associated Keywords Intr? Usual Purpose --------------------------------------------------------------------------- if (expression) then, else, endif no conditional branching repeat N command yes simple repetition while (expression) end, break, continue yes ... until a condition foreach x (list) end, break, continue yes ... for each item in list goto label label: no unconditional branching switch (expression) case, endsw, breaksw no multi-way value branch .fi In the examples that follow I will use C shell variables and expressions as if you were already somewhat familiar with them. Additional explanation sometimes appears in the form of C shell comments (annotations) appended to command lines after a # sign. Also, I have followed a common convention in programming languages in indenting those commands grouped inside control structures. The \fBif\fP Statement This has several different forms, the first one being .IP \fBif (\fPexpression\fB)\fP command .LP where the \fIcommand\fP is executed only if the \fIexpression\fP is true. Here is a script fragment that informs the user of the number of files in the current directory. .IP 'nf set x = (*) # variable x gets list of files echo -n "There are $#x file" # $#x is number of files in $x if ($#x > 1) echo -n "s" # if more than 1, make plural echo " in this directory." # end sentence; no -n so add newline .fi .LP Unfortunately, the first form requires the \fIcommand\fP to be a simple command; in other words, no pipelines or multiple commands are allowed, even if they are aliased. You can avoid this restriction by calling a multi-line script in place of \fIcommand\fP, but that takes time. It would be more efficient to use the second form of the if statement, .IP 'nf \fBif (\fPexpression\fB) then \fPcommands\fB ... else if (\fPexpression\fB) then \fPcommands\fB ... \&... else \fPcommands\fB ... endif\fP .fi .LP The hardest part about this form is to remember to type the keyword \fBthen\fP; even the most experienced users trip over it fairly often. Here is an example of a multi-line \fBif\fP statement which checks to see if a file called "readme" exists and has read permission; if so it displays it, if not it takes other steps. .IP 'nf if (-e readme && -r readme) then # if it exists and is readable echo "I have a message." # announce that fact cat readme # and display the file else if (-e readme) then # else if it exists only ... echo "I have a message, but cannot read it." exit # complain and exit script else # else if it doesn't even exist /usr/games/fortune # do something irrational endif # this ends the multi-line if .fi .LP The \fBrepeat\fP Statement As you may have gathered, the form of this control statement is .IP \fBrepeat \fIN command\fR .LP Unfortunately, the \fIcommand\fP part suffers the same restrictions here as in the one-line if statement. As a result, it is hard to find uses for the repeat statement. One example cited occasionally uses it to make 100 copies of one file in another file, as in .IP repeat 100 cat data >> bigdata .LP A much more efficient way to do this is .IP cat `jot -b data 100` > bigdata .LP using command substitution (those quote marks should be grave accents). Similarly, our earlier example with "lpr" would have been better done with .IP lpr `jot -b letter 10` .LP The \fBwhile\fP Statement The general form of the while statement is .IP 'nf \fBwhile (\fPexpression\fB) commands ... end\fP .fi .LP There are no clear cut criteria for determining when to use this control statement over another, but in general, use it when you want to loop through a sequence of commands until a certain condition becomes false. On each iteration (cycle through the loop), if \fIexpression\fP is true the shell executes the commands up until the "end" statement, then begins the next iteration. When \fIexpression\fP becomes false, execution continues after the "end". You can enter the while statement at your terminal as well as in a script. At the terminal, the shell cannot begin processing until the entire body of commands up to the "end" have been entered, so each time you press RETURN, it reminds you that it still expects you to type in "end" sometime by prompting with a "?". For example, to print out the numbers from 1 to 300 on your terminal with an interactive while loop, .IP 'nf % set n = 1 # variable n starts out 1 % while ($n < 301) # while it's less than 301 ? echo $n # print it out; shell prompt is "?" ? @ n = ($n + 1) # @ is like set, but do arithmetic ? end # end of loop; loop starts after RETURN .fi .LP A better way to count to 300 would be "jot 300", but this example illustrates three points. First, most applications in the C shell that require doing a command sequence a certain number of times have this general form. The condition that the while loop is waiting to become false is an arithmetic relation involving a variable that is initialized before the loop, changes inside the loop, and is tested in the \fIexpression\fP. Secondly, any number of commands could have filled up the body of the loop, so this provides a way to get around the restriction in the repeat statement of repeating only a simple command. The counting variable, "n" in this case, need not always be used except to be incremented. Finally, the variable could have been used to count down instead of up, to increase by larger increments, or to change using any arithmetic operation. The \fIexpression\fP could easily be adjusted to test for different arithmetic relations. As another example, suppose several users need to edit a file called "project" from time to time, the contents of which are important enough that only one user should be allowed to edit it at a time. All the users involved could accomplish this by agreeing among themselves only to edit "project" using the script below. The while loop causes the shell to check every minute whether a file called "lock" exists. When it no longer exists, the shell creates the file again, letting other users know that someone is currently editing "project", and calls up the editor. The lock file is removed after editing, and the script is done. .IP 'nf while (-e lock) # while file "lock" exists ... echo Trying # tells you that you (still) must wait sleep 60 # do nothing for 60 seconds end # end of loop; when "lock" gone, proceed date > lock # recreate "lock", no matter what with vi project # call up vi on data being secured rm lock # let others know they can edit "project" .fi .LP The \fBforeach\fP Statement The general form of the foreach statement is .IP 'nf \fBforeach \fPvar\fB (\fPword1 word2 \fB...) commands ... end\fR .fi .LP As for the while statement there are no clear cut criteria for determining when to use the foreach statement over another, but in general, use it when you want to loop through a sequence of commands to be applied to a list of items. On each iteration, the variable \fIvar\fP is set to the next word in the parenthesized list. The shell executes the commands up until the "end" statement, then begins the next iteration. When there are no unused words left in the list, execution continues after the "end". The variable \fIvar\fP may be any name you choose, and you are not required to use it in the body of the loop. Also, the foreach statement is probably used interactively more often than any other control structure. For example, during a terminal session you could mail the files "data1", "data2", etc. up to "data7" to a user named "fred" using .IP 'nf % foreach f (data1 data2 data3 data4 data5 data6 data7) ? mail fred < $f ? end .fi .LP The list could have been abbreviated with (data[1-7]). All the C shell control structures can be nested (used inside one another). As an example, the following script copies each of the data files above into each of the directories "old", "new", and "current". .IP 'nf foreach d (old new current) # outer loop for directories mkdir $d # make the target directory foreach f (data[1-7]) # inner loop for files cp $f $d # copy (is this efficient?) end # end of inner loop end # end of outer loop .fi .LP This example above is actually rather inefficient because the inner loop (3 statements) could have been replaced with the single command "cp data[1-7] $d", so that the shell would only create one "cp" process per directory instead of 7. The lesson to learn is that however tempted you may be to use the foreach statement, check to see if the command involved takes more than one argument. The \fBgoto\fP and \fBonintr\fP Statements The goto statement consists of the word \fBgoto\fP followed be a \fIlabel\fP and causes the shell to jump to a line beginning with the string \fIlabel\fP followed by a : and to begin executing statements there. Strictly speaking all of the C shell's branching and looping constructs could be built up using combinations of just goto and simple if statements. Computer scientists tend to be fanatical in their opposition to the goto statement, so try not use it in their presence. Here is an example of a script that runs "sed" twice on each of the command line arguments. When there are no arguments left, the script removes a temporary file it created and exits. If an argument file is really a directory, an error message is printed and the temporary file is not removed. .IP 'nf foreach f ($argv) # foreach command argument if (-d $f) goto error # if a directory, goto error sed -f pass1 $f > /tmp/xyz$$ # sed stores in /tmp/xyz$$ sed -f pass2 /tmp/xyz$$ > $f # then stores back in $f end # end of foreach goto done: # skip the error part error: # just label, no other effect echo Error - $f is a directory. # print message goto end: # exit script avoiding "rm" done: # just a label rm /tmp/xyz$$ # remove temporary file end: # last label, nothing after it .LP Besides being clumsy, this script would not remove its temporary file if the user typed an interrupt to abort the execution prematurely; the script would just drop in its tracks. This is remedied by the onintr statement, which has the form \fBonintr \fIlabel\fR, where \fIlabel\fP is a label to jump to whenever the script receives a subsequent interrupt. The next example illustrates this, and also converts the goto statements into more socially acceptable statements. .IP 'nf onintr done # on interrupt, goto done foreach f ($argv) # foreach command argument if (-d $f) then # if a directory then ... echo Error - $f is a directory. exit # exit, but leave temp file endif # end of if statement sed -f pass1 $f > /tmp/xyz$$ # sed stores in /tmp/xyz$$ sed -f pass2 /tmp/xyz$$ > $f # then stores back in $f end # end of foreach done: # label for onintr rm /tmp/xyz$$ # remove temporary file .fi .LP The \fBswitch\fP Statement This is like a multi-line if statement which branches only on the value of one expression. It is usually most useful for visually separating the different branches taken in response to different values of the same thing. Without much further ado, here is a meaningless script fragment that processes its arguments through a while loop using a switch statement. Just before the each next interaction of the loop, the shift command causes the current value of "$argv[1]" to be discarded and replaced with "$argv[2]". .IP 'nf while ($#argv > 0) switch ($argv[1]) case -m: set m breaksw case -l*: set length = $argv[1] breaksw case -T: set sort = (/usr/lib/sort) set lpr = (/usr/ucb/lpr -Pevans1) breaksw case -Y: set sort = (/usr/lib/vsort -W) set lpr = (/usr/ucb/lpr -Pversatec) breaksw case -F: if ($#argv < 2) then echo -F takes following port name. exit(1) endif set argv = (-1 $2.f -2 $2.p -3 $2.o $argv[3-]) breaksw case -1: case -2: case -3: if ($#argv < 2) then echo $1 takes following port name. exit(1) endif breaksw case -x: set xyzzy breaksw case -*: set flags = ($flags $argv[1]) breaksw endsw shift argv end .fi .LP jak