--- gcc/internals.texinfo 2018/04/24 16:37:52 1.1 +++ gcc/internals.texinfo 2018/04/24 16:43:51 1.1.1.8 @@ -6,7 +6,7 @@ @ifinfo This file documents the internals of the GNU compiler. -Copyright (C) 1987 Richard M. Stallman. +Copyright (C) 1988 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice @@ -27,9 +27,9 @@ distributed under the terms of a permiss Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, -except that the section entitled ``GNU CC General Public License'' may be -included in a translation approved by the author instead of in the original -English. +except that the section entitled ``GNU CC General Public License'' and +this permission notice may be included in translations approved by the +Free Software Foundation instead of in the original English. @end ifinfo @setchapternewpage odd @@ -38,9 +38,13 @@ English. @center @titlefont{Internals of GNU CC} @sp 2 @center Richard M. Stallman +@sp 3 +@center last updated 6 September 1988 +@sp 1 +@center for version 1.27 @page @vskip 0pt plus 1filll -Copyright @copyright{} 1987 Richard M. Stallman. +Copyright @copyright{} 1988 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice @@ -61,28 +65,34 @@ English. @page @ifinfo -@node Top, Switches, , (DIR) - -Introduction -************ +@node Top, Copying,, (DIR) +@ichapter Introduction -This manual documents how to install and port the GNU C compiler. +This manual documents how to run, install and port the GNU C compiler, as +well as its new features and incompatibilities, and how to report bugs. @end ifinfo @menu * Copying:: GNU CC General Public License says how you can copy and share GNU CC. -* Switches:: Command switches supported by @samp{gcc}. +* Contributors:: People who have contributed to GNU CC. +* Options:: Command options supported by @samp{gcc}. * Installation:: How to configure, compile and install GNU CC. +* Trouble:: If you have trouble installing GNU CC. +* Incompatibilities:: Incompatibilities of GNU CC. +* Extensions:: GNU extensions to the C language. +* Bugs:: How to report bugs (if you want to get them fixed). * Portability:: Goals of GNU CC's portability features. +* Interface:: Function-call interface of GNU CC output. * Passes:: Order of passes, what they do, and what each file is for. * RTL:: The intermediate representation that most passes work on. * Machine Desc:: How to write machine description instruction patterns. * Machine Macros:: How to write the machine description C macros. @end menu -@node Copying, Switches, Top, Top +@node Copying, Contributors, Top, Top @unnumbered GNU CC GENERAL PUBLIC LICENSE +@center (Clarified 11 Feb 1988) The license agreements of most software companies keep you at the mercy of those companies. By contrast, our general public license is @@ -108,7 +118,7 @@ someone else and passed on, we want its they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation. - Therefore we (Richard Stallman and the Free Software Fundation, + Therefore we (Richard Stallman and the Free Software Foundation, Inc.) make the following terms which say what you must do to be allowed to distribute or change GNU CC. @@ -119,12 +129,12 @@ allowed to distribute or change GNU CC. You may copy and distribute verbatim copies of GNU CC source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy a valid copyright notice -``Copyright @copyright{} 1987 Free Software Foundation, Inc.'' (or -with the year updated if that is appropriate); keep intact the notices -on all files that refer to this License Agreement and to the absence -of any warranty; and give any other recipients of the GNU CC program a -copy of this License Agreement along with the program. You may charge -a distribution fee for the physical act of transferring a copy. +``Copyright @copyright{} 1988 Free Software Foundation, Inc.'' (or +with whatever year is appropriate); keep intact the notices on all +files that refer to this License Agreement and to the absence of any +warranty; and give any other recipients of the GNU CC program a copy +of this License Agreement along with the program. You may charge a +distribution fee for the physical act of transferring a copy. @item You may modify your copy or copies of GNU CC or any portion of it, @@ -137,13 +147,12 @@ cause the modified files to carry promin that you changed the files and the date of any change; and @item -cause the whole of any work that you distribute or publish, -that in whole or in part contains or is a derivative of GNU CC or -any part thereof, to be licensed at no charge to all third -parties on terms identical to those contained in this License -Agreement (except that you may choose to grant more extensive -warranty protection to some or all third parties, at your -option). +cause the whole of any work that you distribute or publish, that +in whole or in part contains or is a derivative of GNU CC or any +part thereof, to be licensed at no charge to all third parties on +terms identical to those contained in this License Agreement +(except that you may choose to grant more extensive warranty +protection to some or all third parties, at your option). @item You may charge a distribution fee for the physical act of @@ -151,54 +160,64 @@ transferring a copy, and you may at your protection in exchange for a fee. @end itemize +Mere aggregation of another unrelated program with this program (or its +derivative) on a volume of a storage or distribution medium does not bring +the other program under the scope of these terms. + @item -You may copy and distribute GNU CC or any portion of it in -compiled, executable or object code form under the terms of Paragraphs -1 and 2 above provided that you do the following: +You may copy and distribute GNU CC (or a portion or derivative of it, +under Paragraph 2) in object code or executable form under the terms +of Paragraphs 1 and 2 above provided that you also do one of the +following: @itemize @bullet @item -cause each such copy to be accompanied by the -corresponding machine-readable source code, which must -be distributed under the terms of Paragraphs 1 and 2 above; or, - -@item -cause each such copy to be accompanied by a -written offer, with no time limit, to give any third party -free (except for a nominal shipping charge) a machine readable -copy of the corresponding source code, to be distributed -under the terms of Paragraphs 1 and 2 above; or, - -@item -in the case of a recipient of GNU CC in compiled, executable -or object code form (without the corresponding source code) you -shall cause copies you distribute to be accompanied by a copy -of the written offer of source code which you received along -with the copy you received. +accompany it with the complete corresponding machine-readable +source code, which must be distributed under the terms of +Paragraphs 1 and 2 above; or, + +@item +accompany it with a written offer, valid for at least three +years, to give any third party free (except for a nominal +shipping charge) a complete machine-readable copy of the +corresponding source code, to be distributed under the terms of +Paragraphs 1 and 2 above; or, + +@item +accompany it with the information you received as to where the +corresponding source code may be obtained. (This alternative is +allowed only for noncommercial distribution and only if you +received the program in object code or executable form alone.) @end itemize +For an executable file, complete source code means all the source code +for all modules it contains; but, as a special exception, it need not +include source code for modules which are standard libraries that +accompany the operating system on which the executable file runs. + @item -You may not copy, sublicense, distribute or transfer GNU CC -except as expressly provided under this License Agreement. Any attempt -otherwise to copy, sublicense, distribute or transfer GNU CC is void and -your rights to use the program under this License agreement shall be -automatically terminated. However, parties who have received computer -software programs from you with this License Agreement will not have -their licenses terminated so long as such parties remain in full compliance. +You may not copy, sublicense, distribute or transfer GNU CC except as +expressly provided under this License Agreement. Any attempt +otherwise to copy, sublicense, distribute or transfer GNU CC is void +and your rights to use the program under this License agreement shall +be automatically terminated. However, parties who have received +computer software programs from you with this License Agreement will +not have their licenses terminated so long as such parties remain in +full compliance. @item If you wish to incorporate parts of GNU CC into other free programs whose distribution conditions are different, write to the Free Software -Foundation at 1000 Mass Ave, Cambridge, MA 02138. We have not yet worked +Foundation at 675 Mass Ave, Cambridge, MA 02139. We have not yet worked out a simple rule that can be stated here, but we will often permit this. We will be guided by the two goals of preserving the free status of all -derivatives our free software and of promoting the sharing and reuse of +derivatives of our free software and of promoting the sharing and reuse of software. @end enumerate Your comments and suggestions about our licensing policies and our software are welcome! Please contact the Free Software Foundation, Inc., -1000 Mass Ave, Cambridge, MA 02138, or call (617) 876-3296. +675 Mass Ave, Cambridge, MA 02139, or call (617) 876-3296. @unnumberedsec NO WARRANTY @@ -223,45 +242,631 @@ FAILURE OF THE PROGRAM TO OPERATE WITH A IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES, OR FOR ANY CLAIM BY ANY OTHER PARTY. -@node Switches, Installation, Copying, Top -@chapter GNU CC Switches +@node Contributors, Options, Copying, Top +@unnumbered Contributors to GNU CC + +In addition to Richard Stallman, several people have written parts +of GNU CC. + +@itemize @bullet +@item +The idea of using RTL and some of the optimization ideas came from the +U. of Arizona Portable Optimizer, written by Jack Davidson and +Christopher Fraser. See ``Register Allocation and Exhaustive Peephole +Optimization'', Software Practice and Experience 14 (9), Sept. 1984, +857-866. + +@item +Paul Rubin wrote most of the preprocessor. + +@item +Leonard Tower wrote parts of the parser, RTL generator, RTL +definitions, and of the Vax machine description. + +@item +Ted Lemon wrote parts of the RTL reader and printer. + +@item +Nobuyuki Hikichi of Software Research Associates, Tokyo, contributed +the support for the SONY NEWS machine. + +@item +Charles LaBrec contributed the support for the Integrated Solutions +68020 system. + +@item +Michael Tiemann of MCC wrote most of the description of the National +Semiconductor 32000 series cpu. He also wrote the code for inline +function integration and for the SPARC cpu and Motorola 88000 cpu +and part of the Sun FPA support. + +@item +Jan Stein of the Chalmers Computer Society provided support for +Genix, as well as part of the 32000 machine description. + +@item +Randy Smith finished the Sun FPA support. + +@item +Robert Brown implemented the support for Encore 32000 systems. + +@item +David Kashtan of SRI adapted GNU CC to the Vomit-Making System. + +@item +Alex Crain provided changes for the 3b1. + +@item +Greg Satz and Chris Hanson assisted in making GNU CC work on HP-UX for +the 9000 series 300. + +@item +William Schelter did most of the work on the Intel 80386 support. +@end itemize + +@node Options, Installation, Contributors, Top +@chapter GNU CC Command Options + +The GNU C compiler uses a command syntax much like the Unix C compiler. +The @code{gcc} program accepts options and file names as operands. +Multiple single-letter options may @emph{not} be grouped: @samp{-dr} is +very different from @samp{-d -r}. + +When you invoke GNU CC, it normally does preprocessing, compilation, +assembly and linking. File names which end in @samp{.c} are taken as C +source to be preprocessed and compiled; compiler output files plus any +input files with names ending in @samp{.s} are assembled; then the +resulting object files, plus any other input files, are linked together to +produce an executable. + +Command options allow you to stop this process at an intermediate stage. +For example, the @samp{-c} option says not to run the linker. Then the +output consists of object files output by the assembler. + +Other command options are passed on to one stage. Some options control +the preprocessor and others the compiler itself. Yet other options +control the assembler and linker; these are not documented here because the +GNU assembler and linker are not yet released. + +Here are the options to control the overall compilation process, including +those that say whether to link, whether to assemble, and so on. + +@table @samp +@item -o @var{file} +Place output in file @var{file}. This applies regardless to whatever +sort of output is being produced, whether it be an executable file, +an object file, an assembler file or preprocessed C code. + +If @samp{-o} is not specified, the default is to put an executable file +in @file{a.out}, the object file @file{@var{source}.c} in +@file{@var{source}.o}, an assembler file in @file{@var{source}.s}, and +preprocessed C on standard output.@refill + +@item -c +Compile or assemble the source files, but do not link. Produce object +files with names made by replacing @samp{.c} or @samp{.s} with +@samp{.o} at the end of the input file names. Do nothing at all for +object files specified as input. + +@item -S +Compile into assembler code but do not assemble. The assembler output +file name is made by replacing @samp{.c} with @samp{.s} at the end of +the input file name. Do nothing at all for assembler source files or +object files specified as input. + +@item -E +Run only the C preprocessor. Preprocess all the C source files +specified and output the results to standard output. + +@item -v +Compiler driver program prints the commands it executes as it runs +the preprocessor, compiler proper, assembler and linker. Some of +these are directed to print their own version numbers. + +@item -B@var{prefix} +Compiler driver program tries @var{prefix} as a prefix for each +program it tries to run. These programs are @file{cpp}, @file{cc1}, +@file{as} and @file{ld}. + +For each subprogram to be run, the compiler driver first tries the +@samp{-B} prefix, if any. If that name is not found, or if @samp{-B} +was not specified, the driver tries two standard prefixes, which are +@file{/usr/lib/gcc-} and @file{/usr/local/lib/gcc-}. If neither of +those results in a file name that is found, the unmodified program +name is searched for using the directories specified in your +@samp{PATH} environment variable. + +The run-time support file @file{gnulib} is also searched for using +the @samp{-B} prefix, if needed. If it is not found there, the two +standard prefixes above are tried, and that is all. The file is left +out of the link if it is not found by those means. Most of the time, +on most machines, you can do without it. +@end table + +These options control the details of C compilation itself. @table @samp +@item -ansi +Support all ANSI standard C programs. + +This turns off certain features of GNU C that are incompatible with +ANSI C, such as the @code{asm}, @code{inline} and @code{typeof} +keywords, and predefined macros such as @code{unix} and @code{vax} +that identify the type of system you are using. It also enables the +undesirable and rarely used ANSI trigraph feature. + +The @samp{-ansi} option does not cause non-ANSI programs to be +rejected gratuitously. For that, @samp{-pedantic} is required in +addition to @samp{-ansi}. + +The macro @code{__STRICT_ANSI__} is predefined when the @samp{-ansi} +option is used. Some header files may notice this macro and refrain +from declaring certain functions or defining certain macros that the +ANSI standard doesn't call for; this is to avoid interfering with +any programs that might use these names for other things. + +@item -traditional +Attempt to support some aspects of traditional C compilers. +Specifically: + +@itemize @bullet +@item +All @code{extern} declarations take effect globally even if they +are written inside of a function definition. This includes implicit +declarations of functions. + +@item +The keywords @code{typeof}, @code{inline}, @code{signed}, @code{const} +and @code{volatile} are not recognized.@refill + +@item +Comparisons between pointers and integers are always allowed. + +@item +Integer types @code{unsigned short} and @code{unsigned char} promote +to @code{unsigned int}. + +@item +Out-of-range floating point literals are not an error. + +@item +In the preprocessor, comments convert to nothing at all, rather than +to a space. This allows traditional token concatenation. + +@item +In the preprocessor, macro arguments are recognized within string +constants in a macro definition (and their values are stringified, +though without additional quote marks, when they appear in such a +context). The preprocessor always considers a string constant to end +at a newline. +@end itemize + @item -O -Do optimize. +Optimize. Optimizing compilation takes somewhat more time, and a lot +more memory for a large function. + +Without @samp{-O}, the compiler's goal is to reduce the cost of +compilation and to make debugging produce the expected results. +Statements are independent: if you stop the program with a breakpoint +between statements, you can then assign a new value to any variable or +change the program counter to any other statement in the function and +get exactly the results you would expect from the source code. + +Without @samp{-O}, only variables declared @code{register} are +allocated in registers. The resulting compiled code is a little worse +than produced by PCC without @samp{-O}. + +With @samp{-O}, the compiler tries to reduce code size and execution +time. + +Some of the @samp{-f} options described below turn specific kinds of +optimization on or off. @item -g -Produce debugging information in DBX format. +Produce debugging information in the operating system's native +format (for DBX or SDB). -@item -c -Compile but do not link the object files. +Unlike most other C compilers, GNU CC allows you to use @samp{-g} with +@samp{-O}. The shortcuts taken by optimized code may occasionally +produce surprising results: some variables you declared may not exist +at all; flow of control may briefly move where you did not expect it; +some statements may not be executed because they compute constant +results or their values were already at hand; some statements may +execute in different places because they were moved out of loops. +Nevertheless it proves possible to debug optimized output. This makes +it reasonable to use the optimizer for programs that might have bugs. + +@item -gg +Produce debugging information in GDB's own format. This requires +the GNU assembler and linker in order to work. -@item -o @var{file} -Place linker output in file @var{file}. +@item -w +Inhibit all warning messages. -@item -S -Compile into assembler code but do not assemble. +@item -W +Print extra warning messages for these events: + +@itemize @bullet +@item +An automatic variable is used without first being initialized. + +These warnings are possible only in optimizing compilation, +because they require data flow information that is computed only +when optimizing. They occur only for variables that are +candidates for register allocation. Therefore, they do not occur +for a variable that is declared @code{volatile}, or whose address +is taken, or whose size is other than 1, 2, 4 or 8 bytes. Also, +they do not occur for structures, unions or arrays, even when +they are in registers. + +Note that there may be no warning about a variable that is used +only to compute a value that itself is never used, because such +computations may be deleted by the flow analysis pass before the +warnings are printed. + +These warnings are made optional because GNU CC is not smart +enough to see all the reasons why the code might be correct +despite appearing to have an error. Here is one example of how +this can happen: + +@example +@{ + int x; + switch (y) + @{ + case 1: x = 1; + break; + case 2: x = 4; + break; + case 3: x = 5; + @} + foo (x); +@} +@end example + +@noindent +If the value of @code{y} is always 1, 2 or 3, then @code{x} is +always initialized, but GNU CC doesn't know this. Here is +another common case: + +@example +@{ + int save_y; + if (change_y) save_y = y, y = new_y; + @dots{} + if (change_y) y = save_y; +@} +@end example + +@noindent +This has no bug because @code{save_y} is used only if it is set. + +@item +A nonvolatile automatic variable might be changed by a call to +@code{longjmp}. These warnings as well are possible only in +optimizing compilation. + +The compiler sees only the calls to @code{setjmp}. It cannot know +where @code{longjmp} will be called; in fact, a signal handler could +call it at any point in the code. As a result, you may get a warning +even when there is in fact no problem because @code{longjmp} cannot +in fact be called at the place which would cause a problem. + +@item +A function can return either with or without a value. (Falling +off the end of the function body is considered returning without +a value.) For example, this function would inspire such a +warning: + +@example +foo (a) +@{ + if (a > 0) + return a; +@} +@end example + +Spurious warnings can occur because GNU CC does not realize that +certain functions (including @code{abort} and @code{longjmp}) +will never return. +@end itemize + +In the future, other useful warnings may also be enabled by this +option. + +@item -Wimplicit +Warn whenever a function is implicitly declared. + +@item -Wreturn-type +Warn whenever a function is defined with a return-type that defaults +to @code{int}. Also warn about any @code{return} statement with no +return-value in a function whose return-type is not @code{void}. + +@item -Wcomment +Warn whenever a comment-start sequence @samp{/*} appears in a comment. + +@item -Wall +All of the above @samp{-W} options combined. + +@item -Wwrite-strings +Give string constants the type @code{const char[@var{length}]} so that +copying the address of one into a non-@code{const} @code{char *} +pointer will get a warning. These warnings will help you find at +compile time code that can try to write into a string constant, but +only if you have been very careful about using @code{const} in +declarations and prototypes. Otherwise, it will just be a nuisance; +this is why we did not make @samp{-Wall} request these warnings. + +@item -p +Generate extra code to write profile information suitable for the +analysis program @code{prof}. + +@item -pg +Generate extra code to write profile information suitable for the +analysis program @code{gprof}. + +@item -l@var{library} +Search a standard list of directories for a library named +@var{library}, which is actually a file named +@file{lib@var{library}.a}. The linker uses this file as if it +had been specified precisely by name. + +The directories searched include several standard system directories +plus any that you specify with @samp{-L}. + +Normally the files found this way are library files---archive files +whose members are object files. The linker handles an archive file by +scanning through it for members which define symbols that have so far +been referenced but not defined. But if the file that is found is an +ordinary object file, it is linked in the usual fashion. The only +difference between using an @samp{-l} option and specifying a file name +is that @samp{-l} searches several directories. + +@item -L@var{dir} +Add directory @var{dir} to the list of directories to be searched +for @samp{-l}. + +@item -nostdlib +Don't use the standard system libraries and startup files when +linking. Only the files you specify (plus @file{gnulib}) will be +passed to the linker. @item -m@var{machinespec} -Machine-dependent switch specifying something about the type -of target machine. For example, using the 68000 machine description, -@samp{-m68000} specifies do not use the 68020 instructions, -and @samp{-msoft-float} specifies do not use the 68881 floating point -instructions. +Machine-dependent option specifying something about the type of target +machine. These options are defined by the macro +@code{TARGET_SWITCHES} in the machine description. The default for +the options is also defined by that macro, which enables you to change +the defaults.@refill + +These are the @samp{-m} options defined in the 68000 machine +description: + +@table @samp +@item -m68020 +@itemx -mc68020 +Generate output for a 68020 (rather than a 68000). This is the +default if you use the unmodified sources. + +@item -m68000 +@item -mc68000 +Generate output for a 68000 (rather than a 68020). + +@item -m68881 +Generate output containing 68881 instructions for floating point. +This is the default if you use the unmodified sources. + +@item -mfpa +Generate output containing Sun FPA instructions for floating point. + +@item -msoft-float +Generate output containing library calls for floating point. + +@item -mshort +Consider type @code{int} to be 16 bits wide, like @code{short int}. + +@item -mnobitfield +Do not use the bit-field instructions. @samp{-m68000} implies +@samp{-mnobitfield}. + +@item -mbitfield +Do use the bit-field instructions. @samp{-m68020} implies +@samp{-mbitfield}. This is the default if you use the unmodified +sources. + +@item -mrtd +Use a different function-calling convention, in which functions +that take a fixed number of arguments return with the @code{rtd} +instruction, which pops their arguments while returning. This +saves one instruction in the caller since there is no need to pop +the arguments there. + +This calling convention is incompatible with the one normally +used on Unix, so you cannot use it if you need to call libraries +compiled with the Unix compiler. + +Also, you must provide function prototypes for all functions that +take variable numbers of arguments (including @code{printf}); +otherwise incorrect code will be generated for calls to those +functions. + +In addition, seriously incorrect code will result if you call a +function with too many arguments. (Normally, extra arguments are +harmlessly ignored.) + +The @code{rtd} instruction is supported by the 68010 and 68020 +processors, but not by the 68000. +@end table + +These @samp{-m} options are defined in the Vax machine description: + +@table @samp +@item -munix +Do not output certain jump instructions (@code{aobleq} and so on) +that the Unix assembler for the Vax cannot handle across long +ranges. + +@item -mgnu +Do output those jump instructions, on the assumption that you +will assemble with the GNU assembler. + +@item -mg +Output code for g-format floating point numbers instead of d-format. +@end table + +@item -f@var{flag} +Specify machine-independent flags. These are the flags: + +@table @samp +@item -ffloat-store +Do not store floating-point variables in registers. This +prevents undesirable excess precision on machines such as the +68000 where the floating registers (of the 68881) keep more +precision than a @code{double} is supposed to have. + +For most programs, the excess precision does only good, but a few +programs rely on the precise definition of IEEE floating point. +Use @samp{-ffloat-store} for such programs. + +@item -fno-asm +Do not recognize @code{asm}, @code{inline} or @code{typeof} as a +keyword. These words may then be used as identifiers. + +@item -fno-defer-pop +Always pop the arguments to each function call as soon as that +function returns. Normally the compiler (when optimizing) lets +arguments accumulate on the stack for several function calls and +pops them all at once. + +@item -fcombine-regs +Allow the combine pass to combine an instruction that copies one +register into another. This might or might not produce better +code when used in addition to @samp{-O}. I am interested in +hearing about the difference this makes. + +@item -fforce-mem +Force memory operands to be copied into registers before doing +arithmetic on them. This may produce better code by making all +memory references potential common subexpressions. When they are +not common subexpressions, instruction combination should +eliminate the separate register-load. I am interested in hearing +about the difference this makes. + +@item -fforce-addr +Force memory address constants to be copied into registers before +doing arithmetic on them. This may produce better code just as +@samp{-fforce-mem} may. I am interested in hearing about the +difference this makes. + +@item -fomit-frame-pointer +Don't keep the frame pointer in a register for functions that +don't need one. This avoids the instructions to save, set up and +restore frame pointers; it also makes an extra register available +in many functions. @strong{It also makes debugging impossible.} + +On some machines, such as the Vax, this flag has no effect, +because the standard calling sequence automatically handles the +frame pointer and nothing is saved by pretending it doesn't +exist. The machine-description macro +@code{FRAME_POINTER_REQUIRED} controls whether a target machine +supports this flag. @xref{Registers}.@refill + +@item -finline-functions +Integrate all simple functions into their callers. The compiler +heuristically decides which functions are simple enough to be +worth integrating in this way. + +If all calls to a given function are integrated, and the function +is declared @code{static}, then the function is normally not +output as assembler code in its own right. + +@item -fkeep-inline-functions +Even if all calls to a given function are integrated, and the +function is declared @code{static}, nevertheless output a +separate run-time callable version of the function. + +@item -fwritable-strings +Store string constants in the writable data segment and don't +uniquize them. This is for compatibility with old programs which +assume they can write into string constants. Writing into string +constants is a very bad idea; ``constants'' should be constant. + +@item -fno-function-cse +Do not put function addresses in registers; make each instruction +that calls a constant function contain the function's address +explicitly. + +This option results in less efficient code, but some strange +hacks that alter the assembler output may be confused by the +optimizations performed when this option is not used. + +@item -fvolatile +Consider all memory references through pointers to be volatile. + +@item -funsigned-char +Let the type @code{char} be the unsigned, like @code{unsigned +char}. + +Each kind of machine has a default for what @code{char} should +be. It is either like @code{unsigned char} by default or like +@code{signed char} by default. (Actually, at present, the +default is always signed.) + +The type @code{char} is always a distinct type from either +@code{signed char} or @code{unsigned char}, even though its +behavior is always just like one of those two. + +@item -fsigned-char +Let the type @code{char} be signed, like @code{signed char}. + +@item -ffixed-@var{reg} +Treat the register named @var{reg} as a fixed register; generated +code should never refer to it (except perhaps as a stack pointer, +frame pointer or in some other fixed role). + +@var{reg} must be the name of a register. The register names +accepted are machine-specific and are defined in the +@code{REGISTER_NAMES} macro in the machine description macro +file. + +@item -fcall-used-@var{reg} +Treat the register named @var{reg} as an allocatable register +that is clobbered by function calls. It may be allocated for +temporaries or variables that do not live across a call. +Functions compiled this way will not save and restore the +register @var{reg}. + +Use of this flag for a register that has a fixed pervasive role +in the machine's execution model, such as the stack pointer or +frame pointer, will produce disastrous results. + +@item -fcall-saved-@var{reg} +Treat the register named @var{reg} as an allocatable register +saved by functions. It may be allocated even for temporaries or +variables that live across a call. Functions compiled this way +will save and restore the register @var{reg} if they use it. + +Use of this flag for a register that has a fixed pervasive role +in the machine's execution model, such as the stack pointer or +frame pointer, will produce disastrous results. + +A different sort of disaster will result from the use of this +flag for a register in which function values are may be returned. +@end table @item -d@var{letters} Says to make debugging dumps at times specified by @var{letters}. Here are the possible letters: @table @samp -@item t -Dump syntax-tree. @item r Dump after RTL generation. @item j Dump after first jump optimization. +@item J +Dump after last jump optimization. @item s -Dump after CSE. +Dump after CSE (including the jump optimization that sometimes +follows CSE). @item L Dump after loop optimization. @item f @@ -272,122 +877,1562 @@ Dump after instruction combination. Dump after local register allocation. @item g Dump after global register allocation. +@item m +Print statistics on memory usage, at the end of the run. @end table @item -pedantic -Attempt to support strict ANSI standard C. Valid ANSI standard C -programs should compile properly with or without this switch. -However, without this switch, certain useful or traditional constructs -banned by the standard are supported. With this switch, they are -rejected. There is no reason to use this switch; it exists only -to satisfy pedants. - -@item E -Preprocess the input files and output the results to standard output. - -@item C -Tell the preprocessor not to discard comments. Used with the @samp{-E} -switch. +Issue all the warnings demanded by strict ANSI standard C; reject +all programs that use forbidden extensions. + +Valid ANSI standard C programs should compile properly with or without +this option (though a rare few will require @samp{-ansi}). However, +without this option, certain GNU extensions and traditional C features +are supported as well. With this option, they are rejected. There is +no reason to @i{use} this option; it exists only to satisfy pedants. +@end table + +These options control the C preprocessor, which is run on each C source +file before actual compilation. If you use the @samp{-E} option, nothing +is done except C preprocessing. Some of these options make sense only +together with @samp{-E} because they request preprocessor output that is +not suitable for actual compilation. + +@table @samp +@item -C +Tell the preprocessor not to discard comments. Used with the +@samp{-E} option. -@item I@var{dir} +@item -I@var{dir} Search directory @var{dir} for include files. -@item D@var{macro} +@item -I- +Any directories specified with @samp{-I} options before the @samp{-I-} +option are searched only for the case of @samp{#include "@var{file}"}; +they are not searched for @samp{#include <@var{file}>}. + +If additional directories are specified with @samp{-I} options after +the @samp{-I-}, these directories are searched for all @samp{#include} +directives. (Ordinarily @emph{all} @samp{-I} directories are used +this way.) + +In addition, the @samp{-I-} option inhibits the use of the current +directory as the first search directory for @samp{#include +"@var{file}"}. Therefore, the current directory is searched only if +it is requested explicitly with @samp{-I.}. Specifying both +@samp{-I-} and @samp{-I.} allows you to control precisely which +directories are searched before the current one and which are searched +after. + +@item -nostdinc +Do not search the standard system directories for header files. Only +the directories you have specified with @samp{-I} options (and the +current directory, if appropriate) are searched. + +Between @samp{-nostdinc} and @samp{-I-}, you can eliminate all +directories from the search path except those you specify. + +@item -M +Tell the preprocessor to output a rule suitable for @code{make} +describing the dependencies of each source file. For each source +file, the preprocessor outputs one @code{make}-rule whose target is +the object file name for that source file and whose dependencies are +all the files @samp{#include}d in it. This rule may be a single line +or may be continued with @samp{\}-newline if it is long. + +@samp{-M} implies @samp{-E}. + +@item -MM +Like @samp{-M} but the output mentions only the user-header files +included with @samp{#include "@var{file}"}. System header files +included with @samp{#include <@var{file}>} are omitted. + +@samp{-MM} implies @samp{-E}. + +@item -D@var{macro} Define macro @var{macro} with the empty string as its definition. -@item D@var{macro}=@var{defn} +@item -D@var{macro}=@var{defn} Define macro @var{macro} as @var{defn}. -@item U@var{macro} +@item -U@var{macro} Undefine macro @var{macro}. -@item w -Inhibit warning messages. - -@item v -Compiler driver program prints the commands it executes as it runs -the preprocessor, compiler proper, assembler and linker. - -@item B@var{prefix} -Compiler driver program tries @var{prefix} as a prefix for each program -it tries to run. These programs are @file{cpp}, @file{cc1}, -@file{as} and @file{ld}. - -For each subprogram to be run, the compiler driver first tries the -@samp{-B} prefix, if any. If that name is not found, or if @samp{-B} -was not specified, the driver tries two standard prefixes, which are -@file{/usr/lib/gcc-} and @file{/usr/local/lib/gcc-}. If neither of -those results in a file name that is found, the unmodified program -name is searched for using the @samp{PATH} environment variable. +@item -T +Support ANSI C trigraphs. You don't want to know about this +brain-damage. The @samp{-ansi} option also has this effect. @end table -@node Installation, Portability, Switches, Top +@node Installation, Trouble, Options, Top @chapter Installing GNU CC +Here is the procedure for installing GNU CC on a Unix system. +@menu +* VMS Install:: See below for installation on VMS. +@end menu +@iftex +(See below for VMS.) +@end iftex + @enumerate @item +Edit @file{Makefile}. If you are using HPUX, or any form of system V, +you must make a few changes described in comments at the beginning of +the file. + +@item +On a Sequent system, go to the Berkeley universe. + +@item Choose configuration files. @itemize @bullet @item -Make a symbolic link from file @file{config.h} to the top-level -config file for the machine you are using. Its name should be -@file{config-@var{machine}.h}. This file is responsible for -defining information about the host machine. It includes -@file{tm.h}. +Make a symbolic link named @file{config.h} to the top-level +config file for the machine you are using (@pxref{Config}). This +file is responsible for defining information about the host +machine. It includes @file{tm.h}. + +The file's name should be @file{config-@var{machine}.h}, with these +exceptions: + +@table @file +@item config-vms.h +for vaxen running VMS. +@item config-vaxv.h +for vaxen running system V. +@item config-i386v.h +for Intel 80386's running system V. +@item config-sun4.h +for Suns (model 3 or 4) running @emph{operating system} version 4. +@item config-hp9k3.h +for the HP 9000 series 300. +@item config-gnx.h +for the ns32000 running Genix +@end table + +If your system does not support symbolic links, you might want to +set up @file{config.h} to contain a @samp{#include} command which +refers to the appropriate file. @item -Make a symbolic link from @file{tm.h} to the machine-description +Make a symbolic link named @file{tm.h} to the machine-description macro file for your machine (its name should be @file{tm-@var{machine}.h}). +If your system is a 68000, don't use the file @file{tm-m68k.h} +directly. Instead, use one of these files: + +@table @file +@item tm-sun3.h +for Sun 3 machines. +@item tm-sun2.h +for Sun 2 machines. +@item tm-3b1.h +for AT&T 3b1 (aka 7300 Unix PC). +@item tm-isi68.h +for Integrated Solutions systems. +@item tm-news800.h +for SONY News systems. +@item tm-hp9k320.h +for HPUX systems, if you are using GNU CC with the system's +assembler and linker. +@item tm-hp9k320g.h +for HPUX systems, if you are using the GNU assembler, linker and +other utilities. Not all of the pieces of GNU software needed +for this mode of operation are as yet in distribution; full +instructions will appear here in the future.@refill +@end table + +For the vax, use @file{tm-vax.h} on BSD Unix, @file{tm-vaxv.h} on +system V, or @file{tm-vms.h} on VMS.@refill + +For the SPARC (Sun 4), use @file{tm-sparc.h}. + +For the Motorola 88000, use @file{tm-m88k.h}. The support for the +88000 has a few unfinished spots because there was no way to run the +output. Bugs are suspected in handling of branch-tables and in +the function prologue and epilogue. + +For the 80386, don't use @file{tm-i386.h} directly. Use +@file{tm-i386v.h} if the target machine is running system V, +@file{tm-seq386.h} for a Sequent 386 system, or @file{tm-compaq.h} for +a Compaq. + +For the 32000, use @file{tm-sequent.h} if you are using a Sequent +machine, or @file{tm-encore.h} for an Encore machine, or +@file{tm-gnx.h} if you are using Genix version 3; otherwise, perhaps +@file{tm-ns32k.h} will work for you. + +Note that Genix has bugs in @code{alloca} and @code{malloc}; you must +get the compiled versions of these from GNU Emacs and edit GNU CC's +@file{Makefile} to use them. + +Note that Encore systems are supported only under BSD. + @item -Make a symbolic link from @file{md} to the -machine description pattern file (its name should be -@file{@var{machine}.md}). +Make a symbolic link named @file{md} to the machine description +pattern file (its name should be @file{@var{machine}.md}). @item -Make a symbolic link from -@file{aux-output.c} to the output-subroutine file for your machine -(its name should be @file{@var{machine}-output.c}). +Make a symbolic link named @file{aux-output.c} to the output +subroutine file for your machine (its name should be +@file{output-@var{machine}.c}). @end itemize @item -Make sure the Bison parser generator is installed. +Make sure the Bison parser generator is installed. (This is +unnecessary if the Bison output files @file{c-parse.tab.c} and +@file{cexp.c} are more recent than @file{c-parse.y} and @file{cexp.y} +and you do not plan to change the @samp{.y} files.) + +Note that if you have an old version of Bison you may get an error +from the line with the @samp{%expect} directive. If so, simply remove +that line from @file{c-parse.y} and proceed. + +@item +If you are using a Sun, make sure the environment variable +@code{FLOAT_OPTION} is not set. If this option were set to +@code{f68881} when @file{gnulib} is compiled, the resulting code would +demand to be linked with a special startup file and will not link +properly without special pains. @item Build the compiler. Just type @samp{make} in the compiler directory. @item -Delete @file{*.o} in the compiler directory. The executables from -the previous step remain for the next step. +Move the first-stage object files and executables into a subdirectory +with this command: + +@example +make stage1 +@end example + +The files are moved into a subdirectory named @file{stage1}. +Once installation is complete, you may wish to delete these files +with @code{rm -r stage1}. + +@item +Recompile the compiler with itself, with this command: + +@example +make CC=stage1/gcc CFLAGS="-g -O -Bstage1/" +@end example + +On a 68000 or 68020 system lacking floating point hardware, +unless you have selected a @file{tm.h} file that expects by default +that there is no such hardware, do this instead: + +@example +make CC=stage1/gcc CFLAGS="-g -O -Bstage1/ -msoft-float" +@end example @item -Remake the compiler with +If you wish to test the compiler by compiling it with itself one more +time, do this: @example -make CC=./gcc CFLAGS="-g -O -I." +make stage2 +Make CC=stage2/gcc CFLAGS="-g -O -Bstage2/" +foreach file (*.o) +cmp $file stage2/$file +end @end example +This will notify you if any of these stage 3 object files differs from +those of stage 2. Any difference, no matter how innocuous, indicates +that the stage 2 compiler has compiled GNU CC incorrectly, and is +therefore a potentially serious bug which you should investigate and +report (@pxref{Bugs}). + +Aside from the @samp{-B} option, the options should be the same as +when you made stage 2. + @item -Install the compiler's passes. Copy the file @file{cc1} made by the -compiler to the name @file{/usr/local/lib/gcc-cc1}. +Install the compiler driver, the compiler's passes and run-time support. +You can use the following command: -Make the file @file{/usr/local/lib/gcc-cpp} either a link to @file{/lib/cpp} -or a copy of the file @file{cpp} generated by @samp{make}. +@example +make install +@end example -@strong{Warning: the GNU CPP may not work for @file{ioctl.h}.} This -cannot be fixed in the GNU CPP because the bug is in @file{ioctl.h}: -at least on some machines, it relies on behavior that is incompatible +@noindent +This copies the files @file{cc1}, @file{cpp} and @file{gnulib} to +files @file{gcc-cc1}, @file{gcc-cpp} and @file{gcc-gnulib} in +directory @file{/usr/local/lib}, which is where the compiler driver +program looks for them. It also copies the driver program @file{gcc} +into the directory @file{/usr/local}, so that it appears in typical +execution search paths.@refill + +@strong{Warning: there is a bug in @code{alloca} in the Sun library. +To avoid this bug, install the binaries of GNU CC that were compiled +by GNU CC. They use @code{alloca} as a built-in function and never +the one in the library.} + +@strong{Warning: the GNU CPP may not work for @file{ioctl.h}, +@file{ttychars.h} and other system header files unless the +@samp{-traditional} option is used.} The bug is in the header files: +at least on some machines, they rely on behavior that is incompatible with ANSI C. This behavior consists of substituting for macro -argument names when they appear inside of character constants. +argument names when they appear inside of character constants. The +@samp{-traditional} option tells GNU CC to behave the way these +headers expect. + +Because of this problem, you might prefer to configure GNU CC to use +the system's own C preprocessor. To do so, make the file +@file{/usr/local/lib/gcc-cpp} a link to @file{/lib/cpp}. + +Alternatively, on Sun systems and 4.3BSD at least, you can correct the +include files by running the shell script @file{fixincludes}. This +installs modified, corrected copies of the files @file{ioctl.h}, +@file{ttychars.h} and many others, in a special directory where only +GNU CC will normally look for them. + +See the file @file{fixincludes} for a list of all the files we know to +require correction. +@end enumerate + +If you cannot install the compiler's passes and run-time support in +@file{/usr/local/lib}, you can alternatively use the @samp{-B} option to +specify a prefix by which they may be found. The compiler concatenates +the prefix with the names @file{cpp}, @file{cc1} and @file{gnulib}. +Thus, you can put the files in a directory @file{/usr/foo/gcc} and +specify @samp{-B/usr/foo/gcc/} when you run GNU CC. + +Also, you can specify an alternative default directory for these files +by setting the Make variable @code{libdir} when you make GNU CC. + +@node VMS Install,, Installation, Installation +@section Installing GNU CC on VMS + +The VMS version of GNU CC is distributed in an unusual tape format which +consists of several tape files. The first is a command file; the second is +an executable program which reads Unix tar format; the third is another +command file which uses this program to read the remainder of the tape. + +To load the tape, it suffices to mount it @samp{/foreign} and then do +@samp{@@mta0:} to execute the command file at the beginning of the tape. + +The tape contains executables and object files as well as sources, so no +compilation is necessary unless you change the sources. (This is a good +thing, since you probably don't have any other C compiler.) If you must +recompile, here is how: + +@enumerate +@item +Copy the file @file{tm-vms.h} to @file{tm.h}, @file{config-vms.h} to +@file{config.h}, @file{vax.md} to @file{md.} and @file{output-vax.c} +to @file{aux-output.c}.@refill + +@item +Type @samp{@@make} to do recompile everything. +@end enumerate + +To install the @samp{GCC} command so you can use the compiler easily, in +the same manner as you use the VMS C compiler, you must install the VMS CLD +file for GNU CC as follows: +@enumerate @item -Install the compiler driver. This is the file @file{gcc} generated -by @samp{make}. +Define the VMS logical names @samp{GNU_CC} and @samp{GNU_CC_INCLUDE} +to point to the directories where the GNU CC executables +(@samp{gcc-cpp}, @samp{gcc-cc1}, etc.) and the C include files are +kept. This should be done with the commands:@refill + +@example +$ assign /super /system disk:[gcc] gnu_cc +$ assign /super /system disk:[gcc.include] gnu_cc_include +@end example + +@noindent +with the appropriate disk and directory names. These commands can be +placed in your system startup file so they will be executed whenever +the machine is rebooted. + +@item +Install the @samp{GCC} command with the command line: + +@example +$ set command /table=sys$library:dcltables gnu_cc:gcc +@end example + +@noindent +Now you can invoke the compiler with a command like @samp{gcc /verbose +file.c}, which is equivalent to the command @samp{gcc -v -c file.c} in +Unix. @end enumerate -@node Portability, Passes, Installation, Top +@node Trouble, Incompatibilities, Installation, Top +@chapter Trouble in Installation + +Here are some of the things that have caused trouble for people installing +GNU CC. + +@itemize @bullet +@item +On certain systems, defining certain environment variables such as +@samp{CC} can interfere with the functioning of @code{make}. + +@item +Cross compilation can run into trouble for certain machines because +some target machines' assemblers require floating point numbers to be +written as @emph{integer} constants in certain contexts. + +The compiler writes these integer constants by examining the floating +point value as an integer and printing that integer, because this is +simple to write and independent of the details of the floating point +representation. But this does not work if the compiler is running on +a different machine with an incompatible floating point format, or +even a different byte-ordering. + +It is possible to fix this by writing machine-independent code which +understands the floating point representation of the target machine. +I am not interested in doing that much work to compensate for bugs +in assemblers. +@end itemize + +@node Incompatibilities, Extensions, Trouble, Top +@chapter Incompatibilities of GNU CC + +There are several noteworthy incompatibilities between GNU C and most +existing (non-ANSI) versions of C. + +Ultimately our intention is that the @samp{-traditional} option will +eliminate most of these incompatibilities by telling GNU C to behave +like the other C compilers. + +@itemize @bullet +@item +GNU CC normally makes string constants read-only. If several +identical-looking string constants are used, GNU CC stores only one +copy of the string. + +One consequence is that you cannot call @code{mktemp} with a string +constant argument. The function @code{mktemp} always alters the +string its argument points to. + +Another consequence is that @code{sscanf} does not work on some +systems when passed a string constant as its format control string. +This is because @code{sscanf} incorrectly tries to write into the +string constant. + +The best solution to these problems is to change the program to use +@code{char}-array variables with initialization strings for these +purposes instead of string constants. But if this is not possible, +you can use the @samp{-fwritable-strings} flag, which directs GNU CC +to handle string constants the same way most C compilers do. + +@item +GNU CC does not substitute macro arguments when they appear inside of +string constants. For example, the following macro in GNU CC + +@example +#define foo(a) "a" +@end example + +@noindent +will produce output @samp{"a"} regardless of what the argument @var{a} is. + +The @samp{-traditional} option directs GNU CC to handle such cases +(among others) in the old-fashioned (non-ANSI) fashion. + +@item +When you use @code{setjmp} and @code{longjmp}, the only automatic +variables guaranteed to remain valid are those declared +@code{volatile}. This is a consequence of automatic register +allocation. Consider this function: + +@example +jmp_buf j; + +foo () +@{ + int a, b; + + a = fun1 (); + if (setjmp (j)) + return a; + + a = fun2 (); + /* @r{@code{longjmp (j)} may be occur in @code{fun3}.} */ + return a + fun3 (); +@} +@end example + +Here @code{a} may or may not be restored to its first value when the +@code{longjmp} occurs. If @code{a} is allocated in a register, then +its first value is restored; otherwise, it keeps the last value stored +in it. + +If you use the @samp{-W} option with the @samp{-O} option, you will +get a warning when GNU CC thinks such a problem might be possible. + +@item +Declarations of external variables and functions within a block apply +only to the block containing the declaration. In other words, they +have the same scope as any other declaration in the same place. + +In some other C compilers, a @code{extern} declaration affects all the +rest of the file even if it happens within a block. + +The @samp{-traditional} option directs GNU C to treat all @code{extern} +declarations as global, like traditional compilers. + +@item +In traditional C, you can combine @code{long}, etc., with a typedef name, +as shown here: + +@example +typedef int foo; +typedef long foo bar; +@end example + +In ANSI C, this is not allowed: @code{long} and other type modifiers +require an explicit @code{int}. Because this criterion is expressed +by Bison grammar rules rather than C code, the @samp{-traditional} +flag cannot alter it. + +@item +PCC allows typedef names to be used as function parameters. The +difficulty described immediately above applies here too. + +@item +PCC allows whitespace in the middle of compound assignment operators +such as @samp{+=}. GNU CC, following the ANSI standard, does not +allow this. The difficulty described immediately above applies here +too. + +@item +When compiling functions that return @code{float}, PCC converts it to +a double. GNU CC actually returns a @code{float}. If you are concerned +with PCC compatibility, you should declare your functions to return +@code{double}; you might as well say what you mean. + +@item +When compiling functions that return structures or unions, GNU CC +output code uses a method different from that used on most versions of +Unix. As a result, code compiled with GNU CC cannot call a +structure-returning function compiled with PCC, and vice versa. + +The method used by GCC is as follows: a structure or union which is 1, +2, 4 or 8 bytes long is returned like a scalar. A structure or union +with any other size is stored into an address supplied by the caller +in a special, fixed register. + +PCC usually handles all sizes of structures and unions by returning +the address of a block of static storage containing the value. This +method is not used in GCC because it is slower and nonreentrant. + +On systems where PCC works this way, you may be able to make GCC-compiled +code call such functions that were compiled with PCC by declaring them +to return a pointer to the structure or union instead of the structure +or union itself. For example, instead of this: + +@example +struct foo nextfoo (); +@end example + +@noindent +write this: + +@example +struct foo *nextfoo (); +#define nextfoo *nextfoo +@end example + +@noindent +(Note that this assumes you are using the GNU preprocessor and not +@samp{-traditional}, so that the ANSI antirecursion rules for macro +expansions are effective.) +@end itemize + +@node Extensions, Bugs, Incompatibilities, Top +@chapter GNU Extensions to the C Language + +GNU C provides several language features not found in ANSI standard C. +(The @samp{-pedantic} option directs GNU CC to print a warning message if +any of these features is used.) To test for the availability of these +features in conditional compilation, check for a predefined macro +@code{__GNUC__}, which is always defined under GNU CC. + +@menu +* Statement Exprs:: Putting statements and declarations inside expressions. +* Naming Types:: Giving a name to the type of some expression. +* Typeof:: @code{typeof}: referring to the type of an expression. +* Lvalues:: Using @samp{?:}, @samp{,} and casts in lvalues. +* Conditionals:: Omitting the middle operand of a @samp{?:} expression. +* Zero-Length:: Zero-length arrays. +* Variable-Length:: Arrays whose length is computed at run time. +* Subscripting:: Any array can be subscripted, even if not an lvalue. +* Pointer Arith:: Arithmetic on @code{void}-pointers and function pointers. +* Constructors:: Constructor expressions give structures, unions + or arrays as values. +* Dollar Signs:: Dollar sign is allowed in identifiers. +* Alignment:: Inquiring about the alignment of a type or variable. +* Inline:: Defining inline functions (as fast as macros). +* Extended Asm:: Assembler instructions with C expressions as operands. + (With them you can define ``built-in'' functions.) +* Asm Labels:: Specifying the assembler name to use for a C symbol. +@end menu + +@node Statement Exprs, Naming Types, Extensions, Extensions +@section Statements and Declarations inside of Expressions + +A compound statement in parentheses may appear inside an expression in GNU +C. This allows you to declare variables within an expression. For +example: + +@example +(@{ int y = foo (); int z; + if (y > 0) z = y; + else z = - y; + z; @}) +@end example + +@noindent +is a valid (though slightly more complex than necessary) expression +for the absolute value of @code{foo ()}. + +This feature is especially useful in making macro definitions ``safe'' (so +that they evaluate each operand exactly once). For example, the +``maximum'' function is commonly defined as a macro in standard C as +follows: + +@example +#define max(a,b) ((a) > (b) ? (a) : (b)) +@end example + +@noindent +But this definition computes either @var{a} or @var{b} twice, with bad +results if the operand has side effects. In GNU C, if you know the +type of the operands (here let's assume @code{int}), you can define +the macro safely as follows: + +@example +#define maxint(a,b) \ + (@{int _a = (a), _b = (b); _a > _b ? _a : _b; @}) +@end example + +Embedded statements are not allowed in constant expressions, such as +the value of an enumeration constant, the width of a bit field, or +the initial value of a static variable. + +If you don't know the type of the operand, you can still do this, but you +must use @code{typeof} (@pxref{Typeof}) or type naming (@pxref{Naming +Types}). + +@node Naming Types, Typeof, Statement Exprs, Extensions +@section Naming an Expression's Type + +You can give a name to the type of an expression using a @code{typedef} +declaration with an initializer. Here is how to define @var{name} as a +type name for the type of @var{exp}: + +@example +typedef @var{name} = @var{exp}; +@end example + +This is useful in conjunction with the statements-within-expressions +feature. Here is how the two together can be used to define a safe +``maximum'' macro that operates on any arithmetic type: + +@example +#define max(a,b) \ + (@{typedef _ta = (a), _tb = (b); \ + _ta _a = (a); _tb _b = (b); \ + _a > _b ? _a : _b; @}) +@end example + +The reason for using names that start with underscores for the local +variables is to avoid conflicts with variable names that occur within the +expressions that are substituted for @code{a} and @code{b}. Eventually we +hope to design a new form of declaration syntax that allows you to declare +variables whose scopes start only after their initializers; this will be a +more reliable way to prevent such conflicts. + +@node Typeof, Lvalues, Naming Types, Extensions +@section Referring to a Type with @code{typeof} + +Another way to refer to the type of an expression is with @code{typeof}. +The syntax of using of this keyword looks like @code{sizeof}, but the +construct acts semantically like a type name defined with @code{typedef}. + +There are two ways of writing the argument to @code{typeof}: with an +expression or with a type. Here is an example with an expression: + +@example +typeof (x[0](1)) +@end example + +@noindent +This assumes that @code{x} is an array of functions; the type described +is that of the values of the functions. + +Here is an example with a typename as the argument: + +@example +typeof (int *) +@end example + +@noindent +Here the type described is that of pointers to @code{int}. + +A @code{typeof}-construct can be used anywhere a typedef name could be +used. For example, you can use it in a declaration, in a cast, or inside +of @code{sizeof} or @code{typeof}. + +@itemize @bullet +@item +This declares @code{y} with the type of what @code{x} points to. + +@example +typeof (*x) y; +@end example + +@item +This declares @code{y} as an array of such values. + +@example +typeof (*x) y[4]; +@end example + +@item +This declares @code{y} as an array of pointers to characters: + +@example +typeof (typeof (char *)[4]) y; +@end example + +@noindent +It is equivalent to the following traditional C declaration: + +@example +char *y[4]; +@end example + +To see the meaning of the declaration using @code{typeof}, and why it +might be a useful way to write, let's rewrite it with these macros: + +@example +#define pointer(T) typeof(T *) +#define array(T, N) typeof(T [N]) +@end example + +@noindent +Now the declaration can be rewritten this way: + +@example +array (pointer (char), 4) y; +@end example + +@noindent +Thus, @samp{array (pointer (char), 4)} is the type of arrays of 4 +pointers to @code{char}. +@end itemize + +@node Lvalues, Conditionals, Typeof, Extensions +@section Generalized Lvalues + +Compound expressions, conditional expressions and casts are allowed as +lvalues provided their operands are lvalues. This means that you can take +their addresses or store values into them. + +For example, a compound expression can be assigned, provided the last +expression in the sequence is an lvalue. These two expressions are +equivalent: + +@example +(a, b) += 5 +a, (b += 5) +@end example + +Similarly, the address of the compound expression can be taken. These two +expressions are equivalent: + +@example +&(a, b) +a, &b +@end example + +A conditional expression is a valid lvalue if its type is not void and the +true and false branches are both valid lvalues. For example, these two +expressions are equivalent: + +@example +(a ? b : c) = 5 +(a ? b = 5 : (c = 5)) +@end example + +A cast is a valid lvalue if its operand is valid. Taking the address of +the cast is the same as taking the address without a cast, except for the +type of the result. For example, these two expressions are equivalent (but +the second may be valid when the type of @samp{a} does not permit a cast to +@samp{int *}). + +@example +&(int *)a +(int **)&a +@end example + +A simple assignment whose left-hand side is a cast works by converting the +right-hand side first to the specified type, then to the type of the inner +left-hand side expression. After this is stored, the value is converter +back to the specified type to become the value of the assignment. Thus, if +@samp{a} has type @samp{char *}, the following two expressions are +equivalent: + +@example +(int)a = 5 +(int)(a = (char *)5) +@end example + +An assignment-with-arithmetic operation such as @samp{+=} applied to a cast +performs the arithmetic using the type resulting from the cast, and then +continues as in the previous case. Therefore, these two expressions are +equivalent: + +@example +(int)a += 5 +(int)(a = (char *) ((int)a + 5)) +@end example + +@node Conditionals, Zero-Length, Lvalues, Extensions +@section Conditional Expressions with Omitted Middle-Operands + +The middle operand in a conditional expression may be omitted. Then +if the first operand is nonzero, its value is the value of the conditional +expression. + +Therefore, the expression + +@example +x ? : y +@end example + +@noindent +has the value of @code{x} if that is nonzero; otherwise, the value of +@code{y}. + +This example is perfectly equivalent to + +@example +x ? x : y +@end example + +@noindent +In this simple case, the ability to omit the middle operand is not +especially useful. When it becomes useful is when the first operand does, +or may (if it is a macro argument), contain a side effect. Then repeating +the operand in the middle would perform the side effect twice. Omitting +the middle operand uses the value already computed without the undesirable +effects of recomputing it. + +@node Zero-Length, Variable-Length, Conditionals, Extensions +@section Arrays of Length Zero + +Zero-length arrays are allowed in GNU C. They are very useful as the last +element of a structure which is really a header for a variable-length +object: + +@example +struct line @{ + int length; + char contents[0]; +@}; + +@{ + struct line *thisline + = (struct line *) malloc (sizeof (struct line) + this_length); + thisline->length = this_length; +@} +@end example + +In standard C, you would have to give @code{contents} a length of 1, which +means either you waste space or complicate the argument to @code{malloc}. + +@node Variable-Length, Subscripting, Zero-Length, Extensions +@section Arrays of Variable Length + +Variable-length automatic arrays are allowed in GNU C. These arrays are +declared like any other automatic arrays, but with a length that is not a +constant expression. The storage is allocated at that time and +deallocated when the brace-level is exited. For example: + +@example +FILE *concat_fopen (char *s1, char *s2, char *mode) +@{ + char str[strlen (s1) + strlen (s2) + 1]; + strcpy (str, s1); + strcat (str, s2); + return fopen (str, mode); +@} +@end example + +You can also define structure types containing variable-length arrays, and +use them even for arguments or function values, as shown here: + +@example +int foo; + +struct entry +@{ + char data[foo]; +@}; + +struct entry +tester (struct entry arg) +@{ + struct entry new; + int i; + for (i = 0; i < foo; i++) + new.data[i] = arg.data[i] + 1; + return new; +@} +@end example + +@noindent +(Eventually there will be a way to say that the size of the array is +another member of the same structure.) + +The length of an array is computed on entry to the brace-level where the +array is declared and is remembered for the scope of the array in case you +access it with @code{sizeof}. + +Jumping or breaking out of the scope of the array name will also deallocate +the storage. Jumping into the scope is not allowed; you will get an error +message for it. + +You can use the function @code{alloca} to get an effect much like +variable-length arrays. The function @code{alloca} is available in +many other C implementations (but not in all). On the other hand, +variable-length arrays are more elegant. + +There are other differences between these two methods. Space allocated +with @code{alloca} exists until the containing @emph{function} returns. +The space for a variable-length array is deallocated as soon as the array +name's scope ends. (If you use both variable-length arrays and +@code{alloca} in the same function, deallocation of a variable-length array +will also deallocate anything more recently allocated with @code{alloca}.) + +@node Subscripting, Pointer Arith, Variable-Length, Extensions +@section Non-Lvalue Arrays May Have Subscripts + +Subscripting is allowed on arrays that are not lvalues, even though the +unary @samp{&} operator is not. For example, this is valid in GNU C though +not valid in other C dialects: + +@example +struct foo @{int a[4];@}; + +struct foo f(); + +bar (int index) +@{ + return f().a[index]; +@} +@end example + +@node Pointer Arith, Initializers, Subscripting, Extensions +@section Arithmetic on @code{void}-Pointers and Function Pointers + +In GNU C, addition and subtraction operations are supported on pointers to +@code{void} and on pointers to functions. This is done by treating the +size of a @code{void} or of a function as 1. + +A consequence of this is that @code{sizeof} is also allowed on @code{void} +and on function types, and returns 1. + +@node Initializers, Constructors, Pointer Arith, Extensions +@section Non-Constant Initializers + +The elements of an aggregate initializer are not required to be constant +expressions in GNU C. Here is an example of an initializer with run-time +varying elements: + +@example +foo (float f, float g) +@{ + float beat_freqs[2] = @{ f-g, f+g @}; + @dots{} +@} +@end example + +@node Constructors, Dollar Signs, Initializers, Extensions +@section Constructor Expressions + +GNU C supports constructor expressions. A constructor looks like a cast +containing an initializer. Its value is an object of the type specified in +the cast, containing the elements specified in the initializer. The type +must be a structure, union or array type. + +Assume that @code{struct foo} and @code{structure} are declared as shown: + +@example +struct foo @{int a; char b[2];@} structure; +@end example + +@noindent +Here is an example of constructing a @samp{struct foo} with a constructor: + +@example +structure = ((struct foo) @{x + y, 'a', 0@}); +@end example + +@noindent +This is equivalent to writing the following: + +@example +@{ + struct foo temp = @{x + y, 'a', 0@}; + structure = temp; +@} +@end example + +You can also construct an array. If all the elements of the constructor +are (made up of) simple constant expressions, suitable for use in +initializers, then the constructor is an lvalue and can be coerced to a +pointer to its first element, as shown here: + +@example +char **foo = (char *[]) @{ "x", "y", "z" @}; +@end example + +Array constructors whose elements are not simple constants are not very +useful, because the constructor is not an lvalue. There are only two valid +ways to use it: to subscript it, or initialize an array variable with it. +The former is probably slower than a @code{switch} statement, while the +latter does the same thing an ordinary C initializer would do. + +@example +output = ((int[]) @{ 2, x, 28 @}) [input]; +@end example + +@node Dollar Signs, Alignment, Constructors, Extensions +@section Dollar Signs in Identifier Names + +In GNU C, you may use dollar signs in identifier names. This is because +many traditional C implementations allow such identifiers. + +@node Alignment, Inline, Dollar Signs, Extensions +@section Inquiring about the Alignment of a Type or Variable + +The keyword @code{__alignof} allows you to inquire about how an object +is aligned, or the minimum alignment usually required by a type. Its +syntax is just like @code{sizeof}. + +For example, if the target machine requires a @code{double} value to be +aligned on an 8-byte boundary, then @code{__alignof (double)} is 8. This +is true on many RISC machines. On more traditional machine designs, +@code{__alignof (double)} is 4 or even 2. + +Some machines never actually require alignment; they allow reference to any +data type even at an odd addresses. For these machines, @code{__alignof} +reports the @emph{recommended} alignment of a type. + +When the operand of @code{__alignof} is an lvalue rather than a type, the +value is the largest alignment that the lvalue is known to have. It may +have this alignment as a result of its data type, or because it is part of +a structure and inherits alignment from that structure. For example, after +this declaration: + +@example +struct foo @{ int x; char y; @} foo1; +@end example + +@noindent +the value of @code{__alignof (foo1.y)} is probably 2 or 4, the same as +@code{__alignof (int)}, even though the data type of @code{foo1.y} does not +itself demand any alignment.@refill + +@node Inline, Extended Asm, Alignment, Extensions +@section An Inline Function is As Fast As a Macro + +By declaring a function @code{inline}, you can direct GNU CC to integrate +that function's code into the code for its callers. This makes execution +faster by eliminating the function-call overhead; in addition, if any of +the actual argument values are constant, their known values may permit +simplifications at compile time so that not all of the inline function's +code needs to be included. + +To declare a function inline, use the @code{inline} keyword in its +declaration, like this: + +@example +inline int +inc (int *a) +@{ + (*a)++; +@} +@end example + +You can also make all ``simple enough'' functions inline with the +option @samp{-finline-functions}. Note that certain usages in a +function definition can make it unsuitable for inline substitution. + +When a function is both inline and @code{static}, if all calls to the +function are integrated into the caller, then the function's own assembler +code is never referenced. In this case, GNU CC does not actually output +assembler code for the function, unless you specify the option +@samp{-fkeep-inline-functions}. Some calls cannot be integrated for +various reasons (in particular, calls that precede the function's +definition cannot be integrated, and neither can recursive calls within the +definition). If there is a nonintegrated call, then the function is +compiled to assembler code as usual. + +When an inline function is not @code{static}, then the compiler must assume +that there may be calls from other source files; since a global symbol can +be defined only once in any program, the function must not be defined in +the other source files, so the calls therein cannot be integrated. +Therefore, a non-@code{static} inline function is always compiled on its +own in the usual fashion. + +@node Extended Asm, Asm Labels, Inline, Extensions +@section Assembler Instructions with C Expression Operands + +In an assembler instruction using @code{asm}, you can now specify the +operands of the instruction using C expressions. This means no more +guessing which registers or memory locations will contain the data you want +to use. + +You must specify an assembler instruction template much like what appears +in a machine description, plus an operand constraint string for each +operand. + +For example, here is how to use the 68881's @code{fsinx} instruction: + +@example +asm ("fsinx %1,%0" : "=f" (result) : "f" (angle)); +@end example + +@noindent +Here @code{angle} is the C expression for the input operand while +@code{result} is that of the output operand. Each has @samp{"f"} as its +operand constraint, saying that a floating-point register is required. The +constraints use the same language used in the machine description +(@pxref{Constraints}). + +Each operand is described by an operand-constraint string followed by the C +expression in parentheses. A colon separates the assembler template from +the first output operand, and another separates the last output operand +from the first input, if any. Commas separate output operands and separate +inputs. The number of operands is limited to the maximum number of +operands in any instruction pattern in the machine description. + +Output operand expressions must be lvalues; the compiler can check this. +The input operands need not be lvalues. The compiler cannot check whether +the operands have data types that are reasonable for the instruction being +executed. It does not parse the assembler instruction template and does +not know what it means, or whether it is valid assembler input. The +extended @code{asm} feature is most often used for machine instructions +that the compiler itself does not know exist. + +If there are no output operands, and there are input operands, then you +should write two colons in a row where the output operands would go. + +The output operands must be write-only; GNU CC will assume that the values +in these operands before the instruction are dead and need not be +generated. For an operand that is read-write, or in which not all bits are +written and the other bits contain useful information, you must logically +split its function into two separate operands, one input operand and one +write-only output operand. The connection between them is expressed by +constraints which say they need to be in the same location when the +instruction executes. You can use the same C expression for both operands, +or different expressions. For example, here we write the (fictitious) +@samp{combine} instruction with @code{bar} as its read-only source operand +and @code{foo} as its read-write destination: + +@example +asm ("combine %2,%0" : "=r" (foo) : "0" (foo), "g" (bar)); +@end example + +@noindent +The constraint @samp{"0"} for operand 1 says that it must occupy the same +location as operand 0. + +Only a digit in the constraint can guarantee that one operand will be in +the same place as another. The mere fact that @code{foo} is the value of +both operands is not enough to guarantee that they will be in the same +place in the generated assembler code. The following would not work: + +@example +asm ("combine %2,%0" : "=r" (foo) : "r" (foo), "g" (bar)); +@end example + +Various optimizations or reloading could cause operands 0 and 1 to be in +different registers; GNU CC knows no reason not to do so. For example, the +compiler might find a copy of the value of @code{foo} in one register and +use it for operand 1, but generate the output operand 0 in a different +register (copying it afterward to @code{foo}'s own address). Of course, +since the register for operand 1 is not even mentioned in the assembler +code, the result will not work, but GNU CC can't tell that. + +Unless an output operand has the @samp{&} constraint modifier, GNU CC may +allocate it in the same register as an unrelated input operand, on the +assumption that the inputs are consumed before the outputs are produced. +This assumption may be false if the assembler code actually consists of +more than one instruction. In such a case, use @samp{&} for each output +operand that may not overlap an input. @xref{Modifiers}. + +Some instructions clobber specific hard registers. To describe this, +write a third colon after the input operands, followed by the names of +the clobbered hard registers (given as strings). For example, on the vax, + +@example +asm volatile ("movc3 %0,%1,%2" + : /* no outputs */ + : "g" (from), "g" (to), "g" (count) + : "r0", "r1", "r2", "r3", "r4", "r5"); +@end example + +Usually the most convenient way to use these @code{asm} instructions is to +encapsulate them in macros that look like functions. For example, + +@example +#define sin(x) \ +(@{ double __value, __arg = (x); \ + asm ("fsinx %1,%0": "=f" (__value): "f" (__arg)); \ + __value; @}) +@end example + +@noindent +Here the variable @code{__arg} is used to make sure that the instruction +operates on a proper @code{double} value, and to accept only those +arguments @code{x} which can convert automatically to a @code{double}. + +Another way to make sure the instruction operates on the correct data type +is to use a cast in the @code{asm}. This is different from using a +variable @code{__arg} in that it converts more different types. For +example, if the desired type were @code{int}, casting the argument to +@code{int} would accept a pointer with no complaint, while assigning the +argument to an @code{int} variable named @code{__arg} would warn about +using a pointer unless the caller explicitly casts it. + +GNU CC assumes for optimization purposes that these instructions have no +side effects except to change the output operands. This does not mean that +instructions with a side effect cannot be used, but you must be careful, +because the compiler may eliminate them if the output operands aren't used, +or move them out of loops, or replace two with one if they constitute a +common subexpression. Also, if your instruction does have a side effect on +a variable that otherwise appears not to change, the old value of the +variable may be reused later if it happens to be found in a register. + +You can prevent an @code{asm} instruction from being deleted, moved or +combined by writing the keyword @code{volatile} after the @code{asm}. For +example: + +@example +#define set_priority(x) \ +asm volatile ("set_priority %1": \ + "=m" (*(char *)0): "g" (x)) +@end example + +@noindent +Note that we have supplied an output operand which is not actually used in +the instruction. This is because @code{asm} requires at least one output +operand. This requirement exists for internal implementation reasons and +we might be able to relax it in the future. + +In this case output operand has the additional benefit effect of giving the +appearance of writing in memory. As a result, GNU CC will assume that data +previously fetched from memory must be fetched again if needed again later. +This may be desirable if you have not employed the @code{volatile} keyword +on all the variable declarations that ought to have it. + +@node Asm Labels,,Extended Asm, Extensions +@section Controlling Names Used in Assembler Code + +You can specify the name to be used in the assembler code for a C function +or variable by writing the @code{asm} keyword after the declarator as +follows: + +@example +int foo asm ("myfoo") = 2; +@end example + +@noindent +This specifies that the name to be used for the variable @code{foo} in +the assembler code should be @samp{myfoo} rather than the usual +@samp{_foo}. + +On systems where an underscore is normally prepended to the name of a C +function or variable, this feature allows you to define names for the +linker that do not start with an underscore. + +You cannot use @code{asm} in this way in a function @emph{definition}; but +you can get the same effect by writing a declaration for the function +before its definition and putting @code{asm} there, like this: + +@example +extern func () asm ("FUNC"); + +func (x, y) + int x, y; +@dots{} +@end example + +It is up to you to make sure that the assembler names you choose do not +conflict with any other assembler symbols. Also, you must not use a +register name; that would produce completely invalid assembler code. GNU +CC does not as yet have the ability to store static variables in registers. +Perhaps that will be added. + +@node Bugs, Portability, Extensions, Top +@chapter Reporting Bugs + +Your bug reports play an essential role in making GNU CC reliable. + +Reporting a bug may help you by bringing a solution to your problem, or it +may not. But in any case the important function of a bug report is to help +the entire community by making the next version of GNU CC work better. Bug +reports are your contribution to the maintenance of GNU CC. + +In order for a bug report to serve its purpose, you must include the +information that makes for fixing the bug. + +@menu +* Criteria: Bug Criteria. Have you really found a bug? +* Reporting: Bug Reporting. How to report a bug effectively. +@end menu + +@node Bug Criteria, Bug Reporting, Bugs, Bugs +@section Have You Found a Bug? + +If you are not sure whether you have found a bug, here are some guidelines: + +@itemize @bullet +@item +If the compiler gets a fatal signal, for any input whatever, that is a +compiler bug. Reliable compilers never crash. + +@item +If the compiler produces invalid assembly code, for any input whatever +(except an @code{asm} statement), that is a compiler bug, unless the +compiler reports errors (not just warnings) which would ordinarily +prevent the assembler from being run. + +@item +If the compiler produces valid assembly code that does not correctly +execute the input source code, that is a compiler bug. + +However, you must double-check to make sure, because you may have run +into an incompatibility between GNU C and traditional C +(@pxref{Incompatibilities}). These incompatibilities might be considered +bugs, but they are inescapable consequences of valuable features. + +Or you may have a program whose behavior is undefined, which happened +by chance to give the desired results with another C compiler. + +For example, in many nonoptimizing compilers, you can write @samp{x;} +at the end of a function instead of @samp{return x;}, with the same +results. But the value of the function is undefined if @samp{return} +is omitted; it is not a bug when GNU CC produces different results. + +Problems often result from expressions with two increment operators, +as in @samp{f (*p++, *p++)}. Your previous compiler might have +interpreted that expression the way you intended; GNU CC might +interpret it another way; neither compiler is wrong. + +After you have localized the error to a single source line, it should +be easy to check for these things. If your program is correct and +well defined, you have found a compiler bug. + +@item +If the compiler produces an error message for valid input, that is a +compiler bug. + +Note that the following is not valid input, and the error message for +it is not a bug: + +@example +int foo (char); + +int +foo (x) + char x; +@{ @dots{} @} +@end example + +@noindent +The prototype says to pass a @code{char}, while the definition says to +pass an @code{int} and treat the value as a @code{char}. This is what +the ANSI standard says, and it makes sense. + +@item +If the compiler does not produce an error message for invalid input, +that is a compiler bug. However, you should note that your idea of +``invalid input'' might be my idea of ``an extension'' or ``support +for traditional practice''. + +@item +If you are an experienced user of C compilers, your suggestions +for improvement of GNU CC are welcome in any case. +@end itemize + +@node Bug Reporting,, Bug Criteria, Bugs +@section How to Report Bugs + +Send bug reports for GNU C to one of these addresses: + +@example +bug-gcc@@prep.ai.mit.edu +@{ucbvax|mit-eddie|uunet@}!prep.ai.mit.edu!bug-gcc +@end example + +As a last resort, snail them to: + +@example +GNU Compiler Bugs +545 Tech Sq +Cambridge, MA 02139 +@end example + +The fundamental principle of reporting bugs usefully is this: +@strong{report all the facts}. If you are not sure whether to mention a +fact or leave it out, mention it! + +Often people omit facts because they think they know what causes the +problem and they conclude that some details don't matter. Thus, you might +assume that the name of the variable you use in an example does not matter. +Well, probably it doesn't, but one cannot be sure. Perhaps the bug is a +stray memory reference which happens to fetch from the location where that +name is stored in memory; perhaps, if the name were different, the contents +of that location would fool the compiler into doing the right thing despite +the bug. Play it safe and give an exact example. + +If you want to enable me to fix the bug, you should include all these +things: + +@itemize @bullet +@item +The version of GNU CC. You can get this by running it with the +@samp{-v} option. + +Without this, I won't know whether there is any point in looking for +the bug in the current version of GNU CC. + +@item +A complete input file that will reproduce the bug. If the bug is in +the C preprocessor, send me a source file and any header files that it +requires. If the bug is in the compiler proper (@file{cc1}), run your +source file through the C preprocessor by doing @samp{gcc -E +@var{sourcefile} > @var{outfile}}, then include the contents of +@var{outfile} in the bug report. (Any @samp{-I}, @samp{-D} or +@samp{-U} options that you used in actual compilation should also be +used when doing this.) + +A single statement is not enough of an example. In order to compile +it, it must be embedded in a function definition; and the bug might +depend on the details of how this is done. + +Without a real example I can compile, all I can do about your bug +report is wish you luck. It would be futile to try to guess how to +provoke the bug. For example, bugs in register allocation and +reloading frequently depend on every little detail of the function +they happen in. + +@item +The command arguments you gave GNU CC to compile that example and +observe the bug. For example, did you use @samp{-O}? To guarantee +you won't omit something important, list them all. + +If I were to try to guess the arguments, I would probably guess wrong +and then I would not encounter the bug. + +@item +The names of the files that you used for @file{tm.h} and @file{md} +when you installed the compiler. + +@item +The type of machine you are using, and the operating system name and +version number. + +@item +A description of what behavior you observe that you believe is +incorrect. For example, ``It gets a fatal signal,'' or, ``There is an +incorrect assembler instruction in the output.'' + +Of course, if the bug is that the compiler gets a fatal signal, then I +will certainly notice it. But if the bug is incorrect output, I might +not notice unless it is glaringly wrong. I won't study all the +assembler code from a 50-line C program just on the off chance that it +might be wrong. + +Even if the problem you experience is a fatal signal, you should still +say so explicitly. Suppose something strange is going on, such as, +your copy of the compiler is out of synch, or you have encountered a +bug in the C library on your system. (This has happened!) Your copy +might crash and mine would not. If you @i{told} me to expect a crash, +then when mine fails to crash, I would know that the bug was not +happening for me. If you had not told me to expect a crash, then I +would not be able to draw any conclusion from my observations. + +In cases where GNU CC generates incorrect code, if you send me a small +complete sample program I will find the error myself by running the +program under a debugger. If you send me a large example or a part of +a larger program, I cannot do this; you must debug the compiled +program and narrow the problem down to one source line. Tell me which +source line it is, and what you believe is incorrect about the code +generated for that line. + +@item +If you send me examples of output from GNU CC, please use @samp{-g} +when you make them. The debugging information includes source line +numbers which are essential for correlating the output with the input. + +@item +If you wish to suggest changes to the GNU CC source, send me context +diffs. If you even discuss something in the GNU CC source, refer to +it by context, not by line number. + +The line numbers in my development sources don't match those in your +sources. They won't tell me anything. +@end itemize + +Here are some things that are not necessary: + +@itemize @bullet +@item +A description of the envelope of the bug. + +Often people who encounter a bug spend a lot of time investigating +which changes to the input file will make the bug go away and which +changes will not affect it. + +This is often time consuming and not very useful, because the way I +will find the bug is by running a single example under the debugger +with breakpoints, not by pure deduction from a series of examples. + +Of course, it can't hurt if you can find a simpler example that +triggers the same bug. Errors in the output will be easier to spot, +running under the debugger will take less time, etc. An easy way +to simplify an example is to delete all the function definitions +except the one where the bug occurs. Those earlier in the file +may be replaced by external declarations. + +However, simplification is not necessary; if you don't want to do +this, report the bug anyway. + +@item +A patch for the bug. + +A patch for the bug does help me if it is a good one. But don't omit +the necessary information, such as the test case, because I might see +problems with your patch and decide to fix the problem another way. + +Sometimes with a program as complicated as GNU CC it is very hard to +construct an example that will make the program go through a certain +point in the code. If you don't send me the example, I won't be able +to verify that the bug is fixed. + +@item +A guess about what the bug is or what it depends on. + +Such guesses are usually wrong. Even I can't guess right about such +things without using the debugger to find the facts. They also don't +serve a useful purpose. +@end itemize + +@node Portability, Interface, Bugs, Top @chapter GNU CC and Portability The main goal of GNU CC was to make a good, fast compiler for machines in @@ -413,52 +2458,125 @@ combinations of parameters. Often I hav cases, but only the common ones or only the ones that I have encountered. As a result, a new target may require additional strategies. You will know if this happens because the compiler will call @code{abort}. Fortunately, -the new strategies can be added to all versions of the compiler, and will -be relevant only for target machines that need them. +the new strategies can be added in a machine-independent fashion, and will +affect only the target machines that need them. -@node Passes, RTL, Portability, Top +@node Interface, Passes, Portability, Top +@chapter Interfacing to GNU CC Output + +GNU CC is normally configured to use the same function calling convention +normally in use on the target system. This is done with the +machine-description macros described (@pxref{Machine Macros}). + +However, returning of structure and union values is done differently. +As a result, functions compiled with PCC returning such types cannot +be called from code compiled with GNU CC, and vice versa. This usually +does not cause trouble because the Unix library routines don't return +structures and unions. + +Structures and unions that are 1, 2, 4 or 8 bytes long are returned in the +same registers used for @code{int} or @code{double} return values. (GNU CC +typically allocates variables of such types in registers also.) Structures +and unions of other sizes are returned by storing them into an address +passed by the caller in a register. This method is faster than the one +normally used by PCC and is also reentrant. The register used for passing +the address is specified by the machine-description macro +@code{STRUCT_VALUE}. + +GNU CC always passes arguments on the stack. At some point it will be +extended to pass arguments in registers, for machines which use that as +the standard calling convention. This will make it possible to use such +a convention on other machines as well. However, that would render it +completely incompatible with PCC. We will probably do this once we +have a complete GNU system so we can compile the libraries with GNU CC. + +If you use @code{longjmp}, beware of automatic variables. ANSI C says that +automatic variables that are not declared @code{volatile} have undefined +values after a @code{longjmp}. And this is all GNU CC promises to do, +because it is very difficult to restore register variables correctly, and +one of GNU CC's features is that it can put variables in registers without +your asking it to. + +If you want a variable to be unaltered by @code{longjmp}, and you don't +want to write @code{volatile} because old C compilers don't accept it, +just take the address of the variable. If a variable's address is ever +taken, even if just to compute it and ignore it, then the variable cannot +go in a register: + +@example +@{ + int careful; + &careful; + @dots{} +@} +@end example + +Code compiled with GNU CC may call certain library routines. The routines +needed on the Vax and 68000 are in the file @file{gnulib.c}. You must +compile this file with the standard C compiler, not with GNU CC, and then +link it with each program you compile with GNU CC. (In actuality, many +programs will not need it.) The usual function call interface is used +for calling the library routines. Some standard parts of the C library, +such as @code{bcopy}, are also called automatically. + +@node Passes, RTL, Interface, Top @chapter Passes and Files of the Compiler The overall control structure of the compiler is in @file{toplev.c}. This file is responsible for initialization, decoding arguments, opening and closing files, and sequencing the passes. -The parsing pass is invoked only once, to parse the entire input. Each -time a complete function definition or top-level data definition is read, -the parsing pass calls the function @code{rest_of_compilation} in -@file{toplev.c}, which is responsible for all further processing necessary, -ending with output of the assembler language. All other compiler passes -run, in sequence, within @code{rest_of_compilation}. After -@code{rest_of_compilation} returns from compiling a function definition, -the storage used for its compilation is entirely freed. +The parsing pass is invoked only once, to parse the entire input. The RTL +intermediate code for a function is generated as the function is parsed, a +statement at a time. Each statement is read in as a syntax tree and then +converted to RTL; then the storage for the tree for the statement is +reclaimed. Storage for types (and the expressions for their sizes), +declarations, and a representation of the binding contours and how they nest, +remains until the function is finished being compiled; these are all needed +to output the debugging information. + +Each time the parsing pass reads a complete function definition or +top-level declaration, it calls the function +@code{rest_of_compilation} or @code{rest_of_decl_compilation} in +@file{toplev.c}, which are responsible for all further processing +necessary, ending with output of the assembler language. All other +compiler passes run, in sequence, within @code{rest_of_compilation}. +When that function returns from compiling a function definition, the +storage used for that function definition's compilation is entirely +freed, unless it is an inline function (@pxref{Inline}). Here is a list of all the passes of the compiler and their source files. Also included is a description of where debugging dumps can be requested -with @samp{-d} switches. +with @samp{-d} options. @itemize @bullet @item Parsing. This pass reads the entire text of a function definition, -constructing a syntax tree. The tree representation does not entirely -follow C syntax, because it is intended to support other languages as well. - -C data type analysis is also done in this pass, and every tree node that -represents an expression has a data type attached. Variables are represented -as declaration nodes. - -Constant folding and associative-law simplifications are also done during -this pass. - -The source files of the parsing pass are @file{parse.y}, @file{decl.c}, -@file{typecheck.c}, @file{stor-layout.c}, @file{fold-const.c}, and -@file{tree.c}. The last three are intended to be language-independent. -There are also header files @file{parse.h}, @file{c-tree.h}, -@file{tree.h} and @file{tree.def}. The last two define the format of -the tree representation. - -@item -RTL generation. This pass converts the tree structure for one -function into RTL code. +constructing partial syntax trees. This and RTL generation are no longer +truly separate passes (formerly they were), but it is easier to think +of them as separate. + +The tree representation does not entirely follow C syntax, because it is +intended to support other languages as well. + +C data type analysis is also done in this pass, and every tree node +that represents an expression has a data type attached. Variables are +represented as declaration nodes. + +Constant folding and associative-law simplifications are also done +during this pass. + +The source files for parsing are @file{c-parse.y}, @file{c-decl.c}, +@file{c-typeck.c}, @file{c-convert.c}, @file{stor-layout.c}, +@file{fold-const.c}, and @file{tree.c}. The last three files are +intended to be language-independent. There are also header files +@file{c-parse.h}, @file{c-tree.h}, @file{tree.h} and @file{tree.def}. +The last two define the format of the tree representation.@refill + +@item +RTL generation. This is the conversion of syntax tree into RTL code. +It is actually done statement-by-statement during parsing, but for +most purposes it can be thought of as a separate pass. This is where the bulk of target-parameter-dependent code is found, since often it is necessary for strategies to apply only when certain @@ -471,39 +2589,52 @@ comparisons, boolean operations or condi recursion is detected at this time also. Decisions are made about how best to arrange loops and how to output @code{switch} statements. -The files of the RTL generation pass are @file{stmt.c}, @file{expr.c}, +The source files for RTL generation are @file{stmt.c}, @file{expr.c}, @file{explow.c}, @file{expmed.c}, @file{optabs.c} and @file{emit-rtl.c}. Also, the file @file{insn-emit.c}, generated from the machine description by the program @code{genemit}, is used in this pass. The header files -@file{expr.h} is used for communication within this pass. +@file{expr.h} is used for communication within this pass.@refill -The header files @file{insn-flags.h} and @file{insn-codes.h}, generated from -the machine description by the programs @code{genflags} and @code{gencodes}, -tell this pass which standard names are available for use and which patterns -correspond to them. +The header files @file{insn-flags.h} and @file{insn-codes.h}, +generated from the machine description by the programs @code{genflags} +and @code{gencodes}, tell this pass which standard names are available +for use and which patterns correspond to them.@refill Aside from debugging information output, none of the following passes -refers to the tree structure representation of the function. +refers to the tree structure representation of the function (only +part of which is saved). -The switch @samp{-dr} causes a debugging dump of the RTL code after this -pass. This dump file's name is made by appending @samp{.rtl} to the -input file name. +The decision of whether the function can and should be expanded inline +in its subsequent callers is made at the end of rtl generation. The +function must meet certain criteria, currently related to the size of +the function and the types and number of parameters it has. Note that +this function may contain loops, recursive calls to itself +(tail-recursive functions can be inlined!), gotos, in short, all +constructs supported by GNU CC. + +The option @samp{-dr} causes a debugging dump of the RTL code after +this pass. This dump file's name is made by appending @samp{.rtl} to +the input file name. @item -Jump optimization. This pass simplifies jumps to the following instruction, -jumps across jumps, and jumps to jumps. It deletes unreferenced labels -and unreachable code, except that unreachable code that contains a loop -is not recognized as unreachable in this pass. (Such loops are deleted -later in the basic block analysis.) +Jump optimization. This pass simplifies jumps to the following +instruction, jumps across jumps, and jumps to jumps. It deletes +unreferenced labels and unreachable code, except that unreachable code +that contains a loop is not recognized as unreachable in this pass. +(Such loops are deleted later in the basic block analysis.) Jump optimization is performed two or three times. The first time is -immediately following RTL generation. +immediately following RTL generation. The second time is after CSE, +but only if CSE says repeated jump optimization is needed. The +last time is right before the final pass. That time, cross-jumping +and deletion of no-op move instructions are done together with the +optimizations described above. The source file of this pass is @file{jump.c}. -The switch @samp{-dj} causes a debugging dump of the RTL code after this -pass is run for the first time. This dump file's name is made by appending -@samp{.jump} to the input file name. +The option @samp{-dj} causes a debugging dump of the RTL code after +this pass is run for the first time. This dump file's name is made by +appending @samp{.jump} to the input file name. @item Register scan. This pass finds the first and last use of each @@ -514,9 +2645,9 @@ is in @file{regclass.c}. Common subexpression elimination. This pass also does constant propagation. Its source file is @file{cse.c}. If constant propagation causes conditional jumps to become unconditional or to -become no-ops, jump optimization is run again when cse is finished. +become no-ops, jump optimization is run again when CSE is finished. -The switch @samp{-ds} causes a debugging dump of the RTL code after +The option @samp{-ds} causes a debugging dump of the RTL code after this pass. This dump file's name is made by appending @samp{.cse} to the input file name. @@ -524,7 +2655,7 @@ the input file name. Loop optimization. This pass moves constant expressions out of loops. Its source file is @file{loop.c}. -The switch @samp{-dL} causes a debugging dump of the RTL code after +The option @samp{-dL} causes a debugging dump of the RTL code after this pass. This dump file's name is made by appending @samp{.loop} to the input file name. @@ -533,8 +2664,7 @@ Stupid register allocation is performed nonoptimizing compilation. It does a little data flow analysis as well. When stupid register allocation is in use, the next pass executed is the reloading pass; the others in between are skipped. -The source file is @file{stupid.c}, with header file @file{stupid.h} -used for communication with the RTL generation pass. +The source file is @file{stupid.c}. @item Data flow analysis (@file{flow.c}). This pass divides the program @@ -547,7 +2677,7 @@ This pass also deletes computations whos combines memory references with add or subtract instructions to make autoincrement or autodecrement addressing. -The switch @samp{-df} causes a debugging dump of the RTL code after +The option @samp{-df} causes a debugging dump of the RTL code after this pass. This dump file's name is made by appending @samp{.flow} to the input file name. If stupid register allocation is in use, this dump file reflects the full results of such allocation. @@ -559,22 +2689,22 @@ flow into single instructions. It combi the instructions by substitution, simplifies the result using algebra, and then attempts to match the result against the machine description. -The switch @samp{-dc} causes a debugging dump of the RTL code after +The option @samp{-dc} causes a debugging dump of the RTL code after this pass. This dump file's name is made by appending @samp{.combine} to the input file name. @item Register class preferencing. The RTL code is scanned to find out -which register class is best for each pseudo register. The source file -is @file{regclass.c}. +which register class is best for each pseudo register. The source +file is @file{regclass.c}. @item Local register allocation (@file{local-alloc.c}). This pass allocates hard registers to pseudo registers that are used only within one basic -block. Because the basic block is linear, it can use fast and powerful -techniques to do a very good job. +block. Because the basic block is linear, it can use fast and +powerful techniques to do a very good job. -The switch @samp{-dl} causes a debugging dump of the RTL code after +The option @samp{-dl} causes a debugging dump of the RTL code after this pass. This dump file's name is made by appending @samp{.lreg} to the input file name. @@ -584,29 +2714,36 @@ allocates hard registers for the remaini whose life spans are not contained in one basic block). @item -Reloading. This pass finds instructions that are invalid because a -value has failed to end up in a register, or has ended up in a -register of the wrong kind. It fixes up these instructions by -reloading the problematical values into registers temporarily. -Additional instructions are generated to do the copying. +Reloading. This pass renumbers pseudo registers with the hardware +registers numbers they were allocated. Pseudo registers that did not +get hard registers are replaced with stack slots. Then it finds +instructions that are invalid because a value has failed to end up in +a register, or has ended up in a register of the wrong kind. It fixes +up these instructions by reloading the problematical values +temporarily into registers. Additional instructions are generated to +do the copying. Source files are @file{reload.c} and @file{reload1.c}, plus the header @file{reload.h} used for communication between them. -The switch @samp{-dg} causes a debugging dump of the RTL code after +The option @samp{-dg} causes a debugging dump of the RTL code after this pass. This dump file's name is made by appending @samp{.greg} to the input file name. @item -Jump optimization is repeated, this time including cross-jumping. +Jump optimization is repeated, this time including cross-jumping +and deletion of no-op move instructions. Machine-specific peephole +optimizations are performed at the same time. + +The option @samp{-dJ} causes a debugging dump of the RTL code after +this pass. This dump file's name is made by appending @samp{.jump2} +to the input file name. @item Final. This pass outputs the assembler code for the function. It is -also responsible for identifying no-op move instructions and spurious -test and compare instructions. The function entry and exit sequences -are generated directly as assembler code in this pass; they never -exist as RTL. Pseudo registers that did not get hard registers are -given stack slots in this pass. +also responsible for identifying spurious test and compare +instructions. The function entry and exit sequences are generated +directly as assembler code in this pass; they never exist as RTL. The source files are @file{final.c} plus @file{insn-output.c}; the latter is generated automatically from the machine description by the @@ -628,9 +2765,9 @@ Every pass uses @file{machmode.def}, whi @item All the passes that work with RTL use the header files @file{rtl.h} -and @file{rtl.def}, and subroutines in file @file{rtl.c}. The -tools @code{gen*} also use these files to read and work with the -machine description RTL. +and @file{rtl.def}, and subroutines in file @file{rtl.c}. The tools +@code{gen*} also use these files to read and work with the machine +description RTL. @item Several passes refer to the header file @file{insn-config.h} which @@ -642,11 +2779,12 @@ automatically from the machine descripti Several passes use the instruction recognizer, which consists of @file{recog.c} and @file{recog.h}, plus the files @file{insn-recog.c} and @file{insn-extract.c} that are generated automatically from the -machine description by the tools @file{genrecog} and @file{genextract}. +machine description by the tools @file{genrecog} and +@file{genextract}.@refill @item -Several passes use the header file @file{regs.h} which defines the -information recorded about pseudo register usage, @file{basic-block.h} +Several passes use the header files @file{regs.h} which defines the +information recorded about pseudo register usage, and @file{basic-block.h} which defines the information recorded about basic blocks. @item @@ -661,7 +2799,7 @@ into loops. @chapter RTL Representation Most of the work of the compiler is done on an intermediate representation -called register tranfer language. In this language, the instructions to be +called register transfer language. In this language, the instructions to be output are described, pretty much one by one, in an algebraic form that describes what the instruction does. @@ -673,6 +2811,7 @@ form uses nested parentheses to indicate @menu * RTL Objects:: Expressions vs vectors vs strings vs integers. * Accessors:: Macros to access expression operands or vector elts. +* Flags:: Other flags in an RTL expression. * Machine Modes:: Describing the size and format of a datum. * Constants:: Expressions with constant values. * Regs and Memory:: Expressions representing register contents or memory. @@ -683,7 +2822,9 @@ form uses nested parentheses to indicate * RTL Declarations:: Declaring volatility, constancy, etc. * Side Effects:: Expressions for storing in registers, etc. * Incdec:: Embedded side-effects for autoincrement addressing. +* Assembler:: Representing @code{asm} with operands. * Insns:: Expression types for entire insns. +* Calls:: RTL representation of function call insns. * Sharing:: Some expressions are unique; others *must* be copied. @end menu @@ -691,17 +2832,17 @@ form uses nested parentheses to indicate @section RTL Object Types RTL uses four kinds of objects: expressions, integers, strings and vectors. -Expressions are the most important ones. An RTL expression is a C -structure, but it is usually referred to with a pointer; a type that is -given the typedef name @code{rtx}. +Expressions are the most important ones. An RTL expression (``RTX'', for +short) is a C structure, but it is usually referred to with a pointer; a +type that is given the typedef name @code{rtx}. An integer is simply an @code{int}, and a string is a @code{char *}. -Within rtl code, strings appear only inside @samp{symbol_ref} expressions, -but they appear in other contexts in the rtl expressions that make up +Within RTL code, strings appear only inside @samp{symbol_ref} expressions, +but they appear in other contexts in the RTL expressions that make up machine descriptions. Their written form uses decimal digits. A string is a sequence of characters. In core it is represented as a -@code{char *} in usual C fashion, and they are written in C syntax as well. +@code{char *} in usual C fashion, and it is written in C syntax as well. However, strings in RTL may never be null. If you write an empty string in a machine description, it is represented in core as a null pointer rather than as a pointer to a null character. In certain contexts, these null @@ -714,31 +2855,35 @@ the vector. The written form of a vecto whitespace separating them. Vectors of length zero are not created; null pointers are used instead. -Expressions are classified by @dfn{expression code}. The expression code -is a name defined in @file{rtl.def}, which is also (in upper case) a C -enumeration constant. The possible expression codes and their meanings are -machine-independent. The code of an rtx can be extracted with the macro -@code{GET_CODE (@var{x})} and altered with @code{PUT_CODE (@var{x}, -@var{newcode})}. +Expressions are classified by @dfn{expression codes} (also called RTX +codes). The expression code is a name defined in @file{rtl.def}, which is +also (in upper case) a C enumeration constant. The possible expression +codes and their meanings are machine-independent. The code of an RTX can +be extracted with the macro @code{GET_CODE (@var{x})} and altered with +@code{PUT_CODE (@var{x}, @var{newcode})}. The expression code determines how many operands the expression contains, and what kinds of objects they are. In RTL, unlike Lisp, you cannot tell by looking at an operand what kind of object it is. Instead, you must know from its context---from the expression code of the containing expression. -For example, in an expression of code @code{subreg}, the first operand is +For example, in an expression of code @samp{subreg}, the first operand is to be regarded as an expression and the second operand as an integer. In -an expression of code @code{plus}, there are two operands, both of which -are to be regarded as expressions. In a @code{symbol_ref} expression, +an expression of code @samp{plus}, there are two operands, both of which +are to be regarded as expressions. In a @samp{symbol_ref} expression, there is one operand, which is to be regarded as a string. Expressions are written as parentheses containing the name of the expression type, its flags and machine mode if any, and then the operands of the expression (separated by spaces). +Expression code names in the @samp{md} file are written in lower case, +but when they appear in C code they are written in upper case. In this +manual, they are shown as follows: @samp{const_int}. + In a few contexts a null pointer is valid where an expression is normally wanted. The written form of this is @samp{(nil)}. -@node Accessors, Machine Modes, RTL Objects, RTL +@node Accessors, Flags, RTL Objects, RTL @section Access to Operands For each expression type @file{rtl.def} specifies the number of contained @@ -746,38 +2891,39 @@ objects and their kinds, with four possi (actually a pointer to an expression), @samp{i} for integer, @samp{s} for string, and @samp{E} for vector of expressions. The sequence of letters for an expression code is called its @dfn{format}. Thus, the format of -@code{subreg} is @samp{ei}. +@samp{subreg} is @samp{ei}.@refill Two other format characters are used occasionally: @samp{u} and @samp{0}. @samp{u} is equivalent to @samp{e} except that it is printed differently in debugging dumps, and @samp{0} means a slot whose contents do not fit any normal category. @samp{0} slots are not printed at all in dumps, and are -often used in special ways by small parts of the compiler. +often used in special ways by small parts of the compiler.@refill There are macros to get the number of operands and the format of an expression code: @table @code @item GET_RTX_LENGTH (@var{code}) -Number of operands of an rtx of code @var{code}. +Number of operands of an RTX of code @var{code}. @item GET_RTX_FORMAT (@var{code}) -The format of an rtx of code @var{code}, as a C string. +The format of an RTX of code @var{code}, as a C string. @end table Operands of expressions are accessed using the macros @code{XEXP}, @code{XINT} and @code{XSTR}. Each of these macros takes two arguments: an -expression-pointer (rtx) and an operand number (counting from zero). Thus, +expression-pointer (RTX) and an operand number (counting from zero). +Thus,@refill @example -XEXP (x, 2) +XEXP (@var{x}, 2) @end example @noindent accesses operand 2 of expression @var{x}, as an expression. @example -XINT (x, 2) +XINT (@var{x}, 2) @end example @noindent @@ -791,15 +2937,16 @@ the containing expression. That is also operands there are. For example, if @var{x} is a @samp{subreg} expression, you know that it has -two operands which can be correctly accessed as @code{XEXP (x, 0)} and -@code{XINT (x, 1)}. If you did @code{XINT (x, 0)}, you would get the -address of the expression operand but cast as an integer; that might -occasionally be useful, but it would be cleaner to write @code{(int) XEXP -(x, 0)}. @code{XEXP (x, 1)} would also compile without error, and would -return the second, integer operand cast as an expression pointer, which -would probably result in a crash when accessed. Nothing stops you from -writing @code{XEXP (x, 28)} either, but this will access memory past the -end of the expression with unpredictable results. +two operands which can be correctly accessed as @code{XEXP (@var{x}, 0)} +and @code{XINT (@var{x}, 1)}. If you did @code{XINT (@var{x}, 0)}, you +would get the address of the expression operand but cast as an integer; +that might occasionally be useful, but it would be cleaner to write +@code{(int) XEXP (@var{x}, 0)}. @code{XEXP (@var{x}, 1)} would also +compile without error, and would return the second, integer operand cast as +an expression pointer, which would probably result in a crash when +accessed. Nothing stops you from writing @code{XEXP (@var{x}, 28)} either, +but this will access memory past the end of the expression with +unpredictable results.@refill Access to operands which are vectors is more complicated. You can use the macro @code{XVEC} to get the vector-pointer itself, or the macros @@ -814,9 +2961,9 @@ Access the vector-pointer which is opera Access the length (number of elements) in the vector which is in operand number @var{idx} in @var{exp}. This value is an @code{int}. -@item XVECLEN (@var{exp}, @var{idx}, @var{eltnum}) +@item XVECEXP (@var{exp}, @var{idx}, @var{eltnum}) Access element number @var{eltnum} in the vector which is -in operand number @var{idx} in @var{exp}. This value is an @code{rtx}. +in operand number @var{idx} in @var{exp}. This value is an RTX. It is up to you to make sure that @var{eltnum} is not negative and is less than @code{XVECLEN (@var{exp}, @var{idx})}. @@ -826,14 +2973,68 @@ All the macros defined in this section e can be used to assign the operands, lengths and vector elements as well as to access them. -@node Machine Modes, Constants, Accessors, RTL +@node Flags, Machine Modes, Accessors, RTL +@section Flags in an RTL Expression + +RTL expressions contain several flags (one-bit bit-fields) that are used +in certain types of expression. + +@table @code +@item used +This flag is used only momentarily, at the end of RTL generation for a +function, to count the number of times an expression appears in insns. +Expressions that appear more than once are copied, according to the +rules for shared structure (@pxref{Sharing}). + +@item volatil +This flag is used in @samp{mem} and @samp{reg} expressions and in insns. +In RTL dump files, it is printed as @samp{/v}. + +In a @samp{mem} expression, it is 1 if the memory reference is volatile. +Volatile memory references may not be deleted, reordered or combined. + +In a @samp{reg} expression, it is 1 if the value is a user-level variable. +0 indicates an internal compiler temporary. + +In an insn, 1 means the insn has been deleted. + +@item in_struct +This flag is used in @samp{mem} expressions. It is 1 if the memory +datum referred to is all or part of a structure or array; 0 if it is (or +might be) a scalar variable. A reference through a C pointer has 0 +because the pointer might point to a scalar variable. + +This information allows the compiler to determine something about possible +cases of aliasing. + +In an RTL dump, this flag is represented as @samp{/s}. + +@item unchanging +This flag is used in @samp{reg} and @samp{mem} expressions. 1 means +that the value of the expression never changes (at least within the +current function). + +In an RTL dump, this flag is represented as @samp{/u}. + +@item integrated +In some kinds of expressions, including insns, this flag means the +rtl was produced by procedure integration. + +In a @samp{reg} expression, this flag indicates the register +containing the value to be returned by the current function. On +machines that pass parameters in registers, the same register number +may be used for parameters as well, but this flag is not set on such +uses. +@end table + +@node Machine Modes, Constants, Flags, RTL @section Machine Modes A machine mode describes a size of data object and the representation used for it. In the C code, machine modes are represented by an enumeration -type, @code{enum machine_mode}. Each rtl expression has room for a machine -mode and so do certain kinds of tree expressions (declarations and types, -to be precise). +type, @code{enum machine_mode}, defined in @file{machmode.def}. Each RTL +expression has room for a machine mode and so do certain kinds of tree +expressions (declarations and types, to be precise). In debugging dumps and machine descriptions, the machine mode of an RTL expression is written after the expression code with a colon to separate @@ -874,13 +3075,13 @@ floating point number. @item BLKmode ``Block'' mode represents values that are aggregates to which none of -the other modes apply. In rtl, only memory references can have this mode, +the other modes apply. In RTL, only memory references can have this mode, and only if they appear in string-move or vector instructions. On machines which have no such instructions, @code{BLKmode} will not appear in RTL. @item VOIDmode Void mode means the absence of a mode or an unspecified mode. -For example, RTL expresslons of code @samp{const_int} have mode +For example, RTL expressions of code @samp{const_int} have mode @code{VOIDmode} because they can be taken to have whatever mode the context requires. In debugging dumps of RTL, @code{VOIDmode} is expressed by the absence of any mode. @@ -924,10 +3125,10 @@ Here are some C macros that relate to ma @table @code @item GET_MODE (@var{x}) -Returns the machine mode of the rtx @var{x}. +Returns the machine mode of the RTX @var{x}. @item PUT_MODE (@var{x}, @var{newmode}) -Alters the machine mode of the rtx @var{x} to be @var{newmode}. +Alters the machine mode of the RTX @var{x} to be @var{newmode}. @item GET_MODE_SIZE (@var{m}) Returns the size in bytes of a datum of mode @var{m}. @@ -938,7 +3139,7 @@ Returns the size in bits of a datum of m @item GET_MODE_UNIT_SIZE (@var{m}) Returns the size in bits of the subunits of a datum of mode @var{m}. This is the same as @code{GET_MODE_SIZE} except in the case of -complex modes and @code{EPmode}. For them, the unit size ithe +complex modes and @code{EPmode}. For them, the unit size is the size of the real or imaginary part, or the size of the function pointer or the context pointer. @end table @@ -952,12 +3153,12 @@ The simplest RTL expressions are those t @item (const_int @var{i}) This type of expression represents the integer value @var{i}. @var{i} is customarily accessed with the macro @code{INTVAL} as in -@code{INTVAL (exp)}, which is equivalent to @code{XINT (exp, 0)}. +@code{INTVAL (@var{exp})}, which is equivalent to @code{XINT (@var{exp}, 0)}. There is only one expression object for the integer value zero; it is the value of the variable @code{const0_rtx}. Likewise, the only expression for integer value one is found in @code{const1_rtx}. -Any attempt to create an expression of code @code{const_int} and +Any attempt to create an expression of code @samp{const_int} and value zero or one will return @code{const0_rtx} or @code{const1_rtx} as appropriate. @@ -967,7 +3168,7 @@ integers @var{i0} and @var{i1} together @code{double} value. To convert them to a @code{double}, do @example -union { double d; int i[2];} u; +union @{ double d; int i[2];@} u; u.i[0] = XINT (x, 0); u.i[1] = XINT (x, 1); @end example @@ -977,8 +3178,9 @@ and then refer to @code{u.d}. The value represented as a double in this fashion even if the value represented is single-precision. -@code{dconst0_rtx} and @code{fconst0_rtx} are @samp{CONST_DOUBLE} -expressions with value 0 and modes @code{DFmode} and @code{SFmode}. +The global variables @code{dconst0_rtx} and @code{fconst0_rtx} hold +@samp{const_double} expressions with value 0, in modes @code{DFmode} and +@code{SFmode}, respectively. @item (symbol_ref @var{symbol}) Represents the value of an assembler label for data. @var{symbol} is @@ -989,7 +3191,7 @@ the @samp{*}. Otherwise, the label is @ @item (label_ref @var{label}) Represents the value of an assembler label for code. It contains one -operand, an expression, which must be a @code{code_label} that appears +operand, an expression, which must be a @samp{code_label} that appears in the instruction sequence to identify the place where the label should go. @@ -1029,7 +3231,7 @@ machine registers that can be used for s hard register numbers, even those that can be used only in certain instructions or can hold only certain types of data. -Each pseudo register number used in a function's rtl code is +Each pseudo register number used in a function's RTL code is represented by a unique @samp{reg} expression. @var{m} is the machine mode of the reference. It is necessary because @@ -1083,6 +3285,11 @@ The compilation parameter @code{WORDS_BI that word number zero is the most significant part; otherwise, it is the least significant part. +Between the combiner pass and the reload pass, it is possible to have +a @samp{subreg} which contains a @samp{mem} instead of a @samp{reg} as +its first operand. The reload pass eliminates these cases by +reloading the @samp{mem} into a suitable register. + Note that it is not valid to access a @code{DFmode} value in @code{SFmode} using a @samp{subreg}. On some machines the most significant part of a @code{DFmode} value does not have the same format as a single-precision @@ -1093,38 +3300,39 @@ This refers to the machine's condition c operands and may not have a machine mode. It may be validly used in only two contexts: as the destination of an assignment (in test and compare instructions) and in comparison operators comparing against -zero (@code{const_int} with value zero; that is to say, -@code{const0_rtx}. +zero (@samp{const_int} with value zero; that is to say, +@code{const0_rtx}). -There is only one expression object of code @code{cc0}; it is the +There is only one expression object of code @samp{cc0}; it is the value of the variable @code{cc0_rtx}. Any attempt to create an -expression of code @code{cc0} will return @code{cc0_rtx}. +expression of code @samp{cc0} will return @code{cc0_rtx}. -One special thing about the condition code register is that instructions -can set it implicitly. On many machines, nearly all instructions set -the condition code based on the value that they compute or store. -It is not necessary to record these actions explicitly in the RTL -because the machine description includes a prescription for recognizing -the instructions that do so (by means of the macro @code{NOTICE_UPDATE_CC}). -Only instructions whose sole purpose is to set the condition code, -and instructions that use the condition code, need mention @code{(cc0)}. +One special thing about the condition code register is that +instructions can set it implicitly. On many machines, nearly all +instructions set the condition code based on the value that they +compute or store. It is not necessary to record these actions +explicitly in the RTL because the machine description includes a +prescription for recognizing the instructions that do so (by means of +the macro @code{NOTICE_UPDATE_CC}). Only instructions whose sole +purpose is to set the condition code, and instructions that use the +condition code, need mention @code{(cc0)}. @item (pc) This represents the machine's program counter. It has no operands and may not have a machine mode. @code{(pc)} may be validly used only in certain specific contexts in jump instructions. -There is only one expression object of code @code{pc}; it is the value of -the variable @code{pc_rtx}. Any attempt to create an expression of code -@code{pc} will return @code{pc_rtx}. +There is only one expression object of code @samp{pc}; it is the value +of the variable @code{pc_rtx}. Any attempt to create an expression of +code @samp{pc} will return @code{pc_rtx}. -All instructions that do not jump alter the program counter implicitly, -but there is no need to mention this in the RTL. +All instructions that do not jump alter the program counter implicitly +by incrementing it, but there is no need to mention this in the RTL. @item (mem:@var{m} @var{addr}) -This rtx represents a reference to main memory at an address -represented by the expression @var{addr}. @var{m} specifies how -large a unit of memory is accessed. +This RTX represents a reference to main memory at an address +represented by the expression @var{addr}. @var{m} specifies how large +a unit of memory is accessed. @end table @node Arithmetic, Comparisons, Regs and Memory, RTL @@ -1148,7 +3356,7 @@ computed without overflow, as if with in Of course, machines can't really subtract with infinite precision. However, they can pretend to do so when only the sign of the result will be used, which is the case when the result is stored -in @code{(cc0)}. And that is the only was this kind of expression +in @code{(cc0)}. And that is the only way this kind of expression may validly be used: as a value to be stored in the condition codes. @item (neg:@var{m} @var{x}) @@ -1160,12 +3368,12 @@ valid for mode @var{m}. Represents the signed product of the values represented by @var{x} and @var{y} carried out in machine mode @var{m}. If @var{x} and @var{y} are both valid for mode @var{m}, this is ordinary -size-preserving multiplication. Alteratively, both @var{x} and @var{y} +size-preserving multiplication. Alternatively, both @var{x} and @var{y} may be valid for a different, narrower mode. This represents the kind of multiplication that generates a product wider than the operands. Widening multiplication and same-size multiplication are completely distinct and supported by different machine instructions; machines may -support one but not the other. +support one but not the other.@refill @samp{mult} may be used for floating point division as well. Then @var{m} is a floating point machine mode. @@ -1173,7 +3381,7 @@ Then @var{m} is a floating point machine @item (umult:@var{m} @var{x} @var{y}) Like @samp{mult} but represents unsigned multiplication. It may be used in both same-size and widening forms, like @samp{mult}. -@samp{umult} is used only for fixed-point division. +@samp{umult} is used only for fixed-point multiplication. @item (div:@var{m} @var{x} @var{y}) Represents the quotient in signed division of @var{x} by @var{y}, @@ -1225,7 +3433,7 @@ description entry for the left-shift ins on the Vax, the mode of @var{c} is @code{QImode} regardless of @var{m}. On some machines, negative values of @var{c} may be meaningful; this -is why logical left shift an arithmetic left shift are distinguished. +is why logical left shift and arithmetic left shift are distinguished. For example, Vaxes have no right-shift instructions, and right shifts are represented as left-shift instructions whose counts happen to be negative constants or else computed (in a previous instruction) @@ -1250,6 +3458,13 @@ Represents the absolute value of @var{x} Represents the square root of @var{x}, computed in mode @var{m}. @var{x} must be valid for @var{m}. Most often @var{m} will be a floating point mode. + +@item (ffs:@var{m} @var{x}) +Represents the one plus the index of the least significant 1-bit in +@var{x}, represented as an integer of mode @var{m}. (The value is +zero if @var{x} is zero.) The mode of @var{x} need not be @var{m}; +depending on the target machine, various mode combinations may be +valid. @end table @node Comparisons, Bit Fields, Arithmetic, RTL @@ -1260,27 +3475,26 @@ represent the value 1 if the relation ho mode of the comparison is determined by the operands; they must both be valid for a common machine mode. A comparison with both operands constant would be invalid as the machine mode could not be deduced from it, but such -a comparison should never exist in rtl due to constant folding. +a comparison should never exist in RTL due to constant folding. Inequality comparisons come in two flavors, signed and unsigned. Thus, -there are distinct expression codes @samp{GT} and @samp{GTU} for signed and +there are distinct expression codes @samp{gt} and @samp{gtu} for signed and unsigned greater-than. These can produce different results for the same pair of integer values: for example, 1 is signed greater-than -1 but not unsigned greater-than, because -1 when regarded as unsigned is actually -0xffffffff which is greater than 1. +@code{0xffffffff} which is greater than 1. The signed comparisons are also used for floating point values. Floating point comparisons are distinguished by the machine modes of the operands. The comparison operators may be used to compare the condition codes -@code{(cc0)} against zero, as in @code{(eq (cc0) (const_int 0))}. -Such a construct actually refers to the result of the preceding -instruction in which the condition codes were set. The above -example stands for 1 if the condition codes were set to say -``zero'' or ``equal'', 0 otherwise. Although the same comparison -operators are used for this as may be used in other contexts -on actual data, no confusion can result since the machine description -would never allow both kinds of uses in the same context. +@code{(cc0)} against zero, as in @code{(eq (cc0) (const_int 0))}. Such a +construct actually refers to the result of the preceding instruction in +which the condition codes were set. The above example stands for 1 if the +condition codes were set to say ``zero'' or ``equal'', 0 otherwise. +Although the same comparison operators are used for this as may be used in +other contexts on actual data, no confusion can result since the machine +description would never allow both kinds of uses in the same context. @table @code @item (eq @var{x} @var{y}) @@ -1325,7 +3539,7 @@ to express conditional jumps. @section Bit-fields Special expression codes exist to represent bit-field instructions. -These types of expressions are lvalues in rtl; they may appear +These types of expressions are lvalues in RTL; they may appear on the left side of a assignment, indicating insertion of a value into the specified bit field. @@ -1334,14 +3548,14 @@ into the specified bit field. This represents a reference to a sign-extended bit-field contained or starting in @var{loc} (a memory or register reference). The bit field is @var{size} bits wide and starts at bit @var{pos}. The compilation -switch @code{BITS_BIG_ENDIAN} says which end of the memory unit +option @code{BITS_BIG_ENDIAN} says which end of the memory unit @var{pos} counts from. Which machine modes are valid for @var{loc} depends on the machine, but typically @var{loc} should be a single byte when in memory or a full word in a register. -@item (zero_extract:SI @var{loc} @var{pos} @var{size}) +@item (zero_extract:SI @var{loc} @var{size} @var{pos}) Like @samp{sign_extract} but refers to an unsigned or zero-extended bit field. The same sequence of bits are extracted, but they are filled to an entire word with zeros instead of by sign-extension. @@ -1352,7 +3566,7 @@ are filled to an entire word with zeros All conversions between machine modes must be represented by explicit conversion operations. For example, an expression -which the sum of a byte and a full word cannot be written as +which is the sum of a byte and a full word cannot be written as @code{(plus:SI (reg:QI 34) (reg:SI 80))} because the @samp{plus} operation requires two operands of the same machine mode. Therefore, the byte-sized operand is enclosed in a conversion @@ -1394,13 +3608,29 @@ to machine mode @var{m}. @var{m} must b and @var{x} a floating point value of a mode wider than @var{m}. @item (float:@var{m} @var{x}) -Represents the result of converting fixed point value @var{x} -to floating point mode @var{m}. +Represents the result of converting fixed point value @var{x}, +regarded as signed, to floating point mode @var{m}. + +@item (unsigned_float:@var{m} @var{x}) +Represents the result of converting fixed point value @var{x}, +regarded as unsigned, to floating point mode @var{m}. @item (fix:@var{m} @var{x}) -Represents the result of converting floating point value @var{x} -to fixed point mode @var{m}. How rounding is done is not specified. +When @var{m} is a fixed point mode, represents the result of +converting floating point value @var{x} to mode @var{m}, regarded as +signed. How rounding is done is not specified, so this operation may +be used validly in compiling C code only for integer-valued operands. + +@item (unsigned_fix:@var{m} @var{x}) +Represents the result of converting floating point value @var{x} to +fixed point mode @var{m}, regarded as unsigned. How rounding is done +is not specified. +@item (fix:@var{m} @var{x}) +When @var{m} is a floating point mode, represents the result of +converting floating point value @var{x} (valid for mode @var{m}) to an +integer, still represented in floating point mode @var{m}, by rounding +towards zero. @end table @node RTL Declarations, Side Effects, Conversions, RTL @@ -1410,42 +3640,14 @@ Declaration expression codes do not repr but rather state assertions about their operands. @table @code -@item (volatile:@var{m} @var{x}) -Represents the same value @var{x} does, but makes the assertion -that it should be treated as a volatile value. This forbids -coalescing multiple accesses or deleting them even if it would -appear to have no effect on the program. @var{x} must be a @samp{mem} -expression with mode @var{m}. - -The first thing the reload pass does to an insn is to remove all -@samp{volatile} expressions from it; each one is replaced by its -operand. - -Recognizers will never recognize anything with @samp{volatile} in it. -This automatically prevents some optimizations on such things -(such as instruction combination). After the reload pass removes -all volatility information, the insns can be recognized. - -Cse removes @samp{volatile} from destinations of @samp{set}'s, because -no optimizations reorder such @samp{set}s. This is not required for -correct code and is done to permit some optimization on the value to -be stored. - -@item (unchanging:@var{m} @var{x}) -Represents the same value @var{x} does, but makes the assertion -that its value is effectively constant during the execution -of the current function. This permits references to @var{x} -to be moved freely within the function. @var{x} must be a @samp{reg} -expression with mode @var{m}. - @item (strict_low_part (subreg:@var{m} (reg:@var{n} @var{r}) 0)) This expression code is used in only one context: operand 0 of a @samp{set} expression. In addition, the operand of this expression must be a @samp{subreg} expression. The presence of @samp{strict_low_part} says that the part of the -register which is meaningful in mode @var{n} but is not part of -mode @var{m} is not to be altered. Normally, an assignment to such +register which is meaningful in mode @var{n}, but is not part of +mode @var{m}, is not to be altered. Normally, an assignment to such a subreg is allowed to have undefined effects on the rest of the register when @var{m} is less than a word. @end table @@ -1468,10 +3670,10 @@ Represents the action of storing the val represented by @var{lval}. @var{lval} must be an expression representing a place that can be stored in: @samp{reg} (or @samp{subreg} or @samp{strict_low_part}), @samp{mem}, @samp{pc} or -@samp{cc0}. +@samp{cc0}.@refill If @var{lval} is a @samp{reg}, @samp{subreg} or @samp{mem}, it has a -machine mode; then @var{x} must be valid for that mode. +machine mode; then @var{x} must be valid for that mode.@refill If @var{lval} is a @samp{reg} whose machine mode is less than the full width of the register, then it means that the part of the register @@ -1483,10 +3685,10 @@ rest of the register receives an undefin If @var{lval} is a @samp{strict_low_part} of a @samp{subreg}, then the part of the register specified by the machine mode of the @samp{subreg} is given the value @var{x} and the rest of the register -is not changed. +is not changed.@refill If @var{lval} is @code{(cc0)}, it has no machine mode, and @var{x} may -have any mode. This represents a ``test'' or ``compare'' instruction. +have any mode. This represents a ``test'' or ``compare'' instruction.@refill If @var{lval} is @code{(pc)}, we have a jump instruction, and the possibilities for @var{x} are very limited. It may be a @@ -1497,23 +3699,25 @@ does not jump) and the other of the two (for the case which does jump). @var{x} may also be a @samp{mem} or @code{(plus:SI (pc) @var{y})}, where @var{y} may be a @samp{reg} or a @samp{mem}; these unusual patterns are used to represent jumps through -branch tables. +branch tables.@refill @item (return) -Represents a return from the current function, on machines where -this can be done with one instruction, such as Vaxen. On machines -where a multi-instruction ``epilogue'' must be executed in order -to return from the function, returning is done by jumping to a -label which precedes the epilogue, and the @samp{return} expression -code is never used. +Represents a return from the current function, on machines where this +can be done with one instruction, such as Vaxes. On machines where a +multi-instruction ``epilogue'' must be executed in order to return +from the function, returning is done by jumping to a label which +precedes the epilogue, and the @samp{return} expression code is never +used. @item (call @var{function} @var{nargs}) Represents a function call. @var{function} is a @samp{mem} expression -whose address is the address of the function to be called. @var{nargs} -is an expression representing the number of words of argument. +whose address is the address of the function to be called. +@var{nargs} is an expression which can be used for two purposes: on +some machines it represents the number of bytes of stack argument; on +others, it represents the number of argument registers. Each machine has a standard machine mode which @var{function} must -have. The machine descripion defines macro @code{FUNCTION_MODE} to +have. The machine description defines macro @code{FUNCTION_MODE} to expand into the requisite mode name. The purpose of this mode is to specify what kind of addressing is allowed, on machines where the allowed kinds of addressing depend on the machine mode being @@ -1526,8 +3730,8 @@ undescribed value into @var{x}, which mu One place this is used is in string instructions that store standard values into particular hard registers. It may not be worth the -trouble to describe the values that are stored, but it is essential -to inform the compiler that the registers will be altered, lest it +trouble to describe the values that are stored, but it is essential to +inform the compiler that the registers will be altered, lest it attempt to keep data in them across the string instruction. @var{x} may also be null---a null C pointer, no expression at all. @@ -1541,23 +3745,22 @@ default to clobber these registers, so t call is assumed to have the potential to alter any memory location. @item (use @var{x}) -Represents the use of the value of @var{x}. It indicates that -the value in @var{x} at this point in the program is needed, -even though it may not be apparent whythis is so. Therefore, the -compiler will not attempt to delete instructions whose only -effect is to store a value in @var{x}. @var{x} must be a @samp{reg} -expression. +Represents the use of the value of @var{x}. It indicates that the +value in @var{x} at this point in the program is needed, even though +it may not be apparent why this is so. Therefore, the compiler will +not attempt to delete instructions whose only effect is to store a +value in @var{x}. @var{x} must be a @samp{reg} expression. @item (parallel [@var{x0} @var{x1} @dots{}]) Represents several side effects performed in parallel. The square brackets stand for a vector; the operand of @samp{parallel} is a vector of expressions. @var{x0}, @var{x1} and so on are individual side effects---expressions of code @samp{set}, @samp{call}, -@samp{return}, @samp{clobber} or @samp{use}. +@samp{return}, @samp{clobber} or @samp{use}.@refill -``In parallel'' means that first all the values used in -the individual side-effects are computed, and second all the actual -side-effects are performed. For example, +``In parallel'' means that first all the values used in the individual +side-effects are computed, and second all the actual side-effects are +performed. For example, @example (parallel [(set (reg:SI 1) (mem:SI (reg:SI 1))) @@ -1568,31 +3771,51 @@ side-effects are performed. For example says unambiguously that the values of hard register 1 and the memory location addressed by it are interchanged. In both places where @code{(reg:SI 1)} appears as a memory address it refers to the value -in register 1 @i{before} the execution of the instruction. +in register 1 @emph{before} the execution of the instruction. + +Peephole optimization, which takes place in the last jump-optimization +pass, can produce insns whose patterns consist of a @samp{parallel} +whose elements are the operands needed to output the resulting +assembler code--often @samp{reg}, @samp{mem} or constant expressions. +This would not be well-formed RTL at any other stage in compilation, +but it is ok then because no further optimization remains to be done. +However, the definition of the macro @code{NOTICE_UPDATE_CC} may need +to deal with such insns. + +@item (sequence [@var{insns} @dots{}]) +Represents a sequence of insns. Each of the @var{insns} that appears +in the vector is suitable for appearing in the chain of insns, so it +must be an @samp{insn}, @samp{jump_insn}, @samp{call_insn}, +@samp{code_label}, @samp{barrier} or @samp{note}. + +A @samp{sequence} RTX never appears in an actual insn. It represents +the sequence of insns that result from a @samp{define_expand} +@emph{before} those insns are passed to @code{emit_insn} to insert +them in the chain of insns. When actually inserted, the individual +sub-insns are separated out and the @samp{sequence} is forgotten. @end table -Three expression codes appear in place of a side effect, as the body -of an insn, though strictly speaking they do not describe side effects -as such: +Three expression codes appear in place of a side effect, as the body of an +insn, though strictly speaking they do not describe side effects as such: @table @code @item (asm_input @var{s}) Represents literal assembler code as described by the string @var{s}. @item (addr_vec:@var{m} [@var{lr0} @var{lr1} @dots{}]) -Represents a table of jump addresses. @var{lr0} etc. are -@samp{label_ref} expressions. The mode @var{m} specifies how much -space is given to each address; normally @var{m} would be +Represents a table of jump addresses. The vector elements @var{lr0}, +etc., are @samp{label_ref} expressions. The mode @var{m} specifies +how much space is given to each address; normally @var{m} would be @code{Pmode}. @item (addr_diff_vec:@var{m} @var{base} [@var{lr0} @var{lr1} @dots{}]) Represents a table of jump addresses expressed as offsets from -@var{base}. @var{lr0} etc. are @samp{label_ref} expressions and so is -@var{base}. The mode @var{m} specifies how much space is given to -each address-difference. +@var{base}. The vector elements @var{lr0}, etc., are @samp{label_ref} +expressions and so is @var{base}. The mode @var{m} specifies how much +space is given to each address-difference.@refill @end table -@node Incdec, Insns, Side Effects, RTL +@node Incdec, Assembler, Side Effects, RTL @section Embedded Side-Effects on Addresses Four special side-effect expression codes appear as memory addresses. @@ -1603,10 +3826,10 @@ Represents the side effect of decrementi amount and represents also the value that @var{x} has after being decremented. @var{x} must be a @samp{reg} or @samp{mem}, but most machines allow only a @samp{reg}. @var{m} must be the machine mode -for pointers on the machine in use. The amount @var{x} is decrement +for pointers on the machine in use. The amount @var{x} is decremented by is the length in bytes of the machine mode of the containing memory reference of which this expression serves as the address. Here is an -example of its use: +example of its use:@refill @example (mem:DF (pre_dec:SI (reg:SI 39))) @@ -1648,7 +3871,46 @@ allow them wherever a memory address is additional parallel stores would require doubling the number of entries in the machine description. -@node Insns, Sharing, Incdec, RTL +@node Assembler, Insns, IncDec, RTL +@section Assembler Instructions as Expressions + +The RTX code @samp{asm_operands} represents a value produced by a +user-specified assembler instruction. It is used to represent +an @code{asm} statement with arguments. An @code{asm} statement with +a single output operand, like this: + +@example +asm ("foo %1,%2,%0" : "a" (outputvar) : "g" (x + y), "di" (*z)); +@end example + +@noindent +is represented using a single @samp{asm_operands} RTX which represents +the value that is stored in @code{outputvar}: + +@example +(set @var{rtx-for-outputvar} + (asm_operands "foo %1,%2,%0" "a" 0 + [@var{rtx-for-addition-result} @var{rtx-for-*z}] + [(asm_input:@var{m1} "g") + (asm_input:@var{m2} "di")])) +@end example + +@noindent +Here the operands of the @samp{asm_operands} RTX are the assembler +template string, the output-operand's constraint, the index-number of the +output operand among the output operands specified, a vector of input +operand RTX's, and a vector of input-operand modes and constraints. The +mode @var{m1} is the mode of the sum @code{x+y}; @var{m2} is that of +@code{*z}. + +When an @code{asm} statement has multiple output values, its insn has +several such @samp{set} RTX's inside of a @samp{parallel}. Each @samp{set} +contains a @samp{asm_operands}; all of these share the same assembler +template and vectors, but each contains the constraint for the respective +output operand. They are also distinguished by the output-operand index +number, which is 0, 1, @dots{} for successive output operands. + +@node Insns, Calls, Assembler, RTL @section Insns The RTL representation of the code for a function is a doubly-linked @@ -1656,9 +3918,9 @@ chain of objects called @dfn{insns}. In special codes that are used for no other purpose. Some insns are actual instructions; others represent dispatch tables for @code{switch} statements; others represent labels to jump to or various sorts of -declaratory information. +declarative information. -In addition to its own specific data, each insn must have a unique id number +In addition to its own specific data, each insn must have a unique id-number that distinguishes it from all other insns in the current function, and chain pointers to the preceding and following insns. These three fields occupy the same position in every insn, independent of the expression code @@ -1690,7 +3952,7 @@ is always true. Every insn has one of the following six expression codes: -@table @code +@table @samp @item insn The expression code @samp{insn} is used for instructions that do not jump and do not do function calls. Insns with code @samp{insn} have four @@ -1727,7 +3989,7 @@ They contain no information beyond the t @item note @samp{note} insns are used to represent additional debugging and -declaratory information. They contain two nonstandard fields, an +declarative information. They contain two nonstandard fields, an integer which is accessed with the macro @code{NOTE_LINE_NUMBER} and a string accessed with @code{NOTE_SOURCE_FILE}. @@ -1769,14 +4031,14 @@ An expression for the side effect perfor @item REG_NOTES (@var{i}) A list (chain of @samp{expr_list} expressions) giving information about the usage of registers in this insn. This list is set up by the -@code{flow} pass; it is a null pointer until then. +flow analysis pass; it is a null pointer until then. @item LOG_LINKS (@var{i}) A list (chain of @samp{insn_list} expressions) of previous ``related'' insns: insns which store into registers values that are used for the first time in this insn. (An additional constraint is that neither a jump nor a label may come between the related insns). This list is -set up by the @code{flow} pass; it is a null pointer until then. +set up by the flow analysis pass; it is a null pointer until then. @item INSN_CODE (@var{i}) An integer that says which pattern in the machine description matches @@ -1792,37 +4054,68 @@ expressions. Each of these has two oper and the second is another @samp{insn_list} expression (the next one in the chain). The last @samp{insn_list} in the chain has a null pointer as second operand. The significant thing about the chain is which -insns apepar in it (as first operands of @samp{insn_list} +insns appear in it (as first operands of @samp{insn_list} expressions). Their order is not significant. The @code{REG_NOTES} field of an insn is a similar chain but of -@samp{expr_list} expressions instead of @samp{insn_list}. The first -operand is a @samp{reg} rtx. Its presence in the list can have three -possible meanings, distinguished by a value that is stored in the -machine-mode field of the @samp{expr_list} because that is a -conveniently available space, but that is not really a machine mode. -These values belong to the C type @code{enum reg_note} and there are -three of them: +@samp{expr_list} expressions instead of @samp{insn_list}. There are four +kinds of register notes, which are distinguished by the machine mode of the +@samp{expr_list}, which a register note is really understood as being an +@code{enum reg_note}. The first operand @var{op} of the @samp{expr_list} +is data whose meaning depends on the kind of note. Here are the four +kinds: @table @code @item REG_DEAD -The @samp{reg} listed dies in this insn; that is to say, altering -the value immediately after this insn would not affect the future -behavior of the program. +The register @var{op} dies in this insn; that is to say, altering the +value immediately after this insn would not affect the future behavior +of the program. @item REG_INC -The @samp{reg} listed is incremented (or decremented; at this level +The register @var{op} is incremented (or decremented; at this level there is no distinction) by an embedded side effect inside this insn. +This means it appears in a @code{POST_INC}, @code{PRE_INC}, +@code{POST_DEC} or @code{PRE_DEC} RTX. + +@item REG_EQUIV +The register that is set by this insn will be equal to @var{op} at run +time, and could validly be replaced in all its occurrences by +@var{op}. (``Validly'' here refers to the data flow of the program; +simple replacement may make some insns invalid.) + +The value which the insn explicitly copies into the register may look +different from @var{op}, but they will be equal at run time. + +For example, when a constant is loaded into a register that is never +assigned any other value, this kind of note is used. + +When a parameter is copied into a pseudo-register at entry to a function, +a note of this kind records that the register is equivalent to the stack +slot where the parameter was passed. Although in this case the register +may be set by other insns, it is still valid to replace the register +by the stack slot throughout the function. + +@item REG_EQUAL +The register that is set by this insn will be equal to @var{op} at run +time at the end of this insn (but not necessarily elsewhere in the +function). + +The RTX @var{op} is typically an arithmetic expression. For example, +when a sequence of insns such as a library call is used to perform an +arithmetic operation, this kind of note is attached to the insn that +produces or copies the final value. It tells the CSE pass how to +think of that value. + +@item REG_RETVAL +This insn copies the value of a library call, and @var{op} is the +first insn that was generated to set up the arguments for the library +call. -@item REG_CONST -The @samp{reg} listed has a value that could safely be replaced -everywhere by the value that this insn copies into it. (``Safety'' -here refers to the data flow of the program; such replacement may -require reloading into registers for some of the insns in which -the @samp{reg} is replaced.) +Flow analysis uses this note to delete all of a library call whose +result is dead. @item REG_WAS_0 -The @samp{reg} listed contained zero before this insn. You can rely +The register @var{op} contained zero before this insn. You can rely on this note if it is present; its absence implies nothing. @end table @@ -1832,7 +4125,72 @@ assumed to be an insn and is printed in unique id; the first operand of an @samp{expr_list} is printed in the ordinary way as an expression.) -@node Sharing,, Insns, RTL +@node Calls, Sharing, Insns, RTL +@section RTL Representation of Function-Call Insns + +Insns that call subroutines have the RTL expression code @samp{call_insn}. +These insns must satisfy special rules, and their bodies must use a special +RTL expression code, @samp{call}. + +A @samp{call} expression has two operands, as follows: + +@example +(call @var{nbytes} (mem:@var{fm} @var{addr})) +@end example + +@noindent +Here @var{nbytes} is an operand that represents the number of bytes of +argument data being passed to the subroutine, @var{fm} is a machine mode +(which must equal as the definition of the @code{FUNCTION_MODE} macro in +the machine description) and @var{addr} represents the address of the +subroutine. + +For a subroutine that returns no value, the @samp{call} RTX as shown above +is the entire body of the insn. + +For a subroutine that returns a value whose mode is not @code{BLKmode}, +the value is returned in a hard register. If this register's number is +@var{r}, then the body of the call insn looks like this: + +@example +(set (reg:@var{m} @var{r}) + (call @var{nbytes} (mem:@var{fm} @var{addr}))) +@end example + +@noindent +This RTL expression makes it clear (to the optimizer passes) that the +appropriate register receives a useful value in this insn. + +Immediately after RTL generation, if the value of the subroutine is +actually used, this call insn is always followed closely by an insn which +refers to the register @var{r}. This remains true through all the +optimizer passes until cross jumping occurs. + +The following insn has one of two forms. Either it copies the value into a +pseudo-register, like this: + +@example +(set (reg:@var{m} @var{p}) (reg:@var{m} @var{r})) +@end example + +@noindent +or (in the case where the calling function will simply return whatever +value the call produced, and no operation is needed to do this): + +@example +(use (reg:@var{m} @var{r})) +@end example + +@noindent +Between the call insn and this following insn there may intervene only a +stack-adjustment insn (and perhaps some @samp{note} insns). + +When a subroutine returns a @code{BLKmode} value, it is handled by +passing to the subroutine the address of a place to store the value. +So the call insn itself does not ``return'' any value, and it has the +same RTL form as a call that returns nothing. + +@node Sharing,, Calls, RTL @section Structure Sharing Assumptions The compiler assumes that certain kinds of RTL expressions are unique; @@ -1870,15 +4228,34 @@ There is only one @samp{const_double} ex value zero. @item -No @samp{label_ref} appears in more than one place in the RTL structure; -in other words, it is safe to do a tree-walk of all the insns in the function -and assume that each time a @samp{label_ref} is seen it is distinct from all -other @samp{label_refs} seen. +No @samp{label_ref} appears in more than one place in the RTL +structure; in other words, it is safe to do a tree-walk of all the +insns in the function and assume that each time a @samp{label_ref} is +seen it is distinct from all others that are seen. + +@item +Only one @samp{mem} object is normally created for each static +variable or stack slot, so these objects are frequently shared in all +the places they appear. However, separate but equal objects for these +variables are occasionally made. + +@item +No RTL object appears in more than one place in the RTL structure +except as described above. Many passes of the compiler rely on this +by assuming that they can modify RTL objects in place without unwanted +side-effects on other insns. + +@item +During initial RTL generation, shared structure is freely introduced. +After all the RTL for a function has been generated, all shared +structure is copied by @code{unshare_all_rtl} in @file{emit-rtl.c}, +after which the above rules are guaranteed to be followed. @item -Aside from the cases listed above, the only kind of expression -object that may appear in more than one place is the @samp{mem} -object that describes a stack slot or a static variable. +During the combiner pass, shared structure with an insn can exist +temporarily. However, the shared structure is copied before the +combiner is finished with the insn. This is done by +@code{copy_substitutions} in @samp{combine.c}. @end itemize @node Machine Desc, Machine Macros, RTL, Top @@ -1897,26 +4274,31 @@ See the next chapter for information on @menu * Patterns:: How to write instruction patterns. -* Example:: Example of an instruction pattern. +* Example:: An explained example of a @samp{define_insn} pattern. +* RTL Template:: The RTL template defines what insns match a pattern. +* Output Template:: The output template says how to make assembler code + from such an insn. +* Output Statement:: For more generality, write C code to output + the assembler code. * Constraints:: When not all operands are general operands. * Standard Names:: Names mark patterns to use for code generation. +* Pattern Ordering:: When the order of patterns makes a difference. * Dependent Patterns:: Having one pattern may make you need another. +* Jump Patterns:: Special considerations for patterns for jump insns. +* Peephole Definitions::Defining machine-specific peephole optimizations. +* Expander Definitions::Generating a sequence of several RTL insns + for a standard operation. @end menu @node Patterns, Example, Machine Desc, Machine Desc -@section Instruction Patterns +@section Everything about Instruction Patterns Each instruction pattern contains an incomplete RTL expression, with pieces to be filled in later, operand constraints that restrict how the pieces can be filled in, and an output pattern or C code to generate the assembler output, all wrapped up in a @samp{define_insn} expression. -Sometimes an insn can match more than one instruction pattern. Then the -pattern that appears first in the machine description is the one used. -Therefore, more specific patterns should usually go first in the -description. - -The @samp{define_insn} expression contains four operands: +A @samp{define_insn} is an RTL expression containing four operands: @enumerate @item @@ -1935,10 +4317,10 @@ Names that are not thus known and used i effect; they are equivalent to no name at all. @item -The recognition template. This is a vector of incomplete RTL -expressions which show what the instruction should look like. It is -incomplete because it may contain @samp{match_operand} and -@samp{match_dup} expressions that stand for operands of the +The @dfn{RTL template} (@pxref{RTL Template}) is a vector of +incomplete RTL expressions which show what the instruction should look +like. It is incomplete because it may contain @samp{match_operand} +and @samp{match_dup} expressions that stand for operands of the instruction. If the vector has only one element, that element is what the @@ -1962,50 +4344,55 @@ recognition template. The insn's operan @code{operands}. @item -A string that says how to output matching insns as assembler code. In -the simpler case, the string is an output template, much like a -@code{printf} control string. @samp{%} in the string specifies where -to insert the operands of the instruction; the @samp{%} is followed by -a single-digit operand number. +The @dfn{output template}: a string that says how to output matching +insns as assembler code. @samp{%} in this string specifies where +to substitute the value of an operand. @xref{Output Template}. -@samp{%c@var{digit}} can be used to subtitute an operand that is a -constant value without the syntax that normally indicates an immediate -operand. +When simple substitution isn't general enough, you can specify a piece +of C code to compute the output. @xref{Output Statement}. +@end enumerate -@samp{%a@var{digit}} can be used to substitute an operand as if it -were a memory reference, with the actual operand treated as the address. -This may be useful when outputting a ``load address'' instruction, -because often the assembler syntax for such an instruction requires -you to write the operand as if it were a memory reference. +@node Example, RTL Template, Patterns, Machine Desc +@section Example of @samp{define_insn} -The template may generate multiple assembler instructions. -Write the text for the instructions, with @samp{\;} between them. +Here is an actual example of an instruction pattern, for the 68000/68020. -If the output control string starts with a @samp{*}, then it is not an -output template but rather a piece of C program that should compute a -template. It should execute a @code{return} statement to return the -template-string you want. Most such templates use C string literals, -which require doublequote characters to delimit them. To include -these doublequote characters in the string, prefix each one with -@samp{\}. - -The operands may be found in the array @code{operands}, whose C -data type is @code{rtx []}. - -It is possible to output an assembler instruction and then go on to -output or compute more of them, using the subroutine -@code{output_asm_insn}. This receives two arguments: a -template-string and a vector of operands. The vector may be -@code{operands}, or it may be another array of @code{rtx} that you -declare locally and initialize yourself. -@end enumerate +@example +(define_insn "tstsi" + [(set (cc0) + (match_operand:SI 0 "general_operand" "rm"))] + "" + "* +@{ if (TARGET_68020 || ! ADDRESS_REG_P (operands[0])) + return \"tstl %0\"; + return \"cmpl #0,%0\"; @}") +@end example -The recognition template is used also, for named patterns, for -constructing insns. Construction involves substituting specified -operands into a copy of the template. Matching involves determining -the values that serve as the operands in the insn being matched. Both -of these activities are controlled by two special expression types -that direct matching and substitution of the operands. +This is an instruction that sets the condition codes based on the value of +a general operand. It has no condition, so any insn whose RTL description +has the form shown may be handled according to this pattern. The name +@samp{tstsi} means ``test a @code{SImode} value'' and tells the RTL generation +pass that, when it is necessary to test such a value, an insn to do so +can be constructed using this pattern. + +The output control string is a piece of C code which chooses which +output template to return based on the kind of operand and the specific +type of CPU for which code is being generated. + +@samp{"rm"} is an operand constraint. Its meaning is explained below. + +@node RTL Template, Output Template, Example, Machine Desc +@section RTL Template for Generating and Recognizing Insns + +The RTL template is used to define which insns match the particular pattern +and how to find their operands. For named patterns, the RTL template also +says how to construct an insn from specified operands. + +Construction involves substituting specified operands into a copy of the +template. Matching involves determining the values that serve as the +operands in the insn being matched. Both of these activities are +controlled by special expression types that direct matching and +substitution of the operands. @table @code @item (match_operand:@var{m} @var{n} @var{testfn} @var{constraint}) @@ -2018,8 +4405,9 @@ pattern will not match at all. Operand numbers must be chosen consecutively counting from zero in each instruction pattern. There may be only one @samp{match_operand} -expression in the pattern for each expression number, and they must -appear in order of increasing expression number. +expression in the pattern for each operand number. Usually operands +are numbered in the order of appearance in @samp{match_operand} +expressions. @var{testfn} is a string that is the name of a C function that accepts two arguments, a machine mode and an expression. During matching, @@ -2032,7 +4420,16 @@ Most often, @var{testfn} is @code{"gener that the putative operand is either a constant, a register or a memory reference, and that it is valid for mode @var{m}. -@var{constraint} is explained later. +For an operand that must be a register, @var{testfn} should be +@code{"register_operand"}. This prevents GNU CC from creating insns +that have memory references in these operands, insns which would only +have to be taken apart in the reload pass. + +For an operand that must be a constant, either @var{testfn} should be +@code{"immediate_operand"}, or the instruction pattern's extra condition +should check for constants, or both. + +@var{constraint} is explained later (@pxref{Constraints}). @item (match_dup @var{n}) This expression is also a placeholder for operand number @var{n}. @@ -2040,10 +4437,10 @@ It is used when the operand needs to app insn. In construction, @samp{match_dup} behaves exactly like -@var{match_operand}: the operand is substituted into the insn being +@samp{match_operand}: the operand is substituted into the insn being constructed. But in matching, @samp{match_dup} behaves differently. It assumes that operand number @var{n} has already been determined by -a @samp{match_operand} apparing earlier in the recognition template, +a @samp{match_operand} appearing earlier in the recognition template, and it matches only an identical-looking expression. @item (address (match_operand:@var{m} @var{n} "address_operand" "")) @@ -2070,36 +4467,113 @@ modes and these modes might be written i expression. @end table -@node Example, Constraints, Patterns, Machine Desc -@section Example of @samp{define_insn} +@node Output Template, Output Statement, RTL Template, Machine Desc +@section Output Templates and Operand Substitution -Here is an actual example of an instruction pattern, for the 68000/68020. +The @dfn{output template} is a string which specifies how to output +the assembler code for an instruction pattern. Most of the template +is a fixed string which is output literally. The character @samp{%} +is used to specify where to substitute an operand; it can also be +used to identify places different variants of the assembler require +different syntax. + +In the simplest case, a @samp{%} followed by a digit @var{n} says to output +operand @var{n} at that point in the string. + +@samp{%} followed by a letter and a digit says to output an operand in an +alternate fashion. Four letters have standard, built-in meanings described +below. The machine description macro @code{PRINT_OPERAND} can define +additional letters with nonstandard meanings. + +@samp{%c@var{digit}} can be used to substitute an operand that is a +constant value without the syntax that normally indicates an immediate +operand. + +@samp{%n@var{digit}} is like @samp{%c@var{digit}} except that the value of +the constant is negated before printing. + +@samp{%a@var{digit}} can be used to substitute an operand as if it were a +memory reference, with the actual operand treated as the address. This may +be useful when outputting a ``load address'' instruction, because often the +assembler syntax for such an instruction requires you to write the operand +as if it were a memory reference. + +@samp{%l@var{digit}} is used to substitute a @code{label_ref} into a jump +instruction. + +@samp{%} followed by a punctuation character specifies a substitution that +does not use an operand. Only one case is standard: @samp{%%} outputs a +@samp{%} into the assembler code. Other nonstandard cases can be +defined in the @code{PRINT_OPERAND} macro. + +The template may generate multiple assembler instructions. Write the text +for the instructions, with @samp{\;} between them. + +When the RTL contains two operand which are required by constraint to match +each other, the output template must refer only to the lower-numbered operand. +Matching operands are not always identical, and the rest of the compiler +arranges to put the proper RTL expression for printing into the lower-numbered +operand. + +One use of nonstandard letters or punctuation following @samp{%} is to +distinguish between different assembler languages for the same machine; for +example, Motorola syntax versus MIT syntax for the 68000. Motorola syntax +requires periods in most opcode names, while MIT syntax does not. For +example, the opcode @samp{movel} in MIT syntax is @samp{move.l} in Motorola +syntax. The same file of patterns is used for both kinds of output syntax, +but the character sequence @samp{%.} is used in each place where Motorola +syntax wants a period. The @code{PRINT_OPERAND} macro for Motorola syntax +defines the sequence to output a period; the macro for MIT syntax defines +it to do nothing. + +@node Output Statement, Constraints, Output Template, Machine Desc +@section C Statements for Generating Assembler Output + +Often a single fixed template string cannot produce correct and efficient +assembler code for all the cases that are recognized by a single +instruction pattern. For example, the opcodes may depend on the kinds of +operands; or some unfortunate combinations of operands may require extra +machine instructions. + +If the output control string starts with a @samp{*}, then it is not an +output template but rather a piece of C program that should compute a +template. It should execute a @code{return} statement to return the +template-string you want. Most such templates use C string literals, which +require doublequote characters to delimit them. To include these +doublequote characters in the string, prefix each one with @samp{\}. + +The operands may be found in the array @code{operands}, whose C data type +is @code{rtx []}. + +It is possible to output an assembler instruction and then go on to output +or compute more of them, using the subroutine @code{output_asm_insn}. This +receives two arguments: a template-string and a vector of operands. The +vector may be @code{operands}, or it may be another array of @code{rtx} +that you declare locally and initialize yourself. + +When an insn pattern has multiple alternatives in its constraints, often +the appearance of the assembler code determined mostly by which alternative +was matched. When this is so, the C code can test the variable +@code{which_alternative}, which is the ordinal number of the alternative +that was actually satisfied (0 for the first, 1 for the second alternative, +etc.). + +For example, suppose there are two opcodes for storing zero, @samp{clrreg} +for registers and @samp{clrmem} for memory locations. Here is how +a pattern could use @code{which_alternative} to choose between them: @example -(define_insn "tstsi" - [(set (cc0) - (match_operand:SI 0 "general_operand" "rm"))] +(define_insn "" + [(set (match_operand:SI 0 "general_operand" "r,m") + (const_int 0))] "" "* -@{ if (TARGET_68020 || ! ADDRESS_REG_P (operands[0])) - return \"tstl %0\"; - return \"cmpl #0,%0\"; @}") + return (which_alternative == 0 + ? \"clrreg %0\" : \"clrmem %0\"); + ") @end example -This is an instruction that sets the condition codes based on the value of -a general operand. It has no condition, so any insn whose RTL description -has the form shown may be handled according to this pattern. The name -@samp{tstsi} means ``test a @code{SImode} value'' and tells the RTL generation -pass that, when it is necessary to test such a value, an insn to do so -can be constructed using this pattern. - -The output control string is a piece of C code which chooses which -output template to return based on the kind of operand and the specific -type of CPU for which code is being generated. - -@samp{"rm"} is an operand constraint. Its meaning is explained below. - -@node Constraints, Standard Names, Example, Machine Desc +@node Constraints, Standard Names, Output Statement, Machine Desc @section Operand Constraints Each @samp{match_operand} in an instruction pattern can specify a @@ -2111,7 +4585,7 @@ have. Constraints can also require two @menu * Simple Constraints:: Basic use of constraints. -* Multi-alternative:: When an insn has two alternative constraint-patterns. +* Multi-Alternative:: When an insn has two alternative constraint-patterns. * Class Preferences:: Constraints guide which hard register to put things in. * Modifiers:: More precise control over effects of constraints. * No Constraints:: Describing a clean machine without constraints. @@ -2124,88 +4598,122 @@ The simplest kind of constraint is a str which describes one kind of operand that is permitted. Here are the letters that are allowed: -@table @samp -@item m +@table @asis +@item @samp{m} A memory operand is allowed, with any kind of address that the machine supports in general. -@item o -A memory operand is allowed, but only if the address is @dfn{offsetable}. -This means that adding a small integer (actually, the width in bytes of the -operand, as determined by its machine mode) may be added to the address -and the result is also a valid memory address. For example, an address -which is constant is offsetable; so is an address that is the sum of -a register and a constant (as long as a slightly larger constant is also -within the range of address-offsets supported by the machine); but an -autoincrement or autodecrement address is not offsetable. More complicated -indirect/indexed addresses may or may not be offsetable depending on the -other addressing modes that the machine supports. +@item @samp{o} +A memory operand is allowed, but only if the address is +@dfn{offsetable}. This means that adding a small integer (actually, +the width in bytes of the operand, as determined by its machine mode) +may be added to the address and the result is also a valid memory +address. + +For example, an address which is constant is offsetable; so is an +address that is the sum of a register and a constant (as long as a +slightly larger constant is also within the range of address-offsets +supported by the machine); but an autoincrement or autodecrement +address is not offsetable. More complicated indirect/indexed +addresses may or may not be offsetable depending on the other +addressing modes that the machine supports. + +Note that in an output operand which can be matched by another +operand, the constraint letter @samp{o} is valid only when accompanied +by both @samp{<} (if the target machine has predecrement addressing) +and @samp{>} (if the target machine has preincrement addressing). -@item < +@item @samp{<} A memory operand with autodecrement addressing (either predecrement or postdecrement) is allowed. -@item > +@item @samp{>} A memory operand with autoincrement addressing (either preincrement or postincrement) is allowed. -@item r -A register operand is allowed provided that it is in a general register. +@item @samp{r} +A register operand is allowed provided that it is in a general +register. -@item d -@itemx a -@itemx f -@itemx @dots{} +@item @samp{d}, @samp{a}, @samp{f}, @dots{} Other letters can be defined in machine-dependent fashion to stand for particular classes of registers. @samp{d}, @samp{a} and @samp{f} are -defined on the 68000/68020 to stand for data, address and floating point -registers. +defined on the 68000/68020 to stand for data, address and floating +point registers. -@item i +@item @samp{i} An immediate integer operand (one with constant value) is allowed. +This includes symbolic constants whose values will be known only at +assembly time. -@item I -@item J -@item K -@itemx @dots{} -Other letters in the range @samp{I} through @samp{M} may be defined in a -machine-dependent fashion to permit immediate integer operands with -explicit integer values in specified ranges. For example, on the 68000, -@samp{I} is defined to stand for the range of values 1 to 8. This is the -range permitted as a shift count in the shift instructions. +@item @samp{n} +An immediate integer operand with a known numeric value is allowed. +Many systems cannot support assembly-time constants for operands less +than a word wide. Constraints for these operands should use @samp{n} +rather than @samp{i}. + +@item @samp{I}, @samp{J}, @samp{K}, @dots{} +Other letters in the range @samp{I} through @samp{M} may be defined in +a machine-dependent fashion to permit immediate integer operands with +explicit integer values in specified ranges. For example, on the +68000, @samp{I} is defined to stand for the range of values 1 to 8. +This is the range permitted as a shift count in the shift +instructions. -@item F +@item @samp{F} An immediate floating operand (expression code @samp{const_double}) is allowed. -@item G -@itemx H +@item @samp{G}, @samp{H} @samp{G} and @samp{H} may be defined in a machine-dependent fashion to permit immediate floating operands in particular ranges of values. -@item s +@item @samp{s} An immediate integer operand whose value is not an explicit integer is -allowed. This might appear strange; if an insn allows a constant operand -with a value not known at compile time, it certainly must allow any known +allowed. + +This might appear strange; if an insn allows a constant operand with a +value not known at compile time, it certainly must allow any known value. So why use @samp{s} instead of @samp{i}? Sometimes it allows -better code to be generated. For example, on the 68000 in a fullword -instruction it is possible to use an immediate operand; but if the -immediate value is between -32 and 31, better code results from loading the -value into a register and using the register. This is because the load -into the register can be done with a @samp{moveq} instruction. We arrange -for this to happen by defining the letter @samp{K} to mean ``any integer -outside the range -32 to 31'', and then specifying @samp{Ks} in the operand +better code to be generated. + +For example, on the 68000 in a fullword instruction it is possible to +use an immediate operand; but if the immediate value is between -32 +and 31, better code results from loading the value into a register and +using the register. This is because the load into the register can be +done with a @samp{moveq} instruction. We arrange for this to happen +by defining the letter @samp{K} to mean ``any integer outside the +range -32 to 31'', and then specifying @samp{Ks} in the operand constraints. -@item g +@item @samp{g} Any register, memory or immediate integer operand is allowed, except for registers that are not general registers. -@item @r{@var{n}, a digit} -An operand identical to operand number @var{n} is allowed. +@item @samp{@var{n}} (a digit) +An operand that matches operand number @var{n} is allowed. If a digit is used together with letters, the digit should come last. -@item p +This is called a @dfn{matching constraint} and what it really means is +that the assembler has only a single operand that fills two roles +considered separate in the RTL insn. For example, an add insn has two +input operands and one output operand in the RTL, but on most machines +an add instruction really has only two operands, one of them an +input-output operand. + +Matching constraints work only in circumstances like that add insn. +More precisely, the matching constraint must appear in an input-only +operand and the operand that it matches must be an output-only operand +with a lower number. + +For operands to match in a particular case usually means that they +are identical-looking RTL expressions. But in a few special cases +specific kinds of dissimilarity are allowed. For example, @code{*x} +as an input operand will match @code{*x++} as an output operand. +For proper results in such cases, the output template should always +use the output-operand's number when printing the operand. + +@item @samp{p} An operand that is a valid memory address is allowed. This is for ``load address'' and ``push address'' instructions. @@ -2216,7 +4724,7 @@ If @samp{p} is used in the constraint, t In order to have valid assembler code, each operand must satisfy its constraint. But a failure to do so does not prevent the pattern from applying to an insn. Instead, it directs the compiler to modify -the code such that the constraint will be satisfied. Usually this is +the code so that the constraint will be satisfied. Usually this is done by copying an operand into a register. Contrast, therefore, the two instruction patterns that follow: @@ -2303,8 +4811,8 @@ Here is how it is done for fullword logi @example (define_insn "iorsi3" [(set (match_operand:SI 0 "general_operand" "=%m,d") - (ior:SI (match_operand:SI 1 "general_operand" "0,0") - (match_operand:SI 2 "general_operand" "dKs,dmKs")))] + (ior:SI (match_operand:SI 1 "general_operand" "0,0") + (match_operand:SI 2 "general_operand" "dKs,dmKs")))] @dots{}) @end example @@ -2335,6 +4843,24 @@ When operands must be copied into regist never choose this alternative as the one to strive for. @end table +When an insn pattern has multiple alternatives in its constraints, +often the appearance of the assembler code determined mostly by which +alternative was matched. When this is so, the C code for writing the +assembler code can use the variable @code{which_alternative}, which is +the ordinal number of the alternative that was actually satisfied +(0 for the first, 1 for the second alternative, etc.). For example: + +@example +(define_insn "" + [(set (match_operand:SI 0 "general_operand" "r,m") + (const_int 0))] + "" + "* + return (which_alternative == 0 + ? \"clrreg %0\" : \"clrmem %0\"); + ") +@end example + @node Class Preferences, Modifiers, Multi-Alternative, Constraints @subsection Register Class Preferences @@ -2356,8 +4882,8 @@ classes are defined. Then none of this @table @samp @item = -Means that this operand is written by the instruction, but its previous -value is not used. +Means that this operand is write-only for this instruction: the previous +value is discarded and replaced by output data. @item + Means that this operand is both read and written by the instruction. @@ -2368,19 +4894,68 @@ which are outputs from it. @samp{=} ide identifies an operand that is both input and output; all other operands are assumed to be input only. +@item & +Means (in a particular alternative) that this operand is written +before the instruction is finished using the input operands. +Therefore, this operand may not lie in a register that is used as an +input operand or as part of any memory address. + +@samp{&} applies only to the alternative in which it is written. In +constraints with multiple alternatives, sometimes one alternative +requires @samp{&} while others do not. See, for example, the +@samp{movdf} insn of the 68000. + +@samp{&} does not obviate the need to write @samp{=}. + @item % -Declares the instruction to be commutative for operands 1 and 2. -This means that the compiler may interchange operands 1 and 2 -if that will make the operands fit their constraints. +Declares the instruction to be commutative for this operand and the +following operand. This means that the compiler may interchange the +two operands if that is the cheapest way to make all operands fit the +constraints. This is often used in patterns for addition instructions +that really have only two operands: the result must go in one of the +arguments. Here for example, is how the 68000 halfword-add +instruction is defined: + +@example +(define_insn "addhi3" + [(set (match_operand:HI 0 "general_operand" "=m,r") + (plus:HI (match_operand:HI 1 "general_operand" "%0,0") + (match_operand:HI 2 "general_operand" "di,g")))] + @dots{}) +@end example + +Note that in previous versions of GNU CC the @samp{%} constraint +modifier always applied to operands 1 and 2 regardless of which +operand it was written in. The usual custom was to write it in +operand 0. Now it must be in operand 1 if the operands to be +exchanged are 1 and 2. @item # -Says that all following characters, up to the next comma, are to be ignored -as a constraint. They are significant only for choosing register preferences. +Says that all following characters, up to the next comma, are to be +ignored as a constraint. They are significant only for choosing +register preferences. @item * Says that the following character should be ignored when choosing -register preferences. @samp{*} has no effect on the meaning of -the constraint as a constraint. +register preferences. @samp{*} has no effect on the meaning of the +constraint as a constraint. + +Here is an example: the 68000 has an instruction to sign-extend a +halfword in a data register, and can also sign-extend a value by +copying it into an address register. While either kind of register is +acceptable, the constraints on an address-register destination are +less strict, so it is best if register allocation makes an address +register its goal. Therefore, @samp{*} is used so that the @samp{d} +constraint letter (for data register) is ignored when computing +register preferences. + +@example +(define_insn "extendhisi2" + [(set (match_operand:SI 0 "general_operand" "=*d,a") + (sign_extend:SI + (match_operand:HI 1 "general_operand" "0,g")))] + @dots{}) +@end example @end table @node No Constraints,, Modifiers, Constraints @@ -2388,12 +4963,12 @@ the constraint as a constraint. Some machines are so clean that operand constraints are not required. For example, on the Vax, an operand valid in one context is valid in any other -context. On such a machine, every operand constraint would be @samp{"g"}, +context. On such a machine, every operand constraint would be @samp{g}, excepting only operands of ``load address'' instructions which are written as if they referred to a memory location's contents but actual -refer to its address. They would have constraint @samp{"p"}. +refer to its address. They would have constraint @samp{p}. -For such machines, instead of writing @samp{"g"} and @samp{"p"} for all +For such machines, instead of writing @samp{g} and @samp{p} for all the constraints, you can choose to write a description with empty constraints. Then you write @samp{""} for the constraint in every @samp{match_operand}. Address operands are identified by writing an @samp{address} expression @@ -2402,16 +4977,16 @@ around the @samp{match_operand}, not by When the machine description has just empty constraints, certain parts of compilation are skipped, making the compiler faster. -@node Standard Names, Dependent Patterns, Constraints, Machine Desc -@section Standard Insn Names +@node Standard Names, Pattern Ordering, Constraints, Machine Desc +@section Standard Names for Patterns Used in Generation Here is a table of the instruction names that are meaningful in the RTL generation pass of the compiler. Giving one of these names to an instruction pattern tells the RTL generation pass that it can use the pattern in to accomplish a certain task. -@table @samp -@item mov@var{m} +@table @asis +@item @samp{mov@var{m}} Here @var{m} is a two-letter machine mode name, in lower case. This instruction pattern moves data with that machine mode from operand 1 to operand 0. For example, @samp{movsi} moves full-word data. @@ -2421,56 +4996,80 @@ natural mode is wider than @var{m}, the to store the specified value in the part of the register that corresponds to mode @var{m}. The effect on the rest of the register is undefined. -@item movstrict@var{m} +This class of patterns is special in several ways. First of all, each +of these names @emph{must} be defined, because there is no other way +to copy a datum from one place to another. + +Second, these patterns are not used solely in the RTL generation pass. +Even the reload pass can generate move insns to copy values from stack +slots into temporary registers. When it does so, one of the operands +is a hard register and the other is an operand that can have a reload. + +Therefore, when given such a pair of operands, the pattern must +generate RTL which needs no temporary registers---no registers other +than the operands. For example, if you support the pattern with a +@code{define_expand}, then in such a case you mustn't call +@code{force_reg} or any other such function which might generate new +pseudo registers. + +This requirement exists even for subword modes on a RISC machine where +fetching those modes from memory normally requires several insns and +some temporary registers. Look in @file{spur.md} to see how the +requirement is satisfied. + +The variety of operands that have reloads depends on the rest of the +machine description, but typically on a RISC machine these can only be +pseudo registers that did not get hard registers, while on other +machines explicit memory references will get optional reloads. + +In addition, the constraints must allow any hard register to be moved +to any other hard register (provided that @code{HARD_REGNO_MODE_OK} +permits mode @var{m} in each of the registers). + +@item @samp{movstrict@var{m}} Like @samp{mov@var{m}} except that if operand 0 is a @samp{subreg} with mode @var{m} of a register whose natural mode is wider, the @samp{movstrict@var{m}} instruction is guaranteed not to alter any of the register except the part which belongs to mode @var{m}. -@item add@var{m}3 +@item @samp{add@var{m}3} Add operand 2 and operand 1, storing the result in operand 0. All operands must have mode @var{m}. This can be used even on two-address machines, by means of constraints requiring operands 1 and 0 to be the same location. -@item sub@var{m}3 -@itemx mul@var{m}3 -@itemx umul@var{m}3 -@itemx div@var{m}3 -@itemx udiv@var{m}3 -@itemx mod@var{m}3 -@itemx umod@var{m}3 -@itemx and@var{m}3 -@itemx ior@var{m}3 -@itemx xor@var{m}3 +@item @samp{sub@var{m}3}, @samp{mul@var{m}3}, @samp{umul@var{m}3}, @samp{div@var{m}3}, @samp{udiv@var{m}3}, @samp{mod@var{m}3}, @samp{umod@var{m}3}, @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3} Similar, for other arithmetic operations. -@item andcb@var{m}3 +There are special considerations for register classes for logical-and +instructions, affecting also the macro @code{PREFERRED_RELOAD_CLASS}. +They apply not only to the patterns with these standard names, but to +any patterns that will match such an instruction. @xref{Register +Classes}. + +@item @samp{andcb@var{m}3} Bitwise logical-and operand 1 with the complement of operand 2 and store the result in operand 0. -@item mulhisi3 +@item @samp{mulhisi3} Multiply operands 1 and 2, which have mode @code{HImode}, and store a @code{SImode} product in operand 0. -@item mulqihi3 -@itemx mulsidi3 +@item @samp{mulqihi3}, @samp{mulsidi3} Similar widening-multiplication instructions of other widths. -@item umulqihi3 -@item umulhisi3 -@item umulsidi3 +@item @samp{umulqihi3}, @samp{umulhisi3}, @samp{umulsidi3} Similar widening-multiplication instructions that do unsigned multiplication. -@item divmod@var{m}4 +@item @samp{divmod@var{m}4} Signed division that produces both a quotient and a remainder. Operand 1 is divided by operand 2 to produce a quotient stored in operand 0 and a remainder stored in operand 3. -@item udivmod@var{m}4 +@item @samp{udivmod@var{m}4} Similar, but does unsigned division. -@item divmod@var{m}@var{n}4 +@item @samp{divmod@var{m}@var{n}4} Like @samp{divmod@var{m}4} except that only the dividend has mode @var{m}; the divisor, quotient and remainder have mode @var{n}. For example, the Vax has a @samp{divmoddisi4} instruction @@ -2479,71 +5078,122 @@ is so slow that it is faster to compute circumlocution that the compiler will use if this instruction is not available). -@item ashl@var{m}3 +@item @samp{ashl@var{m}3} Arithmetic-shift operand 1 left by a number of bits specified by operand 2, and store the result in operand 0. Operand 2 has mode @code{SImode}, not mode @var{m}. -@item ashr@var{m}3 -@itemx lshl@var{m}3 -@itemx lshr@var{m}3 -@itemx rotl@var{m}3 -@itemx rotr@var{m}3 +@item @samp{ashr@var{m}3}, @samp{lshl@var{m}3}, @samp{lshr@var{m}3}, @samp{rotl@var{m}3}, @samp{rotr@var{m}3} Other shift and rotate instructions. -@item neg@var{m}2 +Logical and arithmetic left shift are the same. Machines that do not +allow negative shift counts often have only one instruction for +shifting left. On such machines, you should define a pattern named +@samp{ashl@var{m}3} and leave @samp{lshl@var{m}3} undefined. + +There are special considerations for register classes for shift +instructions, affecting also the macro @code{PREFERRED_RELOAD_CLASS}. +They apply not only to the patterns with these standard names, but to +any patterns that will match such an instruction. @xref{Register +Classes}. + +@item @samp{neg@var{m}2} Negate operand 1 and store the result in operand 0. -@item abs@var{m}2 +@item @samp{abs@var{m}2} Store the absolute value of operand 1 into operand 0. -@item sqrt@var{m}2 +@item @samp{sqrt@var{m}2} Store the square root of operand 1 into operand 0. -@item one_cmpl@var{m}2 +@item @samp{ffs@var{m}2} +Store into operand 0 one plus the index of the least significant 1-bit +of operand 1. If operand 1 is zero, store zero. @var{m} is the mode +of operand 0; operand 1's mode is specified by the instruction +pattern, and the compiler will convert the operand to that mode before +generating the instruction. + +@item @samp{one_cmpl@var{m}2} Store the bitwise-complement of operand 1 into operand 0. -@item cmp@var{m} +@item @samp{cmp@var{m}} Compare operand 0 and operand 1, and set the condition codes. +The RTL pattern should look like this: -@item tst@var{m} +@example +(set (cc0) (minus (match_operand:@var{m} 0 @dots{}) + (match_operand:@var{m} 1 @dots{}))) +@end example + +Each such definition in the machine description, for integer mode +@var{m}, must have a corresponding @samp{tst@var{m}} pattern, because +optimization can simplify the compare into a test when operand 1 is +zero. + +@item @samp{tst@var{m}} Compare operand 0 against zero, and set the condition codes. +The RTL pattern should look like this: -@item movstr@var{m} +@example +(set (cc0) (match_operand:@var{m} 0 @dots{})) +@end example + +@item @samp{movstr@var{m}} Block move instruction. The addresses of the destination and source strings are the first two operands, and both are in mode @code{Pmode}. The number of bytes to move is the third operand, in mode @var{m}. -@item cmpstr@var{m} +@item @samp{cmpstr@var{m}} Block compare instruction, with operands like @samp{movstr@var{m}} except that the two memory blocks are compared byte by byte in lexicographic order. The effect of the instruction is to set the condition codes. -@item float@var{m}@var{n}2 -Convert operand 1 (valid for floating point mode @var{m}) to fixed -point mode @var{n} and store in operand 0 (which has mode @var{n}). - -@item fix@var{m}@var{n}2 +@item @samp{float@var{m}@var{n}2} Convert operand 1 (valid for fixed point mode @var{m}) to floating point mode @var{n} and store in operand 0 (which has mode @var{n}). -@item trunc@var{m}@var{n} +@item @samp{fix@var{m}@var{n}2} +Convert operand 1 (valid for floating point mode @var{m}) to fixed +point mode @var{n} as a signed number and store in operand 0 (which +has mode @var{n}). This instruction's result is defined only when +the value of operand 1 is an integer. + +@item @samp{fixuns@var{m}@var{n}2} +Convert operand 1 (valid for floating point mode @var{m}) to fixed +point mode @var{n} as an unsigned number and store in operand 0 (which +has mode @var{n}). This instruction's result is defined only when the +value of operand 1 is an integer. + +@item @samp{ftrunc@var{m}2} +Convert operand 1 (valid for floating point mode @var{m}) to an +integer value, still represented in floating point mode @var{m}, and +store it in operand 0 (valid for floating point mode @var{m}). + +@item @samp{fix_trunc@var{m}@var{n}2} +Like @samp{fix@var{m}@var{n}2} but works for any floating point value +of mode @var{m} by converting the value to an integer. + +@item @samp{fixuns_trunc@var{m}@var{n}2} +Like @samp{fixuns@var{m}@var{n}2} but works for any floating point +value of mode @var{m} by converting the value to an integer. + +@item @samp{trunc@var{m}@var{n}} Truncate operand 1 (valid for mode @var{m}) to mode @var{n} and store in operand 0 (which has mode @var{n}). Both modes must be fixed point or both floating point. -@item extend@var{m}@var{n} +@item @samp{extend@var{m}@var{n}} Sign-extend operand 1 (valid for mode @var{m}) to mode @var{n} and store in operand 0 (which has mode @var{n}). Both modes must be fixed point or both floating point. -@item zero_extend@var{m}@var{n} +@item @samp{zero_extend@var{m}@var{n}} Zero-extend operand 1 (valid for mode @var{m}) to mode @var{n} and store in operand 0 (which has mode @var{n}). Both modes must be fixed point. -@item extv +@item @samp{extv} Extract a bit-field from operand 1 (a register or memory operand), where operand 2 specifies the width in bits and operand 3 the starting bit, and store it in operand 0. Operand 0 must have @code{Simode}. @@ -2557,10 +5207,10 @@ for operands 2 and 3. The bit-field value is sign-extended to a full word integer before it is stored in operand 0. -@item extzv +@item @samp{extzv} Like @samp{extv} except that the bit-field value is zero-extended. -@item insv +@item @samp{insv} Store operand 3 (which must be valid for @code{SImode}) into a bit-field in operand 0, where operand 1 specifies the width in bits and operand 2 the starting bit. Operand 0 may have mode @code{QImode} @@ -2570,33 +5220,124 @@ Operands 1 and 2 must be valid for @code The RTL generation pass generates this instruction only with constants for operands 1 and 2. -@item s@var{cond}@var{m} -Store zero or -1 in the operand (with mode @var{m}) according to the -condition codes. Value stored is -1 iff the condition @var{cond} is -true. @var{cond} is the name of a comparison operation rtx code, such +@item @samp{s@var{cond}} +Store zero or nonzero in the operand according to the condition codes. +Value stored is nonzero iff the condition @var{cond} is true. +@var{cond} is the name of a comparison operation expression code, such as @samp{eq}, @samp{lt} or @samp{leu}. -@item b@var{cond} +You specify the mode that the operand must have when you write the +@code{match_operand} expression. The compiler automatically sees +which mode you have used and supplies an operand of that mode. + +The value stored for a true condition must have 1 as its low bit. +Otherwise the instruction is not suitable and must be omitted from the +machine description. You must tell the compiler exactly which value +is stored by defining the macro @code{STORE_FLAG_VALUE}. + +@item @samp{b@var{cond}} Conditional branch instruction. Operand 0 is a @samp{label_ref} that refers to the label to jump to. Jump if the condition codes meet condition @var{cond}. -@item call -Subroutine call instruction. Operand 1 is the number of arguments -and operand 0 is the function to call. Operand 1 should be a @samp{mem} -rtx whose address is the address of the function. +@item @samp{call} +Subroutine call instruction returning no value. Operand 0 is the +function to call; operand 1 is the number of bytes of arguments pushed +(in mode @code{SImode}, except it is normally a @samp{const_int}); +operand 2 is the number of registers used as operands. + +On most machines, operand 2 is not actually stored into the RTL +pattern. It is supplied for the sake of some RISC machines which need +to put this information into the assembler code; they can put it in +the RTL instead of operand 1. + +Operand 0 should be a @samp{mem} RTX whose address is the address of +the function. + +@item @samp{call_value} +Subroutine call instruction returning a value. Operand 0 is the hard +register in which the value is returned. There are three more +operands, the same as the three operands of the @samp{call} +instruction (but with numbers increased by one). -@item return +Subroutines that return @code{BLKmode} objects use the @samp{call} +insn. + +@item @samp{return} Subroutine return instruction. This instruction pattern name should be defined only if a single instruction can do all the work of returning from a function. -@item tablejump -@item case@var{m} +@item @samp{casesi} +Instruction to jump through a dispatch table, including bounds checking. +This instruction takes five operands: + +@enumerate +@item +The index to dispatch on, which has mode @code{SImode}. + +@item +The lower bound for indices in the table, an integer constant. + +@item +The upper bound for indices in the table, an integer constant. + +@item +A label to jump to if the index has a value outside the bounds. +(If the machine-description macro @code{CASE_DROPS_THROUGH} is defined, +then an out-of-bounds index drops through to the code following +the jump table instead of jumping to this label. In that case, +this label is not actually used by the @samp{casesi} instruction, +but it is always provided as an operand.) + +@item +A label that precedes the table itself. +@end enumerate + +The table is a @samp{addr_vec} or @samp{addr_diff_vec} inside of a +@samp{jump_insn}. The number of elements in the table is one plus the +difference between the upper bound and the lower bound. + +@item @samp{tablejump} +Instruction to jump to a variable address. This is a low-level +capability which can be used to implement a dispatch table when there +is no @samp{casesi} pattern. + +This pattern requires two operands: the address or offset, and a label +which should immediately precede the jump table. If the macro +@code{CASE_VECTOR_PC_RELATIVE} is defined then the first operand is an +absolute address to jump to; otherwise, it is an offset which counts +from the address of the table. + +The @samp{tablejump} insn is always the last insn before the jump +table it uses. Its assembler code normally has no need to use the +second operand, but you should incorporate it in the RTL pattern so +that the jump optimizer will not delete the table as unreachable code. @end table -@node Dependent Patterns,, Standard Names, Machine Desc -@section Patterns Require Other Patterns +@node Pattern Ordering, Dependent Patterns, Standard Names, Machine Desc +@section When the Order of Patterns Matters + +Sometimes an insn can match more than one instruction pattern. Then the +pattern that appears first in the machine description is the one used. +Therefore, more specific patterns (patterns that will match fewer things) +and faster instructions (those that will produce better code when they +do match) should usually go first in the description. + +In some cases the effect of ordering the patterns can be used to hide +a pattern when it is not valid. For example, the 68000 has an +instruction for converting a fullword to floating point and another +for converting a byte to floating point. An instruction converting +an integer to floating point could match either one. We put the +pattern to convert the fullword first to make sure that one will +be used rather than the other. (Otherwise a large integer might +be generated as a single-byte immediate quantity, which would not work.) +Instead of using this pattern ordering it would be possible to make the +pattern for convert-a-byte smart enough to deal properly with any +constant value. + +@node Dependent Patterns, Jump Patterns, Pattern Ordering, Machine Desc +@section Interdependence of Patterns Every machine description must have a named pattern for each of the conditional branch names @samp{b@var{cond}}. The recognition template @@ -2644,7 +5385,7 @@ is the definition of one of them: @noindent If operand 1 is an explicit integer constant, an instruction constructed -using that pattern can end up looking like +using that pattern can be simplified into an `and' like this: @example (set (reg:SI 41) @@ -2666,10 +5407,10 @@ such an instruction. Here is what is us (match_operand:SI 1 "general_operand" "")))] "GET_CODE (operands[1]) == CONST_INT" "* -{ operands[1] +@{ operands[1] = gen_rtx (CONST_INT, VOIDmode, ~INTVAL (operands[1])); return \"bicl2 %1,%0\"; -}") +@}") @end example @noindent @@ -2678,7 +5419,383 @@ support on the Vax, this pattern is poss constant second argument: a special case that can be output as an `and not' instruction. -@node Machine Macros,, Machine Desc, Top +A ``compare'' instruction whose RTL looks like this: + +@example +(set (cc0) (minus @var{operand} (const_int 0))) +@end example + +@noindent +may be simplified by optimization into a ``test'' like this: + +@example +(set (cc0) @var{operand}) +@end example + +@noindent +So in the machine description, each ``compare'' pattern for an integer +mode must have a corresponding ``test'' pattern that will match the +result of such simplification. + +In some cases machines support instructions identical except for the +machine mode of one or more operands. For example, there may be +``sign-extend halfword'' and ``sign-extend byte'' instructions whose +patterns are + +@example +(set (match_operand:SI 0 @dots{}) + (extend:SI (match_operand:HI 1 @dots{}))) + +(set (match_operand:SI 0 @dots{}) + (extend:SI (match_operand:QI 1 @dots{}))) +@end example + +@noindent +Constant integers do not specify a machine mode, so an instruction to +extend a constant value could match either pattern. The pattern it +actually will match is the one that appears first in the file. For correct +results, this must be the one for the widest possible mode (@code{HImode}, +here). If the pattern matches the @code{QImode} instruction, the results +will be incorrect if the constant value does not actually fit that mode. + +Such instructions to extend constants are rarely generated because they are +optimized away, but they do occasionally happen in nonoptimized +compilations. + +@node Jump Patterns, Peephole Definitions, Dependent Patterns, Machine Desc +@section Defining Jump Instruction Patterns + +GNU CC assumes that the machine has a condition code. A comparison insn +sets the condition code, recording the results of both signed and unsigned +comparison of the given operands. A separate branch insn tests the +condition code and branches or not according its value. The branch insns +come in distinct signed and unsigned flavors. Many common machines, such +as the Vax, the 68000 and the 32000, work this way. + +Some machines have distinct signed and unsigned compare instructions, and +only one set of conditional branch instructions. The easiest way to handle +these machines is to treat them just like the others until the final stage +where assembly code is written. At this time, when outputting code for the +compare instruction, peek ahead at the following branch using +@code{NEXT_INSN (insn)}. (The variable @code{insn} refers to the insn +being output, in the output-writing code in an instruction pattern.) If +the RTL says that is an unsigned branch, output an unsigned compare; +otherwise output a signed compare. When the branch itself is output, you +can treat signed and unsigned branches identically. + +The reason you can do this is that GNU CC always generates a pair of +consecutive RTL insns, one to set the condition code and one to test it, +and keeps the pair inviolate until the end. + +To go with this technique, you must define the machine-description macro +@code{NOTICE_UPDATE_CC} to do @code{CC_STATUS_INIT}; in other words, no +compare instruction is superfluous. + +Some machines have compare-and-branch instructions and no condition code. +A similar technique works for them. When it is time to ``output'' a +compare instruction, record its operands in two static variables. When +outputting the branch-on-condition-code instruction that follows, actually +output a compare-and-branch instruction that uses the remembered operands. + +It also works to define patterns for compare-and-branch instructions. +In optimizing compilation, the pair of compare and branch instructions +will be combined accoprding to these patterns. But this does not happen +if optimization is not requested. So you must use one of the solutions +above in addition to any special patterns you define. + +@node Peephole Definitions, Expander Definitions, Jump Patterns, Machine Desc +@section Defining Machine-Specific Peephole Optimizers + +In addition to instruction patterns the @file{md} file may contain +definitions of machine-specific peephole optimizations. + +The combiner does not notice certain peephole optimizations when the data +flow in the program does not suggest that it should try them. For example, +sometimes two consecutive insns related in purpose can be combined even +though the second one does not appear to use a register computed in the +first one. A machine-specific peephole optimizer can detect such +opportunities. + +A definition looks like this: + +@example +(define_peephole + [@var{insn-pattern-1} + @var{insn-pattern-2} + @dots{}] + "@var{condition}" + "@var{template}") +@end example + +In this skeleton, @var{insn-pattern-1} and so on are patterns to match +consecutive instructions. The optimization applies to a sequence of +instructions when @var{insn-pattern-1} matches the first one, +@var{insn-pattern-2} matches the next, and so on.@refill + +@var{insn-pattern-1} and so on look @emph{almost} like the second operand +of @code{define_insn}. There is one important difference: this pattern is +an RTX, not a vector. If the @code{define_insn} pattern would be a vector +of one element, the @var{insn-pattern} should be just that element, no +vector. If the @code{define_insn} pattern would have multiple elements +then the @var{insn-pattern} must place the vector inside an explicit +@code{parallel} RTX.@refill + +The operands of the instructions are matched with @code{match_operands} and +@code{match_dup}, as usual). What is not usual is that the operand numbers +apply to all the instruction patterns in the definition. So, you can check +for identical operands in two instructions by using @code{match_operand} +in one instruction and @code{match_dup} in the other. + +The operand constraints used in @code{match_operand} patterns do not have +any direct effect on the applicability of the optimization, but they will +be validated afterward, so write constraints that are sure to fit whenever +the optimization is applied. It is safe to use @code{"g"} for each +operand. + +Once a sequence of instructions matches the patterns, the @var{condition} +is checked. This is a C expression which makes the final decision whether +to perform the optimization (do so if the expression is nonzero). If +@var{condition} is omitted (in other words, the string is empty) then the +optimization is applied to every sequence of instructions that matches the +patterns. + +The defined peephole optimizations are applied after register allocation is +complete. Therefore, the optimizer can check which operands have ended up +in which kinds of registers, just by looking at the operands. + +The way to refer to the operands in @var{condition} is to write +@code{operands[@var{i}]} for operand number @var{i} (as matched by +@code{(match_operand @var{i} @dots{})}). Use the variable @code{insn} to +refer to the last of the insns being matched; use @code{PREV_INSN} to find +the preceding insns (but be careful to skip over any @samp{note} insns that +intervene).@refill + +When optimizing computations with intermediate results, you can use +@var{condition} to match only when the intermediate results are not used +elsewhere. Use the C expression @code{dead_or_set_p (@var{insn}, +@var{op})}, where @var{insn} is the insn in which you expect the value to +be used for the last time (from the value of @code{insn}, together with use +of @code{PREV_INSN}), and @var{op} is the intermediate value (from +@code{operands[@var{i}]}).@refill + +Applying the optimization means replacing the sequence of instructions with +one new instruction. The @var{template} controls ultimate output of +assembler code for this combined instruction. It works exactly like the +template of a @code{define_insn}. Operand numbers in this template are the +same ones used in matching the original sequence of instructions. + +The result of a defined peephole optimizer does not need to match any of +the instruction patterns, and it does not have an opportunity to match +them. The peephole optimizer definition itself serves as the instruction +pattern to control how the instruction is output. + +Defined peephole optimizers are run in the last jump optimization pass, so +the instructions they produce are never combined or rearranged +automatically in any way. + +Here is an example, taken from the 68000 machine description: + +@example +(define_peephole + [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4))) + (set (match_operand:DF 0 "register_operand" "f") + (match_operand:DF 1 "register_operand" "ad"))] + "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])" + "* +@{ + rtx xoperands[2]; + xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1); +#ifdef MOTOROLA + output_asm_insn (\"move.l %1,(sp)\", xoperands); + output_asm_insn (\"move.l %1,-(sp)\", operands); + return \"fmove.d (sp)+,%0\"; +#else + output_asm_insn (\"movel %1,sp@@\", xoperands); + output_asm_insn (\"movel %1,sp@@-\", operands); + return \"fmoved sp@@+,%0\"; +#endif +@} +") +@end example + +The effect of this optimization is to change + +@example +jbsr _foobar +addql #4,sp +movel d1,sp@@- +movel d0,sp@@- +fmoved sp@@+,fp0 +@end example + +@noindent +into + +@example +jbsr _foobar +movel d1,sp@@ +movel d0,sp@@- +fmoved sp@@+,fp0 +@end example + +@node Expander Definitions,, Peephole Definitions, Machine Desc +@section Defining RTL Sequences for Code Generation + +On some target machines, some standard pattern names for RTL generation +cannot be handled with single insn, but a sequence of RTL insns can +represent them. For these target machines, you can write a +@samp{define_expand} to specify how to generate the sequence of RTL. + +A @samp{define_expand} is an RTL expression that looks almost like a +@samp{define_insn}; but, unlike the latter, a @samp{define_expand} is used +only for RTL generation and it can produce more than one RTL insn. + +A @samp{define_expand} RTX has four operands: + +@itemize @bullet +@item +The name. Each @samp{define_expand} must have a name, since the only +use for it is to refer to it by name. + +@item +The RTL template. This is just like the RTL template for a +@samp{define_peephole} in that it is a vector of RTL expressions +each being one insn. + +@item +The condition, a string containing a C expression. This expression is +used to express how the availability of this pattern depends on +subclasses of target machine, selected by command-line options when +GNU CC is run. This is just like the condition of a +@samp{define_insn} that has a standard name. + +@item +The preparation statements, a string containing zero or more C +statements which are to be executed before RTL code is generated from +the RTL template. + +Usually these statements prepare temporary registers for use as +internal operands in the RTL template, but they can also generate RTL +insns directly by calling routines such as @samp{emit_insn}, etc. +Any such insns precede the ones that come from the RTL template. +@end itemize + +The RTL template, in addition to controlling generation of RTL insns, +also describes the operands that need to be specified when this pattern +is used. In particular, it gives a predicate for each operand. + +A true operand, which need to be specified in order to generate RTL from +the pattern, should be described with a @samp{match_operand} in its first +occurrence in the RTL template. This enters information on the operand's +predicate into the tables that record such things. GNU CC uses the +information to preload the operand into a register if that is required for +valid RTL code. If the operand is referred to more than once, subsequent +references should use @samp{match_dup}. + +The RTL template may also refer to internal ``operands'' which are +temporary registers or labels used only within the sequence made by the +@samp{define_expand}. Internal operands are substituted into the RTL +template with @samp{match_dup}, never with @samp{match_operand}. The +values of the internal operands are not passed in as arguments by the +compiler when it requests use of this pattern. Instead, they are computed +within the pattern, in the preparation statements. These statements +compute the values and store them into the appropriate elements of +@code{operands} so that @samp{match_dup} can find them. + +There are two special macros defined for use in the preparation statements: +@code{DONE} and @code{FAIL}. Use them with a following semicolon, +as a statement. + +@table @code +@item DONE +Use the @code{DONE} macro to end RTL generation for the pattern. The +only RTL insns resulting from the pattern on this occasion will be +those already emitted by explicit calls to @code{emit_insn} within the +preparation statements; the RTL template will not be generated. + +@item FAIL +Make the pattern fail on this occasion. When a pattern fails, it means +that the pattern was not truly available. The calling routines in the +compiler will try other strategies for code generation using other patterns. + +Failure is currently supported only for binary operations (addition, +multiplication, shifting, etc.). + +Do not emit any insns explicitly with @code{emit_insn} before failing. +@end table + +Here is an example, the definition of left-shift for the SPUR chip: + +@example +(define_expand "ashlsi3" + [(set (match_operand:SI 0 "register_operand" "") + (ashift:SI + (match_operand:SI 1 "register_operand" "") + (match_operand:SI 2 "nonmemory_operand" "")))] + "" + " +@{ + if (GET_CODE (operands[2]) != CONST_INT + || (unsigned) INTVAL (operands[2]) > 3) + FAIL; +@}") +@end example + +@noindent +This example uses @samp{define_expand} so that it can generate an RTL insn +for shifting when the shift-count is in the supported range of 0 to 3 but +fail in other cases where machine insns aren't available. When it fails, +the compiler tries another strategy using different patterns (such as, a +library call). + +If the compiler were able to handle nontrivial condition-strings in +patterns with names, then there would be possible to use a +@samp{define_insn} in that case. Here is another case (zero-extension on +the 68000) which makes more use of the power of @samp{define_expand}: + +@example +(define_expand "zero_extendhisi2" + [(set (match_operand:SI 0 "general_operand" "") + (const_int 0)) + (set (strict_low_part + (subreg:HI + (match_operand:SI 0 "general_operand" "") + 0)) + (match_operand:HI 1 "general_operand" ""))] + "" + "operands[1] = make_safe_from (operands[1], operands[0]);") +@end example + +@noindent +Here two RTL insns are generated, one to clear the entire output operand +and the other to copy the input operand into its low half. This sequence +is incorrect if the input operand refers to [the old value of] the output +operand, so the preparation statement makes sure this isn't so. The +function @code{make_safe_from} copies the @code{operands[1]} into a +temporary register if it refers to @code{operands[0]}. It does this +by emitting another RTL insn. + +Finally, a third example shows the use of an internal operand. +Zero-extension on the SPUR chip is done by @samp{and}-ing the result +against a halfword mask. But this mask cannot be represented by a +@samp{const_int} because the constant value is too large to be legitimate +on this machine. So it must be copied into a register with +@code{force_reg} and then the register used in the @samp{and}. + +@example +(define_expand "zero_extendhisi2" + [(set (match_operand:SI 0 "register_operand" "") + (and:SI (subreg:SI + (match_operand:HI 1 "register_operand" "") + 0) + (match_dup 2)))] + "" + "operands[2] + = force_reg (SImode, gen_rtx (CONST_INT, + VOIDmode, 65535)); ") +@end example + +@node Machine Macros, Config, Machine Desc, Top @chapter Machine Description Macros The other half of the machine description is a C header file conventionally @@ -2687,11 +5804,12 @@ link to it. The header file @file{confi compiler source files include @file{config.h}. @menu -* Run-time Target:: Defining -m switches like -m68000 and -m68020. +* Run-time Target:: Defining -m options like -m68000 and -m68020. * Storage Layout:: Defining sizes and alignments of data types. * Registers:: Naming and describing the hardware registers. * Register Classes:: Defining the classes of hardware registers. * Stack Layout:: Defining which way the stack grows and by how much. +* Library Names:: Specifying names of subroutines to call automatically. * Addressing Modes:: Defining addressing modes valid for memory operands. * Condition Code:: Defining how insns update the condition code. * Assembler Format:: Defining how to write insns and pseudo-ops to output. @@ -2703,15 +5821,31 @@ compiler source files include @file{conf @table @code @item CPP_PREDEFINES -Define this to be a string constant containing @samp{-D} switches -to define the predefined macros that identify this machine and system. +Define this to be a string constant containing @samp{-D} options to +define the predefined macros that identify this machine and system. +These macros will be predefined unless the @samp{-ansi} option is +specified. For example, on the Sun, one can use the value @example -"-Dmc68000 -Dsun" +"-Dmc68000 -Dsun -Dunix" @end example +@item CPP_SPEC +A C string constant that tells the GNU CC driver program options to +pass to CPP. It can also specify how to translate options you +give to GNU CC into options for GNU CC to pass to the CPP. + +Do not define this macro if it does not need to do anything. + +@item CC1_SPEC +A C string constant that tells the GNU CC driver program options to +pass to CC1. It can also specify how to translate options you +give to GNU CC into options for GNU CC to pass to the CC1. + +Do not define this macro if it does not need to do anything. + @item extern int target_flags; This declaration should be present. @@ -2733,24 +5867,24 @@ Its definition should test a bit in @cod One place where these macros are used is in the condition-expressions of instruction patterns. Note how @code{TARGET_68020} appears -frequently in the 68000 machine description file, @file{m68000.md}. +frequently in the 68000 machine description file, @file{m68k.md}. Another place they are used is in the definitions of the other macros in the @file{tm-@var{machine}.h} file. @item TARGET_SWITCHES -This macro defines names of command switches to set and clear +This macro defines names of command options to set and clear bits in @code{target_flags}. Its definition is an initializer -with a subgrouping for each command switches. +with a subgrouping for each command option. -Each subgrouping contains a string constant, that defines the switch +Each subgrouping contains a string constant, that defines the option name, and a number, which contains the bits to set in @code{target_flags}. A negative number says to clear bits instead; -the negative of the number is which bits to clear. The actual switch +the negative of the number is which bits to clear. The actual option name is made by appending @samp{-m} to the specified name. One of the subgroupings should have a null string. The number in this grouping is the default value for @code{target_flags}. Any -target switches act starting with that value. +target options act starting with that value. Here is an example which defines @samp{-m68000} and @samp{-m68020} with opposite meanings, and picks the latter as the default: @@ -2761,11 +5895,23 @@ with opposite meanings, and picks the la @{ "68000", -1@}, \ @{ "", 1@}@} @end example + +@item OVERRIDE_OPTIONS +Sometimes certain combinations of command options do not make sense on +a particular target machine. You can define a macro +@code{OVERRIDE_OPTIONS} to take account of this. This macro, if +defined, is executed once just after all the command options have been +parsed. @end table @node Storage Layout, Registers, Run-time Target, Machine Macros @section Storage Layout +Note that the definitions of the macros in this table which are sizes or +alignments measured in bits do not need to be constant. They can be C +expressions that refer to static variables, such as the @code{target_flags}. +@xref{Run-time Target}. + @table @code @item BITS_BIG_ENDIAN Define this macro if the most significant bit in a byte has the lowest @@ -2778,7 +5924,7 @@ Define this macro if the most significan lowest number. @item WORDS_BIG_ENDIAN -Define this macro if, in a multiword object, the most signficant +Define this macro if, in a multiword object, the most significant word has the lowest number. @item BITS_PER_UNIT @@ -2793,19 +5939,63 @@ Number of storage units in a word; norma @item POINTER_SIZE Width of a pointer, in bits. +@item POINTER_BOUNDARY +Alignment required for pointers stored in memory, in bits. + @item PARM_BOUNDARY -Alignment required for pointers, in bits. +Alignment required for function parameters on the stack, in bits. + +@item STACK_BOUNDARY +Define this macro if you wish to preserve a certain alignment for +the stack pointer at all times. The definition is a C expression +for the desired alignment (measured in bits). @item FUNCTION_BOUNDARY Alignment required for a function entry point, in bits. @item BIGGEST_ALIGNMENT -Biggest alignment that anything can require on this machine, in bits. +Biggest alignment that any data type can require on this machine, in bits. + +@item EMPTY_FIELD_BOUNDARY +Alignment in bits to be given to a structure bit field that follows an +empty field such as @code{int : 0;}. + +@item STRUCTURE_SIZE_BOUNDARY +Number of bits which any structure or union's size must be a multiple of. +Each structure or union's size is rounded up to a multiple of this. + +If you do not define this macro, the default is the same as +@code{BITS_PER_UNIT}. @item STRICT_ALIGNMENT Define this if instructions will fail to work if given data not on the nominal alignment. If instructions will merely go slower in that case, do not define this macro. + +@item PCC_BITFIELD_TYPE_MATTERS +Define this if you wish to imitate a certain bizarre behavior pattern +of some instances of PCC: a bit field whose declared type is +@code{int} has the same effect on the size and alignment of a +structure as an actual @code{int} would have. + +Just what effect that is in GNU CC depends on other parameters, but on +most machines it would force the structure's alignment and size to a +multiple of 32 or @code{BIGGEST_ALIGNMENT} bits. + +@item CHECK_FLOAT_VALUE (@var{mode}, @var{value}) +A C statement to validate the value @var{value} (or type +@code{double}) for mode @var{mode}. This means that you check whether +@var{value} fits within the possible range of values for mode +@var{mode} on this target machine. The mode @var{mode} is always +@code{SFmode} or @code{DFmode}. + +If @var{value} is not valid, you should call @code{error} to print an +error message and then assign some valid value to @var{value}. +Allowing an invalid value to go through the compiler can produce +incorrect assembler code which may even cause Unix assemblers to +crash. + +This macro need not be defined if there is no work for it to do. @end table @node Registers, Register Classes, Storage Layout, Machine Macros @@ -2815,20 +6005,27 @@ in that case, do not define this macro. @item FIRST_PSEUDO_REGISTER Number of hardware registers known to the compiler. They receive numbers 0 through @code{FIRST_PSEUDO_REGISTER-1}; thus, the first -pseudo register's number really is assigned the number7 +pseudo register's number really is assigned the number @code{FIRST_PSEUDO_REGISTER}. @item FIXED_REGISTERS An initializer that says which registers are used for fixed purposes all throughout the compiled code and are therefore not available for -general allocation. These would inclue the stack pointer, the frame -pointer, the program counter on machines where that is considered one -of the addressable registers, and any other numbered register with a -standard use. +general allocation. These would include the stack pointer, the frame +pointer (except on machines where that can be used as a general +register when no frame pointer is needed), the program counter on +machines where that is considered one of the addressable registers, +and any other numbered register with a standard use. This information is expressed as a sequence of numbers, separated by commas and surrounded by braces. The @var{n}th number is 1 if -register @var{n} is fixed, 0 otherwise +register @var{n} is fixed, 0 otherwise. + +The table initialized from this macro, and the table initialized by +the following one, may be overridden at run time either automatically, +by the actions of the macro @code{CONDITIONAL_REGISTER_USAGE}, or by +the user with the command options @samp{-ffixed-@var{reg}}, +@samp{-fcall-used-@var{reg}} and @samp{-fcall-saved-@var{reg}}. @item CALL_USED_REGISTERS Like @code{FIXED_REGISTERS} but has 1 for each register that is @@ -2837,10 +6034,74 @@ registers. This macro therefore identif available for general allocation of values that must live across function calls. -If a registers has 0 in @code{CALL_USED_REGISTERS}, the compiler +If a register has 0 in @code{CALL_USED_REGISTERS}, the compiler automatically saves it on function entry and restores it on function exit, if the register is used within the function. +@item CONDITIONAL_REGISTER_USAGE +Zero or more C statements that may conditionally modify two variables +@code{fixed_regs} and @code{call_used_regs} (both of type @code{char +[]}) after they have been initialized from the two preceding macros. + +This is necessary in case the fixed or call-clobbered registers depend +on target flags. + +You need not define this macro if it has no work to do. + +If the usage of an entire class of registers depends on the target +flags, you may indicate this to gcc by using this macro to modify +@code{fixed_regs} and @code{call_used_regs} to 1 for each of the +registers in the classes which should not be used by gcc. Also define +the macro @code{REG_CLASS_FROM_LETTER} to return @code{NO_REGS} if it +is called with a letter for a class that shouldn't be used. + +(However, if this class is not included in @code{GENERAL_REGS} and all +of the insn patterns whose constraints permit this class are +controlled by target switches, then GCC will automatically avoid using +these registers when the target switches are opposed to them.) + +@item OVERLAPPING_REGNO_P (@var{regno}) +If defined, this is a C expression whose value is @var{regno} is +nonzero if hard register number @var{regno} is an overlapping +register. This means a hard register which overlaps a hard register +with a different number. (Such overlap is undesirable, but +occasionally it allows a machine to be supported which otherwise could +not be.) This macro must return nonzero for @emph{all} the registers +which overlap each other. GNU CC can use an overlapping register only +in certain limited ways. It can be used for allocation within a basic +block, and may be spilled for reloading; that is all. + +If this macro is not defined, it means that none of the hard registers +overlap each other. This is the usual situation. + +@item INSN_CLOBBERS_REGNO_P (@var{insn}, @var{regno}) +If defined, this is a C expression whose value should be nonzero if +the insn @var{insn} has the effect of mysteriously clobbering the +contents of hard register number @var{regno}. By ``mysterious'' we +mean that the insn's RTL expression doesn't describe such an effect. + +If this macro is not defined, it means that no insn clobbers registers +mysteriously. This is the usual situation; all else being equal, +it is best for the RTL expression to show all the activity. + +@item PRESERVE_DEATH_INFO_REGNO_P (@var{regno}) +If defined, this is a C expression whose value is nonzero if accurate +@code{REG_DEAD} notes are needed for hard register number @var{regno} +at the time of outputting the assembler code. When this is so, a few +optimizations that take place after register allocation and could +invalidate the death notes are not done when this register is +involved. + +You would arrange to preserve death info for a register when some +of the code in the machine description which is executed to write +the assembler code looks at the the death notes. This is +necessary only when the actual hardware feature which GNU CC +thinks of as a register is not actually a register of the usual sort. +(It might, for example, be a hardware stack.) + +If this macro is not defined, it means that no death notes need to be +preserved. This is the usual situation. + @item HARD_REGNO_REGS (@var{regno}, @var{mode}) A C expression for the number of consecutive hard registers, starting at register number @var{regno}, required to hold a value of mode @@ -2868,6 +6129,33 @@ are equivalent, a suitable definition is It is not necessary for this macro to check for fixed register numbers because the allocation mechanism considers them to be always occupied. +Many machines have special registers for floating point arithmetic. +Often people assume that floating point machine modes are allowed only +in floating point registers. This is not true. Any registers that +can hold integers can safely @emph{hold} a floating point machine +mode, whether or not floating arithmetic can be done on it in those +registers. + +The true significance of special floating registers is rather than +non-floating-point machine modes @emph{may not} go in those registers. +This is true if the floating registers normalize any value stored in +them, because storing a non-floating value there would garble it. If +the floating registers do not automatically normalize, if you can +store any bit pattern in one and retrieve it unchanged without a trap, +then any machine mode may go in a floating register and this macro +should say so. + +Sometimes there are floating registers that are especially slow to +access, so that it is better to store a value in a stack frame than in +such a register if floating point arithmetic is not being done. As long +as the floating registers are not in class @code{GENERAL_REGS}, they +will not be used unless some insn's constraint asks for one. + +It is obligatory to support floating point `move' instructions into +and out of general registers, because unions and structures (which +have modes @samp{SImode} or @samp{DImode}) can be in those registers +and they may have floating point members. + @item MODES_TIEABLE_P (@var{mode1}, @var{mode2}) A C expression that is nonzero if it is desirable to choose register allocation so as to avoid move instructions between a value of mode @@ -2889,17 +6177,37 @@ machines, the hardware determines which @item FRAME_POINTER_REGNUM The register number of the frame pointer register, which is used to -access automatic variables in the stack frame. It must also described -in @code{FIXED_REGISTERS} as a fixed register. On some machines, the +access automatic variables in the stack frame. On some machines, the hardware determines which register this is. On other machines, you can choose any register you wish for this purpose. +@item FRAME_POINTER_REQUIRED +A C expression which is nonzero if a function must have and use a +frame pointer. This expression is evaluated in the reload pass, in +the function @code{reload}, and it can in principle examine the +current function and decide according to the facts, but on most +machines the constant 0 or the constant 1 suffices. Use 0 when the +machine allows code to be generated with no frame pointer, and doing +so saves some time or space. Use 1 when there is no possible +advantage to avoiding a frame pointer. + +In certain cases, the compiler does not know how to do without a frame +pointer. The compiler recognizes those cases and automatically gives +the function a frame pointer regardless of what +@code{FRAME_POINTER_REQUIRED} says. You don't need to worry about +them.@refill + +In a function that does not require a frame pointer, the frame pointer +register can be allocated for ordinary usage, unless you mark it as a +fixed register. See @code{FIXED_REGISTERS} for more information. + @item ARG_POINTER_REGNUM The register number of the arg pointer register, which is used to access the function's argument list. On some machines, this is the same as the frame pointer register. On some machines, the hardware determines which register this is. On other machines, you can choose -any register you wish for this purpose. It must in any case be a +any register you wish for this purpose. If this is not the same +register as the frame pointer register, then you must mark it as a fixed register according to @code{FIXED_REGISTERS}. @item STATIC_CHAIN_REGNUM @@ -2907,22 +6215,56 @@ The register number used for passing a f pointer. This is needed for languages such as Pascal and Algol where functions defined within other functions can access the local variables of the outer functions; it is not currently used because C -does not provide this feature. +does not provide this feature, but you must define the macro. The static chain register need not be a fixed register. -@item FUNCTION_VALUE_REGNUM -The register number used for returning values from a function. This -must be one of the call-used registers (since function calls alter -it!) but should not be a fixed register. When the value being -returned has a multi-word machine mode, multiple consecutive registers -starting with the specified one are used. - @item STRUCT_VALUE_REGNUM -When a function's value's mode is @code{BLKmode}, the value is not returned -in the register @code{FUNCTION_VALUE_REGNUM}. Instead, the caller passes -the address of a block of memory in which the value should be stored. -@code{STRUCT_VALUE_REGNUM} is the register in which this address is passed. +When a function's value's mode is @code{BLKmode}, the value is not +returned according to @code{FUNCTION_VALUE}. Instead, the caller +passes the address of a block of memory in which the value should be +stored. + +If this value is passed in a register, then @code{STRUCT_VALUE_REGNUM} +should be the number of that register. + +@item STRUCT_VALUE +If the structure value address is not passed in a register, define +@code{STRUCT_VALUE} as an expression returning an RTX for the place +where the address is passed. If it returns a @samp{mem} RTX, the +address is passed as an ``invisible'' first argument. + +@item STRUCT_VALUE_INCOMING_REGNUM +On some architectures the place where the structure value address +is found by the called function is not the same place that the +caller put it. This can be due to register windows, or it could +be because the function prologue moves it to a different place. + +If the incoming location of the structure value address is in a +register, define this macro as the register number. + +@item STRUCT_VALUE_INCOMING +If the incoming location is not a register, define +@code{STRUCT_VALUE_INCOMING} as an expression for an RTX for where the +called function should find the value. If it should find the value on +the stack, define this to create a @samp{mem} which refers to the +frame pointer. If the value is a @samp{mem}, the compiler assumes it +is for an invisible first argument, and leaves space for it when +finding the first real argument. + +@item REG_ALLOC_ORDER +If defined, an initializer for a vector of integers, containing the +numbers of hard registers in the order in which the GNU CC should +prefer to use them (from most preferred to least). + +If this macro is not defined, registers are used lowest numbered first +(all else being equal). + +One use of this macro is on the 360, where the highest numbered +registers must always be saved and the save-multiple-registers +instruction supports only sequences of consecutive registers. This +macro is defined to cause the highest numbered allocatable registers +to be used first. @end table @node Register Classes, Stack Layout, Registers, Machine Macros @@ -2953,11 +6295,27 @@ constraints is through machine-dependent You can define such letters to correspond to various classes, then use them in operand constraints. +You should define a class for the union of two classes whenever some +instruction allows both classes. For example, if an instruction allows +either a floating-point (coprocessor) register or a general register for a +certain operand, you should define a class @code{FLOAT_OR_GENERAL_REGS} +which includes both of them. Otherwise you will get suboptimal code. + You must also specify certain redundant information about the register classes: for each class, which classes contain it and which ones are contained in it; for each pair of classes, the largest class contained in their union. +Register classes used for input-operands of bitwise-and or shift +instructions have a special requirement: each such class must have, for +each fixed-point machine mode, a subclass whose registers can transfer that +mode to or from memory. For example, on some machines, the operations for +single-byte values (@code{QImode}) are limited to certain registers. When +this is so, each register class that is used in a bitwise-and or shift +instruction must have a subclass consisting of registers from which +single-byte values can be loaded or stored. This is so that +@code{PREFERRED_RELOAD_CLASS} can always have a possible value to return. + @table @code @item enum reg_class An enumeral type that must be defined with all the register class names @@ -2970,6 +6328,13 @@ Each register class has a number, which the class name to type @code{int}. The number serves as an index in many of the tables described below. +@item N_REG_CLASSES +The number of distinct register classes, defined as follows: + +@example +#define N_REG_CLASSES (int) LIM_REG_CLASSES +@end example + @item REG_CLASS_NAMES An initializer containing the names of the register classes as C string constants. These names are used in writing some of the debugging dumps. @@ -2991,76 +6356,100 @@ A C expression whose value is a register which is @dfn{minimal}, meaning that no smaller class also contains the register. -@item REG_CLASS_SUPERCLASSES -A two-level initializer that says, for each class, which classes contain -it. The @var{n}th element of the initializer is a sub-initializer for -class @var{n}; it contains the names of the othe classes that contain class -@var{n} (but not the name of class @var{n} itself), followed by -@code{LIM_REG_CLASSES} to mark the end of the element. - -@item REG_CLASS_SUBCLASSES -Similar to @code{REG_CLASS_SUPERCLASSES}, except that element @var{n} lists -the classes @emph{contained in} class @var{n}, followed once again by -@code{LIM_REG_CLASSES} to mark the end of the element. - -@item REG_CLASS_SUBUNION -An two-level initializer for a two-dimensional array. The element -(@var{m}, @var{n}) of this array must be a class that is ``close to'' -being the union of classes @var{m} and @var{n}. If there is a class -that is exactly that union, use it; otherwise, choose some smaller -class, preferably as large as possible but certainly not containing -any register that is neither in class @var{m} nor in class @var{n}. +@item BASE_REG_CLASS +A macro whose definition is the name of the class to which a valid +base register must belong. A base register is one used in an address +which is the register value plus a displacement. @item INDEX_REG_CLASS -A macro whose definition is the name of the class to which a valid index -register must belong. +A macro whose definition is the name of the class to which a valid +index register must belong. An index register is one used in an +address where its value is either multiplied by a scale factor or +added to another register (as well as added to a displacement). @item REG_CLASS_FROM_LETTER (@var{char}) A C expression which defines the machine-dependent operand constraint -letters for register classes. If @var{char} is such a letter, the value -should be the register class corresponding to it. Otherwise, the value -should be @code{NO_REGS}. - -@item REGNO_OK_FOR_CLASS_P (@var{regno}, @var{class}) -A C expression which is nonzero if register number @var{regno} is a hard -register belonging to class @var{class}. The expression is always zero if -@var{regno} is a pseudo register. - -@item REG_OK_FOR_CLASS_P (@var{reg}, @var{class}) -A C expression which is nonzero if @var{reg} (an rtx assumed to have -code @samp{reg}) belongs to class @var{class}. - -What about pseudo registers? There are two alternatives, and the machine -description header file must be able to do either one on command. If the -macro @code{REG_OK_STRICT} is defined, this macro should be defined to -reject all pseudo registers (return 0 for them). Otherwise, this macro -should be defined to accept all pseudo registers (return 1 for them). - -Some source files of the compiler define @code{REG_OK_STRICT} before -including the machine description header file, while others do not, -according to the needs of that part of the compiler. +letters for register classes. If @var{char} is such a letter, the +value should be the register class corresponding to it. Otherwise, +the value should be @code{NO_REGS}. + +@item REGNO_OK_FOR_BASE_P (@var{num}) +A C expression which is nonzero if register number @var{num} is +suitable for use as a base register in operand addresses. It may be +either a suitable hard register or a pseudo register that has been +allocated such a hard register. + +@item REGNO_OK_FOR_INDEX_P (@var{num}) +A C expression which is nonzero if register number @var{num} is +suitable for use as an index register in operand addresses. It may be +either a suitable hard register or a pseudo register that has been +allocated such a hard register. + +The difference between an index register and a base register is that +the index register may be scaled. If an address involves the sum of +two registers, neither one of them scaled, then either one may be +labeled the ``base'' and the other the ``index''; but whichever +labeling is used must fit the machine's constraints of which registers +may serve in each capacity. The compiler will try both labelings, +looking for one that is valid, and will reload one or both registers +only if neither labeling works. @item PREFERRED_RELOAD_CLASS (@var{x}, @var{class}) A C expression that places additional restrictions on the register class to use when it is necessary to copy value @var{x} into a register in class @var{class}. The value is a register class; perhaps @var{class}, or perhaps -another, smaller class. @var{class} is always safe as a value. In fact, -the definition +another, smaller class. On many machines, the definition @example #define PREFERRED_RELOAD_CLASS(X,CLASS) CLASS @end example @noindent -is always safe. However, sometimes returning a more restrictive class -makes better code. For example, on the 68000, when @var{x} is an -integer constant that is in range for a @samp{moveq} instruction, -the value of this macro is always @code{DATA_REGS} as long as -@var{class} includes the data registers. Requiring a data register -guarantees that a @samp{moveq} will be used. +is safe. + +Sometimes returning a more restrictive class makes better code. For +example, on the 68000, when @var{x} is an integer constant that is in range +for a @samp{moveq} instruction, the value of this macro is always +@code{DATA_REGS} as long as @var{class} includes the data registers. +Requiring a data register guarantees that a @samp{moveq} will be used. + +If @var{x} is a @samp{const_double}, by returning @code{NO_REGS} +you can force @var{x} into a memory constant. This is useful on +certain machines where immediate floating values cannot be loaded into +certain kinds of registers. + +In a shift instruction or a bitwise-and instruction, the mode of @var{x}, +the value being reloaded, may not be the same as the mode of the +instruction's operand. (They will both be fixed-point modes, however.) In +such a case, @var{class} may not be a safe value to return. @var{class} is +certainly valid for the instruction, but it may not be valid for reloading +@var{x}. This problem can occur on machines such as the 68000 and 80386 +where some registers can handle full-word values but cannot handle +single-byte values. + +On such machines, this macro must examine the mode of @var{x} and return a +subclass of @var{class} which can handle loads and stores of that mode. On +the 68000, where address registers cannot handle @code{QImode}, if @var{x} +has @code{QImode} then you must return @code{DATA_REGS}. If @var{class} is +@code{ADDR_REGS}, then there is no correct value to return; but the shift +and bitwise-and instructions don't use @code{ADDR_REGS}, so this fatal case +never arises. + +@item CLASS_MAX_NREGS (@var{class}, @var{mode}) +A C expression for the maximum number of consecutive registers +of class @var{class} needed to hold a value of mode @var{mode}. + +This is closely related to the macro @code{HARD_REGNO_NREGS}. +In fact, the value of the macro @code{CLASS_MAX_NREGS (@var{class}, @var{mode})} +should be the maximum value of @code{HARD_REGNO_NREGS (@var{regno}, @var{mode})} +for all @var{regno} values in the class @var{class}. + +This macro helps control the handling of multiple-word values +in the reload pass. @end table -Two other special macros +Two other special macros describe which constants fit which constraint +letters. @table @code @item CONST_OK_FOR_LETTER_P (@var{value}, @var{c}) @@ -3073,20 +6462,23 @@ not one of those letters, the value shou @item CONST_DOUBLE_OK_FOR_LETTER_P (@var{value}, @var{c}) A C expression that defines the machine-dependent operand constraint letters that specify particular ranges of floating values. If @var{c} is -one of those letters, the expression should check that @var{value}, an rtx +one of those letters, the expression should check that @var{value}, an RTX of code @samp{const_double}, is in the appropriate range and return 1 if so, 0 otherwise. If @var{c} is not one of those letters, the value should be 0 regardless of @var{value}. @end table -@node Stack Layout, Addressing Modes, Register Classes, Machine Macros +@node Stack Layout, Library Names, Register Classes, Machine Macros @section Describing Stack Layout @table @code @item STACK_GROWS_DOWNWARD Define this macro if pushing a word onto the stack moves the stack -pointer to a smaller address. The definition is irrelevant because the -compiler checks this macro with @code{#ifdef}. +pointer to a smaller address. + +When we say, ``define this macro if @dots{},'' it means that the +compiler checks this macro only with @code{#ifdef} so the precise +definition used does not matter. @item FRAME_GROWS_DOWNWARD Define this macro if the addresses of local variable slots are at negative @@ -3104,6 +6496,11 @@ the value @code{STARTING_FRAME_OFFSET}. A C expression that is the number of bytes actually pushed onto the stack when an instruction attempts to push @var{npushed} bytes. +If the target machine does not have a push instruction, do not define +this macro. That directs GNU CC to use an alternate strategy: to +allocate the entire argument block and then store the arguments into +it. + On some machines, the definition @example @@ -3119,29 +6516,245 @@ alignment. Then the definition should b #define PUSH_ROUNDING(BYTES) (((BYTES) + 1) & ~1) @end example -@item FIRST_PARM_OFFSET -Offset from the argument pointer register to the first argument's address. +@item FIRST_PARM_OFFSET (@var{fundecl}) +Offset from the argument pointer register to the first argument's +address. On some machines it may depend on the data type of the +function. (In the next version of GNU CC, the argument will be +changed to the function data type rather than its declaration.) + +@item RETURN_POPS_ARGS (@var{funtype}) +A C expression that should be 1 if a function pops its own arguments +on returning, or 0 if the function pops no arguments and the caller +must therefore pop them all after the function returns. + +@var{funtype} is a C variable whose value is a tree node that +describes the function in question. Normally it is a node of type +@code{FUNCTION_TYPE} that describes the data type of the function. +From this it is possible to obtain the data types of the value and +arguments (if known). + +When a call to a library function is being considered, @var{funtype} +will contain an identifier node for the library function. Thus, if +you need to distinguish among various library functions, you can do so +by their names. Note that ``library function'' in this context means +a function used to perform arithmetic, whose name is known specially +in the compiler and was not mentioned in the C code being compiled. + +On the Vax, all functions always pop their arguments, so the +definition of this macro is 1. On the 68000, using the standard +calling convention, no functions pop their arguments, so the value of +the macro is always 0 in this case. But an alternative calling +convention is available in which functions that take a fixed number of +arguments pop them but other functions (such as @code{printf}) pop +nothing (the caller pops all). When this convention is in use, +@var{funtype} is examined to determine whether a function takes a +fixed number of arguments. + +@item FUNCTION_VALUE (@var{valtype}, @var{func}) +A C expression to create an RTX representing the place where a +function returns a value of data type @var{valtype}. @var{valtype} is +a tree node representing a data type. Write @code{TYPE_MODE +(@var{valtype})} to get the machine mode used to represent that type. +On many machines, only the mode is relevant. (Actually, on most +machines, scalar values are returned in the same place regardless of +mode).@refill + +If the precise function being called is known, @var{func} is a tree +node (@code{FUNCTION_DECL}) for it; otherwise, @var{func} is a null +pointer. This makes it possible to use a different value-returning +convention for specific functions when all their calls are +known.@refill + +@item FUNCTION_OUTGOING_VALUE (@var{valtype}, @var{func}) +Define this macro if the target machine has ``register windows'' +so that the register in which a function returns its value is not +the same as the one in which the caller sees the value. + +For such machines, @code{FUNCTION_VALUE} computes the register in +which the caller will see the value, and +@code{FUNCTION_OUTGOING_VALUE} should be defined in a similar fashion +to tell the function where to put the value.@refill + +If @code{FUNCTION_OUTGOING_VALUE} is not defined, +@code{FUNCTION_VALUE} serves both purposes.@refill + +@item LIBCALL_VALUE (@var{mode}) +A C expression to create an RTX representing the place where a library +function returns a value of mode @var{mode}. If the precise function +being called is known, @var{func} is a tree node +(@code{FUNCTION_DECL}) for it; otherwise, @var{func} is a null +pointer. This makes it possible to use a different value-returning +convention for specific functions when all their calls are +known.@refill + +Note that ``library function'' in this context means a compiler +support routine, used to perform arithmetic, whose name is known +specially by the compiler and was not mentioned in the C code being +compiled. + +@item FUNCTION_VALUE_REGNO_P (@var{regno}) +A C expression that is nonzero if @var{regno} is the number of a hard +register in which the values of called function may come back. + +A register whose use for returning values is limited to serving as the +second of a pair (for a value of type @code{double}, say) need not be +recognized by this macro. So for most machines, this definition +suffices: + +@example +#define FUNCTION_VALUE_REGNO_P(N) ((N) == 0) +@end example + +If the machine has register windows, so that the caller and the called +function use different registers for the return value, this macro +should recognize only the caller's register numbers. + +@item FUNCTION_ARG (@var{cum}, @var{mode}, @var{type}, @var{named}) +A C expression that controls whether a function argument is passed +in a register, and which register. + +The arguments are @var{cum}, which summarizes all the previous +arguments; @var{mode}, the machine mode of the argument; @var{type}, +the data type of the argument as a tree node or 0 if that is not known +(which happens for C support library functions); and @var{named}, +which is 1 for an ordinary argument and 0 for nameless arguments that +correspond to @samp{...} in the called function's prototype. + +The value of the expression should either be a @samp{reg} RTX for the +hard register in which to pass the argument, or zero to pass the +argument on the stack. + +For the Vax and 68000, where normally all arguments are pushed, zero +suffices as a definition. + +@item FUNCTION_INCOMING_ARG (@var{cum}, @var{mode}, @var{type}, @var{named}) +Define this macro if the target machine has ``register windows'', so +that the register in which a function sees an arguments is not +necessarily the same as the one in which the caller passed the +argument. + +For such machines, @code{FUNCTION_ARG} computes the register in which +the caller passes the value, and @code{FUNCTION_INCOMING_ARG} should +be defined in a similar fashion to tell the function being called +where the arguments will arrive. + +If @code{FUNCTION_INCOMING_ARG} is not defined, @code{FUNCTION_ARG} +serves both purposes.@refill + +@item FUNCTION_ARG_PARTIAL_NREGS (@var{cum}, @var{mode}, @var{type}, @var{named}) +A C expression for the number of words, at the beginning of an +argument, must be put in registers. The value must be zero for +arguments that are passed entirely in registers or that are entirely +pushed on the stack. + +On some machines, certain arguments must be passed partially in +registers and partially in memory. On these machines, typically the +first @var{n} words of arguments are passed in registers, and the rest +on the stack. If a multi-word argument (a @code{double} or a +structure) crosses that boundary, its first few words must be passed +in registers and the rest must be pushed. This macro tells the +compiler when this occurs, and how many of the words should go in +registers. -@item RETURN_POPS_ARGS -Define this macro if returning from a function automatically pops the -function's arguments. Do not define it if the caller must pop them. +@code{FUNCTION_ARG} for these arguments should return the first +register to be used by the caller for this argument; likewise +@code{FUNCTION_INCOMING_ARG}, for the called function. + +@item CUMULATIVE_ARGS +A C type for declaring a variable that is used as the first argument +of @code{FUNCTION_ARG} and other related values. For some target +machines, the type @code{int} suffices and can hold the number of +bytes of argument so far. + +@item INIT_CUMULATIVE_ARGS (@var{cum}, @var{fntype}) +A C statement (sans semicolon) for initializing the variable @var{cum} +for the state at the beginning of the argument list. The variable has +type @code{CUMULATIVE_ARGS}. The value of @var{fntype} is the tree node +for the data type of the function which will receive the args, or 0 +if the args are to a compiler support library function. + +@item FUNCTION_ARG_ADVANCE (@var{cum}, @var{mode}, @var{type}, @var{named}) +Update the summarizer variable @var{cum} to advance past an argument +in the argument list. The values @var{mode}, @var{type} and +@var{named} describe that argument. Once this is done, the variable +@var{cum} is suitable for analyzing the @emph{following} argument +with @code{FUNCTION_ARG}, etc.@refill + +@item FUNCTION_ARG_REGNO_P (@var{regno}) +A C expression that is nonzero if @var{regno} is the number of a hard +register in which function arguments are sometimes passed. This does +@emph{not} include implicit arguments such as the static chain and +the structure-value address. On many machines, no registers can be +used for this purpose since all function arguments are pushed on the +stack. + +@item FUNCTION_ARG_PADDING (@var{mode}, @var{size}) +If defined, a C expression which determines whether, and in which direction, +to pad out an argument with extra space. The value should be of type +@code{enum direction}: either @code{upward} to pad above the argument, +@code{downward} to pad below, or @code{none} to inhibit padding. + +The argument @var{size} is an RTX which describes the size of the +argument, in bytes. It should be used only if @var{mode} is +@code{BLKmode}. Otherwise, @var{size} is 0. + +This macro does not control the @emph{amount} of padding; that is +always just enough to reach the next multiple of @code{PARM_BOUNDARY}. + +This macro has a default definition which is right for most systems. +For little-endian machines, the default is to pad upward. For +big-endian machines, the default is to pad downward for an argument of +constant size shorter than an @code{int}, and upward otherwise. @item FUNCTION_PROLOGUE (@var{file}, @var{size}) A C compound statement that outputs the assembler code for entry to a function. The prologue is responsible for setting up the stack frame, initializing the frame pointer register, saving registers that must be -saved, and allocating @var{size} additional bytes of storage for the local -variables. @var{size} is an integer. @var{file} is a stdio stream to -which the assembler code should be output. +saved, and allocating @var{size} additional bytes of storage for the +local variables. @var{size} is an integer. @var{file} is a stdio +stream to which the assembler code should be output. The label for the beginning of the function need not be output by this macro. That has already been done when the macro is run. To determine which registers to save, the macro can refer to the array -@code{regs_ever_live}: element @var{r} is nonzero if hard register @var{r} -is used anywhere within the function. This implies the function prologue -should save register @var{r}, but not if it is one of the call-used -registers. +@code{regs_ever_live}: element @var{r} is nonzero if hard register +@var{r} is used anywhere within the function. This implies the +function prologue should save register @var{r}, but not if it is one +of the call-used registers. + +On machines where functions may or may not have frame-pointers, the +function entry code must vary accordingly; it must set up the frame +pointer if one is wanted, and not otherwise. To determine whether a +frame pointer is in wanted, the macro can refer to the variable +@code{frame_pointer_needed}. The variable's value will be 1 at run +time in a function that needs a frame pointer. + +@item FUNCTION_PROFILER (@var{file}, @var{labelno}) +A C statement or compound statement to output to @var{file} some +assembler code to call the profiling subroutine @code{mcount}. +Before calling, the assembler code must load the address of a +counter variable into a register where @code{mcount} expects to +find the address. The name of this variable is @samp{LP} followed +by the number @var{labelno}, so you would generate the name using +@samp{LP%d} in a @code{fprintf}. + +The details of how the address should be passed to @code{mcount} are +determined by your operating system environment, not by GNU CC. To +figure them out, compile a small program for profiling using the +system's installed C compiler and look at the assembler code that +results. + +@item EXIT_IGNORES_STACK +Define this macro as a C expression that is nonzero if the return +instruction or the function epilogue ignores the value of the stack +pointer; in other words, if it is safe to delete an instruction to +adjust the stack pointer before a return from the function. + +Note that this macro's value is relevant only for for which frame +pointers are maintained. It is never possible to delete a final stack +adjustment in a function that has no frame pointer, and the compiler +knows this regardless of @code{EXIT_IGNORES_STACK}. @item FUNCTION_EPILOGUE (@var{file}, @var{size}) A C compound statement that outputs the assembler code for exit from a @@ -3152,13 +6765,78 @@ same arguments as the macro @code{FUNCTI registers to restore are determined from @code{regs_ever_live} and @code{CALL_USED_REGISTERS} in the same way. -On some machines, there is a single instruction that does all the work of -returning from the function. On these machines, give that instruction the -name @samp{return} and do not define the macro @code{FUNCTION_EPILOGUE} at -all. +On some machines, there is a single instruction that does all the work +of returning from the function. On these machines, give that +instruction the name @samp{return} and do not define the macro +@code{FUNCTION_EPILOGUE} at all. + +Do not define a pattern named @samp{return} if you want the +@code{FUNCTION_EPILOGUE} to be used. If you want the target switches +to control whether return instructions or epilogues are used, define a +@samp{return} pattern with a validity condition that tests the target +switches appropriately. If the @samp{return} pattern's validity +condition is false, epilogues will be used. + +On machines where functions may or may not have frame-pointers, the +function exit code must vary accordingly. Sometimes the code for +these two cases is completely different. To determine whether a frame +pointer is in wanted, the macro can refer to the variable +@code{frame_pointer_needed}. The variable's value will be 1 at run +time in a function that needs a frame pointer. + +On some machines, some functions pop their arguments on exit while +others leave that for the caller to do. For example, the 68020 when +given @samp{-mrtd} pops arguments in functions that take a fixed +number of arguments. + +Your definition of the macro @code{RETURN_POPS_ARGS} decides which +functions pop their own arguments. @code{FUNCTION_EPILOGUE} needs to +know what was decided. The variable @code{current_function_pops_args} +is nonzero if the function should pop its own arguments. If so, use +the variable @code{current_function_args_size} as the number of bytes +to pop. + +@item FIX_FRAME_POINTER_ADDRESS (@var{addr}, @var{depth}) +A C compound statement to alter a memory address that uses the frame +pointer register so that it uses the stack pointer register instead. +This must be done in the instructions that load parameter values into +registers, when the reload pass determines that a frame pointer is not +necessary for the function. @var{addr} will be a C variable name, and +the updated address should be stored in that variable. @var{depth} +will be the current depth of stack temporaries (number of bytes of +arguments currently pushed). The change in offset between a +frame-pointer-relative address and a stack-pointer-relative address +must include @var{depth}. + +Even if your machine description specifies there will always be a +frame pointer in the frame pointer register, you must still define +@code{FIX_FRAME_POINTER_ADDRESS}, but the definition will never be +executed at run time, so it may be empty. +@end table + +@node Library Names, Addressing Modes, Stack Layout, Machine Macros +@section Library Subroutine Names + +@table @code +@item UDIVSI3_LIBCALL +A C string constant giving the name of the function to call for +division of a full-word by a full-word. If you do not define this +macro, the default name is used, which is @code{_udivsi3}, a function +defined in @file{gnulib}. + +@item UMODSI3_LIBCALL +A C string constant giving the name of the function to call for the +remainder in division of a full-word by a full-word. If you do not +define this macro, the default name is used, which is @code{_umodsi3}, +a function defined in @file{gnulib}. + +@item TARGET_MEM_FUNCTIONS +Define this macro if GNU CC should generate calls to the System V +(and ANSI C) library functions @code{memcpy} and @code{memset} +rather than the BSD functions @code{bcopy} and @code{bzero}. @end table -@node Addressing Modes, Misc, Stack Layout, Machine Macros +@node Addressing Modes, Misc, Library Names, Machine Macros @section Addressing Modes @table @code @@ -3171,10 +6849,14 @@ Define this macro if the machine support Similar for other kinds of addressing. @item CONSTANT_ADDRESS_P (@var{x}) -A C expression that is 1 if the rtx @var{x} is a constant whose value +A C expression that is 1 if the RTX @var{x} is a constant whose value is an integer. This includes integers whose values are not explicitly -known, such as @samp{symbol_ref} and @samp{label_ref} expressions -and @samp{const} arithmetic expressions. +known, such as @samp{symbol_ref} and @samp{label_ref} expressions and +@samp{const} arithmetic expressions. + +On most machines, this can be defined as @code{CONSTANT_P (@var{x})}, +but a few machines are more restrictive in which constant addresses +are supported. @item MAX_REGS_PER_ADDRESS A number, the maximum number of registers that can appear in a valid @@ -3182,18 +6864,62 @@ memory address. @item GO_IF_LEGITIMATE_ADDRESS (@var{mode}, @var{x}, @var{label}) A C compound statement with a conditional @code{goto @var{label};} -executed if @var{x} (an rtx) is a legitimate memory address on -the target machine for a memory operand of mode @var{mode}. +executed if @var{x} (an RTX) is a legitimate memory address on the +target machine for a memory operand of mode @var{mode}. It usually pays to define several simpler macros to serve as -subroutines for this one. Otherwise it may be too complicated -to understand. +subroutines for this one. Otherwise it may be too complicated to +understand. + +This macro must exist in two variants: a strict variant and a +non-strict one. The strict variant is used in the reload pass. It +must be defined so that any pseudo-register that has not been +allocated a hard register is considered a memory reference. In +contexts where some kind of register is required, a pseudo-register +with no hard register must be rejected. + +The non-strict variant is used in other passes. It must be defined to +accept all pseudo-registers in every context where some kind of +register is required. + +Compiler source files that want to use the strict variant of this +macro define the macro @code{REG_OK_STRICT}. You should use an +@code{#ifdef REG_OK_STRICT} conditional to define the strict variant +in that case and the non-strict variant otherwise. + +Typically among the subroutines used to define +@code{GO_IF_LEGITIMATE_ADDRESS} are subroutines to check for +acceptable registers for various purposes (one for base registers, one +for index registers, and so on). Then only these subroutine macros +need have two variants; the higher levels of macros may be the same +whether strict or not.@refill + +@item REG_OK_FOR_BASE_P (@var{x}) +A C expression that is nonzero if @var{x} (asumed to be a @code{reg} +RTX) is valid for use as a base register. For hard registers, it +should always accept those which the hardware permits and reject the +others. Whether the macro accepts or rejects pseudo registers must be +controlled by @code{REG_OK_STRICT} as described above. This usually +requires two variant definitions, of which @code{REG_OK_STRICT} +controls the one actually used. + +@item REG_OK_FOR_INDEX_P (@var{x}) +A C expression that is nonzero if @var{x} (asumed to be a @code{reg} +RTX) is valid for use as an index register. + +The difference between an index register and a base register is that +the index register may be scaled. If an address involves the sum of +two registers, neither one of them scaled, then either one may be +labeled the ``base'' and the other the ``index''; but whichever +labeling is used must fit the machine's constraints of which registers +may serve in each capacity. The compiler will try both labelings, +looking for one that is valid, and will reload one or both registers +only if neither labeling works. @item LEGITIMIZE_ADDRESS (@var{x}, @var{oldx}, @var{mode}, @var{win}) A C compound statement that attempts to replace @var{x} with a valid -memory address for an operand of mode @var{mode}. @var{win} will be -a C statement label elsewhere in the code; the macro definition -may use +memory address for an operand of mode @var{mode}. @var{win} will be a +C statement label elsewhere in the code; the macro definition may use @example GO_IF_LEGITIMATE_ADDRESS (@var{mode}, @var{x}, @var{win}); @@ -3206,14 +6932,35 @@ to avoid further processing if the addre and @var{oldx} will be the operand that was given to that function to produce @var{x}. -The code generated by this macro should not alter the substructure of @var{x}. -If it transforms @var{x} into a more legitimate form, it should assign @var{x} -(which will always be a C variable) a new value. - -It is not necessary for this macro to come up with a legitimate address. -The compiler has standard ways of doing so in all cases. In fact, it is -safe for this macro to do nothing. But often a machine-dependent strategy -can generate better code. +The code generated by this macro should not alter the substructure of +@var{x}. If it transforms @var{x} into a more legitimate form, it +should assign @var{x} (which will always be a C variable) a new value. + +It is not necessary for this macro to come up with a legitimate +address. The compiler has standard ways of doing so in all cases. In +fact, it is safe for this macro to do nothing. But often a +machine-dependent strategy can generate better code. + +@item GO_IF_MODE_DEPENDENT_ADDRESS (@var{addr}, @var{label}) +A C statement or compound statement with a conditional @code{goto +@var{label};} executed if memory address @var{x} (an RTX) can have +different meanings depending on the machine mode of the memory +reference it is used for. + +Autoincrement and autodecrement addresses typically have mode-dependent +effects because the amount of the increment or decrement is the size +of the operand being addressed. Some machines have other mode-dependent +addresses. Many RISC machines have no mode-dependent addresses. + +You may assume that @var{addr} is a valid address for the machine. + +@item LEGITIMATE_CONSTANT_P (@var{x}) +A C expression that is nonzero if @var{x} is a legitimate constant for +an immediate operand on the target machine. You can assume that +either @var{x} is a @samp{const_double} or it satisfies +@code{CONSTANT_P}, so you need not check these things. In fact, +@samp{1} is a suitable definition for this macro on machines where any +@samp{const_double} is valid and anything @code{CONSTANT_P} is valid.@refill @end table @node Misc, Condition Code, Addressing Modes, Machine Macros @@ -3221,70 +6968,138 @@ can generate better code. @table @code @item CASE_VECTOR_MODE -An alias for a machine mode name. This is the machine mode that elements -of a jump-table should have. +An alias for a machine mode name. This is the machine mode that +elements of a jump-table should have. @item CASE_VECTOR_PC_RELATIVE Define this macro if jump-tables should contain relative addresses. +@item CASE_DROPS_THROUGH +Define this if control falls through a @code{case} insn when the index +value is out of range. This means the specified default-label is +actually ignored by the @code{case} insn proper. + @item IMPLICIT_FIX_EXPR An alias for a tree code that should be used by default for conversion -of floating point values to fixed point. Normally, @code{FIX_ROUND_EXPR} -is used. +of floating point values to fixed point. Normally, +@code{FIX_ROUND_EXPR} is used.@refill + +@item FIXUNS_TRUNC_LIKE_FIX_TRUNC +Define this macro if the same instructions that convert a floating +point number to a signed fixed point number also convert validly to an +unsigned one. @item EASY_DIV_EXPR -An alias for a tree code that is the easiest kind of division to compile -code for in the general case. It may be @code{TRUNC_DIV_EXPR}, -@code{FLOOR_DIV_EXPR}, @code{CEIL_DIV_EXPR} or @code{ROUND_DIV_EXPR}. -These differ in how they round the result to an integer. -@code{EASY_DIV_EXPR} is used when it is permissible to use any of those -kinds of division and the choice should be made on the basis of efficiency. +An alias for a tree code that is the easiest kind of division to +compile code for in the general case. It may be +@code{TRUNC_DIV_EXPR}, @code{FLOOR_DIV_EXPR}, @code{CEIL_DIV_EXPR} or +@code{ROUND_DIV_EXPR}. These four division operators differ in how +they round the result to an integer. @code{EASY_DIV_EXPR} is used +when it is permissible to use any of those kinds of division and the +choice should be made on the basis of efficiency.@refill + +@item DEFAULT_SIGNED_CHAR +An expression whose value is 1 or 0, according to whether the type +@code{char} should be signed or unsigned by default. The user can +always override this default with the options @samp{-fsigned-char} +and @samp{-funsigned-char}. + +@item SCCS_DIRECTIVE +Define this if the preprocessor should ignore @code{#sccs} directives +and print no error message. + +@item IDENT_DIRECTIVE +Define this if the preprocessor should ignore @code{#ident} directives +and print no error message. @item MOVE_MAX The maximum number of bytes that a single instruction can move quickly from memory to memory. +@item INT_TYPE_SIZE +A C expression for the size in bits of the type @code{int} on the +target machine. + +@item SLOW_BYTE_ACCESS +Define this macro as a C expression which is nonzero if accessing less +than a word of memory (i.e. a @code{char} or a @code{short}) is slow +(requires more than one instruction). + @item SLOW_ZERO_EXTEND -Define this macro if zero-extension (of chars or shorts to integers) -can be done faster if the destination is a register that is known to be zero. +Define this macro if zero-extension (of a @code{char} or @code{short} +to an @code{int}) can be done faster if the destination is a register +that is known to be zero. + +If you define this macro, you must have instruction patterns that +recognize RTL structures like this: + +@example +(set (strict-low-part (subreg:QI (reg:SI @dots{}) 0)) @dots{}) +@end example + +@noindent +and likewise for @code{HImode}. @item SHIFT_COUNT_TRUNCATED Define this macro if shift instructions ignore all but the lowest few bits of the shift count. It implies that a sign-extend or zero-extend instruction for the shift count can be omitted. -@item TRULY_NOOP_TRUNCATON (@var{outprec}, @var{inprec}) +@item TRULY_NOOP_TRUNCATION (@var{outprec}, @var{inprec}) A C expression which is nonzero if on this machine it is safe to -``convert'' an integer of @var{inprec} bits to one of @var{outprec} bits -(where @var{outprec} is smaller than @var{inprec}) by merely operating -on it as if it had only @var{inprec} bits. +``convert'' an integer of @var{inprec} bits to one of @var{outprec} +bits (where @var{outprec} is smaller than @var{inprec}) by merely +operating on it as if it had only @var{outprec} bits. On many machines, this expression can be 1. +@item NO_FUNCTION_CSE +Define this macro if it is as good or better to call a constant +function address than to call an address kept in a register. + +@item PROMOTE_PROTOTYPES +Define this macro if an argument declared as @code{char} or +@code{short} in a prototype should actually be passed as an +@code{int}. In addition to avoiding errors in certain cases of +mismatch, it also makes for better code on certain machines. + +@item STORE_FLAG_VALUE +A C expression for the value stored by a store-flag instruction +(@code{s@var{cond}}) when the condition is true. This is usually 1 or +-1; it is required to be an odd number. + +Do not define @code{STORE_FLAG_VALUE} if the machine has no store-flag +instructions. + @item Pmode -An alias for the machine mode for pointers. Normally the definition can be +An alias for the machine mode for pointers. Normally the definition +can be @example #define Pmode SImode @end example @item FUNCTION_MODE -An alias for the machine mode used for memory references to functions being -called, in @samp{call} RTL expressions. On most machines this should be -@code{QImode}. - -@item CONST_COST (@var{x}, @var{code}) -A part of a C @code{switch} statement that describes the relative costs of -constant RTL expressions. It must contain @code{case} labels for -expression codes @samp{const_int}, @samp{const}, @samp{symbol_ref}, -@samp{label_ref} and @code{const_double}. Each case must ultimately reach -a @code{return} statement to return the relative cost of the use of that +An alias for the machine mode used for memory references to functions +being called, in @samp{call} RTL expressions. On most machines this +should be @code{QImode}. + +@item CONST_COSTS (@var{x}, @var{code}) +A part of a C @code{switch} statement that describes the relative +costs of constant RTL expressions. It must contain @code{case} labels +for expression codes @samp{const_int}, @samp{const}, @samp{symbol_ref}, @samp{label_ref} +and @samp{const_double}. Each case must ultimately reach a +@code{return} statement to return the relative cost of the use of that kind of constant value in an expression. The cost may depend on the precise value of the constant, which is available for examination in @var{x}. -@var{code} is the expression code---redundant, since it can be obtained with -@code{GET_CODE (@var{x})}. +@var{code} is the expression code---redundant, since it can be +obtained with @code{GET_CODE (@var{x})}. + +@item DOLLARS_IN_IDENTIFIERS +Define this to be nonzero if the character @samp{$} should be allowed +by default in identifier names. @end table @node Condition Code, Assembler Format, Misc, Machine Macros @@ -3302,90 +7117,378 @@ information by defining @code{CC_STATUS_ @table @code @item CC_STATUS_MDEP -A type, with which the @code{mdep} component of @code{cc_status} should -be declared. It defaults to @code{int}. +C code for a data type which is used for declaring the @code{mdep} +component of @code{cc_status}. It defaults to @code{int}. @item CC_STATUS_MDEP_INIT -A C expression for the initial value of the @code{mdep} field. -It defaults to 0. +A C expression for the initial value of the @code{mdep} field. It +defaults to 0. @item NOTICE_UPDATE_CC (@var{exp}) A C compound statement to set the components of @code{cc_status} -appropriately for an insn whose body is @var{exp}. It is this -macro's responsibility to recognize insns that set the condition code -as a byproduct of other activity as well as those that explicitly -set @code{(cc0)}. - -If there are insn that do not set the condition code but do alter other -machine registers, this macro must check to see whether they invalidate the -expressions that the condition code is recorded as reflecting. For -example, on the 68000, insns that store in address registers do not set the -condition code, which means that usually @code{NOTICE_UPDATE_CC} can leave -@code{cc_status} unaltered for such insns. But suppose that the previous -insn set the condition code based on location @code{a4@@(102)} and the -current insn stores a new value in @code{a4}. Although the condition code -is not changed by this, it will no longer be true that it reflects the -contents of @code{a4@@(102)}. Therefore, @code{NOTICE_UPDATE_CC} must alter +appropriately for an insn whose body is @var{exp}. It is this macro's +responsibility to recognize insns that set the condition code as a +byproduct of other activity as well as those that explicitly set +@code{(cc0)}. + +If there are insn that do not set the condition code but do alter +other machine registers, this macro must check to see whether they +invalidate the expressions that the condition code is recorded as +reflecting. For example, on the 68000, insns that store in address +registers do not set the condition code, which means that usually +@code{NOTICE_UPDATE_CC} can leave @code{cc_status} unaltered for such +insns. But suppose that the previous insn set the condition code +based on location @samp{a4@@(102)} and the current insn stores a new +value in @samp{a4}. Although the condition code is not changed by +this, it will no longer be true that it reflects the contents of +@samp{a4@@(102)}. Therefore, @code{NOTICE_UPDATE_CC} must alter @code{cc_status} in this case to say that nothing is known about the condition code value. + +The definition of @code{NOTICE_UPDATE_CC} must be prepared to deal +with the results of peephole optimization: insns whose patterns are +@samp{parallel} RTXs containing various @samp{reg}, @samp{mem} or +constants which are just the operands. The RTL structure of these +insns is not sufficient to indicate what the insns actually do. What +@code{NOTICE_UPDATE_CC} should do when it sees one is just to run +@code{CC_STATUS_INIT}. @end table @node Assembler Format,, Condition Code, Machine Macros @section Output of Assembler Code @table @code +@item ASM_SPEC +A C string constant that tells the GNU CC driver program options to +pass to the assembler. It can also specify how to translate options +you give to GNU CC into options for GNU CC to pass to the assembler. +See the file @file{tm-sun3.h} for an example of this. + +Do not define this macro if it does not need to do anything. + +@item LINK_SPEC +A C string constant that tells the GNU CC driver program options to +pass to the linker. It can also specify how to translate options you +give to GNU CC into options for GNU CC to pass to the linker. + +Do not define this macro if it does not need to do anything. + +@item LIB_SPEC +Another C string constant used much like @code{LINK_SPEC}. The difference +between the two is that @code{LIBS_SPEC} is used at the end of the +command given to the linker. + +If this macro is not defined, a default is provided that +loads the standard C library from the usual place. See @file{gcc.c}. + +@item STARTFILE_SPEC +Another C string constant used much like @code{LINK_SPEC}. The +difference between the two is that @code{STARTFILE_SPEC} is used at +the very beginning of the command given to the linker. + +If this macro is not defined, a default is provided that loads the +standard C startup file from the usual place. See @file{gcc.c}. + +@item ASM_FILE_START (@var{stream}) +A C expression which outputs to the stdio stream @var{stream} +some appropriate text to go at the start of an assembler file. + +Normally this macro is defined to output a line containing +@samp{#NO_APP}, which is a comment that has no effect on most +assemblers but tells the GNU assembler that it can save time by not +checking for certain assembler constructs. + +On systems that use SDB, it is necessary to output certain commands; +see @file{tm-attasm.h}. + +@item ASM_APP_ON +A C string constant for text to be output before each @code{asm} +statement or group of consecutive ones. Normally this is +@code{"#APP"}, which is a comment that has no effect on most +assemblers but tells the GNU assembler that it must check the lines +that follow for all valid assembler constructs. + +@item ASM_APP_OFF +A C string constant for text to be output after each @code{asm} +statement or group of consecutive ones. Normally this is +@code{"#NO_APP"}, which tells the GNU assembler to resume making the +time-saving assumptions that are valid for ordinary compiler output. + @item TEXT_SECTION_ASM_OP A C string constant for the assembler operation that should precede instructions and read-only data. Normally @code{".text"} is right. @item DATA_SECTION_ASM_OP -A C string constant for the assembler operation to identify the following -data as writable initialized data. Normally @code{".data"} is right. +A C string constant for the assembler operation to identify the +following data as writable initialized data. Normally @code{".data"} +is right. @item REGISTER_NAMES -A C initializer containing the assembler's names for the machine registers, -each one as a C string constant. This is what translates register numbers -in the compiler into assembler language. +A C initializer containing the assembler's names for the machine +registers, each one as a C string constant. This is what translates +register numbers in the compiler into assembler language. @item DBX_REGISTER_NUMBER (@var{regno}) -A C expression that returns the DBX register number for the compiler register -number @var{regno}. In simple cases, the value of this expression may be -@var{regno} itself. But sometimes there are some registers that the compiler -knows about and DBX does not, or vice versa. In such cases, some register -may need to have one number in the compiler and another for DBX. +A C expression that returns the DBX register number for the compiler +register number @var{regno}. In simple cases, the value of this +expression may be @var{regno} itself. But sometimes there are some +registers that the compiler knows about and DBX does not, or vice +versa. In such cases, some register may need to have one number in +the compiler and another for DBX. + +@item DBX_DEBUGGING_INFO +Define this macro if GNU CC should produce debugging output for DBX +in response to the @samp{-g} option. + +@item SDB_DEBUGGING_INFO +Define this macro if GNU CC should produce debugging output for SDB +in response to the @samp{-g} option. + +@item PUT_SDB_@var{op} +Define these macros to override the assembler syntax for the special +SDB assembler directives. See @file{sdbout.c} for a list of these +macros and their arguments. If the standard syntax is used, you need +not define them yourself. + +@item SDB_GENERATE_FAKE +Define this macro to override the usual method of constructing a dummy +name for anonymous structure and union types. See @file{sdbout.c} for +more infomation. + +@item DBX_NO_XREFS +Define this macro if DBX on your system does not support the construct +@samp{xs@var{tagname}}. On some systems, this construct is used to +describe a forward reference to a structure named @var{tagname}. +On other systems, this construct is not supported at all. + +@item DBX_CONTIN_LENGTH +A symbol name in DBX-format debugging information is normally +continued (split into two separate @code{.stabs} directives) when it +exceeds a certain length (by default, 80 characters). On some +operating systems, DBX requires this splitting; on others, splitting +must not be done. You can inhibit splitting by defining this macro +with the value zero. You can override the default splitting-length by +defining this macro as an expression for the length you desire. + +@item DBX_CONTIN_CHAR +Normally continuation is indicated by adding a @samp{\} character to +the end of a @code{.stabs} string when a continuation follows. To use +a different character instead, define this macro as a character +constant for the character you want to use. Do not define this macro +if backslash is correct for your system. + +@item ASM_OUTPUT_LABEL (@var{stream}, @var{name}) +A C statement (sans semicolon) to output to the stdio stream +@var{stream} the assembler definition of a label named @var{name}. Use +the expression @code{assemble_name (@var{stream}, @var{name})} to output +the name itself; before and after that, output the additional +assembler syntax for defining the name, and a newline. + +@item ASM_DECLARE_FUNCTION_NAME (@var{stream}, @var{name}, @var{decl}) +A C statement (sans semicolon) to output to the stdio stream +@var{stream} any text necessary for declaring the name @var{name} of a +function which is being defined. This macro is responsible for +outputting the label definition (perhaps using +@code{ASM_OUTPUT_LABEL}). The argument @var{decl} is the +@code{FUNCTION_DECL} tree node representing the function. + +If this macro is not defined, then the function name is defined in the +usual manner as a label (by means of @code{ASM_OUTPUT_LABEL}). + +@item ASM_GLOBALIZE_LABEL (@var{stream}, @var{name}) +A C statement (sans semicolon) to output to the stdio stream +@var{stream} some commands that will make the label @var{name} global; +that is, available for reference from other files. Use the expression +@code{assemble_name (@var{stream}, @var{name})} to output the name +itself; before and after that, output the additional assembler syntax +for making that name global, and a newline. + +@item ASM_OUTPUT_EXTERNAL (@var{stream}, @var{name}, @var{decl}) +A C statement (sans semicolon) to output to the stdio stream +@var{stream} any text necessary for declaring the name of an external +symbol named @var{name} which is referenced in this compilation but +not defined. The value of @var{decl} is the tree node for the +declaration. + +This macro need not be defined if it does not need to output anything. +The GNU assembler and most Unix assemblers don't require anything. + +@item ASM_OUTPUT_LABELREF (@var{stream}, @var{name}) +A C statement to output to the stdio stream @var{stream} a reference in +assembler syntax to a label named @var{name}. The character @samp{_} +should be added to the front of the name, if that is customary on your +operating system, as it is in most Berkeley Unix systems. This macro +is used in @code{assemble_name}. + +@item ASM_GENERATE_INTERNAL_LABEL (@var{string}, @var{prefix}, @var{num}) +A C statement to store into the string @var{string} a label whose +name is made from the string @var{prefix} and the number @var{num}. + +This string, when output subsequently by @code{ASM_OUTPUT_LABELREF}, +should produce the same output that @code{ASM_OUTPUT_INTERNAL_LABEL} +would produce with the same @var{prefix} and @var{num}. + +@item ASM_OUTPUT_INTERNAL_LABEL (@var{stream}, @var{prefix}, @var{num}) +A C statement to output to the stdio stream @var{stream} a label whose +name is made from the string @var{prefix} and the number @var{num}. +These labels are used for internal purposes, and there is no reason +for them to appear in the symbol table of the object file. On many +systems, the letter @samp{L} at the beginning of a label has this +effect. The usual definition of this macro is as follows: + +@example +fprintf (@var{stream}, "L%s%d:\n", @var{prefix}, @var{num}) +@end example -@item ASM_OUTPUT_DOUBLE (@var{file}, @var{value}) -A C statement to output to the stdio stream @var{file} an assembler +@item ASM_OUTPUT_CASE_LABEL (@var{stream}, @var{prefix}, @var{num}, @var{table}) +Define this if the label before a jump-table needs to be output +specially. The first three arguments are the same as for +@code{ASM_OUTPUT_INTERNAL_LABEL}; the fourth argument is the +jump-table which follows (a @samp{jump_insn} containing an +@samp{addr_vec} or @samp{addr_diff_vec}). + +This feature is used on system V to output a @code{swbeg} statement +for the table. + +If this macro is not defined, these labels are output with +@code{ASM_OUTPUT_INTERNAL_LABEL}. + +@item ASM_OUTPUT_CASE_END (@var{stream}, @var{num}, @var{table}) +Define this if something special must be output at the end of a jump-table. +The definition should be a C statement to be executed after the assembler +code for the table is written. It should write the appropriate code to +stdio stream @var{stream}. The argument @var{table} is the jump-table +insn, and @var{num} is the label-number of the preceding label. + +If this macro is not defined, nothing special is output at the end of +the jump-table. + +@item ASM_FORMAT_PRIVATE_NAME (@var{outvar}, @var{name}, @var{number}) +A C expression to assign to @var{outvar} (which is a variable of type +@code{char *}) a newly allocated string made from the string +@var{name} and the number @var{number}, with some suitable punctuation +added. Use @code{alloca} to get space for the string. + +This string will be used as the argument to @code{ASM_OUTPUT_LABELREF} +to produce an assembler label for an internal static variable whose +name is @var{name}. Therefore, the string must be such as to result +in valid assembler code. The argument @var{number} is different each +time this macro is executed; it prevents conflicts between +similarly-named internal static variables in different scopes. + +Ideally this string should not be a valid C identifier, to prevent any +conflict with the user's own symbols. Most assemblers allow periods +or percent signs in assembler symbols; putting at least one of these +between the name and the number will suffice. + +@item ASM_OUTPUT_ADDR_DIFF_ELT (@var{stream}, @var{value}, @var{rel}) +This macro should be provided on machines where the addresses +in a dispatch table are relative to the table's own address. + +The definition should be a C statement to output to the stdio stream +@var{stream} an assembler pseudo-instruction to generate a difference +between two labels. @var{value} and @var{rel} are the numbers of two +internal labels. The definitions of these labels are output using +@code{ASM_OUTPUT_INTERNAL_LABEL}, and they must be printed in the same +way here. For example, + +@example +fprintf (@var{stream}, "\t.word L%d-L%d\n", + @var{value}, @var{rel}) +@end example + +@item ASM_OUTPUT_ADDR_VEC_ELT (@var{stream}, @var{value}) +This macro should be provided on machines where the addresses +in a dispatch table are absolute. + +The definition should be a C statement to output to the stdio stream +@var{stream} an assembler pseudo-instruction to generate a reference to +a label. @var{value} is the number of an internal label whose +definition is output using @code{ASM_OUTPUT_INTERNAL_LABEL}. +For example, + +@example +fprintf (@var{stream}, "\t.word L%d\n", @var{value}) +@end example + +@item ASM_OUTPUT_DOUBLE (@var{stream}, @var{value}) +A C statement to output to the stdio stream @var{stream} an assembler instruction to assemble a @code{double} constant whose value is -@var{value}. @var{value} will be a C expression of type @code{double}. +@var{value}. @var{value} will be a C expression of type +@code{double}. -@item ASM_OUTPUT_FLOAT (@var{file}, @var{value}) -A C statement to output to the stdio stream @var{file} an assembler -instruction to assemble a @code{float} constant whose value is @var{value}. -@var{value} will be a C expression of type @code{float}. +@item ASM_OUTPUT_FLOAT (@var{stream}, @var{value}) +A C statement to output to the stdio stream @var{stream} an assembler +instruction to assemble a @code{float} constant whose value is +@var{value}. @var{value} will be a C expression of type @code{float}. + +@item ASM_OUTPUT_INT (@var{stream}, @var{exp}) +@itemx ASM_OUTPUT_SHORT (@var{stream}, @var{exp}) +@itemx ASM_OUTPUT_CHAR (@var{stream}, @var{exp}) +A C statement to output to the stdio stream @var{stream} an assembler +instruction to assemble a @code{int}, @code{short} or @code{char} +constant whose value is @var{value}. The argument @var{exp} will be +an RTL expression which represents a constant value. Use +@samp{output_addr_const (@var{exp})} to output this value as an +assembler expression.@refill + +@item ASM_OUTPUT_BYTE (@var{stream}, @var{value}) +A C statement to output to the stdio stream @var{stream} an assembler +instruction to assemble a single byte containing the number @var{value}. + +@item ASM_OUTPUT_ASCII (@var{stream}, @var{ptr}, @var{len}) +A C statement to output to the stdio stream @var{stream} an assembler +instruction to assemble a string constant containing the @var{len} +bytes at @var{ptr}. @var{ptr} will be a C expression of type +@code{char *} and @var{len} a C expression of type @code{int}. + +If the assembler has a @code{.ascii} pseudo-op as found in the +Berkeley Unix assembler, do not define the macro +@code{ASM_OUTPUT_ASCII}. -@item ASM_OUTPUT_SKIP (@var{file}, @var{nbytes}) -A C statement to output to the stdio stream @var{file} an assembler +@item ASM_OUTPUT_SKIP (@var{stream}, @var{nbytes}) +A C statement to output to the stdio stream @var{stream} an assembler instruction to advance the location counter by @var{nbytes} bytes. @var{nbytes} will be a C expression of type @code{int}. -@item ASM_OUTPUT_ALIGN (@var{file}, @var{power}) -A C statement to output to the stdio stream @var{file} an assembler +@item ASM_OUTPUT_ALIGN (@var{stream}, @var{power}) +A C statement to output to the stdio stream @var{stream} an assembler instruction to advance the location counter to a multiple of 2 to the @var{power} bytes. @var{power} will be a C expression of type @code{int}. -@item ASM_INT_OP -A C string constant for the assembler operation that assembles constants of -C type @code{int}. A space must follow the operation name. Normally -@code{".long@ "}. - -@item ASM_SHORT_OP -@itemx ASM_CHAR_OP -Likewise, for C types @code{short} and @code{char}. Normally @code{".word@ "} -and @code{".byte@ "}. +@item ASM_OUTPUT_COMMON (@var{stream}, @var{name}, @var{size}) +A C statement (sans semicolon) to output to the stdio stream +@var{stream} the assembler definition of a common-label named @var{name} +whose size is @var{size} bytes. Use the expression +@code{assemble_name (@var{stream}, @var{name})} to output the name +itself; before and after that, output the additional assembler syntax +for defining the name, and a newline. + +This macro controls how the assembler definitions of uninitialized +global variables are output. + +@item ASM_OUTPUT_LOCAL (@var{stream}, @var{name}, @var{size}) +A C statement (sans semicolon) to output to the stdio stream +@var{stream} the assembler definition of a local-common-label named +@var{name} whose size is @var{size} bytes. Use the expression +@code{assemble_name (@var{stream}, @var{name})} to output the name +itself; before and after that, output the additional assembler syntax +for defining the name, and a newline. + +This macro controls how the assembler definitions of uninitialized +static variables are output. + +@item ASM_OUTPUT_SOURCE_LINE (@var{stream}, @var{line}) +A C statment to output DBX or SDB debugging information before code +for line number @var{line} of the current source file to the +stdio stream @var{stream}. + +This macro need not be defined if the standard form of debugging +information for the debugger in use is appropriate. @item TARGET_BELL -A C constant expression for the integer value for escape sequence @samp{\a}. +A C constant expression for the integer value for escape sequence +@samp{\a}. @item TARGET_BS @itemx TARGET_TAB @@ -3399,19 +7502,106 @@ C constant expressions for the integer v C constant expressions for the integer values for escape sequences @samp{\v}, @samp{\f} and @samp{\r}. -@item PRINT_OPERAND (@var{file}, @var{x}) -A C compound statement to output to stdio stream @var{file} -the assembler syntax for an instruction operand @var{x}. -@var{x} is an RTL expression. - -If @var{x} is a register, this macro should print the register's name. The -names can be found in an array @code{reg_names} whose type is @code{char -*[]}. @code{reg_names} is initialized from @code{REGISTER_NAMES}. - -@item PRINT_OPERAND_ADDRESS (@var{file}, @var{x}) -A C compound statement to output to stdio stream @var{file} the assembler -syntax for an instruction operand that is a memory reference whose address -is @var{x}. @var{x} is an RTL expression. +@item ASM_OUTPUT_OPCODE (@var{stream}, @var{ptr}) +Define this macro if you are using an unusual assembler that +requires different names for the machine instructions. + +The definition is a C statement or statements which output an +assembler instruction opcode to the stdio stream @var{stream}. The +macro-operand @var{ptr} is a variable of type @code{char *} which +points to the opcode name in its ``internal'' form---the form that is +written in the machine description. The definition should output the +opcode name to @var{stream}, performing any translation you desire, and +increment the variable @var{ptr} to point at the end of the opcode +so that it will not be output twice. + +In fact, your macro definition may process less than the entire opcode +name, or more than the opcode name; but if you want to process text +that includes @samp{%}-sequences to substitute operands, you must take +care of the substitution yourself. Just be sure to increment +@var{ptr} over whatever text should not be output normally. + +If the macro definition does nothing, the instruction is output +in the usual way. + +@item FINAL_PRESCAN_INSN (@var{insn}, @var{opvec}, @var{noperands}) +If defined, a C statement to be executed just prior to the output of +assembler code for @var{insn}, to modify the extracted operands so +they will be output differently. + +Here the argument @var{opvec} is the vector containing the operands +extracted from @var{insn}, and @var{noperands} is the number of +elements of the vector which contain meaningful data for this insn. +The contents of this vector are what will be used to convert the insn +template into assembler code, so you can change the assembler output +by changing the contents of the vector. + +This macro is useful when various assembler syntaxes share a single +file of instruction patterns; by defining this macro differently, you +can cause a large class of instructions to be output differently (such +as with rearranged operands). Naturally, variations in assembler +syntax affecting individual insn patterns ought to be handled by +writing conditional output routines in those patterns. + +If this macro is not defined, it is equivalent to a null statement. + +@item PRINT_OPERAND (@var{stream}, @var{x}, @var{code}) +A C compound statement to output to stdio stream @var{stream} the +assembler syntax for an instruction operand @var{x}. @var{x} is an +RTL expression. + +@var{code} is a value that can be used to specify one of several ways +of printing the operand. It is used when identical operands must be +printed differently depending on the context. @var{code} comes from +the @samp{%} specification that was used to request printing of the +operand. If the specification was just @samp{%@var{digit}} then +@var{code} is 0; if the specification was @samp{%@var{ltr} +@var{digit}} then @var{code} is the ASCII code for @var{ltr}. + +If @var{x} is a register, this macro should print the register's name. +The names can be found in an array @code{reg_names} whose type is +@code{char *[]}. @code{reg_names} is initialized from +@code{REGISTER_NAMES}. + +When the machine description has a specification @samp{%@var{punct}} +(a @samp{%} followed by a punctuation character), this macro is called +with a null pointer for @var{x} and the punctuation character for +@var{code}. + +@item PRINT_OPERAND_ADDRESS (@var{stream}, @var{x}) +A C compound statement to output to stdio stream @var{stream} the +assembler syntax for an instruction operand that is a memory reference +whose address is @var{x}. @var{x} is an RTL expression. + +@item ASM_OPEN_PAREN +@itemx ASM_CLOSE_PAREN +These macros are defined as C string constant, describing the syntax +in the assembler for grouping arithmetic expressions. The following +definitions are correct for most assemblers: + +@example +#define ASM_OPEN_PAREN "(" +#define ASM_CLOSE_PAREN ")" +@end example +@end table + +@node Config,, Machine Macros, Top +@chapter The Configuration File + +The configuration file @file{config-@var{machine}.h} contains macro +definitions that describe the machine and system on which the compiler is +running. Most of the values in it are actually the same on all machines +that GNU CC runs on, so most all configuration files are identical. But +there are some macros that vary: + +@table @code +@item FAILURE_EXIT_CODE +A C expression for the status code to be returned when the compiler +exits after serious errors. + +@item SUCCESS_EXIT_CODE +A C expression for the status code to be returned when the compiler +exits without serious errors. @end table @contents