|
|
1.1 ! root 1: ! 2: ! 3: File: internals, Node: Passes, Next: RTL, Prev: Interface, Up: Top ! 4: ! 5: Passes and Files of the Compiler ! 6: ******************************** ! 7: ! 8: The overall control structure of the compiler is in `toplev.c'. This file ! 9: is responsible for initialization, decoding arguments, opening and closing ! 10: files, and sequencing the passes. ! 11: ! 12: The parsing pass is invoked only once, to parse the entire input. The RTL ! 13: intermediate code for a function is generated as the function is parsed, a ! 14: statement at a time. Each statement is read in as a syntax tree and then ! 15: converted to RTL; then the storage for the tree for the statement is ! 16: reclaimed. Storage for types (and the expressions for their sizes), ! 17: declarations, and a representation of the binding contours and how they ! 18: nest, remains until the function is finished being compiled; these are all ! 19: needed to output the debugging information. ! 20: ! 21: Each time the parsing pass reads a complete function definition or ! 22: top-level declaration, it calls the function `rest_of_compilation' or ! 23: `rest_of_decl_compilation' in `toplev.c', which are responsible for all ! 24: further processing necessary, ending with output of the assembler language. ! 25: All other compiler passes run, in sequence, within `rest_of_compilation'. ! 26: When that function returns from compiling a function definition, the ! 27: storage used for that function definition's compilation is entirely freed, ! 28: unless it is an inline function (*Note Inline::.). ! 29: ! 30: Here is a list of all the passes of the compiler and their source files. ! 31: Also included is a description of where debugging dumps can be requested ! 32: with `-d' options. ! 33: ! 34: * Parsing. This pass reads the entire text of a function definition, ! 35: constructing partial syntax trees. This and RTL generation are no ! 36: longer truly separate passes (formerly they were), but it is easier to ! 37: think of them as separate. ! 38: ! 39: The tree representation does not entirely follow C syntax, because it ! 40: is intended to support other languages as well. ! 41: ! 42: C data type analysis is also done in this pass, and every tree node ! 43: that represents an expression has a data type attached. Variables are ! 44: represented as declaration nodes. ! 45: ! 46: Constant folding and associative-law simplifications are also done ! 47: during this pass. ! 48: ! 49: The source files for parsing are `parse.y', `decl.c', `typecheck.c', ! 50: `stor-layout.c', `fold-const.c', and `tree.c'. The last three are ! 51: intended to be language-independent. There are also header files ! 52: `parse.h', `c-tree.h', `tree.h' and `tree.def'. The last two define ! 53: the format of the tree representation. ! 54: ! 55: * RTL generation. This is the conversion of syntax tree into RTL code. ! 56: It is actually done statement-by-statement during parsing, but for ! 57: most purposes it can be thought of as a separate pass. ! 58: ! 59: This is where the bulk of target-parameter-dependent code is found, ! 60: since often it is necessary for strategies to apply only when certain ! 61: standard kinds of instructions are available. The purpose of named ! 62: instruction patterns is to provide this information to the RTL ! 63: generation pass. ! 64: ! 65: Optimization is done in this pass for `if'-conditions that are ! 66: comparisons, boolean operations or conditional expressions. Tail ! 67: recursion is detected at this time also. Decisions are made about how ! 68: best to arrange loops and how to output `switch' statements. ! 69: ! 70: The source files for RTL generation are `stmt.c', `expr.c', ! 71: `explow.c', `expmed.c', `optabs.c' and `emit-rtl.c'. Also, the file ! 72: `insn-emit.c', generated from the machine description by the program ! 73: `genemit', is used in this pass. The header files `expr.h' is used ! 74: for communication within this pass. ! 75: ! 76: The header files `insn-flags.h' and `insn-codes.h', generated from the ! 77: machine description by the programs `genflags' and `gencodes', tell ! 78: this pass which standard names are available for use and which ! 79: patterns correspond to them. ! 80: ! 81: Aside from debugging information output, none of the following passes ! 82: refers to the tree structure representation of the function (only part ! 83: of which is saved). ! 84: ! 85: The decision of whether the function can and should be expanded inline ! 86: in its subsequent callers is made at the end of rtl generation. The ! 87: function must meet certain criteria, currently related to the size of ! 88: the function and the types and number of parameters it has. Note that ! 89: this function may contain loops, recursive calls to itself ! 90: (tail-recursive functions can be inlined!), gotos, in short, all ! 91: constructs supported by GNU CC. ! 92: ! 93: The option `-dr' causes a debugging dump of the RTL code after this ! 94: pass. This dump file's name is made by appending `.rtl' to the input ! 95: file name. ! 96: ! 97: * Jump optimization. This pass simplifies jumps to the following ! 98: instruction, jumps across jumps, and jumps to jumps. It deletes ! 99: unreferenced labels and unreachable code, except that unreachable code ! 100: that contains a loop is not recognized as unreachable in this pass. ! 101: (Such loops are deleted later in the basic block analysis.) ! 102: ! 103: Jump optimization is performed two or three times. The first time is ! 104: immediately following RTL generation. The second time is after CSE, ! 105: but only if CSE says repeated jump optimization is needed. The last ! 106: time is right before the final pass. That time, cross-jumping and ! 107: deletion of no-op move instructions are done together with the ! 108: optimizations described above. ! 109: ! 110: The source file of this pass is `jump.c'. ! 111: ! 112: The option `-dj' causes a debugging dump of the RTL code after this ! 113: pass is run for the first time. This dump file's name is made by ! 114: appending `.jump' to the input file name. ! 115: ! 116: * Register scan. This pass finds the first and last use of each ! 117: register, as a guide for common subexpression elimination. Its source ! 118: is in `regclass.c'. ! 119: ! 120: * Common subexpression elimination. This pass also does constant ! 121: propagation. Its source file is `cse.c'. If constant propagation ! 122: causes conditional jumps to become unconditional or to become no-ops, ! 123: jump optimization is run again when CSE is finished. ! 124: ! 125: The option `-ds' causes a debugging dump of the RTL code after this ! 126: pass. This dump file's name is made by appending `.cse' to the input ! 127: file name. ! 128: ! 129: * Loop optimization. This pass moves constant expressions out of loops. ! 130: Its source file is `loop.c'. ! 131: ! 132: The option `-dL' causes a debugging dump of the RTL code after this ! 133: pass. This dump file's name is made by appending `.loop' to the input ! 134: file name. ! 135: ! 136: * Stupid register allocation is performed at this point in a ! 137: nonoptimizing compilation. It does a little data flow analysis as ! 138: well. When stupid register allocation is in use, the next pass ! 139: executed is the reloading pass; the others in between are skipped. ! 140: The source file is `stupid.c'. ! 141: ! 142: * Data flow analysis (`flow.c'). This pass divides the program into ! 143: basic blocks (and in the process deletes unreachable loops); then it ! 144: computes which pseudo-registers are live at each point in the program, ! 145: and makes the first instruction that uses a value point at the ! 146: instruction that computed the value. ! 147: ! 148: This pass also deletes computations whose results are never used, and ! 149: combines memory references with add or subtract instructions to make ! 150: autoincrement or autodecrement addressing. ! 151: ! 152: The option `-df' causes a debugging dump of the RTL code after this ! 153: pass. This dump file's name is made by appending `.flow' to the input ! 154: file name. If stupid register allocation is in use, this dump file ! 155: reflects the full results of such allocation. ! 156: ! 157: * Instruction combination (`combine.c'). This pass attempts to combine ! 158: groups of two or three instructions that are related by data flow into ! 159: single instructions. It combines the RTL expressions for the ! 160: instructions by substitution, simplifies the result using algebra, and ! 161: then attempts to match the result against the machine description. ! 162: ! 163: The option `-dc' causes a debugging dump of the RTL code after this ! 164: pass. This dump file's name is made by appending `.combine' to the ! 165: input file name. ! 166: ! 167: * Register class preferencing. The RTL code is scanned to find out ! 168: which register class is best for each pseudo register. The source ! 169: file is `regclass.c'. ! 170: ! 171: * Local register allocation (`local-alloc.c'). This pass allocates hard ! 172: registers to pseudo registers that are used only within one basic ! 173: block. Because the basic block is linear, it can use fast and ! 174: powerful techniques to do a very good job. ! 175: ! 176: The option `-dl' causes a debugging dump of the RTL code after this ! 177: pass. This dump file's name is made by appending `.lreg' to the input ! 178: file name. ! 179: ! 180: * Global register allocation (`global-alloc.c'). This pass allocates ! 181: hard registers for the remaining pseudo registers (those whose life ! 182: spans are not contained in one basic block). ! 183: ! 184: * Reloading. This pass renumbers pseudo registers with the hardware ! 185: registers numbers they were allocated. Pseudo registers that did not ! 186: get hard registers are replaced with stack slots. Then it finds ! 187: instructions that are invalid because a value has failed to end up in ! 188: a register, or has ended up in a register of the wrong kind. It fixes ! 189: up these instructions by reloading the problematical values ! 190: temporarily into registers. Additional instructions are generated to ! 191: do the copying. ! 192: ! 193: Source files are `reload.c' and `reload1.c', plus the header ! 194: `reload.h' used for communication between them. ! 195: ! 196: The option `-dg' causes a debugging dump of the RTL code after this ! 197: pass. This dump file's name is made by appending `.greg' to the input ! 198: file name. ! 199: ! 200: * Jump optimization is repeated, this time including cross-jumping and ! 201: deletion of no-op move instructions. Machine-specific peephole ! 202: optimizations are performed at the same time. ! 203: ! 204: The option `-dJ' causes a debugging dump of the RTL code after this ! 205: pass. This dump file's name is made by appending `.jump2' to the ! 206: input file name. ! 207: ! 208: * Final. This pass outputs the assembler code for the function. It is ! 209: also responsible for identifying spurious test and compare ! 210: instructions. The function entry and exit sequences are generated ! 211: directly as assembler code in this pass; they never exist as RTL. ! 212: ! 213: The source files are `final.c' plus `insn-output.c'; the latter is ! 214: generated automatically from the machine description by the tool ! 215: `genoutput'. The header file `conditions.h' is used for communication ! 216: between these files. ! 217: ! 218: * Debugging information output. This is run after final because it must ! 219: output the stack slot offsets for pseudo registers that did not get ! 220: hard registers. Source files are `dbxout.c' for DBX symbol table ! 221: format and `symout.c' for GDB's own symbol table format. ! 222: ! 223: Some additional files are used by all or many passes: ! 224: ! 225: * Every pass uses `machmode.def', which defines the machine modes. ! 226: ! 227: * All the passes that work with RTL use the header files `rtl.h' and ! 228: `rtl.def', and subroutines in file `rtl.c'. The tools `gen*' also use ! 229: these files to read and work with the machine description RTL. ! 230: ! 231: * Several passes refer to the header file `insn-config.h' which contains ! 232: a few parameters (C macro definitions) generated automatically from ! 233: the machine description RTL by the tool `genconfig'. ! 234: ! 235: * Several passes use the instruction recognizer, which consists of ! 236: `recog.c' and `recog.h', plus the files `insn-recog.c' and ! 237: `insn-extract.c' that are generated automatically from the machine ! 238: description by the tools `genrecog' and `genextract'. ! 239: ! 240: * Several passes use the header files `regs.h' which defines the ! 241: information recorded about pseudo register usage, and `basic-block.h' ! 242: which defines the information recorded about basic blocks. ! 243: ! 244: * `hard-reg-set.h' defines the type `HARD_REG_SET', a bit-vector with a ! 245: bit for each hard register, and some macros to manipulate it. This ! 246: type is just `int' if the machine has few enough hard registers; ! 247: otherwise it is an array of `int' and some of the macros expand into ! 248: loops. ! 249: ! 250: ! 251: File: internals, Node: RTL, Next: Machine Desc, Prev: Passes, Up: Top ! 252: ! 253: RTL Representation ! 254: ****************** ! 255: ! 256: Most of the work of the compiler is done on an intermediate representation ! 257: called register transfer language. In this language, the instructions to ! 258: be output are described, pretty much one by one, in an algebraic form that ! 259: describes what the instruction does. ! 260: ! 261: RTL is inspired by Lisp lists. It has both an internal form, made up of ! 262: structures that point at other structures, and a textual form that is used ! 263: in the machine description and in printed debugging dumps. The textual ! 264: form uses nested parentheses to indicate the pointers in the internal form. ! 265: ! 266: * Menu: ! 267: ! 268: * RTL Objects:: Expressions vs vectors vs strings vs integers. ! 269: * Accessors:: Macros to access expression operands or vector elts. ! 270: * Flags:: Other flags in an RTL expression. ! 271: * Machine Modes:: Describing the size and format of a datum. ! 272: * Constants:: Expressions with constant values. ! 273: * Regs and Memory:: Expressions representing register contents or memory. ! 274: * Arithmetic:: Expressions representing arithmetic on other expressions. ! 275: * Comparisons:: Expressions representing comparison of expressions. ! 276: * Bit Fields:: Expressions representing bit-fields in memory or reg. ! 277: * Conversions:: Extending, truncating, floating or fixing. ! 278: * RTL Declarations:: Declaring volatility, constancy, etc. ! 279: * Side Effects:: Expressions for storing in registers, etc. ! 280: * Incdec:: Embedded side-effects for autoincrement addressing. ! 281: * Assembler:: Representing `asm' with operands. ! 282: * Insns:: Expression types for entire insns. ! 283: * Calls:: RTL representation of function call insns. ! 284: * Sharing:: Some expressions are unique; others *must* be copied. ! 285: ! 286: ! 287: ! 288: File: internals, Node: RTL Objects, Next: Accessors, Prev: RTL, Up: RTL ! 289: ! 290: RTL Object Types ! 291: ================ ! 292: ! 293: RTL uses four kinds of objects: expressions, integers, strings and vectors. ! 294: Expressions are the most important ones. An RTL expression (``RTX'', for ! 295: short) is a C structure, but it is usually referred to with a pointer; a ! 296: type that is given the typedef name `rtx'. ! 297: ! 298: An integer is simply an `int', and a string is a `char *'. Within RTL ! 299: code, strings appear only inside `symbol_ref' expressions, but they appear ! 300: in other contexts in the RTL expressions that make up machine descriptions. ! 301: Their written form uses decimal digits. ! 302: ! 303: A string is a sequence of characters. In core it is represented as a `char ! 304: *' in usual C fashion, and it is written in C syntax as well. However, ! 305: strings in RTL may never be null. If you write an empty string in a ! 306: machine description, it is represented in core as a null pointer rather ! 307: than as a pointer to a null character. In certain contexts, these null ! 308: pointers instead of strings are valid. ! 309: ! 310: A vector contains an arbitrary, specified number of pointers to ! 311: expressions. The number of elements in the vector is explicitly present in ! 312: the vector. The written form of a vector consists of square brackets ! 313: (`[...]') surrounding the elements, in sequence and with whitespace ! 314: separating them. Vectors of length zero are not created; null pointers are ! 315: used instead. ! 316: ! 317: Expressions are classified by "expression codes" (also called RTX codes). ! 318: The expression code is a name defined in `rtl.def', which is also (in upper ! 319: case) a C enumeration constant. The possible expression codes and their ! 320: meanings are machine-independent. The code of an RTX can be extracted with ! 321: the macro `GET_CODE (X)' and altered with `PUT_CODE (X, NEWCODE)'. ! 322: ! 323: The expression code determines how many operands the expression contains, ! 324: and what kinds of objects they are. In RTL, unlike Lisp, you cannot tell ! 325: by looking at an operand what kind of object it is. Instead, you must know ! 326: from its context---from the expression code of the containing expression. ! 327: For example, in an expression of code `subreg', the first operand is to be ! 328: regarded as an expression and the second operand as an integer. In an ! 329: expression of code `plus', there are two operands, both of which are to be ! 330: regarded as expressions. In a `symbol_ref' expression, there is one ! 331: operand, which is to be regarded as a string. ! 332: ! 333: Expressions are written as parentheses containing the name of the ! 334: expression type, its flags and machine mode if any, and then the operands ! 335: of the expression (separated by spaces). ! 336: ! 337: Expression code names in the `md' file are written in lower case, but when ! 338: they appear in C code they are written in upper case. In this manual, they ! 339: are shown as follows: `const_int'. ! 340: ! 341: In a few contexts a null pointer is valid where an expression is normally ! 342: wanted. The written form of this is `(nil)'. ! 343: ! 344: ! 345: File: internals, Node: Accessors, Next: Flags, Prev: RTL Objects, Up: RTL ! 346: ! 347: Access to Operands ! 348: ================== ! 349: ! 350: For each expression type `rtl.def' specifies the number of contained ! 351: objects and their kinds, with four possibilities: `e' for expression ! 352: (actually a pointer to an expression), `i' for integer, `s' for string, and ! 353: `E' for vector of expressions. The sequence of letters for an expression ! 354: code is called its "format". Thus, the format of `subreg' is `ei'. ! 355: ! 356: Two other format characters are used occasionally: `u' and `0'. `u' is ! 357: equivalent to `e' except that it is printed differently in debugging dumps, ! 358: and `0' means a slot whose contents do not fit any normal category. `0' ! 359: slots are not printed at all in dumps, and are often used in special ways ! 360: by small parts of the compiler. ! 361: ! 362: There are macros to get the number of operands and the format of an ! 363: expression code: ! 364: ! 365: `GET_RTX_LENGTH (CODE)' ! 366: Number of operands of an RTX of code CODE. ! 367: ! 368: `GET_RTX_FORMAT (CODE)' ! 369: The format of an RTX of code CODE, as a C string. ! 370: ! 371: Operands of expressions are accessed using the macros `XEXP', `XINT' and ! 372: `XSTR'. Each of these macros takes two arguments: an expression-pointer ! 373: (RTX) and an operand number (counting from zero). Thus, ! 374: ! 375: XEXP (X, 2) ! 376: ! 377: accesses operand 2 of expression X, as an expression. ! 378: ! 379: XINT (X, 2) ! 380: ! 381: accesses the same operand as an integer. `XSTR', used in the same fashion, ! 382: would access it as a string. ! 383: ! 384: Any operand can be accessed as an integer, as an expression or as a string. ! 385: You must choose the correct method of access for the kind of value ! 386: actually stored in the operand. You would do this based on the expression ! 387: code of the containing expression. That is also how you would know how ! 388: many operands there are. ! 389: ! 390: For example, if X is a `subreg' expression, you know that it has two ! 391: operands which can be correctly accessed as `XEXP (X, 0)' and `XINT (X, ! 392: 1)'. If you did `XINT (X, 0)', you would get the address of the expression ! 393: operand but cast as an integer; that might occasionally be useful, but it ! 394: would be cleaner to write `(int) XEXP (X, 0)'. `XEXP (X, 1)' would also ! 395: compile without error, and would return the second, integer operand cast as ! 396: an expression pointer, which would probably result in a crash when ! 397: accessed. Nothing stops you from writing `XEXP (X, 28)' either, but this ! 398: will access memory past the end of the expression with unpredictable results. ! 399: ! 400: Access to operands which are vectors is more complicated. You can use the ! 401: macro `XVEC' to get the vector-pointer itself, or the macros `XVECEXP' and ! 402: `XVECLEN' to access the elements and length of a vector. ! 403: ! 404: `XVEC (EXP, IDX)' ! 405: Access the vector-pointer which is operand number IDX in EXP. ! 406: ! 407: `XVECLEN (EXP, IDX)' ! 408: Access the length (number of elements) in the vector which is in ! 409: operand number IDX in EXP. This value is an `int'. ! 410: ! 411: `XVECEXP (EXP, IDX, ELTNUM)' ! 412: Access element number ELTNUM in the vector which is in operand number ! 413: IDX in EXP. This value is an RTX. ! 414: ! 415: It is up to you to make sure that ELTNUM is not negative and is less ! 416: than `XVECLEN (EXP, IDX)'. ! 417: ! 418: All the macros defined in this section expand into lvalues and therefore ! 419: can be used to assign the operands, lengths and vector elements as well as ! 420: to access them. ! 421: ! 422: ! 423: File: internals, Node: Flags, Next: Machine Modes, Prev: Accessors, Up: RTL ! 424: ! 425: Flags in an RTL Expression ! 426: ========================== ! 427: ! 428: RTL expressions contain several flags (one-bit bit-fields) that are used in ! 429: certain types of expression. ! 430: ! 431: `used' ! 432: This flag is used only momentarily, at the end of RTL generation for a ! 433: function, to count the number of times an expression appears in insns. ! 434: Expressions that appear more than once are copied, according to the ! 435: rules for shared structure (*Note Sharing::.). ! 436: ! 437: `volatil' ! 438: This flag is used in `mem' and `reg' expressions and in insns. In RTL ! 439: dump files, it is printed as `/v'. ! 440: ! 441: In a `mem' expression, it is 1 if the memory reference is volatile. ! 442: Volatile memory references may not be deleted, reordered or combined. ! 443: ! 444: In a `reg' expression, it is 1 if the value is a user-level variable. ! 445: 0 indicates an internal compiler temporary. ! 446: ! 447: In an insn, 1 means the insn has been deleted. ! 448: ! 449: `in_struct' ! 450: This flag is used in `mem' expressions. It is 1 if the memory datum ! 451: referred to is all or part of a structure or array; 0 if it is (or ! 452: might be) a scalar variable. A reference through a C pointer has 0 ! 453: because the pointer might point to a scalar variable. ! 454: ! 455: This information allows the compiler to determine something about ! 456: possible cases of aliasing. ! 457: ! 458: In an RTL dump, this flag is represented as `/s'. ! 459: ! 460: `unchanging' ! 461: This flag is used in `reg' and `mem' expressions. 1 means that the ! 462: value of the expression never changes (at least within the current ! 463: function). ! 464: ! 465: In an RTL dump, this flag is represented as `/u'. ! 466: ! 467: ! 468: File: internals, Node: Machine Modes, Next: Constants, Prev: Flags, Up: RTL ! 469: ! 470: Machine Modes ! 471: ============= ! 472: ! 473: A machine mode describes a size of data object and the representation used ! 474: for it. In the C code, machine modes are represented by an enumeration ! 475: type, `enum machine_mode', defined in `machmode.def'. Each RTL expression ! 476: has room for a machine mode and so do certain kinds of tree expressions ! 477: (declarations and types, to be precise). ! 478: ! 479: In debugging dumps and machine descriptions, the machine mode of an RTL ! 480: expression is written after the expression code with a colon to separate ! 481: them. The letters `mode' which appear at the end of each machine mode name ! 482: are omitted. For example, `(reg:SI 38)' is a `reg' expression with machine ! 483: mode `SImode'. If the mode is `VOIDmode', it is not written at all. ! 484: ! 485: Here is a table of machine modes. ! 486: ! 487: `QImode' ! 488: ``Quarter-Integer'' mode represents a single byte treated as an integer. ! 489: ! 490: `HImode' ! 491: ``Half-Integer'' mode represents a two-byte integer. ! 492: ! 493: `SImode' ! 494: ``Single Integer'' mode represents a four-byte integer. ! 495: ! 496: `DImode' ! 497: ``Double Integer'' mode represents an eight-byte integer. ! 498: ! 499: `TImode' ! 500: ``Tetra Integer'' (?) mode represents a sixteen-byte integer. ! 501: ! 502: `SFmode' ! 503: ``Single Floating'' mode represents a single-precision (four byte) ! 504: floating point number. ! 505: ! 506: `DFmode' ! 507: ``Double Floating'' mode represents a double-precision (eight byte) ! 508: floating point number. ! 509: ! 510: `TFmode' ! 511: ``Tetra Floating'' mode represents a quadruple-precision (sixteen ! 512: byte) floating point number. ! 513: ! 514: `BLKmode' ! 515: ``Block'' mode represents values that are aggregates to which none of ! 516: the other modes apply. In RTL, only memory references can have this ! 517: mode, and only if they appear in string-move or vector instructions. ! 518: On machines which have no such instructions, `BLKmode' will not appear ! 519: in RTL. ! 520: ! 521: `VOIDmode' ! 522: Void mode means the absence of a mode or an unspecified mode. For ! 523: example, RTL expressions of code `const_int' have mode `VOIDmode' ! 524: because they can be taken to have whatever mode the context requires. ! 525: In debugging dumps of RTL, `VOIDmode' is expressed by the absence of ! 526: any mode. ! 527: ! 528: `EPmode' ! 529: ``Entry Pointer'' mode is intended to be used for function variables ! 530: in Pascal and other block structured languages. Such values contain ! 531: both a function address and a static chain pointer for access to ! 532: automatic variables of outer levels. This mode is only partially ! 533: implemented since C does not use it. ! 534: ! 535: `CSImode, ...' ! 536: ``Complex Single Integer'' mode stands for a complex number ! 537: represented as a pair of `SImode' integers. Any of the integer and ! 538: floating modes may have `C' prefixed to its name to obtain a complex ! 539: number mode. For example, there are `CQImode', `CSFmode', and ! 540: `CDFmode'. Since C does not support complex numbers, these machine ! 541: modes are only partially implemented. ! 542: ! 543: `BImode' ! 544: This is the machine mode of a bit-field in a structure. It is used ! 545: only in the syntax tree, never in RTL, and in the syntax tree it ! 546: appears only in declaration nodes. In C, it appears only in ! 547: `FIELD_DECL' nodes for structure fields defined with a bit size. ! 548: ! 549: The machine description defines `Pmode' as a C macro which expands into the ! 550: machine mode used for addresses. Normally this is `SImode'. ! 551: ! 552: The only modes which a machine description must support are `QImode', ! 553: `SImode', `SFmode' and `DFmode'. The compiler will attempt to use `DImode' ! 554: for two-word structures and unions, but it would not be hard to program it ! 555: to avoid this. Likewise, you can arrange for the C type `short int' to ! 556: avoid using `HImode'. In the long term it would be desirable to make the ! 557: set of available machine modes machine-dependent and eliminate all ! 558: assumptions about specific machine modes or their uses from the ! 559: machine-independent code of the compiler. ! 560: ! 561: Here are some C macros that relate to machine modes: ! 562: ! 563: `GET_MODE (X)' ! 564: Returns the machine mode of the RTX X. ! 565: ! 566: `PUT_MODE (X, NEWMODE)' ! 567: Alters the machine mode of the RTX X to be NEWMODE. ! 568: ! 569: `GET_MODE_SIZE (M)' ! 570: Returns the size in bytes of a datum of mode M. ! 571: ! 572: `GET_MODE_BITSIZE (M)' ! 573: Returns the size in bits of a datum of mode M. ! 574: ! 575: `GET_MODE_UNIT_SIZE (M)' ! 576: Returns the size in bits of the subunits of a datum of mode M. This ! 577: is the same as `GET_MODE_SIZE' except in the case of complex modes and ! 578: `EPmode'. For them, the unit size is the size of the real or ! 579: imaginary part, or the size of the function pointer or the context ! 580: pointer. ! 581: ! 582: ! 583: File: internals, Node: Constants, Next: Regs and Memory, Prev: Machine Modes, Up: RTL ! 584: ! 585: Constant Expression Types ! 586: ========================= ! 587: ! 588: The simplest RTL expressions are those that represent constant values. ! 589: ! 590: `(const_int I)' ! 591: This type of expression represents the integer value I. I is ! 592: customarily accessed with the macro `INTVAL' as in `INTVAL (EXP)', ! 593: which is equivalent to `XINT (EXP, 0)'. ! 594: ! 595: There is only one expression object for the integer value zero; it is ! 596: the value of the variable `const0_rtx'. Likewise, the only expression ! 597: for integer value one is found in `const1_rtx'. Any attempt to create ! 598: an expression of code `const_int' and value zero or one will return ! 599: `const0_rtx' or `const1_rtx' as appropriate. ! 600: ! 601: `(const_double:M I0 I1)' ! 602: Represents a floating point constant value of mode M. The two ! 603: inteGERS I0 and I1 together contain the bits of a `double' value. To ! 604: convert them to a `double', do ! 605: ! 606: union { double d; int i[2];} u; ! 607: u.i[0] = XINT (x, 0); ! 608: u.i[1] = XINT (x, 1); ! 609: ! 610: and then refer to `u.d'. The value of the constant is represented as ! 611: a double in this fashion even if the value represented is ! 612: single-precision. ! 613: ! 614: The global variables `dconst0_rtx' and `fconst0_rtx' hold ! 615: `const_double' expressions with value 0, in modes `DFmode' and ! 616: `SFmode', respectively. ! 617: ! 618: `(symbol_ref SYMBOL)' ! 619: Represents the value of an assembler label for data. SYMBOL is a ! 620: string that describes the name of the assembler label. If it starts ! 621: with a `*', the label is the rest of SYMBOL not including the `*'. ! 622: Otherwise, the label is SYMBOL, prefixed with `_'. ! 623: ! 624: `(label_ref LABEL)' ! 625: Represents the value of an assembler label for code. It contains one ! 626: operand, an expression, which must be a `code_label' that appears in ! 627: the instruction sequence to identify the place where the label should ! 628: go. ! 629: ! 630: The reason for using a distinct expression type for code label ! 631: references is so that jump optimization can distinguish them. ! 632: ! 633: `(const EXP)' ! 634: Represents a constant that is the result of an assembly-time ! 635: arithmetic computation. The operand, EXP, is an expression that ! 636: contains only constants (`const_int', `symbol_ref' and `label_ref' ! 637: expressions) combined with `plus' and `minus'. However, not all ! 638: combinations are valid, since the assembler cannot do arbitrary ! 639: arithmetic on relocatable symbols. ! 640: ! 641: ! 642: File: internals, Node: Regs and Memory, Next: Arithmetic, Prev: Constants, Up: RTL ! 643: ! 644: Registers and Memory ! 645: ==================== ! 646: ! 647: Here are the RTL expression types for describing access to machine ! 648: registers and to main memory. ! 649: ! 650: `(reg:M N)' ! 651: For small values of the integer N (less than `FIRST_PSEUDO_REGISTER'), ! 652: this stands for a reference to machine register number N: a "hard ! 653: register". For larger values of N, it stands for a temporary value or ! 654: "pseudo register". The compiler's strategy is to generate code ! 655: assuming an unlimited number of such pseudo registers, and later ! 656: convert them into hard registers or into memory references. ! 657: ! 658: The symbol `FIRST_PSEUDO_REGISTER' is defined by the machine ! 659: description, since the number of hard registers on the machine is an ! 660: invariant characteristic of the machine. Note, however, that not all ! 661: of the machine registers must be general registers. All the machine ! 662: registers that can be used for storage of data are given hard register ! 663: numbers, even those that can be used only in certain instructions or ! 664: can hold only certain types of data. ! 665: ! 666: Each pseudo register number used in a function's RTL code is ! 667: represented by a unique `reg' expression. ! 668: ! 669: M is the machine mode of the reference. It is necessary because ! 670: machines can generally refer to each register in more than one mode. ! 671: For example, a register may contain a full word but there may be ! 672: instructions to refer to it as a half word or as a single byte, as ! 673: well as instructions to refer to it as a floating point number of ! 674: various precisions. ! 675: ! 676: Even for a register that the machine can access in only one mode, the ! 677: mode must always be specified. ! 678: ! 679: A hard register may be accessed in various modes throughout one ! 680: function, but each pseudo register is given a natural mode and is ! 681: accessed only in that mode. When it is necessary to describe an ! 682: access to a pseudo register using a nonnatural mode, a `subreg' ! 683: expression is used. ! 684: ! 685: A `reg' expression with a machine mode that specifies more than one ! 686: word of data may actually stand for several consecutive registers. If ! 687: in addition the register number specifies a hardware register, then it ! 688: actually represents several consecutive hardware registers starting ! 689: with the specified one. ! 690: ! 691: Such multi-word hardware register `reg' expressions may not be live ! 692: across the boundary of a basic block. The lifetime analysis pass does ! 693: not know how to record properly that several consecutive registers are ! 694: actually live there, and therefore register allocation would be ! 695: confused. The CSE pass must go out of its way to make sure the ! 696: situation does not arise. ! 697: ! 698: `(subreg:M REG WORDNUM)' ! 699: `subreg' expressions are used to refer to a register in a machine mode ! 700: other than its natural one, or to refer to one register of a ! 701: multi-word `reg' that actually refers to several registers. ! 702: ! 703: Each pseudo-register has a natural mode. If it is necessary to ! 704: operate on it in a different mode---for example, to perform a fullword ! 705: move instruction on a pseudo-register that contains a single byte--- ! 706: the pseudo-register must be enclosed in a `subreg'. In such a case, ! 707: WORDNUM is zero. ! 708: ! 709: The other use of `subreg' is to extract the individual registers of a ! 710: multi-register value. Machine modes such as `DImode' and `EPmode' ! 711: indicate values longer than a word, values which usually require two ! 712: consecutive registers. To access one of the registers, use a `subreg' ! 713: with mode `SImode' and a WORDNUM that says which register. ! 714: ! 715: The compilation parameter `WORDS_BIG_ENDIAN', if defined, says that ! 716: word number zero is the most significant part; otherwise, it is the ! 717: least significant part. ! 718: ! 719: Note that it is not valid to access a `DFmode' value in `SFmode' using ! 720: a `subreg'. On some machines the most significant part of a `DFmode' ! 721: value does not have the same format as a single-precision floating ! 722: value. ! 723: ! 724: `(cc0)' ! 725: This refers to the machine's condition code register. It has no ! 726: operands and may not have a machine mode. It may be validly used in ! 727: only two contexts: as the destination of an assignment (in test and ! 728: compare instructions) and in comparison operators comparing against ! 729: zero (`const_int' with value zero; that is to say, `const0_rtx'). ! 730: ! 731: There is only one expression object of code `cc0'; it is the value of ! 732: the variable `cc0_rtx'. Any attempt to create an expression of code ! 733: `cc0' will return `cc0_rtx'. ! 734: ! 735: One special thing about the condition code register is that ! 736: instructions can set it implicitly. On many machines, nearly all ! 737: instructions set the condition code based on the value that they ! 738: compute or store. It is not necessary to record these actions ! 739: explicitly in the RTL because the machine description includes a ! 740: prescription for recognizing the instructions that do so (by means of ! 741: the macro `NOTICE_UPDATE_CC'). Only instructions whose sole purpose ! 742: is to set the condition code, and instructions that use the condition ! 743: code, need mention `(cc0)'. ! 744: ! 745: `(pc)' ! 746: This represents the machine's program counter. It has no operands and ! 747: may not have a machine mode. `(pc)' may be validly used only in ! 748: certain specific contexts in jump instructions. ! 749: ! 750: There is only one expression object of code `pc'; it is the value of ! 751: the variable `pc_rtx'. Any attempt to create an expression of code ! 752: `pc' will return `pc_rtx'. ! 753: ! 754: All instructions that do not jump alter the program counter implicitly ! 755: by incrementing it, but there is no need to mention this in the RTL. ! 756: ! 757: `(mem:M ADDR)' ! 758: This RTX represents a reference to main memory at an address ! 759: represented by the expression ADDR. M specifies how large a unit of ! 760: memory is accessed. ! 761: ! 762: ! 763: File: internals, Node: Arithmetic, Next: Comparisons, Prev: Regs and Memory, Up: RTL ! 764: ! 765: RTL Expressions for Arithmetic ! 766: ============================== ! 767: ! 768: `(plus:M X Y)' ! 769: Represents the sum of the values represented by X and Y carried out in ! 770: machine mode M. This is valid only if X and Y both are valid for mode ! 771: M. ! 772: ! 773: `(minus:M X Y)' ! 774: Like `plus' but represents subtraction. ! 775: ! 776: `(minus X Y)' ! 777: Represents the result of subtracting Y from X for purposes of ! 778: comparison. The absence of a machine mode in the `minus' expression ! 779: indicates that the result is computed without overflow, as if with ! 780: infinite precision. ! 781: ! 782: Of course, machines can't really subtract with infinite precision. ! 783: However, they can pretend to do so when only the sign of the result ! 784: will be used, which is the case when the result is stored in `(cc0)'. ! 785: And that is the only way this kind of expression may validly be used: ! 786: as a value to be stored in the condition codes. ! 787: ! 788: `(neg:M X)' ! 789: Represents the negation (subtraction from zero) of the value ! 790: represented by X, carried out in mode M. X must be valid for mode M. ! 791: ! 792: `(mult:M X Y)' ! 793: Represents the signed product of the values represented by X and Y ! 794: carried out in machine mode M. If X and Y are both valid for mode M, ! 795: this is ordinary size-preserving multiplication. Alternatively, both ! 796: X and Y may be valid for a different, narrower mode. This represents ! 797: the kind of multiplication that generates a product wider than the ! 798: operands. Widening multiplication and same-size multiplication are ! 799: completely distinct and supported by different machine instructions; ! 800: machines may support one but not the other. ! 801: ! 802: `mult' may be used for floating point division as well. Then M is a ! 803: floating point machine mode. ! 804: ! 805: `(umult:M X Y)' ! 806: Like `mult' but represents unsigned multiplication. It may be used in ! 807: both same-size and widening forms, like `mult'. `umult' is used only ! 808: for fixed-point multiplication. ! 809: ! 810: `(div:M X Y)' ! 811: Represents the quotient in signed division of X by Y, carried out in ! 812: machine mode M. If M is a floating-point mode, it represents the ! 813: exact quotient; otherwise, the integerized quotient. If X and Y are ! 814: both valid for mode M, this is ordinary size-preserving division. ! 815: Some machines have division instructions in which the operands and ! 816: quotient widths are not all the same; such instructions are ! 817: represented by `div' expressions in which the machine modes are not ! 818: all the same. ! 819: ! 820: `(udiv:M X Y)' ! 821: Like `div' but represents unsigned division. ! 822: ! 823: `(mod:M X Y)' ! 824: `(umod:M X Y)' ! 825: Like `div' and `udiv' but represent the remainder instead of the ! 826: quotient. ! 827: ! 828: `(not:M X)' ! 829: Represents the bitwise complement of the value represented by X, ! 830: carried out in mode M, which must be a fixed-point machine mode. X ! 831: must be valid for mode M, which must be a fixed-point mode. ! 832: ! 833: `(and:M X Y)' ! 834: Represents the bitwise logical-and of the values represented by X and ! 835: Y, carried out in machine mode M. This is valid only if X and Y both ! 836: are valid for mode M, which must be a fixed-point mode. ! 837: ! 838: `(ior:M X Y)' ! 839: Represents the bitwise inclusive-or of the values represented by X and ! 840: Y, carried out in machine mode M. This is valid only if X and Y both ! 841: are valid for mode M, which must be a fixed-point mode. ! 842: ! 843: `(xor:M X Y)' ! 844: Represents the bitwise exclusive-or of the values represented by X and ! 845: Y, carried out in machine mode M. This is valid only if X and Y both ! 846: are valid for mode M, which must be a fixed-point mode. ! 847: ! 848: `(lshift:M X C)' ! 849: Represents the result of logically shifting X left by C places. X ! 850: must be valid for the mode M, a fixed-point machine mode. C must be ! 851: valid for a fixed-point mode; which mode is determined by the mode ! 852: called for in the machine description entry for the left-shift ! 853: instruction. For example, on the Vax, the mode of C is `QImode' ! 854: regardless of M. ! 855: ! 856: On some machines, negative values of C may be meaningful; this is why ! 857: logical left shift and arithmetic left shift are distinguished. For ! 858: example, Vaxes have no right-shift instructions, and right shifts are ! 859: represented as left-shift instructions whose counts happen to be ! 860: negative constants or else computed (in a previous instruction) by ! 861: negation. ! 862: ! 863: `(ashift:M X C)' ! 864: Like `lshift' but for arithmetic left shift. ! 865: ! 866: `(lshiftrt:M X C)' ! 867: `(ashiftrt:M X C)' ! 868: Like `lshift' and `ashift' but for right shift. ! 869: ! 870: `(rotate:M X C)' ! 871: `(rotatert:M X C)' ! 872: Similar but represent left and right rotate. ! 873: ! 874: `(abs:M X)' ! 875: Represents the absolute value of X, computed in mode M. X must be ! 876: valid for M. ! 877: ! 878: `(sqrt:M X)' ! 879: Represents the square root of X, computed in mode M. X must be valid ! 880: for M. Most often M will be a floating point mode. ! 881: ! 882: `(ffs:M X)' ! 883: Represents the one plus the index of the least significant 1-bit in X, ! 884: represented as an integer of mode M. (The value is zero if X is ! 885: zero.) The mode of X need not be M; depending on the target machine, ! 886: various mode combinations may be valid. ! 887: ! 888: ! 889: File: internals, Node: Comparisons, Next: Bit Fields, Prev: Arithmetic, Up: RTL ! 890: ! 891: Comparison Operations ! 892: ===================== ! 893: ! 894: Comparison operators test a relation on two operands and are considered to ! 895: represent the value 1 if the relation holds, or zero if it does not. The ! 896: mode of the comparison is determined by the operands; they must both be ! 897: valid for a common machine mode. A comparison with both operands constant ! 898: would be invalid as the machine mode could not be deduced from it, but such ! 899: a comparison should never exist in RTL due to constant folding. ! 900: ! 901: Inequality comparisons come in two flavors, signed and unsigned. Thus, ! 902: there are distinct expression codes `gt' and `gtu' for signed and unsigned ! 903: greater-than. These can produce different results for the same pair of ! 904: integer values: for example, 1 is signed greater-than -1 but not unsigned ! 905: greater-than, because -1 when regarded as unsigned is actually `0xffffffff' ! 906: which is greater than 1. ! 907: ! 908: The signed comparisons are also used for floating point values. Floating ! 909: point comparisons are distinguished by the machine modes of the operands. ! 910: ! 911: The comparison operators may be used to compare the condition codes `(cc0)' ! 912: against zero, as in `(eq (cc0) (const_int 0))'. Such a construct actually ! 913: refers to the result of the preceding instruction in which the condition ! 914: codes were set. The above example stands for 1 if the condition codes were ! 915: set to say ``zero'' or ``equal'', 0 otherwise. Although the same ! 916: comparison operators are used for this as may be used in other contexts on ! 917: actual data, no confusion can result since the machine description would ! 918: never allow both kinds of uses in the same context. ! 919: ! 920: `(eq X Y)' ! 921: 1 if the values represented by X and Y are equal, otherwise 0. ! 922: ! 923: `(ne X Y)' ! 924: 1 if the values represented by X and Y are not equal, otherwise 0. ! 925: ! 926: `(gt X Y)' ! 927: 1 if the X is greater than Y. If they are fixed-point, the comparison ! 928: is done in a signed sense. ! 929: ! 930: `(gtu X Y)' ! 931: Like `gt' but does unsigned comparison, on fixed-point numbers only. ! 932: ! 933: `(lt X Y)' ! 934: `(ltu X Y)' ! 935: Like `gt' and `gtu' but test for ``less than''. ! 936: ! 937: `(ge X Y)' ! 938: `(geu X Y)' ! 939: Like `gt' and `gtu' but test for ``greater than or equal''. ! 940: ! 941: `(le X Y)' ! 942: `(leu X Y)' ! 943: Like `gt' and `gtu' but test for ``less than or equal''. ! 944: ! 945: `(if_then_else COND THEN ELSE)' ! 946: This is not a comparison operation but is listed here because it is ! 947: always used in conjunction with a comparison operation. To be ! 948: precISE, COND is a comparison expression. This expression represents ! 949: a choice, according to COND, between the value represented by THEN and ! 950: the one represented by ELSE. ! 951: ! 952: On most machines, `if_then_else' expressions are valid only to express ! 953: conditional jumps. ! 954: ! 955: ! 956: File: internals, Node: Bit Fields, Next: Conversions, Prev: Comparisons, Up: RTL ! 957: ! 958: Bit-fields ! 959: ========== ! 960: ! 961: Special expression codes exist to represent bit-field instructions. These ! 962: types of expressions are lvalues in RTL; they may appear on the left side ! 963: of a assignment, indicating insertion of a value into the specified bit ! 964: field. ! 965: ! 966: `(sign_extract:SI LOC SIZE POS)' ! 967: This represents a reference to a sign-extended bit-field contained or ! 968: starting in LOC (a memory or register reference). The bit field is ! 969: SIZE bits wide and starts at bit POS. The compilation option ! 970: `BITS_BIG_ENDIAN' says which end of the memory unit POS counts from. ! 971: ! 972: Which machine modes are valid for LOC depends on the machine, but ! 973: typically LOC should be a single byte when in memory or a full word in ! 974: a register. ! 975: ! 976: `(zero_extract:SI LOC SIZE POS)' ! 977: Like `sign_extract' but refers to an unsigned or zero-extended bit ! 978: field. The same sequence of bits are extracted, but they are filled ! 979: to an entire word with zeros instead of by sign-extension. ! 980: ! 981: ! 982: File: internals, Node: Conversions, Next: RTL Declarations, Prev: Bit Fields, Up: RTL ! 983: ! 984: Conversions ! 985: =========== ! 986: ! 987: All conversions between machine modes must be represented by explicit ! 988: conversion operations. For example, an expression which is the sum of a ! 989: byte and a full word cannot be written as `(plus:SI (reg:QI 34) (reg:SI ! 990: 80))' because the `plus' operation requires two operands of the same ! 991: machine mode. Therefore, the byte-sized operand is enclosed in a ! 992: conversion operation, as in ! 993: ! 994: (plus:SI (sign_extend:SI (reg:QI 34)) (reg:SI 80)) ! 995: ! 996: The conversion operation is not a mere placeholder, because there may be ! 997: more than one way of converting from a given starting mode to the desired ! 998: final mode. The conversion operation code says how to do it. ! 999: ! 1000: `(sign_extend:M X)' ! 1001: Represents the result of sign-extending the value X to machine mode M. ! 1002: M must be a fixed-point mode and X a fixed-point value of a mode ! 1003: narrower than M. ! 1004: ! 1005: `(zero_extend:M X)' ! 1006: Represents the result of zero-extending the value X to machine mode M. ! 1007: M must be a fixed-point mode and X a fixed-point value of a mode ! 1008: narrower than M. ! 1009: ! 1010: `(float_extend:M X)' ! 1011: Represents the result of extending the value X to machine mode M. M ! 1012: must be a floating point mode and X a floating point value of a mode ! 1013: narrower than M. ! 1014: ! 1015: `(truncate:M X)' ! 1016: Represents the result of truncating the value X to machine mode M. M ! 1017: must be a fixed-point mode and X a fixed-point value of a mode wider ! 1018: than M. ! 1019: ! 1020: `(float_truncate:M X)' ! 1021: Represents the result of truncating the value X to machine mode M. M ! 1022: must be a floating point mode and X a floating point value of a mode ! 1023: wider than M. ! 1024: ! 1025: `(float:M X)' ! 1026: Represents the result of converting fixed point value X, regarded as ! 1027: signed, to floating point mode M. ! 1028: ! 1029: `(unsigned_float:M X)' ! 1030: Represents the result of converting fixed point value X, regarded as ! 1031: unsigned, to floating point mode M. ! 1032: ! 1033: `(fix:M X)' ! 1034: When M is a fixed point mode, represents the result of converting ! 1035: floating point value X to mode M, regarded as signed. How rounding is ! 1036: done is not specified, so this operation may be used validly in ! 1037: compiling C code only for integer-valued operands. ! 1038: ! 1039: `(unsigned_fix:M X)' ! 1040: Represents the result of converting floating point value X to fixed ! 1041: point mode M, regarded as unsigned. How rounding is done is not ! 1042: specified. ! 1043: ! 1044: `(fix:M X)' ! 1045: When M is a floating point mode, represents the result of converting ! 1046: floating point value X (valid for mode M) to an integer, still ! 1047: represented in floating point mode M, by rounding towards zero. ! 1048: ! 1049: ! 1050: File: internals, Node: RTL Declarations, Next: Side Effects, Prev: Conversions, Up: RTL ! 1051: ! 1052: Declarations ! 1053: ============ ! 1054: ! 1055: Declaration expression codes do not represent arithmetic operations but ! 1056: rather state assertions about their operands. ! 1057: ! 1058: `(strict_low_part (subreg:M (reg:N R) 0))' ! 1059: This expression code is used in only one context: operand 0 of a `set' ! 1060: expression. In addition, the operand of this expression must be a ! 1061: `subreg' expression. ! 1062: ! 1063: The presence of `strict_low_part' says that the part of the register ! 1064: which is meaningful in mode N, but is not part of mode M, is not to be ! 1065: altered. Normally, an assignment to such a subreg is allowed to have ! 1066: undefined effects on the rest of the register when M is less than a ! 1067: word. ! 1068: ! 1069:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.