|
|
1.1 root 1: Info file gcc.info, produced by Makeinfo, -*- Text -*- from input
2: file gcc.texinfo.
3:
4: This file documents the use and the internals of the GNU compiler.
5:
1.1.1.2 ! root 6: Copyright (C) 1988, 1989 Free Software Foundation, Inc.
1.1 root 7:
8: Permission is granted to make and distribute verbatim copies of this
9: manual provided the copyright notice and this permission notice are
10: preserved on all copies.
11:
12: Permission is granted to copy and distribute modified versions of
13: this manual under the conditions for verbatim copying, provided also
1.1.1.2 ! root 14: that the section entitled ``GNU General Public License'' is included
! 15: exactly as in the original, and provided that the entire resulting
! 16: derived work is distributed under the terms of a permission notice
! 17: identical to this one.
1.1 root 18:
19: Permission is granted to copy and distribute translations of this
20: manual into another language, under the above conditions for modified
1.1.1.2 ! root 21: versions, except that the section entitled ``GNU General Public
1.1 root 22: License'' and this permission notice may be included in translations
23: approved by the Free Software Foundation instead of in the original
24: English.
25:
26:
27:
1.1.1.2 ! root 28: File: gcc.info, Node: Extended Asm, Next: Asm Labels, Prev: Inline, Up: Extensions
! 29:
! 30: Assembler Instructions with C Expression Operands
! 31: =================================================
! 32:
! 33: In an assembler instruction using `asm', you can now specify the
! 34: operands of the instruction using C expressions. This means no more
! 35: guessing which registers or memory locations will contain the data
! 36: you want to use.
! 37:
! 38: You must specify an assembler instruction template much like what
! 39: appears in a machine description, plus an operand constraint string
! 40: for each operand.
! 41:
! 42: For example, here is how to use the 68881's `fsinx' instruction:
! 43:
! 44: asm ("fsinx %1,%0" : "=f" (result) : "f" (angle));
! 45:
! 46: Here `angle' is the C expression for the input operand while `result'
! 47: is that of the output operand. Each has `"f"' as its operand
! 48: constraint, saying that a floating-point register is required. The
! 49: `=' in `=f' indicates that the operand is an output; all output
! 50: operands' constraints must use `='. The constraints use the same
! 51: language used in the machine description (*note Constraints::.).
! 52:
! 53: Each operand is described by an operand-constraint string followed by
! 54: the C expression in parentheses. A colon separates the assembler
! 55: template from the first output operand, and another separates the
! 56: last output operand from the first input, if any. Commas separate
! 57: output operands and separate inputs. The total number of operands is
! 58: limited to the maximum number of operands in any instruction pattern
! 59: in the machine description.
! 60:
! 61: If there are no output operands, and there are input operands, then
! 62: there must be two consecutive colons surrounding the place where the
! 63: output operands would go.
! 64:
! 65: Output operand expressions must be lvalues; the compiler can check
! 66: this. The input operands need not be lvalues. The compiler cannot
! 67: check whether the operands have data types that are reasonable for
! 68: the instruction being executed. It does not parse the assembler
! 69: instruction template and does not know what it means, or whether it
! 70: is valid assembler input. The extended `asm' feature is most often
! 71: used for machine instructions that the compiler itself does not know
! 72: exist.
! 73:
! 74: The output operands must be write-only; GNU CC will assume that the
! 75: values in these operands before the instruction are dead and need not
! 76: be generated. For an operand that is read-write, or in which not all
! 77: bits are written and the other bits contain useful information, you
! 78: must logically split its function into two separate operands, one
! 79: input operand and one write-only output operand. The connection
! 80: between them is expressed by constraints which say they need to be in
! 81: the same location when the instruction executes. You can use the
! 82: same C expression for both operands, or different expressions. For
! 83: example, here we write the (fictitious) `combine' instruction with
! 84: `bar' as its read-only source operand and `foo' as its read-write
! 85: destination:
! 86:
! 87: asm ("combine %2,%0" : "=r" (foo) : "0" (foo), "g" (bar));
! 88:
! 89: The constraint `"0"' for operand 1 says that it must occupy the same
! 90: location as operand 0. A digit in constraint is allowed only in an
! 91: input operand, and it must refer to an output operand.
! 92:
! 93: Only a digit in the constraint can guarantee that one operand will be
! 94: in the same place as another. The mere fact that `foo' is the value
! 95: of both operands is not enough to guarantee that they will be in the
! 96: same place in the generated assembler code. The following would not
! 97: work:
! 98:
! 99: asm ("combine %2,%0" : "=r" (foo) : "r" (foo), "g" (bar));
! 100:
! 101: Various optimizations or reloading could cause operands 0 and 1 to be
! 102: in different registers; GNU CC knows no reason not to do so. For
! 103: example, the compiler might find a copy of the value of `foo' in one
! 104: register and use it for operand 1, but generate the output operand 0
! 105: in a different register (copying it afterward to `foo''s own
! 106: address). Of course, since the register for operand 1 is not even
! 107: mentioned in the assembler code, the result will not work, but GNU CC
! 108: can't tell that.
! 109:
! 110: Unless an output operand has the `&' constraint modifier, GNU CC may
! 111: allocate it in the same register as an unrelated input operand, on
! 112: the assumption that the inputs are consumed before the outputs are
! 113: produced. This assumption may be false if the assembler code
! 114: actually consists of more than one instruction. In such a case, use
! 115: `&' for each output operand that may not overlap an input. *Note
! 116: Modifiers::.
! 117:
! 118: Some instructions clobber specific hard registers. To describe this,
! 119: write a third colon after the input operands, followed by the names
! 120: of the clobbered hard registers (given as strings). Here is a
! 121: realistic example for the vax:
! 122:
! 123: asm volatile ("movc3 %0,%1,%2"
! 124: : /* no outputs */
! 125: : "g" (from), "g" (to), "g" (count)
! 126: : "r0", "r1", "r2", "r3", "r4", "r5");
! 127:
! 128: You can put multiple assembler instructions together in a single
! 129: `asm' template, separated either with newlines (written as `\n') or
! 130: with semicolons if the assembler allows such semicolons. The GNU
! 131: assembler allows semicolons and all Unix assemblers seem to do so.
! 132: The input operands are guaranteed not to use any of the clobbered
! 133: registers, and neither will the output operands' addresses, so you
! 134: can read and write the clobbered registers as many times as you like.
! 135: Here is an example of multiple instructions in a template; it assumes
! 136: that the subroutine `_foo' accepts arguments in registers 9 and 10:
! 137:
! 138: asm ("movl %0,r9;movl %1,r10;call _foo"
! 139: : /* no outputs */
! 140: : "g" (from), "g" (to)
! 141: : "r9", "r10");
! 142:
! 143: If you want to test the condition code produced by an assembler
! 144: instruction, you must include a branch and a label in the `asm'
! 145: construct, as follows:
! 146:
! 147: asm ("clr %0;frob %1;beq 0f;mov #1,%0;0:"
! 148: : "g" (result)
! 149: : "g" (input));
! 150:
! 151: This assumes your assembler supports local labels, as the GNU
! 152: assembler and most Unix assemblers do.
! 153:
! 154: Usually the most convenient way to use these `asm' instructions is to
! 155: encapsulate them in macros that look like functions. For example,
! 156:
! 157: #define sin(x) \
! 158: ({ double __value, __arg = (x); \
! 159: asm ("fsinx %1,%0": "=f" (__value): "f" (__arg)); \
! 160: __value; })
! 161:
! 162: Here the variable `__arg' is used to make sure that the instruction
! 163: operates on a proper `double' value, and to accept only those
! 164: arguments `x' which can convert automatically to a `double'.
! 165:
! 166: Another way to make sure the instruction operates on the correct data
! 167: type is to use a cast in the `asm'. This is different from using a
! 168: variable `__arg' in that it converts more different types. For
! 169: example, if the desired type were `int', casting the argument to
! 170: `int' would accept a pointer with no complaint, while assigning the
! 171: argument to an `int' variable named `__arg' would warn about using a
! 172: pointer unless the caller explicitly casts it.
! 173:
! 174: If an `asm' has output operands, GNU CC assumes for optimization
! 175: purposes that the instruction has no side effects except to change
! 176: the output operands. This does not mean that instructions with a
! 177: side effect cannot be used, but you must be careful, because the
! 178: compiler may eliminate them if the output operands aren't used, or
! 179: move them out of loops, or replace two with one if they constitute a
! 180: common subexpression. Also, if your instruction does have a side
! 181: effect on a variable that otherwise appears not to change, the old
! 182: value of the variable may be reused later if it happens to be found
! 183: in a register.
! 184:
! 185: You can prevent an `asm' instruction from being deleted, moved or
! 186: combined by writing the keyword `volatile' after the `asm'. For
! 187: example:
! 188:
! 189: #define set_priority(x) \
! 190: asm volatile ("set_priority %0": /* no outputs */ : "g" (x))
! 191:
! 192: (However, an instruction without output operands will not be deleted
! 193: or moved, regardless, unless it is unreachable.)
! 194:
! 195: It is a natural idea to look for a way to give access to the
! 196: condition code left by the assembler instruction. However, when we
! 197: attempted to implement this, we found no way to make it work
! 198: reliably. The problem is that output operands might need reloading,
! 199: which would result in additional following ``store'' instructions.
! 200: On most machines, these instructions would alter the condition code
! 201: before there was time to test it. This problem doesn't arise for
! 202: ordinary ``test'' and ``compare'' instructions because they don't
! 203: have any output operands.
! 204:
! 205: If you are writing a header file that should be includable in ANSI C
! 206: programs, write `__asm' instead of `asm'. *Note Alternate Keywords::.
! 207:
! 208:
! 209:
! 210: File: gcc.info, Node: Asm Labels, Next: Global Reg Vars, Prev: Extended Asm, Up: Extensions
! 211:
! 212: Controlling Names Used in Assembler Code
! 213: ========================================
! 214:
! 215: You can specify the name to be used in the assembler code for a C
! 216: function or variable by writing the `asm' (or `__asm') keyword after
! 217: the declarator as follows:
! 218:
! 219: int foo asm ("myfoo") = 2;
! 220:
! 221: This specifies that the name to be used for the variable `foo' in the
! 222: assembler code should be `myfoo' rather than the usual `_foo'.
! 223:
! 224: On systems where an underscore is normally prepended to the name of a
! 225: C function or variable, this feature allows you to define names for
! 226: the linker that do not start with an underscore.
! 227:
! 228: You cannot use `asm' in this way in a function *definition*; but you
! 229: can get the same effect by writing a declaration for the function
! 230: before its definition and putting `asm' there, like this:
! 231:
! 232: extern func () asm ("FUNC");
! 233:
! 234: func (x, y)
! 235: int x, y;
! 236: ...
! 237:
! 238: It is up to you to make sure that the assembler names you choose do
! 239: not conflict with any other assembler symbols. Also, you must not
! 240: use a register name; that would produce completely invalid assembler
! 241: code. GNU CC does not as yet have the ability to store static
! 242: variables in registers. Perhaps that will be added.
! 243:
! 244:
! 245:
! 246: File: gcc.info, Node: Global Reg Vars, Next: Alternate Keywords, Prev: Asm Labels, Up: Extensions
! 247:
! 248: Global Variables in Registers
! 249: =============================
! 250:
! 251: A few programs, such as programming language interpreters, may have a
! 252: couple of global variables that are accessed so often that it is
! 253: worth while to reserve registers throughout the program just for them.
! 254:
! 255: You can define a global register variable in GNU C like this:
! 256:
! 257: register int *foo asm ("a5");
! 258:
! 259: Here `a5' is the name of the register which should be used. Choose a
! 260: register which is normally saved and restored by function calls on
! 261: your machine, so that library routines will not clobber it.
! 262:
! 263: Naturally the register name is cpu-dependent, so you would need to
! 264: conditionalize your program according to cpu type. The register `a5'
! 265: would be a good choice on a 68000 for a variable of pointer type. On
! 266: machines with register windows, be sure to choose a ``global''
! 267: register that is not affected by the function call mechanism.
! 268:
! 269: In addition, operating systems on one type of cpu may differ in how
! 270: they name the registers; then you would need additional conditionals.
! 271: For example, some 68000 operating systems call this register `%a5'.
! 272:
! 273: Eventually there may be a way of asking the compiler to choose a
! 274: register automatically, but first we need to figure out how it should
! 275: choose and how to enable you to guide the choice. No solution is
! 276: evident.
! 277:
! 278: Defining a global register variable in a certain register reserves
! 279: that register entirely for this use, at least within the current
! 280: compilation. The register will not be allocated for any other
! 281: purpose in the functions in the current compilation. The register
! 282: will not be saved and restored by these functions. Stores into this
! 283: register are never deleted even if they would appear to be dead, but
! 284: references may be deleted or moved or simplified.
! 285:
! 286: It is not safe to access the global register variables from signal
! 287: handlers, or from more than one thread of control, because the system
! 288: library routines may temporarily use the register for other things
! 289: (unless you recompile them specially for the task at hand).
! 290:
! 291: It is not safe for one function that uses a global register variable
! 292: to call another such function `foo' by way of a third function `lose'
! 293: that was compiled without knowledge of this variable (i.e. in a
! 294: different source file in which the variable wasn't declared). This
! 295: is because `lose' might save the register and put some other value
! 296: there. For example, you can't expect a global register variable to
! 297: be available in the comparison-function that you pass to `qsort',
! 298: since `qsort' might have put something else in that register. (If
! 299: you are prepared to recompile `qsort' with the same global register
! 300: variable, you can solve this problem.)
! 301:
! 302: If you want to recompile `qsort' or other source files which do not
! 303: actually use your global register variable, so that they will not use
! 304: that register for any other purpose, then it suffices to specify the
! 305: compiler option `-ffixed-REG'. You need not actually add a global
! 306: register declaration to their source code.
! 307:
! 308: A function which can alter the value of a global register variable
! 309: cannot safely be called from a function compiled without this
! 310: variable, because it could clobber the value the caller expects to
! 311: find there on return. Therefore, the function which is the entry
! 312: point into the part of the program that uses the global register
! 313: variable must explicitly save and restore the value which belongs to
! 314: its caller.
! 315:
! 316: On most machines, `longjmp' will restore to each global register
! 317: variable the value it had at the time of the `setjmp'. On some
! 318: machines, however, `longjmp' will not change the value of global
! 319: register variables. To be portable, the function that called
! 320: `setjmp' should make other arrangements to save the values of the
! 321: global register variables, and to restore them if a `longjmp'. This
! 322: way, the the same thing will happen regardless of what `longjmp' does.
! 323:
! 324: All global register variable declarations must precede all function
! 325: definitions. If such a declaration could appear after function
! 326: definitions, the declaration would be too late to prevent the
! 327: register from being used for other purposes in the preceding functions.
! 328:
! 329: Global register variables may not have initial values, because an
! 330: executable file has no means to supply initial contents for a register.
! 331:
! 332:
! 333:
! 334: File: gcc.info, Node: Alternate Keywords, Prev: Global Reg Vars, Up: Extensions
! 335:
! 336: Alternate Keywords
! 337: ==================
! 338:
! 339: The option `-traditional' disables certain keywords; `-ansi' disables
! 340: certain others. This causes trouble when you want to use GNU C
! 341: extensions, or ANSI C features, in a general-purpose header file that
! 342: should be usable by all programs, including ANSI C programs and
! 343: traditional ones. The keywords `asm', `typeof' and `inline' cannot
! 344: be used since they won't work in a program compiled with `-ansi',
! 345: while the keywords `const', `volatile', `signed', `typeof' and
! 346: `inline' won't work in a program compiled with `-traditional'.
! 347:
! 348: The way to solve these problems is to put `__' in front of each
! 349: problematical keyword. For example, use `__asm' instead of `asm',
! 350: `__const' instead of `const', and `__inline' instead of `inline'.
! 351:
! 352: Other C compilers won't accept these alternative keywords; if you
! 353: want to compile with another compiler, you can define the alternate
! 354: keywords as macros to replace them with the customary keywords. It
! 355: looks like this:
! 356:
! 357: #ifndef __GNUC__
! 358: #define __asm asm
! 359: #endif
! 360:
! 361:
! 362:
! 363: File: gcc.info, Node: Bugs, Next: Portability, Prev: Extensions, Up: Top
! 364:
! 365: Reporting Bugs
! 366: **************
! 367:
! 368: Your bug reports play an essential role in making GNU CC reliable.
! 369:
! 370: Reporting a bug may help you by bringing a solution to your problem,
! 371: or it may not. But in any case the important function of a bug
! 372: report is to help the entire community by making the next version of
! 373: GNU CC work better. Bug reports are your contribution to the
! 374: maintenance of GNU CC.
! 375:
! 376: In order for a bug report to serve its purpose, you must include the
! 377: information that makes for fixing the bug.
! 378:
! 379: * Menu:
! 380:
! 381: * Criteria: Bug Criteria. Have you really found a bug?
! 382: * Reporting: Bug Reporting. How to report a bug effectively.
! 383:
! 384:
! 385:
! 386: File: gcc.info, Node: Bug Criteria, Next: Bug Reporting, Prev: Bugs, Up: Bugs
! 387:
! 388: Have You Found a Bug?
! 389: =====================
! 390:
! 391: If you are not sure whether you have found a bug, here are some
! 392: guidelines:
! 393:
! 394: * If the compiler gets a fatal signal, for any input whatever,
! 395: that is a compiler bug. Reliable compilers never crash.
! 396:
! 397: * If the compiler produces invalid assembly code, for any input
! 398: whatever (except an `asm' statement), that is a compiler bug,
! 399: unless the compiler reports errors (not just warnings) which
! 400: would ordinarily prevent the assembler from being run.
! 401:
! 402: * If the compiler produces valid assembly code that does not
! 403: correctly execute the input source code, that is a compiler bug.
! 404:
! 405: However, you must double-check to make sure, because you may
! 406: have run into an incompatibility between GNU C and traditional C
! 407: (*note Incompatibilities::.). These incompatibilities might be
! 408: considered bugs, but they are inescapable consequences of
! 409: valuable features.
! 410:
! 411: Or you may have a program whose behavior is undefined, which
! 412: happened by chance to give the desired results with another C
! 413: compiler.
! 414:
! 415: For example, in many nonoptimizing compilers, you can write `x;'
! 416: at the end of a function instead of `return x;', with the same
! 417: results. But the value of the function is undefined if `return'
! 418: is omitted; it is not a bug when GNU CC produces different
! 419: results.
! 420:
! 421: Problems often result from expressions with two increment
! 422: operators, as in `f (*p++, *p++)'. Your previous compiler might
! 423: have interpreted that expression the way you intended; GNU CC
! 424: might interpret it another way; neither compiler is wrong.
! 425:
! 426: After you have localized the error to a single source line, it
! 427: should be easy to check for these things. If your program is
! 428: correct and well defined, you have found a compiler bug.
! 429:
! 430: * If the compiler produces an error message for valid input, that
! 431: is a compiler bug.
! 432:
! 433: Note that the following is not valid input, and the error
! 434: message for it is not a bug:
! 435:
! 436: int foo (char);
! 437:
! 438: int
! 439: foo (x)
! 440: char x;
! 441: { ... }
! 442:
! 443: The prototype says to pass a `char', while the definition says
! 444: to pass an `int' and treat the value as a `char'. This is what
! 445: the ANSI standard says, and it makes sense.
! 446:
! 447: * If the compiler does not produce an error message for invalid
! 448: input, that is a compiler bug. However, you should note that
! 449: your idea of ``invalid input'' might be my idea of ``an
! 450: extension'' or ``support for traditional practice''.
! 451:
! 452: * If you are an experienced user of C compilers, your suggestions
! 453: for improvement of GNU CC are welcome in any case.
! 454:
! 455:
! 456:
! 457: File: gcc.info, Node: Bug Reporting, Prev: Bug Criteria, Up: Bugs
! 458:
! 459: How to Report Bugs
! 460: ==================
! 461:
! 462: Send bug reports for GNU C to one of these addresses:
! 463:
! 464: [email protected]
! 465: {ucbvax|mit-eddie|uunet}!prep.ai.mit.edu!bug-gcc
! 466:
! 467: As a last resort, snail them to:
! 468:
! 469: GNU Compiler Bugs
! 470: 545 Tech Sq
! 471: Cambridge, MA 02139
! 472:
! 473: The fundamental principle of reporting bugs usefully is this: *report
! 474: all the facts*. If you are not sure whether to mention a fact or
! 475: leave it out, mention it!
! 476:
! 477: Often people omit facts because they think they know what causes the
! 478: problem and they conclude that some details don't matter. Thus, you
! 479: might assume that the name of the variable you use in an example does
! 480: not matter. Well, probably it doesn't, but one cannot be sure.
! 481: Perhaps the bug is a stray memory reference which happens to fetch
! 482: from the location where that name is stored in memory; perhaps, if
! 483: the name were different, the contents of that location would fool the
! 484: compiler into doing the right thing despite the bug. Play it safe
! 485: and give an exact example.
! 486:
! 487: If you want to enable me to fix the bug, you should include all these
! 488: things:
! 489:
! 490: * The version of GNU CC. You can get this by running it with the
! 491: `-v' option.
! 492:
! 493: Without this, I won't know whether there is any point in looking
! 494: for the bug in the current version of GNU CC.
! 495:
! 496: * A complete input file that will reproduce the bug. If the bug
! 497: is in the C preprocessor, send me a source file and any header
! 498: files that it requires. If the bug is in the compiler proper
! 499: (`cc1'), run your source file through the C preprocessor by
! 500: doing `gcc -E SOURCEFILE > OUTFILE', then include the contents
! 501: of OUTFILE in the bug report. (Any `-I', `-D' or `-U' options
! 502: that you used in actual compilation should also be used when
! 503: doing this.)
! 504:
! 505: A single statement is not enough of an example. In order to
! 506: compile it, it must be embedded in a function definition; and
! 507: the bug might depend on the details of how this is done.
! 508:
! 509: Without a real example I can compile, all I can do about your
! 510: bug report is wish you luck. It would be futile to try to guess
! 511: how to provoke the bug. For example, bugs in register
! 512: allocation and reloading frequently depend on every little
! 513: detail of the function they happen in.
! 514:
! 515: * The command arguments you gave GNU CC to compile that example
! 516: and observe the bug. For example, did you use `-O'? To
! 517: guarantee you won't omit something important, list them all.
! 518:
! 519: If I were to try to guess the arguments, I would probably guess
! 520: wrong and then I would not encounter the bug.
! 521:
! 522: * The names of the files that you used for `tm.h' and `md' when
! 523: you installed the compiler.
! 524:
! 525: * The type of machine you are using, and the operating system name
! 526: and version number.
! 527:
! 528: * A description of what behavior you observe that you believe is
! 529: incorrect. For example, ``It gets a fatal signal,'' or, ``There
! 530: is an incorrect assembler instruction in the output.''
! 531:
! 532: Of course, if the bug is that the compiler gets a fatal signal,
! 533: then I will certainly notice it. But if the bug is incorrect
! 534: output, I might not notice unless it is glaringly wrong. I
! 535: won't study all the assembler code from a 50-line C program just
! 536: on the off chance that it might be wrong.
! 537:
! 538: Even if the problem you experience is a fatal signal, you should
! 539: still say so explicitly. Suppose something strange is going on,
! 540: such as, your copy of the compiler is out of synch, or you have
! 541: encountered a bug in the C library on your system. (This has
! 542: happened!) Your copy might crash and mine would not. If you
! 543: told me to expect a crash, then when mine fails to crash, I
! 544: would know that the bug was not happening for me. If you had
! 545: not told me to expect a crash, then I would not be able to draw
! 546: any conclusion from my observations.
! 547:
! 548: In cases where GNU CC generates incorrect code, if you send me a
! 549: small complete sample program I will find the error myself by
! 550: running the program under a debugger. If you send me a large
! 551: example or a part of a larger program, I cannot do this; you
! 552: must debug the compiled program and narrow the problem down to
! 553: one source line. Tell me which source line it is, and what you
! 554: believe is incorrect about the code generated for that line.
! 555:
! 556: * If you send me examples of output from GNU CC, please use `-g'
! 557: when you make them. The debugging information includes source
! 558: line numbers which are essential for correlating the output with
! 559: the input.
! 560:
! 561: * If you wish to suggest changes to the GNU CC source, send me
! 562: context diffs. If you even discuss something in the GNU CC
! 563: source, refer to it by context, not by line number.
! 564:
! 565: The line numbers in my development sources don't match those in
! 566: your sources. Your line numbers would convey no useful
! 567: information to me.
! 568:
! 569: * Additional information from a debugger might enable me to find a
! 570: problem on a machine which I do not have available myself.
! 571: However, you need to think when you collect this information if
! 572: you want it to have any chance of being useful.
! 573:
! 574: For example, many people send just a backtrace, but that is
! 575: never useful by itself. A simple backtrace with arguments
! 576: conveys little about GNU CC because the compiler is largely
! 577: data-driven; the same functions are called over and over for
! 578: different RTL insns, doing different things depending on the
! 579: details of the insn.
! 580:
! 581: Most of the arguments listed in the backtrace are useless
! 582: because they are pointers to RTL list structure. The numeric
! 583: values of the pointers, which the debugger prints in the
! 584: backtrace, have no significance whatever; all that matters is
! 585: the contents of the objects they point to (and most of the
! 586: contents are other such pointers).
! 587:
! 588: In addition, most compiler passes consist of one or more loops
! 589: that scan the RTL insn sequence. The most vital piece of
! 590: information about such a loop--which insn it has reached--is
! 591: usually in a local variable, not in an argument.
! 592:
! 593: What you need to provide in addition to a backtrace are the
! 594: values of the local variables for several stack frames up. When
! 595: a local variable or an argument is an RTX, first print its value
! 596: and then use the GDB command `pr' to print the RTL expression
! 597: that it points to. (If GDB doesn't run on your machine, use
! 598: your debugger to call the function `debug_rtx' with the RTX as
! 599: an argument.) In general, whenever a variable is a pointer, its
! 600: value is no use without the data it points to.
! 601:
! 602: In addition, include a debugging dump from just before the pass
! 603: in which the crash happens. Most bugs involve a series of
! 604: insns, not just one.
! 605:
! 606: Here are some things that are not necessary:
! 607:
! 608: * A description of the envelope of the bug.
! 609:
! 610: Often people who encounter a bug spend a lot of time
! 611: investigating which changes to the input file will make the bug
! 612: go away and which changes will not affect it.
! 613:
! 614: This is often time consuming and not very useful, because the
! 615: way I will find the bug is by running a single example under the
! 616: debugger with breakpoints, not by pure deduction from a series
! 617: of examples.
! 618:
! 619: Of course, if you can find a simpler example to report *instead*
! 620: of the original one, that is a convenience for me. Errors in
! 621: the output will be easier to spot, running under the debugger
! 622: will take less time, etc. Most GNU CC bugs involve just one
! 623: function, so the most straightforward way to simplify an example
! 624: is to delete all the function definitions except the one where
! 625: the bug occurs. Those earlier in the file may be replaced by
! 626: external declarations if the crucial function depends on them.
! 627:
! 628: However, simplification is not vital; if you don't want to do
! 629: this, report the bug anyway.
! 630:
! 631: * A patch for the bug.
! 632:
! 633: A patch for the bug does help me if it is a good one. But don't
! 634: omit the necessary information, such as the test case, because I
! 635: might see problems with your patch and decide to fix the problem
! 636: another way.
! 637:
! 638: Sometimes with a program as complicated as GNU CC it is very
! 639: hard to construct an example that will make the program follow a
! 640: certain path through the code. If you don't send me the
! 641: example, I won't be able to construct one, so I won't be able to
! 642: verify that the bug is fixed.
! 643:
! 644: * A guess about what the bug is or what it depends on.
! 645:
! 646: Such guesses are usually wrong. Even I can't guess right about
! 647: such things without using the debugger to find the facts.
! 648:
! 649:
! 650:
! 651: File: gcc.info, Node: Portability, Next: Interface, Prev: Bugs, Up: Top
! 652:
! 653: GNU CC and Portability
! 654: **********************
! 655:
! 656: The main goal of GNU CC was to make a good, fast compiler for
! 657: machines in the class that the GNU system aims to run on: 32-bit
! 658: machines that address 8-bit bytes and have several general registers.
! 659: Elegance, theoretical power and simplicity are only secondary.
! 660:
! 661: GNU CC gets most of the information about the target machine from a
! 662: machine description which gives an algebraic formula for each of the
! 663: machine's instructions. This is a very clean way to describe the
! 664: target. But when the compiler needs information that is difficult to
! 665: express in this fashion, I have not hesitated to define an ad-hoc
! 666: parameter to the machine description. The purpose of portability is
! 667: to reduce the total work needed on the compiler; it was not of
! 668: interest for its own sake.
! 669:
! 670: GNU CC does not contain machine dependent code, but it does contain
! 671: code that depends on machine parameters such as endianness (whether
! 672: the most significant byte has the highest or lowest address of the
! 673: bytes in a word) and the availability of autoincrement addressing.
! 674: In the RTL-generation pass, it is often necessary to have multiple
! 675: strategies for generating code for a particular kind of syntax tree,
! 676: strategies that are usable for different combinations of parameters.
! 677: Often I have not tried to address all possible cases, but only the
! 678: common ones or only the ones that I have encountered. As a result, a
! 679: new target may require additional strategies. You will know if this
! 680: happens because the compiler will call `abort'. Fortunately, the new
! 681: strategies can be added in a machine-independent fashion, and will
! 682: affect only the target machines that need them.
! 683:
! 684:
! 685:
1.1 root 686: File: gcc.info, Node: Interface, Next: Passes, Prev: Portability, Up: Top
687:
688: Interfacing to GNU CC Output
689: ****************************
690:
691: GNU CC is normally configured to use the same function calling
692: convention normally in use on the target system. This is done with
693: the machine-description macros described (*note Machine Macros::.).
694:
695: However, returning of structure and union values is done differently
696: on some target machines. As a result, functions compiled with PCC
697: returning such types cannot be called from code compiled with GNU CC,
698: and vice versa. This does not cause trouble often because few Unix
699: library routines return structures or unions.
700:
701: GNU CC code returns structures and unions that are 1, 2, 4 or 8 bytes
702: long in the same registers used for `int' or `double' return values.
703: (GNU CC typically allocates variables of such types in registers
704: also.) Structures and unions of other sizes are returned by storing
705: them into an address passed by the caller (usually in a register).
706: The machine-description macros `STRUCT_VALUE' and
707: `STRUCT_INCOMING_VALUE' tell GNU CC where to pass this address.
708:
709: By contrast, PCC on most target machines returns structures and
710: unions of any size by copying the data into an area of static
711: storage, and then returning the address of that storage as if it were
712: a pointer value. The caller must copy the data from that memory area
713: to the place where the value is wanted. This is slower than the
714: method used by GNU CC, and fails to be reentrant.
715:
716: On some target machines, such as RISC machines and the 80386, the
717: standard system convention is to pass to the subroutine the address
718: of where to return the value. On these machines, GNU CC has been
719: configured to be compatible with the standard compiler, when this
720: method is used. It may not be compatible for structures of 1, 2, 4
721: or 8 bytes.
722:
723: GNU CC uses the system's standard convention for passing arguments.
724: On some machines, the first few arguments are passed in registers; in
725: others, all are passed on the stack. It would be possible to use
726: registers for argument passing on any machine, and this would
727: probably result in a significant speedup. But the result would be
728: complete incompatibility with code that follows the standard
729: convention. So this change is practical only if you are switching to
730: GNU CC as the sole C compiler for the system. We may implement
731: register argument passing on certain machines once we have a complete
732: GNU system so that we can compile the libraries with GNU CC.
733:
734: If you use `longjmp', beware of automatic variables. ANSI C says
735: that automatic variables that are not declared `volatile' have
736: undefined values after a `longjmp'. And this is all GNU CC promises
737: to do, because it is very difficult to restore register variables
738: correctly, and one of GNU CC's features is that it can put variables
739: in registers without your asking it to.
740:
741: If you want a variable to be unaltered by `longjmp', and you don't
742: want to write `volatile' because old C compilers don't accept it,
743: just take the address of the variable. If a variable's address is
744: ever taken, even if just to compute it and ignore it, then the
745: variable cannot go in a register:
746:
747: {
748: int careful;
749: &careful;
750: ...
751: }
752:
753: Code compiled with GNU CC may call certain library routines. Most of
754: them handle arithmetic for which there are no instructions. This
755: includes multiply and divide on some machines, and floating point
756: operations on any machine for which floating point support is
757: disabled with `-msoft-float'. Some standard parts of the C library,
758: such as `bcopy' or `memcpy', are also called automatically. The
759: usual function call interface is used for calling the library routines.
760:
761: These library routines should be defined in the library `gnulib',
762: which GNU CC automatically searches whenever it links a program. On
763: machines that have multiply and divide instructions, if hardware
764: floating point is in use, normally `gnulib' is not needed, but it is
765: searched just in case.
766:
767: Each arithmetic function is defined in `gnulib.c' to use the
768: corresponding C arithmetic operator. As long as the file is compiled
769: with another C compiler, which supports all the C arithmetic
770: operators, this file will work portably. However, `gnulib.c' does
771: not work if compiled with GNU CC, because each arithmetic function
772: would compile into a call to itself!
773:
774:
775:
776: File: gcc.info, Node: Passes, Next: RTL, Prev: Interface, Up: Top
777:
778: Passes and Files of the Compiler
779: ********************************
780:
781: The overall control structure of the compiler is in `toplev.c'. This
782: file is responsible for initialization, decoding arguments, opening
783: and closing files, and sequencing the passes.
784:
785: The parsing pass is invoked only once, to parse the entire input.
786: The RTL intermediate code for a function is generated as the function
787: is parsed, a statement at a time. Each statement is read in as a
788: syntax tree and then converted to RTL; then the storage for the tree
789: for the statement is reclaimed. Storage for types (and the
790: expressions for their sizes), declarations, and a representation of
791: the binding contours and how they nest, remains until the function is
792: finished being compiled; these are all needed to output the debugging
793: information.
794:
795: Each time the parsing pass reads a complete function definition or
796: top-level declaration, it calls the function `rest_of_compilation' or
797: `rest_of_decl_compilation' in `toplev.c', which are responsible for
798: all further processing necessary, ending with output of the assembler
799: language. All other compiler passes run, in sequence, within
800: `rest_of_compilation'. When that function returns from compiling a
801: function definition, the storage used for that function definition's
802: compilation is entirely freed, unless it is an inline function (*note
803: Inline::.).
804:
805: Here is a list of all the passes of the compiler and their source
806: files. Also included is a description of where debugging dumps can
807: be requested with `-d' options.
808:
809: * Parsing. This pass reads the entire text of a function
810: definition, constructing partial syntax trees. This and RTL
811: generation are no longer truly separate passes (formerly they
812: were), but it is easier to think of them as separate.
813:
814: The tree representation does not entirely follow C syntax,
815: because it is intended to support other languages as well.
816:
817: C data type analysis is also done in this pass, and every tree
818: node that represents an expression has a data type attached.
819: Variables are represented as declaration nodes.
820:
821: Constant folding and associative-law simplifications are also
822: done during this pass.
823:
824: The source files for parsing are `c-parse.y', `c-decl.c',
825: `c-typeck.c', `c-convert.c', `stor-layout.c', `fold-const.c',
826: and `tree.c'. The last three files are intended to be
827: language-independent. There are also header files `c-parse.h',
828: `c-tree.h', `tree.h' and `tree.def'. The last two define the
829: format of the tree representation.
830:
831: * RTL generation. This is the conversion of syntax tree into RTL
832: code. It is actually done statement-by-statement during
833: parsing, but for most purposes it can be thought of as a
834: separate pass.
835:
836: This is where the bulk of target-parameter-dependent code is
837: found, since often it is necessary for strategies to apply only
838: when certain standard kinds of instructions are available. The
839: purpose of named instruction patterns is to provide this
840: information to the RTL generation pass.
841:
842: Optimization is done in this pass for `if'-conditions that are
843: comparisons, boolean operations or conditional expressions.
844: Tail recursion is detected at this time also. Decisions are
845: made about how best to arrange loops and how to output `switch'
846: statements.
847:
848: The source files for RTL generation are `stmt.c', `expr.c',
849: `explow.c', `expmed.c', `optabs.c' and `emit-rtl.c'. Also, the
850: file `insn-emit.c', generated from the machine description by
851: the program `genemit', is used in this pass. The header files
852: `expr.h' is used for communication within this pass.
853:
854: The header files `insn-flags.h' and `insn-codes.h', generated
855: from the machine description by the programs `genflags' and
856: `gencodes', tell this pass which standard names are available
857: for use and which patterns correspond to them.
858:
859: Aside from debugging information output, none of the following
860: passes refers to the tree structure representation of the
861: function (only part of which is saved).
862:
863: The decision of whether the function can and should be expanded
864: inline in its subsequent callers is made at the end of rtl
865: generation. The function must meet certain criteria, currently
866: related to the size of the function and the types and number of
867: parameters it has. Note that this function may contain loops,
868: recursive calls to itself (tail-recursive functions can be
869: inlined!), gotos, in short, all constructs supported by GNU CC.
870:
871: The option `-dr' causes a debugging dump of the RTL code after
872: this pass. This dump file's name is made by appending `.rtl' to
873: the input file name.
874:
875: * Jump optimization. This pass simplifies jumps to the following
876: instruction, jumps across jumps, and jumps to jumps. It deletes
877: unreferenced labels and unreachable code, except that
878: unreachable code that contains a loop is not recognized as
879: unreachable in this pass. (Such loops are deleted later in the
880: basic block analysis.)
881:
882: Jump optimization is performed two or three times. The first
883: time is immediately following RTL generation. The second time
884: is after CSE, but only if CSE says repeated jump optimization is
885: needed. The last time is right before the final pass. That
886: time, cross-jumping and deletion of no-op move instructions are
887: done together with the optimizations described above.
888:
889: The source file of this pass is `jump.c'.
890:
891: The option `-dj' causes a debugging dump of the RTL code after
892: this pass is run for the first time. This dump file's name is
893: made by appending `.jump' to the input file name.
894:
895: * Register scan. This pass finds the first and last use of each
896: register, as a guide for common subexpression elimination. Its
897: source is in `regclass.c'.
898:
899: * Common subexpression elimination. This pass also does constant
900: propagation. Its source file is `cse.c'. If constant
901: propagation causes conditional jumps to become unconditional or
902: to become no-ops, jump optimization is run again when CSE is
903: finished.
904:
905: The option `-ds' causes a debugging dump of the RTL code after
906: this pass. This dump file's name is made by appending `.cse' to
907: the input file name.
908:
909: * Loop optimization. This pass moves constant expressions out of
910: loops. Its source file is `loop.c'.
911:
912: The option `-dL' causes a debugging dump of the RTL code after
913: this pass. This dump file's name is made by appending `.loop'
914: to the input file name.
915:
916: * Stupid register allocation is performed at this point in a
917: nonoptimizing compilation. It does a little data flow analysis
918: as well. When stupid register allocation is in use, the next
919: pass executed is the reloading pass; the others in between are
920: skipped. The source file is `stupid.c'.
921:
922: * Data flow analysis (`flow.c'). This pass divides the program
923: into basic blocks (and in the process deletes unreachable
924: loops); then it computes which pseudo-registers are live at each
925: point in the program, and makes the first instruction that uses
926: a value point at the instruction that computed the value.
927:
928: This pass also deletes computations whose results are never
929: used, and combines memory references with add or subtract
930: instructions to make autoincrement or autodecrement addressing.
931:
932: The option `-df' causes a debugging dump of the RTL code after
933: this pass. This dump file's name is made by appending `.flow'
934: to the input file name. If stupid register allocation is in
935: use, this dump file reflects the full results of such allocation.
936:
937: * Instruction combination (`combine.c'). This pass attempts to
938: combine groups of two or three instructions that are related by
939: data flow into single instructions. It combines the RTL
940: expressions for the instructions by substitution, simplifies the
941: result using algebra, and then attempts to match the result
942: against the machine description.
943:
944: The option `-dc' causes a debugging dump of the RTL code after
945: this pass. This dump file's name is made by appending
946: `.combine' to the input file name.
947:
948: * Register class preferencing. The RTL code is scanned to find
949: out which register class is best for each pseudo register. The
950: source file is `regclass.c'.
951:
952: * Local register allocation (`local-alloc.c'). This pass
953: allocates hard registers to pseudo registers that are used only
954: within one basic block. Because the basic block is linear, it
955: can use fast and powerful techniques to do a very good job.
956:
957: The option `-dl' causes a debugging dump of the RTL code after
958: this pass. This dump file's name is made by appending `.lreg'
959: to the input file name.
960:
961: * Global register allocation (`global-alloc.c'). This pass
962: allocates hard registers for the remaining pseudo registers
963: (those whose life spans are not contained in one basic block).
964:
965: * Reloading. This pass renumbers pseudo registers with the
966: hardware registers numbers they were allocated. Pseudo
967: registers that did not get hard registers are replaced with
968: stack slots. Then it finds instructions that are invalid
969: because a value has failed to end up in a register, or has ended
970: up in a register of the wrong kind. It fixes up these
971: instructions by reloading the problematical values temporarily
972: into registers. Additional instructions are generated to do the
973: copying.
974:
975: Source files are `reload.c' and `reload1.c', plus the header
976: `reload.h' used for communication between them.
977:
978: The option `-dg' causes a debugging dump of the RTL code after
979: this pass. This dump file's name is made by appending `.greg'
980: to the input file name.
981:
982: * Jump optimization is repeated, this time including cross-jumping
1.1.1.2 ! root 983: and deletion of no-op move instructions.
1.1 root 984:
985: The option `-dJ' causes a debugging dump of the RTL code after
986: this pass. This dump file's name is made by appending `.jump2'
987: to the input file name.
988:
989: * Final. This pass outputs the assembler code for the function.
990: It is also responsible for identifying spurious test and compare
1.1.1.2 ! root 991: instructions. Machine-specific peephole optimizations are
! 992: performed at the same time. The function entry and exit
! 993: sequences are generated directly as assembler code in this pass;
! 994: they never exist as RTL.
1.1 root 995:
996: The source files are `final.c' plus `insn-output.c'; the latter
997: is generated automatically from the machine description by the
998: tool `genoutput'. The header file `conditions.h' is used for
999: communication between these files.
1000:
1001: * Debugging information output. This is run after final because
1002: it must output the stack slot offsets for pseudo registers that
1003: did not get hard registers. Source files are `dbxout.c' for DBX
1004: symbol table format and `symout.c' for GDB's own symbol table
1005: format.
1006:
1007: Some additional files are used by all or many passes:
1008:
1009: * Every pass uses `machmode.def', which defines the machine modes.
1010:
1011: * All the passes that work with RTL use the header files `rtl.h'
1012: and `rtl.def', and subroutines in file `rtl.c'. The tools
1013: `gen*' also use these files to read and work with the machine
1014: description RTL.
1015:
1016: * Several passes refer to the header file `insn-config.h' which
1017: contains a few parameters (C macro definitions) generated
1018: automatically from the machine description RTL by the tool
1019: `genconfig'.
1020:
1021: * Several passes use the instruction recognizer, which consists of
1022: `recog.c' and `recog.h', plus the files `insn-recog.c' and
1023: `insn-extract.c' that are generated automatically from the
1024: machine description by the tools `genrecog' and `genextract'.
1025:
1026: * Several passes use the header files `regs.h' which defines the
1027: information recorded about pseudo register usage, and
1028: `basic-block.h' which defines the information recorded about
1029: basic blocks.
1030:
1031: * `hard-reg-set.h' defines the type `HARD_REG_SET', a bit-vector
1032: with a bit for each hard register, and some macros to manipulate
1033: it. This type is just `int' if the machine has few enough hard
1034: registers; otherwise it is an array of `int' and some of the
1035: macros expand into loops.
1036:
1037:
1038:
1039: File: gcc.info, Node: RTL, Next: Machine Desc, Prev: Passes, Up: Top
1040:
1041: RTL Representation
1042: ******************
1043:
1044: Most of the work of the compiler is done on an intermediate
1045: representation called register transfer language. In this language,
1046: the instructions to be output are described, pretty much one by one,
1047: in an algebraic form that describes what the instruction does.
1048:
1049: RTL is inspired by Lisp lists. It has both an internal form, made up
1050: of structures that point at other structures, and a textual form that
1051: is used in the machine description and in printed debugging dumps.
1052: The textual form uses nested parentheses to indicate the pointers in
1053: the internal form.
1054:
1055: * Menu:
1056:
1057: * RTL Objects:: Expressions vs vectors vs strings vs integers.
1058: * Accessors:: Macros to access expression operands or vector elts.
1059: * Flags:: Other flags in an RTL expression.
1060: * Machine Modes:: Describing the size and format of a datum.
1061: * Constants:: Expressions with constant values.
1062: * Regs and Memory:: Expressions representing register contents or memory.
1063: * Arithmetic:: Expressions representing arithmetic on other expressions.
1064: * Comparisons:: Expressions representing comparison of expressions.
1065: * Bit Fields:: Expressions representing bit-fields in memory or reg.
1066: * Conversions:: Extending, truncating, floating or fixing.
1067: * RTL Declarations:: Declaring volatility, constancy, etc.
1068: * Side Effects:: Expressions for storing in registers, etc.
1069: * Incdec:: Embedded side-effects for autoincrement addressing.
1070: * Assembler:: Representing `asm' with operands.
1071: * Insns:: Expression types for entire insns.
1072: * Calls:: RTL representation of function call insns.
1073: * Sharing:: Some expressions are unique; others *must* be copied.
1074:
1.1.1.2 ! root 1075:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.