|
|
1.1.1.2 root 1: 1.1.1.3 ! root 2: ! 3: File: internals, Node: Extensions, Next: Bugs, Prev: Incompatibilities, Up: Top ! 4: ! 5: GNU Extensions to the C Language ! 6: ******************************** ! 7: ! 8: GNU C provides several language features not found in ANSI standard C. ! 9: (The `-pedantic' option directs GNU CC to print a warning message if any of ! 10: these features is used.) To test for the availability of these features in ! 11: conditional compilation, check for a predefined macro `__GNUC__', which is ! 12: always defined under GNU CC. ! 13: ! 14: * Menu: ! 15: ! 16: * Statement Exprs:: Putting statements and declarations inside expressions. ! 17: * Naming Types:: Giving a name to the type of some expression. ! 18: * Typeof:: `typeof': referring to the type of an expression. ! 19: * Lvalues:: Using `?:', `,' and casts in lvalues. ! 20: * Conditionals:: Omitting the middle operand of a `?:' expression. ! 21: * Zero-Length:: Zero-length arrays. ! 22: * Variable-Length:: Arrays whose length is computed at run time. ! 23: * Subscripting:: Any array can be subscripted, even if not an lvalue. ! 24: * Pointer Arith:: Arithmetic on `void'-pointers and function pointers. ! 25: * Constructors:: Constructor expressions give structures, unions ! 26: or arrays as values. ! 27: * Dollar Signs:: Dollar sign is allowed in identifiers. ! 28: * Alignment:: Inquiring about the alignment of a type or variable. ! 29: * Inline:: Defining inline functions (as fast as macros). ! 30: * Extended Asm:: Assembler instructions with C expressions as operands. ! 31: (With them you can define ``built-in'' functions.) ! 32: * Asm Labels:: Specifying the assembler name to use for a C symbol. ! 33: ! 34: ! 35: ! 36: File: internals, Node: Statement Exprs, Next: Naming Types, Prev: Extensions, Up: Extensions ! 37: ! 38: Statements and Declarations inside of Expressions ! 39: ================================================= ! 40: ! 41: A compound statement in parentheses may appear inside an expression in GNU ! 42: C. This allows you to declare variables within an expression. For example: ! 43: ! 44: ({ int y = foo (); int z; ! 45: if (y > 0) z = y; ! 46: else z = - y; ! 47: z; }) ! 48: ! 49: is a valid (though slightly more complex than necessary) expression for the ! 50: absolute value of `foo ()'. ! 51: ! 52: This feature is especially useful in making macro definitions ``safe'' (so ! 53: that they evaluate each operand exactly once). For example, the ! 54: ``maximum'' function is commonly defined as a macro in standard C as follows: ! 55: ! 56: #define max(a,b) ((a) > (b) ? (a) : (b)) ! 57: ! 58: But this definition computes either A or B twice, with bad results if the ! 59: operand has side effects. In GNU C, if you know the type of the operands ! 60: (here let's assume `int'), you can define the macro safely as follows: ! 61: ! 62: #define maxint(a,b) \ ! 63: ({int _a = (a), _b = (b); _a > _b ? _a : _b; }) ! 64: ! 65: Embedded statements are not allowed in constant expressions, such as the ! 66: value of an enumeration constant, the width of a bit field, or the initial ! 67: value of a static variable. ! 68: ! 69: If you don't know the type of the operand, you can still do this, but you ! 70: must use `typeof' (*Note Typeof::.) or type naming (*Note Naming Types::.). ! 71: ! 72: ! 73: File: internals, Node: Naming Types, Next: Typeof, Prev: Statement Exprs, Up: Extensions ! 74: ! 75: Naming an Expression's Type ! 76: =========================== ! 77: ! 78: You can give a name to the type of an expression using a `typedef' ! 79: declaration with an initializer. Here is how to define NAME as a type name ! 80: for the type of EXP: ! 81: ! 82: typedef NAME = EXP; ! 83: ! 84: This is useful in conjunction with the statements-within-expressions ! 85: feature. Here is how the two together can be used to define a safe ! 86: ``maximum'' macro that operates on any arithmetic type: ! 87: ! 88: #define max(a,b) \ ! 89: ({typedef _ta = (a), _tb = (b); \ ! 90: _ta _a = (a); _tb _b = (b); \ ! 91: _a > _b ? _a : _b; }) ! 92: ! 93: The reason for using names that start with underscores for the local ! 94: variables is to avoid conflicts with variable names that occur within the ! 95: expressions that are substituted for `a' and `b'. Eventually we hope to ! 96: design a new form of declaration syntax that allows you to declare ! 97: variables whose scopes start only after their initializers; this will be a ! 98: more reliable way to prevent such conflicts. ! 99: ! 100: ! 101: File: internals, Node: Typeof, Next: Lvalues, Prev: Naming Types, Up: Extensions ! 102: ! 103: Referring to a Type with `typeof' ! 104: ================================= ! 105: ! 106: Another way to refer to the type of an expression is with `typeof'. The ! 107: syntax of using of this keyword looks like `sizeof', but the construct acts ! 108: semantically like a type name defined with `typedef'. ! 109: ! 110: There are two ways of writing the argument to `typeof': with an expression ! 111: or with a type. Here is an example with an expression: ! 112: ! 113: typeof (x[0](1)) ! 114: ! 115: This assumes that `x' is an array of functions; the type described is that ! 116: of the values of the functions. ! 117: ! 118: Here is an example with a typename as the argument: ! 119: ! 120: typeof (int *) ! 121: ! 122: Here the type described is that of pointers to `int'. ! 123: ! 124: A `typeof'-construct can be used anywhere a typedef name could be used. ! 125: For example, you can use it in a declaration, in a cast, or inside of ! 126: `sizeof' or `typeof'. ! 127: ! 128: * This declares `y' with the type of what `x' points to. ! 129: ! 130: typeof (*x) y; ! 131: ! 132: * This declares `y' as an array of such values. 1.1 root 133: 1.1.1.3 ! root 134: typeof (*x) y[4]; 1.1 root 135: 1.1.1.3 ! root 136: * This declares `y' as an array of pointers to characters: 1.1 root 137: 1.1.1.3 ! root 138: typeof (typeof (char *)[4]) y; 1.1 root 139: 1.1.1.3 ! root 140: It is equivalent to the following traditional C declaration: 1.1 root 141: 1.1.1.3 ! root 142: char *y[4]; 1.1 root 143: 1.1.1.3 ! root 144: To see the meaning of the declaration using `typeof', and why it might ! 145: be a useful way to write, let's rewrite it with these macros: 1.1 root 146: 1.1.1.3 ! root 147: #define pointer(T) typeof(T *) ! 148: #define array(T, N) typeof(T [N]) 1.1 root 149: 1.1.1.3 ! root 150: Now the declaration can be rewritten this way: 1.1 root 151: 1.1.1.3 ! root 152: array (pointer (char), 4) y; ! 153: ! 154: Thus, `array (pointer (char), 4)' is the type of arrays of 4 pointers ! 155: to `char'. 1.1 root 156: 157: 1.1.1.2 root 158: File: internals, Node: Lvalues, Next: Conditionals, Prev: Typeof, Up: Extensions 159: 160: Generalized Lvalues 161: =================== 162: 163: Compound expressions, conditional expressions and casts are allowed as 164: lvalues provided their operands are lvalues. This means that you can take 165: their addresses or store values into them. 166: 167: For example, a compound expression can be assigned, provided the last 168: expression in the sequence is an lvalue. These two expressions are 169: equivalent: 170: 171: (a, b) += 5 172: a, (b += 5) 173: 174: Similarly, the address of the compound expression can be taken. These two 175: expressions are equivalent: 176: 177: &(a, b) 178: a, &b 179: 180: A conditional expression is a valid lvalue if its type is not void and the 181: true and false branches are both valid lvalues. For example, these two 182: expressions are equivalent: 183: 184: (a ? b : c) = 5 185: (a ? b = 5 : (c = 5)) 186: 187: A cast is a valid lvalue if its operand is valid. Taking the address of 188: the cast is the same as taking the address without a cast, except for the 189: type of the result. For example, these two expressions are equivalent (but 190: the second may be valid when the type of `a' does not permit a cast to `int 191: *'). 192: 193: &(int *)a 194: (int **)&a 195: 196: A simple assignment whose left-hand side is a cast works by converting the 197: right-hand side first to the specified type, then to the type of the inner 198: left-hand side expression. After this is stored, the value is converter 199: back to the specified type to become the value of the assignment. Thus, if 200: `a' has type `char *', the following two expressions are equivalent: 201: 202: (int)a = 5 203: (int)(a = (char *)5) 204: 205: An assignment-with-arithmetic operation such as `+=' applied to a cast 206: performs the arithmetic using the type resulting from the cast, and then 207: continues as in the previous case. Therefore, these two expressions are 208: equivalent: 209: 210: (int)a += 5 211: (int)(a = (char *) ((int)a + 5)) 212: 213: 214: File: internals, Node: Conditionals, Next: Zero-Length, Prev: Lvalues, Up: Extensions 215: 216: Conditional Expressions with Omitted Middle-Operands 217: ==================================================== 218: 219: The middle operand in a conditional expression may be omitted. Then if the 220: first operand is nonzero, its value is the value of the conditional 221: expression. 222: 223: Therefore, the expression 224: 225: x ? : y 226: 227: has the value of `x' if that is nonzero; otherwise, the value of `y'. 228: 229: This example is perfectly equivalent to 230: 231: x ? x : y 232: 233: In this simple case, the ability to omit the middle operand is not 234: especially useful. When it becomes useful is when the first operand does, 235: or may (if it is a macro argument), contain a side effect. Then repeating 236: the operand in the middle would perform the side effect twice. Omitting 237: the middle operand uses the value already computed without the undesirable 238: effects of recomputing it. 239: 240: 241: File: internals, Node: Zero-Length, Next: Variable-Length, Prev: Conditionals, Up: Extensions 242: 243: Arrays of Length Zero 1.1 root 244: ===================== 245: 1.1.1.2 root 246: Zero-length arrays are allowed in GNU C. They are very useful as the last 247: element of a structure which is really a header for a variable-length object: 248: 249: struct line { 250: int length; 251: char contents[0]; 252: }; 253: 254: { 255: struct line *thisline 256: = (struct line *) malloc (sizeof (struct line) + this_length); 257: thisline->length = thislength; 258: } 259: 260: In standard C, you would have to give `contents' a length of 1, which means 261: either you waste space or complicate the argument to `malloc'. 1.1 root 262: 263: 1.1.1.2 root 264: File: internals, Node: Variable-Length, Next: Subscripting, Prev: Zero-Length, Up: Extensions 1.1 root 265: 1.1.1.2 root 266: Arrays of Variable Length 267: ========================= 1.1 root 268: 1.1.1.2 root 269: Variable-length automatic arrays are allowed in GNU C. These arrays are 270: declared like any other automatic arrays, but with a length that is not a 271: constant expression. The storage is allocated at that time and deallocated 272: when the brace-level is exited. For example: 273: 274: FILE *concat_fopen (char *s1, char *s2, char *mode) 275: { 276: char str[strlen (s1) + strlen (s2) + 1]; 277: strcpy (str, s1); 278: strcat (str, s2); 279: return fopen (str, mode); 280: } 281: 282: You can also define structure types containing variable-length arrays, and 283: use them even for arguments or function values, as shown here: 284: 285: int foo; 286: 287: struct entry 288: { 289: char data[foo]; 290: }; 291: 292: struct entry 293: tester (struct entry arg) 294: { 295: struct entry new; 296: int i; 297: for (i = 0; i < foo; i++) 298: new.data[i] = arg.data[i] + 1; 299: return new; 300: } 301: 302: (Eventually there will be a way to say that the size of the array is 303: another member of the same structure.) 304: 305: The length of an array is computed on entry to the brace-level where the 306: array is declared and is remembered for the scope of the array in case you 307: access it with `sizeof'. 308: 309: Jumping or breaking out of the scope of the array name will also deallocate 310: the storage. Jumping into the scope is not allowed; you will get an error 311: message for it. 312: 313: You can use the function `alloca' to get an effect much like 314: variable-length arrays. The function `alloca' is available in many other C 315: implementations (but not in all). On the other hand, variable-length 316: arrays are more elegant. 317: 318: There are other differences between these two methods. Space allocated 319: with `alloca' exists until the containing *function* returns. The space 320: for a variable-length array is deallocated as soon as the array name's 321: scope ends. (If you use both variable-length arrays and `alloca' in the 322: same function, deallocation of a variable-length array will also deallocate 323: anything more recently allocated with `alloca'.) 1.1 root 324: 325: 1.1.1.2 root 326: File: internals, Node: Subscripting, Next: Pointer Arith, Prev: Variable-Length, Up: Extensions 1.1 root 327: 1.1.1.2 root 328: Non-Lvalue Arrays May Have Subscripts 329: ===================================== 1.1 root 330: 1.1.1.2 root 331: Subscripting is allowed on arrays that are not lvalues, even though the 332: unary `&' operator is not. For example, this is valid in GNU C though not 333: valid in other C dialects: 1.1 root 334: 1.1.1.2 root 335: struct foo {int a[4];}; 1.1 root 336: 1.1.1.2 root 337: struct foo f(); 1.1 root 338: 1.1.1.2 root 339: bar (int index) 340: { 341: return f().a[index]; 342: } 343: 344: 345: File: internals, Node: Pointer Arith, Next: Initializers, Prev: Subscripting, Up: Extensions 346: 347: Arithmetic on `void'-Pointers and Function Pointers 348: =================================================== 349: 350: In GNU C, addition and subtraction operations are supported on pointers to 351: `void' and on pointers to functions. This is done by treating the size of 352: a `void' or of a function as 1. 353: 354: A consequence of this is that `sizeof' is also allowed on `void' and on 355: function types, and returns 1. 1.1 root 356: 357: 1.1.1.2 root 358: File: internals, Node: Initializers, Next: Constructors, Prev: Pointer Arith, Up: Extensions 359: 360: Non-Constant Initializers 361: ========================= 362: 363: The elements of an aggregate initializer are not required to be constant 364: expressions in GNU C. Here is an example of an initializer with run-time 365: varying elements: 366: 367: foo (float f, float g) 368: { 369: float beat_freqs[2] = { f-g, f+g }; 370: ... 371: } 1.1 root 372: 1.1.1.2 root 373: 374: File: internals, Node: Constructors, Next: Dollar Signs, Prev: Initializers, Up: Extensions 375: 376: Constructor Expressions 1.1 root 377: ======================= 378: 1.1.1.2 root 379: GNU C supports constructor expressions. A constructor looks like a cast 380: containing an initializer. Its value is an object of the type specified in 381: the cast, containing the elements specified in the initializer. The type 382: must be a structure, union or array type. 1.1 root 383: 1.1.1.2 root 384: Assume that `struct foo' and `structure' are declared as shown: 1.1 root 385: 1.1.1.2 root 386: struct foo {int a; char b[2];} structure; 1.1 root 387: 1.1.1.2 root 388: Here is an example of constructing a `struct foo' with a constructor: 1.1 root 389: 1.1.1.2 root 390: structure = ((struct foo) {x + y, 'a', 0}); 1.1 root 391: 1.1.1.2 root 392: This is equivalent to writing the following: 1.1 root 393: 1.1.1.2 root 394: { 395: struct foo temp = {x + y, 'a', 0}; 396: structure = temp; 397: } 1.1 root 398: 1.1.1.2 root 399: You can also construct an array. If all the elements of the constructor 400: are (made up of) simple constant expressions, suitable for use in 401: initializers, then the constructor is an lvalue and can be coerced to a 402: pointer to its first element, as shown here: 403: 404: char **foo = (char *[]) { "x", "y", "z" }; 405: 406: Array constructors whose elements are not simple constants are not very 407: useful, because the constructor is not an lvalue. There are only two valid 408: ways to use it: to subscript it, or initialize an array variable with it. 409: The former is probably slower than a `switch' statement, while the latter 410: does the same thing an ordinary C initializer would do. 411: 412: output = ((int[]) { 2, x, 28 }) [input]; 1.1 root 413: 414: 1.1.1.2 root 415: File: internals, Node: Dollar Signs, Next: Alignment, Prev: Constructors, Up: Extensions 1.1 root 416: 1.1.1.2 root 417: Dollar Signs in Identifier Names 418: ================================ 1.1 root 419: 1.1.1.2 root 420: In GNU C, you may use dollar signs in identifier names. This is because 421: many traditional C implementations allow such identifiers. 1.1 root 422: 1.1.1.2 root 423: 424: File: internals, Node: Alignment, Next: Inline, Prev: Dollar Signs, Up: Extensions 1.1 root 425: 1.1.1.2 root 426: Inquiring about the Alignment of a Type or Variable 427: =================================================== 1.1 root 428: 1.1.1.2 root 429: The keyword `__alignof' allows you to inquire about how an object is 430: aligned, or the minimum alignment usually required by a type. Its syntax 431: is just like `sizeof'. 1.1 root 432: 1.1.1.2 root 433: For example, if the target machine requires a `double' value to be aligned 434: on an 8-byte boundary, then `__alignof (double)' is 8. This is true on 435: many RISC machines. On more traditional machine designs, `__alignof 436: (double)' is 4 or even 2. 437: 438: Some machines never actually require alignment; they allow reference to any 439: data type even at an odd addresses. For these machines, `__alignof' 440: reports the *recommended* alignment of a type. 441: 442: When the operand of `__alignof' is an lvalue rather than a type, the value 443: is the largest alignment that the lvalue is known to have. It may have 444: this alignment as a result of its data type, or because it is part of a 445: structure and inherits alignment from that structure. For example, after 446: this declaration: 447: 448: struct foo { int x; char y; } foo1; 449: 450: the value of `__alignof (foo1.y)' is probably 2 or 4, the same as 451: `__alignof (int)', even though the data type of `foo1.y' does not itself 452: demand any alignment. 453: 454: 455: File: internals, Node: Inline, Next: Extended Asm, Prev: Alignment, Up: Extensions 456: 457: An Inline Function is As Fast As a Macro 458: ======================================== 459: 460: By declaring a function `inline', you can direct GNU CC to integrate that 461: function's code into the code for its callers. This makes execution faster 462: by eliminating the function-call overhead; in addition, if any of the 463: actual argument values are constant, their known values may permit 464: simplifications at compile time so that not all of the inline function's 465: code needs to be included. 466: 467: To declare a function inline, use the `inline' keyword in its declaration, 468: like this: 469: 470: inline int 471: inc (int *a) 472: { 473: (*a)++; 474: } 475: 476: You can also make all ``simple enough'' functions inline with the option 477: `-finline-functions'. Note that certain usages in a function definition 478: can make it unsuitable for inline substitution. 479: 480: When a function is both inline and `static', if all calls to the function 481: are integrated into the caller, then the function's own assembler code is 482: never referenced. In this case, GNU CC does not actually output assembler 483: code for the function, unless you specify the option 484: `-fkeep-inline-functions'. Some calls cannot be integrated for various 485: reasons (in particular, calls that precede the function's definition cannot 486: be integrated, and neither can recursive calls within the definition). If 487: there is a nonintegrated call, then the function is compiled to assembler 488: code as usual. 489: 490: When an inline function is not `static', then the compiler must assume that 491: there may be calls from other source files; since a global symbol can be 492: defined only once in any program, the function must not be defined in the 493: other source files, so the calls therein cannot be integrated. Therefore, 494: a non-`static' inline function is always compiled on its own in the usual 495: fashion. 496: 497: 498: File: internals, Node: Extended Asm, Next: Asm Labels, Prev: Inline, Up: Extensions 499: 500: Assembler Instructions with C Expression Operands 501: ================================================= 502: 503: In an assembler instruction using `asm', you can now specify the operands 504: of the instruction using C expressions. This means no more guessing which 505: registers or memory locations will contain the data you want to use. 506: 507: You must specify an assembler instruction template much like what appears 508: in a machine description, plus an operand constraint string for each operand. 509: 510: For example, here is how to use the 68881's `fsinx' instruction: 511: 512: asm ("fsinx %1,%0" : "=f" (result) : "f" (angle)); 513: 514: Here `angle' is the C expression for the input operand while `result' is 515: that of the output operand. Each has `"f"' as its operand constraint, 516: saying that a floating-point register is required. The constraints use the 517: same language used in the machine description (*Note Constraints::.). 518: 519: Each operand is described by an operand-constraint string followed by the C 520: expression in parentheses. A colon separates the assembler template from 521: the first output operand, and another separates the last output operand 522: from the first input, if any. Commas separate output operands and separate 523: inputs. The number of operands is limited to the maximum number of 524: operands in any instruction pattern in the machine description. 525: 526: Output operand expressions must be lvalues, and there must be at least one 527: of them. The compiler can check this. The input operands need not be 528: lvalues, and there need not be any. The compiler cannot check whether the 529: operands have data types that are reasonable for the instruction being 530: executed. 531: 532: The output operands must be write-only; GNU CC will assume that the values 533: in these operands before the instruction are dead and need not be 534: generated. For an operand that is read-write, you must logically split its 535: function into two separate operands, one input operand and one write-only 536: output operand. The connection between them is expressed by constraints 537: which say they need to be in the same location when the instruction 538: executes. You can use the same C expression for both operands, or 539: different expressions. For example, here we write the (fictitious) 540: `combine' instruction with `bar' as its read-only source operand and `foo' 541: as its read-write destination: 542: 543: asm ("combine %2,%0" : "=r" (foo) : "0" (foo), "g" (bar)); 544: 545: The constraint `"0"' for operand 1 says that it must occupy the same 546: location as operand 0. Therefore it is not necessary to substitute operand 547: 1 into the assembler code output. 548: 549: Usually the most convenient way to use these `asm' instructions is to 550: encapsulate them in macros that look like functions. For example, 551: 552: #define sin(x) \ 553: ({ double __value, __arg = (x); \ 554: asm ("fsinx %1,%0": "=f" (__value): "f" (__arg)); \ 555: __value; }) 556: 557: Here the variable `__arg' is used to make sure that the instruction 558: operates on a proper `double' value, and to accept only those arguments `x' 559: which can convert automatically to a `double'. 560: 561: Another way to make sure the instruction operates on the correct data type 562: is to use a cast in the `asm'. This is different from using a variable 563: `__arg' in that it converts more different types. For example, if the 564: desired type were `int', casting the argument to `int' would accept a 565: pointer with no complaint, while assigning the argument to an `int' 566: variable named `__arg' would warn about using a pointer unless the caller 567: explicitly casts it. 568: 569: GNU CC assumes for optimization purposes that these instructions have no 570: side effects except to change the output operands. This does not mean that 571: instructions with a side effect cannot be used, but you must be careful, 572: because the compiler may eliminate them if the output operands aren't used, 573: or move them out of loops, or replace two with one if they constitute a 574: common subexpression. Also, if your instruction does have a side effect on 575: a variable that otherwise appears not to change, the old value of the 576: variable may be reused later if it happens to be found in a register. 577: 578: You can prevent an `asm' instruction from being deleted, moved or combined 579: by writing the keyword `volatile' after the `asm'. For example: 580: 581: #define set_priority(x) \ 582: asm volatile ("set_priority %1": \ 583: "=m" (*(char *)0): "g" (x)) 584: 585: Note that we have supplied an output operand which is not actually used in 586: the instruction. This is because `asm' requires at least one output 587: operand. This requirement exists for internal implementation reasons and 588: we might be able to relax it in the future. 589: 590: In this case output operand has the additional benefit effect of giving the 591: appearance of writing in memory. As a result, GNU CC will assume that data 592: previously fetched from memory must be fetched again if needed again later. 593: This may be desirable if you have not employed the `volatile' keyword on 594: all the variable declarations that ought to have it. 595: 596: 597: File: internals, Node: Asm Labels, Prev: Extended Asm, Up: Extensions 598: 599: Controlling Names Used in Assembler Code 600: ======================================== 601: 602: You can specify the name to be used in the assembler code for a C function 603: or variable by writing the `asm' keyword after the declarator as follows: 604: 605: int foo asm ("myfoo") = 2; 606: 607: This specifies that the name to be used for the variable `foo' in the 608: assembler code should be `myfoo' rather than the usual `_foo'. 609: 610: On systems where an underscore is normally prepended to the name of a C 611: function or variable, this feature allows you to define names for the 612: linker that do not start with an underscore. 613: 614: You cannot use `asm' in this way in a function *definition*; but you can 615: get the same effect by writing a declaration for the function before its 616: definition and putting `asm' there, like this: 617: 618: extern func () asm ("FUNC"); 1.1 root 619: 1.1.1.2 root 620: func (x, y) 621: int x, y; 622: ... 623: 624: It is up to you to make sure that the assembler names you choose do not 625: conflict with any other assembler symbols. Also, you must not use a 626: register name; that would produce completely invalid assembler code. GNU 627: CC does not as yet have the ability to store static variables in registers. 628: Perhaps that will be added. 1.1 root 629: 630: 1.1.1.2 root 631: File: internals, Node: Bugs, Next: Portability, Prev: Extensions, Up: Top 1.1 root 632: 1.1.1.2 root 633: Reporting Bugs 634: ************** 635: 636: Your bug reports play an essential role in making GNU CC reliable. 637: 638: Reporting a bug may help you by bringing a solution to your problem, or it 639: may not. But in any case the important function of a bug report is to help 640: the entire community by making the next version of GNU CC work better. Bug 641: reports are your contribution to the maintenance of GNU CC. 642: 643: In order for a bug report to serve its purpose, you must include the 644: information that makes for fixing the bug. 1.1 root 645: 646: * Menu: 647: 1.1.1.2 root 648: * Criteria: Bug Criteria. Have you really found a bug? 649: * Reporting: Bug Reporting. How to report a bug effectively. 650: 1.1 root 651: 652: 1.1.1.2 root 653: File: internals, Node: Bug Criteria, Next: Bug Reporting, Prev: Bugs, Up: Bugs 1.1 root 654: 1.1.1.2 root 655: Have You Found a Bug? 656: ===================== 1.1 root 657: 1.1.1.2 root 658: If you are not sure whether you have found a bug, here are some guidelines: 1.1 root 659: 1.1.1.2 root 660: * If the compiler gets a fatal signal, for any input whatever, that is a 661: compiler bug. Reliable compilers never crash. 1.1 root 662: 1.1.1.2 root 663: * If the compiler produces invalid assembly code, for any input whatever 664: (except an `asm' statement), that is a compiler bug, unless the 665: compiler reports errors (not just warnings) which would ordinarily 666: prevent the assembler from being run. 667: 668: * If the compiler produces valid assembly code that does not correctly 669: execute the input source code, that is a compiler bug. 670: 671: However, you must double-check to make sure, because you may have run 672: into an incompatibility between GNU C and traditional C (*Note 673: Incompatibilities::.). These incompatibilities might be considered 1.1.1.3 ! root 674: bugs, but they are inescapable consequences of valuable features. 1.1.1.2 root 675: 676: Or you may have a program whose behavior is undefined, which happened 677: by chance to give the desired results with another C compiler. 678: 679: For example, in many nonoptimizing compilers, you can write `x;' at 680: the end of a function instead of `return x;', with the same results. 681: But the value of the function is undefined if `return' is omitted; it 682: is not a bug when GNU CC produces different results. 683: 684: Problems often result from expressions with two increment operators, 685: as in `f (*p++, *p++)'. Your previous compiler might have interpreted 686: that expression the way you intended; GNU CC might interpret it 687: another way; neither compiler is wrong. 688: 689: After you have localized the error to a single source line, it should 690: be easy to check for these things. If your program is correct and 691: well defined, you have found a compiler bug. 692: 693: * If the compiler produces an error message for valid input, that is a 694: compiler bug. 695: 696: Note that the following is not valid input, and the error message for 697: it is not a bug: 698: 699: int foo (char); 700: 701: int 702: foo (x) 703: char x; 704: { ... } 705: 706: The prototype says to pass a `char', while the definition says to pass 707: an `int' and treat the value as a `char'. This is what the ANSI 708: standard says, and it makes sense. 709: 710: * If the compiler does not produce an error message for invalid input, 711: that is a compiler bug. However, you should note that your idea of 712: ``invalid input'' might be my idea of ``an extension'' or ``support 713: for traditional practice''. 714: 715: * If you are an experienced user of C compilers, your suggestions for 716: improvement of GNU CC are welcome in any case. 1.1 root 717: 718: 1.1.1.2 root 719: File: internals, Node: Bug Reporting, Prev: Bug Criteria, Up: Bugs 720: 721: How to Report Bugs 722: ================== 723: 724: Send bug reports for GNU C to one of these addresses: 725: 726: [email protected] 727: {ucbvax|mit-eddie|uunet}!prep.ai.mit.edu!bug-gcc 728: 729: As a last resort, snail them to: 730: 731: GNU Compiler Bugs 732: 545 Tech Sq 733: Cambridge, MA 02139 734: 735: The fundamental principle of reporting bugs usefully is this: *report all 736: the facts*. If you are not sure whether to mention a fact or leave it out, 737: mention it! 738: 739: Often people omit facts because they think they know what causes the 740: problem and they conclude that some details don't matter. Thus, you might 741: assume that the name of the variable you use in an example does not matter. 742: Well, probably it doesn't, but one cannot be sure. Perhaps the bug is a 743: stray memory reference which happens to fetch from the location where that 744: name is stored in memory; perhaps, if the name were different, the contents 745: of that location would fool the compiler into doing the right thing despite 746: the bug. Play it safe and give an exact example. 747: 748: If you want to enable me to fix the bug, you should include all these things: 749: 750: * The version of GNU CC. You can get this by running it with the `-v' 751: option. 752: 753: Without this, I won't know whether there is any point in looking for 754: the bug in the current version of GNU CC. 755: 756: * A complete input file that will reproduce the bug. If the bug is in 757: the C preprocessor, send me a source file and any header files that it 758: requires. If the bug is in the compiler proper (`cc1'), run your 759: source file through the C preprocessor by doing `gcc -E SOURCEFILE > 760: OUTFILE', then include the contents of OUTFILE in the bug report. 761: (Any `-I', `-D' or `-U' options that you used in actual compilation 762: should also be used when doing this.) 763: 764: A single statement is not enough of an example. In order to compile 765: it, it must be embedded in a function definition; and the bug might 766: depend on the details of how this is done. 767: 768: Without a real example I can compile, all I can do about your bug 769: report is wish you luck. It would be futile to try to guess how to 770: provoke the bug. For example, bugs in register allocation and 771: reloading frequently depend on every little detail of the function 772: they happen in. 1.1 root 773: 1.1.1.2 root 774: * The command arguments you gave GNU CC to compile that example and 775: observe the bug. For example, did you use `-O'? To guarantee you 776: won't omit something important, list them all. 1.1 root 777: 1.1.1.2 root 778: If I were to try to guess the arguments, I would probably guess wrong 779: and then I would not encounter the bug. 1.1 root 780: 1.1.1.2 root 781: * The names of the files that you used for `tm.h' and `md' when you 782: installed the compiler. 783: 784: * The type of machine you are using, and the operating system name and 785: version number. 786: 787: * A description of what behavior you observe that you believe is 788: incorrect. For example, ``It gets a fatal signal,'' or, ``There is an 789: incorrect assembler instruction in the output.'' 790: 791: Of course, if the bug is that the compiler gets a fatal signal, then I 792: will certainly notice it. But if the bug is incorrect output, I might 793: not notice unless it is glaringly wrong. I won't study all the 794: assembler code from a 50-line C program just on the off chance that it 795: might be wrong. 796: 797: Even if the problem you experience is a fatal signal, you should still 798: say so explicitly. Suppose something strange is going on, such as, 799: your copy of the compiler is out of synch, or you have encountered a 800: bug in the C library on your system. (This has happened!) Your copy 801: might crash and mine would not. If you told me to expect a crash, 802: then when mine fails to crash, I would know that the bug was not 803: happening for me. If you had not told me to expect a crash, then I 804: would not be able to draw any conclusion from my observations. 805: 806: In cases where GNU CC generates incorrect code, if you send me a small 807: complete sample program I will find the error myself by running the 808: program under a debugger. If you send me a large example or a part of 809: a larger program, I cannot do this; you must debug the compiled 810: program and narrow the problem down to one source line. Tell me which 811: source line it is, and what you believe is incorrect about the code 812: generated for that line. 813: 814: * If you send me examples of output from GNU CC, please use `-g' when 815: you make them. The debugging information includes source line numbers 816: which are essential for correlating the output with the input. 817: 818: Here are some things that are not necessary: 819: 820: * A description of the envelope of the bug. 821: 822: Often people who encounter a bug spend a lot of time investigating 823: which changes to the input file will make the bug go away and which 824: changes will not affect it. 825: 826: This is often time consuming and not very useful, because the way I 827: will find the bug is by running a single example under the debugger 828: with breakpoints, not by pure deduction from a series of examples. 829: 830: Of course, it can't hurt if you can find a simpler example that 831: triggers the same bug. Errors in the output will be easier to spot, 832: running under the debugger will take less time, etc. An easy way to 833: simplify an example is to delete all the function definitions except 834: the one where the bug occurs. Those earlier in the file may be 835: replaced by external declarations. 836: 837: However, simplification is not necessary; if you don't want to do 838: this, report the bug anyway. 839: 840: * A patch for the bug. 841: 842: A patch for the bug does help me if it is a good one. But don't omit 843: the necessary information, such as the test case, because I might see 844: problems with your patch and decide to fix the problem another way. 845: 846: Sometimes with a program as complicated as GNU CC it is very hard to 847: construct an example that will make the program go through a certain 848: point in the code. If you don't send me the example, I won't be able 849: to verify that the bug is fixed. 850: 851: * A guess about what the bug is or what it depends on. 852: 853: Such guesses are usually wrong. Even I can't guess right about such 854: things without using the debugger to find the facts. They also don't 855: serve a useful purpose. 1.1 root 856: 857: 1.1.1.2 root 858: File: internals, Node: Portability, Next: Interface, Prev: Bugs, Up: Top 1.1 root 859: 1.1.1.2 root 860: GNU CC and Portability 861: ********************** 1.1 root 862: 1.1.1.2 root 863: The main goal of GNU CC was to make a good, fast compiler for machines in 864: the class that the GNU system aims to run on: 32-bit machines that address 865: 8-bit bytes and have several general registers. Elegance, theoretical 866: power and simplicity are only secondary. 867: 868: GNU CC gets most of the information about the target machine from a machine 869: description which gives an algebraic formula for each of the machine's 870: instructions. This is a very clean way to describe the target. But when 871: the compiler needs information that is difficult to express in this 872: fashion, I have not hesitated to define an ad-hoc parameter to the machine 873: description. The purpose of portability is to reduce the total work needed 874: on the compiler; it was not of interest for its own sake. 875: 876: GNU CC does not contain machine dependent code, but it does contain code 877: that depends on machine parameters such as endianness (whether the most 878: significant byte has the highest or lowest address of the bytes in a word) 879: and the availability of autoincrement addressing. In the RTL-generation 880: pass, it is often necessary to have multiple strategies for generating code 881: for a particular kind of syntax tree, strategies that are usable for 882: different combinations of parameters. Often I have not tried to address 883: all possible cases, but only the common ones or only the ones that I have 884: encountered. As a result, a new target may require additional strategies. 885: You will know if this happens because the compiler will call `abort'. 886: Fortunately, the new strategies can be added in a machine-independent 887: fashion, and will affect only the target machines that need them. 1.1 root 888: 889: 1.1.1.2 root 890: File: internals, Node: Interface, Next: Passes, Prev: Portability, Up: Top 1.1 root 891: 1.1.1.2 root 892: Interfacing to GNU CC Output 893: **************************** 1.1 root 894: 1.1.1.2 root 895: GNU CC is normally configured to use the same function calling convention 896: normally in use on the target system. This is done with the 897: machine-description macros described (*Note Machine Macros::.). 898: 899: However, returning of structure and union values is done differently. As a 900: result, functions compiled with PCC returning such types cannot be called 901: from code compiled with GNU CC, and vice versa. This usually does not 902: cause trouble because the Unix library routines don't return structures and 903: unions. 904: 905: Structures and unions that are 1, 2, 4 or 8 bytes long are returned in the 906: same registers used for `int' or `double' return values. (GNU CC typically 907: allocates variables of such types in registers also.) Structures and 908: unions of other sizes are returned by storing them into an address passed 909: by the caller in a register. This method is faster than the one normally 910: used by PCC and is also reentrant. The register used for passing the 911: address is specified by the machine-description macro `STRUCT_VALUE_REGNUM'. 912: 913: GNU CC always passes arguments on the stack. At some point it will be 914: extended to pass arguments in registers, for machines which use that as the 915: standard calling convention. This will make it possible to use such a 916: convention on other machines as well. However, that would render it 917: completely incompatible with PCC. We will probably do this once we have a 918: complete GNU system so we can compile the libraries with GNU CC. 919: 920: If you use `longjmp', beware of automatic variables. ANSI C says that 921: automatic variables that are not declared `volatile' have undefined values 922: after a `longjmp'. And this is all GNU CC promises to do, because it is 923: very difficult to restore register variables correctly, and one of GNU CC's 924: features is that it can put variables in registers without your asking it to. 925: 926: If you want a variable to be unaltered by `longjmp', and you don't want to 927: write `volatile' because old C compilers don't accept it, just take the 928: address of the variable. If a variable's address is ever taken, even if 929: just to compute it and ignore it, then the variable cannot go in a register: 930: 931: { 932: int careful; 933: &careful; 934: ... 935: } 936: 937: Code compiled with GNU CC may call certain library routines. The routines 938: needed on the Vax and 68000 are in the file `gnulib.c'. You must compile 939: this file with the standard C compiler, not with GNU CC, and then link it 940: with each program you compile with GNU CC. (In actuality, many programs 941: will not need it.) The usual function call interface is used for calling 942: the library routines. Some standard parts of the C library, such as 943: `bcopy', are also called automatically. 944: 1.1.1.3 ! root 945:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.