|
|
1.1 root 1: Info file gcc.info, produced by Makeinfo, -*- Text -*- from input
2: file gcc.texinfo.
3:
4: This file documents the use and the internals of the GNU compiler.
5:
1.1.1.4 root 6: Copyright (C) 1988, 1989, 1990 Free Software Foundation, Inc.
1.1 root 7:
8: Permission is granted to make and distribute verbatim copies of this
9: manual provided the copyright notice and this permission notice are
10: preserved on all copies.
11:
12: Permission is granted to copy and distribute modified versions of
13: this manual under the conditions for verbatim copying, provided also
1.1.1.4 root 14: that the sections entitled "GNU General Public License" and "Protect
15: Your Freedom--Fight `Look And Feel'" are included exactly as in the
16: original, and provided that the entire resulting derived work is
17: distributed under the terms of a permission notice identical to this
18: one.
1.1 root 19:
20: Permission is granted to copy and distribute translations of this
21: manual into another language, under the above conditions for modified
1.1.1.4 root 22: versions, except that the sections entitled "GNU General Public
23: License" and "Protect Your Freedom--Fight `Look And Feel'" and this
24: permission notice may be included in translations approved by the
25: Free Software Foundation instead of in the original English.
26:
27:
28:
1.1.1.5 ! root 29: File: gcc.info, Node: Zero-Length, Next: Variable-Length, Prev: Conditionals, Up: Extensions
! 30:
! 31: Arrays of Length Zero
! 32: =====================
! 33:
! 34: Zero-length arrays are allowed in GNU C. They are very useful as the
! 35: last element of a structure which is really a header for a
! 36: variable-length object:
! 37:
! 38: struct line {
! 39: int length;
! 40: char contents[0];
! 41: };
! 42:
! 43: {
! 44: struct line *thisline
! 45: = (struct line *) malloc (sizeof (struct line) + this_length);
! 46: thisline->length = this_length;
! 47: }
! 48:
! 49: In standard C, you would have to give `contents' a length of 1, which
! 50: means either you waste space or complicate the argument to `malloc'.
! 51:
! 52:
! 53:
1.1.1.4 root 54: File: gcc.info, Node: Variable-Length, Next: Subscripting, Prev: Zero-Length, Up: Extensions
55:
56: Arrays of Variable Length
57: =========================
58:
59: Variable-length automatic arrays are allowed in GNU C. These arrays
60: are declared like any other automatic arrays, but with a length that
61: is not a constant expression. The storage is allocated at that time
62: and deallocated when the brace-level is exited. For example:
63:
64: FILE *concat_fopen (char *s1, char *s2, char *mode)
65: {
66: char str[strlen (s1) + strlen (s2) + 1];
67: strcpy (str, s1);
68: strcat (str, s2);
69: return fopen (str, mode);
70: }
71:
72: You can also use variable-length arrays as arguments to functions:
73:
74: struct entry
75: tester (int len, char data[len])
76: {
77: ...
78: }
79:
80: The length of an array is computed on entry to the brace-level where
81: the array is declared and is remembered for the scope of the array in
82: case you access it with `sizeof'.
83:
84: Jumping or breaking out of the scope of the array name will also
85: deallocate the storage. Jumping into the scope is not allowed; you
86: will get an error message for it.
87:
88: You can use the function `alloca' to get an effect much like
89: variable-length arrays. The function `alloca' is available in many
90: other C implementations (but not in all). On the other hand,
91: variable-length arrays are more elegant.
92:
93: There are other differences between these two methods. Space
94: allocated with `alloca' exists until the containing *function* returns.
95: The space for a variable-length array is deallocated as soon as the
96: array name's scope ends. (If you use both variable-length arrays and
97: `alloca' in the same function, deallocation of a variable-length
98: array will also deallocate anything more recently allocated with
99: `alloca'.)
100:
101:
102:
103: File: gcc.info, Node: Subscripting, Next: Pointer Arith, Prev: Variable-Length, Up: Extensions
104:
105: Non-Lvalue Arrays May Have Subscripts
106: =====================================
107:
108: Subscripting is allowed on arrays that are not lvalues, even though
109: the unary `&' operator is not. For example, this is valid in GNU C
110: though not valid in other C dialects:
111:
112: struct foo {int a[4];};
113:
114: struct foo f();
115:
116: bar (int index)
117: {
118: return f().a[index];
119: }
120:
121:
122:
123: File: gcc.info, Node: Pointer Arith, Next: Initializers, Prev: Subscripting, Up: Extensions
124:
125: Arithmetic on `void'-Pointers and Function Pointers
126: ===================================================
127:
128: In GNU C, addition and subtraction operations are supported on
129: pointers to `void' and on pointers to functions. This is done by
130: treating the size of a `void' or of a function as 1.
131:
132: A consequence of this is that `sizeof' is also allowed on `void' and
133: on function types, and returns 1.
134:
135: The option `-Wpointer-arith' requests a warning if these extensions
136: are used.
137:
138:
139:
140: File: gcc.info, Node: Initializers, Next: Constructors, Prev: Pointer Arith, Up: Extensions
141:
142: Non-Constant Initializers
143: =========================
144:
145: The elements of an aggregate initializer for an automatic variable
146: are not required to be constant expressions in GNU C. Here is an
147: example of an initializer with run-time varying elements:
148:
149: foo (float f, float g)
150: {
151: float beat_freqs[2] = { f-g, f+g };
152: ...
153: }
154:
155:
156:
157: File: gcc.info, Node: Constructors, Next: Function Attributes, Prev: Initializers, Up: Extensions
158:
159: Constructor Expressions
160: =======================
161:
162: GNU C supports constructor expressions. A constructor looks like a
163: cast containing an initializer. Its value is an object of the type
164: specified in the cast, containing the elements specified in the
165: initializer. The type must be a structure, union or array type.
166:
167: Assume that `struct foo' and `structure' are declared as shown:
168:
169: struct foo {int a; char b[2];} structure;
170:
171: Here is an example of constructing a `struct foo' with a constructor:
172:
173: structure = ((struct foo) {x + y, 'a', 0});
174:
175: This is equivalent to writing the following:
176:
177: {
178: struct foo temp = {x + y, 'a', 0};
179: structure = temp;
180: }
181:
182: You can also construct an array. If all the elements of the
183: constructor are (made up of) simple constant expressions, suitable
184: for use in initializers, then the constructor is an lvalue and can be
185: coerced to a pointer to its first element, as shown here:
186:
187: char **foo = (char *[]) { "x", "y", "z" };
188:
189: Array constructors whose elements are not simple constants are not
190: very useful, because the constructor is not an lvalue. There are
191: only two valid ways to use it: to subscript it, or initialize an
192: array variable with it. The former is probably slower than a
193: `switch' statement, while the latter does the same thing an ordinary
194: C initializer would do.
195:
196: output = ((int[]) { 2, x, 28 }) [input];
1.1.1.3 root 197:
1.1 root 198:
1.1.1.2 root 199:
1.1.1.3 root 200: File: gcc.info, Node: Function Attributes, Next: Dollar Signs, Prev: Constructors, Up: Extensions
201:
202: Declaring Attributes of Functions
203: =================================
204:
205: In GNU C, you declare certain things about functions called in your
206: program which help the compiler optimize function calls.
207:
208: A few functions, such as `abort' and `exit', cannot return. These
209: functions should be declared `volatile'. For example,
210:
211: extern volatile void abort ();
212:
213: tells the compiler that it can assume that `abort' will not return.
214: This makes slightly better code, but more importantly it helps avoid
215: spurious warnings of uninitialized variables.
1.1.1.2 root 216:
1.1.1.3 root 217: Many functions do not examine any values except their arguments, and
218: have no effects except the return value. Such a function can be
219: subject to common subexpression elimination and loop optimization
220: just as an arithmetic operator would be. These functions should be
221: declared `const'. For example,
1.1.1.2 root 222:
1.1.1.3 root 223: extern const void square ();
1.1.1.2 root 224:
1.1.1.3 root 225: says that the hypothetical function `square' is safe to call fewer
226: times than the program says.
227:
228: Note that a function that has pointer arguments and examines the data
229: pointed to must *not* be declared `const'. Likewise, a function that
1.1.1.4 root 230: calls a non-`const' function usually must not be `const'.
1.1.1.3 root 231:
232: Some people object to this feature, claiming that ANSI C's `#pragma'
233: should be used instead. There are two reasons I did not do this.
234:
235: 1. It is impossible to generate `#pragma' commands from a macro.
236:
237: 2. The `#pragma' command is just as likely as these keywords to
238: mean something else in another compiler.
239:
240: These two reasons apply to *any* application whatever: as far as I
241: can see, `#pragma' is never useful.
1.1.1.2 root 242:
243:
244:
1.1.1.3 root 245: File: gcc.info, Node: Dollar Signs, Next: Alignment, Prev: Function Attributes, Up: Extensions
1.1.1.2 root 246:
1.1.1.3 root 247: Dollar Signs in Identifier Names
248: ================================
249:
250: In GNU C, you may use dollar signs in identifier names. This is
251: because many traditional C implementations allow such identifiers.
1.1.1.2 root 252:
1.1.1.3 root 253: Dollar signs are allowed if you specify `-traditional'; they are not
254: allowed if you specify `-ansi'. Whether they are allowed by default
255: depends on the target machine; usually, they are not.
1.1.1.2 root 256:
257:
258:
1.1.1.3 root 259: File: gcc.info, Node: Alignment, Next: Inline, Prev: Dollar Signs, Up: Extensions
1.1.1.2 root 260:
1.1.1.3 root 261: Inquiring about the Alignment of a Type or Variable
262: ===================================================
1.1.1.2 root 263:
1.1.1.3 root 264: The keyword `__alignof__' allows you to inquire about how an object
265: is aligned, or the minimum alignment usually required by a type. Its
266: syntax is just like `sizeof'.
267:
268: For example, if the target machine requires a `double' value to be
269: aligned on an 8-byte boundary, then `__alignof__ (double)' is 8.
270: This is true on many RISC machines. On more traditional machine
271: designs, `__alignof__ (double)' is 4 or even 2.
272:
273: Some machines never actually require alignment; they allow reference
274: to any data type even at an odd addresses. For these machines,
275: `__alignof__' reports the *recommended* alignment of a type.
276:
277: When the operand of `__alignof__' is an lvalue rather than a type,
278: the value is the largest alignment that the lvalue is known to have.
279: It may have this alignment as a result of its data type, or because
280: it is part of a structure and inherits alignment from that structure.
281: For example, after this declaration:
282:
283: struct foo { int x; char y; } foo1;
284:
285: the value of `__alignof__ (foo1.y)' is probably 2 or 4, the same as
286: `__alignof__ (int)', even though the data type of `foo1.y' does not
287: itself demand any alignment.
1.1.1.2 root 288:
289:
290:
1.1.1.3 root 291: File: gcc.info, Node: Inline, Next: Extended Asm, Prev: Alignment, Up: Extensions
292:
293: An Inline Function is As Fast As a Macro
294: ========================================
295:
296: By declaring a function `inline', you can direct GNU CC to integrate
297: that function's code into the code for its callers. This makes
298: execution faster by eliminating the function-call overhead; in
299: addition, if any of the actual argument values are constant, their
300: known values may permit simplifications at compile time so that not
301: all of the inline function's code needs to be included.
1.1.1.2 root 302:
1.1.1.3 root 303: To declare a function inline, use the `inline' keyword in its
304: declaration, like this:
1.1.1.2 root 305:
1.1.1.3 root 306: inline int
307: inc (int *a)
308: {
309: (*a)++;
310: }
311:
312: (If you are writing a header file to be included in ANSI C programs,
313: write `__inline__' instead of `inline'. *Note Alternate Keywords::.)
314:
1.1.1.4 root 315: You can also make all "simple enough" functions inline with the
1.1.1.3 root 316: option `-finline-functions'. Note that certain usages in a function
317: definition can make it unsuitable for inline substitution.
318:
319: When a function is both inline and `static', if all calls to the
320: function are integrated into the caller, and the function's address
321: is never used, then the function's own assembler code is never
322: referenced. In this case, GNU CC does not actually output assembler
323: code for the function, unless you specify the option
324: `-fkeep-inline-functions'. Some calls cannot be integrated for
325: various reasons (in particular, calls that precede the function's
326: definition cannot be integrated, and neither can recursive calls
327: within the definition). If there is a nonintegrated call, then the
328: function is compiled to assembler code as usual. The function must
329: also be compiled as usual if the program refers to its address,
330: because that can't be inlined.
331:
332: When an inline function is not `static', then the compiler must
333: assume that there may be calls from other source files; since a
334: global symbol can be defined only once in any program, the function
335: must not be defined in the other source files, so the calls therein
336: cannot be integrated. Therefore, a non-`static' inline function is
337: always compiled on its own in the usual fashion.
338:
339: If you specify both `inline' and `extern' in the function definition,
340: then the definition is used only for inlining. In no case is the
341: function compiled on its own, not even if you refer to its address
342: explicitly. Such an address becomes an external reference, as if you
343: had only declared the function, and had not defined it.
344:
345: This combination of `inline' and `extern' has almost the effect of a
346: macro. The way to use it is to put a function definition in a header
347: file with these keywords, and put another copy of the definition
348: (lacking `inline' and `extern') in a library file. The definition in
349: the header file will cause most calls to the function to be inlined.
350: If any uses of the function remain, they will refer to the single
351: copy in the library.
1.1.1.2 root 352:
353:
354:
1.1.1.3 root 355: File: gcc.info, Node: Extended Asm, Next: Asm Labels, Prev: Inline, Up: Extensions
356:
357: Assembler Instructions with C Expression Operands
358: =================================================
359:
360: In an assembler instruction using `asm', you can now specify the
361: operands of the instruction using C expressions. This means no more
362: guessing which registers or memory locations will contain the data
363: you want to use.
364:
365: You must specify an assembler instruction template much like what
366: appears in a machine description, plus an operand constraint string
367: for each operand.
368:
369: For example, here is how to use the 68881's `fsinx' instruction:
370:
371: asm ("fsinx %1,%0" : "=f" (result) : "f" (angle));
372:
373: Here `angle' is the C expression for the input operand while `result'
374: is that of the output operand. Each has `"f"' as its operand
375: constraint, saying that a floating-point register is required. The
376: `=' in `=f' indicates that the operand is an output; all output
377: operands' constraints must use `='. The constraints use the same
378: language used in the machine description (*note Constraints::.).
379:
380: Each operand is described by an operand-constraint string followed by
381: the C expression in parentheses. A colon separates the assembler
382: template from the first output operand, and another separates the
383: last output operand from the first input, if any. Commas separate
384: output operands and separate inputs. The total number of operands is
385: limited to the maximum number of operands in any instruction pattern
386: in the machine description.
387:
388: If there are no output operands, and there are input operands, then
389: there must be two consecutive colons surrounding the place where the
390: output operands would go.
391:
392: Output operand expressions must be lvalues; the compiler can check
393: this. The input operands need not be lvalues. The compiler cannot
394: check whether the operands have data types that are reasonable for
395: the instruction being executed. It does not parse the assembler
396: instruction template and does not know what it means, or whether it
397: is valid assembler input. The extended `asm' feature is most often
398: used for machine instructions that the compiler itself does not know
399: exist.
400:
401: The output operands must be write-only; GNU CC will assume that the
402: values in these operands before the instruction are dead and need not
403: be generated. Extended asm does not support input-output or
404: read-write operands. For this reason, the constraint character `+',
405: which indicates such an operand, may not be used.
406:
407: When the assembler instruction has a read-write operand, or an
408: operand in which only some of the bits are to be changed, you must
409: logically split its function into two separate operands, one input
410: operand and one write-only output operand. The connection between
411: them is expressed by constraints which say they need to be in the
412: same location when the instruction executes. You can use the same C
413: expression for both operands, or different expressions. For example,
414: here we write the (fictitious) `combine' instruction with `bar' as
415: its read-only source operand and `foo' as its read-write destination:
416:
417: asm ("combine %2,%0" : "=r" (foo) : "0" (foo), "g" (bar));
418:
419: The constraint `"0"' for operand 1 says that it must occupy the same
420: location as operand 0. A digit in constraint is allowed only in an
421: input operand, and it must refer to an output operand.
422:
423: Only a digit in the constraint can guarantee that one operand will be
424: in the same place as another. The mere fact that `foo' is the value
425: of both operands is not enough to guarantee that they will be in the
426: same place in the generated assembler code. The following would not
427: work:
428:
429: asm ("combine %2,%0" : "=r" (foo) : "r" (foo), "g" (bar));
430:
431: Various optimizations or reloading could cause operands 0 and 1 to be
432: in different registers; GNU CC knows no reason not to do so. For
433: example, the compiler might find a copy of the value of `foo' in one
434: register and use it for operand 1, but generate the output operand 0
435: in a different register (copying it afterward to `foo''s own
436: address). Of course, since the register for operand 1 is not even
437: mentioned in the assembler code, the result will not work, but GNU CC
438: can't tell that.
439:
440: Unless an output operand has the `&' constraint modifier, GNU CC may
441: allocate it in the same register as an unrelated input operand, on
442: the assumption that the inputs are consumed before the outputs are
443: produced. This assumption may be false if the assembler code
444: actually consists of more than one instruction. In such a case, use
445: `&' for each output operand that may not overlap an input. *Note
446: Modifiers::.
447:
448: Some instructions clobber specific hard registers. To describe this,
449: write a third colon after the input operands, followed by the names
450: of the clobbered hard registers (given as strings). Here is a
451: realistic example for the vax:
452:
453: asm volatile ("movc3 %0,%1,%2"
454: : /* no outputs */
455: : "g" (from), "g" (to), "g" (count)
456: : "r0", "r1", "r2", "r3", "r4", "r5");
457:
458: You can put multiple assembler instructions together in a single
459: `asm' template, separated either with newlines (written as `\n') or
460: with semicolons if the assembler allows such semicolons. The GNU
461: assembler allows semicolons and all Unix assemblers seem to do so.
462: The input operands are guaranteed not to use any of the clobbered
463: registers, and neither will the output operands' addresses, so you
464: can read and write the clobbered registers as many times as you like.
465: Here is an example of multiple instructions in a template; it assumes
466: that the subroutine `_foo' accepts arguments in registers 9 and 10:
467:
468: asm ("movl %0,r9;movl %1,r10;call _foo"
469: : /* no outputs */
470: : "g" (from), "g" (to)
471: : "r9", "r10");
472:
473: If you want to test the condition code produced by an assembler
474: instruction, you must include a branch and a label in the `asm'
475: construct, as follows:
476:
477: asm ("clr %0;frob %1;beq 0f;mov #1,%0;0:"
478: : "g" (result)
479: : "g" (input));
480:
481: This assumes your assembler supports local labels, as the GNU
482: assembler and most Unix assemblers do.
483:
484: Usually the most convenient way to use these `asm' instructions is to
485: encapsulate them in macros that look like functions. For example,
486:
487: #define sin(x) \
488: ({ double __value, __arg = (x); \
489: asm ("fsinx %1,%0": "=f" (__value): "f" (__arg)); \
490: __value; })
491:
492: Here the variable `__arg' is used to make sure that the instruction
493: operates on a proper `double' value, and to accept only those
494: arguments `x' which can convert automatically to a `double'.
495:
496: Another way to make sure the instruction operates on the correct data
497: type is to use a cast in the `asm'. This is different from using a
498: variable `__arg' in that it converts more different types. For
499: example, if the desired type were `int', casting the argument to
500: `int' would accept a pointer with no complaint, while assigning the
501: argument to an `int' variable named `__arg' would warn about using a
502: pointer unless the caller explicitly casts it.
1.1.1.2 root 503:
1.1.1.3 root 504: If an `asm' has output operands, GNU CC assumes for optimization
505: purposes that the instruction has no side effects except to change
506: the output operands. This does not mean that instructions with a
507: side effect cannot be used, but you must be careful, because the
508: compiler may eliminate them if the output operands aren't used, or
509: move them out of loops, or replace two with one if they constitute a
510: common subexpression. Also, if your instruction does have a side
511: effect on a variable that otherwise appears not to change, the old
512: value of the variable may be reused later if it happens to be found
513: in a register.
1.1.1.2 root 514:
1.1.1.3 root 515: You can prevent an `asm' instruction from being deleted, moved or
516: combined by writing the keyword `volatile' after the `asm'. For
517: example:
1.1.1.2 root 518:
1.1.1.3 root 519: #define set_priority(x) \
520: asm volatile ("set_priority %0": /* no outputs */ : "g" (x))
521:
522: (However, an instruction without output operands will not be deleted
523: or moved, regardless, unless it is unreachable.)
524:
525: It is a natural idea to look for a way to give access to the
526: condition code left by the assembler instruction. However, when we
527: attempted to implement this, we found no way to make it work
528: reliably. The problem is that output operands might need reloading,
1.1.1.4 root 529: which would result in additional following "store" instructions. On
530: most machines, these instructions would alter the condition code
1.1.1.3 root 531: before there was time to test it. This problem doesn't arise for
1.1.1.4 root 532: ordinary "test" and "compare" instructions because they don't have
533: any output operands.
1.1.1.3 root 534:
535: If you are writing a header file that should be includable in ANSI C
536: programs, write `__asm__' instead of `asm'. *Note Alternate
537: Keywords::.
1.1.1.2 root 538:
539:
540:
1.1.1.3 root 541: File: gcc.info, Node: Asm Labels, Next: Explicit Reg Vars, Prev: Extended Asm, Up: Extensions
542:
543: Controlling Names Used in Assembler Code
544: ========================================
545:
546: You can specify the name to be used in the assembler code for a C
547: function or variable by writing the `asm' (or `__asm__') keyword
548: after the declarator as follows:
549:
550: int foo asm ("myfoo") = 2;
551:
552: This specifies that the name to be used for the variable `foo' in the
553: assembler code should be `myfoo' rather than the usual `_foo'.
1.1.1.2 root 554:
1.1.1.3 root 555: On systems where an underscore is normally prepended to the name of a
556: C function or variable, this feature allows you to define names for
557: the linker that do not start with an underscore.
1.1.1.2 root 558:
1.1.1.3 root 559: You cannot use `asm' in this way in a function *definition*; but you
560: can get the same effect by writing a declaration for the function
561: before its definition and putting `asm' there, like this:
1.1.1.2 root 562:
1.1.1.3 root 563: extern func () asm ("FUNC");
564:
565: func (x, y)
566: int x, y;
567: ...
568:
569: It is up to you to make sure that the assembler names you choose do
570: not conflict with any other assembler symbols. Also, you must not
571: use a register name; that would produce completely invalid assembler
572: code. GNU CC does not as yet have the ability to store static
573: variables in registers. Perhaps that will be added.
1.1.1.2 root 574:
575:
576:
1.1.1.3 root 577: File: gcc.info, Node: Explicit Reg Vars, Next: Alternate Keywords, Prev: Asm Labels, Up: Extensions
578:
579: Variables in Specified Registers
580: ================================
581:
582: GNU C allows you to put a few global variables into specified
583: hardware registers. You can also specify the register in which an
584: ordinary register variable should be allocated.
585:
586: * Global register variables reserve registers throughout the
587: program. This may be useful in programs such as programming
588: language interpreters which have a couple of global variables
589: that are accessed very often.
1.1.1.2 root 590:
1.1.1.3 root 591: * Local register variables in specific registers do not reserve
592: the registers. The compiler's data flow analysis is capable of
593: determining where the specified registers contain live values,
594: and where they are available for other uses. These local
595: variables are sometimes convenient for use with the extended
596: `asm' feature (*note Extended Asm::.).
1.1.1.2 root 597:
1.1.1.3 root 598: * Menu:
599:
600: * Global Reg Vars::
601: * Local Reg Vars::
1.1.1.2 root 602:
1.1 root 603:
604:
1.1.1.3 root 605: File: gcc.info, Node: Global Reg Vars, Next: Local Reg Vars, Prev: Explicit Reg Vars, Up: Explicit Reg Vars
1.1 root 606:
1.1.1.3 root 607: Defining Global Register Variables
608: ----------------------------------
609:
610: You can define a global register variable in GNU C like this:
1.1 root 611:
1.1.1.3 root 612: register int *foo asm ("a5");
1.1 root 613:
1.1.1.3 root 614: Here `a5' is the name of the register which should be used. Choose a
615: register which is normally saved and restored by function calls on
616: your machine, so that library routines will not clobber it.
617:
618: Naturally the register name is cpu-dependent, so you would need to
619: conditionalize your program according to cpu type. The register `a5'
620: would be a good choice on a 68000 for a variable of pointer type. On
1.1.1.4 root 621: machines with register windows, be sure to choose a "global" register
622: that is not affected magically by the function call mechanism.
1.1.1.3 root 623:
624: In addition, operating systems on one type of cpu may differ in how
625: they name the registers; then you would need additional conditionals.
626: For example, some 68000 operating systems call this register `%a5'.
627:
628: Eventually there may be a way of asking the compiler to choose a
629: register automatically, but first we need to figure out how it should
630: choose and how to enable you to guide the choice. No solution is
631: evident.
632:
633: Defining a global register variable in a certain register reserves
634: that register entirely for this use, at least within the current
635: compilation. The register will not be allocated for any other
636: purpose in the functions in the current compilation. The register
637: will not be saved and restored by these functions. Stores into this
638: register are never deleted even if they would appear to be dead, but
639: references may be deleted or moved or simplified.
640:
641: It is not safe to access the global register variables from signal
642: handlers, or from more than one thread of control, because the system
643: library routines may temporarily use the register for other things
644: (unless you recompile them specially for the task at hand).
645:
646: It is not safe for one function that uses a global register variable
647: to call another such function `foo' by way of a third function `lose'
648: that was compiled without knowledge of this variable (i.e. in a
649: different source file in which the variable wasn't declared). This
650: is because `lose' might save the register and put some other value
651: there. For example, you can't expect a global register variable to
652: be available in the comparison-function that you pass to `qsort',
653: since `qsort' might have put something else in that register. (If
654: you are prepared to recompile `qsort' with the same global register
655: variable, you can solve this problem.)
656:
657: If you want to recompile `qsort' or other source files which do not
658: actually use your global register variable, so that they will not use
659: that register for any other purpose, then it suffices to specify the
660: compiler option `-ffixed-REG'. You need not actually add a global
661: register declaration to their source code.
662:
663: A function which can alter the value of a global register variable
664: cannot safely be called from a function compiled without this
665: variable, because it could clobber the value the caller expects to
666: find there on return. Therefore, the function which is the entry
667: point into the part of the program that uses the global register
668: variable must explicitly save and restore the value which belongs to
669: its caller.
670:
671: On most machines, `longjmp' will restore to each global register
672: variable the value it had at the time of the `setjmp'. On some
673: machines, however, `longjmp' will not change the value of global
674: register variables. To be portable, the function that called
675: `setjmp' should make other arrangements to save the values of the
676: global register variables, and to restore them in a `longjmp'. This
1.1.1.4 root 677: way, the same thing will happen regardless of what `longjmp' does.
1.1.1.3 root 678:
679: All global register variable declarations must precede all function
680: definitions. If such a declaration could appear after function
681: definitions, the declaration would be too late to prevent the
682: register from being used for other purposes in the preceding functions.
683:
684: Global register variables may not have initial values, because an
685: executable file has no means to supply initial contents for a register.
1.1 root 686:
687:
688:
1.1.1.3 root 689: File: gcc.info, Node: Local Reg Vars, Prev: Global Reg Vars, Up: Explicit Reg Vars
690:
691: Specifying Registers for Local Variables
692: ----------------------------------------
693:
694: You can define a local register variable with a specified register
695: like this:
696:
697: register int *foo asm ("a5");
1.1 root 698:
1.1.1.3 root 699: Here `a5' is the name of the register which should be used. Note
700: that this is the same syntax used for defining global register
701: variables, but for a local variable it would appear within a function.
1.1 root 702:
1.1.1.3 root 703: Naturally the register name is cpu-dependent, but this is not a
704: problem, since specific registers are most often useful with explicit
705: assembler instructions (*note Extended Asm::.). Both of these things
706: generally require that you conditionalize your program according to
707: cpu type.
708:
709: In addition, operating systems on one type of cpu may differ in how
710: they name the registers; then you would need additional conditionals.
711: For example, some 68000 operating systems call this register `%a5'.
712:
713: Eventually there may be a way of asking the compiler to choose a
714: register automatically, but first we need to figure out how it should
715: choose and how to enable you to guide the choice. No solution is
716: evident.
717:
718: Defining such a register variable does not reserve the register; it
719: remains available for other uses in places where flow control
720: determines the variable's value is not live. However, these
1.1.1.4 root 721: registers are made unavailable for use in the reload pass. I would
722: not be surprised if excessive use of this feature leaves the compiler
723: too few available registers to compile certain functions.
1.1 root 724:
725:
726:
1.1.1.3 root 727: File: gcc.info, Node: Alternate Keywords, Prev: Explicit Reg Vars, Up: Extensions
728:
729: Alternate Keywords
730: ==================
731:
732: The option `-traditional' disables certain keywords; `-ansi' disables
733: certain others. This causes trouble when you want to use GNU C
734: extensions, or ANSI C features, in a general-purpose header file that
735: should be usable by all programs, including ANSI C programs and
736: traditional ones. The keywords `asm', `typeof' and `inline' cannot
737: be used since they won't work in a program compiled with `-ansi',
738: while the keywords `const', `volatile', `signed', `typeof' and
739: `inline' won't work in a program compiled with `-traditional'.
740:
741: The way to solve these problems is to put `__' at the beginning and
742: end of each problematical keyword. For example, use `__asm__'
743: instead of `asm', `__const__' instead of `const', and `__inline__'
744: instead of `inline'.
1.1 root 745:
1.1.1.3 root 746: Other C compilers won't accept these alternative keywords; if you
747: want to compile with another compiler, you can define the alternate
748: keywords as macros to replace them with the customary keywords. It
749: looks like this:
1.1 root 750:
1.1.1.3 root 751: #ifndef __GNUC__
752: #define __asm__ asm
753: #endif
1.1 root 754:
755:
756:
1.1.1.3 root 757: File: gcc.info, Node: Bugs, Next: Portability, Prev: Extensions, Up: Top
1.1 root 758:
1.1.1.3 root 759: Reporting Bugs
760: **************
1.1 root 761:
1.1.1.3 root 762: Your bug reports play an essential role in making GNU CC reliable.
1.1 root 763:
1.1.1.5 ! root 764: When you encounter a problem, the first thing to do is to see if it
! 765: is already known. *Note Trouble::. Also look in *Note
! 766: Incompatibilities::. If it isn't known, then you should report the
! 767: problem.
! 768:
1.1.1.3 root 769: Reporting a bug may help you by bringing a solution to your problem,
1.1.1.5 ! root 770: or it may not. (If it does not, look in the service directory; see
! 771: *Note Service::.) In any case, the principal function of a bug
1.1.1.3 root 772: report is to help the entire community by making the next version of
773: GNU CC work better. Bug reports are your contribution to the
774: maintenance of GNU CC.
1.1 root 775:
1.1.1.3 root 776: In order for a bug report to serve its purpose, you must include the
777: information that makes for fixing the bug.
1.1 root 778:
1.1.1.3 root 779: * Menu:
780:
781: * Criteria: Bug Criteria. Have you really found a bug?
782: * Reporting: Bug Reporting. How to report a bug effectively.
783:
784:
1.1 root 785:
1.1.1.3 root 786: File: gcc.info, Node: Bug Criteria, Next: Bug Reporting, Prev: Bugs, Up: Bugs
787:
788: Have You Found a Bug?
789: =====================
790:
791: If you are not sure whether you have found a bug, here are some
792: guidelines:
793:
794: * If the compiler gets a fatal signal, for any input whatever,
795: that is a compiler bug. Reliable compilers never crash.
1.1 root 796:
1.1.1.3 root 797: * If the compiler produces invalid assembly code, for any input
798: whatever (except an `asm' statement), that is a compiler bug,
799: unless the compiler reports errors (not just warnings) which
800: would ordinarily prevent the assembler from being run.
1.1 root 801:
1.1.1.3 root 802: * If the compiler produces valid assembly code that does not
803: correctly execute the input source code, that is a compiler bug.
804:
805: However, you must double-check to make sure, because you may
806: have run into an incompatibility between GNU C and traditional C
807: (*note Incompatibilities::.). These incompatibilities might be
808: considered bugs, but they are inescapable consequences of
809: valuable features.
810:
811: Or you may have a program whose behavior is undefined, which
812: happened by chance to give the desired results with another C
813: compiler.
814:
815: For example, in many nonoptimizing compilers, you can write `x;'
816: at the end of a function instead of `return x;', with the same
817: results. But the value of the function is undefined if `return'
818: is omitted; it is not a bug when GNU CC produces different
819: results.
820:
821: Problems often result from expressions with two increment
822: operators, as in `f (*p++, *p++)'. Your previous compiler might
823: have interpreted that expression the way you intended; GNU CC
824: might interpret it another way. Neither compiler is wrong. The
825: bug is in your code.
826:
827: After you have localized the error to a single source line, it
828: should be easy to check for these things. If your program is
829: correct and well defined, you have found a compiler bug.
830:
831: * If the compiler produces an error message for valid input, that
832: is a compiler bug.
833:
834: Note that the following is not valid input, and the error
835: message for it is not a bug:
836:
837: int foo (char);
838:
839: int
840: foo (x)
841: char x;
842: { ... }
843:
844: The prototype says to pass a `char', while the definition says
845: to pass an `int' and treat the value as a `char'. This is what
846: the ANSI standard says, and it makes sense.
847:
848: * If the compiler does not produce an error message for invalid
849: input, that is a compiler bug. However, you should note that
1.1.1.4 root 850: your idea of "invalid input" might be my idea of "an extension"
851: or "support for traditional practice".
1.1.1.3 root 852:
853: * If you are an experienced user of C compilers, your suggestions
854: for improvement of GNU CC are welcome in any case.
1.1 root 855:
856:
857:
1.1.1.3 root 858: File: gcc.info, Node: Bug Reporting, Prev: Bug Criteria, Up: Bugs
859:
860: How to Report Bugs
861: ==================
1.1 root 862:
1.1.1.3 root 863: Send bug reports for GNU C to one of these addresses:
1.1 root 864:
1.1.1.3 root 865: [email protected]
866: {ucbvax|mit-eddie|uunet}!prep.ai.mit.edu!bug-gcc
1.1 root 867:
1.1.1.3 root 868: *Do not send bug reports to `info-gcc', or to the newsgroup
869: `gnu.gcc'.* Most users of GNU CC do not want to receive bug reports.
870: Those that do, have asked to be on `bug-gcc'.
871:
872: The mailing list `bug-gcc' has a newsgroup which serves as a
873: repeater. The mailing list and the newsgroup carry exactly the same
874: messages. Often people think of posting bug reports to the newsgroup
875: instead of mailing them. This appears to work, but it has one
876: problem which can be crucial: a newsgroup posting does not contain a
877: mail path back to the sender. Thus, if I need to ask for more
878: information, I may be unable to reach you. For this reason, it is
879: better to send bug reports to the mailing list.
880:
881: As a last resort, send bug reports on paper to:
882:
883: GNU Compiler Bugs
884: 545 Tech Sq
885: Cambridge, MA 02139
886:
887: The fundamental principle of reporting bugs usefully is this: *report
888: all the facts*. If you are not sure whether to state a fact or leave
889: it out, state it!
890:
891: Often people omit facts because they think they know what causes the
892: problem and they conclude that some details don't matter. Thus, you
893: might assume that the name of the variable you use in an example does
894: not matter. Well, probably it doesn't, but one cannot be sure.
895: Perhaps the bug is a stray memory reference which happens to fetch
896: from the location where that name is stored in memory; perhaps, if
897: the name were different, the contents of that location would fool the
898: compiler into doing the right thing despite the bug. Play it safe
899: and give a specific, complete example. That is the easiest thing for
900: you to do, and the most helpful.
901:
902: Keep in mind that the purpose of a bug report is to enable me to fix
903: the bug if it is not known. It isn't very important what happens if
904: the bug is already known. Therefore, always write your bug reports
905: on the assumption that the bug is not known.
906:
1.1.1.4 root 907: Sometimes people give a few sketchy facts and ask, "Does this ring a
908: bell?" Those bug reports are useless, and I urge everyone to *refuse
909: to respond to them* except to chide the sender to report bugs properly.
1.1.1.3 root 910:
911: To enable me to fix the bug, you should include all these things:
912:
913: * The version of GNU CC. You can get this by running it with the
914: `-v' option.
915:
916: Without this, I won't know whether there is any point in looking
917: for the bug in the current version of GNU CC.
918:
919: * A complete input file that will reproduce the bug. If the bug
920: is in the C preprocessor, send me a source file and any header
921: files that it requires. If the bug is in the compiler proper
922: (`cc1'), run your source file through the C preprocessor by
923: doing `gcc -E SOURCEFILE > OUTFILE', then include the contents
924: of OUTFILE in the bug report. (Any `-I', `-D' or `-U' options
925: that you used in actual compilation should also be used when
926: doing this.)
927:
928: A single statement is not enough of an example. In order to
929: compile it, it must be embedded in a function definition; and
930: the bug might depend on the details of how this is done.
931:
932: Without a real example I can compile, all I can do about your
933: bug report is wish you luck. It would be futile to try to guess
934: how to provoke the bug. For example, bugs in register
935: allocation and reloading frequently depend on every little
936: detail of the function they happen in.
937:
938: * The command arguments you gave GNU CC to compile that example
939: and observe the bug. For example, did you use `-O'? To
940: guarantee you won't omit something important, list them all.
941:
942: If I were to try to guess the arguments, I would probably guess
943: wrong and then I would not encounter the bug.
944:
945: * The names of the files that you used for `tm.h' and `md' when
946: you installed the compiler.
947:
948: * The type of machine you are using, and the operating system name
949: and version number.
950:
951: * A description of what behavior you observe that you believe is
1.1.1.4 root 952: incorrect. For example, "It gets a fatal signal," or, "There is
953: an incorrect assembler instruction in the output."
1.1.1.3 root 954:
955: Of course, if the bug is that the compiler gets a fatal signal,
956: then I will certainly notice it. But if the bug is incorrect
957: output, I might not notice unless it is glaringly wrong. I
958: won't study all the assembler code from a 50-line C program just
959: on the off chance that it might be wrong.
960:
961: Even if the problem you experience is a fatal signal, you should
962: still say so explicitly. Suppose something strange is going on,
963: such as, your copy of the compiler is out of synch, or you have
964: encountered a bug in the C library on your system. (This has
965: happened!) Your copy might crash and mine would not. If you
966: told me to expect a crash, then when mine fails to crash, I
967: would know that the bug was not happening for me. If you had
968: not told me to expect a crash, then I would not be able to draw
969: any conclusion from my observations.
970:
971: Often the observed symptom is incorrect output when your program
972: is run. Sad to say, this is not enough information for me
973: unless the program is short and simple. If you send me a large
974: program, I don't have time to figure out how it would work if
975: compiled correctly, much less which line of it was compiled
976: wrong. So you will have to do that. Tell me which source line
977: it is, and what incorrect result happens when that line is
978: executed. A person who understands the test program can find
979: this as easily as a bug in the program itself.
980:
981: * If you send me examples of output from GNU CC, please use `-g'
982: when you make them. The debugging information includes source
983: line numbers which are essential for correlating the output with
984: the input.
985:
986: * If you wish to suggest changes to the GNU CC source, send me
987: context diffs. If you even discuss something in the GNU CC
988: source, refer to it by context, not by line number.
989:
990: The line numbers in my development sources don't match those in
991: your sources. Your line numbers would convey no useful
992: information to me.
993:
994: * Additional information from a debugger might enable me to find a
995: problem on a machine which I do not have available myself.
996: However, you need to think when you collect this information if
997: you want it to have any chance of being useful.
998:
999: For example, many people send just a backtrace, but that is
1000: never useful by itself. A simple backtrace with arguments
1001: conveys little about GNU CC because the compiler is largely
1002: data-driven; the same functions are called over and over for
1003: different RTL insns, doing different things depending on the
1004: details of the insn.
1005:
1006: Most of the arguments listed in the backtrace are useless
1007: because they are pointers to RTL list structure. The numeric
1008: values of the pointers, which the debugger prints in the
1009: backtrace, have no significance whatever; all that matters is
1010: the contents of the objects they point to (and most of the
1011: contents are other such pointers).
1012:
1013: In addition, most compiler passes consist of one or more loops
1014: that scan the RTL insn sequence. The most vital piece of
1015: information about such a loop--which insn it has reached--is
1016: usually in a local variable, not in an argument.
1017:
1018: What you need to provide in addition to a backtrace are the
1019: values of the local variables for several stack frames up. When
1020: a local variable or an argument is an RTX, first print its value
1021: and then use the GDB command `pr' to print the RTL expression
1022: that it points to. (If GDB doesn't run on your machine, use
1023: your debugger to call the function `debug_rtx' with the RTX as
1024: an argument.) In general, whenever a variable is a pointer, its
1025: value is no use without the data it points to.
1026:
1027: In addition, include a debugging dump from just before the pass
1028: in which the crash happens. Most bugs involve a series of
1029: insns, not just one.
1030:
1031: Here are some things that are not necessary:
1032:
1033: * A description of the envelope of the bug.
1034:
1035: Often people who encounter a bug spend a lot of time
1036: investigating which changes to the input file will make the bug
1037: go away and which changes will not affect it.
1038:
1039: This is often time consuming and not very useful, because the
1040: way I will find the bug is by running a single example under the
1041: debugger with breakpoints, not by pure deduction from a series
1042: of examples. I recommend that you save your time for something
1043: else.
1044:
1045: Of course, if you can find a simpler example to report *instead*
1046: of the original one, that is a convenience for me. Errors in
1047: the output will be easier to spot, running under the debugger
1048: will take less time, etc. Most GNU CC bugs involve just one
1049: function, so the most straightforward way to simplify an example
1050: is to delete all the function definitions except the one where
1051: the bug occurs. Those earlier in the file may be replaced by
1052: external declarations if the crucial function depends on them.
1053: (Exception: inline functions may affect compilation of functions
1054: defined later in the file.)
1055:
1056: However, simplification is not vital; if you don't want to do
1057: this, report the bug anyway and send me the entire test case you
1058: used.
1059:
1060: * A patch for the bug.
1061:
1062: A patch for the bug does help me if it is a good one. But don't
1063: omit the necessary information, such as the test case, on the
1064: assumption that a patch is all I need. I might see problems
1065: with your patch and decide to fix the problem another way, or I
1066: might not understand it at all.
1067:
1068: Sometimes with a program as complicated as GNU CC it is very
1069: hard to construct an example that will make the program follow a
1070: certain path through the code. If you don't send me the
1071: example, I won't be able to construct one, so I won't be able to
1072: verify that the bug is fixed.
1073:
1074: And if I can't understand what bug you are trying to fix, or why
1075: your patch should be an improvement, I won't install it. A test
1076: case will help me to understand.
1077:
1078: * A guess about what the bug is or what it depends on.
1079:
1080: Such guesses are usually wrong. Even I can't guess right about
1081: such things without first using the debugger to find the facts.
1082:
1083:
1084:
1085: File: gcc.info, Node: Portability, Next: Interface, Prev: Bugs, Up: Top
1086:
1087: GNU CC and Portability
1088: **********************
1089:
1090: The main goal of GNU CC was to make a good, fast compiler for
1091: machines in the class that the GNU system aims to run on: 32-bit
1092: machines that address 8-bit bytes and have several general registers.
1093: Elegance, theoretical power and simplicity are only secondary.
1094:
1095: GNU CC gets most of the information about the target machine from a
1096: machine description which gives an algebraic formula for each of the
1097: machine's instructions. This is a very clean way to describe the
1098: target. But when the compiler needs information that is difficult to
1099: express in this fashion, I have not hesitated to define an ad-hoc
1100: parameter to the machine description. The purpose of portability is
1101: to reduce the total work needed on the compiler; it was not of
1102: interest for its own sake.
1103:
1104: GNU CC does not contain machine dependent code, but it does contain
1105: code that depends on machine parameters such as endianness (whether
1106: the most significant byte has the highest or lowest address of the
1107: bytes in a word) and the availability of autoincrement addressing.
1108: In the RTL-generation pass, it is often necessary to have multiple
1109: strategies for generating code for a particular kind of syntax tree,
1110: strategies that are usable for different combinations of parameters.
1111: Often I have not tried to address all possible cases, but only the
1112: common ones or only the ones that I have encountered. As a result, a
1113: new target may require additional strategies. You will know if this
1114: happens because the compiler will call `abort'. Fortunately, the new
1115: strategies can be added in a machine-independent fashion, and will
1116: affect only the target machines that need them.
1117:
1118:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.