|
|
1.1 root 1: Info file gcc.info, produced by Makeinfo, -*- Text -*- from input
2: file gcc.texinfo.
3:
4: This file documents the use and the internals of the GNU compiler.
5:
6: Copyright (C) 1988 Free Software Foundation, Inc.
7:
8: Permission is granted to make and distribute verbatim copies of this
9: manual provided the copyright notice and this permission notice are
10: preserved on all copies.
11:
12: Permission is granted to copy and distribute modified versions of
13: this manual under the conditions for verbatim copying, provided also
14: that the section entitled ``GNU CC General Public License'' is
15: included exactly as in the original, and provided that the entire
16: resulting derived work is distributed under the terms of a permission
17: notice identical to this one.
18:
19: Permission is granted to copy and distribute translations of this
20: manual into another language, under the above conditions for modified
21: versions, except that the section entitled ``GNU CC General Public
22: License'' and this permission notice may be included in translations
23: approved by the Free Software Foundation instead of in the original
24: English.
25:
26:
27:
28: File: gcc.info, Node: Simple Constraints, Next: Multi-Alternative, Prev: Constraints, Up: Constraints
29:
30: Simple Constraints
31: ------------------
32:
33: The simplest kind of constraint is a string full of letters, each of
34: which describes one kind of operand that is permitted. Here are the
35: letters that are allowed:
36:
37: `m'
38: A memory operand is allowed, with any kind of address that the
39: machine supports in general.
40:
41: `o'
42: A memory operand is allowed, but only if the address is
43: "offsetable". This means that adding a small integer (actually,
44: the width in bytes of the operand, as determined by its machine
45: mode) may be added to the address and the result is also a valid
46: memory address.
47:
48: For example, an address which is constant is offsetable; so is
49: an address that is the sum of a register and a constant (as long
50: as a slightly larger constant is also within the range of
51: address-offsets supported by the machine); but an autoincrement
52: or autodecrement address is not offsetable. More complicated
53: indirect/indexed addresses may or may not be offsetable
54: depending on the other addressing modes that the machine supports.
55:
56: Note that in an output operand which can be matched by another
57: operand, the constraint letter `o' is valid only when
58: accompanied by both `<' (if the target machine has predecrement
59: addressing) and `>' (if the target machine has preincrement
60: addressing).
61:
62: When the constraint letter `o' is used, the reload pass may
63: generate instructions which copy a nonoffsetable address into an
64: index register. The idea is that the register can be used as a
65: replacement offsetable address. But this method requires that
66: there be patterns to copy any kind of address into a register.
67: Auto-increment and auto-decrement addresses are an exception;
68: there need not be an instruction that can copy such an address
69: into a register, because reload handles these cases specially.
70:
71: Most older machine designs have ``load address'' instructions
72: which do just what is needed here. Some RISC machines do not
73: advertise such instructions, but the possible addresses on these
74: machines are very limited, so it is easy to fake them.
75:
76: `<'
77: A memory operand with autodecrement addressing (either
78: predecrement or postdecrement) is allowed.
79:
80: `>'
81: A memory operand with autoincrement addressing (either
82: preincrement or postincrement) is allowed.
83:
84: `r'
85: A register operand is allowed provided that it is in a general
86: register.
87:
88: `d', `a', `f', ...
89: Other letters can be defined in machine-dependent fashion to
90: stand for particular classes of registers. `d', `a' and `f' are
91: defined on the 68000/68020 to stand for data, address and
92: floating point registers.
93:
94: `i'
95: An immediate integer operand (one with constant value) is allowed.
96: This includes symbolic constants whose values will be known only
97: at assembly time.
98:
99: `n'
100: An immediate integer operand with a known numeric value is
101: allowed. Many systems cannot support assembly-time constants
102: for operands less than a word wide. Constraints for these
103: operands should use `n' rather than `i'.
104:
105: `I', `J', `K', ...
106: Other letters in the range `I' through `M' may be defined in a
107: machine-dependent fashion to permit immediate integer operands
108: with explicit integer values in specified ranges. For example,
109: on the 68000, `I' is defined to stand for the range of values 1
110: to 8. This is the range permitted as a shift count in the shift
111: instructions.
112:
113: `F'
114: An immediate floating operand (expression code `const_double')
115: is allowed.
116:
117: `G', `H'
118: `G' and `H' may be defined in a machine-dependent fashion to
119: permit immediate floating operands in particular ranges of values.
120:
121: `s'
122: An immediate integer operand whose value is not an explicit
123: integer is allowed.
124:
125: This might appear strange; if an insn allows a constant operand
126: with a value not known at compile time, it certainly must allow
127: any known value. So why use `s' instead of `i'? Sometimes it
128: allows better code to be generated.
129:
130: For example, on the 68000 in a fullword instruction it is
131: possible to use an immediate operand; but if the immediate value
132: is between -32 and 31, better code results from loading the
133: value into a register and using the register. This is because
134: the load into the register can be done with a `moveq'
135: instruction. We arrange for this to happen by defining the
136: letter `K' to mean ``any integer outside the range -32 to 31'',
137: and then specifying `Ks' in the operand constraints.
138:
139: `g'
140: Any register, memory or immediate integer operand is allowed,
141: except for registers that are not general registers.
142:
143: `N' (a digit)
144: An operand that matches operand number N is allowed. If a digit
145: is used together with letters, the digit should come last.
146:
147: This is called a "matching constraint" and what it really means
148: is that the assembler has only a single operand that fills two
149: roles considered separate in the RTL insn. For example, an add
150: insn has two input operands and one output operand in the RTL,
151: but on most machines an add instruction really has only two
152: operands, one of them an input-output operand.
153:
154: Matching constraints work only in circumstances like that add
155: insn. More precisely, the matching constraint must appear in an
156: input-only operand and the operand that it matches must be an
157: output-only operand with a lower number.
158:
159: For operands to match in a particular case usually means that
160: they are identical-looking RTL expressions. But in a few
161: special cases specific kinds of dissimilarity are allowed. For
162: example, `*x' as an input operand will match `*x++' as an output
163: operand. For proper results in such cases, the output template
164: should always use the output-operand's number when printing the
165: operand.
166:
167: `p'
168: An operand that is a valid memory address is allowed. This is
169: for ``load address'' and ``push address'' instructions.
170:
171: If `p' is used in the constraint, the test-function in the
172: `match_operand' must be `address_operand'.
173:
174: In order to have valid assembler code, each operand must satisfy its
175: constraint. But a failure to do so does not prevent the pattern from
176: applying to an insn. Instead, it directs the compiler to modify the
177: code so that the constraint will be satisfied. Usually this is done
178: by copying an operand into a register.
179:
180: Contrast, therefore, the two instruction patterns that follow:
181:
182: (define_insn ""
183: [(set (match_operand:SI 0 "general_operand" "r")
184: (plus:SI (match_dup 0)
185: (match_operand:SI 1 "general_operand" "r")))]
186: ""
187: "...")
188:
189: which has two operands, one of which must appear in two places, and
190:
191: (define_insn ""
192: [(set (match_operand:SI 0 "general_operand" "r")
193: (plus:SI (match_operand:SI 1 "general_operand" "0")
194: (match_operand:SI 2 "general_operand" "r")))]
195: ""
196: "...")
197:
198: which has three operands, two of which are required by a constraint
199: to be identical. If we are considering an insn of the form
200:
201: (insn N PREV NEXT
202: (set (reg:SI 3)
203: (plus:SI (reg:SI 6) (reg:SI 109)))
204: ...)
205:
206: the first pattern would not apply at all, because this insn does not
207: contain two identical subexpressions in the right place. The pattern
208: would say, ``That does not look like an add instruction; try other
209: patterns.'' The second pattern would say, ``Yes, that's an add
210: instruction, but there is something wrong with it.'' It would direct
211: the reload pass of the compiler to generate additional insns to make
212: the constraint true. The results might look like this:
213:
214: (insn N2 PREV N
215: (set (reg:SI 3) (reg:SI 6))
216: ...)
217:
218: (insn N N2 NEXT
219: (set (reg:SI 3)
220: (plus:SI (reg:SI 3) (reg:SI 109)))
221: ...)
222:
223: It is up to you to make sure that each operand, in each pattern, has
224: constraints that can handle any RTL expression that could be present
225: for that operand. (When multiple alternatives are in use, each
226: pattern must, for each possible combination of operand expressions,
227: have at least one alternative which can handle that combination of
228: operands.) The constraints don't need to *allow* any possible
229: operand--when this is the case, they do not constrain--but they must
230: at least point the way to reloading any possible operand so that it
231: will fit.
232:
233: * If the constraint accepts whatever operands the predicate
234: permits, there is no problem: reloading is never necessary for
235: this operand.
236:
237: For example, an operand whose constraints permit everything
238: except registers is safe provided its predicate rejects registers.
239:
240: An operand whose predicate accepts only constant values is safe
241: provided its constraints include the letter `i'. If any
242: possible constant value is accepted, then nothing less than `i'
243: will do; if the predicate is more selective, than the
244: constraints may also be more selective.
245:
246: * Any operand expression can be reloaded by copying it into a
247: register. So if an operand's constraints allow some kind of
248: register, it is certain to be safe. It need not permit all
249: classes of registers; the compiler knows how to copy a register
250: into another register of the proper class in order to make an
251: instruction valid.
252:
253: * A nonoffsetable memory reference can be reloaded by copying the
254: address into a register. So if the constraint uses the letter
255: `o', all memory references are taken care of.
256:
257: * A constant operand can be reloaded by storing it in memory; it
258: then becomes an offsetable memory reference. So if the
259: constraint uses the letters `o' or `m', constant operands are
260: not a problem.
261:
262: If the operand's predicate can recognize registers, but the
263: constraint does not permit them, it can make the compiler crash.
264: When this operand happens to be a register, the reload pass will be
265: stymied, because it does not know how to copy a register temporarily
266: into memory.
267:
268:
269:
270: File: gcc.info, Node: Multi-Alternative, Next: Class Preferences, Prev: Simple Constraints, Up: Constraints
271:
272: Multiple Alternative Constraints
273: --------------------------------
274:
275: Sometimes a single instruction has multiple alternative sets of
276: possible operands. For example, on the 68000, a logical-or
277: instruction can combine register or an immediate value into memory,
278: or it can combine any kind of operand into a register; but it cannot
279: combine one memory location into another.
280:
281: These constraints are represented as multiple alternatives. An
282: alternative can be described by a series of letters for each operand.
283: The overall constraint for an operand is made from the letters for
284: this operand from the first alternative, a comma, the letters for
285: this operand from the second alternative, a comma, and so on until
286: the last alternative. Here is how it is done for fullword logical-or
287: on the 68000:
288:
289: (define_insn "iorsi3"
290: [(set (match_operand:SI 0 "general_operand" "=%m,d")
291: (ior:SI (match_operand:SI 1 "general_operand" "0,0")
292: (match_operand:SI 2 "general_operand" "dKs,dmKs")))]
293: ...)
294:
295: The first alternative has `m' (memory) for operand 0, `0' for operand
296: 1 (meaning it must match operand 0), and `dKs' for operand 2. The
297: second alternative has `d' (data register) for operand 0, `0' for
298: operand 1, and `dmKs' for operand 2. The `=' and `%' in the
299: constraint for operand 0 are not part of any alternative; their
300: meaning is explained in the next section.
301:
302: If all the operands fit any one alternative, the instruction is valid.
303: Otherwise, for each alternative, the compiler counts how many
304: instructions must be added to copy the operands so that that
305: alternative applies. The alternative requiring the least copying is
306: chosen. If two alternatives need the same amount of copying, the one
307: that comes first is chosen. These choices can be altered with the
308: `?' and `!' characters:
309:
310: `?'
311: Disparage slightly the alternative that the `?' appears in, as a
312: choice when no alternative applies exactly. The compiler
313: regards this alternative as one unit more costly for each `?'
314: that appears in it.
315:
316: `!'
317: Disparage severely the alternative that the `!' appears in.
318: When operands must be copied into registers, the compiler will
319: never choose this alternative as the one to strive for.
320:
321: When an insn pattern has multiple alternatives in its constraints,
322: often the appearance of the assembler code determined mostly by which
323: alternative was matched. When this is so, the C code for writing the
324: assembler code can use the variable `which_alternative', which is the
325: ordinal number of the alternative that was actually satisfied (0 for
326: the first, 1 for the second alternative, etc.). For example:
327:
328: (define_insn ""
329: [(set (match_operand:SI 0 "general_operand" "r,m")
330: (const_int 0))]
331: ""
332: "*
333: return (which_alternative == 0
334: ? \"clrreg %0\" : \"clrmem %0\");
335: ")
336:
337:
338:
339: File: gcc.info, Node: Class Preferences, Next: Modifiers, Prev: Multi-Alternative, Up: Constraints
340:
341: Register Class Preferences
342: --------------------------
343:
344: The operand constraints have another function: they enable the
345: compiler to decide which kind of hardware register a pseudo register
346: is best allocated to. The compiler examines the constraints that
347: apply to the insns that use the pseudo register, looking for the
348: machine-dependent letters such as `d' and `a' that specify classes of
349: registers. The pseudo register is put in whichever class gets the
350: most ``votes''. The constraint letters `g' and `r' also vote: they
351: vote in favor of a general register. The machine description says
352: which registers are considered general.
353:
354: Of course, on some machines all registers are equivalent, and no
355: register classes are defined. Then none of this complexity is
356: relevant.
357:
358:
359:
360: File: gcc.info, Node: Modifiers, Next: No Constraints, Prev: Class Preferences, Up: Constraints
361:
362: Constraint Modifier Characters
363: ------------------------------
364:
365: `='
366: Means that this operand is write-only for this instruction: the
367: previous value is discarded and replaced by output data.
368:
369: `+'
370: Means that this operand is both read and written by the
371: instruction.
372:
373: When the compiler fixes up the operands to satisfy the
374: constraints, it needs to know which operands are inputs to the
375: instruction and which are outputs from it. `=' identifies an
376: output; `+' identifies an operand that is both input and output;
377: all other operands are assumed to be input only.
378:
379: `&'
380: Means (in a particular alternative) that this operand is written
381: before the instruction is finished using the input operands.
382: Therefore, this operand may not lie in a register that is used
383: as an input operand or as part of any memory address.
384:
385: `&' applies only to the alternative in which it is written. In
386: constraints with multiple alternatives, sometimes one
387: alternative requires `&' while others do not. See, for example,
388: the `movdf' insn of the 68000.
389:
390: `&' does not obviate the need to write `='.
391:
392: `%'
393: Declares the instruction to be commutative for this operand and
394: the following operand. This means that the compiler may
395: interchange the two operands if that is the cheapest way to make
396: all operands fit the constraints. This is often used in
397: patterns for addition instructions that really have only two
398: operands: the result must go in one of the arguments. Here for
399: example, is how the 68000 halfword-add instruction is defined:
400:
401: (define_insn "addhi3"
402: [(set (match_operand:HI 0 "general_operand" "=m,r")
403: (plus:HI (match_operand:HI 1 "general_operand" "%0,0")
404: (match_operand:HI 2 "general_operand" "di,g")))]
405: ...)
406:
407: Note that in previous versions of GNU CC the `%' constraint
408: modifier always applied to operands 1 and 2 regardless of which
409: operand it was written in. The usual custom was to write it in
410: operand 0. Now it must be in operand 1 if the operands to be
411: exchanged are 1 and 2.
412:
413: `#'
414: Says that all following characters, up to the next comma, are to
415: be ignored as a constraint. They are significant only for
416: choosing register preferences.
417:
418: `*'
419: Says that the following character should be ignored when
420: choosing register preferences. `*' has no effect on the meaning
421: of the constraint as a constraint.
422:
423: Here is an example: the 68000 has an instruction to sign-extend
424: a halfword in a data register, and can also sign-extend a value
425: by copying it into an address register. While either kind of
426: register is acceptable, the constraints on an address-register
427: destination are less strict, so it is best if register
428: allocation makes an address register its goal. Therefore, `*'
429: is used so that the `d' constraint letter (for data register) is
430: ignored when computing register preferences.
431:
432: (define_insn "extendhisi2"
433: [(set (match_operand:SI 0 "general_operand" "=*d,a")
434: (sign_extend:SI
435: (match_operand:HI 1 "general_operand" "0,g")))]
436: ...)
437:
438:
439:
440: File: gcc.info, Node: No Constraints, Prev: Modifiers, Up: Constraints
441:
442: Not Using Constraints
443: ---------------------
444:
445: Some machines are so clean that operand constraints are not required.
446: For example, on the Vax, an operand valid in one context is valid in
447: any other context. On such a machine, every operand constraint would
448: be `g', excepting only operands of ``load address'' instructions
449: which are written as if they referred to a memory location's contents
450: but actual refer to its address. They would have constraint `p'.
451:
452: For such machines, instead of writing `g' and `p' for all the
453: constraints, you can choose to write a description with empty
454: constraints. Then you write `""' for the constraint in every
455: `match_operand'. Address operands are identified by writing an
456: `address' expression around the `match_operand', not by their
457: constraints.
458:
459: When the machine description has just empty constraints, certain
460: parts of compilation are skipped, making the compiler faster.
461:
462:
463:
464: File: gcc.info, Node: Standard Names, Next: Pattern Ordering, Prev: Constraints, Up: Machine Desc
465:
466: Standard Names for Patterns Used in Generation
467: ==============================================
468:
469: Here is a table of the instruction names that are meaningful in the
470: RTL generation pass of the compiler. Giving one of these names to an
471: instruction pattern tells the RTL generation pass that it can use the
472: pattern in to accomplish a certain task.
473:
474: `movM'
475: Here M is a two-letter machine mode name, in lower case. This
476: instruction pattern moves data with that machine mode from
477: operand 1 to operand 0. For example, `movsi' moves full-word
478: data.
479:
480: If operand 0 is a `subreg' with mode M of a register whose
481: natural mode is wider than M, the effect of this instruction is
482: to store the specified value in the part of the register that
483: corresponds to mode M. The effect on the rest of the register
484: is undefined.
485:
486: This class of patterns is special in several ways. First of
487: all, each of these names *must* be defined, because there is no
488: other way to copy a datum from one place to another.
489:
490: Second, these patterns are not used solely in the RTL generation
491: pass. Even the reload pass can generate move insns to copy
492: values from stack slots into temporary registers. When it does
493: so, one of the operands is a hard register and the other is an
494: operand that can have a reload.
495:
496: Therefore, when given such a pair of operands, the pattern must
497: generate RTL which needs no temporary registers--no registers
498: other than the operands. For example, if you support the
499: pattern with a `define_expand', then in such a case you mustn't
500: call `force_reg' or any other such function which might generate
501: new pseudo registers.
502:
503: This requirement exists even for subword modes on a RISC machine
504: where fetching those modes from memory normally requires several
505: insns and some temporary registers. Look in `spur.md' to see
506: how the requirement is satisfied.
507:
508: The variety of operands that have reloads depends on the rest of
509: the machine description, but typically on a RISC machine these
510: can only be pseudo registers that did not get hard registers,
511: while on other machines explicit memory references will get
512: optional reloads.
513:
514: In addition, the constraints must allow any hard register to be
515: moved to any other hard register (provided that
516: `HARD_REGNO_MODE_OK' permits mode M in each of the registers).
517:
518: `movstrictM'
519: Like `movM' except that if operand 0 is a `subreg' with mode M
520: of a register whose natural mode is wider, the `movstrictM'
521: instruction is guaranteed not to alter any of the register
522: except the part which belongs to mode M.
523:
524: `addM3'
525: Add operand 2 and operand 1, storing the result in operand 0.
526: All operands must have mode M. This can be used even on
527: two-address machines, by means of constraints requiring operands
528: 1 and 0 to be the same location.
529:
530: `subM3', `mulM3', `umulM3', `divM3', `udivM3', `modM3', `umodM3', `andM3', `iorM3', `xorM3'
531: Similar, for other arithmetic operations.
532:
533: There are special considerations for register classes for
534: logical-and instructions, affecting also the macro
535: `PREFERRED_RELOAD_CLASS'. They apply not only to the patterns
536: with these standard names, but to any patterns that will match
537: such an instruction. *Note Register Classes::.
538:
539: `mulhisi3'
540: Multiply operands 1 and 2, which have mode `HImode', and store a
541: `SImode' product in operand 0.
542:
543: `mulqihi3', `mulsidi3'
544: Similar widening-multiplication instructions of other widths.
545:
546: `umulqihi3', `umulhisi3', `umulsidi3'
547: Similar widening-multiplication instructions that do unsigned
548: multiplication.
549:
550: `divmodM4'
551: Signed division that produces both a quotient and a remainder.
552: Operand 1 is divided by operand 2 to produce a quotient stored
553: in operand 0 and a remainder stored in operand 3.
554:
555: `udivmodM4'
556: Similar, but does unsigned division.
557:
558: `divmodMN4'
559: Like `divmodM4' except that only the dividend has mode M; the
560: divisor, quotient and remainder have mode N. For example, the
561: Vax has a `divmoddisi4' instruction (but it is omitted from the
562: machine description, because it is so slow that it is faster to
563: compute remainders by the circumlocution that the compiler will
564: use if this instruction is not available).
565:
566: `ashlM3'
567: Arithmetic-shift operand 1 left by a number of bits specified by
568: operand 2, and store the result in operand 0. Operand 2 has
569: mode `SImode', not mode M.
570:
571: `ashrM3', `lshlM3', `lshrM3', `rotlM3', `rotrM3'
572: Other shift and rotate instructions.
573:
574: Logical and arithmetic left shift are the same. Machines that
575: do not allow negative shift counts often have only one
576: instruction for shifting left. On such machines, you should
577: define a pattern named `ashlM3' and leave `lshlM3' undefined.
578:
579: There are special considerations for register classes for shift
580: instructions, affecting also the macro `PREFERRED_RELOAD_CLASS'.
581: They apply not only to the patterns with these standard names,
582: but to any patterns that will match such an instruction. *Note
583: Register Classes::.
584:
585: `negM2'
586: Negate operand 1 and store the result in operand 0.
587:
588: `absM2'
589: Store the absolute value of operand 1 into operand 0.
590:
591: `sqrtM2'
592: Store the square root of operand 1 into operand 0.
593:
594: `ffsM2'
595: Store into operand 0 one plus the index of the least significant
596: 1-bit of operand 1. If operand 1 is zero, store zero. M is the
597: mode of operand 0; operand 1's mode is specified by the
598: instruction pattern, and the compiler will convert the operand
599: to that mode before generating the instruction.
600:
601: `one_cmplM2'
602: Store the bitwise-complement of operand 1 into operand 0.
603:
604: `cmpM'
605: Compare operand 0 and operand 1, and set the condition codes.
606: The RTL pattern should look like this:
607:
608: (set (cc0) (minus (match_operand:M 0 ...)
609: (match_operand:M 1 ...)))
610:
611: Each such definition in the machine description, for integer
612: mode M, must have a corresponding `tstM' pattern, because
613: optimization can simplify the compare into a test when operand 1
614: is zero.
615:
616: `tstM'
617: Compare operand 0 against zero, and set the condition codes.
618: The RTL pattern should look like this:
619:
620: (set (cc0) (match_operand:M 0 ...))
621:
622: `movstrM'
623: Block move instruction. The addresses of the destination and
624: source strings are the first two operands, and both are in mode
625: `Pmode'. The number of bytes to move is the third operand, in
626: mode M.
627:
628: `cmpstrM'
629: Block compare instruction, with operands like `movstrM' except
630: that the two memory blocks are compared byte by byte in
631: lexicographic order. The effect of the instruction is to set
632: the condition codes.
633:
634: `floatMN2'
635: Convert operand 1 (valid for fixed point mode M) to floating
636: point mode N and store in operand 0 (which has mode N).
637:
638: `fixMN2'
639: Convert operand 1 (valid for floating point mode M) to fixed
640: point mode N as a signed number and store in operand 0 (which
641: has mode N). This instruction's result is defined only when the
642: value of operand 1 is an integer.
643:
644: `fixunsMN2'
645: Convert operand 1 (valid for floating point mode M) to fixed
646: point mode N as an unsigned number and store in operand 0 (which
647: has mode N). This instruction's result is defined only when the
648: value of operand 1 is an integer.
649:
650: `ftruncM2'
651: Convert operand 1 (valid for floating point mode M) to an
652: integer value, still represented in floating point mode M, and
653: store it in operand 0 (valid for floating point mode M).
654:
655: `fix_truncMN2'
656: Like `fixMN2' but works for any floating point value of mode M
657: by converting the value to an integer.
658:
659: `fixuns_truncMN2'
660: Like `fixunsMN2' but works for any floating point value of mode
661: M by converting the value to an integer.
662:
663: `truncMN'
664: Truncate operand 1 (valid for mode M) to mode N and store in
665: operand 0 (which has mode N). Both modes must be fixed point or
666: both floating point.
667:
668: `extendMN'
669: Sign-extend operand 1 (valid for mode M) to mode N and store in
670: operand 0 (which has mode N). Both modes must be fixed point or
671: both floating point.
672:
673: `zero_extendMN'
674: Zero-extend operand 1 (valid for mode M) to mode N and store in
675: operand 0 (which has mode N). Both modes must be fixed point.
676:
677: `extv'
678: Extract a bit-field from operand 1 (a register or memory
679: operand), where operand 2 specifies the width in bits and
680: operand 3 the starting bit, and store it in operand 0. Operand
681: 0 must have `Simode'. Operand 1 may have mode `QImode' or
682: `SImode'; often `SImode' is allowed only for registers.
683: Operands 2 and 3 must be valid for `SImode'.
684:
685: The RTL generation pass generates this instruction only with
686: constants for operands 2 and 3.
687:
688: The bit-field value is sign-extended to a full word integer
689: before it is stored in operand 0.
690:
691: `extzv'
692: Like `extv' except that the bit-field value is zero-extended.
693:
694: `insv'
695: Store operand 3 (which must be valid for `SImode') into a
696: bit-field in operand 0, where operand 1 specifies the width in
697: bits and operand 2 the starting bit. Operand 0 may have mode
698: `QImode' or `SImode'; often `SImode' is allowed only for
699: registers. Operands 1 and 2 must be valid for `SImode'.
700:
701: The RTL generation pass generates this instruction only with
702: constants for operands 1 and 2.
703:
704: `sCOND'
705: Store zero or nonzero in the operand according to the condition
706: codes. Value stored is nonzero iff the condition COND is true.
707: COND is the name of a comparison operation expression code, such
708: as `eq', `lt' or `leu'.
709:
710: You specify the mode that the operand must have when you write
711: the `match_operand' expression. The compiler automatically sees
712: which mode you have used and supplies an operand of that mode.
713:
714: The value stored for a true condition must have 1 as its low bit.
715: Otherwise the instruction is not suitable and must be omitted
716: from the machine description. You must tell the compiler
717: exactly which value is stored by defining the macro
718: `STORE_FLAG_VALUE'.
719:
720: `bCOND'
721: Conditional branch instruction. Operand 0 is a `label_ref' that
722: refers to the label to jump to. Jump if the condition codes
723: meet condition COND.
724:
725: `call'
726: Subroutine call instruction returning no value. Operand 0 is
727: the function to call; operand 1 is the number of bytes of
728: arguments pushed (in mode `SImode', except it is normally a
729: `const_int'); operand 2 is the number of registers used as
730: operands.
731:
732: On most machines, operand 2 is not actually stored into the RTL
733: pattern. It is supplied for the sake of some RISC machines
734: which need to put this information into the assembler code; they
735: can put it in the RTL instead of operand 1.
736:
737: Operand 0 should be a `mem' RTX whose address is the address of
738: the function.
739:
740: `call_value'
741: Subroutine call instruction returning a value. Operand 0 is the
742: hard register in which the value is returned. There are three
743: more operands, the same as the three operands of the `call'
744: instruction (but with numbers increased by one).
745:
746: Subroutines that return `BLKmode' objects use the `call' insn.
747:
748: `return'
749: Subroutine return instruction. This instruction pattern name
750: should be defined only if a single instruction can do all the
751: work of returning from a function.
752:
753: `casesi'
754: Instruction to jump through a dispatch table, including bounds
755: checking. This instruction takes five operands:
756:
757: 1. The index to dispatch on, which has mode `SImode'.
758:
759: 2. The lower bound for indices in the table, an integer
760: constant.
761:
762: 3. The upper bound for indices in the table, an integer
763: constant.
764:
765: 4. A label to jump to if the index has a value outside the
766: bounds. (If the machine-description macro
767: `CASE_DROPS_THROUGH' is defined, then an out-of-bounds
768: index drops through to the code following the jump table
769: instead of jumping to this label. In that case, this label
770: is not actually used by the `casesi' instruction, but it is
771: always provided as an operand.)
772:
773: 5. A label that precedes the table itself.
774:
775: The table is a `addr_vec' or `addr_diff_vec' inside of a
776: `jump_insn'. The number of elements in the table is one plus
777: the difference between the upper bound and the lower bound.
778:
779: `tablejump'
780: Instruction to jump to a variable address. This is a low-level
781: capability which can be used to implement a dispatch table when
782: there is no `casesi' pattern.
783:
784: This pattern requires two operands: the address or offset, and a
785: label which should immediately precede the jump table. If the
786: macro `CASE_VECTOR_PC_RELATIVE' is defined then the first
787: operand is an absolute address to jump to; otherwise, it is an
788: offset which counts from the address of the table.
789:
790: The `tablejump' insn is always the last insn before the jump
791: table it uses. Its assembler code normally has no need to use
792: the second operand, but you should incorporate it in the RTL
793: pattern so that the jump optimizer will not delete the table as
794: unreachable code.
795:
796:
797:
798: File: gcc.info, Node: Pattern Ordering, Next: Dependent Patterns, Prev: Standard Names, Up: Machine Desc
799:
800: When the Order of Patterns Matters
801: ==================================
802:
803: Sometimes an insn can match more than one instruction pattern. Then
804: the pattern that appears first in the machine description is the one
805: used. Therefore, more specific patterns (patterns that will match
806: fewer things) and faster instructions (those that will produce better
807: code when they do match) should usually go first in the description.
808:
809: In some cases the effect of ordering the patterns can be used to hide
810: a pattern when it is not valid. For example, the 68000 has an
811: instruction for converting a fullword to floating point and another
812: for converting a byte to floating point. An instruction converting
813: an integer to floating point could match either one. We put the
814: pattern to convert the fullword first to make sure that one will be
815: used rather than the other. (Otherwise a large integer might be
816: generated as a single-byte immediate quantity, which would not work.)
817: Instead of using this pattern ordering it would be possible to make
818: the pattern for convert-a-byte smart enough to deal properly with any
819: constant value.
820:
821:
822:
823: File: gcc.info, Node: Dependent Patterns, Next: Jump Patterns, Prev: Pattern Ordering, Up: Machine Desc
824:
825: Interdependence of Patterns
826: ===========================
827:
828: Every machine description must have a named pattern for each of the
829: conditional branch names `bCOND'. The recognition template must
830: always have the form
831:
832: (set (pc)
833: (if_then_else (COND (cc0) (const_int 0))
834: (label_ref (match_operand 0 "" ""))
835: (pc)))
836:
837: In addition, every machine description must have an anonymous pattern
838: for each of the possible reverse-conditional branches. These
839: patterns look like
840:
841: (set (pc)
842: (if_then_else (COND (cc0) (const_int 0))
843: (pc)
844: (label_ref (match_operand 0 "" ""))))
845:
846: They are necessary because jump optimization can turn
847: direct-conditional branches into reverse-conditional branches.
848:
849: The compiler does more with RTL than just create it from patterns and
850: recognize the patterns: it can perform arithmetic expression codes
851: when constant values for their operands can be determined. As a
852: result, sometimes having one pattern can require other patterns. For
853: example, the Vax has no `and' instruction, but it has `and not'
854: instructions. Here is the definition of one of them:
855:
856: (define_insn "andcbsi2"
857: [(set (match_operand:SI 0 "general_operand" "")
858: (and:SI (match_dup 0)
859: (not:SI (match_operand:SI
860: 1 "general_operand" ""))))]
861: ""
862: "bicl2 %1,%0")
863:
864: If operand 1 is an explicit integer constant, an instruction
865: constructed using that pattern can be simplified into an `and' like
866: this:
867:
868: (set (reg:SI 41)
869: (and:SI (reg:SI 41)
870: (const_int 0xffff7fff)))
871:
872: (where the integer constant is the one's complement of what appeared
873: in the original instruction).
874:
875: To avoid a fatal error, the compiler must have a pattern that
876: recognizes such an instruction. Here is what is used:
877:
878: (define_insn ""
879: [(set (match_operand:SI 0 "general_operand" "")
880: (and:SI (match_dup 0)
881: (match_operand:SI 1 "general_operand" "")))]
882: "GET_CODE (operands[1]) == CONST_INT"
883: "*
884: { operands[1]
885: = gen_rtx (CONST_INT, VOIDmode, ~INTVAL (operands[1]));
886: return \"bicl2 %1,%0\";
887: }")
888:
889: Whereas a pattern to match a general `and' instruction is impossible
890: to support on the Vax, this pattern is possible because it matches
891: only a constant second argument: a special case that can be output as
892: an `and not' instruction.
893:
894: A ``compare'' instruction whose RTL looks like this:
895:
896: (set (cc0) (minus OPERAND (const_int 0)))
897:
898: may be simplified by optimization into a ``test'' like this:
899:
900: (set (cc0) OPERAND)
901:
902: So in the machine description, each ``compare'' pattern for an
903: integer mode must have a corresponding ``test'' pattern that will
904: match the result of such simplification.
905:
906: In some cases machines support instructions identical except for the
907: machine mode of one or more operands. For example, there may be
908: ``sign-extend halfword'' and ``sign-extend byte'' instructions whose
909: patterns are
910:
911: (set (match_operand:SI 0 ...)
912: (extend:SI (match_operand:HI 1 ...)))
913:
914: (set (match_operand:SI 0 ...)
915: (extend:SI (match_operand:QI 1 ...)))
916:
917: Constant integers do not specify a machine mode, so an instruction to
918: extend a constant value could match either pattern. The pattern it
919: actually will match is the one that appears first in the file. For
920: correct results, this must be the one for the widest possible mode
921: (`HImode', here). If the pattern matches the `QImode' instruction,
922: the results will be incorrect if the constant value does not actually
923: fit that mode.
924:
925: Such instructions to extend constants are rarely generated because
926: they are optimized away, but they do occasionally happen in
927: nonoptimized compilations.
928:
929: When an instruction has the constraint letter `o', the reload pass
930: may generate instructions which copy a nonoffsetable address into an
931: index register. The idea is that the register can be used as a
932: replacement offsetable address. In order for these generated
933: instructions to work, there must be patterns to copy any kind of
934: valid address into a register.
935:
936: Most older machine designs have ``load address'' instructions which
937: do just what is needed here. Some RISC machines do not advertise
938: such instructions, but the possible addresses on these machines are
939: very limited, so it is easy to fake them.
940:
941: Auto-increment and auto-decrement addresses are an exception; there
942: need not be an instruction that can copy such an address into a
943: register, because reload handles these cases in a different manner.
944:
945:
946:
947: File: gcc.info, Node: Jump Patterns, Next: Peephole Definitions, Prev: Dependent Patterns, Up: Machine Desc
948:
949: Defining Jump Instruction Patterns
950: ==================================
951:
952: GNU CC assumes that the machine has a condition code. A comparison
953: insn sets the condition code, recording the results of both signed
954: and unsigned comparison of the given operands. A separate branch
955: insn tests the condition code and branches or not according its
956: value. The branch insns come in distinct signed and unsigned
957: flavors. Many common machines, such as the Vax, the 68000 and the
958: 32000, work this way.
959:
960: Some machines have distinct signed and unsigned compare instructions,
961: and only one set of conditional branch instructions. The easiest way
962: to handle these machines is to treat them just like the others until
963: the final stage where assembly code is written. At this time, when
964: outputting code for the compare instruction, peek ahead at the
965: following branch using `NEXT_INSN (insn)'. (The variable `insn'
966: refers to the insn being output, in the output-writing code in an
967: instruction pattern.) If the RTL says that is an unsigned branch,
968: output an unsigned compare; otherwise output a signed compare. When
969: the branch itself is output, you can treat signed and unsigned
970: branches identically.
971:
972: The reason you can do this is that GNU CC always generates a pair of
973: consecutive RTL insns, one to set the condition code and one to test
974: it, and keeps the pair inviolate until the end.
975:
976: To go with this technique, you must define the machine-description
977: macro `NOTICE_UPDATE_CC' to do `CC_STATUS_INIT'; in other words, no
978: compare instruction is superfluous.
979:
980: Some machines have compare-and-branch instructions and no condition
981: code. A similar technique works for them. When it is time to
982: ``output'' a compare instruction, record its operands in two static
983: variables. When outputting the branch-on-condition-code instruction
984: that follows, actually output a compare-and-branch instruction that
985: uses the remembered operands.
986:
987: It also works to define patterns for compare-and-branch instructions.
988: In optimizing compilation, the pair of compare and branch
989: instructions will be combined accoprding to these patterns. But this
990: does not happen if optimization is not requested. So you must use
991: one of the solutions above in addition to any special patterns you
992: define.
993:
994:
995:
996: File: gcc.info, Node: Peephole Definitions, Next: Expander Definitions, Prev: Jump Patterns, Up: Machine Desc
997:
998: Defining Machine-Specific Peephole Optimizers
999: =============================================
1000:
1001: In addition to instruction patterns the `md' file may contain
1002: definitions of machine-specific peephole optimizations.
1003:
1004: The combiner does not notice certain peephole optimizations when the
1005: data flow in the program does not suggest that it should try them.
1006: For example, sometimes two consecutive insns related in purpose can
1007: be combined even though the second one does not appear to use a
1008: register computed in the first one. A machine-specific peephole
1009: optimizer can detect such opportunities.
1010:
1011: A definition looks like this:
1012:
1013: (define_peephole
1014: [INSN-PATTERN-1
1015: INSN-PATTERN-2
1016: ...]
1017: "CONDITION"
1018: "TEMPLATE"
1019: "MACHINE-SPECIFIC INFO")
1020:
1021: The last string operand may be omitted if you are not using any
1022: machine-specific information in this machine description. If
1023: present, it must obey the same rules as in a `define_insn'.
1024:
1025: In this skeleton, INSN-PATTERN-1 and so on are patterns to match
1026: consecutive instructions. The optimization applies to a sequence of
1027: instructions when INSN-PATTERN-1 matches the first one,
1028: INSN-PATTERN-2 matches the next, and so on.
1029:
1030: INSN-PATTERN-1 and so on look *almost* like the second operand of
1031: `define_insn'. There is one important difference: this pattern is an
1032: RTX, not a vector. If the `define_insn' pattern would be a vector of
1033: one element, the INSN-PATTERN should be just that element, no vector.
1034: If the `define_insn' pattern would have multiple elements then the
1035: INSN-PATTERN must place the vector inside an explicit `parallel' RTX.
1036:
1037: The operands of the instructions are matched with `match_operands'
1038: and `match_dup', as usual). What is not usual is that the operand
1039: numbers apply to all the instruction patterns in the definition. So,
1040: you can check for identical operands in two instructions by using
1041: `match_operand' in one instruction and `match_dup' in the other.
1042:
1043: The operand constraints used in `match_operand' patterns do not have
1044: any direct effect on the applicability of the optimization, but they
1045: will be validated afterward, so write constraints that are sure to
1046: fit whenever the optimization is applied. It is safe to use `"g"'
1047: for each operand.
1048:
1049: Once a sequence of instructions matches the patterns, the CONDITION
1050: is checked. This is a C expression which makes the final decision
1051: whether to perform the optimization (do so if the expression is
1052: nonzero). If CONDITION is omitted (in other words, the string is
1053: empty) then the optimization is applied to every sequence of
1054: instructions that matches the patterns.
1055:
1056: The defined peephole optimizations are applied after register
1057: allocation is complete. Therefore, the optimizer can check which
1058: operands have ended up in which kinds of registers, just by looking
1059: at the operands.
1060:
1061: The way to refer to the operands in CONDITION is to write
1062: `operands[I]' for operand number I (as matched by `(match_operand I
1063: ...)'). Use the variable `insn' to refer to the last of the insns
1064: being matched; use `PREV_INSN' to find the preceding insns (but be
1065: careful to skip over any `note' insns that intervene).
1066:
1067: When optimizing computations with intermediate results, you can use
1068: CONDITION to match only when the intermediate results are not used
1069: elsewhere. Use the C expression `dead_or_set_p (INSN, OP)', where
1070: INSN is the insn in which you expect the value to be used for the
1071: last time (from the value of `insn', together with use of
1072: `PREV_INSN'), and OP is the intermediate value (from `operands[I]').
1073:
1074: Applying the optimization means replacing the sequence of
1075: instructions with one new instruction. The TEMPLATE controls
1076: ultimate output of assembler code for this combined instruction. It
1077: works exactly like the template of a `define_insn'. Operand numbers
1078: in this template are the same ones used in matching the original
1079: sequence of instructions.
1080:
1081: The result of a defined peephole optimizer does not need to match any
1082: of the instruction patterns, and it does not have an opportunity to
1083: match them. The peephole optimizer definition itself serves as the
1084: instruction pattern to control how the instruction is output.
1085:
1086: Defined peephole optimizers are run in the last jump optimization
1087: pass, so the instructions they produce are never combined or
1088: rearranged automatically in any way.
1089:
1090: Here is an example, taken from the 68000 machine description:
1091:
1092: (define_peephole
1093: [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4)))
1094: (set (match_operand:DF 0 "register_operand" "f")
1095: (match_operand:DF 1 "register_operand" "ad"))]
1096: "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
1097: "*
1098: {
1099: rtx xoperands[2];
1100: xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1);
1101: #ifdef MOTOROLA
1102: output_asm_insn (\"move.l %1,(sp)\", xoperands);
1103: output_asm_insn (\"move.l %1,-(sp)\", operands);
1104: return \"fmove.d (sp)+,%0\";
1105: #else
1106: output_asm_insn (\"movel %1,sp@\", xoperands);
1107: output_asm_insn (\"movel %1,sp@-\", operands);
1108: return \"fmoved sp@+,%0\";
1109: #endif
1110: }
1111: ")
1112:
1113: The effect of this optimization is to change
1114:
1115: jbsr _foobar
1116: addql #4,sp
1117: movel d1,sp@-
1118: movel d0,sp@-
1119: fmoved sp@+,fp0
1120:
1121: into
1122:
1123: jbsr _foobar
1124: movel d1,sp@
1125: movel d0,sp@-
1126: fmoved sp@+,fp0
1127:
1128:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.