|
|
1.1 root 1:
2:
3: File: internals, Node: Multi-Alternative, Next: Class Preferences, Prev: Simple Constraints, Up: Constraints
4:
5: Multiple Alternative Constraints
6: --------------------------------
7:
8: Sometimes a single instruction has multiple alternative sets of possible
9: operands. For example, on the 68000, a logical-or instruction can combine
10: register or an immediate value into memory, or it can combine any kind of
11: operand into a register; but it cannot combine one memory location into
12: another.
13:
14: These constraints are represented as multiple alternatives. An alternative
15: can be described by a series of letters for each operand. The overall
16: constraint for an operand is made from the letters for this operand from
17: the first alternative, a comma, the letters for this operand from the
18: second alternative, a comma, and so on until the last alternative. Here is
19: how it is done for fullword logical-or on the 68000:
20:
21: (define_insn "iorsi3"
22: [(set (match_operand:SI 0 "general_operand" "=%m,d")
23: (ior:SI (match_operand:SI 1 "general_operand" "0,0")
24: (match_operand:SI 2 "general_operand" "dKs,dmKs")))]
25: ...)
26:
27: The first alternative has `m' (memory) for operand 0, `0' for operand 1
28: (meaning it must match operand 0), and `dKs' for operand 2. The second
29: alternative has `d' (data register) for operand 0, `0' for operand 1, and
30: `dmKs' for operand 2. The `=' and `%' in the constraint for operand 0 are
31: not part of any alternative; their meaning is explained in the next section.
32:
33: If all the operands fit any one alternative, the instruction is valid.
34: Otherwise, for each alternative, the compiler counts how many instructions
35: must be added to copy the operands so that that alternative applies. The
36: alternative requiring the least copying is chosen. If two alternatives
37: need the same amount of copying, the one that comes first is chosen. These
38: choices can be altered with the `?' and `!' characters:
39:
40: `?'
41: Disparage slightly the alternative that the `?' appears in, as a
42: choice when no alternative applies exactly. The compiler regards this
43: alternative as one unit more costly for each `?' that appears in it.
44:
45: `!'
46: Disparage severely the alternative that the `!' appears in. When
47: operands must be copied into registers, the compiler will never choose
48: this alternative as the one to strive for.
49:
50: When an insn pattern has multiple alternatives in its constraints, often
51: the appearance of the assembler code determined mostly by which alternative
52: was matched. When this is so, the C code for writing the assembler code
53: can use the variable `which_alternative', which is the ordinal number of
54: the alternative that was actually satisfied (0 for the first, 1 for the
55: second alternative, etc.). For example:
56:
57: (define_insn ""
58: [(set (match_operand:SI 0 "general_operand" "r,m")
59: (const_int 0))]
60: ""
61: "*
62: return (which_alternative == 0
63: ? \"clrreg %0\" : \"clrmem %0\");
64: ")
65:
66:
67: File: internals, Node: Class Preferences, Next: Modifiers, Prev: Multi-Alternative, Up: Constraints
68:
69: Register Class Preferences
70: --------------------------
71:
72: The operand constraints have another function: they enable the compiler to
73: decide which kind of hardware register a pseudo register is best allocated
74: to. The compiler examines the constraints that apply to the insns that use
75: the pseudo register, looking for the machine-dependent letters such as `d'
76: and `a' that specify classes of registers. The pseudo register is put in
77: whichever class gets the most ``votes''. The constraint letters `g' and
78: `r' also vote: they vote in favor of a general register. The machine
79: description says which registers are considered general.
80:
81: Of course, on some machines all registers are equivalent, and no register
82: classes are defined. Then none of this complexity is relevant.
83:
84:
85: File: internals, Node: Modifiers, Next: No Constraints, Prev: Class Preferences, Up: Constraints
86:
87: Constraint Modifier Characters
88: ------------------------------
89:
90: `='
91: Means that this operand is write-only for this instruction: the
92: previous value is discarded and replaced by output data.
93:
94: `+'
95: Means that this operand is both read and written by the instruction.
96:
97: When the compiler fixes up the operands to satisfy the constraints, it
98: needs to know which operands are inputs to the instruction and which
99: are outputs from it. `=' identifies an output; `+' identifies an
100: operand that is both input and output; all other operands are assumed
101: to be input only.
102:
103: `&'
104: Means (in a particular alternative) that this operand is written
105: before the instruction is finished using the input operands.
106: Therefore, this operand may not lie in a register that is used as an
107: input operand or as part of any memory address.
108:
109: `&' applies only to the alternative in which it is written. In
110: constraints with multiple alternatives, sometimes one alternative
111: requires `&' while others do not. See, for example, the `movdf' insn
112: of the 68000.
113:
114: `&' does not obviate the need to write `='.
115:
116: `%'
117: Declares the instruction to be commutative for this operand and the
118: following operand. This means that the compiler may interchange the
119: two operands if that is the cheapest way to make all operands fit the
120: constraints. This is often used in patterns for addition instructions
121: that really have only two operands: the result must go in one of the
122: arguments. Here for example, is how the 68000 halfword-add
123: instruction is defined:
124:
125: (define_insn "addhi3"
126: [(set (match_operand:HI 0 "general_operand" "=m,r")
127: (plus:HI (match_operand:HI 1 "general_operand" "%0,0")
128: (match_operand:HI 2 "general_operand" "di,g")))]
129: ...)
130:
131: Note that in previous versions of GNU CC the `%' constraint modifier
132: always applied to operands 1 and 2 regardless of which operand it was
133: written in. The usual custom was to write it in operand 0. Now it
134: must be in operand 1 if the operands to be exchanged are 1 and 2.
135:
136: `#'
137: Says that all following characters, up to the next comma, are to be
138: ignored as a constraint. They are significant only for choosing
139: register preferences.
140:
141: `*'
142: Says that the following character should be ignored when choosing
143: register preferences. `*' has no effect on the meaning of the
144: constraint as a constraint.
145:
146: Here is an example: the 68000 has an instruction to sign-extend a
147: halfword in a data register, and can also sign-extend a value by
148: copying it into an address register. While either kind of register is
149: acceptable, the constraints on an address-register destination are
150: less strict, so it is best if register allocation makes an address
151: register its goal. Therefore, `*' is used so that the `d' constraint
152: letter (for data register) is ignored when computing register
153: preferences.
154:
155: (define_insn "extendhisi2"
156: [(set (match_operand:SI 0 "general_operand" "=*d,a")
157: (sign_extend:SI
158: (match_operand:HI 1 "general_operand" "0,g")))]
159: ...)
160:
161:
162: File: internals, Node: No Constraints, Prev: Modifiers, Up: Constraints
163:
164: Not Using Constraints
165: ---------------------
166:
167: Some machines are so clean that operand constraints are not required. For
168: example, on the Vax, an operand valid in one context is valid in any other
169: context. On such a machine, every operand constraint would be `g',
170: excepting only operands of ``load address'' instructions which are written
171: as if they referred to a memory location's contents but actual refer to its
172: address. They would have constraint `p'.
173:
174: For such machines, instead of writing `g' and `p' for all the constraints,
175: you can choose to write a description with empty constraints. Then you
176: write `""' for the constraint in every `match_operand'. Address operands
177: are identified by writing an `address' expression around the
178: `match_operand', not by their constraints.
179:
180: When the machine description has just empty constraints, certain parts of
181: compilation are skipped, making the compiler faster.
182:
183:
184: File: internals, Node: Standard Names, Next: Pattern Ordering, Prev: Constraints, Up: Machine Desc
185:
186: Standard Names for Patterns Used in Generation
187: ==============================================
188:
189: Here is a table of the instruction names that are meaningful in the RTL
190: generation pass of the compiler. Giving one of these names to an
191: instruction pattern tells the RTL generation pass that it can use the
192: pattern in to accomplish a certain task.
193:
194: `movM'
195: Here M is a two-letter machine mode name, in lower case. This
196: instruction pattern moves data with that machine mode from operand 1
197: to operand 0. For example, `movsi' moves full-word data.
198:
199: If operand 0 is a `subreg' with mode M of a register whose natural
200: mode is wider than M, the effect of this instruction is to store the
201: specified value in the part of the register that corresponds to mode
202: M. The effect on the rest of the register is undefined.
203:
204: This class of patterns is special in several ways. First of all, each
205: of these names *must* be defined, because there is no other way to
206: copy a datum from one place to another.
207:
208: Second, these patterns are not used solely in the RTL generation pass.
209: Even the reload pass can generate move insns to copy values from
210: stack slots into temporary registers. When it does so, one of the
211: operands is a hard register and the other is an operand that can have
212: a reload.
213:
214: Therefore, when given such a pair of operands, the pattern must
215: generate RTL which needs no temporary registers---no registers other
216: than the operands. For example, if you support the pattern with a
217: `define_expand', then in such a case you mustn't call `force_reg' or
218: any other such function which might generate new pseudo registers.
219:
220: This requirement exists even for subword modes on a RISC machine where
221: fetching those modes from memory normally requires several insns and
222: some temporary registers. Look in `spur.md' to see how the
223: requirement is satisfied.
224:
225: The variety of operands that have reloads depends on the rest of the
226: machine description, but typically on a RISC machine these can only be
227: pseudo registers that did not get hard registers, while on other
228: machines explicit memory references will get optional reloads.
229:
230: `movstrictM'
231: Like `movM' except that if operand 0 is a `subreg' with mode M of a
232: register whose natural mode is wider, the `movstrictM' instruction is
233: guaranteed not to alter any of the register except the part which
234: belongs to mode M.
235:
236: `addM3'
237: Add operand 2 and operand 1, storing the result in operand 0. All
238: operands must have mode M. This can be used even on two-address
239: machines, by means of constraints requiring operands 1 and 0 to be the
240: same location.
241:
242: `subM3', `mulM3', `umulM3', `divM3', `udivM3', `modM3', `umodM3', `andM3', `iorM3', `xorM3'
243: Similar, for other arithmetic operations.
244:
245: `andcbM3'
246: Bitwise logical-and operand 1 with the complement of operand 2 and
247: store the result in operand 0.
248:
249: `mulhisi3'
250: Multiply operands 1 and 2, which have mode `HImode', and store a
251: `SImode' product in operand 0.
252:
253: `mulqihi3', `mulsidi3'
254: Similar widening-multiplication instructions of other widths.
255:
256: `umulqihi3', `umulhisi3', `umulsidi3'
257: Similar widening-multiplication instructions that do unsigned
258: multiplication.
259:
260: `divmodM4'
261: Signed division that produces both a quotient and a remainder.
262: Operand 1 is divided by operand 2 to produce a quotient stored in
263: operand 0 and a remainder stored in operand 3.
264:
265: `udivmodM4'
266: Similar, but does unsigned division.
267:
268: `divmodMN4'
269: Like `divmodM4' except that only the dividend has mode M; the divisor,
270: quotient and remainder have mode N. For example, the Vax has a
271: `divmoddisi4' instruction (but it is omitted from the machine
272: description, because it is so slow that it is faster to compute
273: remainders by the circumlocution that the compiler will use if this
274: instruction is not available).
275:
276: `ashlM3'
277: Arithmetic-shift operand 1 left by a number of bits specified by
278: operand 2, and store the result in operand 0. Operand 2 has mode
279: `SImode', not mode M.
280:
281: `ashrM3', `lshlM3', `lshrM3', `rotlM3', `rotrM3'
282: Other shift and rotate instructions.
283:
284: Logical and arithmetic left shift are the same. Machines that do not
285: allow negative shift counts often have only one instruction for
286: shifting left. On such machines, you should define a pattern named
287: `ashlM3' and leave `lshlM3' undefined.
288:
289: `negM2'
290: Negate operand 1 and store the result in operand 0.
291:
292: `absM2'
293: Store the absolute value of operand 1 into operand 0.
294:
295: `sqrtM2'
296: Store the square root of operand 1 into operand 0.
297:
298: `ffsM2'
299: Store into operand 0 one plus the index of the least significant 1-bit
300: of operand 1. If operand 1 is zero, store zero. M is the mode of
301: operand 0; operand 1's mode is specified by the instruction pattern,
302: and the compiler will convert the operand to that mode before
303: generating the instruction.
304:
305: `one_cmplM2'
306: Store the bitwise-complement of operand 1 into operand 0.
307:
308: `cmpM'
309: Compare operand 0 and operand 1, and set the condition codes. The RTL
310: pattern should look like this:
311:
312: (set (cc0) (minus (match_operand:M 0 ...)
313: (match_operand:M 1 ...)))
314:
315: Each such definition in the machine description, for integer mode M,
316: must have a corresponding `tstM' pattern, because optimization can
317: simplify the compare into a test when operand 1 is zero.
318:
319: `tstM'
320: Compare operand 0 against zero, and set the condition codes. The RTL
321: pattern should look like this:
322:
323: (set (cc0) (match_operand:M 0 ...))
324:
325: `movstrM'
326: Block move instruction. The addresses of the destination and source
327: strings are the first two operands, and both are in mode `Pmode'. The
328: number of bytes to move is the third operand, in mode M.
329:
330: `cmpstrM'
331: Block compare instruction, with operands like `movstrM' except that
332: the two memory blocks are compared byte by byte in lexicographic
333: order. The effect of the instruction is to set the condition codes.
334:
335: `floatMN2'
336: Convert operand 1 (valid for fixed point mode M) to floating point
337: MODE N and store in operand 0 (which has mode N).
338:
339: `fixMN2'
340: Convert operand 1 (valid for floating point mode M) to fixed point
341: MODE N as a signed number and store in operand 0 (which has mode N).
342: This instruction's result is defined only when the value of operand 1
343: is an integer.
344:
345: `fixunsMN2'
346: Convert operand 1 (valid for floating point mode M) to fixed point
347: MODE N as an unsigned number and store in operand 0 (which has mode
348: N). This instruction's result is defined only when the value of
349: operand 1 is an integer.
350:
351: `ftruncM2'
352: Convert operand 1 (valid for floating point mode M) to an integer
353: value, still represented in floating point mode M, and store it in
354: operand 0 (valid for floating point mode M).
355:
356: `fix_truncMN2'
357: Like `fixMN2' but works for any floating point value of mode M by
358: converting the value to an integer.
359:
360: `fixuns_truncMN2'
361: Like `fixunsMN2' but works for any floating point value of mode M by
362: converting the value to an integer.
363:
364: `truncMN'
365: Truncate operand 1 (valid for mode M) to mode N and store in operand 0
366: (which has mode N). Both modes must be fixed point or both floating
367: point.
368:
369: `extendMN'
370: Sign-extend operand 1 (valid for mode M) to mode N and store in
371: operand 0 (which has mode N). Both modes must be fixed point or both
372: floating point.
373:
374: `zero_extendMN'
375: Zero-extend operand 1 (valid for mode M) to mode N and store in
376: operand 0 (which has mode N). Both modes must be fixed point.
377:
378: `extv'
379: Extract a bit-field from operand 1 (a register or memory operand),
380: where operand 2 specifies the width in bits and operand 3 the starting
381: bit, and store it in operand 0. Operand 0 must have `Simode'.
382: Operand 1 may have mode `QImode' or `SImode'; often `SImode' is
383: allowed only for registers. Operands 2 and 3 must be valid for
384: `SImode'.
385:
386: The RTL generation pass generates this instruction only with constants
387: for operands 2 and 3.
388:
389: The bit-field value is sign-extended to a full word integer before it
390: is stored in operand 0.
391:
392: `extzv'
393: Like `extv' except that the bit-field value is zero-extended.
394:
395: `insv'
396: Store operand 3 (which must be valid for `SImode') into a bit-field in
397: operand 0, where operand 1 specifies the width in bits and operand 2
398: the starting bit. Operand 0 may have mode `QImode' or `SImode'; often
399: `SImode' is allowed only for registers. Operands 1 and 2 must be
400: valid for `SImode'.
401:
402: The RTL generation pass generates this instruction only with constants
403: for operands 1 and 2.
404:
405: `sCOND'
406: Store zero or nonzero in the operand according to the condition codes.
407: Value stored is nonzero iff the condition COND is true. COND is the
408: name of a comparison operation expression code, such as `eq', `lt' or
409: `leu'.
410:
411: You specify the mode that the operand must have when you write the
412: `match_operand' expression. The compiler automatically sees which
413: mode you have used and supplies an operand of that mode.
414:
415: The value stored for a true condition must have 1 as its low bit.
416: Otherwise the instruction is not suitable and must be omitted from the
417: machine description. You must tell the compiler exactly which value
418: is stored by defining the macro `STORE_FLAG_VALUE'.
419:
420: `bCOND'
421: Conditional branch instruction. Operand 0 is a `label_ref' that
422: refers to the label to jump to. Jump if the condition codes meet
423: condition COND.
424:
425: `call'
426: Subroutine call instruction. Operand 1 is the number of bytes of
427: arguments pushed (in mode `SImode'), and operand 0 is the function to
428: call. Operand 0 should be a `mem' RTX whose address is the address of
429: the function.
430:
431: `return'
432: Subroutine return instruction. This instruction pattern name should
433: be defined only if a single instruction can do all the work of
434: returning from a function.
435:
436: `casesi'
437: Instruction to jump through a dispatch table, including bounds checking.
438: This instruction takes five operands:
439:
440: 1. The index to dispatch on, which has mode `SImode'.
441:
442: 2. The lower bound for indices in the table, an integer constant.
443:
444: 3. The upper bound for indices in the table, an integer constant.
445:
446: 4. A label to jump to if the index has a value outside the bounds. (If the
447: machine-description macro `CASE_DROPS_THROUGH' is defined, then
448: an out-of-bounds index drops through to the code following the
449: jump table instead of jumping to this label. In that case, this
450: label is not actually used by the `casesi' instruction, but it is
451: always provided as an operand.)
452:
453: 5. A label that precedes the table itself.
454:
455: The table is a `addr_vec' or `addr_diff_vec' inside of a `jump_insn'.
456: The number of elements in the table is one plus the difference between
457: the upper bound and the lower bound.
458:
459: `tablejump'
460: Instruction to jump to a variable address. This is a low-level
461: capability which can be used to implement a dispatch table when there
462: is no `casesi' pattern.
463:
464: This pattern requires two operands: the address or offset, and a label
465: which should immediately precede the jump table. If the macro
466: `CASE_VECTOR_PC_RELATIVE' is defined then the first operand is an
467: absolute address to jump to; otherwise, it is an offset which counts
468: from the address of the table.
469:
470: The `tablejump' insn is always the last insn before the jump table it
471: uses. Its assembler code normally has no need to use the second
472: operand, but you should incorporate it in the RTL pattern so that the
473: jump optimizer will not delete the table as unreachable code.
474:
475:
476: File: internals, Node: Pattern Ordering, Next: Dependent Patterns, Prev: Standard Names, Up: Machine Desc
477:
478: When the Order of Patterns Matters
479: ==================================
480:
481: Sometimes an insn can match more than one instruction pattern. Then the
482: pattern that appears first in the machine description is the one used.
483: Therefore, more specific patterns (patterns that will match fewer things)
484: and faster instructions (those that will produce better code when they do
485: match) should usually go first in the description.
486:
487: In some cases the effect of ordering the patterns can be used to hide a
488: pattern when it is not valid. For example, the 68000 has an instruction
489: for converting a fullword to floating point and another for converting a
490: byte to floating point. An instruction converting an integer to floating
491: point could match either one. We put the pattern to convert the fullword
492: first to make sure that one will be used rather than the other. (Otherwise
493: a large integer might be generated as a single-byte immediate quantity,
494: which would not work.) Instead of using this pattern ordering it would be
495: possible to make the pattern for convert-a-byte smart enough to deal
496: properly with any constant value.
497:
498:
499: File: internals, Node: Dependent Patterns, Next: Jump Patterns, Prev: Pattern Ordering, Up: Machine Desc
500:
501: Interdependence of Patterns
502: ===========================
503:
504: Every machine description must have a named pattern for each of the
505: conditional branch names `bCOND'. The recognition template must always
506: have the form
507:
508: (set (pc)
509: (if_then_else (COND (cc0) (const_int 0))
510: (label_ref (match_operand 0 "" ""))
511: (pc)))
512:
513: In addition, every machine description must have an anonymous pattern for
514: each of the possible reverse-conditional branches. These patterns look like
515:
516: (set (pc)
517: (if_then_else (COND (cc0) (const_int 0))
518: (pc)
519: (label_ref (match_operand 0 "" ""))))
520:
521: They are necessary because jump optimization can turn direct-conditional
522: branches into reverse-conditional branches.
523:
524: The compiler does more with RTL than just create it from patterns and
525: recognize the patterns: it can perform arithmetic expression codes when
526: constant values for their operands can be determined. As a result,
527: sometimes having one pattern can require other patterns. For example, the
528: Vax has no `and' instruction, but it has `and not' instructions. Here is
529: the definition of one of them:
530:
531: (define_insn "andcbsi2"
532: [(set (match_operand:SI 0 "general_operand" "")
533: (and:SI (match_dup 0)
534: (not:SI (match_operand:SI
535: 1 "general_operand" ""))))]
536: ""
537: "bicl2 %1,%0")
538:
539: If operand 1 is an explicit integer constant, an instruction constructed
540: using that pattern can be simplified into an `and' like this:
541:
542: (set (reg:SI 41)
543: (and:SI (reg:SI 41)
544: (const_int 0xffff7fff)))
545:
546: (where the integer constant is the one's complement of what appeared in the
547: original instruction).
548:
549: To avoid a fatal error, the compiler must have a pattern that recognizes
550: such an instruction. Here is what is used:
551:
552: (define_insn ""
553: [(set (match_operand:SI 0 "general_operand" "")
554: (and:SI (match_dup 0)
555: (match_operand:SI 1 "general_operand" "")))]
556: "GET_CODE (operands[1]) == CONST_INT"
557: "*
558: { operands[1]
559: = gen_rtx (CONST_INT, VOIDmode, ~INTVAL (operands[1]));
560: return \"bicl2 %1,%0\";
561: }")
562:
563: Whereas a pattern to match a general `and' instruction is impossible to
564: support on the Vax, this pattern is possible because it matches only a
565: constant second argument: a special case that can be output as an `and not'
566: instruction.
567:
568: A ``compare'' instruction whose RTL looks like this:
569:
570: (set (cc0) (minus OPERAND (const_int 0)))
571:
572: may be simplified by optimization into a ``test'' like this:
573:
574: (set (cc0) OPERAND)
575:
576: So in the machine description, each ``compare'' pattern for an integer mode
577: must have a corresponding ``test'' pattern that will match the result of
578: such simplification.
579:
580: In some cases machines support instructions identical except for the
581: machine mode of one or more operands. For example, there may be
582: ``sign-extend halfword'' and ``sign-extend byte'' instructions whose
583: patterns are
584:
585: (set (match_operand:SI 0 ...)
586: (extend:SI (match_operand:HI 1 ...)))
587:
588: (set (match_operand:SI 0 ...)
589: (extend:SI (match_operand:QI 1 ...)))
590:
591: Constant integers do not specify a machine mode, so an instruction to
592: extend a constant value could match either pattern. The pattern it
593: actually will match is the one that appears first in the file. For correct
594: results, this must be the one for the widest possible mode (`HImode',
595: here). If the pattern matches the `QImode' instruction, the results will
596: be incorrect if the constant value does not actually fit that mode.
597:
598: Such instructions to extend constants are rarely generated because they are
599: optimized away, but they do occasionally happen in nonoptimized compilations.
600:
601:
602: File: internals, Node: Jump Patterns, Next: Peephole Definitions, Prev: Dependent Patterns, Up: Machine Desc
603:
604: Defining Jump Instruction Patterns
605: ==================================
606:
607: GNU CC assumes that the machine has a condition code. A comparison insn
608: sets the condition code, recording the results of both signed and unsigned
609: comparison of the given operands. A separate branch insn tests the
610: condition code and branches or not according its value. The branch insns
611: come in distinct signed and unsigned flavors. Many common machines, such
612: as the Vax, the 68000 and the 32000, work this way.
613:
614: Some machines have distinct signed and unsigned compare instructions, and
615: only one set of conditional branch instructions. The easiest way to handle
616: these machines is to treat them just like the others until the final stage
617: where assembly code is written. At this time, when outputting code for the
618: compare instruction, peek ahead at the following branch using `NEXT_INSN
619: (insn)'. (The variable `insn' refers to the insn being output, in the
620: output-writing code in an instruction pattern.) If the RTL says that is an
621: unsigned branch, output an unsigned compare; otherwise output a signed
622: compare. When the branch itself is output, you can treat signed and
623: unsigned branches identically.
624:
625: The reason you can do this is that GNU CC always generates a pair of
626: consecutive RTL insns, one to set the condition code and one to test it,
627: and keeps the pair inviolate until the end.
628:
629: To go with this technique, you must define the machine-description macro
630: `NOTICE_UPDATE_CC' to do `CC_STATUS_INIT'; in other words, no compare
631: instruction is superfluous.
632:
633: Some machines have compare-and-branch instructions and no condition code.
634: A similar technique works for them. When it is time to ``output'' a
635: compare instruction, record its operands in two static variables. When
636: outputting the branch-on-condition-code instruction that follows, actually
637: output a compare-and-branch instruction that uses the remembered operands.
638:
639: It also works to define patterns for compare-and-branch instructions. In
640: optimizing compilation, the pair of compare and branch instructions will be
641: combined accoprding to these patterns. But this does not happen if
642: optimization is not requested. So you must use one of the solutions above
643: in addition to any special patterns you define.
644:
645:
646: File: internals, Node: Peephole Definitions, Next: Expander Definitions, Prev: Jump Patterns, Up: Machine Desc
647:
648: Defining Machine-Specific Peephole Optimizers
649: =============================================
650:
651: In addition to instruction patterns the `md' file may contain definitions
652: of machine-specific peephole optimizations.
653:
654: The combiner does not notice certain peephole optimizations when the data
655: flow in the program does not suggest that it should try them. For example,
656: sometimes two consecutive insns related in purpose can be combined even
657: though the second one does not appear to use a register computed in the
658: first one. A machine-specific peephole optimizer can detect such
659: opportunities.
660:
661: A definition looks like this:
662:
663: (define_peephole
664: [INSN-PATTERN-1
665: INSN-PATTERN-2
666: ...]
667: "CONDITION"
668: "TEMPLATE")
669:
670: In this skeleton, INSN-PATTERN-1 and so on are patterns to match
671: consecutive instructions. The optimization applies to a sequence of
672: instructions when INSN-PATTERN-1 matches the first one, INSN-PATTERN-2
673: matches the next, and so on.
674:
675: INSN-PATTERN-1 and so on look *almost* like the second operand of
676: `define_insn'. There is one important difference: this pattern is an RTX,
677: not a vector. If the `define_insn' pattern would be a vector of one
678: element, the INSN-PATTERN should be just that element, no vector. If the
679: `define_insn' pattern would have multiple elements then the INSN-PATTERN
680: must place the vector inside an explicit `parallel' RTX.
681:
682: The operands of the instructions are matched with `match_operands' and
683: `match_dup', as usual). What is not usual is that the operand numbers
684: apply to all the instruction patterns in the definition. So, you can check
685: for identical operands in two instructions by using `match_operand' in one
686: instruction and `match_dup' in the other.
687:
688: The operand constraints used in `match_operand' patterns do not have any
689: direct effect on the applicability of the optimization, but they will be
690: validated afterward, so write constraints that are sure to fit whenever the
691: optimization is applied. It is safe to use `"g"' for each operand.
692:
693: Once a sequence of instructions matches the patterns, the CONDITION is
694: checked. This is a C expression which makes the final decision whether to
695: perform the optimization (do so if the expression is nonzero). If
696: CONDITION is omitted (in other words, the string is empty) then the
697: optimization is applied to every sequence of instructions that matches the
698: patterns.
699:
700: The defined peephole optimizations are applied after register allocation is
701: complete. Therefore, the optimizer can check which operands have ended up
702: in which kinds of registers, just by looking at the operands.
703:
704: The way to refer to the operands in CONDITION is to write `operands[I]' for
705: operand number I (as matched by `(match_operand I ...)'). Use the variable
706: `insn' to refer to the last of the insns being matched; use `PREV_INSN' to
707: find the preceding insns (but be careful to skip over any `note' insns that
708: intervene).
709:
710: When optimizing computations with intermediate results, you can use
711: CONDITION to match only when the intermediate results are not used
712: elsewhere. Use the C expression `dead_or_set_p (INSN, OP)', where INSN is
713: the insn in which you expect the value to be used for the last time (from
714: the value of `insn', together with use of `PREV_INSN'), and OP is the
715: intermediate value (from `operands[I]').
716:
717: Applying the optimization means replacing the sequence of instructions with
718: one new instruction. The TEMPLATE controls ultimate output of assembler
719: code for this combined instruction. It works exactly like the template of
720: a `define_insn'. Operand numbers in this template are the same ones used
721: in matching the original sequence of instructions.
722:
723: The result of a defined peephole optimizer does not need to match any of
724: the instruction patterns, and it does not have an opportunity to match
725: them. The peephole optimizer definition itself serves as the instruction
726: pattern to control how the instruction is output.
727:
728: Defined peephole optimizers are run in the last jump optimization pass, so
729: the instructions they produce are never combined or rearranged
730: automatically in any way.
731:
732: Here is an example, taken from the 68000 machine description:
733:
734: (define_peephole
735: [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4)))
736: (set (match_operand:DF 0 "register_operand" "f")
737: (match_operand:DF 1 "register_operand" "ad"))]
738: "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
739: "*
740: {
741: rtx xoperands[2];
742: xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1);
743: #ifdef MOTOROLA
744: output_asm_insn (\"move.l %1,(sp)\", xoperands);
745: output_asm_insn (\"move.l %1,-(sp)\", operands);
746: return \"fmove.d (sp)+,%0\";
747: #else
748: output_asm_insn (\"movel %1,sp@\", xoperands);
749: output_asm_insn (\"movel %1,sp@-\", operands);
750: return \"fmoved sp@+,%0\";
751: #endif
752: }
753: ")
754:
755: The effect of this optimization is to change
756:
757: jbsr _foobar
758: addql #4,sp
759: movel d1,sp@-
760: movel d0,sp@-
761: fmoved sp@+,fp0
762:
763: into
764:
765: jbsr _foobar
766: movel d1,sp@
767: movel d0,sp@-
768: fmoved sp@+,fp0
769:
770:
771: File: internals, Node: Expander Definitions, Prev: Peephole Definitions, Up: Machine Desc
772:
773: Defining RTL Sequences for Code Generation
774: ==========================================
775:
776: On some target machines, some standard pattern names for RTL generation
777: cannot be handled with single insn, but a sequence of RTL insns can
778: represent them. For these target machines, you can write a `define_expand'
779: to specify how to generate the sequence of RTL.
780:
781: A `define_expand' is an RTL expression that looks almost like a
782: `define_insn'; but, unlike the latter, a `define_expand' is used only for
783: RTL generation and it can produce more than one RTL insn.
784:
785: A `define_expand' RTX has four operands:
786:
787: * The name. Each `define_expand' must have a name, since the only use
788: for it is to refer to it by name.
789:
790: * The RTL template. This is just like the RTL template for a
791: `define_peephole' in that it is a vector of RTL expressions each being
792: one insn.
793:
794: * The condition, a string containing a C expression. This expression is
795: used to express how the availability of this pattern depends on
796: subclasses of target machine, selected by command-line options when
797: GNU CC is run. This is just like the condition of a `define_insn'
798: that has a standard name.
799:
800: * The preparation statements, a string containing zero or more C
801: statements which are to be executed before RTL code is generated from
802: the RTL template.
803:
804: Usually these statements prepare temporary registers for use as
805: internal operands in the RTL template, but they can also generate RTL
806: insns directly by calling routines such as `emit_insn', etc. Any such
807: insns precede the ones that come from the RTL template.
808:
809: The RTL template, in addition to controlling generation of RTL insns, also
810: describes the operands that need to be specified when this pattern is used.
811: In particular, it gives a predicate for each operand.
812:
813: A true operand, which need to be specified in order to generate RTL from
814: the pattern, should be described with a `match_operand' in its first
815: occurrence in the RTL template. This enters information on the operand's
816: predicate into the tables that record such things. GNU CC uses the
817: information to preload the operand into a register if that is required for
818: valid RTL code. If the operand is referred to more than once, subsequent
819: references should use `match_dup'.
820:
821: The RTL template may also refer to internal ``operands'' which are
822: temporary registers or labels used only within the sequence made by the
823: `define_expand'. Internal operands are substituted into the RTL template
824: with `match_dup', never with `match_operand'. The values of the internal
825: operands are not passed in as arguments by the compiler when it requests
826: use of this pattern. Instead, they are computed within the pattern, in the
827: preparation statements. These statements compute the values and store them
828: into the appropriate elements of `operands' so that `match_dup' can find
829: them.
830:
831: There are two special macros defined for use in the preparation statements:
832: `DONE' and `FAIL'. Use them with a following semicolon, as a statement.
833:
834: `DONE'
835: Use the `DONE' macro to end RTL generation for the pattern. The only
836: RTL insns resulting from the pattern on this occasion will be those
837: already emitted by explicit calls to `emit_insn' within the
838: preparation statements; the RTL template will not be generated.
839:
840: `FAIL'
841: Make the pattern fail on this occasion. When a pattern fails, it
842: means that the pattern was not truly available. The calling routines
843: in the compiler will try other strategies for code generation using
844: other patterns.
845:
846: Failure is currently supported only for binary operations (addition,
847: multiplication, shifting, etc.).
848:
849: Do not emit any insns explicitly with `emit_insn' before failing.
850:
851: Here is an example, the definition of left-shift for the SPUR chip:
852:
853: (define_expand "ashlsi3"
854: [(set (match_operand:SI 0 "register_operand" "")
855: (ashift:SI
856: (match_operand:SI 1 "register_operand" "")
857: (match_operand:SI 2 "nonmemory_operand" "")))]
858: ""
859: "
860: {
861: if (GET_CODE (operands[2]) != CONST_INT
862: || (unsigned) INTVAL (operands[2]) > 3)
863: FAIL;
864: }")
865:
866: This example uses `define_expand' so that it can generate an RTL insn for
867: shifting when the shift-count is in the supported range of 0 to 3 but fail
868: in other cases where machine insns aren't available. When it fails, the
869: compiler tries another strategy using different patterns (such as, a
870: library call).
871:
872: If the compiler were able to handle nontrivial condition-strings in
873: patterns with names, then there would be possible to use a `define_insn' in
874: that case. Here is another case (zero-extension on the 68000) which makes
875: more use of the power of `define_expand':
876:
877: (define_expand "zero_extendhisi2"
878: [(set (match_operand:SI 0 "general_operand" "")
879: (const_int 0))
880: (set (strict_low_part
881: (subreg:HI
882: (match_operand:SI 0 "general_operand" "")
883: 0))
884: (match_operand:HI 1 "general_operand" ""))]
885: ""
886: "operands[1] = make_safe_from (operands[1], operands[0]);")
887:
888: Here two RTL insns are generated, one to clear the entire output operand
889: and the other to copy the input operand into its low half. This sequence
890: is incorrect if the input operand refers to [the old value of] the output
891: operand, so the preparation statement makes sure this isn't so. The
892: function `make_safe_from' copies the `operands[1]' into a temporary
893: register if it refers to `operands[0]'. It does this by emitting another
894: RTL insn.
895:
896: Finally, a third example shows the use of an internal operand.
897: Zero-extension on the SPUR chip is done by `and'-ing the result against a
898: halfword mask. But this mask cannot be represented by a `const_int'
899: because the constant value is too large to be legitimate on this machine.
900: So it must be copied into a register with `force_reg' and then the register
901: used in the `and'.
902:
903: (define_expand "zero_extendhisi2"
904: [(set (match_operand:SI 0 "register_operand" "")
905: (and:SI (subreg:SI
906: (match_operand:HI 1 "register_operand" "")
907: 0)
908: (match_dup 2)))]
909: ""
910: "operands[2]
911: = force_reg (SImode, gen_rtx (CONST_INT,
912: VOIDmode, 65535)); ")
913:
914:
915: File: internals, Node: Machine Macros, Next: Config, Prev: Machine Desc, Up: Top
916:
917: Machine Description Macros
918: **************************
919:
920: The other half of the machine description is a C header file conventionally
921: given the name `tm-MACHINE.h'. The file `tm.h' should be a link to it.
922: The header file `config.h' includes `tm.h' and most compiler source files
923: include `config.h'.
924:
925: * Menu:
926:
927: * Run-time Target:: Defining -m options like -m68000 and -m68020.
928: * Storage Layout:: Defining sizes and alignments of data types.
929: * Registers:: Naming and describing the hardware registers.
930: * Register Classes:: Defining the classes of hardware registers.
931: * Stack Layout:: Defining which way the stack grows and by how much.
932: * Library Names:: Specifying names of subroutines to call automatically.
933: * Addressing Modes:: Defining addressing modes valid for memory operands.
934: * Condition Code:: Defining how insns update the condition code.
935: * Assembler Format:: Defining how to write insns and pseudo-ops to output.
936: * Misc:: Everything else.
937:
938:
939:
940: File: internals, Node: Run-time Target, Next: Storage Layout, Prev: Machine Macros, Up: Machine Macros
941:
942: Run-time Target Specification
943: =============================
944:
945: `CPP_PREDEFINES'
946: Define this to be a string constant containing `-D' options to define
947: the predefined macros that identify this machine and system.
948:
949: For example, on the Sun, one can use the value
950:
951: "-Dmc68000 -Dsun -Dunix"
952:
953: `extern int target_flags;'
954: This declaration should be present.
955:
956: `TARGET_...'
957: This series of macros is to allow compiler command arguments to enable
958: or disable the use of optional features of the target machine. For
959: example, one machine description serves both the 68000 and the 68020;
960: a command argument tells the compiler whether it should use 68020-only
961: instructions or not. This command argument works by means of a macro
962: `TARGET_68020' that tests a bit in `target_flags'.
963:
964: Define a macro `TARGET_FEATURENAME' for each such option. Its
965: definition should test a bit in `target_flags'; for example:
966:
967: #define TARGET_68020 (target_flags & 1)
968:
969: One place where these macros are used is in the condition-expressions
970: of instruction patterns. Note how `TARGET_68020' appears frequently
971: in the 68000 machine description file, `m68k.md'. Another place they
972: are used is in the definitions of the other macros in the
973: `tm-MACHINE.h' file.
974:
975: `TARGET_SWITCHES'
976: This macro defines names of command options to set and clear bits in
977: `target_flags'. Its definition is an initializer with a subgrouping
978: for each command option.
979:
980: Each subgrouping contains a string constant, that defines the option
981: name, and a number, which contains the bits to set in `target_flags'.
982: A negative number says to clear bits instead; the negative of the
983: number is which bits to clear. The actual option name is made by
984: appending `-m' to the specified name.
985:
986: One of the subgroupings should have a null string. The number in this
987: grouping is the default value for `target_flags'. Any target options
988: act starting with that value.
989:
990: Here is an example which defines `-m68000' and `-m68020' with opposite
991: meanings, and picks the latter as the default:
992:
993: #define TARGET_SWITCHES \
994: { { "68020", 1}, \
995: { "68000", -1}, \
996: { "", 1}}
997:
998: Sometimes certain combinations of command options do not make sense on a
999: particular target machine. You can define a macro `OVERRIDE_OPTIONS' to
1000: take account of this. This macro, if defined, is executed once just after
1001: all the command options have been parsed.
1002:
1003:
1004: File: internals, Node: Storage Layout, Next: Registers, Prev: Run-time Target, Up: Machine Macros
1005:
1006: Storage Layout
1007: ==============
1008:
1009: Note that the definitions of the macros in this table which are sizes or
1010: alignments measured in bits do not need to be constant. They can be C
1011: expressions that refer to static variables, such as the `target_flags'.
1012: *note Run-time Target::.
1013:
1014: `BITS_BIG_ENDIAN'
1015: Define this macro if the most significant bit in a byte has the lowest
1016: number. This means that bit-field instructions count from the most
1017: significant bit. If the machine has no bit-field instructions, this
1018: macro is irrelevant.
1019:
1020: `BYTES_BIG_ENDIAN'
1021: Define this macro if the most significant byte in a word has the
1022: lowest number.
1023:
1024: `WORDS_BIG_ENDIAN'
1025: Define this macro if, in a multiword object, the most significant word
1026: has the lowest number.
1027:
1028: `BITS_PER_UNIT'
1029: Number of bits in an addressable storage unit (byte); normally 8.
1030:
1031: `BITS_PER_WORD'
1032: Number of bits in a word; normally 32.
1033:
1034: `UNITS_PER_WORD'
1035: Number of storage units in a word; normally 4.
1036:
1037: `POINTER_SIZE'
1038: Width of a pointer, in bits.
1039:
1040: `PARM_BOUNDARY'
1041: Alignment required for function parameters on the stack, in bits.
1042:
1043: `STACK_BOUNDARY'
1044: Define this macro if you wish to preserve a certain alignment for the
1045: stack pointer at all times. The definition is a C expression for the
1046: desired alignment (measured in bits).
1047:
1048: `FUNCTION_BOUNDARY'
1049: Alignment required for a function entry point, in bits.
1050:
1051: `BIGGEST_ALIGNMENT'
1052: Biggest alignment that any data type can require on this machine, in
1053: bits.
1054:
1055: `EMPTY_FIELD_ALIGNMENT'
1056: Alignment in bits to be given to a structure bit field that follows an
1057: empty field such as `int : 0;'.
1058:
1059: `STRUCTURE_SIZE_BOUNDARY'
1060: Number of bits which any structure or union's size must be a multiple
1061: of. Each structure or union's size is rounded up to a multiple of this.
1062:
1063: If you do not define this macro, the default is the same as
1064: `BITS_PER_UNIT'.
1065:
1066: `STRICT_ALIGNMENT'
1067: Define this if instructions will fail to work if given data not on the
1068: nominal alignment. If instructions will merely go slower in that
1069: case, do not define this macro.
1070:
1071:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.