|
|
1.1 root 1: @c Copyright (C) 1988, 1989, 1992, 1993 Free Software Foundation, Inc.
2: @c This is part of the GCC manual.
3: @c For copying conditions, see the file gcc.texi.
4:
5: @ifset INTERNALS
6: @node Machine Desc
7: @chapter Machine Descriptions
8: @cindex machine descriptions
9:
10: A machine description has two parts: a file of instruction patterns
11: (@file{.md} file) and a C header file of macro definitions.
12:
13: The @file{.md} file for a target machine contains a pattern for each
14: instruction that the target machine supports (or at least each instruction
15: that is worth telling the compiler about). It may also contain comments.
16: A semicolon causes the rest of the line to be a comment, unless the semicolon
17: is inside a quoted string.
18:
19: See the next chapter for information on the C header file.
20:
21: @menu
22: * Patterns:: How to write instruction patterns.
23: * Example:: An explained example of a @code{define_insn} pattern.
24: * RTL Template:: The RTL template defines what insns match a pattern.
25: * Output Template:: The output template says how to make assembler code
26: from such an insn.
27: * Output Statement:: For more generality, write C code to output
28: the assembler code.
29: * Constraints:: When not all operands are general operands.
30: * Standard Names:: Names mark patterns to use for code generation.
31: * Pattern Ordering:: When the order of patterns makes a difference.
32: * Dependent Patterns:: Having one pattern may make you need another.
33: * Jump Patterns:: Special considerations for patterns for jump insns.
34: * Insn Canonicalizations::Canonicalization of Instructions
35: * Peephole Definitions::Defining machine-specific peephole optimizations.
36: * Expander Definitions::Generating a sequence of several RTL insns
37: for a standard operation.
38: * Insn Splitting:: Splitting Instructions into Multiple Instructions
39: * Insn Attributes:: Specifying the value of attributes for generated insns.
40: @end menu
41:
42: @node Patterns
43: @section Everything about Instruction Patterns
44: @cindex patterns
45: @cindex instruction patterns
46:
47: @findex define_insn
48: Each instruction pattern contains an incomplete RTL expression, with pieces
49: to be filled in later, operand constraints that restrict how the pieces can
50: be filled in, and an output pattern or C code to generate the assembler
51: output, all wrapped up in a @code{define_insn} expression.
52:
53: A @code{define_insn} is an RTL expression containing four or five operands:
54:
55: @enumerate
56: @item
57: An optional name. The presence of a name indicate that this instruction
58: pattern can perform a certain standard job for the RTL-generation
59: pass of the compiler. This pass knows certain names and will use
60: the instruction patterns with those names, if the names are defined
61: in the machine description.
62:
63: The absence of a name is indicated by writing an empty string
64: where the name should go. Nameless instruction patterns are never
65: used for generating RTL code, but they may permit several simpler insns
66: to be combined later on.
67:
68: Names that are not thus known and used in RTL-generation have no
69: effect; they are equivalent to no name at all.
70:
71: @item
72: The @dfn{RTL template} (@pxref{RTL Template}) is a vector of incomplete
73: RTL expressions which show what the instruction should look like. It is
74: incomplete because it may contain @code{match_operand},
75: @code{match_operator}, and @code{match_dup} expressions that stand for
76: operands of the instruction.
77:
78: If the vector has only one element, that element is the template for the
79: instruction pattern. If the vector has multiple elements, then the
80: instruction pattern is a @code{parallel} expression containing the
81: elements described.
82:
83: @item
84: @cindex pattern conditions
85: @cindex conditions, in patterns
86: A condition. This is a string which contains a C expression that is
87: the final test to decide whether an insn body matches this pattern.
88:
89: @cindex named patterns and conditions
90: For a named pattern, the condition (if present) may not depend on
91: the data in the insn being matched, but only the target-machine-type
92: flags. The compiler needs to test these conditions during
93: initialization in order to learn exactly which named instructions are
94: available in a particular run.
95:
96: @findex operands
97: For nameless patterns, the condition is applied only when matching an
98: individual insn, and only after the insn has matched the pattern's
99: recognition template. The insn's operands may be found in the vector
100: @code{operands}.
101:
102: @item
103: The @dfn{output template}: a string that says how to output matching
104: insns as assembler code. @samp{%} in this string specifies where
105: to substitute the value of an operand. @xref{Output Template}.
106:
107: When simple substitution isn't general enough, you can specify a piece
108: of C code to compute the output. @xref{Output Statement}.
109:
110: @item
111: Optionally, a vector containing the values of attributes for insns matching
112: this pattern. @xref{Insn Attributes}.
113: @end enumerate
114:
115: @node Example
116: @section Example of @code{define_insn}
117: @cindex @code{define_insn} example
118:
119: Here is an actual example of an instruction pattern, for the 68000/68020.
120:
121: @example
122: (define_insn "tstsi"
123: [(set (cc0)
124: (match_operand:SI 0 "general_operand" "rm"))]
125: ""
126: "*
127: @{ if (TARGET_68020 || ! ADDRESS_REG_P (operands[0]))
128: return \"tstl %0\";
129: return \"cmpl #0,%0\"; @}")
130: @end example
131:
132: This is an instruction that sets the condition codes based on the value of
133: a general operand. It has no condition, so any insn whose RTL description
134: has the form shown may be handled according to this pattern. The name
135: @samp{tstsi} means ``test a @code{SImode} value'' and tells the RTL generation
136: pass that, when it is necessary to test such a value, an insn to do so
137: can be constructed using this pattern.
138:
139: The output control string is a piece of C code which chooses which
140: output template to return based on the kind of operand and the specific
141: type of CPU for which code is being generated.
142:
143: @samp{"rm"} is an operand constraint. Its meaning is explained below.
144:
145: @node RTL Template
146: @section RTL Template
147: @cindex RTL insn template
148: @cindex generating insns
149: @cindex insns, generating
150: @cindex recognizing insns
151: @cindex insns, recognizing
152:
153: The RTL template is used to define which insns match the particular pattern
154: and how to find their operands. For named patterns, the RTL template also
155: says how to construct an insn from specified operands.
156:
157: Construction involves substituting specified operands into a copy of the
158: template. Matching involves determining the values that serve as the
159: operands in the insn being matched. Both of these activities are
160: controlled by special expression types that direct matching and
161: substitution of the operands.
162:
163: @table @code
164: @findex match_operand
165: @item (match_operand:@var{m} @var{n} @var{predicate} @var{constraint})
166: This expression is a placeholder for operand number @var{n} of
167: the insn. When constructing an insn, operand number @var{n}
168: will be substituted at this point. When matching an insn, whatever
169: appears at this position in the insn will be taken as operand
170: number @var{n}; but it must satisfy @var{predicate} or this instruction
171: pattern will not match at all.
172:
173: Operand numbers must be chosen consecutively counting from zero in
174: each instruction pattern. There may be only one @code{match_operand}
175: expression in the pattern for each operand number. Usually operands
176: are numbered in the order of appearance in @code{match_operand}
177: expressions.
178:
179: @var{predicate} is a string that is the name of a C function that accepts two
180: arguments, an expression and a machine mode. During matching, the
181: function will be called with the putative operand as the expression and
182: @var{m} as the mode argument (if @var{m} is not specified,
183: @code{VOIDmode} will be used, which normally causes @var{predicate} to accept
184: any mode). If it returns zero, this instruction pattern fails to match.
185: @var{predicate} may be an empty string; then it means no test is to be done
186: on the operand, so anything which occurs in this position is valid.
187:
188: Most of the time, @var{predicate} will reject modes other than @var{m}---but
189: not always. For example, the predicate @code{address_operand} uses
190: @var{m} as the mode of memory ref that the address should be valid for.
191: Many predicates accept @code{const_int} nodes even though their mode is
192: @code{VOIDmode}.
193:
194: @var{constraint} controls reloading and the choice of the best register
195: class to use for a value, as explained later (@pxref{Constraints}).
196:
197: People are often unclear on the difference between the constraint and the
198: predicate. The predicate helps decide whether a given insn matches the
199: pattern. The constraint plays no role in this decision; instead, it
200: controls various decisions in the case of an insn which does match.
201:
202: @findex general_operand
203: On CISC machines, the most common @var{predicate} is
204: @code{"general_operand"}. This function checks that the putative
205: operand is either a constant, a register or a memory reference, and that
206: it is valid for mode @var{m}.
207:
208: @findex register_operand
209: For an operand that must be a register, @var{predicate} should be
210: @code{"register_operand"}. Using @code{"general_operand"} would be
211: valid, since the reload pass would copy any non-register operands
212: through registers, but this would make GNU CC do extra work, it would
213: prevent invariant operands (such as constant) from being removed from
214: loops, and it would prevent the register allocator from doing the best
215: possible job. On RISC machines, it is usually most efficient to allow
216: @var{predicate} to accept only objects that the constraints allow.
217:
218: @findex immediate_operand
219: For an operand that must be a constant, you must be sure to either use
220: @code{"immediate_operand"} for @var{predicate}, or make the instruction
221: pattern's extra condition require a constant, or both. You cannot
222: expect the constraints to do this work! If the constraints allow only
223: constants, but the predicate allows something else, the compiler will
224: crash when that case arises.
225:
226: @findex match_scratch
227: @item (match_scratch:@var{m} @var{n} @var{constraint})
228: This expression is also a placeholder for operand number @var{n}
229: and indicates that operand must be a @code{scratch} or @code{reg}
230: expression.
231:
232: When matching patterns, this is completely equivalent to
233:
234: @smallexample
235: (match_operand:@var{m} @var{n} "scratch_operand" @var{pred})
236: @end smallexample
237:
238: but, when generating RTL, it produces a (@code{scratch}:@var{m})
239: expression.
240:
241: If the last few expressions in a @code{parallel} are @code{clobber}
242: expressions whose operands are either a hard register or
243: @code{match_scratch}, the combiner can add them when necessary.
244: @xref{Side Effects}.
245:
246: @findex match_dup
247: @item (match_dup @var{n})
248: This expression is also a placeholder for operand number @var{n}.
249: It is used when the operand needs to appear more than once in the
250: insn.
251:
252: In construction, @code{match_dup} acts just like @code{match_operand}:
253: the operand is substituted into the insn being constructed. But in
254: matching, @code{match_dup} behaves differently. It assumes that operand
255: number @var{n} has already been determined by a @code{match_operand}
256: appearing earlier in the recognition template, and it matches only an
257: identical-looking expression.
258:
259: @findex match_operator
260: @item (match_operator:@var{m} @var{n} @var{predicate} [@var{operands}@dots{}])
261: This pattern is a kind of placeholder for a variable RTL expression
262: code.
263:
264: When constructing an insn, it stands for an RTL expression whose
265: expression code is taken from that of operand @var{n}, and whose
266: operands are constructed from the patterns @var{operands}.
267:
268: When matching an expression, it matches an expression if the function
269: @var{predicate} returns nonzero on that expression @emph{and} the
270: patterns @var{operands} match the operands of the expression.
271:
272: Suppose that the function @code{commutative_operator} is defined as
273: follows, to match any expression whose operator is one of the
274: commutative arithmetic operators of RTL and whose mode is @var{mode}:
275:
276: @smallexample
277: int
278: commutative_operator (x, mode)
279: rtx x;
280: enum machine_mode mode;
281: @{
282: enum rtx_code code = GET_CODE (x);
283: if (GET_MODE (x) != mode)
284: return 0;
285: return (GET_RTX_CLASS (code) == 'c'
286: || code == EQ || code == NE);
287: @}
288: @end smallexample
289:
290: Then the following pattern will match any RTL expression consisting
291: of a commutative operator applied to two general operands:
292:
293: @smallexample
294: (match_operator:SI 3 "commutative_operator"
295: [(match_operand:SI 1 "general_operand" "g")
296: (match_operand:SI 2 "general_operand" "g")])
297: @end smallexample
298:
299: Here the vector @code{[@var{operands}@dots{}]} contains two patterns
300: because the expressions to be matched all contain two operands.
301:
302: When this pattern does match, the two operands of the commutative
303: operator are recorded as operands 1 and 2 of the insn. (This is done
304: by the two instances of @code{match_operand}.) Operand 3 of the insn
305: will be the entire commutative expression: use @code{GET_CODE
306: (operands[3])} to see which commutative operator was used.
307:
308: The machine mode @var{m} of @code{match_operator} works like that of
309: @code{match_operand}: it is passed as the second argument to the
310: predicate function, and that function is solely responsible for
311: deciding whether the expression to be matched ``has'' that mode.
312:
313: When constructing an insn, argument 3 of the gen-function will specify
314: the operation (i.e. the expression code) for the expression to be
315: made. It should be an RTL expression, whose expression code is copied
316: into a new expression whose operands are arguments 1 and 2 of the
317: gen-function. The subexpressions of argument 3 are not used;
318: only its expression code matters.
319:
320: When @code{match_operator} is used in a pattern for matching an insn,
321: it usually best if the operand number of the @code{match_operator}
322: is higher than that of the actual operands of the insn. This improves
323: register allocation because the register allocator often looks at
324: operands 1 and 2 of insns to see if it can do register tying.
325:
326: There is no way to specify constraints in @code{match_operator}. The
327: operand of the insn which corresponds to the @code{match_operator}
328: never has any constraints because it is never reloaded as a whole.
329: However, if parts of its @var{operands} are matched by
330: @code{match_operand} patterns, those parts may have constraints of
331: their own.
332:
333: @findex match_op_dup
334: @item (match_op_dup:@var{m} @var{n}[@var{operands}@dots{}])
335: Like @code{match_dup}, except that it applies to operators instead of
336: operands. When constructing an insn, operand number @var{n} will be
337: substituted at this point. But in matching, @code{match_op_dup} behaves
338: differently. It assumes that operand number @var{n} has already been
339: determined by a @code{match_operator} appearing earlier in the
340: recognition template, and it matches only an identical-looking
341: expression.
342:
343: @findex match_parallel
344: @item (match_parallel @var{n} @var{predicate} [@var{subpat}@dots{}])
345: This pattern is a placeholder for an insn that consists of a
346: @code{parallel} expression with a variable number of elements. This
347: expression should only appear at the top level of an insn pattern.
348:
349: When constructing an insn, operand number @var{n} will be substituted at
350: this point. When matching an insn, it matches if the body of the insn
351: is a @code{parallel} expression with at least as many elements as the
352: vector of @var{subpat} expressions in the @code{match_parallel}, if each
353: @var{subpat} matches the corresponding element of the @code{parallel},
354: @emph{and} the function @var{predicate} returns nonzero on the
355: @code{parallel} that is the body of the insn. It is the responsibility
356: of the predicate to validate elements of the @code{parallel} beyond
357: those listed in the @code{match_parallel}.@refill
358:
359: A typical use of @code{match_parallel} is to match load and store
360: multiple expressions, which can contains a variable number of elements
361: in a @code{parallel}. For example,
362: @c the following is *still* going over. need to change the code.
363: @c also need to work on grouping of this example. --mew 1feb93
364:
365: @smallexample
366: (define_insn ""
367: [(match_parallel 0 "load_multiple_operation"
368: [(set (match_operand:SI 1 "gpc_reg_operand" "=r")
369: (match_operand:SI 2 "memory_operand" "m"))
370: (use (reg:SI 179))
371: (clobber (reg:SI 179))])]
372: ""
373: "loadm 0,0,%1,%2")
374: @end smallexample
375:
376: This example comes from @file{a29k.md}. The function
377: @code{load_multiple_operations} is defined in @file{a29k.c} and checks
378: that subsequent elements in the @code{parallel} are the same as the
379: @code{set} in the pattern, except that they are referencing subsequent
380: registers and memory locations.
381:
382: An insn that matches this pattern might look like:
383:
384: @smallexample
385: (parallel
386: [(set (reg:SI 20) (mem:SI (reg:SI 100)))
387: (use (reg:SI 179))
388: (clobber (reg:SI 179))
389: (set (reg:SI 21)
390: (mem:SI (plus:SI (reg:SI 100)
391: (const_int 4))))
392: (set (reg:SI 22)
393: (mem:SI (plus:SI (reg:SI 100)
394: (const_int 8))))])
395: @end smallexample
396:
397: @findex match_par_dup
398: @item (match_par_dup @var{n} [@var{subpat}@dots{}])
399: Like @code{match_op_dup}, but for @code{match_parallel} instead of
400: @code{match_operator}.
401:
402: @findex address
403: @item (address (match_operand:@var{m} @var{n} "address_operand" ""))
404: This complex of expressions is a placeholder for an operand number
405: @var{n} in a ``load address'' instruction: an operand which specifies
406: a memory location in the usual way, but for which the actual operand
407: value used is the address of the location, not the contents of the
408: location.
409:
410: @code{address} expressions never appear in RTL code, only in machine
411: descriptions. And they are used only in machine descriptions that do
412: not use the operand constraint feature. When operand constraints are
413: in use, the letter @samp{p} in the constraint serves this purpose.
414:
415: @var{m} is the machine mode of the @emph{memory location being
416: addressed}, not the machine mode of the address itself. That mode is
417: always the same on a given target machine (it is @code{Pmode}, which
418: normally is @code{SImode}), so there is no point in mentioning it;
419: thus, no machine mode is written in the @code{address} expression. If
420: some day support is added for machines in which addresses of different
421: kinds of objects appear differently or are used differently (such as
422: the PDP-10), different formats would perhaps need different machine
423: modes and these modes might be written in the @code{address}
424: expression.
425: @end table
426:
427: @node Output Template
428: @section Output Templates and Operand Substitution
429: @cindex output templates
430: @cindex operand substitution
431:
432: @cindex @samp{%} in template
433: @cindex percent sign
434: The @dfn{output template} is a string which specifies how to output the
435: assembler code for an instruction pattern. Most of the template is a
436: fixed string which is output literally. The character @samp{%} is used
437: to specify where to substitute an operand; it can also be used to
438: identify places where different variants of the assembler require
439: different syntax.
440:
441: In the simplest case, a @samp{%} followed by a digit @var{n} says to output
442: operand @var{n} at that point in the string.
443:
444: @samp{%} followed by a letter and a digit says to output an operand in an
445: alternate fashion. Four letters have standard, built-in meanings described
446: below. The machine description macro @code{PRINT_OPERAND} can define
447: additional letters with nonstandard meanings.
448:
449: @samp{%c@var{digit}} can be used to substitute an operand that is a
450: constant value without the syntax that normally indicates an immediate
451: operand.
452:
453: @samp{%n@var{digit}} is like @samp{%c@var{digit}} except that the value of
454: the constant is negated before printing.
455:
456: @samp{%a@var{digit}} can be used to substitute an operand as if it were a
457: memory reference, with the actual operand treated as the address. This may
458: be useful when outputting a ``load address'' instruction, because often the
459: assembler syntax for such an instruction requires you to write the operand
460: as if it were a memory reference.
461:
462: @samp{%l@var{digit}} is used to substitute a @code{label_ref} into a jump
463: instruction.
464:
465: @samp{%=} outputs a number which is unique to each instruction in the
466: entire compilation. This is useful for making local labels to be
467: referred to more than once in a single template that generates multiple
468: assembler instructions.
469:
470: @samp{%} followed by a punctuation character specifies a substitution that
471: does not use an operand. Only one case is standard: @samp{%%} outputs a
472: @samp{%} into the assembler code. Other nonstandard cases can be
473: defined in the @code{PRINT_OPERAND} macro. You must also define
474: which punctuation characters are valid with the
475: @code{PRINT_OPERAND_PUNCT_VALID_P} macro.
476:
477: @cindex \
478: @cindex backslash
479: The template may generate multiple assembler instructions. Write the text
480: for the instructions, with @samp{\;} between them.
481:
482: @cindex matching operands
483: When the RTL contains two operands which are required by constraint to match
484: each other, the output template must refer only to the lower-numbered operand.
485: Matching operands are not always identical, and the rest of the compiler
486: arranges to put the proper RTL expression for printing into the lower-numbered
487: operand.
488:
489: One use of nonstandard letters or punctuation following @samp{%} is to
490: distinguish between different assembler languages for the same machine; for
491: example, Motorola syntax versus MIT syntax for the 68000. Motorola syntax
492: requires periods in most opcode names, while MIT syntax does not. For
493: example, the opcode @samp{movel} in MIT syntax is @samp{move.l} in Motorola
494: syntax. The same file of patterns is used for both kinds of output syntax,
495: but the character sequence @samp{%.} is used in each place where Motorola
496: syntax wants a period. The @code{PRINT_OPERAND} macro for Motorola syntax
497: defines the sequence to output a period; the macro for MIT syntax defines
498: it to do nothing.
499:
500: @cindex @code{#} in template
501: As a special case, a template consisting of the single character @code{#}
502: instructs the compiler to first split the insn, and then output the
503: resulting instructions separately. This helps eliminate redundancy in the
504: output templates. If you have a @code{define_insn} that needs to emit
505: multiple assembler instructions, and there is an matching @code{define_split}
506: already defined, then you can simply use @code{#} as the output template
507: instead of writing an output template that emits the multiple assembler
508: instructions.
509:
510: If @code{ASSEMBLER_DIALECT} is defined, you can use
511: @samp{@{option0|option1|option2@}} constructs in the templates. These
512: describe multiple variants of assembler language syntax.
513: @xref{Instruction Output}.
514:
515: @node Output Statement
516: @section C Statements for Assembler Output
517: @cindex output statements
518: @cindex C statements for assembler output
519: @cindex generating assembler output
520:
521: Often a single fixed template string cannot produce correct and efficient
522: assembler code for all the cases that are recognized by a single
523: instruction pattern. For example, the opcodes may depend on the kinds of
524: operands; or some unfortunate combinations of operands may require extra
525: machine instructions.
526:
527: If the output control string starts with a @samp{@@}, then it is actually
528: a series of templates, each on a separate line. (Blank lines and
529: leading spaces and tabs are ignored.) The templates correspond to the
530: pattern's constraint alternatives (@pxref{Multi-Alternative}). For example,
531: if a target machine has a two-address add instruction @samp{addr} to add
532: into a register and another @samp{addm} to add a register to memory, you
533: might write this pattern:
534:
535: @smallexample
536: (define_insn "addsi3"
537: [(set (match_operand:SI 0 "general_operand" "=r,m")
538: (plus:SI (match_operand:SI 1 "general_operand" "0,0")
539: (match_operand:SI 2 "general_operand" "g,r")))]
540: ""
541: "@@
542: addr %2,%0
543: addm %2,%0")
544: @end smallexample
545:
546: @cindex @code{*} in template
547: @cindex asterisk in template
548: If the output control string starts with a @samp{*}, then it is not an
549: output template but rather a piece of C program that should compute a
550: template. It should execute a @code{return} statement to return the
551: template-string you want. Most such templates use C string literals, which
552: require doublequote characters to delimit them. To include these
553: doublequote characters in the string, prefix each one with @samp{\}.
554:
555: The operands may be found in the array @code{operands}, whose C data type
556: is @code{rtx []}.
557:
558: It is very common to select different ways of generating assembler code
559: based on whether an immediate operand is within a certain range. Be
560: careful when doing this, because the result of @code{INTVAL} is an
561: integer on the host machine. If the host machine has more bits in an
562: @code{int} than the target machine has in the mode in which the constant
563: will be used, then some of the bits you get from @code{INTVAL} will be
564: superfluous. For proper results, you must carefully disregard the
565: values of those bits.
566:
567: @findex output_asm_insn
568: It is possible to output an assembler instruction and then go on to output
569: or compute more of them, using the subroutine @code{output_asm_insn}. This
570: receives two arguments: a template-string and a vector of operands. The
571: vector may be @code{operands}, or it may be another array of @code{rtx}
572: that you declare locally and initialize yourself.
573:
574: @findex which_alternative
575: When an insn pattern has multiple alternatives in its constraints, often
576: the appearance of the assembler code is determined mostly by which alternative
577: was matched. When this is so, the C code can test the variable
578: @code{which_alternative}, which is the ordinal number of the alternative
579: that was actually satisfied (0 for the first, 1 for the second alternative,
580: etc.).
581:
582: For example, suppose there are two opcodes for storing zero, @samp{clrreg}
583: for registers and @samp{clrmem} for memory locations. Here is how
584: a pattern could use @code{which_alternative} to choose between them:
585:
586: @smallexample
587: (define_insn ""
588: [(set (match_operand:SI 0 "general_operand" "=r,m")
589: (const_int 0))]
590: ""
591: "*
592: return (which_alternative == 0
593: ? \"clrreg %0\" : \"clrmem %0\");
594: ")
595: @end smallexample
596:
597: The example above, where the assembler code to generate was
598: @emph{solely} determined by the alternative, could also have been specified
599: as follows, having the output control string start with a @samp{@@}:
600:
601: @smallexample
602: @group
603: (define_insn ""
604: [(set (match_operand:SI 0 "general_operand" "=r,m")
605: (const_int 0))]
606: ""
607: "@@
608: clrreg %0
609: clrmem %0")
610: @end group
611: @end smallexample
612: @end ifset
613:
614: @c Most of this node appears by itself (in a different place) even
615: @c when the INTERNALS flag is clear. Passages that require the full
616: @c manual's context are conditionalized to appear only in the full manual.
617: @ifset INTERNALS
618: @node Constraints
619: @section Operand Constraints
620: @cindex operand constraints
621: @cindex constraints
622:
623: Each @code{match_operand} in an instruction pattern can specify a
624: constraint for the type of operands allowed.
625: @end ifset
626: @ifclear INTERNALS
627: @node Constraints
628: @section Constraints for @code{asm} Operands
629: @cindex operand constraints, @code{asm}
630: @cindex constraints, @code{asm}
631: @cindex @code{asm} constraints
632: Here are specific details on what constraint letters you can use with
633: @code{asm} operands.
634: @end ifclear
635: Constraints can say whether
636: an operand may be in a register, and which kinds of register; whether the
637: operand can be a memory reference, and which kinds of address; whether the
638: operand may be an immediate constant, and which possible values it may
639: have. Constraints can also require two operands to match.
640:
641: @ifset INTERNALS
642: @menu
643: * Simple Constraints:: Basic use of constraints.
644: * Multi-Alternative:: When an insn has two alternative constraint-patterns.
645: * Class Preferences:: Constraints guide which hard register to put things in.
646: * Modifiers:: More precise control over effects of constraints.
647: * Machine Constraints:: Existing constraints for some particular machines.
648: * No Constraints:: Describing a clean machine without constraints.
649: @end menu
650: @end ifset
651:
652: @ifclear INTERNALS
653: @menu
654: * Simple Constraints:: Basic use of constraints.
655: * Multi-Alternative:: When an insn has two alternative constraint-patterns.
656: * Modifiers:: More precise control over effects of constraints.
657: * Machine Constraints:: Special constraints for some particular machines.
658: @end menu
659: @end ifclear
660:
661: @node Simple Constraints
662: @subsection Simple Constraints
663: @cindex simple constraints
664:
665: The simplest kind of constraint is a string full of letters, each of
666: which describes one kind of operand that is permitted. Here are
667: the letters that are allowed:
668:
669: @table @asis
670: @cindex @samp{m} in constraint
671: @cindex memory references in constraints
672: @item @samp{m}
673: A memory operand is allowed, with any kind of address that the machine
674: supports in general.
675:
676: @cindex offsettable address
677: @cindex @samp{o} in constraint
678: @item @samp{o}
679: A memory operand is allowed, but only if the address is
680: @dfn{offsettable}. This means that adding a small integer (actually,
681: the width in bytes of the operand, as determined by its machine mode)
682: may be added to the address and the result is also a valid memory
683: address.
684:
685: @cindex autoincrement/decrement addressing
686: For example, an address which is constant is offsettable; so is an
687: address that is the sum of a register and a constant (as long as a
688: slightly larger constant is also within the range of address-offsets
689: supported by the machine); but an autoincrement or autodecrement
690: address is not offsettable. More complicated indirect/indexed
691: addresses may or may not be offsettable depending on the other
692: addressing modes that the machine supports.
693:
694: Note that in an output operand which can be matched by another
695: operand, the constraint letter @samp{o} is valid only when accompanied
696: by both @samp{<} (if the target machine has predecrement addressing)
697: and @samp{>} (if the target machine has preincrement addressing).
698:
699: @cindex @samp{V} in constraint
700: @item @samp{V}
701: A memory operand that is not offsettable. In other words, anything that
702: would fit the @samp{m} constraint but not the @samp{o} constraint.
703:
704: @cindex @samp{<} in constraint
705: @item @samp{<}
706: A memory operand with autodecrement addressing (either predecrement or
707: postdecrement) is allowed.
708:
709: @cindex @samp{>} in constraint
710: @item @samp{>}
711: A memory operand with autoincrement addressing (either preincrement or
712: postincrement) is allowed.
713:
714: @cindex @samp{r} in constraint
715: @cindex registers in constraints
716: @item @samp{r}
717: A register operand is allowed provided that it is in a general
718: register.
719:
720: @cindex @samp{d} in constraint
721: @item @samp{d}, @samp{a}, @samp{f}, @dots{}
722: Other letters can be defined in machine-dependent fashion to stand for
723: particular classes of registers. @samp{d}, @samp{a} and @samp{f} are
724: defined on the 68000/68020 to stand for data, address and floating
725: point registers.
726:
727: @cindex constants in constraints
728: @cindex @samp{i} in constraint
729: @item @samp{i}
730: An immediate integer operand (one with constant value) is allowed.
731: This includes symbolic constants whose values will be known only at
732: assembly time.
733:
734: @cindex @samp{n} in constraint
735: @item @samp{n}
736: An immediate integer operand with a known numeric value is allowed.
737: Many systems cannot support assembly-time constants for operands less
738: than a word wide. Constraints for these operands should use @samp{n}
739: rather than @samp{i}.
740:
741: @cindex @samp{I} in constraint
742: @item @samp{I}, @samp{J}, @samp{K}, @dots{} @samp{P}
743: Other letters in the range @samp{I} through @samp{P} may be defined in
744: a machine-dependent fashion to permit immediate integer operands with
745: explicit integer values in specified ranges. For example, on the
746: 68000, @samp{I} is defined to stand for the range of values 1 to 8.
747: This is the range permitted as a shift count in the shift
748: instructions.
749:
750: @cindex @samp{E} in constraint
751: @item @samp{E}
752: An immediate floating operand (expression code @code{const_double}) is
753: allowed, but only if the target floating point format is the same as
754: that of the host machine (on which the compiler is running).
755:
756: @cindex @samp{F} in constraint
757: @item @samp{F}
758: An immediate floating operand (expression code @code{const_double}) is
759: allowed.
760:
761: @cindex @samp{G} in constraint
762: @cindex @samp{H} in constraint
763: @item @samp{G}, @samp{H}
764: @samp{G} and @samp{H} may be defined in a machine-dependent fashion to
765: permit immediate floating operands in particular ranges of values.
766:
767: @cindex @samp{s} in constraint
768: @item @samp{s}
769: An immediate integer operand whose value is not an explicit integer is
770: allowed.
771:
772: This might appear strange; if an insn allows a constant operand with a
773: value not known at compile time, it certainly must allow any known
774: value. So why use @samp{s} instead of @samp{i}? Sometimes it allows
775: better code to be generated.
776:
777: For example, on the 68000 in a fullword instruction it is possible to
778: use an immediate operand; but if the immediate value is between -128
779: and 127, better code results from loading the value into a register and
780: using the register. This is because the load into the register can be
781: done with a @samp{moveq} instruction. We arrange for this to happen
782: by defining the letter @samp{K} to mean ``any integer outside the
783: range -128 to 127'', and then specifying @samp{Ks} in the operand
784: constraints.
785:
786: @cindex @samp{g} in constraint
787: @item @samp{g}
788: Any register, memory or immediate integer operand is allowed, except for
789: registers that are not general registers.
790:
791: @cindex @samp{X} in constraint
792: @item @samp{X}
793: @ifset INTERNALS
794: Any operand whatsoever is allowed, even if it does not satisfy
795: @code{general_operand}. This is normally used in the constraint of
796: a @code{match_scratch} when certain alternatives will not actually
797: require a scratch register.
798: @end ifset
799: @ifclear INTERNALS
800: Any operand whatsoever is allowed.
801: @end ifclear
802:
803: @cindex @samp{0} in constraint
804: @cindex digits in constraint
805: @item @samp{0}, @samp{1}, @samp{2}, @dots{} @samp{9}
806: An operand that matches the specified operand number is allowed. If a
807: digit is used together with letters within the same alternative, the
808: digit should come last.
809:
810: @cindex matching constraint
811: @cindex constraint, matching
812: This is called a @dfn{matching constraint} and what it really means is
813: that the assembler has only a single operand that fills two roles
814: @ifset INTERNALS
815: considered separate in the RTL insn. For example, an add insn has two
816: input operands and one output operand in the RTL, but on most CISC
817: @end ifset
818: @ifclear INTERNALS
819: which @code{asm} distinguishes. For example, an add instruction uses
820: two input operands and an output operand, but on most CISC
821: @end ifclear
822: machines an add instruction really has only two operands, one of them an
823: input-output operand:
824:
825: @smallexample
826: addl #35,r12
827: @end smallexample
828:
829: Matching constraints are used in these circumstances.
830: More precisely, the two operands that match must include one input-only
831: operand and one output-only operand. Moreover, the digit must be a
832: smaller number than the number of the operand that uses it in the
833: constraint.
834:
835: @ifset INTERNALS
836: For operands to match in a particular case usually means that they
837: are identical-looking RTL expressions. But in a few special cases
838: specific kinds of dissimilarity are allowed. For example, @code{*x}
839: as an input operand will match @code{*x++} as an output operand.
840: For proper results in such cases, the output template should always
841: use the output-operand's number when printing the operand.
842: @end ifset
843:
844: @cindex load address instruction
845: @cindex push address instruction
846: @cindex address constraints
847: @cindex @samp{p} in constraint
848: @item @samp{p}
849: An operand that is a valid memory address is allowed. This is
850: for ``load address'' and ``push address'' instructions.
851:
852: @findex address_operand
853: @samp{p} in the constraint must be accompanied by @code{address_operand}
854: as the predicate in the @code{match_operand}. This predicate interprets
855: the mode specified in the @code{match_operand} as the mode of the memory
856: reference for which the address would be valid.
857:
858: @cindex extensible constraints
859: @cindex @samp{Q}, in constraint
860: @item @samp{Q}, @samp{R}, @samp{S}, @dots{} @samp{U}
861: Letters in the range @samp{Q} through @samp{U} may be defined in a
862: machine-dependent fashion to stand for arbitrary operand types.
863: @ifset INTERNALS
864: The machine description macro @code{EXTRA_CONSTRAINT} is passed the
865: operand as its first argument and the constraint letter as its
866: second operand.
867:
868: A typical use for this would be to distinguish certain types of
869: memory references that affect other insn operands.
870:
871: Do not define these constraint letters to accept register references
872: (@code{reg}); the reload pass does not expect this and would not handle
873: it properly.
874: @end ifset
875: @end table
876:
877: @ifset INTERNALS
878: In order to have valid assembler code, each operand must satisfy
879: its constraint. But a failure to do so does not prevent the pattern
880: from applying to an insn. Instead, it directs the compiler to modify
881: the code so that the constraint will be satisfied. Usually this is
882: done by copying an operand into a register.
883:
884: Contrast, therefore, the two instruction patterns that follow:
885:
886: @smallexample
887: (define_insn ""
888: [(set (match_operand:SI 0 "general_operand" "=r")
889: (plus:SI (match_dup 0)
890: (match_operand:SI 1 "general_operand" "r")))]
891: ""
892: "@dots{}")
893: @end smallexample
894:
895: @noindent
896: which has two operands, one of which must appear in two places, and
897:
898: @smallexample
899: (define_insn ""
900: [(set (match_operand:SI 0 "general_operand" "=r")
901: (plus:SI (match_operand:SI 1 "general_operand" "0")
902: (match_operand:SI 2 "general_operand" "r")))]
903: ""
904: "@dots{}")
905: @end smallexample
906:
907: @noindent
908: which has three operands, two of which are required by a constraint to be
909: identical. If we are considering an insn of the form
910:
911: @smallexample
912: (insn @var{n} @var{prev} @var{next}
913: (set (reg:SI 3)
914: (plus:SI (reg:SI 6) (reg:SI 109)))
915: @dots{})
916: @end smallexample
917:
918: @noindent
919: the first pattern would not apply at all, because this insn does not
920: contain two identical subexpressions in the right place. The pattern would
921: say, ``That does not look like an add instruction; try other patterns.''
922: The second pattern would say, ``Yes, that's an add instruction, but there
923: is something wrong with it.'' It would direct the reload pass of the
924: compiler to generate additional insns to make the constraint true. The
925: results might look like this:
926:
927: @smallexample
928: (insn @var{n2} @var{prev} @var{n}
929: (set (reg:SI 3) (reg:SI 6))
930: @dots{})
931:
932: (insn @var{n} @var{n2} @var{next}
933: (set (reg:SI 3)
934: (plus:SI (reg:SI 3) (reg:SI 109)))
935: @dots{})
936: @end smallexample
937:
938: It is up to you to make sure that each operand, in each pattern, has
939: constraints that can handle any RTL expression that could be present for
940: that operand. (When multiple alternatives are in use, each pattern must,
941: for each possible combination of operand expressions, have at least one
942: alternative which can handle that combination of operands.) The
943: constraints don't need to @emph{allow} any possible operand---when this is
944: the case, they do not constrain---but they must at least point the way to
945: reloading any possible operand so that it will fit.
946:
947: @itemize @bullet
948: @item
949: If the constraint accepts whatever operands the predicate permits,
950: there is no problem: reloading is never necessary for this operand.
951:
952: For example, an operand whose constraints permit everything except
953: registers is safe provided its predicate rejects registers.
954:
955: An operand whose predicate accepts only constant values is safe
956: provided its constraints include the letter @samp{i}. If any possible
957: constant value is accepted, then nothing less than @samp{i} will do;
958: if the predicate is more selective, then the constraints may also be
959: more selective.
960:
961: @item
962: Any operand expression can be reloaded by copying it into a register.
963: So if an operand's constraints allow some kind of register, it is
964: certain to be safe. It need not permit all classes of registers; the
965: compiler knows how to copy a register into another register of the
966: proper class in order to make an instruction valid.
967:
968: @cindex nonoffsettable memory reference
969: @cindex memory reference, nonoffsettable
970: @item
971: A nonoffsettable memory reference can be reloaded by copying the
972: address into a register. So if the constraint uses the letter
973: @samp{o}, all memory references are taken care of.
974:
975: @item
976: A constant operand can be reloaded by allocating space in memory to
977: hold it as preinitialized data. Then the memory reference can be used
978: in place of the constant. So if the constraint uses the letters
979: @samp{o} or @samp{m}, constant operands are not a problem.
980:
981: @item
982: If the constraint permits a constant and a pseudo register used in an insn
983: was not allocated to a hard register and is equivalent to a constant,
984: the register will be replaced with the constant. If the predicate does
985: not permit a constant and the insn is re-recognized for some reason, the
986: compiler will crash. Thus the predicate must always recognize any
987: objects allowed by the constraint.
988: @end itemize
989:
990: If the operand's predicate can recognize registers, but the constraint does
991: not permit them, it can make the compiler crash. When this operand happens
992: to be a register, the reload pass will be stymied, because it does not know
993: how to copy a register temporarily into memory.
994: @end ifset
995:
996: @node Multi-Alternative
997: @subsection Multiple Alternative Constraints
998: @cindex multiple alternative constraints
999:
1000: Sometimes a single instruction has multiple alternative sets of possible
1001: operands. For example, on the 68000, a logical-or instruction can combine
1002: register or an immediate value into memory, or it can combine any kind of
1003: operand into a register; but it cannot combine one memory location into
1004: another.
1005:
1006: These constraints are represented as multiple alternatives. An alternative
1007: can be described by a series of letters for each operand. The overall
1008: constraint for an operand is made from the letters for this operand
1009: from the first alternative, a comma, the letters for this operand from
1010: the second alternative, a comma, and so on until the last alternative.
1011: @ifset INTERNALS
1012: Here is how it is done for fullword logical-or on the 68000:
1013:
1014: @smallexample
1015: (define_insn "iorsi3"
1016: [(set (match_operand:SI 0 "general_operand" "=m,d")
1017: (ior:SI (match_operand:SI 1 "general_operand" "%0,0")
1018: (match_operand:SI 2 "general_operand" "dKs,dmKs")))]
1019: @dots{})
1020: @end smallexample
1021:
1022: The first alternative has @samp{m} (memory) for operand 0, @samp{0} for
1023: operand 1 (meaning it must match operand 0), and @samp{dKs} for operand
1024: 2. The second alternative has @samp{d} (data register) for operand 0,
1025: @samp{0} for operand 1, and @samp{dmKs} for operand 2. The @samp{=} and
1026: @samp{%} in the constraints apply to all the alternatives; their
1027: meaning is explained in the next section (@pxref{Class Preferences}).
1028: @end ifset
1029:
1030: @c FIXME Is this ? and ! stuff of use in asm()? If not, hide unless INTERNAL
1031: If all the operands fit any one alternative, the instruction is valid.
1032: Otherwise, for each alternative, the compiler counts how many instructions
1033: must be added to copy the operands so that that alternative applies.
1034: The alternative requiring the least copying is chosen. If two alternatives
1035: need the same amount of copying, the one that comes first is chosen.
1036: These choices can be altered with the @samp{?} and @samp{!} characters:
1037:
1038: @table @code
1039: @cindex @samp{?} in constraint
1040: @cindex question mark
1041: @item ?
1042: Disparage slightly the alternative that the @samp{?} appears in,
1043: as a choice when no alternative applies exactly. The compiler regards
1044: this alternative as one unit more costly for each @samp{?} that appears
1045: in it.
1046:
1047: @cindex @samp{!} in constraint
1048: @cindex exclamation point
1049: @item !
1050: Disparage severely the alternative that the @samp{!} appears in.
1051: This alternative can still be used if it fits without reloading,
1052: but if reloading is needed, some other alternative will be used.
1053: @end table
1054:
1055: @ifset INTERNALS
1056: When an insn pattern has multiple alternatives in its constraints, often
1057: the appearance of the assembler code is determined mostly by which
1058: alternative was matched. When this is so, the C code for writing the
1059: assembler code can use the variable @code{which_alternative}, which is
1060: the ordinal number of the alternative that was actually satisfied (0 for
1061: the first, 1 for the second alternative, etc.). @xref{Output Statement}.
1062: @end ifset
1063:
1064: @ifset INTERNALS
1065: @node Class Preferences
1066: @subsection Register Class Preferences
1067: @cindex class preference constraints
1068: @cindex register class preference constraints
1069:
1070: @cindex voting between constraint alternatives
1071: The operand constraints have another function: they enable the compiler
1072: to decide which kind of hardware register a pseudo register is best
1073: allocated to. The compiler examines the constraints that apply to the
1074: insns that use the pseudo register, looking for the machine-dependent
1075: letters such as @samp{d} and @samp{a} that specify classes of registers.
1076: The pseudo register is put in whichever class gets the most ``votes''.
1077: The constraint letters @samp{g} and @samp{r} also vote: they vote in
1078: favor of a general register. The machine description says which registers
1079: are considered general.
1080:
1081: Of course, on some machines all registers are equivalent, and no register
1082: classes are defined. Then none of this complexity is relevant.
1083: @end ifset
1084:
1085: @node Modifiers
1086: @subsection Constraint Modifier Characters
1087: @cindex modifiers in constraints
1088: @cindex constraint modifier characters
1089:
1090: @table @samp
1091: @cindex @samp{=} in constraint
1092: @item =
1093: Means that this operand is write-only for this instruction: the previous
1094: value is discarded and replaced by output data.
1095:
1096: @cindex @samp{+} in constraint
1097: @item +
1098: Means that this operand is both read and written by the instruction.
1099:
1100: When the compiler fixes up the operands to satisfy the constraints,
1101: it needs to know which operands are inputs to the instruction and
1102: which are outputs from it. @samp{=} identifies an output; @samp{+}
1103: identifies an operand that is both input and output; all other operands
1104: are assumed to be input only.
1105:
1106: @cindex @samp{&} in constraint
1107: @item &
1108: Means (in a particular alternative) that this operand is written
1109: before the instruction is finished using the input operands.
1110: Therefore, this operand may not lie in a register that is used as an
1111: input operand or as part of any memory address.
1112:
1113: @samp{&} applies only to the alternative in which it is written. In
1114: constraints with multiple alternatives, sometimes one alternative
1115: requires @samp{&} while others do not. See, for example, the
1116: @samp{movdf} insn of the 68000.
1117:
1118: @samp{&} does not obviate the need to write @samp{=}.
1119:
1120: @cindex @samp{%} in constraint
1121: @item %
1122: Declares the instruction to be commutative for this operand and the
1123: following operand. This means that the compiler may interchange the
1124: two operands if that is the cheapest way to make all operands fit the
1125: constraints.
1126: @ifset INTERNALS
1127: This is often used in patterns for addition instructions
1128: that really have only two operands: the result must go in one of the
1129: arguments. Here for example, is how the 68000 halfword-add
1130: instruction is defined:
1131:
1132: @smallexample
1133: (define_insn "addhi3"
1134: [(set (match_operand:HI 0 "general_operand" "=m,r")
1135: (plus:HI (match_operand:HI 1 "general_operand" "%0,0")
1136: (match_operand:HI 2 "general_operand" "di,g")))]
1137: @dots{})
1138: @end smallexample
1139: @end ifset
1140:
1141: @cindex @samp{#} in constraint
1142: @item #
1143: Says that all following characters, up to the next comma, are to be
1144: ignored as a constraint. They are significant only for choosing
1145: register preferences.
1146:
1147: @ifset INTERNALS
1148: @cindex @samp{*} in constraint
1149: @item *
1150: Says that the following character should be ignored when choosing
1151: register preferences. @samp{*} has no effect on the meaning of the
1152: constraint as a constraint, and no effect on reloading.
1153:
1154: Here is an example: the 68000 has an instruction to sign-extend a
1155: halfword in a data register, and can also sign-extend a value by
1156: copying it into an address register. While either kind of register is
1157: acceptable, the constraints on an address-register destination are
1158: less strict, so it is best if register allocation makes an address
1159: register its goal. Therefore, @samp{*} is used so that the @samp{d}
1160: constraint letter (for data register) is ignored when computing
1161: register preferences.
1162:
1163: @smallexample
1164: (define_insn "extendhisi2"
1165: [(set (match_operand:SI 0 "general_operand" "=*d,a")
1166: (sign_extend:SI
1167: (match_operand:HI 1 "general_operand" "0,g")))]
1168: @dots{})
1169: @end smallexample
1170: @end ifset
1171: @end table
1172:
1173: @node Machine Constraints
1174: @subsection Constraints for Particular Machines
1175: @cindex machine specific constraints
1176: @cindex constraints, machine specific
1177:
1178: Whenever possible, you should use the general-purpose constraint letters
1179: in @code{asm} arguments, since they will convey meaning more readily to
1180: people reading your code. Failing that, use the constraint letters
1181: that usually have very similar meanings across architectures. The most
1182: commonly used constraints are @samp{m} and @samp{r} (for memory and
1183: general-purpose registers respectively; @pxref{Simple Constraints}), and
1184: @samp{I}, usually the letter indicating the most common
1185: immediate-constant format.
1186:
1187: For each machine architecture, the @file{config/@var{machine}.h} file
1188: defines additional constraints. These constraints are used by the
1189: compiler itself for instruction generation, as well as for @code{asm}
1190: statements; therefore, some of the constraints are not particularly
1191: interesting for @code{asm}. The constraints are defined through these
1192: macros:
1193:
1194: @table @code
1195: @item REG_CLASS_FROM_LETTER
1196: Register class constraints (usually lower case).
1197:
1198: @item CONST_OK_FOR_LETTER_P
1199: Immediate constant constraints, for non-floating point constants of
1200: word size or smaller precision (usually upper case).
1201:
1202: @item CONST_DOUBLE_OK_FOR_LETTER_P
1203: Immediate constant constraints, for all floating point constants and for
1204: constants of greater than word size precision (usually upper case).
1205:
1206: @item EXTRA_CONSTRAINT
1207: Special cases of registers or memory. This macro is not required, and
1208: is only defined for some machines.
1209: @end table
1210:
1211: Inspecting these macro definitions in the compiler source for your
1212: machine is the best way to be certain you have the right constraints.
1213: However, here is a summary of the machine-dependent constraints
1214: available on some particular machines.
1215:
1216: @table @emph
1217: @item AMD 29000 family---@file{a29k.h}
1218: @table @code
1219: @item l
1220: Local register 0
1221:
1222: @item b
1223: Byte Pointer (@samp{BP}) register
1224:
1225: @item q
1226: @samp{Q} register
1227:
1228: @item h
1229: Special purpose register
1230:
1231: @item A
1232: First accumulator register
1233:
1234: @item a
1235: Other accumulator register
1236:
1237: @item f
1238: Floating point register
1239:
1240: @item I
1241: Constant greater than 0, less than 0x100
1242:
1243: @item J
1244: Constant greater than 0, less than 0x10000
1245:
1246: @item K
1247: Constant whose high 24 bits are on (1)
1248:
1249: @item L
1250: 16 bit constant whose high 8 bits are on (1)
1251:
1252: @item M
1253: 32 bit constant whose high 16 bits are on (1)
1254:
1255: @item N
1256: 32 bit negative constant that fits in 8 bits
1257:
1258: @item O
1259: The constant 0x80000000 or, on the 29050, any 32 bit constant
1260: whose low 16 bits are 0.
1261:
1262: @item P
1263: 16 bit negative constant that fits in 8 bits
1264:
1265: @item G
1266: @itemx H
1267: A floating point constant (in @code{asm} statements, use the machine
1268: independent @samp{E} or @samp{F} instead)
1269: @end table
1270:
1271: @item IBM RS6000---@file{rs6000.h}
1272: @table @code
1273: @item b
1274: Address base register
1275:
1276: @item f
1277: Floating point register
1278:
1279: @item h
1280: @samp{MQ}, @samp{CTR}, or @samp{LINK} register
1281:
1282: @item q
1283: @samp{MQ} register
1284:
1285: @item c
1286: @samp{CTR} register
1287:
1288: @item l
1289: @samp{LINK} register
1290:
1291: @item x
1292: @samp{CR} register (condition register) number 0
1293:
1294: @item y
1295: @samp{CR} register (condition register)
1296:
1297: @item I
1298: Signed 16 bit constant
1299:
1300: @item J
1301: Constant whose low 16 bits are 0
1302:
1303: @item K
1304: Constant whose high 16 bits are 0
1305:
1306: @item L
1307: Constant suitable as a mask operand
1308:
1309: @item M
1310: Constant larger than 31
1311:
1312: @item N
1313: Exact power of 2
1314:
1315: @item O
1316: Zero
1317:
1318: @item P
1319: Constant whose negation is a signed 16 bit constant
1320:
1321: @item G
1322: Floating point constant that can be loaded into a register with one
1323: instruction per word
1324:
1325: @item Q
1326: Memory operand that is an offset from a register (@samp{m} is preferable
1327: for @code{asm} statements)
1328: @end table
1329:
1330: @item Intel 386---@file{i386.h}
1331: @table @code
1332: @item q
1333: @samp{a}, @code{b}, @code{c}, or @code{d} register
1334:
1335: @item f
1336: Floating point register
1337:
1338: @item t
1339: First (top of stack) floating point register
1340:
1341: @item u
1342: Second floating point register
1343:
1344: @item a
1345: @samp{a} register
1346:
1347: @item b
1348: @samp{b} register
1349:
1350: @item c
1351: @samp{c} register
1352:
1353: @item d
1354: @samp{d} register
1355:
1356: @item D
1357: @samp{di} register
1358:
1359: @item S
1360: @samp{si} register
1361:
1362: @item I
1363: Constant in range 0 to 31 (for 32 bit shifts)
1364:
1365: @item J
1366: Constant in range 0 to 63 (for 64 bit shifts)
1367:
1368: @item K
1369: @samp{0xff}
1370:
1371: @item L
1372: @samp{0xffff}
1373:
1374: @item M
1375: 0, 1, 2, or 3 (shifts for @code{lea} instruction)
1376:
1377: @item G
1378: Standard 80387 floating point constant
1379: @end table
1380:
1381: @item Intel 960---@file{i960.h}
1382: @table @code
1383: @item f
1384: Floating point register (@code{fp0} to @code{fp3})
1385:
1386: @item l
1387: Local register (@code{r0} to @code{r15})
1388:
1389: @item b
1390: Global register (@code{g0} to @code{g15})
1391:
1392: @item d
1393: Any local or global register
1394:
1395: @item I
1396: Integers from 0 to 31
1397:
1398: @item J
1399: 0
1400:
1401: @item K
1402: Integers from -31 to 0
1403:
1404: @item G
1405: Floating point 0
1406:
1407: @item H
1408: Floating point 1
1409: @end table
1410:
1411: @item MIPS---@file{mips.h}
1412: @table @code
1413: @item d
1414: General-purpose integer register
1415:
1416: @item f
1417: Floating-point register (if available)
1418:
1419: @item h
1420: @samp{Hi} register
1421:
1422: @item l
1423: @samp{Lo} register
1424:
1425: @item x
1426: @samp{Hi} or @samp{Lo} register
1427:
1428: @item y
1429: General-purpose integer register
1430:
1431: @item z
1432: Floating-point status register
1433:
1434: @item I
1435: Signed 16 bit constant (for arithmetic instructions)
1436:
1437: @item J
1438: Zero
1439:
1440: @item K
1441: Zero-extended 16-bit constant (for logic instructions)
1442:
1443: @item L
1444: Constant with low 16 bits zero (can be loaded with @code{lui})
1445:
1446: @item M
1447: 32 bit constant which requires two instructions to load (a constant
1448: which is not @samp{I}, @samp{K}, or @samp{L})
1449:
1450: @item N
1451: Negative 16 bit constant
1452:
1453: @item O
1454: Exact power of two
1455:
1456: @item P
1457: Positive 16 bit constant
1458:
1459: @item G
1460: Floating point zero
1461:
1462: @item Q
1463: Memory reference that can be loaded with more than one instruction
1464: (@samp{m} is preferable for @code{asm} statements)
1465:
1466: @item R
1467: Memory reference that can be loaded with one instruction
1468: (@samp{m} is preferable for @code{asm} statements)
1469:
1470: @item S
1471: Memory reference in external OSF/rose PIC format
1472: (@samp{m} is preferable for @code{asm} statements)
1473: @end table
1474:
1475: @item Motorola 680x0---@file{m68k.h}
1476: @table @code
1477: @item a
1478: Address register
1479:
1480: @item d
1481: Data register
1482:
1483: @item f
1484: 68881 floating-point register, if available
1485:
1486: @item x
1487: Sun FPA (floating-point) register, if available
1488:
1489: @item y
1490: First 16 Sun FPA registers, if available
1491:
1492: @item I
1493: Integer in the range 1 to 8
1494:
1495: @item J
1496: 16 bit signed number
1497:
1498: @item K
1499: Signed number whose magnitude is greater than 0x80
1500:
1501: @item L
1502: Integer in the range -8 to -1
1503:
1504: @item G
1505: Floating point constant that is not a 68881 constant
1506:
1507: @item H
1508: Floating point constant that can be used by Sun FPA
1509: @end table
1510:
1511: @need 1000
1512: @item SPARC---@file{sparc.h}
1513: @table @code
1514: @item f
1515: Floating-point register
1516:
1517: @item I
1518: Signed 13 bit constant
1519:
1520: @item J
1521: Zero
1522:
1523: @item K
1524: 32 bit constant with the low 12 bits clear (a constant that can be
1525: loaded with the @code{sethi} instruction)
1526:
1527: @item G
1528: Floating-point zero
1529:
1530: @item H
1531: Signed 13 bit constant, sign-extended to 32 or 64 bits
1532:
1533: @item Q
1534: Memory reference that can be loaded with one instruction (@samp{m} is
1535: more appropriate for @code{asm} statements)
1536:
1537: @item S
1538: Constant, or memory address
1539:
1540: @item T
1541: Memory address aligned to an 8-byte boundary
1542:
1543: @item U
1544: Even register
1545: @end table
1546: @end table
1547:
1548: @ifset INTERNALS
1549: @node No Constraints
1550: @subsection Not Using Constraints
1551: @cindex no constraints
1552: @cindex not using constraints
1553:
1554: Some machines are so clean that operand constraints are not required. For
1555: example, on the Vax, an operand valid in one context is valid in any other
1556: context. On such a machine, every operand constraint would be @samp{g},
1557: excepting only operands of ``load address'' instructions which are
1558: written as if they referred to a memory location's contents but actual
1559: refer to its address. They would have constraint @samp{p}.
1560:
1561: @cindex empty constraints
1562: For such machines, instead of writing @samp{g} and @samp{p} for all
1563: the constraints, you can choose to write a description with empty constraints.
1564: Then you write @samp{""} for the constraint in every @code{match_operand}.
1565: Address operands are identified by writing an @code{address} expression
1566: around the @code{match_operand}, not by their constraints.
1567:
1568: When the machine description has just empty constraints, certain parts
1569: of compilation are skipped, making the compiler faster. However,
1570: few machines actually do not need constraints; all machine descriptions
1571: now in existence use constraints.
1572: @end ifset
1573:
1574: @ifset INTERNALS
1575: @node Standard Names
1576: @section Standard Pattern Names For Generation
1577: @cindex standard pattern names
1578: @cindex pattern names
1579: @cindex names, pattern
1580:
1581: Here is a table of the instruction names that are meaningful in the RTL
1582: generation pass of the compiler. Giving one of these names to an
1583: instruction pattern tells the RTL generation pass that it can use the
1584: pattern in to accomplish a certain task.
1585:
1586: @table @asis
1587: @cindex @code{mov@var{m}} instruction pattern
1588: @item @samp{mov@var{m}}
1589: Here @var{m} stands for a two-letter machine mode name, in lower case.
1590: This instruction pattern moves data with that machine mode from operand
1591: 1 to operand 0. For example, @samp{movsi} moves full-word data.
1592:
1593: If operand 0 is a @code{subreg} with mode @var{m} of a register whose
1594: own mode is wider than @var{m}, the effect of this instruction is
1595: to store the specified value in the part of the register that corresponds
1596: to mode @var{m}. The effect on the rest of the register is undefined.
1597:
1598: This class of patterns is special in several ways. First of all, each
1599: of these names @emph{must} be defined, because there is no other way
1600: to copy a datum from one place to another.
1601:
1602: Second, these patterns are not used solely in the RTL generation pass.
1603: Even the reload pass can generate move insns to copy values from stack
1604: slots into temporary registers. When it does so, one of the operands is
1605: a hard register and the other is an operand that can need to be reloaded
1606: into a register.
1607:
1608: @findex force_reg
1609: Therefore, when given such a pair of operands, the pattern must generate
1610: RTL which needs no reloading and needs no temporary registers---no
1611: registers other than the operands. For example, if you support the
1612: pattern with a @code{define_expand}, then in such a case the
1613: @code{define_expand} mustn't call @code{force_reg} or any other such
1614: function which might generate new pseudo registers.
1615:
1616: This requirement exists even for subword modes on a RISC machine where
1617: fetching those modes from memory normally requires several insns and
1618: some temporary registers. Look in @file{spur.md} to see how the
1619: requirement can be satisfied.
1620:
1621: @findex change_address
1622: During reload a memory reference with an invalid address may be passed
1623: as an operand. Such an address will be replaced with a valid address
1624: later in the reload pass. In this case, nothing may be done with the
1625: address except to use it as it stands. If it is copied, it will not be
1626: replaced with a valid address. No attempt should be made to make such
1627: an address into a valid address and no routine (such as
1628: @code{change_address}) that will do so may be called. Note that
1629: @code{general_operand} will fail when applied to such an address.
1630:
1631: @findex reload_in_progress
1632: The global variable @code{reload_in_progress} (which must be explicitly
1633: declared if required) can be used to determine whether such special
1634: handling is required.
1635:
1636: The variety of operands that have reloads depends on the rest of the
1637: machine description, but typically on a RISC machine these can only be
1638: pseudo registers that did not get hard registers, while on other
1639: machines explicit memory references will get optional reloads.
1640:
1641: If a scratch register is required to move an object to or from memory,
1642: it can be allocated using @code{gen_reg_rtx} prior to reload. But this
1643: is impossible during and after reload. If there are cases needing
1644: scratch registers after reload, you must define
1645: @code{SECONDARY_INPUT_RELOAD_CLASS} and perhaps also
1646: @code{SECONDARY_OUTPUT_RELOAD_CLASS} to detect them, and provide
1647: patterns @samp{reload_in@var{m}} or @samp{reload_out@var{m}} to handle
1648: them. @xref{Register Classes}.
1649:
1650: The constraints on a @samp{move@var{m}} must permit moving any hard
1651: register to any other hard register provided that
1652: @code{HARD_REGNO_MODE_OK} permits mode @var{m} in both registers and
1653: @code{REGISTER_MOVE_COST} applied to their classes returns a value of 2.
1654:
1655: It is obligatory to support floating point @samp{move@var{m}}
1656: instructions into and out of any registers that can hold fixed point
1657: values, because unions and structures (which have modes @code{SImode} or
1658: @code{DImode}) can be in those registers and they may have floating
1659: point members.
1660:
1661: There may also be a need to support fixed point @samp{move@var{m}}
1662: instructions in and out of floating point registers. Unfortunately, I
1663: have forgotten why this was so, and I don't know whether it is still
1664: true. If @code{HARD_REGNO_MODE_OK} rejects fixed point values in
1665: floating point registers, then the constraints of the fixed point
1666: @samp{move@var{m}} instructions must be designed to avoid ever trying to
1667: reload into a floating point register.
1668:
1669: @cindex @code{reload_in} instruction pattern
1670: @cindex @code{reload_out} instruction pattern
1671: @item @samp{reload_in@var{m}}
1672: @itemx @samp{reload_out@var{m}}
1673: Like @samp{mov@var{m}}, but used when a scratch register is required to
1674: move between operand 0 and operand 1. Operand 2 describes the scratch
1675: register. See the discussion of the @code{SECONDARY_RELOAD_CLASS}
1676: macro in @pxref{Register Classes}.
1677:
1678: @cindex @code{movstrict@var{m}} instruction pattern
1679: @item @samp{movstrict@var{m}}
1680: Like @samp{mov@var{m}} except that if operand 0 is a @code{subreg}
1681: with mode @var{m} of a register whose natural mode is wider,
1682: the @samp{movstrict@var{m}} instruction is guaranteed not to alter
1683: any of the register except the part which belongs to mode @var{m}.
1684:
1685: @cindex @code{load_multiple} instruction pattern
1686: @item @code{load_multiple}
1687: Load several consecutive memory locations into consecutive registers.
1688: Operand 0 is the first of the consecutive registers, operand 1
1689: is the first memory location, and operand 2 is a constant: the
1690: number of consecutive registers.
1691:
1692: Define this only if the target machine really has such an instruction;
1693: do not define this if the most efficient way of loading consecutive
1694: registers from memory is to do them one at a time.
1695:
1696: On some machines, there are restrictions as to which consecutive
1697: registers can be stored into memory, such as particular starting or
1698: ending register numbers or only a range of valid counts. For those
1699: machines, use a @code{define_expand} (@pxref{Expander Definitions})
1700: and make the pattern fail if the restrictions are not met.
1701:
1702: Write the generated insn as a @code{parallel} with elements being a
1703: @code{set} of one register from the appropriate memory location (you may
1704: also need @code{use} or @code{clobber} elements). Use a
1705: @code{match_parallel} (@pxref{RTL Template}) to recognize the insn. See
1706: @file{a29k.md} and @file{rs6000.md} for examples of the use of this insn
1707: pattern.
1708:
1709: @cindex @samp{store_multiple} instruction pattern
1710: @item @code{store_multiple}
1711: Similar to @samp{load_multiple}, but store several consecutive registers
1712: into consecutive memory locations. Operand 0 is the first of the
1713: consecutive memory locations, operand 1 is the first register, and
1714: operand 2 is a constant: the number of consecutive registers.
1715:
1716: @cindex @code{add@var{m}3} instruction pattern
1717: @item @samp{add@var{m}3}
1718: Add operand 2 and operand 1, storing the result in operand 0. All operands
1719: must have mode @var{m}. This can be used even on two-address machines, by
1720: means of constraints requiring operands 1 and 0 to be the same location.
1721:
1722: @cindex @code{sub@var{m}3} instruction pattern
1723: @cindex @code{mul@var{m}3} instruction pattern
1724: @cindex @code{div@var{m}3} instruction pattern
1725: @cindex @code{udiv@var{m}3} instruction pattern
1726: @cindex @code{mod@var{m}3} instruction pattern
1727: @cindex @code{umod@var{m}3} instruction pattern
1728: @cindex @code{min@var{m}3} instruction pattern
1729: @cindex @code{max@var{m}3} instruction pattern
1730: @cindex @code{umin@var{m}3} instruction pattern
1731: @cindex @code{umax@var{m}3} instruction pattern
1732: @cindex @code{and@var{m}3} instruction pattern
1733: @cindex @code{ior@var{m}3} instruction pattern
1734: @cindex @code{xor@var{m}3} instruction pattern
1735: @item @samp{sub@var{m}3}, @samp{mul@var{m}3}
1736: @itemx @samp{div@var{m}3}, @samp{udiv@var{m}3}, @samp{mod@var{m}3}, @samp{umod@var{m}3}
1737: @itemx @samp{smin@var{m}3}, @samp{smax@var{m}3}, @samp{umin@var{m}3}, @samp{umax@var{m}3}
1738: @itemx @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3}
1739: Similar, for other arithmetic operations.
1740:
1741: @cindex @code{mulhisi3} instruction pattern
1742: @item @samp{mulhisi3}
1743: Multiply operands 1 and 2, which have mode @code{HImode}, and store
1744: a @code{SImode} product in operand 0.
1745:
1746: @cindex @code{mulqihi3} instruction pattern
1747: @cindex @code{mulsidi3} instruction pattern
1748: @item @samp{mulqihi3}, @samp{mulsidi3}
1749: Similar widening-multiplication instructions of other widths.
1750:
1751: @cindex @code{umulqihi3} instruction pattern
1752: @cindex @code{umulhisi3} instruction pattern
1753: @cindex @code{umulsidi3} instruction pattern
1754: @item @samp{umulqihi3}, @samp{umulhisi3}, @samp{umulsidi3}
1755: Similar widening-multiplication instructions that do unsigned
1756: multiplication.
1757:
1758: @cindex @code{divmod@var{m}4} instruction pattern
1759: @item @samp{divmod@var{m}4}
1760: Signed division that produces both a quotient and a remainder.
1761: Operand 1 is divided by operand 2 to produce a quotient stored
1762: in operand 0 and a remainder stored in operand 3.
1763:
1764: For machines with an instruction that produces both a quotient and a
1765: remainder, provide a pattern for @samp{divmod@var{m}4} but do not
1766: provide patterns for @samp{div@var{m}3} and @samp{mod@var{m}3}. This
1767: allows optimization in the relatively common case when both the quotient
1768: and remainder are computed.
1769:
1770: If an instruction that just produces a quotient or just a remainder
1771: exists and is more efficient than the instruction that produces both,
1772: write the output routine of @samp{divmod@var{m}4} to call
1773: @code{find_reg_note} and look for a @code{REG_UNUSED} note on the
1774: quotient or remainder and generate the appropriate instruction.
1775:
1776: @cindex @code{udivmod@var{m}4} instruction pattern
1777: @item @samp{udivmod@var{m}4}
1778: Similar, but does unsigned division.
1779:
1780: @cindex @code{ashl@var{m}3} instruction pattern
1781: @item @samp{ashl@var{m}3}
1782: Arithmetic-shift operand 1 left by a number of bits specified by operand
1783: 2, and store the result in operand 0. Here @var{m} is the mode of
1784: operand 0 and operand 1; operand 2's mode is specified by the
1785: instruction pattern, and the compiler will convert the operand to that
1786: mode before generating the instruction.
1787:
1788: @cindex @code{ashr@var{m}3} instruction pattern
1789: @cindex @code{lshl@var{m}3} instruction pattern
1790: @cindex @code{lshr@var{m}3} instruction pattern
1791: @cindex @code{rotl@var{m}3} instruction pattern
1792: @cindex @code{rotr@var{m}3} instruction pattern
1793: @item @samp{ashr@var{m}3}, @samp{lshl@var{m}3}, @samp{lshr@var{m}3}, @samp{rotl@var{m}3}, @samp{rotr@var{m}3}
1794: Other shift and rotate instructions, analogous to the
1795: @code{ashl@var{m}3} instructions.
1796:
1797: Logical and arithmetic left shift are the same. Machines that do not
1798: allow negative shift counts often have only one instruction for
1799: shifting left. On such machines, you should define a pattern named
1800: @samp{ashl@var{m}3} and leave @samp{lshl@var{m}3} undefined.
1801:
1802: @cindex @code{neg@var{m}2} instruction pattern
1803: @item @samp{neg@var{m}2}
1804: Negate operand 1 and store the result in operand 0.
1805:
1806: @cindex @code{abs@var{m}2} instruction pattern
1807: @item @samp{abs@var{m}2}
1808: Store the absolute value of operand 1 into operand 0.
1809:
1810: @cindex @code{sqrt@var{m}2} instruction pattern
1811: @item @samp{sqrt@var{m}2}
1812: Store the square root of operand 1 into operand 0.
1813:
1814: The @code{sqrt} built-in function of C always uses the mode which
1815: corresponds to the C data type @code{double}.
1816:
1817: @cindex @code{ffs@var{m}2} instruction pattern
1818: @item @samp{ffs@var{m}2}
1819: Store into operand 0 one plus the index of the least significant 1-bit
1820: of operand 1. If operand 1 is zero, store zero. @var{m} is the mode
1821: of operand 0; operand 1's mode is specified by the instruction
1822: pattern, and the compiler will convert the operand to that mode before
1823: generating the instruction.
1824:
1825: The @code{ffs} built-in function of C always uses the mode which
1826: corresponds to the C data type @code{int}.
1827:
1828: @cindex @code{one_cmpl@var{m}2} instruction pattern
1829: @item @samp{one_cmpl@var{m}2}
1830: Store the bitwise-complement of operand 1 into operand 0.
1831:
1832: @cindex @code{cmp@var{m}} instruction pattern
1833: @item @samp{cmp@var{m}}
1834: Compare operand 0 and operand 1, and set the condition codes.
1835: The RTL pattern should look like this:
1836:
1837: @smallexample
1838: (set (cc0) (compare (match_operand:@var{m} 0 @dots{})
1839: (match_operand:@var{m} 1 @dots{})))
1840: @end smallexample
1841:
1842: @cindex @code{tst@var{m}} instruction pattern
1843: @item @samp{tst@var{m}}
1844: Compare operand 0 against zero, and set the condition codes.
1845: The RTL pattern should look like this:
1846:
1847: @smallexample
1848: (set (cc0) (match_operand:@var{m} 0 @dots{}))
1849: @end smallexample
1850:
1851: @samp{tst@var{m}} patterns should not be defined for machines that do
1852: not use @code{(cc0)}. Doing so would confuse the optimizer since it
1853: would no longer be clear which @code{set} operations were comparisons.
1854: The @samp{cmp@var{m}} patterns should be used instead.
1855:
1856: @cindex @code{movstr@var{m}} instruction pattern
1857: @item @samp{movstr@var{m}}
1858: Block move instruction. The addresses of the destination and source
1859: strings are the first two operands, and both are in mode @code{Pmode}.
1860: The number of bytes to move is the third operand, in mode @var{m}.
1861:
1862: The fourth operand is the known shared alignment of the source and
1863: destination, in the form of a @code{const_int} rtx. Thus, if the
1864: compiler knows that both source and destination are word-aligned,
1865: it may provide the value 4 for this operand.
1866:
1867: These patterns need not give special consideration to the possibility
1868: that the source and destination strings might overlap.
1869:
1870: @cindex @code{cmpstr@var{m}} instruction pattern
1871: @item @samp{cmpstr@var{m}}
1872: Block compare instruction, with five operands. Operand 0 is the output;
1873: it has mode @var{m}. The remaining four operands are like the operands
1874: of @samp{movstr@var{m}}. The two memory blocks specified are compared
1875: byte by byte in lexicographic order. The effect of the instruction is
1876: to store a value in operand 0 whose sign indicates the result of the
1877: comparison.
1878:
1879: @cindex @code{strlen@var{m}} instruction pattern
1880: Compute the length of a string, with three operands.
1881: Operand 0 is the result (of mode @var{m}), operand 1 is
1882: a @code{mem} referring to the first character of the string,
1883: operand 2 is the character to search for (normally zero),
1884: and operand 3 is a constant describing the known alignment
1885: of the beginning of the string.
1886:
1887: @cindex @code{float@var{mn}2} instruction pattern
1888: @item @samp{float@var{m}@var{n}2}
1889: Convert signed integer operand 1 (valid for fixed point mode @var{m}) to
1890: floating point mode @var{n} and store in operand 0 (which has mode
1891: @var{n}).
1892:
1893: @cindex @code{floatuns@var{mn}2} instruction pattern
1894: @item @samp{floatuns@var{m}@var{n}2}
1895: Convert unsigned integer operand 1 (valid for fixed point mode @var{m})
1896: to floating point mode @var{n} and store in operand 0 (which has mode
1897: @var{n}).
1898:
1899: @cindex @code{fix@var{mn}2} instruction pattern
1900: @item @samp{fix@var{m}@var{n}2}
1901: Convert operand 1 (valid for floating point mode @var{m}) to fixed
1902: point mode @var{n} as a signed number and store in operand 0 (which
1903: has mode @var{n}). This instruction's result is defined only when
1904: the value of operand 1 is an integer.
1905:
1906: @cindex @code{fixuns@var{mn}2} instruction pattern
1907: @item @samp{fixuns@var{m}@var{n}2}
1908: Convert operand 1 (valid for floating point mode @var{m}) to fixed
1909: point mode @var{n} as an unsigned number and store in operand 0 (which
1910: has mode @var{n}). This instruction's result is defined only when the
1911: value of operand 1 is an integer.
1912:
1913: @cindex @code{ftrunc@var{m}2} instruction pattern
1914: @item @samp{ftrunc@var{m}2}
1915: Convert operand 1 (valid for floating point mode @var{m}) to an
1916: integer value, still represented in floating point mode @var{m}, and
1917: store it in operand 0 (valid for floating point mode @var{m}).
1918:
1919: @cindex @code{fix_trunc@var{mn}2} instruction pattern
1920: @item @samp{fix_trunc@var{m}@var{n}2}
1921: Like @samp{fix@var{m}@var{n}2} but works for any floating point value
1922: of mode @var{m} by converting the value to an integer.
1923:
1924: @cindex @code{fixuns_trunc@var{mn}2} instruction pattern
1925: @item @samp{fixuns_trunc@var{m}@var{n}2}
1926: Like @samp{fixuns@var{m}@var{n}2} but works for any floating point
1927: value of mode @var{m} by converting the value to an integer.
1928:
1929: @cindex @code{trunc@var{mn}} instruction pattern
1930: @item @samp{trunc@var{m}@var{n}}
1931: Truncate operand 1 (valid for mode @var{m}) to mode @var{n} and
1932: store in operand 0 (which has mode @var{n}). Both modes must be fixed
1933: point or both floating point.
1934:
1935: @cindex @code{extend@var{mn}} instruction pattern
1936: @item @samp{extend@var{m}@var{n}}
1937: Sign-extend operand 1 (valid for mode @var{m}) to mode @var{n} and
1938: store in operand 0 (which has mode @var{n}). Both modes must be fixed
1939: point or both floating point.
1940:
1941: @cindex @code{zero_extend@var{mn}} instruction pattern
1942: @item @samp{zero_extend@var{m}@var{n}}
1943: Zero-extend operand 1 (valid for mode @var{m}) to mode @var{n} and
1944: store in operand 0 (which has mode @var{n}). Both modes must be fixed
1945: point.
1946:
1947: @cindex @code{extv} instruction pattern
1948: @item @samp{extv}
1949: Extract a bit field from operand 1 (a register or memory operand), where
1950: operand 2 specifies the width in bits and operand 3 the starting bit,
1951: and store it in operand 0. Operand 0 must have mode @code{word_mode}.
1952: Operand 1 may have mode @code{byte_mode} or @code{word_mode}; often
1953: @code{word_mode} is allowed only for registers. Operands 2 and 3 must
1954: be valid for @code{word_mode}.
1955:
1956: The RTL generation pass generates this instruction only with constants
1957: for operands 2 and 3.
1958:
1959: The bit-field value is sign-extended to a full word integer
1960: before it is stored in operand 0.
1961:
1962: @cindex @code{extzv} instruction pattern
1963: @item @samp{extzv}
1964: Like @samp{extv} except that the bit-field value is zero-extended.
1965:
1966: @cindex @code{insv} instruction pattern
1967: @item @samp{insv}
1968: Store operand 3 (which must be valid for @code{word_mode}) into a bit
1969: field in operand 0, where operand 1 specifies the width in bits and
1970: operand 2 the starting bit. Operand 0 may have mode @code{byte_mode} or
1971: @code{word_mode}; often @code{word_mode} is allowed only for registers.
1972: Operands 1 and 2 must be valid for @code{word_mode}.
1973:
1974: The RTL generation pass generates this instruction only with constants
1975: for operands 1 and 2.
1976:
1977: @cindex @code{s@var{cond}} instruction pattern
1978: @item @samp{s@var{cond}}
1979: Store zero or nonzero in the operand according to the condition codes.
1980: Value stored is nonzero iff the condition @var{cond} is true.
1981: @var{cond} is the name of a comparison operation expression code, such
1982: as @code{eq}, @code{lt} or @code{leu}.
1983:
1984: You specify the mode that the operand must have when you write the
1985: @code{match_operand} expression. The compiler automatically sees
1986: which mode you have used and supplies an operand of that mode.
1987:
1988: The value stored for a true condition must have 1 as its low bit, or
1989: else must be negative. Otherwise the instruction is not suitable and
1990: you should omit it from the machine description. You describe to the
1991: compiler exactly which value is stored by defining the macro
1992: @code{STORE_FLAG_VALUE} (@pxref{Misc}). If a description cannot be
1993: found that can be used for all the @samp{s@var{cond}} patterns, you
1994: should omit those operations from the machine description.
1995:
1996: These operations may fail, but should do so only in relatively
1997: uncommon cases; if they would fail for common cases involving
1998: integer comparisons, it is best to omit these patterns.
1999:
2000: If these operations are omitted, the compiler will usually generate code
2001: that copies the constant one to the target and branches around an
2002: assignment of zero to the target. If this code is more efficient than
2003: the potential instructions used for the @samp{s@var{cond}} pattern
2004: followed by those required to convert the result into a 1 or a zero in
2005: @code{SImode}, you should omit the @samp{s@var{cond}} operations from
2006: the machine description.
2007:
2008: @cindex @code{b@var{cond}} instruction pattern
2009: @item @samp{b@var{cond}}
2010: Conditional branch instruction. Operand 0 is a @code{label_ref} that
2011: refers to the label to jump to. Jump if the condition codes meet
2012: condition @var{cond}.
2013:
2014: Some machines do not follow the model assumed here where a comparison
2015: instruction is followed by a conditional branch instruction. In that
2016: case, the @samp{cmp@var{m}} (and @samp{tst@var{m}}) patterns should
2017: simply store the operands away and generate all the required insns in a
2018: @code{define_expand} (@pxref{Expander Definitions}) for the conditional
2019: branch operations. All calls to expand @samp{b@var{cond}} patterns are
2020: immediately preceded by calls to expand either a @samp{cmp@var{m}}
2021: pattern or a @samp{tst@var{m}} pattern.
2022:
2023: Machines that use a pseudo register for the condition code value, or
2024: where the mode used for the comparison depends on the condition being
2025: tested, should also use the above mechanism. @xref{Jump Patterns}
2026:
2027: The above discussion also applies to @samp{s@var{cond}} patterns.
2028:
2029: @cindex @code{call} instruction pattern
2030: @item @samp{call}
2031: Subroutine call instruction returning no value. Operand 0 is the
2032: function to call; operand 1 is the number of bytes of arguments pushed
2033: (in mode @code{SImode}, except it is normally a @code{const_int});
2034: operand 2 is the number of registers used as operands.
2035:
2036: On most machines, operand 2 is not actually stored into the RTL
2037: pattern. It is supplied for the sake of some RISC machines which need
2038: to put this information into the assembler code; they can put it in
2039: the RTL instead of operand 1.
2040:
2041: Operand 0 should be a @code{mem} RTX whose address is the address of the
2042: function. Note, however, that this address can be a @code{symbol_ref}
2043: expression even if it would not be a legitimate memory address on the
2044: target machine. If it is also not a valid argument for a call
2045: instruction, the pattern for this operation should be a
2046: @code{define_expand} (@pxref{Expander Definitions}) that places the
2047: address into a register and uses that register in the call instruction.
2048:
2049: @cindex @code{call_value} instruction pattern
2050: @item @samp{call_value}
2051: Subroutine call instruction returning a value. Operand 0 is the hard
2052: register in which the value is returned. There are three more
2053: operands, the same as the three operands of the @samp{call}
2054: instruction (but with numbers increased by one).
2055:
2056: Subroutines that return @code{BLKmode} objects use the @samp{call}
2057: insn.
2058:
2059: @cindex @code{call_pop} instruction pattern
2060: @cindex @code{call_value_pop} instruction pattern
2061: @item @samp{call_pop}, @samp{call_value_pop}
2062: Similar to @samp{call} and @samp{call_value}, except used if defined and
2063: if @code{RETURN_POPS_ARGS} is non-zero. They should emit a @code{parallel}
2064: that contains both the function call and a @code{set} to indicate the
2065: adjustment made to the frame pointer.
2066:
2067: For machines where @code{RETURN_POPS_ARGS} can be non-zero, the use of these
2068: patterns increases the number of functions for which the frame pointer
2069: can be eliminated, if desired.
2070:
2071: @cindex @code{untyped_call} instruction pattern
2072: @item @samp{untyped_call}
2073: Subroutine call instruction returning a value of any type. Operand 0 is
2074: the function to call; operand 1 is a memory location where the result of
2075: calling the function is to be stored; operand 2 is a @code{parallel}
2076: expression where each element is a @code{set} expression that indicates
2077: the saving of a function return value into the result block.
2078:
2079: This instruction pattern should be defined to support
2080: @code{__builtin_apply} on machines where special instructions are needed
2081: to call a subroutine with arbitrary arguments or to save the value
2082: returned. This instruction pattern is required on machines that have
2083: multiple registers that can hold a return value (i.e.
2084: @code{FUNCTION_VALUE_REGNO_P} is true for more than one register).
2085:
2086: @cindex @code{return} instruction pattern
2087: @item @samp{return}
2088: Subroutine return instruction. This instruction pattern name should be
2089: defined only if a single instruction can do all the work of returning
2090: from a function.
2091:
2092: Like the @samp{mov@var{m}} patterns, this pattern is also used after the
2093: RTL generation phase. In this case it is to support machines where
2094: multiple instructions are usually needed to return from a function, but
2095: some class of functions only requires one instruction to implement a
2096: return. Normally, the applicable functions are those which do not need
2097: to save any registers or allocate stack space.
2098:
2099: @findex reload_completed
2100: @findex leaf_function_p
2101: For such machines, the condition specified in this pattern should only
2102: be true when @code{reload_completed} is non-zero and the function's
2103: epilogue would only be a single instruction. For machines with register
2104: windows, the routine @code{leaf_function_p} may be used to determine if
2105: a register window push is required.
2106:
2107: Machines that have conditional return instructions should define patterns
2108: such as
2109:
2110: @smallexample
2111: (define_insn ""
2112: [(set (pc)
2113: (if_then_else (match_operator
2114: 0 "comparison_operator"
2115: [(cc0) (const_int 0)])
2116: (return)
2117: (pc)))]
2118: "@var{condition}"
2119: "@dots{}")
2120: @end smallexample
2121:
2122: where @var{condition} would normally be the same condition specified on the
2123: named @samp{return} pattern.
2124:
2125: @cindex @code{untyped_return} instruction pattern
2126: @item @samp{untyped_return}
2127: Untyped subroutine return instruction. This instruction pattern should
2128: be defined to support @code{__builtin_return} on machines where special
2129: instructions are needed to return a value of any type.
2130:
2131: Operand 0 is a memory location where the result of calling a function
2132: with @code{__builtin_apply} is stored; operand 1 is a @code{parallel}
2133: expression where each element is a @code{set} expression that indicates
2134: the restoring of a function return value from the result block.
2135:
2136: @cindex @code{nop} instruction pattern
2137: @item @samp{nop}
2138: No-op instruction. This instruction pattern name should always be defined
2139: to output a no-op in assembler code. @code{(const_int 0)} will do as an
2140: RTL pattern.
2141:
2142: @cindex @code{indirect_jump} instruction pattern
2143: @item @samp{indirect_jump}
2144: An instruction to jump to an address which is operand zero.
2145: This pattern name is mandatory on all machines.
2146:
2147: @cindex @code{casesi} instruction pattern
2148: @item @samp{casesi}
2149: Instruction to jump through a dispatch table, including bounds checking.
2150: This instruction takes five operands:
2151:
2152: @enumerate
2153: @item
2154: The index to dispatch on, which has mode @code{SImode}.
2155:
2156: @item
2157: The lower bound for indices in the table, an integer constant.
2158:
2159: @item
2160: The total range of indices in the table---the largest index
2161: minus the smallest one (both inclusive).
2162:
2163: @item
2164: A label that precedes the table itself.
2165:
2166: @item
2167: A label to jump to if the index has a value outside the bounds.
2168: (If the machine-description macro @code{CASE_DROPS_THROUGH} is defined,
2169: then an out-of-bounds index drops through to the code following
2170: the jump table instead of jumping to this label. In that case,
2171: this label is not actually used by the @samp{casesi} instruction,
2172: but it is always provided as an operand.)
2173: @end enumerate
2174:
2175: The table is a @code{addr_vec} or @code{addr_diff_vec} inside of a
2176: @code{jump_insn}. The number of elements in the table is one plus the
2177: difference between the upper bound and the lower bound.
2178:
2179: @cindex @code{tablejump} instruction pattern
2180: @item @samp{tablejump}
2181: Instruction to jump to a variable address. This is a low-level
2182: capability which can be used to implement a dispatch table when there
2183: is no @samp{casesi} pattern.
2184:
2185: This pattern requires two operands: the address or offset, and a label
2186: which should immediately precede the jump table. If the macro
2187: @code{CASE_VECTOR_PC_RELATIVE} is defined then the first operand is an
2188: offset which counts from the address of the table; otherwise, it is an
2189: absolute address to jump to. In either case, the first operand has
2190: mode @code{Pmode}.
2191:
2192: The @samp{tablejump} insn is always the last insn before the jump
2193: table it uses. Its assembler code normally has no need to use the
2194: second operand, but you should incorporate it in the RTL pattern so
2195: that the jump optimizer will not delete the table as unreachable code.
2196:
2197: @cindex @code{save_stack_block} instruction pattern
2198: @cindex @code{save_stack_function} instruction pattern
2199: @cindex @code{save_stack_nonlocal} instruction pattern
2200: @cindex @code{restore_stack_block} instruction pattern
2201: @cindex @code{restore_stack_function} instruction pattern
2202: @cindex @code{restore_stack_nonlocal} instruction pattern
2203: @item @samp{save_stack_block}
2204: @itemx @samp{save_stack_function}
2205: @itemx @samp{save_stack_nonlocal}
2206: @itemx @samp{restore_stack_block}
2207: @itemx @samp{restore_stack_function}
2208: @itemx @samp{restore_stack_nonlocal}
2209: Most machines save and restore the stack pointer by copying it to or
2210: from an object of mode @code{Pmode}. Do not define these patterns on
2211: such machines.
2212:
2213: Some machines require special handling for stack pointer saves and
2214: restores. On those machines, define the patterns corresponding to the
2215: non-standard cases by using a @code{define_expand} (@pxref{Expander
2216: Definitions}) that produces the required insns. The three types of
2217: saves and restores are:
2218:
2219: @enumerate
2220: @item
2221: @samp{save_stack_block} saves the stack pointer at the start of a block
2222: that allocates a variable-sized object, and @samp{restore_stack_block}
2223: restores the stack pointer when the block is exited.
2224:
2225: @item
2226: @samp{save_stack_function} and @samp{restore_stack_function} do a
2227: similar job for the outermost block of a function and are used when the
2228: function allocates variable-sized objects or calls @code{alloca}. Only
2229: the epilogue uses the restored stack pointer, allowing a simpler save or
2230: restore sequence on some machines.
2231:
2232: @item
2233: @samp{save_stack_nonlocal} is used in functions that contain labels
2234: branched to by nested functions. It saves the stack pointer in such a
2235: way that the inner function can use @samp{restore_stack_nonlocal} to
2236: restore the stack pointer. The compiler generates code to restore the
2237: frame and argument pointer registers, but some machines require saving
2238: and restoring additional data such as register window information or
2239: stack backchains. Place insns in these patterns to save and restore any
2240: such required data.
2241: @end enumerate
2242:
2243: When saving the stack pointer, operand 0 is the save area and operand 1
2244: is the stack pointer. The mode used to allocate the save area is the
2245: mode of operand 0. You must specify an integral mode, or
2246: @code{VOIDmode} if no save area is needed for a particular type of save
2247: (either because no save is needed or because a machine-specific save
2248: area can be used). Operand 0 is the stack pointer and operand 1 is the
2249: save area for restore operations. If @samp{save_stack_block} is
2250: defined, operand 0 must not be @code{VOIDmode} since these saves can be
2251: arbitrarily nested.
2252:
2253: A save area is a @code{mem} that is at a constant offset from
2254: @code{virtual_stack_vars_rtx} when the stack pointer is saved for use by
2255: nonlocal gotos and a @code{reg} in the other two cases.
2256:
2257: @cindex @code{allocate_stack} instruction pattern
2258: @item @samp{allocate_stack}
2259: Subtract (or add if @code{STACK_GROWS_DOWNWARD} is undefined) operand 0 from
2260: the stack pointer to create space for dynamically allocated data.
2261:
2262: Do not define this pattern if all that must be done is the subtraction.
2263: Some machines require other operations such as stack probes or
2264: maintaining the back chain. Define this pattern to emit those
2265: operations in addition to updating the stack pointer.
2266: @end table
2267:
2268: @node Pattern Ordering
2269: @section When the Order of Patterns Matters
2270: @cindex Pattern Ordering
2271: @cindex Ordering of Patterns
2272:
2273: Sometimes an insn can match more than one instruction pattern. Then the
2274: pattern that appears first in the machine description is the one used.
2275: Therefore, more specific patterns (patterns that will match fewer things)
2276: and faster instructions (those that will produce better code when they
2277: do match) should usually go first in the description.
2278:
2279: In some cases the effect of ordering the patterns can be used to hide
2280: a pattern when it is not valid. For example, the 68000 has an
2281: instruction for converting a fullword to floating point and another
2282: for converting a byte to floating point. An instruction converting
2283: an integer to floating point could match either one. We put the
2284: pattern to convert the fullword first to make sure that one will
2285: be used rather than the other. (Otherwise a large integer might
2286: be generated as a single-byte immediate quantity, which would not work.)
2287: Instead of using this pattern ordering it would be possible to make the
2288: pattern for convert-a-byte smart enough to deal properly with any
2289: constant value.
2290:
2291: @node Dependent Patterns
2292: @section Interdependence of Patterns
2293: @cindex Dependent Patterns
2294: @cindex Interdependence of Patterns
2295:
2296: Every machine description must have a named pattern for each of the
2297: conditional branch names @samp{b@var{cond}}. The recognition template
2298: must always have the form
2299:
2300: @example
2301: (set (pc)
2302: (if_then_else (@var{cond} (cc0) (const_int 0))
2303: (label_ref (match_operand 0 "" ""))
2304: (pc)))
2305: @end example
2306:
2307: @noindent
2308: In addition, every machine description must have an anonymous pattern
2309: for each of the possible reverse-conditional branches. Their templates
2310: look like
2311:
2312: @example
2313: (set (pc)
2314: (if_then_else (@var{cond} (cc0) (const_int 0))
2315: (pc)
2316: (label_ref (match_operand 0 "" ""))))
2317: @end example
2318:
2319: @noindent
2320: They are necessary because jump optimization can turn direct-conditional
2321: branches into reverse-conditional branches.
2322:
2323: It is often convenient to use the @code{match_operator} construct to
2324: reduce the number of patterns that must be specified for branches. For
2325: example,
2326:
2327: @example
2328: (define_insn ""
2329: [(set (pc)
2330: (if_then_else (match_operator 0 "comparison_operator"
2331: [(cc0) (const_int 0)])
2332: (pc)
2333: (label_ref (match_operand 1 "" ""))))]
2334: "@var{condition}"
2335: "@dots{}")
2336: @end example
2337:
2338: In some cases machines support instructions identical except for the
2339: machine mode of one or more operands. For example, there may be
2340: ``sign-extend halfword'' and ``sign-extend byte'' instructions whose
2341: patterns are
2342:
2343: @example
2344: (set (match_operand:SI 0 @dots{})
2345: (extend:SI (match_operand:HI 1 @dots{})))
2346:
2347: (set (match_operand:SI 0 @dots{})
2348: (extend:SI (match_operand:QI 1 @dots{})))
2349: @end example
2350:
2351: @noindent
2352: Constant integers do not specify a machine mode, so an instruction to
2353: extend a constant value could match either pattern. The pattern it
2354: actually will match is the one that appears first in the file. For correct
2355: results, this must be the one for the widest possible mode (@code{HImode},
2356: here). If the pattern matches the @code{QImode} instruction, the results
2357: will be incorrect if the constant value does not actually fit that mode.
2358:
2359: Such instructions to extend constants are rarely generated because they are
2360: optimized away, but they do occasionally happen in nonoptimized
2361: compilations.
2362:
2363: If a constraint in a pattern allows a constant, the reload pass may
2364: replace a register with a constant permitted by the constraint in some
2365: cases. Similarly for memory references. You must ensure that the
2366: predicate permits all objects allowed by the constraints to prevent the
2367: compiler from crashing.
2368:
2369: Because of this substitution, you should not provide separate patterns
2370: for increment and decrement instructions. Instead, they should be
2371: generated from the same pattern that supports register-register add
2372: insns by examining the operands and generating the appropriate machine
2373: instruction.
2374:
2375: @node Jump Patterns
2376: @section Defining Jump Instruction Patterns
2377: @cindex jump instruction patterns
2378: @cindex defining jump instruction patterns
2379:
2380: For most machines, GNU CC assumes that the machine has a condition code.
2381: A comparison insn sets the condition code, recording the results of both
2382: signed and unsigned comparison of the given operands. A separate branch
2383: insn tests the condition code and branches or not according its value.
2384: The branch insns come in distinct signed and unsigned flavors. Many
2385: common machines, such as the Vax, the 68000 and the 32000, work this
2386: way.
2387:
2388: Some machines have distinct signed and unsigned compare instructions, and
2389: only one set of conditional branch instructions. The easiest way to handle
2390: these machines is to treat them just like the others until the final stage
2391: where assembly code is written. At this time, when outputting code for the
2392: compare instruction, peek ahead at the following branch using
2393: @code{next_cc0_user (insn)}. (The variable @code{insn} refers to the insn
2394: being output, in the output-writing code in an instruction pattern.) If
2395: the RTL says that is an unsigned branch, output an unsigned compare;
2396: otherwise output a signed compare. When the branch itself is output, you
2397: can treat signed and unsigned branches identically.
2398:
2399: The reason you can do this is that GNU CC always generates a pair of
2400: consecutive RTL insns, possibly separated by @code{note} insns, one to
2401: set the condition code and one to test it, and keeps the pair inviolate
2402: until the end.
2403:
2404: To go with this technique, you must define the machine-description macro
2405: @code{NOTICE_UPDATE_CC} to do @code{CC_STATUS_INIT}; in other words, no
2406: compare instruction is superfluous.
2407:
2408: Some machines have compare-and-branch instructions and no condition code.
2409: A similar technique works for them. When it is time to ``output'' a
2410: compare instruction, record its operands in two static variables. When
2411: outputting the branch-on-condition-code instruction that follows, actually
2412: output a compare-and-branch instruction that uses the remembered operands.
2413:
2414: It also works to define patterns for compare-and-branch instructions.
2415: In optimizing compilation, the pair of compare and branch instructions
2416: will be combined according to these patterns. But this does not happen
2417: if optimization is not requested. So you must use one of the solutions
2418: above in addition to any special patterns you define.
2419:
2420: In many RISC machines, most instructions do not affect the condition
2421: code and there may not even be a separate condition code register. On
2422: these machines, the restriction that the definition and use of the
2423: condition code be adjacent insns is not necessary and can prevent
2424: important optimizations. For example, on the IBM RS/6000, there is a
2425: delay for taken branches unless the condition code register is set three
2426: instructions earlier than the conditional branch. The instruction
2427: scheduler cannot perform this optimization if it is not permitted to
2428: separate the definition and use of the condition code register.
2429:
2430: On these machines, do not use @code{(cc0)}, but instead use a register
2431: to represent the condition code. If there is a specific condition code
2432: register in the machine, use a hard register. If the condition code or
2433: comparison result can be placed in any general register, or if there are
2434: multiple condition registers, use a pseudo register.
2435:
2436: @findex prev_cc0_setter
2437: @findex next_cc0_user
2438: On some machines, the type of branch instruction generated may depend on
2439: the way the condition code was produced; for example, on the 68k and
2440: Sparc, setting the condition code directly from an add or subtract
2441: instruction does not clear the overflow bit the way that a test
2442: instruction does, so a different branch instruction must be used for
2443: some conditional branches. For machines that use @code{(cc0)}, the set
2444: and use of the condition code must be adjacent (separated only by
2445: @code{note} insns) allowing flags in @code{cc_status} to be used.
2446: (@xref{Condition Code}.) Also, the comparison and branch insns can be
2447: located from each other by using the functions @code{prev_cc0_setter}
2448: and @code{next_cc0_user}.
2449:
2450: However, this is not true on machines that do not use @code{(cc0)}. On
2451: those machines, no assumptions can be made about the adjacency of the
2452: compare and branch insns and the above methods cannot be used. Instead,
2453: we use the machine mode of the condition code register to record
2454: different formats of the condition code register.
2455:
2456: Registers used to store the condition code value should have a mode that
2457: is in class @code{MODE_CC}. Normally, it will be @code{CCmode}. If
2458: additional modes are required (as for the add example mentioned above in
2459: the Sparc), define the macro @code{EXTRA_CC_MODES} to list the
2460: additional modes required (@pxref{Condition Code}). Also define
2461: @code{EXTRA_CC_NAMES} to list the names of those modes and
2462: @code{SELECT_CC_MODE} to choose a mode given an operand of a compare.
2463:
2464: If it is known during RTL generation that a different mode will be
2465: required (for example, if the machine has separate compare instructions
2466: for signed and unsigned quantities, like most IBM processors), they can
2467: be specified at that time.
2468:
2469: If the cases that require different modes would be made by instruction
2470: combination, the macro @code{SELECT_CC_MODE} determines which machine
2471: mode should be used for the comparison result. The patterns should be
2472: written using that mode. To support the case of the add on the Sparc
2473: discussed above, we have the pattern
2474:
2475: @smallexample
2476: (define_insn ""
2477: [(set (reg:CC_NOOV 0)
2478: (compare:CC_NOOV
2479: (plus:SI (match_operand:SI 0 "register_operand" "%r")
2480: (match_operand:SI 1 "arith_operand" "rI"))
2481: (const_int 0)))]
2482: ""
2483: "@dots{}")
2484: @end smallexample
2485:
2486: The @code{SELECT_CC_MODE} macro on the Sparc returns @code{CC_NOOVmode}
2487: for comparisons whose argument is a @code{plus}.
2488:
2489: @node Insn Canonicalizations
2490: @section Canonicalization of Instructions
2491: @cindex canonicalization of instructions
2492: @cindex insn canonicalization
2493:
2494: There are often cases where multiple RTL expressions could represent an
2495: operation performed by a single machine instruction. This situation is
2496: most commonly encountered with logical, branch, and multiply-accumulate
2497: instructions. In such cases, the compiler attempts to convert these
2498: multiple RTL expressions into a single canonical form to reduce the
2499: number of insn patterns required.
2500:
2501: In addition to algebraic simplifications, following canonicalizations
2502: are performed:
2503:
2504: @itemize @bullet
2505: @item
2506: For commutative and comparison operators, a constant is always made the
2507: second operand. If a machine only supports a constant as the second
2508: operand, only patterns that match a constant in the second operand need
2509: be supplied.
2510:
2511: @cindex @code{neg}, canonicalization of
2512: @cindex @code{not}, canonicalization of
2513: @cindex @code{mult}, canonicalization of
2514: @cindex @code{plus}, canonicalization of
2515: @cindex @code{minus}, canonicalization of
2516: For these operators, if only one operand is a @code{neg}, @code{not},
2517: @code{mult}, @code{plus}, or @code{minus} expression, it will be the
2518: first operand.
2519:
2520: @cindex @code{compare}, canonicalization of
2521: @item
2522: For the @code{compare} operator, a constant is always the second operand
2523: on machines where @code{cc0} is used (@pxref{Jump Patterns}). On other
2524: machines, there are rare cases where the compiler might want to construct
2525: a @code{compare} with a constant as the first operand. However, these
2526: cases are not common enough for it to be worthwhile to provide a pattern
2527: matching a constant as the first operand unless the machine actually has
2528: such an instruction.
2529:
2530: An operand of @code{neg}, @code{not}, @code{mult}, @code{plus}, or
2531: @code{minus} is made the first operand under the same conditions as
2532: above.
2533:
2534: @item
2535: @code{(minus @var{x} (const_int @var{n}))} is converted to
2536: @code{(plus @var{x} (const_int @var{-n}))}.
2537:
2538: @item
2539: Within address computations (i.e., inside @code{mem}), a left shift is
2540: converted into the appropriate multiplication by a power of two.
2541:
2542: @cindex @code{ior}, canonicalization of
2543: @cindex @code{and}, canonicalization of
2544: @cindex De Morgan's law
2545: De`Morgan's Law is used to move bitwise negation inside a bitwise
2546: logical-and or logical-or operation. If this results in only one
2547: operand being a @code{not} expression, it will be the first one.
2548:
2549: A machine that has an instruction that performs a bitwise logical-and of one
2550: operand with the bitwise negation of the other should specify the pattern
2551: for that instruction as
2552:
2553: @example
2554: (define_insn ""
2555: [(set (match_operand:@var{m} 0 @dots{})
2556: (and:@var{m} (not:@var{m} (match_operand:@var{m} 1 @dots{}))
2557: (match_operand:@var{m} 2 @dots{})))]
2558: "@dots{}"
2559: "@dots{}")
2560: @end example
2561:
2562: @noindent
2563: Similarly, a pattern for a ``NAND'' instruction should be written
2564:
2565: @example
2566: (define_insn ""
2567: [(set (match_operand:@var{m} 0 @dots{})
2568: (ior:@var{m} (not:@var{m} (match_operand:@var{m} 1 @dots{}))
2569: (not:@var{m} (match_operand:@var{m} 2 @dots{}))))]
2570: "@dots{}"
2571: "@dots{}")
2572: @end example
2573:
2574: In both cases, it is not necessary to include patterns for the many
2575: logically equivalent RTL expressions.
2576:
2577: @cindex @code{xor}, canonicalization of
2578: @item
2579: The only possible RTL expressions involving both bitwise exclusive-or
2580: and bitwise negation are @code{(xor:@var{m} @var{x} @var{y})}
2581: and @code{(not:@var{m} (xor:@var{m} @var{x} @var{y}))}.@refill
2582:
2583: @item
2584: The sum of three items, one of which is a constant, will only appear in
2585: the form
2586:
2587: @example
2588: (plus:@var{m} (plus:@var{m} @var{x} @var{y}) @var{constant})
2589: @end example
2590:
2591: @item
2592: On machines that do not use @code{cc0},
2593: @code{(compare @var{x} (const_int 0))} will be converted to
2594: @var{x}.@refill
2595:
2596: @cindex @code{zero_extract}, canonicalization of
2597: @cindex @code{sign_extract}, canonicalization of
2598: @item
2599: Equality comparisons of a group of bits (usually a single bit) with zero
2600: will be written using @code{zero_extract} rather than the equivalent
2601: @code{and} or @code{sign_extract} operations.
2602:
2603: @end itemize
2604:
2605: @node Peephole Definitions
2606: @section Machine-Specific Peephole Optimizers
2607: @cindex peephole optimizer definitions
2608: @cindex defining peephole optimizers
2609:
2610: In addition to instruction patterns the @file{md} file may contain
2611: definitions of machine-specific peephole optimizations.
2612:
2613: The combiner does not notice certain peephole optimizations when the data
2614: flow in the program does not suggest that it should try them. For example,
2615: sometimes two consecutive insns related in purpose can be combined even
2616: though the second one does not appear to use a register computed in the
2617: first one. A machine-specific peephole optimizer can detect such
2618: opportunities.
2619:
2620: @need 1000
2621: A definition looks like this:
2622:
2623: @smallexample
2624: (define_peephole
2625: [@var{insn-pattern-1}
2626: @var{insn-pattern-2}
2627: @dots{}]
2628: "@var{condition}"
2629: "@var{template}"
2630: "@var{optional insn-attributes}")
2631: @end smallexample
2632:
2633: @noindent
2634: The last string operand may be omitted if you are not using any
2635: machine-specific information in this machine description. If present,
2636: it must obey the same rules as in a @code{define_insn}.
2637:
2638: In this skeleton, @var{insn-pattern-1} and so on are patterns to match
2639: consecutive insns. The optimization applies to a sequence of insns when
2640: @var{insn-pattern-1} matches the first one, @var{insn-pattern-2} matches
2641: the next, and so on.@refill
2642:
2643: Each of the insns matched by a peephole must also match a
2644: @code{define_insn}. Peepholes are checked only at the last stage just
2645: before code generation, and only optionally. Therefore, any insn which
2646: would match a peephole but no @code{define_insn} will cause a crash in code
2647: generation in an unoptimized compilation, or at various optimization
2648: stages.
2649:
2650: The operands of the insns are matched with @code{match_operands},
2651: @code{match_operator}, and @code{match_dup}, as usual. What is not
2652: usual is that the operand numbers apply to all the insn patterns in the
2653: definition. So, you can check for identical operands in two insns by
2654: using @code{match_operand} in one insn and @code{match_dup} in the
2655: other.
2656:
2657: The operand constraints used in @code{match_operand} patterns do not have
2658: any direct effect on the applicability of the peephole, but they will
2659: be validated afterward, so make sure your constraints are general enough
2660: to apply whenever the peephole matches. If the peephole matches
2661: but the constraints are not satisfied, the compiler will crash.
2662:
2663: It is safe to omit constraints in all the operands of the peephole; or
2664: you can write constraints which serve as a double-check on the criteria
2665: previously tested.
2666:
2667: Once a sequence of insns matches the patterns, the @var{condition} is
2668: checked. This is a C expression which makes the final decision whether to
2669: perform the optimization (we do so if the expression is nonzero). If
2670: @var{condition} is omitted (in other words, the string is empty) then the
2671: optimization is applied to every sequence of insns that matches the
2672: patterns.
2673:
2674: The defined peephole optimizations are applied after register allocation
2675: is complete. Therefore, the peephole definition can check which
2676: operands have ended up in which kinds of registers, just by looking at
2677: the operands.
2678:
2679: @findex prev_nonnote_insn
2680: The way to refer to the operands in @var{condition} is to write
2681: @code{operands[@var{i}]} for operand number @var{i} (as matched by
2682: @code{(match_operand @var{i} @dots{})}). Use the variable @code{insn}
2683: to refer to the last of the insns being matched; use
2684: @code{prev_nonnote_insn} to find the preceding insns.
2685:
2686: @findex dead_or_set_p
2687: When optimizing computations with intermediate results, you can use
2688: @var{condition} to match only when the intermediate results are not used
2689: elsewhere. Use the C expression @code{dead_or_set_p (@var{insn},
2690: @var{op})}, where @var{insn} is the insn in which you expect the value
2691: to be used for the last time (from the value of @code{insn}, together
2692: with use of @code{prev_nonnote_insn}), and @var{op} is the intermediate
2693: value (from @code{operands[@var{i}]}).@refill
2694:
2695: Applying the optimization means replacing the sequence of insns with one
2696: new insn. The @var{template} controls ultimate output of assembler code
2697: for this combined insn. It works exactly like the template of a
2698: @code{define_insn}. Operand numbers in this template are the same ones
2699: used in matching the original sequence of insns.
2700:
2701: The result of a defined peephole optimizer does not need to match any of
2702: the insn patterns in the machine description; it does not even have an
2703: opportunity to match them. The peephole optimizer definition itself serves
2704: as the insn pattern to control how the insn is output.
2705:
2706: Defined peephole optimizers are run as assembler code is being output,
2707: so the insns they produce are never combined or rearranged in any way.
2708:
2709: Here is an example, taken from the 68000 machine description:
2710:
2711: @smallexample
2712: (define_peephole
2713: [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4)))
2714: (set (match_operand:DF 0 "register_operand" "=f")
2715: (match_operand:DF 1 "register_operand" "ad"))]
2716: "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
2717: "*
2718: @{
2719: rtx xoperands[2];
2720: xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1);
2721: #ifdef MOTOROLA
2722: output_asm_insn (\"move.l %1,(sp)\", xoperands);
2723: output_asm_insn (\"move.l %1,-(sp)\", operands);
2724: return \"fmove.d (sp)+,%0\";
2725: #else
2726: output_asm_insn (\"movel %1,sp@@\", xoperands);
2727: output_asm_insn (\"movel %1,sp@@-\", operands);
2728: return \"fmoved sp@@+,%0\";
2729: #endif
2730: @}
2731: ")
2732: @end smallexample
2733:
2734: @need 1000
2735: The effect of this optimization is to change
2736:
2737: @smallexample
2738: @group
2739: jbsr _foobar
2740: addql #4,sp
2741: movel d1,sp@@-
2742: movel d0,sp@@-
2743: fmoved sp@@+,fp0
2744: @end group
2745: @end smallexample
2746:
2747: @noindent
2748: into
2749:
2750: @smallexample
2751: @group
2752: jbsr _foobar
2753: movel d1,sp@@
2754: movel d0,sp@@-
2755: fmoved sp@@+,fp0
2756: @end group
2757: @end smallexample
2758:
2759: @ignore
2760: @findex CC_REVERSED
2761: If a peephole matches a sequence including one or more jump insns, you must
2762: take account of the flags such as @code{CC_REVERSED} which specify that the
2763: condition codes are represented in an unusual manner. The compiler
2764: automatically alters any ordinary conditional jumps which occur in such
2765: situations, but the compiler cannot alter jumps which have been replaced by
2766: peephole optimizations. So it is up to you to alter the assembler code
2767: that the peephole produces. Supply C code to write the assembler output,
2768: and in this C code check the condition code status flags and change the
2769: assembler code as appropriate.
2770: @end ignore
2771:
2772: @var{insn-pattern-1} and so on look @emph{almost} like the second
2773: operand of @code{define_insn}. There is one important difference: the
2774: second operand of @code{define_insn} consists of one or more RTX's
2775: enclosed in square brackets. Usually, there is only one: then the same
2776: action can be written as an element of a @code{define_peephole}. But
2777: when there are multiple actions in a @code{define_insn}, they are
2778: implicitly enclosed in a @code{parallel}. Then you must explicitly
2779: write the @code{parallel}, and the square brackets within it, in the
2780: @code{define_peephole}. Thus, if an insn pattern looks like this,
2781:
2782: @smallexample
2783: (define_insn "divmodsi4"
2784: [(set (match_operand:SI 0 "general_operand" "=d")
2785: (div:SI (match_operand:SI 1 "general_operand" "0")
2786: (match_operand:SI 2 "general_operand" "dmsK")))
2787: (set (match_operand:SI 3 "general_operand" "=d")
2788: (mod:SI (match_dup 1) (match_dup 2)))]
2789: "TARGET_68020"
2790: "divsl%.l %2,%3:%0")
2791: @end smallexample
2792:
2793: @noindent
2794: then the way to mention this insn in a peephole is as follows:
2795:
2796: @smallexample
2797: (define_peephole
2798: [@dots{}
2799: (parallel
2800: [(set (match_operand:SI 0 "general_operand" "=d")
2801: (div:SI (match_operand:SI 1 "general_operand" "0")
2802: (match_operand:SI 2 "general_operand" "dmsK")))
2803: (set (match_operand:SI 3 "general_operand" "=d")
2804: (mod:SI (match_dup 1) (match_dup 2)))])
2805: @dots{}]
2806: @dots{})
2807: @end smallexample
2808:
2809: @node Expander Definitions
2810: @section Defining RTL Sequences for Code Generation
2811: @cindex expander definitions
2812: @cindex code generation RTL sequences
2813: @cindex defining RTL sequences for code generation
2814:
2815: On some target machines, some standard pattern names for RTL generation
2816: cannot be handled with single insn, but a sequence of RTL insns can
2817: represent them. For these target machines, you can write a
2818: @code{define_expand} to specify how to generate the sequence of RTL.
2819:
2820: @findex define_expand
2821: A @code{define_expand} is an RTL expression that looks almost like a
2822: @code{define_insn}; but, unlike the latter, a @code{define_expand} is used
2823: only for RTL generation and it can produce more than one RTL insn.
2824:
2825: A @code{define_expand} RTX has four operands:
2826:
2827: @itemize @bullet
2828: @item
2829: The name. Each @code{define_expand} must have a name, since the only
2830: use for it is to refer to it by name.
2831:
2832: @findex define_peephole
2833: @item
2834: The RTL template. This is just like the RTL template for a
2835: @code{define_peephole} in that it is a vector of RTL expressions
2836: each being one insn.
2837:
2838: @item
2839: The condition, a string containing a C expression. This expression is
2840: used to express how the availability of this pattern depends on
2841: subclasses of target machine, selected by command-line options when
2842: GNU CC is run. This is just like the condition of a
2843: @code{define_insn} that has a standard name.
2844:
2845: @item
2846: The preparation statements, a string containing zero or more C
2847: statements which are to be executed before RTL code is generated from
2848: the RTL template.
2849:
2850: Usually these statements prepare temporary registers for use as
2851: internal operands in the RTL template, but they can also generate RTL
2852: insns directly by calling routines such as @code{emit_insn}, etc.
2853: Any such insns precede the ones that come from the RTL template.
2854: @end itemize
2855:
2856: Every RTL insn emitted by a @code{define_expand} must match some
2857: @code{define_insn} in the machine description. Otherwise, the compiler
2858: will crash when trying to generate code for the insn or trying to optimize
2859: it.
2860:
2861: The RTL template, in addition to controlling generation of RTL insns,
2862: also describes the operands that need to be specified when this pattern
2863: is used. In particular, it gives a predicate for each operand.
2864:
2865: A true operand, which needs to be specified in order to generate RTL from
2866: the pattern, should be described with a @code{match_operand} in its first
2867: occurrence in the RTL template. This enters information on the operand's
2868: predicate into the tables that record such things. GNU CC uses the
2869: information to preload the operand into a register if that is required for
2870: valid RTL code. If the operand is referred to more than once, subsequent
2871: references should use @code{match_dup}.
2872:
2873: The RTL template may also refer to internal ``operands'' which are
2874: temporary registers or labels used only within the sequence made by the
2875: @code{define_expand}. Internal operands are substituted into the RTL
2876: template with @code{match_dup}, never with @code{match_operand}. The
2877: values of the internal operands are not passed in as arguments by the
2878: compiler when it requests use of this pattern. Instead, they are computed
2879: within the pattern, in the preparation statements. These statements
2880: compute the values and store them into the appropriate elements of
2881: @code{operands} so that @code{match_dup} can find them.
2882:
2883: There are two special macros defined for use in the preparation statements:
2884: @code{DONE} and @code{FAIL}. Use them with a following semicolon,
2885: as a statement.
2886:
2887: @table @code
2888:
2889: @findex DONE
2890: @item DONE
2891: Use the @code{DONE} macro to end RTL generation for the pattern. The
2892: only RTL insns resulting from the pattern on this occasion will be
2893: those already emitted by explicit calls to @code{emit_insn} within the
2894: preparation statements; the RTL template will not be generated.
2895:
2896: @findex FAIL
2897: @item FAIL
2898: Make the pattern fail on this occasion. When a pattern fails, it means
2899: that the pattern was not truly available. The calling routines in the
2900: compiler will try other strategies for code generation using other patterns.
2901:
2902: Failure is currently supported only for binary (addition, multiplication,
2903: shifting, etc.) and bitfield (@code{extv}, @code{extzv}, and @code{insv})
2904: operations.
2905: @end table
2906:
2907: Here is an example, the definition of left-shift for the SPUR chip:
2908:
2909: @smallexample
2910: @group
2911: (define_expand "ashlsi3"
2912: [(set (match_operand:SI 0 "register_operand" "")
2913: (ashift:SI
2914: @end group
2915: @group
2916: (match_operand:SI 1 "register_operand" "")
2917: (match_operand:SI 2 "nonmemory_operand" "")))]
2918: ""
2919: "
2920: @end group
2921: @end smallexample
2922:
2923: @smallexample
2924: @group
2925: @{
2926: if (GET_CODE (operands[2]) != CONST_INT
2927: || (unsigned) INTVAL (operands[2]) > 3)
2928: FAIL;
2929: @}")
2930: @end group
2931: @end smallexample
2932:
2933: @noindent
2934: This example uses @code{define_expand} so that it can generate an RTL insn
2935: for shifting when the shift-count is in the supported range of 0 to 3 but
2936: fail in other cases where machine insns aren't available. When it fails,
2937: the compiler tries another strategy using different patterns (such as, a
2938: library call).
2939:
2940: If the compiler were able to handle nontrivial condition-strings in
2941: patterns with names, then it would be possible to use a
2942: @code{define_insn} in that case. Here is another case (zero-extension
2943: on the 68000) which makes more use of the power of @code{define_expand}:
2944:
2945: @smallexample
2946: (define_expand "zero_extendhisi2"
2947: [(set (match_operand:SI 0 "general_operand" "")
2948: (const_int 0))
2949: (set (strict_low_part
2950: (subreg:HI
2951: (match_dup 0)
2952: 0))
2953: (match_operand:HI 1 "general_operand" ""))]
2954: ""
2955: "operands[1] = make_safe_from (operands[1], operands[0]);")
2956: @end smallexample
2957:
2958: @noindent
2959: @findex make_safe_from
2960: Here two RTL insns are generated, one to clear the entire output operand
2961: and the other to copy the input operand into its low half. This sequence
2962: is incorrect if the input operand refers to [the old value of] the output
2963: operand, so the preparation statement makes sure this isn't so. The
2964: function @code{make_safe_from} copies the @code{operands[1]} into a
2965: temporary register if it refers to @code{operands[0]}. It does this
2966: by emitting another RTL insn.
2967:
2968: Finally, a third example shows the use of an internal operand.
2969: Zero-extension on the SPUR chip is done by @code{and}-ing the result
2970: against a halfword mask. But this mask cannot be represented by a
2971: @code{const_int} because the constant value is too large to be legitimate
2972: on this machine. So it must be copied into a register with
2973: @code{force_reg} and then the register used in the @code{and}.
2974:
2975: @smallexample
2976: (define_expand "zero_extendhisi2"
2977: [(set (match_operand:SI 0 "register_operand" "")
2978: (and:SI (subreg:SI
2979: (match_operand:HI 1 "register_operand" "")
2980: 0)
2981: (match_dup 2)))]
2982: ""
2983: "operands[2]
2984: = force_reg (SImode, gen_rtx (CONST_INT,
2985: VOIDmode, 65535)); ")
2986: @end smallexample
2987:
2988: @strong{Note:} If the @code{define_expand} is used to serve a
2989: standard binary or unary arithmetic operation or a bitfield operation,
2990: then the last insn it generates must not be a @code{code_label},
2991: @code{barrier} or @code{note}. It must be an @code{insn},
2992: @code{jump_insn} or @code{call_insn}. If you don't need a real insn
2993: at the end, emit an insn to copy the result of the operation into
2994: itself. Such an insn will generate no code, but it can avoid problems
2995: in the compiler.@refill
2996:
2997: @node Insn Splitting
2998: @section Defining How to Split Instructions
2999: @cindex insn splitting
3000: @cindex instruction splitting
3001: @cindex splitting instructions
3002:
3003: There are two cases where you should specify how to split a pattern into
3004: multiple insns. On machines that have instructions requiring delay
3005: slots (@pxref{Delay Slots}) or that have instructions whose output is
3006: not available for multiple cycles (@pxref{Function Units}), the compiler
3007: phases that optimize these cases need to be able to move insns into
3008: one-instruction delay slots. However, some insns may generate more than one
3009: machine instruction. These insns cannot be placed into a delay slot.
3010:
3011: Often you can rewrite the single insn as a list of individual insns,
3012: each corresponding to one machine instruction. The disadvantage of
3013: doing so is that it will cause the compilation to be slower and require
3014: more space. If the resulting insns are too complex, it may also
3015: suppress some optimizations. The compiler splits the insn if there is a
3016: reason to believe that it might improve instruction or delay slot
3017: scheduling.
3018:
3019: The insn combiner phase also splits putative insns. If three insns are
3020: merged into one insn with a complex expression that cannot be matched by
3021: some @code{define_insn} pattern, the combiner phase attempts to split
3022: the complex pattern into two insns that are recognized. Usually it can
3023: break the complex pattern into two patterns by splitting out some
3024: subexpression. However, in some other cases, such as performing an
3025: addition of a large constant in two insns on a RISC machine, the way to
3026: split the addition into two insns is machine-dependent.
3027:
3028: @cindex define_split
3029: The @code{define_split} definition tells the compiler how to split a
3030: complex insn into several simpler insns. It looks like this:
3031:
3032: @smallexample
3033: (define_split
3034: [@var{insn-pattern}]
3035: "@var{condition}"
3036: [@var{new-insn-pattern-1}
3037: @var{new-insn-pattern-2}
3038: @dots{}]
3039: "@var{preparation statements}")
3040: @end smallexample
3041:
3042: @var{insn-pattern} is a pattern that needs to be split and
3043: @var{condition} is the final condition to be tested, as in a
3044: @code{define_insn}. When an insn matching @var{insn-pattern} and
3045: satisfying @var{condition} is found, it is replaced in the insn list
3046: with the insns given by @var{new-insn-pattern-1},
3047: @var{new-insn-pattern-2}, etc.
3048:
3049: The @var{preparation statements} are similar to those statements that
3050: are specified for @code{define_expand} (@pxref{Expander Definitions})
3051: and are executed before the new RTL is generated to prepare for the
3052: generated code or emit some insns whose pattern is not fixed. Unlike
3053: those in @code{define_expand}, however, these statements must not
3054: generate any new pseudo-registers. Once reload has completed, they also
3055: must not allocate any space in the stack frame.
3056:
3057: Patterns are matched against @var{insn-pattern} in two different
3058: circumstances. If an insn needs to be split for delay slot scheduling
3059: or insn scheduling, the insn is already known to be valid, which means
3060: that it must have been matched by some @code{define_insn} and, if
3061: @code{reload_completed} is non-zero, is known to satisfy the constraints
3062: of that @code{define_insn}. In that case, the new insn patterns must
3063: also be insns that are matched by some @code{define_insn} and, if
3064: @code{reload_completed} is non-zero, must also satisfy the constraints
3065: of those definitions.
3066:
3067: As an example of this usage of @code{define_split}, consider the following
3068: example from @file{a29k.md}, which splits a @code{sign_extend} from
3069: @code{HImode} to @code{SImode} into a pair of shift insns:
3070:
3071: @smallexample
3072: (define_split
3073: [(set (match_operand:SI 0 "gen_reg_operand" "")
3074: (sign_extend:SI (match_operand:HI 1 "gen_reg_operand" "")))]
3075: ""
3076: [(set (match_dup 0)
3077: (ashift:SI (match_dup 1)
3078: (const_int 16)))
3079: (set (match_dup 0)
3080: (ashiftrt:SI (match_dup 0)
3081: (const_int 16)))]
3082: "
3083: @{ operands[1] = gen_lowpart (SImode, operands[1]); @}")
3084: @end smallexample
3085:
3086: When the combiner phase tries to split an insn pattern, it is always the
3087: case that the pattern is @emph{not} matched by any @code{define_insn}.
3088: The combiner pass first tries to split a single @code{set} expression
3089: and then the same @code{set} expression inside a @code{parallel}, but
3090: followed by a @code{clobber} of a pseudo-reg to use as a scratch
3091: register. In these cases, the combiner expects exactly two new insn
3092: patterns to be generated. It will verify that these patterns match some
3093: @code{define_insn} definitions, so you need not do this test in the
3094: @code{define_split} (of course, there is no point in writing a
3095: @code{define_split} that will never produce insns that match).
3096:
3097: Here is an example of this use of @code{define_split}, taken from
3098: @file{rs6000.md}:
3099:
3100: @smallexample
3101: (define_split
3102: [(set (match_operand:SI 0 "gen_reg_operand" "")
3103: (plus:SI (match_operand:SI 1 "gen_reg_operand" "")
3104: (match_operand:SI 2 "non_add_cint_operand" "")))]
3105: ""
3106: [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3)))
3107: (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))]
3108: "
3109: @{
3110: int low = INTVAL (operands[2]) & 0xffff;
3111: int high = (unsigned) INTVAL (operands[2]) >> 16;
3112:
3113: if (low & 0x8000)
3114: high++, low |= 0xffff0000;
3115:
3116: operands[3] = gen_rtx (CONST_INT, VOIDmode, high << 16);
3117: operands[4] = gen_rtx (CONST_INT, VOIDmode, low);
3118: @}")
3119: @end smallexample
3120:
3121: Here the predicate @code{non_add_cint_operand} matches any
3122: @code{const_int} that is @emph{not} a valid operand of a single add
3123: insn. The add with the smaller displacement is written so that it
3124: can be substituted into the address of a subsequent operation.
3125:
3126: An example that uses a scratch register, from the same file, generates
3127: an equality comparison of a register and a large constant:
3128:
3129: @smallexample
3130: (define_split
3131: [(set (match_operand:CC 0 "cc_reg_operand" "")
3132: (compare:CC (match_operand:SI 1 "gen_reg_operand" "")
3133: (match_operand:SI 2 "non_short_cint_operand" "")))
3134: (clobber (match_operand:SI 3 "gen_reg_operand" ""))]
3135: "find_single_use (operands[0], insn, 0)
3136: && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ
3137: || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)"
3138: [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4)))
3139: (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))]
3140: "
3141: @{
3142: /* Get the constant we are comparing against, C, and see what it
3143: looks like sign-extended to 16 bits. Then see what constant
3144: could be XOR'ed with C to get the sign-extended value. */
3145:
3146: int c = INTVAL (operands[2]);
3147: int sextc = (c << 16) >> 16;
3148: int xorv = c ^ sextc;
3149:
3150: operands[4] = gen_rtx (CONST_INT, VOIDmode, xorv);
3151: operands[5] = gen_rtx (CONST_INT, VOIDmode, sextc);
3152: @}")
3153: @end smallexample
3154:
3155: To avoid confusion, don't write a single @code{define_split} that
3156: accepts some insns that match some @code{define_insn} as well as some
3157: insns that don't. Instead, write two separate @code{define_split}
3158: definitions, one for the insns that are valid and one for the insns that
3159: are not valid.
3160:
3161: @node Insn Attributes
3162: @section Instruction Attributes
3163: @cindex insn attributes
3164: @cindex instruction attributes
3165:
3166: In addition to describing the instruction supported by the target machine,
3167: the @file{md} file also defines a group of @dfn{attributes} and a set of
3168: values for each. Every generated insn is assigned a value for each attribute.
3169: One possible attribute would be the effect that the insn has on the machine's
3170: condition code. This attribute can then be used by @code{NOTICE_UPDATE_CC}
3171: to track the condition codes.
3172:
3173: @menu
3174: * Defining Attributes:: Specifying attributes and their values.
3175: * Expressions:: Valid expressions for attribute values.
3176: * Tagging Insns:: Assigning attribute values to insns.
3177: * Attr Example:: An example of assigning attributes.
3178: * Insn Lengths:: Computing the length of insns.
3179: * Constant Attributes:: Defining attributes that are constant.
3180: * Delay Slots:: Defining delay slots required for a machine.
3181: * Function Units:: Specifying information for insn scheduling.
3182: @end menu
3183:
3184: @node Defining Attributes
3185: @subsection Defining Attributes and their Values
3186: @cindex defining attributes and their values
3187: @cindex attributes, defining
3188:
3189: @findex define_attr
3190: The @code{define_attr} expression is used to define each attribute required
3191: by the target machine. It looks like:
3192:
3193: @smallexample
3194: (define_attr @var{name} @var{list-of-values} @var{default})
3195: @end smallexample
3196:
3197: @var{name} is a string specifying the name of the attribute being defined.
3198:
3199: @var{list-of-values} is either a string that specifies a comma-separated
3200: list of values that can be assigned to the attribute, or a null string to
3201: indicate that the attribute takes numeric values.
3202:
3203: @var{default} is an attribute expression that gives the value of this
3204: attribute for insns that match patterns whose definition does not include
3205: an explicit value for this attribute. @xref{Attr Example}, for more
3206: information on the handling of defaults. @xref{Constant Attributes},
3207: for information on attributes that do not depend on any particular insn.
3208:
3209: @findex insn-attr.h
3210: For each defined attribute, a number of definitions are written to the
3211: @file{insn-attr.h} file. For cases where an explicit set of values is
3212: specified for an attribute, the following are defined:
3213:
3214: @itemize @bullet
3215: @item
3216: A @samp{#define} is written for the symbol @samp{HAVE_ATTR_@var{name}}.
3217:
3218: @item
3219: An enumeral class is defined for @samp{attr_@var{name}} with
3220: elements of the form @samp{@var{upper-name}_@var{upper-value}} where
3221: the attribute name and value are first converted to upper case.
3222:
3223: @item
3224: A function @samp{get_attr_@var{name}} is defined that is passed an insn and
3225: returns the attribute value for that insn.
3226: @end itemize
3227:
3228: For example, if the following is present in the @file{md} file:
3229:
3230: @smallexample
3231: (define_attr "type" "branch,fp,load,store,arith" @dots{})
3232: @end smallexample
3233:
3234: @noindent
3235: the following lines will be written to the file @file{insn-attr.h}.
3236:
3237: @smallexample
3238: #define HAVE_ATTR_type
3239: enum attr_type @{TYPE_BRANCH, TYPE_FP, TYPE_LOAD,
3240: TYPE_STORE, TYPE_ARITH@};
3241: extern enum attr_type get_attr_type ();
3242: @end smallexample
3243:
3244: If the attribute takes numeric values, no @code{enum} type will be
3245: defined and the function to obtain the attribute's value will return
3246: @code{int}.
3247:
3248: @node Expressions
3249: @subsection Attribute Expressions
3250: @cindex attribute expressions
3251:
3252: RTL expressions used to define attributes use the codes described above
3253: plus a few specific to attribute definitions, to be discussed below.
3254: Attribute value expressions must have one of the following forms:
3255:
3256: @table @code
3257: @cindex @code{const_int} and attributes
3258: @item (const_int @var{i})
3259: The integer @var{i} specifies the value of a numeric attribute. @var{i}
3260: must be non-negative.
3261:
3262: The value of a numeric attribute can be specified either with a
3263: @code{const_int} or as an integer represented as a string in
3264: @code{const_string}, @code{eq_attr} (see below), and @code{set_attr}
3265: (@pxref{Tagging Insns}) expressions.
3266:
3267: @cindex @code{const_string} and attributes
3268: @item (const_string @var{value})
3269: The string @var{value} specifies a constant attribute value.
3270: If @var{value} is specified as @samp{"*"}, it means that the default value of
3271: the attribute is to be used for the insn containing this expression.
3272: @samp{"*"} obviously cannot be used in the @var{default} expression
3273: of a @code{define_attr}.@refill
3274:
3275: If the attribute whose value is being specified is numeric, @var{value}
3276: must be a string containing a non-negative integer (normally
3277: @code{const_int} would be used in this case). Otherwise, it must
3278: contain one of the valid values for the attribute.
3279:
3280: @cindex @code{if_then_else} and attributes
3281: @item (if_then_else @var{test} @var{true-value} @var{false-value})
3282: @var{test} specifies an attribute test, whose format is defined below.
3283: The value of this expression is @var{true-value} if @var{test} is true,
3284: otherwise it is @var{false-value}.
3285:
3286: @cindex @code{cond} and attributes
3287: @item (cond [@var{test1} @var{value1} @dots{}] @var{default})
3288: The first operand of this expression is a vector containing an even
3289: number of expressions and consisting of pairs of @var{test} and @var{value}
3290: expressions. The value of the @code{cond} expression is that of the
3291: @var{value} corresponding to the first true @var{test} expression. If
3292: none of the @var{test} expressions are true, the value of the @code{cond}
3293: expression is that of the @var{default} expression.
3294: @end table
3295:
3296: @var{test} expressions can have one of the following forms:
3297:
3298: @table @code
3299: @cindex @code{const_int} and attribute tests
3300: @item (const_int @var{i})
3301: This test is true if @var{i} is non-zero and false otherwise.
3302:
3303: @cindex @code{not} and attributes
3304: @cindex @code{ior} and attributes
3305: @cindex @code{and} and attributes
3306: @item (not @var{test})
3307: @itemx (ior @var{test1} @var{test2})
3308: @itemx (and @var{test1} @var{test2})
3309: These tests are true if the indicated logical function is true.
3310:
3311: @cindex @code{match_operand} and attributes
3312: @item (match_operand:@var{m} @var{n} @var{pred} @var{constraints})
3313: This test is true if operand @var{n} of the insn whose attribute value
3314: is being determined has mode @var{m} (this part of the test is ignored
3315: if @var{m} is @code{VOIDmode}) and the function specified by the string
3316: @var{pred} returns a non-zero value when passed operand @var{n} and mode
3317: @var{m} (this part of the test is ignored if @var{pred} is the null
3318: string).
3319:
3320: The @var{constraints} operand is ignored and should be the null string.
3321:
3322: @cindex @code{le} and attributes
3323: @cindex @code{leu} and attributes
3324: @cindex @code{lt} and attributes
3325: @cindex @code{gt} and attributes
3326: @cindex @code{gtu} and attributes
3327: @cindex @code{ge} and attributes
3328: @cindex @code{geu} and attributes
3329: @cindex @code{ne} and attributes
3330: @cindex @code{eq} and attributes
3331: @cindex @code{plus} and attributes
3332: @cindex @code{minus} and attributes
3333: @cindex @code{mult} and attributes
3334: @cindex @code{div} and attributes
3335: @cindex @code{mod} and attributes
3336: @cindex @code{abs} and attributes
3337: @cindex @code{neg} and attributes
3338: @cindex @code{lshift} and attributes
3339: @cindex @code{ashift} and attributes
3340: @cindex @code{lshiftrt} and attributes
3341: @cindex @code{ashiftrt} and attributes
3342: @item (le @var{arith1} @var{arith2})
3343: @itemx (leu @var{arith1} @var{arith2})
3344: @itemx (lt @var{arith1} @var{arith2})
3345: @itemx (ltu @var{arith1} @var{arith2})
3346: @itemx (gt @var{arith1} @var{arith2})
3347: @itemx (gtu @var{arith1} @var{arith2})
3348: @itemx (ge @var{arith1} @var{arith2})
3349: @itemx (geu @var{arith1} @var{arith2})
3350: @itemx (ne @var{arith1} @var{arith2})
3351: @itemx (eq @var{arith1} @var{arith2})
3352: These tests are true if the indicated comparison of the two arithmetic
3353: expressions is true. Arithmetic expressions are formed with
3354: @code{plus}, @code{minus}, @code{mult}, @code{div}, @code{mod},
3355: @code{abs}, @code{neg}, @code{and}, @code{ior}, @code{xor}, @code{not},
3356: @code{lshift}, @code{ashift}, @code{lshiftrt}, and @code{ashiftrt}
3357: expressions.@refill
3358:
3359: @findex get_attr
3360: @code{const_int} and @code{symbol_ref} are always valid terms (@pxref{Insn
3361: Lengths},for additional forms). @code{symbol_ref} is a string
3362: denoting a C expression that yields an @code{int} when evaluated by the
3363: @samp{get_attr_@dots{}} routine. It should normally be a global
3364: variable.@refill
3365:
3366: @findex eq_attr
3367: @item (eq_attr @var{name} @var{value})
3368: @var{name} is a string specifying the name of an attribute.
3369:
3370: @var{value} is a string that is either a valid value for attribute
3371: @var{name}, a comma-separated list of values, or @samp{!} followed by a
3372: value or list. If @var{value} does not begin with a @samp{!}, this
3373: test is true if the value of the @var{name} attribute of the current
3374: insn is in the list specified by @var{value}. If @var{value} begins
3375: with a @samp{!}, this test is true if the attribute's value is
3376: @emph{not} in the specified list.
3377:
3378: For example,
3379:
3380: @smallexample
3381: (eq_attr "type" "load,store")
3382: @end smallexample
3383:
3384: @noindent
3385: is equivalent to
3386:
3387: @smallexample
3388: (ior (eq_attr "type" "load") (eq_attr "type" "store"))
3389: @end smallexample
3390:
3391: If @var{name} specifies an attribute of @samp{alternative}, it refers to the
3392: value of the compiler variable @code{which_alternative}
3393: (@pxref{Output Statement}) and the values must be small integers. For
3394: example,@refill
3395:
3396: @smallexample
3397: (eq_attr "alternative" "2,3")
3398: @end smallexample
3399:
3400: @noindent
3401: is equivalent to
3402:
3403: @smallexample
3404: (ior (eq (symbol_ref "which_alternative") (const_int 2))
3405: (eq (symbol_ref "which_alternative") (const_int 3)))
3406: @end smallexample
3407:
3408: Note that, for most attributes, an @code{eq_attr} test is simplified in cases
3409: where the value of the attribute being tested is known for all insns matching
3410: a particular pattern. This is by far the most common case.@refill
3411:
3412: @findex attr_flag
3413: @item (attr_flag @var{name})
3414: The value of an @code{attr_flag} expression is true if the flag
3415: specified by @var{name} is true for the @code{insn} currently being
3416: scheduled.
3417:
3418: @var{name} is a string specifying one of a fixed set of flags to test.
3419: Test the flags @code{forward} and @code{backward} to determine the
3420: direction of a conditional branch. Test the flags @code{very_likely},
3421: @code{likely}, @code{very_unlikely}, and @code{unlikely} to determine
3422: if a conditional branch is expected to be taken.
3423:
3424: If the @code{very_likely} flag is true, then the @code{likely} flag is also
3425: true. Likewise for the @code{very_unlikely} and @code{unlikely} flags.
3426:
3427: This example describes a conditional branch delay slot which
3428: can be nullified for forward branches that are taken (annul-true) or
3429: for backward branches which are not taken (annul-false).
3430:
3431: @smallexample
3432: (define_delay (eq_attr "type" "cbranch")
3433: [(eq_attr "in_branch_delay" "true")
3434: (and (eq_attr "in_branch_delay" "true")
3435: (attr_flag "forward"))
3436: (and (eq_attr "in_branch_delay" "true")
3437: (attr_flag "backward"))])
3438: @end smallexample
3439:
3440: The @code{forward} and @code{backward} flags are false if the current
3441: @code{insn} being scheduled is not a conditional branch.
3442:
3443: The @code{very_likely} and @code{likely} flags are true if the
3444: @code{insn} being scheduled is not a conditional branch. The
3445: The @code{very_unlikely} and @code{unlikely} flags are false if the
3446: @code{insn} being scheduled is not a conditional branch.
3447:
3448: @code{attr_flag} is only used during delay slot scheduling and has no
3449: meaning to other passes of the compiler.
3450: @end table
3451:
3452: @node Tagging Insns
3453: @subsection Assigning Attribute Values to Insns
3454: @cindex tagging insns
3455: @cindex assigning attribute values to insns
3456:
3457: The value assigned to an attribute of an insn is primarily determined by
3458: which pattern is matched by that insn (or which @code{define_peephole}
3459: generated it). Every @code{define_insn} and @code{define_peephole} can
3460: have an optional last argument to specify the values of attributes for
3461: matching insns. The value of any attribute not specified in a particular
3462: insn is set to the default value for that attribute, as specified in its
3463: @code{define_attr}. Extensive use of default values for attributes
3464: permits the specification of the values for only one or two attributes
3465: in the definition of most insn patterns, as seen in the example in the
3466: next section.@refill
3467:
3468: The optional last argument of @code{define_insn} and
3469: @code{define_peephole} is a vector of expressions, each of which defines
3470: the value for a single attribute. The most general way of assigning an
3471: attribute's value is to use a @code{set} expression whose first operand is an
3472: @code{attr} expression giving the name of the attribute being set. The
3473: second operand of the @code{set} is an attribute expression
3474: (@pxref{Expressions}) giving the value of the attribute.@refill
3475:
3476: When the attribute value depends on the @samp{alternative} attribute
3477: (i.e., which is the applicable alternative in the constraint of the
3478: insn), the @code{set_attr_alternative} expression can be used. It
3479: allows the specification of a vector of attribute expressions, one for
3480: each alternative.
3481:
3482: @findex set_attr
3483: When the generality of arbitrary attribute expressions is not required,
3484: the simpler @code{set_attr} expression can be used, which allows
3485: specifying a string giving either a single attribute value or a list
3486: of attribute values, one for each alternative.
3487:
3488: The form of each of the above specifications is shown below. In each case,
3489: @var{name} is a string specifying the attribute to be set.
3490:
3491: @table @code
3492: @item (set_attr @var{name} @var{value-string})
3493: @var{value-string} is either a string giving the desired attribute value,
3494: or a string containing a comma-separated list giving the values for
3495: succeeding alternatives. The number of elements must match the number
3496: of alternatives in the constraint of the insn pattern.
3497:
3498: Note that it may be useful to specify @samp{*} for some alternative, in
3499: which case the attribute will assume its default value for insns matching
3500: that alternative.
3501:
3502: @findex set_attr_alternative
3503: @item (set_attr_alternative @var{name} [@var{value1} @var{value2} @dots{}])
3504: Depending on the alternative of the insn, the value will be one of the
3505: specified values. This is a shorthand for using a @code{cond} with
3506: tests on the @samp{alternative} attribute.
3507:
3508: @findex attr
3509: @item (set (attr @var{name}) @var{value})
3510: The first operand of this @code{set} must be the special RTL expression
3511: @code{attr}, whose sole operand is a string giving the name of the
3512: attribute being set. @var{value} is the value of the attribute.
3513: @end table
3514:
3515: The following shows three different ways of representing the same
3516: attribute value specification:
3517:
3518: @smallexample
3519: (set_attr "type" "load,store,arith")
3520:
3521: (set_attr_alternative "type"
3522: [(const_string "load") (const_string "store")
3523: (const_string "arith")])
3524:
3525: (set (attr "type")
3526: (cond [(eq_attr "alternative" "1") (const_string "load")
3527: (eq_attr "alternative" "2") (const_string "store")]
3528: (const_string "arith")))
3529: @end smallexample
3530:
3531: @need 1000
3532: @findex define_asm_attributes
3533: The @code{define_asm_attributes} expression provides a mechanism to
3534: specify the attributes assigned to insns produced from an @code{asm}
3535: statement. It has the form:
3536:
3537: @smallexample
3538: (define_asm_attributes [@var{attr-sets}])
3539: @end smallexample
3540:
3541: @noindent
3542: where @var{attr-sets} is specified the same as for both the
3543: @code{define_insn} and the @code{define_peephole} expressions.
3544:
3545: These values will typically be the ``worst case'' attribute values. For
3546: example, they might indicate that the condition code will be clobbered.
3547:
3548: A specification for a @code{length} attribute is handled specially. The
3549: way to compute the length of an @code{asm} insn is to multiply the
3550: length specified in the expression @code{define_asm_attributes} by the
3551: number of machine instructions specified in the @code{asm} statement,
3552: determined by counting the number of semicolons and newlines in the
3553: string. Therefore, the value of the @code{length} attribute specified
3554: in a @code{define_asm_attributes} should be the maximum possible length
3555: of a single machine instruction.
3556:
3557: @node Attr Example
3558: @subsection Example of Attribute Specifications
3559: @cindex attribute specifications example
3560: @cindex attribute specifications
3561:
3562: The judicious use of defaulting is important in the efficient use of
3563: insn attributes. Typically, insns are divided into @dfn{types} and an
3564: attribute, customarily called @code{type}, is used to represent this
3565: value. This attribute is normally used only to define the default value
3566: for other attributes. An example will clarify this usage.
3567:
3568: Assume we have a RISC machine with a condition code and in which only
3569: full-word operations are performed in registers. Let us assume that we
3570: can divide all insns into loads, stores, (integer) arithmetic
3571: operations, floating point operations, and branches.
3572:
3573: Here we will concern ourselves with determining the effect of an insn on
3574: the condition code and will limit ourselves to the following possible
3575: effects: The condition code can be set unpredictably (clobbered), not
3576: be changed, be set to agree with the results of the operation, or only
3577: changed if the item previously set into the condition code has been
3578: modified.
3579:
3580: Here is part of a sample @file{md} file for such a machine:
3581:
3582: @smallexample
3583: (define_attr "type" "load,store,arith,fp,branch" (const_string "arith"))
3584:
3585: (define_attr "cc" "clobber,unchanged,set,change0"
3586: (cond [(eq_attr "type" "load")
3587: (const_string "change0")
3588: (eq_attr "type" "store,branch")
3589: (const_string "unchanged")
3590: (eq_attr "type" "arith")
3591: (if_then_else (match_operand:SI 0 "" "")
3592: (const_string "set")
3593: (const_string "clobber"))]
3594: (const_string "clobber")))
3595:
3596: (define_insn ""
3597: [(set (match_operand:SI 0 "general_operand" "=r,r,m")
3598: (match_operand:SI 1 "general_operand" "r,m,r"))]
3599: ""
3600: "@@
3601: move %0,%1
3602: load %0,%1
3603: store %0,%1"
3604: [(set_attr "type" "arith,load,store")])
3605: @end smallexample
3606:
3607: Note that we assume in the above example that arithmetic operations
3608: performed on quantities smaller than a machine word clobber the condition
3609: code since they will set the condition code to a value corresponding to the
3610: full-word result.
3611:
3612: @node Insn Lengths
3613: @subsection Computing the Length of an Insn
3614: @cindex insn lengths, computing
3615: @cindex computing the length of an insn
3616:
3617: For many machines, multiple types of branch instructions are provided, each
3618: for different length branch displacements. In most cases, the assembler
3619: will choose the correct instruction to use. However, when the assembler
3620: cannot do so, GCC can when a special attribute, the @samp{length}
3621: attribute, is defined. This attribute must be defined to have numeric
3622: values by specifying a null string in its @code{define_attr}.
3623:
3624: In the case of the @samp{length} attribute, two additional forms of
3625: arithmetic terms are allowed in test expressions:
3626:
3627: @table @code
3628: @cindex @code{match_dup} and attributes
3629: @item (match_dup @var{n})
3630: This refers to the address of operand @var{n} of the current insn, which
3631: must be a @code{label_ref}.
3632:
3633: @cindex @code{pc} and attributes
3634: @item (pc)
3635: This refers to the address of the @emph{current} insn. It might have
3636: been more consistent with other usage to make this the address of the
3637: @emph{next} insn but this would be confusing because the length of the
3638: current insn is to be computed.
3639: @end table
3640:
3641: @cindex @code{addr_vec}, length of
3642: @cindex @code{addr_diff_vec}, length of
3643: For normal insns, the length will be determined by value of the
3644: @samp{length} attribute. In the case of @code{addr_vec} and
3645: @code{addr_diff_vec} insn patterns, the length is computed as
3646: the number of vectors multiplied by the size of each vector.
3647:
3648: Lengths are measured in addressable storage units (bytes).
3649:
3650: The following macros can be used to refine the length computation:
3651:
3652: @table @code
3653: @findex FIRST_INSN_ADDRESS
3654: @item FIRST_INSN_ADDRESS
3655: When the @code{length} insn attribute is used, this macro specifies the
3656: value to be assigned to the address of the first insn in a function. If
3657: not specified, 0 is used.
3658:
3659: @findex ADJUST_INSN_LENGTH
3660: @item ADJUST_INSN_LENGTH (@var{insn}, @var{length})
3661: If defined, modifies the length assigned to instruction @var{insn} as a
3662: function of the context in which it is used. @var{length} is an lvalue
3663: that contains the initially computed length of the insn and should be
3664: updated with the correct length of the insn. If updating is required,
3665: @var{insn} must not be a varying-length insn.
3666:
3667: This macro will normally not be required. A case in which it is
3668: required is the ROMP. On this machine, the size of an @code{addr_vec}
3669: insn must be increased by two to compensate for the fact that alignment
3670: may be required.
3671: @end table
3672:
3673: @findex get_attr_length
3674: The routine that returns @code{get_attr_length} (the value of the
3675: @code{length} attribute) can be used by the output routine to
3676: determine the form of the branch instruction to be written, as the
3677: example below illustrates.
3678:
3679: As an example of the specification of variable-length branches, consider
3680: the IBM 360. If we adopt the convention that a register will be set to
3681: the starting address of a function, we can jump to labels within 4k of
3682: the start using a four-byte instruction. Otherwise, we need a six-byte
3683: sequence to load the address from memory and then branch to it.
3684:
3685: On such a machine, a pattern for a branch instruction might be specified
3686: as follows:
3687:
3688: @smallexample
3689: (define_insn "jump"
3690: [(set (pc)
3691: (label_ref (match_operand 0 "" "")))]
3692: ""
3693: "*
3694: @{
3695: return (get_attr_length (insn) == 4
3696: ? \"b %l0\" : \"l r15,=a(%l0); br r15\");
3697: @}"
3698: [(set (attr "length") (if_then_else (lt (match_dup 0) (const_int 4096))
3699: (const_int 4)
3700: (const_int 6)))])
3701: @end smallexample
3702:
3703: @node Constant Attributes
3704: @subsection Constant Attributes
3705: @cindex constant attributes
3706:
3707: A special form of @code{define_attr}, where the expression for the
3708: default value is a @code{const} expression, indicates an attribute that
3709: is constant for a given run of the compiler. Constant attributes may be
3710: used to specify which variety of processor is used. For example,
3711:
3712: @smallexample
3713: (define_attr "cpu" "m88100,m88110,m88000"
3714: (const
3715: (cond [(symbol_ref "TARGET_88100") (const_string "m88100")
3716: (symbol_ref "TARGET_88110") (const_string "m88110")]
3717: (const_string "m88000"))))
3718:
3719: (define_attr "memory" "fast,slow"
3720: (const
3721: (if_then_else (symbol_ref "TARGET_FAST_MEM")
3722: (const_string "fast")
3723: (const_string "slow"))))
3724: @end smallexample
3725:
3726: The routine generated for constant attributes has no parameters as it
3727: does not depend on any particular insn. RTL expressions used to define
3728: the value of a constant attribute may use the @code{symbol_ref} form,
3729: but may not use either the @code{match_operand} form or @code{eq_attr}
3730: forms involving insn attributes.
3731:
3732: @node Delay Slots
3733: @subsection Delay Slot Scheduling
3734: @cindex delay slots, defining
3735:
3736: The insn attribute mechanism can be used to specify the requirements for
3737: delay slots, if any, on a target machine. An instruction is said to
3738: require a @dfn{delay slot} if some instructions that are physically
3739: after the instruction are executed as if they were located before it.
3740: Classic examples are branch and call instructions, which often execute
3741: the following instruction before the branch or call is performed.
3742:
3743: On some machines, conditional branch instructions can optionally
3744: @dfn{annul} instructions in the delay slot. This means that the
3745: instruction will not be executed for certain branch outcomes. Both
3746: instructions that annul if the branch is true and instructions that
3747: annul if the branch is false are supported.
3748:
3749: Delay slot scheduling differs from instruction scheduling in that
3750: determining whether an instruction needs a delay slot is dependent only
3751: on the type of instruction being generated, not on data flow between the
3752: instructions. See the next section for a discussion of data-dependent
3753: instruction scheduling.
3754:
3755: @findex define_delay
3756: The requirement of an insn needing one or more delay slots is indicated
3757: via the @code{define_delay} expression. It has the following form:
3758:
3759: @smallexample
3760: (define_delay @var{test}
3761: [@var{delay-1} @var{annul-true-1} @var{annul-false-1}
3762: @var{delay-2} @var{annul-true-2} @var{annul-false-2}
3763: @dots{}])
3764: @end smallexample
3765:
3766: @var{test} is an attribute test that indicates whether this
3767: @code{define_delay} applies to a particular insn. If so, the number of
3768: required delay slots is determined by the length of the vector specified
3769: as the second argument. An insn placed in delay slot @var{n} must
3770: satisfy attribute test @var{delay-n}. @var{annul-true-n} is an
3771: attribute test that specifies which insns may be annulled if the branch
3772: is true. Similarly, @var{annul-false-n} specifies which insns in the
3773: delay slot may be annulled if the branch is false. If annulling is not
3774: supported for that delay slot, @code{(nil)} should be coded.@refill
3775:
3776: For example, in the common case where branch and call insns require
3777: a single delay slot, which may contain any insn other than a branch or
3778: call, the following would be placed in the @file{md} file:
3779:
3780: @smallexample
3781: (define_delay (eq_attr "type" "branch,call")
3782: [(eq_attr "type" "!branch,call") (nil) (nil)])
3783: @end smallexample
3784:
3785: Multiple @code{define_delay} expressions may be specified. In this
3786: case, each such expression specifies different delay slot requirements
3787: and there must be no insn for which tests in two @code{define_delay}
3788: expressions are both true.
3789:
3790: For example, if we have a machine that requires one delay slot for branches
3791: but two for calls, no delay slot can contain a branch or call insn,
3792: and any valid insn in the delay slot for the branch can be annulled if the
3793: branch is true, we might represent this as follows:
3794:
3795: @smallexample
3796: (define_delay (eq_attr "type" "branch")
3797: [(eq_attr "type" "!branch,call")
3798: (eq_attr "type" "!branch,call")
3799: (nil)])
3800:
3801: (define_delay (eq_attr "type" "call")
3802: [(eq_attr "type" "!branch,call") (nil) (nil)
3803: (eq_attr "type" "!branch,call") (nil) (nil)])
3804: @end smallexample
3805: @c the above is *still* too long. --mew 4feb93
3806:
3807: @node Function Units
3808: @subsection Specifying Function Units
3809: @cindex function units, for scheduling
3810:
3811: On most RISC machines, there are instructions whose results are not
3812: available for a specific number of cycles. Common cases are instructions
3813: that load data from memory. On many machines, a pipeline stall will result
3814: if the data is referenced too soon after the load instruction.
3815:
3816: In addition, many newer microprocessors have multiple function units, usually
3817: one for integer and one for floating point, and often will incur pipeline
3818: stalls when a result that is needed is not yet ready.
3819:
3820: The descriptions in this section allow the specification of how much
3821: time must elapse between the execution of an instruction and the time
3822: when its result is used. It also allows specification of when the
3823: execution of an instruction will delay execution of similar instructions
3824: due to function unit conflicts.
3825:
3826: For the purposes of the specifications in this section, a machine is
3827: divided into @dfn{function units}, each of which execute a specific
3828: class of instructions in first-in-first-out order. Function units that
3829: accept one instruction each cycle and allow a result to be used in the
3830: succeeding instruction (usually via forwarding) need not be specified.
3831: Classic RISC microprocessors will normally have a single function unit,
3832: which we can call @samp{memory}. The newer ``superscalar'' processors
3833: will often have function units for floating point operations, usually at
3834: least a floating point adder and multiplier.
3835:
3836: @findex define_function_unit
3837: Each usage of a function units by a class of insns is specified with a
3838: @code{define_function_unit} expression, which looks like this:
3839:
3840: @smallexample
3841: (define_function_unit @var{name} @var{multiplicity} @var{simultaneity}
3842: @var{test} @var{ready-delay} @var{issue-delay}
3843: [@var{conflict-list}])
3844: @end smallexample
3845:
3846: @var{name} is a string giving the name of the function unit.
3847:
3848: @var{multiplicity} is an integer specifying the number of identical
3849: units in the processor. If more than one unit is specified, they will
3850: be scheduled independently. Only truly independent units should be
3851: counted; a pipelined unit should be specified as a single unit. (The
3852: only common example of a machine that has multiple function units for a
3853: single instruction class that are truly independent and not pipelined
3854: are the two multiply and two increment units of the CDC 6600.)
3855:
3856: @var{simultaneity} specifies the maximum number of insns that can be
3857: executing in each instance of the function unit simultaneously or zero
3858: if the unit is pipelined and has no limit.
3859:
3860: All @code{define_function_unit} definitions referring to function unit
3861: @var{name} must have the same name and values for @var{multiplicity} and
3862: @var{simultaneity}.
3863:
3864: @var{test} is an attribute test that selects the insns we are describing
3865: in this definition. Note that an insn may use more than one function
3866: unit and a function unit may be specified in more than one
3867: @code{define_function_unit}.
3868:
3869: @var{ready-delay} is an integer that specifies the number of cycles
3870: after which the result of the instruction can be used without
3871: introducing any stalls.
3872:
3873: @var{issue-delay} is an integer that specifies the number of cycles
3874: after the instruction matching the @var{test} expression begins using
3875: this unit until a subsequent instruction can begin. A cost of @var{N}
3876: indicates an @var{N-1} cycle delay. A subsequent instruction may also
3877: be delayed if an earlier instruction has a longer @var{ready-delay}
3878: value. This blocking effect is computed using the @var{simultaneity},
3879: @var{ready-delay}, @var{issue-delay}, and @var{conflict-list} terms.
3880: For a normal non-pipelined function unit, @var{simultaneity} is one, the
3881: unit is taken to block for the @var{ready-delay} cycles of the executing
3882: insn, and smaller values of @var{issue-delay} are ignored.
3883:
3884: @var{conflict-list} is an optional list giving detailed conflict costs
3885: for this unit. If specified, it is a list of condition test expressions
3886: to be applied to insns chosen to execute in @var{name} following the
3887: particular insn matching @var{test} that is already executing in
3888: @var{name}. For each insn in the list, @var{issue-delay} specifies the
3889: conflict cost; for insns not in the list, the cost is zero. If not
3890: specified, @var{conflict-list} defaults to all instructions that use the
3891: function unit.
3892:
3893: Typical uses of this vector are where a floating point function unit can
3894: pipeline either single- or double-precision operations, but not both, or
3895: where a memory unit can pipeline loads, but not stores, etc.
3896:
3897: As an example, consider a classic RISC machine where the result of a
3898: load instruction is not available for two cycles (a single ``delay''
3899: instruction is required) and where only one load instruction can be executed
3900: simultaneously. This would be specified as:
3901:
3902: @smallexample
3903: (define_function_unit "memory" 1 1 (eq_attr "type" "load") 2 0)
3904: @end smallexample
3905:
3906: For the case of a floating point function unit that can pipeline either
3907: single or double precision, but not both, the following could be specified:
3908:
3909: @smallexample
3910: (define_function_unit
3911: "fp" 1 0 (eq_attr "type" "sp_fp") 4 4 [(eq_attr "type" "dp_fp")])
3912: (define_function_unit
3913: "fp" 1 0 (eq_attr "type" "dp_fp") 4 4 [(eq_attr "type" "sp_fp")])
3914: @end smallexample
3915:
3916: @strong{Note:} The scheduler attempts to avoid function unit conflicts
3917: and uses all the specifications in the @code{define_function_unit}
3918: expression. It has recently come to our attention that these
3919: specifications may not allow modeling of some of the newer
3920: ``superscalar'' processors that have insns using multiple pipelined
3921: units. These insns will cause a potential conflict for the second unit
3922: used during their execution and there is no way of representing that
3923: conflict. We welcome any examples of how function unit conflicts work
3924: in such processors and suggestions for their representation.
3925: @end ifset
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.