|
|
1.1 root 1: This is Info file gcc.info, produced by Makeinfo-1.54 from the input
2: file gcc.texi.
3:
4: This file documents the use and the internals of the GNU compiler.
5:
6: Published by the Free Software Foundation 675 Massachusetts Avenue
7: Cambridge, MA 02139 USA
8:
9: Copyright (C) 1988, 1989, 1992, 1993 Free Software Foundation, Inc.
10:
11: Permission is granted to make and distribute verbatim copies of this
12: manual provided the copyright notice and this permission notice are
13: preserved on all copies.
14:
15: Permission is granted to copy and distribute modified versions of
16: this manual under the conditions for verbatim copying, provided also
17: that the sections entitled "GNU General Public License" and "Protect
18: Your Freedom--Fight `Look And Feel'" are included exactly as in the
19: original, and provided that the entire resulting derived work is
20: distributed under the terms of a permission notice identical to this
21: one.
22:
23: Permission is granted to copy and distribute translations of this
24: manual into another language, under the above conditions for modified
25: versions, except that the sections entitled "GNU General Public
26: License" and "Protect Your Freedom--Fight `Look And Feel'", and this
27: permission notice, may be included in translations approved by the Free
28: Software Foundation instead of in the original English.
29:
30:
31: File: gcc.info, Node: Standard Names, Next: Pattern Ordering, Prev: Constraints, Up: Machine Desc
32:
33: Standard Pattern Names For Generation
34: =====================================
35:
36: Here is a table of the instruction names that are meaningful in the
37: RTL generation pass of the compiler. Giving one of these names to an
38: instruction pattern tells the RTL generation pass that it can use the
39: pattern in to accomplish a certain task.
40:
41: `movM'
42: Here M stands for a two-letter machine mode name, in lower case.
43: This instruction pattern moves data with that machine mode from
44: operand 1 to operand 0. For example, `movsi' moves full-word data.
45:
46: If operand 0 is a `subreg' with mode M of a register whose own
47: mode is wider than M, the effect of this instruction is to store
48: the specified value in the part of the register that corresponds
49: to mode M. The effect on the rest of the register is undefined.
50:
51: This class of patterns is special in several ways. First of all,
52: each of these names *must* be defined, because there is no other
53: way to copy a datum from one place to another.
54:
55: Second, these patterns are not used solely in the RTL generation
56: pass. Even the reload pass can generate move insns to copy values
57: from stack slots into temporary registers. When it does so, one
58: of the operands is a hard register and the other is an operand
59: that can need to be reloaded into a register.
60:
61: Therefore, when given such a pair of operands, the pattern must
62: generate RTL which needs no reloading and needs no temporary
63: registers--no registers other than the operands. For example, if
64: you support the pattern with a `define_expand', then in such a
65: case the `define_expand' mustn't call `force_reg' or any other such
66: function which might generate new pseudo registers.
67:
68: This requirement exists even for subword modes on a RISC machine
69: where fetching those modes from memory normally requires several
70: insns and some temporary registers. Look in `spur.md' to see how
71: the requirement can be satisfied.
72:
73: During reload a memory reference with an invalid address may be
74: passed as an operand. Such an address will be replaced with a
75: valid address later in the reload pass. In this case, nothing may
76: be done with the address except to use it as it stands. If it is
77: copied, it will not be replaced with a valid address. No attempt
78: should be made to make such an address into a valid address and no
79: routine (such as `change_address') that will do so may be called.
80: Note that `general_operand' will fail when applied to such an
81: address.
82:
83: The global variable `reload_in_progress' (which must be explicitly
84: declared if required) can be used to determine whether such special
85: handling is required.
86:
87: The variety of operands that have reloads depends on the rest of
88: the machine description, but typically on a RISC machine these can
89: only be pseudo registers that did not get hard registers, while on
90: other machines explicit memory references will get optional
91: reloads.
92:
93: If a scratch register is required to move an object to or from
94: memory, it can be allocated using `gen_reg_rtx' prior to reload.
95: But this is impossible during and after reload. If there are
96: cases needing scratch registers after reload, you must define
97: `SECONDARY_INPUT_RELOAD_CLASS' and perhaps also
98: `SECONDARY_OUTPUT_RELOAD_CLASS' to detect them, and provide
99: patterns `reload_inM' or `reload_outM' to handle them. *Note
100: Register Classes::.
101:
102: The constraints on a `moveM' must permit moving any hard register
103: to any other hard register provided that `HARD_REGNO_MODE_OK'
104: permits mode M in both registers and `REGISTER_MOVE_COST' applied
105: to their classes returns a value of 2.
106:
107: It is obligatory to support floating point `moveM' instructions
108: into and out of any registers that can hold fixed point values,
109: because unions and structures (which have modes `SImode' or
110: `DImode') can be in those registers and they may have floating
111: point members.
112:
113: There may also be a need to support fixed point `moveM'
114: instructions in and out of floating point registers.
115: Unfortunately, I have forgotten why this was so, and I don't know
116: whether it is still true. If `HARD_REGNO_MODE_OK' rejects fixed
117: point values in floating point registers, then the constraints of
118: the fixed point `moveM' instructions must be designed to avoid
119: ever trying to reload into a floating point register.
120:
121: `reload_inM'
122: `reload_outM'
123: Like `movM', but used when a scratch register is required to move
124: between operand 0 and operand 1. Operand 2 describes the scratch
125: register. See the discussion of the `SECONDARY_RELOAD_CLASS'
126: macro in *note Register Classes::..
127:
128: `movstrictM'
129: Like `movM' except that if operand 0 is a `subreg' with mode M of
130: a register whose natural mode is wider, the `movstrictM'
131: instruction is guaranteed not to alter any of the register except
132: the part which belongs to mode M.
133:
134: `load_multiple'
135: Load several consecutive memory locations into consecutive
136: registers. Operand 0 is the first of the consecutive registers,
137: operand 1 is the first memory location, and operand 2 is a
138: constant: the number of consecutive registers.
139:
140: Define this only if the target machine really has such an
141: instruction; do not define this if the most efficient way of
142: loading consecutive registers from memory is to do them one at a
143: time.
144:
145: On some machines, there are restrictions as to which consecutive
146: registers can be stored into memory, such as particular starting or
147: ending register numbers or only a range of valid counts. For those
148: machines, use a `define_expand' (*note Expander Definitions::.)
149: and make the pattern fail if the restrictions are not met.
150:
151: Write the generated insn as a `parallel' with elements being a
152: `set' of one register from the appropriate memory location (you may
153: also need `use' or `clobber' elements). Use a `match_parallel'
154: (*note RTL Template::.) to recognize the insn. See `a29k.md' and
155: `rs6000.md' for examples of the use of this insn pattern.
156:
157: `store_multiple'
158: Similar to `load_multiple', but store several consecutive registers
159: into consecutive memory locations. Operand 0 is the first of the
160: consecutive memory locations, operand 1 is the first register, and
161: operand 2 is a constant: the number of consecutive registers.
162:
163: `addM3'
164: Add operand 2 and operand 1, storing the result in operand 0. All
165: operands must have mode M. This can be used even on two-address
166: machines, by means of constraints requiring operands 1 and 0 to be
167: the same location.
168:
169: `subM3', `mulM3'
170: `divM3', `udivM3', `modM3', `umodM3'
171: `sminM3', `smaxM3', `uminM3', `umaxM3'
172: `andM3', `iorM3', `xorM3'
173: Similar, for other arithmetic operations.
174:
175: `mulhisi3'
176: Multiply operands 1 and 2, which have mode `HImode', and store a
177: `SImode' product in operand 0.
178:
179: `mulqihi3', `mulsidi3'
180: Similar widening-multiplication instructions of other widths.
181:
182: `umulqihi3', `umulhisi3', `umulsidi3'
183: Similar widening-multiplication instructions that do unsigned
184: multiplication.
185:
186: `divmodM4'
187: Signed division that produces both a quotient and a remainder.
188: Operand 1 is divided by operand 2 to produce a quotient stored in
189: operand 0 and a remainder stored in operand 3.
190:
191: For machines with an instruction that produces both a quotient and
192: a remainder, provide a pattern for `divmodM4' but do not provide
193: patterns for `divM3' and `modM3'. This allows optimization in the
194: relatively common case when both the quotient and remainder are
195: computed.
196:
197: If an instruction that just produces a quotient or just a remainder
198: exists and is more efficient than the instruction that produces
199: both, write the output routine of `divmodM4' to call
200: `find_reg_note' and look for a `REG_UNUSED' note on the quotient
201: or remainder and generate the appropriate instruction.
202:
203: `udivmodM4'
204: Similar, but does unsigned division.
205:
206: `ashlM3'
207: Arithmetic-shift operand 1 left by a number of bits specified by
208: operand 2, and store the result in operand 0. Here M is the mode
209: of operand 0 and operand 1; operand 2's mode is specified by the
210: instruction pattern, and the compiler will convert the operand to
211: that mode before generating the instruction.
212:
213: `ashrM3', `lshlM3', `lshrM3', `rotlM3', `rotrM3'
214: Other shift and rotate instructions, analogous to the `ashlM3'
215: instructions.
216:
217: Logical and arithmetic left shift are the same. Machines that do
218: not allow negative shift counts often have only one instruction for
219: shifting left. On such machines, you should define a pattern named
220: `ashlM3' and leave `lshlM3' undefined.
221:
222: `negM2'
223: Negate operand 1 and store the result in operand 0.
224:
225: `absM2'
226: Store the absolute value of operand 1 into operand 0.
227:
228: `sqrtM2'
229: Store the square root of operand 1 into operand 0.
230:
231: The `sqrt' built-in function of C always uses the mode which
232: corresponds to the C data type `double'.
233:
234: `ffsM2'
235: Store into operand 0 one plus the index of the least significant
236: 1-bit of operand 1. If operand 1 is zero, store zero. M is the
237: mode of operand 0; operand 1's mode is specified by the instruction
238: pattern, and the compiler will convert the operand to that mode
239: before generating the instruction.
240:
241: The `ffs' built-in function of C always uses the mode which
242: corresponds to the C data type `int'.
243:
244: `one_cmplM2'
245: Store the bitwise-complement of operand 1 into operand 0.
246:
247: `cmpM'
248: Compare operand 0 and operand 1, and set the condition codes. The
249: RTL pattern should look like this:
250:
251: (set (cc0) (compare (match_operand:M 0 ...)
252: (match_operand:M 1 ...)))
253:
254: `tstM'
255: Compare operand 0 against zero, and set the condition codes. The
256: RTL pattern should look like this:
257:
258: (set (cc0) (match_operand:M 0 ...))
259:
260: `tstM' patterns should not be defined for machines that do not use
261: `(cc0)'. Doing so would confuse the optimizer since it would no
262: longer be clear which `set' operations were comparisons. The
263: `cmpM' patterns should be used instead.
264:
265: `movstrM'
266: Block move instruction. The addresses of the destination and
267: source strings are the first two operands, and both are in mode
268: `Pmode'. The number of bytes to move is the third operand, in
269: mode M.
270:
271: The fourth operand is the known shared alignment of the source and
272: destination, in the form of a `const_int' rtx. Thus, if the
273: compiler knows that both source and destination are word-aligned,
274: it may provide the value 4 for this operand.
275:
276: These patterns need not give special consideration to the
277: possibility that the source and destination strings might overlap.
278:
279: `cmpstrM'
280: Block compare instruction, with five operands. Operand 0 is the
281: output; it has mode M. The remaining four operands are like the
282: operands of `movstrM'. The two memory blocks specified are
283: compared byte by byte in lexicographic order. The effect of the
284: instruction is to store a value in operand 0 whose sign indicates
285: the result of the comparison.
286:
287: Compute the length of a string, with three operands. Operand 0 is
288: the result (of mode M), operand 1 is a `mem' referring to the
289: first character of the string, operand 2 is the character to
290: search for (normally zero), and operand 3 is a constant describing
291: the known alignment of the beginning of the string.
292:
293: `floatMN2'
294: Convert signed integer operand 1 (valid for fixed point mode M) to
295: floating point mode N and store in operand 0 (which has mode N).
296:
297: `floatunsMN2'
298: Convert unsigned integer operand 1 (valid for fixed point mode M)
299: to floating point mode N and store in operand 0 (which has mode N).
300:
301: `fixMN2'
302: Convert operand 1 (valid for floating point mode M) to fixed point
303: mode N as a signed number and store in operand 0 (which has mode
304: N). This instruction's result is defined only when the value of
305: operand 1 is an integer.
306:
307: `fixunsMN2'
308: Convert operand 1 (valid for floating point mode M) to fixed point
309: mode N as an unsigned number and store in operand 0 (which has
310: mode N). This instruction's result is defined only when the value
311: of operand 1 is an integer.
312:
313: `ftruncM2'
314: Convert operand 1 (valid for floating point mode M) to an integer
315: value, still represented in floating point mode M, and store it in
316: operand 0 (valid for floating point mode M).
317:
318: `fix_truncMN2'
319: Like `fixMN2' but works for any floating point value of mode M by
320: converting the value to an integer.
321:
322: `fixuns_truncMN2'
323: Like `fixunsMN2' but works for any floating point value of mode M
324: by converting the value to an integer.
325:
326: `truncMN'
327: Truncate operand 1 (valid for mode M) to mode N and store in
328: operand 0 (which has mode N). Both modes must be fixed point or
329: both floating point.
330:
331: `extendMN'
332: Sign-extend operand 1 (valid for mode M) to mode N and store in
333: operand 0 (which has mode N). Both modes must be fixed point or
334: both floating point.
335:
336: `zero_extendMN'
337: Zero-extend operand 1 (valid for mode M) to mode N and store in
338: operand 0 (which has mode N). Both modes must be fixed point.
339:
340: `extv'
341: Extract a bit field from operand 1 (a register or memory operand),
342: where operand 2 specifies the width in bits and operand 3 the
343: starting bit, and store it in operand 0. Operand 0 must have mode
344: `word_mode'. Operand 1 may have mode `byte_mode' or `word_mode';
345: often `word_mode' is allowed only for registers. Operands 2 and 3
346: must be valid for `word_mode'.
347:
348: The RTL generation pass generates this instruction only with
349: constants for operands 2 and 3.
350:
351: The bit-field value is sign-extended to a full word integer before
352: it is stored in operand 0.
353:
354: `extzv'
355: Like `extv' except that the bit-field value is zero-extended.
356:
357: `insv'
358: Store operand 3 (which must be valid for `word_mode') into a bit
359: field in operand 0, where operand 1 specifies the width in bits and
360: operand 2 the starting bit. Operand 0 may have mode `byte_mode' or
361: `word_mode'; often `word_mode' is allowed only for registers.
362: Operands 1 and 2 must be valid for `word_mode'.
363:
364: The RTL generation pass generates this instruction only with
365: constants for operands 1 and 2.
366:
367: `sCOND'
368: Store zero or nonzero in the operand according to the condition
369: codes. Value stored is nonzero iff the condition COND is true.
370: cOND is the name of a comparison operation expression code, such
371: as `eq', `lt' or `leu'.
372:
373: You specify the mode that the operand must have when you write the
374: `match_operand' expression. The compiler automatically sees which
375: mode you have used and supplies an operand of that mode.
376:
377: The value stored for a true condition must have 1 as its low bit,
378: or else must be negative. Otherwise the instruction is not
379: suitable and you should omit it from the machine description. You
380: describe to the compiler exactly which value is stored by defining
381: the macro `STORE_FLAG_VALUE' (*note Misc::.). If a description
382: cannot be found that can be used for all the `sCOND' patterns, you
383: should omit those operations from the machine description.
384:
385: These operations may fail, but should do so only in relatively
386: uncommon cases; if they would fail for common cases involving
387: integer comparisons, it is best to omit these patterns.
388:
389: If these operations are omitted, the compiler will usually
390: generate code that copies the constant one to the target and
391: branches around an assignment of zero to the target. If this code
392: is more efficient than the potential instructions used for the
393: `sCOND' pattern followed by those required to convert the result
394: into a 1 or a zero in `SImode', you should omit the `sCOND'
395: operations from the machine description.
396:
397: `bCOND'
398: Conditional branch instruction. Operand 0 is a `label_ref' that
399: refers to the label to jump to. Jump if the condition codes meet
400: condition COND.
401:
402: Some machines do not follow the model assumed here where a
403: comparison instruction is followed by a conditional branch
404: instruction. In that case, the `cmpM' (and `tstM') patterns should
405: simply store the operands away and generate all the required insns
406: in a `define_expand' (*note Expander Definitions::.) for the
407: conditional branch operations. All calls to expand `bCOND'
408: patterns are immediately preceded by calls to expand either a
409: `cmpM' pattern or a `tstM' pattern.
410:
411: Machines that use a pseudo register for the condition code value,
412: or where the mode used for the comparison depends on the condition
413: being tested, should also use the above mechanism. *Note Jump
414: Patterns::
415:
416: The above discussion also applies to `sCOND' patterns.
417:
418: `call'
419: Subroutine call instruction returning no value. Operand 0 is the
420: function to call; operand 1 is the number of bytes of arguments
421: pushed (in mode `SImode', except it is normally a `const_int');
422: operand 2 is the number of registers used as operands.
423:
424: On most machines, operand 2 is not actually stored into the RTL
425: pattern. It is supplied for the sake of some RISC machines which
426: need to put this information into the assembler code; they can put
427: it in the RTL instead of operand 1.
428:
429: Operand 0 should be a `mem' RTX whose address is the address of the
430: function. Note, however, that this address can be a `symbol_ref'
431: expression even if it would not be a legitimate memory address on
432: the target machine. If it is also not a valid argument for a call
433: instruction, the pattern for this operation should be a
434: `define_expand' (*note Expander Definitions::.) that places the
435: address into a register and uses that register in the call
436: instruction.
437:
438: `call_value'
439: Subroutine call instruction returning a value. Operand 0 is the
440: hard register in which the value is returned. There are three more
441: operands, the same as the three operands of the `call' instruction
442: (but with numbers increased by one).
443:
444: Subroutines that return `BLKmode' objects use the `call' insn.
445:
446: `call_pop', `call_value_pop'
447: Similar to `call' and `call_value', except used if defined and if
448: `RETURN_POPS_ARGS' is non-zero. They should emit a `parallel'
449: that contains both the function call and a `set' to indicate the
450: adjustment made to the frame pointer.
451:
452: For machines where `RETURN_POPS_ARGS' can be non-zero, the use of
453: these patterns increases the number of functions for which the
454: frame pointer can be eliminated, if desired.
455:
456: `untyped_call'
457: Subroutine call instruction returning a value of any type.
458: Operand 0 is the function to call; operand 1 is a memory location
459: where the result of calling the function is to be stored; operand
460: 2 is a `parallel' expression where each element is a `set'
461: expression that indicates the saving of a function return value
462: into the result block.
463:
464: This instruction pattern should be defined to support
465: `__builtin_apply' on machines where special instructions are needed
466: to call a subroutine with arbitrary arguments or to save the value
467: returned. This instruction pattern is required on machines that
468: have multiple registers that can hold a return value (i.e.
469: `FUNCTION_VALUE_REGNO_P' is true for more than one register).
470:
471: `return'
472: Subroutine return instruction. This instruction pattern name
473: should be defined only if a single instruction can do all the work
474: of returning from a function.
475:
476: Like the `movM' patterns, this pattern is also used after the RTL
477: generation phase. In this case it is to support machines where
478: multiple instructions are usually needed to return from a
479: function, but some class of functions only requires one
480: instruction to implement a return. Normally, the applicable
481: functions are those which do not need to save any registers or
482: allocate stack space.
483:
484: For such machines, the condition specified in this pattern should
485: only be true when `reload_completed' is non-zero and the function's
486: epilogue would only be a single instruction. For machines with
487: register windows, the routine `leaf_function_p' may be used to
488: determine if a register window push is required.
489:
490: Machines that have conditional return instructions should define
491: patterns such as
492:
493: (define_insn ""
494: [(set (pc)
495: (if_then_else (match_operator
496: 0 "comparison_operator"
497: [(cc0) (const_int 0)])
498: (return)
499: (pc)))]
500: "CONDITION"
501: "...")
502:
503: where CONDITION would normally be the same condition specified on
504: the named `return' pattern.
505:
506: `untyped_return'
507: Untyped subroutine return instruction. This instruction pattern
508: should be defined to support `__builtin_return' on machines where
509: special instructions are needed to return a value of any type.
510:
511: Operand 0 is a memory location where the result of calling a
512: function with `__builtin_apply' is stored; operand 1 is a
513: `parallel' expression where each element is a `set' expression
514: that indicates the restoring of a function return value from the
515: result block.
516:
517: `nop'
518: No-op instruction. This instruction pattern name should always be
519: defined to output a no-op in assembler code. `(const_int 0)' will
520: do as an RTL pattern.
521:
522: `indirect_jump'
523: An instruction to jump to an address which is operand zero. This
524: pattern name is mandatory on all machines.
525:
526: `casesi'
527: Instruction to jump through a dispatch table, including bounds
528: checking. This instruction takes five operands:
529:
530: 1. The index to dispatch on, which has mode `SImode'.
531:
532: 2. The lower bound for indices in the table, an integer constant.
533:
534: 3. The total range of indices in the table--the largest index
535: minus the smallest one (both inclusive).
536:
537: 4. A label that precedes the table itself.
538:
539: 5. A label to jump to if the index has a value outside the
540: bounds. (If the machine-description macro
541: `CASE_DROPS_THROUGH' is defined, then an out-of-bounds index
542: drops through to the code following the jump table instead of
543: jumping to this label. In that case, this label is not
544: actually used by the `casesi' instruction, but it is always
545: provided as an operand.)
546:
547: The table is a `addr_vec' or `addr_diff_vec' inside of a
548: `jump_insn'. The number of elements in the table is one plus the
549: difference between the upper bound and the lower bound.
550:
551: `tablejump'
552: Instruction to jump to a variable address. This is a low-level
553: capability which can be used to implement a dispatch table when
554: there is no `casesi' pattern.
555:
556: This pattern requires two operands: the address or offset, and a
557: label which should immediately precede the jump table. If the
558: macro `CASE_VECTOR_PC_RELATIVE' is defined then the first operand
559: is an offset which counts from the address of the table;
560: otherwise, it is an absolute address to jump to. In either case,
561: the first operand has mode `Pmode'.
562:
563: The `tablejump' insn is always the last insn before the jump table
564: it uses. Its assembler code normally has no need to use the
565: second operand, but you should incorporate it in the RTL pattern so
566: that the jump optimizer will not delete the table as unreachable
567: code.
568:
569: `save_stack_block'
570: `save_stack_function'
571: `save_stack_nonlocal'
572: `restore_stack_block'
573: `restore_stack_function'
574: `restore_stack_nonlocal'
575: Most machines save and restore the stack pointer by copying it to
576: or from an object of mode `Pmode'. Do not define these patterns on
577: such machines.
578:
579: Some machines require special handling for stack pointer saves and
580: restores. On those machines, define the patterns corresponding to
581: the non-standard cases by using a `define_expand' (*note Expander
582: Definitions::.) that produces the required insns. The three types
583: of saves and restores are:
584:
585: 1. `save_stack_block' saves the stack pointer at the start of a
586: block that allocates a variable-sized object, and
587: `restore_stack_block' restores the stack pointer when the
588: block is exited.
589:
590: 2. `save_stack_function' and `restore_stack_function' do a
591: similar job for the outermost block of a function and are
592: used when the function allocates variable-sized objects or
593: calls `alloca'. Only the epilogue uses the restored stack
594: pointer, allowing a simpler save or restore sequence on some
595: machines.
596:
597: 3. `save_stack_nonlocal' is used in functions that contain labels
598: branched to by nested functions. It saves the stack pointer
599: in such a way that the inner function can use
600: `restore_stack_nonlocal' to restore the stack pointer. The
601: compiler generates code to restore the frame and argument
602: pointer registers, but some machines require saving and
603: restoring additional data such as register window information
604: or stack backchains. Place insns in these patterns to save
605: and restore any such required data.
606:
607: When saving the stack pointer, operand 0 is the save area and
608: operand 1 is the stack pointer. The mode used to allocate the
609: save area is the mode of operand 0. You must specify an integral
610: mode, or `VOIDmode' if no save area is needed for a particular
611: type of save (either because no save is needed or because a
612: machine-specific save area can be used). Operand 0 is the stack
613: pointer and operand 1 is the save area for restore operations. If
614: `save_stack_block' is defined, operand 0 must not be `VOIDmode'
615: since these saves can be arbitrarily nested.
616:
617: A save area is a `mem' that is at a constant offset from
618: `virtual_stack_vars_rtx' when the stack pointer is saved for use by
619: nonlocal gotos and a `reg' in the other two cases.
620:
621: `allocate_stack'
622: Subtract (or add if `STACK_GROWS_DOWNWARD' is undefined) operand 0
623: from the stack pointer to create space for dynamically allocated
624: data.
625:
626: Do not define this pattern if all that must be done is the
627: subtraction. Some machines require other operations such as stack
628: probes or maintaining the back chain. Define this pattern to emit
629: those operations in addition to updating the stack pointer.
630:
631:
632: File: gcc.info, Node: Pattern Ordering, Next: Dependent Patterns, Prev: Standard Names, Up: Machine Desc
633:
634: When the Order of Patterns Matters
635: ==================================
636:
637: Sometimes an insn can match more than one instruction pattern. Then
638: the pattern that appears first in the machine description is the one
639: used. Therefore, more specific patterns (patterns that will match
640: fewer things) and faster instructions (those that will produce better
641: code when they do match) should usually go first in the description.
642:
643: In some cases the effect of ordering the patterns can be used to hide
644: a pattern when it is not valid. For example, the 68000 has an
645: instruction for converting a fullword to floating point and another for
646: converting a byte to floating point. An instruction converting an
647: integer to floating point could match either one. We put the pattern
648: to convert the fullword first to make sure that one will be used rather
649: than the other. (Otherwise a large integer might be generated as a
650: single-byte immediate quantity, which would not work.) Instead of using
651: this pattern ordering it would be possible to make the pattern for
652: convert-a-byte smart enough to deal properly with any constant value.
653:
654:
655: File: gcc.info, Node: Dependent Patterns, Next: Jump Patterns, Prev: Pattern Ordering, Up: Machine Desc
656:
657: Interdependence of Patterns
658: ===========================
659:
660: Every machine description must have a named pattern for each of the
661: conditional branch names `bCOND'. The recognition template must always
662: have the form
663:
664: (set (pc)
665: (if_then_else (COND (cc0) (const_int 0))
666: (label_ref (match_operand 0 "" ""))
667: (pc)))
668:
669: In addition, every machine description must have an anonymous pattern
670: for each of the possible reverse-conditional branches. Their templates
671: look like
672:
673: (set (pc)
674: (if_then_else (COND (cc0) (const_int 0))
675: (pc)
676: (label_ref (match_operand 0 "" ""))))
677:
678: They are necessary because jump optimization can turn direct-conditional
679: branches into reverse-conditional branches.
680:
681: It is often convenient to use the `match_operator' construct to
682: reduce the number of patterns that must be specified for branches. For
683: example,
684:
685: (define_insn ""
686: [(set (pc)
687: (if_then_else (match_operator 0 "comparison_operator"
688: [(cc0) (const_int 0)])
689: (pc)
690: (label_ref (match_operand 1 "" ""))))]
691: "CONDITION"
692: "...")
693:
694: In some cases machines support instructions identical except for the
695: machine mode of one or more operands. For example, there may be
696: "sign-extend halfword" and "sign-extend byte" instructions whose
697: patterns are
698:
699: (set (match_operand:SI 0 ...)
700: (extend:SI (match_operand:HI 1 ...)))
701:
702: (set (match_operand:SI 0 ...)
703: (extend:SI (match_operand:QI 1 ...)))
704:
705: Constant integers do not specify a machine mode, so an instruction to
706: extend a constant value could match either pattern. The pattern it
707: actually will match is the one that appears first in the file. For
708: correct results, this must be the one for the widest possible mode
709: (`HImode', here). If the pattern matches the `QImode' instruction, the
710: results will be incorrect if the constant value does not actually fit
711: that mode.
712:
713: Such instructions to extend constants are rarely generated because
714: they are optimized away, but they do occasionally happen in nonoptimized
715: compilations.
716:
717: If a constraint in a pattern allows a constant, the reload pass may
718: replace a register with a constant permitted by the constraint in some
719: cases. Similarly for memory references. You must ensure that the
720: predicate permits all objects allowed by the constraints to prevent the
721: compiler from crashing.
722:
723: Because of this substitution, you should not provide separate
724: patterns for increment and decrement instructions. Instead, they
725: should be generated from the same pattern that supports
726: register-register add insns by examining the operands and generating
727: the appropriate machine instruction.
728:
729:
730: File: gcc.info, Node: Jump Patterns, Next: Insn Canonicalizations, Prev: Dependent Patterns, Up: Machine Desc
731:
732: Defining Jump Instruction Patterns
733: ==================================
734:
735: For most machines, GNU CC assumes that the machine has a condition
736: code. A comparison insn sets the condition code, recording the results
737: of both signed and unsigned comparison of the given operands. A
738: separate branch insn tests the condition code and branches or not
739: according its value. The branch insns come in distinct signed and
740: unsigned flavors. Many common machines, such as the Vax, the 68000 and
741: the 32000, work this way.
742:
743: Some machines have distinct signed and unsigned compare
744: instructions, and only one set of conditional branch instructions. The
745: easiest way to handle these machines is to treat them just like the
746: others until the final stage where assembly code is written. At this
747: time, when outputting code for the compare instruction, peek ahead at
748: the following branch using `next_cc0_user (insn)'. (The variable
749: `insn' refers to the insn being output, in the output-writing code in
750: an instruction pattern.) If the RTL says that is an unsigned branch,
751: output an unsigned compare; otherwise output a signed compare. When
752: the branch itself is output, you can treat signed and unsigned branches
753: identically.
754:
755: The reason you can do this is that GNU CC always generates a pair of
756: consecutive RTL insns, possibly separated by `note' insns, one to set
757: the condition code and one to test it, and keeps the pair inviolate
758: until the end.
759:
760: To go with this technique, you must define the machine-description
761: macro `NOTICE_UPDATE_CC' to do `CC_STATUS_INIT'; in other words, no
762: compare instruction is superfluous.
763:
764: Some machines have compare-and-branch instructions and no condition
765: code. A similar technique works for them. When it is time to "output"
766: a compare instruction, record its operands in two static variables.
767: When outputting the branch-on-condition-code instruction that follows,
768: actually output a compare-and-branch instruction that uses the
769: remembered operands.
770:
771: It also works to define patterns for compare-and-branch instructions.
772: In optimizing compilation, the pair of compare and branch instructions
773: will be combined according to these patterns. But this does not happen
774: if optimization is not requested. So you must use one of the solutions
775: above in addition to any special patterns you define.
776:
777: In many RISC machines, most instructions do not affect the condition
778: code and there may not even be a separate condition code register. On
779: these machines, the restriction that the definition and use of the
780: condition code be adjacent insns is not necessary and can prevent
781: important optimizations. For example, on the IBM RS/6000, there is a
782: delay for taken branches unless the condition code register is set three
783: instructions earlier than the conditional branch. The instruction
784: scheduler cannot perform this optimization if it is not permitted to
785: separate the definition and use of the condition code register.
786:
787: On these machines, do not use `(cc0)', but instead use a register to
788: represent the condition code. If there is a specific condition code
789: register in the machine, use a hard register. If the condition code or
790: comparison result can be placed in any general register, or if there are
791: multiple condition registers, use a pseudo register.
792:
793: On some machines, the type of branch instruction generated may
794: depend on the way the condition code was produced; for example, on the
795: 68k and Sparc, setting the condition code directly from an add or
796: subtract instruction does not clear the overflow bit the way that a test
797: instruction does, so a different branch instruction must be used for
798: some conditional branches. For machines that use `(cc0)', the set and
799: use of the condition code must be adjacent (separated only by `note'
800: insns) allowing flags in `cc_status' to be used. (*Note Condition
801: Code::.) Also, the comparison and branch insns can be located from
802: each other by using the functions `prev_cc0_setter' and `next_cc0_user'.
803:
804: However, this is not true on machines that do not use `(cc0)'. On
805: those machines, no assumptions can be made about the adjacency of the
806: compare and branch insns and the above methods cannot be used. Instead,
807: we use the machine mode of the condition code register to record
808: different formats of the condition code register.
809:
810: Registers used to store the condition code value should have a mode
811: that is in class `MODE_CC'. Normally, it will be `CCmode'. If
812: additional modes are required (as for the add example mentioned above in
813: the Sparc), define the macro `EXTRA_CC_MODES' to list the additional
814: modes required (*note Condition Code::.). Also define `EXTRA_CC_NAMES'
815: to list the names of those modes and `SELECT_CC_MODE' to choose a mode
816: given an operand of a compare.
817:
818: If it is known during RTL generation that a different mode will be
819: required (for example, if the machine has separate compare instructions
820: for signed and unsigned quantities, like most IBM processors), they can
821: be specified at that time.
822:
823: If the cases that require different modes would be made by
824: instruction combination, the macro `SELECT_CC_MODE' determines which
825: machine mode should be used for the comparison result. The patterns
826: should be written using that mode. To support the case of the add on
827: the Sparc discussed above, we have the pattern
828:
829: (define_insn ""
830: [(set (reg:CC_NOOV 0)
831: (compare:CC_NOOV
832: (plus:SI (match_operand:SI 0 "register_operand" "%r")
833: (match_operand:SI 1 "arith_operand" "rI"))
834: (const_int 0)))]
835: ""
836: "...")
837:
838: The `SELECT_CC_MODE' macro on the Sparc returns `CC_NOOVmode' for
839: comparisons whose argument is a `plus'.
840:
841:
842: File: gcc.info, Node: Insn Canonicalizations, Next: Peephole Definitions, Prev: Jump Patterns, Up: Machine Desc
843:
844: Canonicalization of Instructions
845: ================================
846:
847: There are often cases where multiple RTL expressions could represent
848: an operation performed by a single machine instruction. This situation
849: is most commonly encountered with logical, branch, and
850: multiply-accumulate instructions. In such cases, the compiler attempts
851: to convert these multiple RTL expressions into a single canonical form
852: to reduce the number of insn patterns required.
853:
854: In addition to algebraic simplifications, following canonicalizations
855: are performed:
856:
857: * For commutative and comparison operators, a constant is always
858: made the second operand. If a machine only supports a constant as
859: the second operand, only patterns that match a constant in the
860: second operand need be supplied.
861:
862: For these operators, if only one operand is a `neg', `not',
863: `mult', `plus', or `minus' expression, it will be the first
864: operand.
865:
866: * For the `compare' operator, a constant is always the second operand
867: on machines where `cc0' is used (*note Jump Patterns::.). On other
868: machines, there are rare cases where the compiler might want to
869: construct a `compare' with a constant as the first operand.
870: However, these cases are not common enough for it to be worthwhile
871: to provide a pattern matching a constant as the first operand
872: unless the machine actually has such an instruction.
873:
874: An operand of `neg', `not', `mult', `plus', or `minus' is made the
875: first operand under the same conditions as above.
876:
877: * `(minus X (const_int N))' is converted to `(plus X (const_int
878: -N))'.
879:
880: * Within address computations (i.e., inside `mem'), a left shift is
881: converted into the appropriate multiplication by a power of two.
882:
883: De`Morgan's Law is used to move bitwise negation inside a bitwise
884: logical-and or logical-or operation. If this results in only one
885: operand being a `not' expression, it will be the first one.
886:
887: A machine that has an instruction that performs a bitwise
888: logical-and of one operand with the bitwise negation of the other
889: should specify the pattern for that instruction as
890:
891: (define_insn ""
892: [(set (match_operand:M 0 ...)
893: (and:M (not:M (match_operand:M 1 ...))
894: (match_operand:M 2 ...)))]
895: "..."
896: "...")
897:
898: Similarly, a pattern for a "NAND" instruction should be written
899:
900: (define_insn ""
901: [(set (match_operand:M 0 ...)
902: (ior:M (not:M (match_operand:M 1 ...))
903: (not:M (match_operand:M 2 ...))))]
904: "..."
905: "...")
906:
907: In both cases, it is not necessary to include patterns for the many
908: logically equivalent RTL expressions.
909:
910: * The only possible RTL expressions involving both bitwise
911: exclusive-or and bitwise negation are `(xor:M X Y)' and `(not:M
912: (xor:M X Y))'.
913:
914: * The sum of three items, one of which is a constant, will only
915: appear in the form
916:
917: (plus:M (plus:M X Y) CONSTANT)
918:
919: * On machines that do not use `cc0', `(compare X (const_int 0))'
920: will be converted to X.
921:
922: * Equality comparisons of a group of bits (usually a single bit)
923: with zero will be written using `zero_extract' rather than the
924: equivalent `and' or `sign_extract' operations.
925:
926:
927: File: gcc.info, Node: Peephole Definitions, Next: Expander Definitions, Prev: Insn Canonicalizations, Up: Machine Desc
928:
929: Machine-Specific Peephole Optimizers
930: ====================================
931:
932: In addition to instruction patterns the `md' file may contain
933: definitions of machine-specific peephole optimizations.
934:
935: The combiner does not notice certain peephole optimizations when the
936: data flow in the program does not suggest that it should try them. For
937: example, sometimes two consecutive insns related in purpose can be
938: combined even though the second one does not appear to use a register
939: computed in the first one. A machine-specific peephole optimizer can
940: detect such opportunities.
941:
942: A definition looks like this:
943:
944: (define_peephole
945: [INSN-PATTERN-1
946: INSN-PATTERN-2
947: ...]
948: "CONDITION"
949: "TEMPLATE"
950: "OPTIONAL INSN-ATTRIBUTES")
951:
952: The last string operand may be omitted if you are not using any
953: machine-specific information in this machine description. If present,
954: it must obey the same rules as in a `define_insn'.
955:
956: In this skeleton, INSN-PATTERN-1 and so on are patterns to match
957: consecutive insns. The optimization applies to a sequence of insns when
958: INSN-PATTERN-1 matches the first one, INSN-PATTERN-2 matches the next,
959: and so on.
960:
961: Each of the insns matched by a peephole must also match a
962: `define_insn'. Peepholes are checked only at the last stage just
963: before code generation, and only optionally. Therefore, any insn which
964: would match a peephole but no `define_insn' will cause a crash in code
965: generation in an unoptimized compilation, or at various optimization
966: stages.
967:
968: The operands of the insns are matched with `match_operands',
969: `match_operator', and `match_dup', as usual. What is not usual is that
970: the operand numbers apply to all the insn patterns in the definition.
971: So, you can check for identical operands in two insns by using
972: `match_operand' in one insn and `match_dup' in the other.
973:
974: The operand constraints used in `match_operand' patterns do not have
975: any direct effect on the applicability of the peephole, but they will
976: be validated afterward, so make sure your constraints are general enough
977: to apply whenever the peephole matches. If the peephole matches but
978: the constraints are not satisfied, the compiler will crash.
979:
980: It is safe to omit constraints in all the operands of the peephole;
981: or you can write constraints which serve as a double-check on the
982: criteria previously tested.
983:
984: Once a sequence of insns matches the patterns, the CONDITION is
985: checked. This is a C expression which makes the final decision whether
986: to perform the optimization (we do so if the expression is nonzero). If
987: CONDITION is omitted (in other words, the string is empty) then the
988: optimization is applied to every sequence of insns that matches the
989: patterns.
990:
991: The defined peephole optimizations are applied after register
992: allocation is complete. Therefore, the peephole definition can check
993: which operands have ended up in which kinds of registers, just by
994: looking at the operands.
995:
996: The way to refer to the operands in CONDITION is to write
997: `operands[I]' for operand number I (as matched by `(match_operand I
998: ...)'). Use the variable `insn' to refer to the last of the insns
999: being matched; use `prev_nonnote_insn' to find the preceding insns.
1000:
1001: When optimizing computations with intermediate results, you can use
1002: CONDITION to match only when the intermediate results are not used
1003: elsewhere. Use the C expression `dead_or_set_p (INSN, OP)', where INSN
1004: is the insn in which you expect the value to be used for the last time
1005: (from the value of `insn', together with use of `prev_nonnote_insn'),
1006: and OP is the intermediate value (from `operands[I]').
1007:
1008: Applying the optimization means replacing the sequence of insns with
1009: one new insn. The TEMPLATE controls ultimate output of assembler code
1010: for this combined insn. It works exactly like the template of a
1011: `define_insn'. Operand numbers in this template are the same ones used
1012: in matching the original sequence of insns.
1013:
1014: The result of a defined peephole optimizer does not need to match
1015: any of the insn patterns in the machine description; it does not even
1016: have an opportunity to match them. The peephole optimizer definition
1017: itself serves as the insn pattern to control how the insn is output.
1018:
1019: Defined peephole optimizers are run as assembler code is being
1020: output, so the insns they produce are never combined or rearranged in
1021: any way.
1022:
1023: Here is an example, taken from the 68000 machine description:
1024:
1025: (define_peephole
1026: [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4)))
1027: (set (match_operand:DF 0 "register_operand" "=f")
1028: (match_operand:DF 1 "register_operand" "ad"))]
1029: "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
1030: "*
1031: {
1032: rtx xoperands[2];
1033: xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1);
1034: #ifdef MOTOROLA
1035: output_asm_insn (\"move.l %1,(sp)\", xoperands);
1036: output_asm_insn (\"move.l %1,-(sp)\", operands);
1037: return \"fmove.d (sp)+,%0\";
1038: #else
1039: output_asm_insn (\"movel %1,sp@\", xoperands);
1040: output_asm_insn (\"movel %1,sp@-\", operands);
1041: return \"fmoved sp@+,%0\";
1042: #endif
1043: }
1044: ")
1045:
1046: The effect of this optimization is to change
1047:
1048: jbsr _foobar
1049: addql #4,sp
1050: movel d1,sp@-
1051: movel d0,sp@-
1052: fmoved sp@+,fp0
1053:
1054: into
1055:
1056: jbsr _foobar
1057: movel d1,sp@
1058: movel d0,sp@-
1059: fmoved sp@+,fp0
1060:
1061: INSN-PATTERN-1 and so on look *almost* like the second operand of
1062: `define_insn'. There is one important difference: the second operand
1063: of `define_insn' consists of one or more RTX's enclosed in square
1064: brackets. Usually, there is only one: then the same action can be
1065: written as an element of a `define_peephole'. But when there are
1066: multiple actions in a `define_insn', they are implicitly enclosed in a
1067: `parallel'. Then you must explicitly write the `parallel', and the
1068: square brackets within it, in the `define_peephole'. Thus, if an insn
1069: pattern looks like this,
1070:
1071: (define_insn "divmodsi4"
1072: [(set (match_operand:SI 0 "general_operand" "=d")
1073: (div:SI (match_operand:SI 1 "general_operand" "0")
1074: (match_operand:SI 2 "general_operand" "dmsK")))
1075: (set (match_operand:SI 3 "general_operand" "=d")
1076: (mod:SI (match_dup 1) (match_dup 2)))]
1077: "TARGET_68020"
1078: "divsl%.l %2,%3:%0")
1079:
1080: then the way to mention this insn in a peephole is as follows:
1081:
1082: (define_peephole
1083: [...
1084: (parallel
1085: [(set (match_operand:SI 0 "general_operand" "=d")
1086: (div:SI (match_operand:SI 1 "general_operand" "0")
1087: (match_operand:SI 2 "general_operand" "dmsK")))
1088: (set (match_operand:SI 3 "general_operand" "=d")
1089: (mod:SI (match_dup 1) (match_dup 2)))])
1090: ...]
1091: ...)
1092:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.