Annotation of researchv10dc/cmd/gcc/internals-4, revision 1.1.1.1

1.1       root        1: 
                      2: 
                      3: File: internals,  Node: Side Effects,  Next: Incdec,  Prev: RTL Declarations,  Up: RTL
                      4: 
                      5: Side Effect Expressions
                      6: =======================
                      7: 
                      8: The expression codes described so far represent values, not actions.  But
                      9: machine instructions never produce values; they are meaningful only for
                     10: their side effects on the state of the machine.  Special expression codes
                     11: are used to represent side effects.
                     12: 
                     13: The body of an instruction is always one of these side effect codes; the
                     14: codes described above, which represent values, appear only as the operands
                     15: of these.
                     16: 
                     17: `(set LVAL X)'
                     18:      Represents the action of storing the value of X into the place
                     19:      represented by LVAL.  LVAL must be an expression representing a place
                     20:      that can be stored in: `reg' (or `subreg' or `strict_low_part'),
                     21:      `mem', `pc' or `cc0'.
                     22: 
                     23:      If LVAL is a `reg', `subreg' or `mem', it has a machine mode; then X
                     24:      must be valid for that mode.
                     25: 
                     26:      If LVAL is a `reg' whose machine mode is less than the full width of
                     27:      the register, then it means that the part of the register specified by
                     28:      the machine mode is given the specified value and the rest of the
                     29:      register receives an undefined value.  Likewise, if LVAL is a `subreg'
                     30:      whose machine mode is narrower than `SImode', the rest of the register
                     31:      can be changed in an undefined way.
                     32: 
                     33:      If LVAL is a `strict_low_part' of a `subreg', then the part of the
                     34:      register specified by the machine mode of the `subreg' is given the
                     35:      value X and the rest of the register is not changed.
                     36: 
                     37:      If LVAL is `(cc0)', it has no machine mode, and X may have any mode. 
                     38:      This represents a ``test'' or ``compare'' instruction.
                     39: 
                     40:      If LVAL is `(pc)', we have a jump instruction, and the possibilities
                     41:      for X are very limited.  It may be a `label_ref' expression
                     42:      (unconditional jump).  It may be an `if_then_else' (conditional jump),
                     43:      in which case either the second or the third operand must be `(pc)'
                     44:      (for the case which does not jump) and the other of the two must be a
                     45:      `label_ref' (for the case which does jump).  X may also be a `mem' or
                     46:      `(plus:SI (pc) Y)', where Y may be a `reg' or a `mem'; these unusual
                     47:      patterns are used to represent jumps through branch tables.
                     48: 
                     49: `(return)'
                     50:      Represents a return from the current function, on machines where this
                     51:      can be done with one instruction, such as Vaxes.  On machines where a
                     52:      multi-instruction ``epilogue'' must be executed in order to return
                     53:      from the function, returning is done by jumping to a label which
                     54:      precedes the epilogue, and the `return' expression code is never used.
                     55: 
                     56: `(call FUNCTION NARGS)'
                     57:      Represents a function call.  FUNCTION is a `mem' expression whose
                     58:      address is the address of the function to be called.  NARGS is an
                     59:      expression representing the number of words of argument.
                     60: 
                     61:      Each machine has a standard machine mode which FUNCTION must have. 
                     62:      The machine description defines macro `FUNCTION_MODE' to expand into
                     63:      the requisite mode name.  The purpose of this mode is to specify what
                     64:      kind of addressing is allowed, on machines where the allowed kinds of
                     65:      addressing depend on the machine mode being addressed.
                     66: 
                     67: `(clobber X)'
                     68:      Represents the storing or possible storing of an unpredictable,
                     69:      undescribed value into X, which must be a `reg' or `mem' expression.
                     70: 
                     71:      One place this is used is in string instructions that store standard
                     72:      values into particular hard registers.  It may not be worth the
                     73:      trouble to describe the values that are stored, but it is essential to
                     74:      inform the compiler that the registers will be altered, lest it
                     75:      attempt to keep data in them across the string instruction.
                     76: 
                     77:      X may also be null---a null C pointer, no expression at all.  Such a
                     78:      `(clobber (null))' expression means that all memory locations must be
                     79:      presumed clobbered.
                     80: 
                     81:      Note that the machine description classifies certain hard registers as
                     82:      ``call-clobbered''.  All function call instructions are assumed by
                     83:      default to clobber these registers, so there is no need to use
                     84:      `clobber' expressions to indicate this fact.  Also, each function call
                     85:      is assumed to have the potential to alter any memory location.
                     86: 
                     87: `(use X)'
                     88:      Represents the use of the value of X.  It indicates that the value in
                     89:      X at this point in the program is needed, even though it may not be
                     90:      apparent why this is so.  Therefore, the compiler will not attempt to
                     91:      delete instructions whose only effect is to store a value in X.  X
                     92:      must be a `reg' expression.
                     93: 
                     94: `(parallel [X0 X1 ...])'
                     95:      Represents several side effects performed in parallel.  The square
                     96:      brackets stand for a vector; the operand of `parallel' is a vector of
                     97:      expressions.  X0, X1 and so on are individual side
                     98:      effects---expressions of code `set', `call', `return', `clobber' or
                     99:      `use'.
                    100: 
                    101:      ``In parallel'' means that first all the values used in the individual
                    102:      side-effects are computed, and second all the actual side-effects are
                    103:      performed.  For example,
                    104: 
                    105:           (parallel [(set (reg:SI 1) (mem:SI (reg:SI 1)))
                    106:                      (set (mem:SI (reg:SI 1)) (reg:SI 1))])
                    107: 
                    108:      says unambiguously that the values of hard register 1 and the memory
                    109:      location addressed by it are interchanged.  In both places where
                    110:      `(reg:SI 1)' appears as a memory address it refers to the value in
                    111:      register 1 *before* the execution of the instruction.
                    112: 
                    113: `(sequence [INSNS ...])'
                    114:      Represents a sequence of insns.  Each of the INSNS that appears in the
                    115:      vector is suitable for appearing in the chain of insns, so it must be
                    116:      an `insn', `jump_insn', `call_insn', `code_label', `barrier' or `note'.
                    117: 
                    118:      A `sequence' RTX never appears in an actual insn.  It represents the
                    119:      sequence of insns that result from a `define_expand' *before* those
                    120:      insns are passed to `emit_insn' to insert them in the chain of insns. 
                    121:      When actually inserted, the individual sub-insns are separated out and
                    122:      the `sequence' is forgotten.
                    123: 
                    124: Three expression codes appear in place of a side effect, as the body of an
                    125: insn, though strictly speaking they do not describe side effects as such:
                    126: 
                    127: `(asm_input S)'
                    128:      Represents literal assembler code as described by the string S.
                    129: 
                    130: `(addr_vec:M [LR0 LR1 ...])'
                    131:      Represents a table of jump addresses.  The vector elements LR0, etc.,
                    132:      are `label_ref' expressions.  The mode M specifies how much space is
                    133:      given to each address; normally M would be `Pmode'.
                    134: 
                    135: `(addr_diff_vec:M BASE [LR0 LR1 ...])'
                    136:      Represents a table of jump addresses expressed as offsets from BASE. 
                    137:      The vector elements LR0, etc., are `label_ref' expressions and so is
                    138:      BASE.  The mode M specifies how much space is given to each
                    139:      address-difference.
                    140: 
                    141: 
                    142: File: internals,  Node: Incdec,  Next: Assembler,  Prev: Side Effects,  Up: RTL
                    143: 
                    144: Embedded Side-Effects on Addresses
                    145: ==================================
                    146: 
                    147: Four special side-effect expression codes appear as memory addresses.
                    148: 
                    149: `(pre_dec:M X)'
                    150:      Represents the side effect of decrementing X by a standard amount and
                    151:      represents also the value that X has after being decremented.  X must
                    152:      be a `reg' or `mem', but most machines allow only a `reg'.  M must be
                    153:      the machine mode for pointers on the machine in use.  The amount X is
                    154:      decremented by is the length in bytes of the machine mode of the
                    155:      containing memory reference of which this expression serves as the
                    156:      address.  Here is an example of its use:
                    157: 
                    158:           (mem:DF (pre_dec:SI (reg:SI 39)))
                    159: 
                    160:      This says to decrement pseudo register 39 by the length of a `DFmode'
                    161:      value and use the result to address a `DFmode' value.
                    162: 
                    163: `(pre_inc:M X)'
                    164:      Similar, but specifies incrementing X instead of decrementing it.
                    165: 
                    166: `(post_dec:M X)'
                    167:      Represents the same side effect as `pre_decrement' but a different
                    168:      value.  The value represented here is the value X has before being
                    169:      decremented.
                    170: 
                    171: `(post_inc:M X)'
                    172:      Similar, but specifies incrementing X instead of decrementing it.
                    173: 
                    174: These embedded side effect expressions must be used with care.  Instruction
                    175: patterns may not use them.  Until the `flow' pass of the compiler, they may
                    176: occur only to represent pushes onto the stack.  The `flow' pass finds cases
                    177: where registers are incremented or decremented in one instruction and used
                    178: as an address shortly before or after; these cases are then transformed to
                    179: use pre- or post-increment or -decrement.
                    180: 
                    181: Explicit popping of the stack could be represented with these embedded side
                    182: effect operators, but that would not be safe; the instruction combination
                    183: pass could move the popping past pushes, thus changing the meaning of the
                    184: code.
                    185: 
                    186: An instruction that can be represented with an embedded side effect could
                    187: also be represented using `parallel' containing an additional `set' to
                    188: describe how the address register is altered.  This is not done because
                    189: machines that allow these operations at all typically allow them wherever a
                    190: memory address is called for.  Describing them as additional parallel
                    191: stores would require doubling the number of entries in the machine
                    192: description.
                    193: 
                    194: 
                    195: File: internals,  Node: Assembler,  Next: Insns,  Prev: IncDec,  Up: RTL
                    196: 
                    197: Assembler Instructions as Expressions
                    198: =====================================
                    199: 
                    200: The RTX code `asm_operands' represents a value produced by a user-specified
                    201: assembler instruction.  It is used to represent an `asm' statement with
                    202: arguments.  An `asm' statement with a single output operand, like this:
                    203: 
                    204:      asm ("foo %1,%2,%0" : "a" (outputvar) : "g" (x + y), "di" (*z));
                    205: 
                    206: is represented using a single `asm_operands' RTX which represents the value
                    207: that is stored in `outputvar':
                    208: 
                    209:      (set RTX-FOR-OUTPUTVAR
                    210:           (asm_operands "foo %1,%2,%0" "a" 0
                    211:                         [RTX-FOR-ADDITION-RESULT RTX-FOR-*Z]
                    212:                         [(asm_input:M1 "g")
                    213:                          (asm_input:M2 "di")]))
                    214: 
                    215: Here the operands of the `asm_operands' RTX are the assembler template
                    216: string, the output-operand's constraint, the index-number of the output
                    217: operand among the output operands specified, a vector of input operand
                    218: RTX's, and a vector of input-operand modes and constraints.  The mode M1 is
                    219: the mode of the sum `x+y'; M2 is that of `*z'.
                    220: 
                    221: When an `asm' statement has multiple output values, its insn has several
                    222: such `set' RTX's inside of a `parallel'.  Each `set' contains a
                    223: `asm_operands'; all of these share the same assembler template and vectors,
                    224: but each contains the constraint for the respective output operand.  They
                    225: are also distinguished by the output-operand index number, which is 0, 1,
                    226: ... for successive output operands.
                    227: 
                    228: 
                    229: File: internals,  Node: Insns,  Next: Calls,  Prev: Assembler,  Up: RTL
                    230: 
                    231: Insns
                    232: =====
                    233: 
                    234: The RTL representation of the code for a function is a doubly-linked chain
                    235: of objects called "insns".  Insns are expressions with special codes that
                    236: are used for no other purpose.  Some insns are actual instructions; others
                    237: represent dispatch tables for `switch' statements; others represent labels
                    238: to jump to or various sorts of declarative information.
                    239: 
                    240: In addition to its own specific data, each insn must have a unique
                    241: id-number that distinguishes it from all other insns in the current
                    242: function, and chain pointers to the preceding and following insns.  These
                    243: three fields occupy the same position in every insn, independent of the
                    244: expression code of the insn.  They could be accessed with `XEXP' and
                    245: `XINT', but instead three special macros are always used:
                    246: 
                    247: `INSN_UID (I)'
                    248:      Accesses the unique id of insn I.
                    249: 
                    250: `PREV_INSN (I)'
                    251:      Accesses the chain pointer to the insn preceding I.  If I is the first
                    252:      insn, this is a null pointer.
                    253: 
                    254: `NEXT_INSN (I)'
                    255:      Accesses the chain pointer to the insn following I.  If I is the last
                    256:      insn, this is a null pointer.
                    257: 
                    258: The `NEXT_INSN' and `PREV_INSN' pointers must always correspond: if I is
                    259: not the first insn,
                    260: 
                    261:      NEXT_INSN (PREV_INSN (INSN)) == INSN
                    262: 
                    263: is always true.
                    264: 
                    265: Every insn has one of the following six expression codes:
                    266: 
                    267: `insn'
                    268:      The expression code `insn' is used for instructions that do not jump
                    269:      and do not do function calls.  Insns with code `insn' have four
                    270:      additional fields beyond the three mandatory ones listed above.  These
                    271:      four are described in a table below.
                    272: 
                    273: `jump_insn'
                    274:      The expression code `jump_insn' is used for instructions that may jump
                    275:      (or, more generally, may contain `label_ref' expressions). 
                    276:      `jump_insn' insns have the same extra fields as `insn' insns, accessed
                    277:      in the same way.
                    278: 
                    279: `call_insn'
                    280:      The expression code `call_insn' is used for instructions that may do
                    281:      function calls.  It is important to distinguish these instructions
                    282:      because they imply that certain registers and memory locations may be
                    283:      altered unpredictably.
                    284: 
                    285:      `call_insn' insns have the same extra fields as `insn' insns, accessed
                    286:      in the same way.
                    287: 
                    288: `code_label'
                    289:      A `code_label' insn represents a label that a jump insn can jump to. 
                    290:      It contains one special field of data in addition to the three
                    291:      standard ones.  It is used to hold the "label number", a number that
                    292:      identifies this label uniquely among all the labels in the compilation
                    293:      (not just in the current function).  Ultimately, the label is
                    294:      represented in the assembler output as an assembler label `LN' where N
                    295:      is the label number.
                    296: 
                    297: `barrier'
                    298:      Barriers are placed in the instruction stream after unconditional jump
                    299:      instructions to indicate that the jumps are unconditional.  They
                    300:      contain no information beyond the three standard fields.
                    301: 
                    302: `note'
                    303:      `note' insns are used to represent additional debugging and
                    304:      declarative information.  They contain two nonstandard fields, an
                    305:      integer which is accessed with the macro `NOTE_LINE_NUMBER' and a
                    306:      string accessed with `NOTE_SOURCE_FILE'.
                    307: 
                    308:      If `NOTE_LINE_NUMBER' is positive, the note represents the position of
                    309:      a source line and `NOTE_SOURCE_FILE' is the source file name that the
                    310:      line came from.  These notes control generation of line number data in
                    311:      the assembler output.
                    312: 
                    313:      Otherwise, `NOTE_LINE_NUMBER' is not really a line number but a code
                    314:      with one of the following values (and `NOTE_SOURCE_FILE' must contain
                    315:      a null pointer):
                    316: 
                    317:      `NOTE_INSN_DELETED'
                    318:           Such a note is completely ignorable.  Some passes of the compiler
                    319:           delete insns by altering them into notes of this kind.
                    320: 
                    321:      `NOTE_INSN_BLOCK_BEG'
                    322:      `NOTE_INSN_BLOCK_END'
                    323:           These types of notes indicate the position of the beginning and
                    324:           end of a level of scoping of variable names.  They control the
                    325:           output of debugging information.
                    326: 
                    327:      `NOTE_INSN_LOOP_BEG'
                    328:      `NOTE_INSN_LOOP_END'
                    329:           These types of notes indicate the position of the beginning and
                    330:           end of a `while' or `for' loop.  They enable the loop optimizer
                    331:           to find loops quickly.
                    332: 
                    333: Here is a table of the extra fields of `insn', `jump_insn' and `call_insn'
                    334: insns:
                    335: 
                    336: `PATTERN (I)'
                    337:      An expression for the side effect performed by this insn.
                    338: 
                    339: `REG_NOTES (I)'
                    340:      A list (chain of `expr_list' expressions) giving information about the
                    341:      usage of registers in this insn.  This list is set up by the flow
                    342:      analysis pass; it is a null pointer until then.
                    343: 
                    344: `LOG_LINKS (I)'
                    345:      A list (chain of `insn_list' expressions) of previous ``related''
                    346:      insns: insns which store into registers values that are used for the
                    347:      first time in this insn.  (An additional constraint is that neither a
                    348:      jump nor a label may come between the related insns).  This list is
                    349:      set up by the flow analysis pass; it is a null pointer until then.
                    350: 
                    351: `INSN_CODE (I)'
                    352:      An integer that says which pattern in the machine description matches
                    353:      this insn, or -1 if the matching has not yet been attempted.
                    354: 
                    355:      Such matching is never attempted and this field is not used on an insn
                    356:      whose pattern consists of a single `use', `clobber', `asm', `addr_vec'
                    357:      or `addr_diff_vec' expression.
                    358: 
                    359: The `LOG_LINKS' field of an insn is a chain of `insn_list' expressions. 
                    360: Each of these has two operands: the first is an insn, and the second is
                    361: another `insn_list' expression (the next one in the chain).  The last
                    362: `insn_list' in the chain has a null pointer as second operand.  The
                    363: significant thing about the chain is which insns appear in it (as first
                    364: operands of `insn_list' expressions).  Their order is not significant.
                    365: 
                    366: The `REG_NOTES' field of an insn is a similar chain but of `expr_list'
                    367: expressions instead of `insn_list'.  There are four kinds of register
                    368: notes, which are distinguished by the machine mode of the `expr_list',
                    369: which a register note is really understood as being an `enum reg_note'. 
                    370: The first operand OP of the `expr_list' is data whose meaning depends on
                    371: the kind of note.  Here are the four kinds:
                    372: 
                    373: `REG_DEAD'
                    374:      The register OP dies in this insn; that is to say, altering the value
                    375:      immediately after this insn would not affect the future behavior of
                    376:      the program.
                    377: 
                    378: `REG_INC'
                    379:      The register OP is incremented (or decremented; at this level there is
                    380:      no distinction) by an embedded side effect inside this insn.  This
                    381:      means it appears in a `POST_INC', `PRE_INC', `POST_DEC' or `PRE_DEC'
                    382:      RTX.
                    383: 
                    384: `REG_EQUIV'
                    385:      The register that is set by this insn will be equal to OP at run time,
                    386:      and could validly be replaced in all its occurrences by OP. 
                    387:      (``Validly'' here refers to the data flow of the program; simple
                    388:      replacement may make some insns invalid.)
                    389: 
                    390:      The value which the insn explicitly copies into the register may look
                    391:      different from OP, but they will be equal at run time.
                    392: 
                    393:      For example, when a constant is loaded into a register that is never
                    394:      assigned any other value, this kind of note is used.
                    395: 
                    396:      When a parameter is copied into a pseudo-register at entry to a
                    397:      function, a note of this kind records that the register is equivalent
                    398:      to the stack slot where the parameter was passed.  Although in this
                    399:      case the register may be set by other insns, it is still valid to
                    400:      replace the register by the stack slot throughout the function.
                    401: 
                    402: `REG_EQUAL'
                    403:      The register that is set by this insn will be equal to OP at run time
                    404:      at the end of this insn (but not necessarily elsewhere in the function).
                    405: 
                    406:      The RTX OP is typically an arithmetic expression.  For example, when a
                    407:      sequence of insns such as a library call is used to perform an
                    408:      arithmetic operation, this kind of note is attached to the insn that
                    409:      produces or copies the final value.  It tells the CSE pass how to
                    410:      think of that value.
                    411: 
                    412: `REG_RETVAL'
                    413:      This insn copies the value of a library call, and OP is the first insn
                    414:      that was generated to set up the arguments for the library call.
                    415: 
                    416:      Flow analysis uses this note to delete all of a library call whose
                    417:      result is dead.
                    418: 
                    419: `REG_WAS_0'
                    420:      The register OP contained zero before this insn.  You can rely on this
                    421:      note if it is present; its absence implies nothing.
                    422: 
                    423: (The only difference between the expression codes `insn_list' and
                    424: `expr_list' is that the first operand of an `insn_list' is assumed to be an
                    425: insn and is printed in debugging dumps as the insn's unique id; the first
                    426: operand of an `expr_list' is printed in the ordinary way as an expression.)
                    427: 
                    428: 
                    429: File: internals,  Node: Calls,  Next: Sharing,  Prev: Insns,  Up: RTL
                    430: 
                    431: RTL Representation of Function-Call Insns
                    432: =========================================
                    433: 
                    434: Insns that call subroutines have the RTL expression code `call_insn'. 
                    435: These insns must satisfy special rules, and their bodies must use a special
                    436: RTL expression code, `call'.
                    437: 
                    438: A `call' expression has two operands, as follows:
                    439: 
                    440:      (call NBYTES (mem:FM ADDR))
                    441: 
                    442: Here NBYTES is an operand that represents the number of bytes of argument
                    443: data being passed to the subroutine, FM is a machine mode (which must equal
                    444: as the definition of the `FUNCTION_MODE' macro in the machine description)
                    445: and ADDR represents the address of the subroutine.
                    446: 
                    447: For a subroutine that returns no value, the `call' RTX as shown above is
                    448: the entire body of the insn.
                    449: 
                    450: For a subroutine that returns a value whose mode is not `BLKmode', the
                    451: value is returned in a hard register.  If this register's number is R, then
                    452: the body of the call insn looks like this:
                    453: 
                    454:      (set (reg:M R)
                    455:           (call NBYTES (mem:FM ADDR)))
                    456: 
                    457: This RTL expression makes it clear (to the optimizer passes) that the
                    458: appropriate register receives a useful value in this insn.
                    459: 
                    460: Immediately after RTL generation, if the value of the subroutine is
                    461: actually used, this call insn is always followed closely by an insn which
                    462: refers to the register R.  This remains true through all the optimizer
                    463: passes until cross jumping occurs.
                    464: 
                    465: The following insn has one of two forms.  Either it copies the value into a
                    466: pseudo-register, like this:
                    467: 
                    468:      (set (reg:M P) (reg:M R))
                    469: 
                    470: or (in the case where the calling function will simply return whatever
                    471: value the call produced, and no operation is needed to do this):
                    472: 
                    473:      (use (reg:M R))
                    474: 
                    475: Between the call insn and this following insn there may intervene only a
                    476: stack-adjustment insn (and perhaps some `note' insns).
                    477: 
                    478: When a subroutine returns a `BLKmode' value, it is handled by passing to
                    479: the subroutine the address of a place to store the value.  So the call insn
                    480: itself does not ``return'' any value, and it has the same RTL form as a
                    481: call that returns nothing.
                    482: 
                    483: 
                    484: File: internals,  Node: Sharing,  Prev: Calls,  Up: RTL
                    485: 
                    486: Structure Sharing Assumptions
                    487: =============================
                    488: 
                    489: The compiler assumes that certain kinds of RTL expressions are unique;
                    490: there do not exist two distinct objects representing the same value.  In
                    491: other cases, it makes an opposite assumption: that no RTL expression object
                    492: of a certain kind appears in more than one place in the containing structure.
                    493: 
                    494: These assumptions refer to a single function; except for the RTL objects
                    495: that describe global variables and external functions, no RTL objects are
                    496: common to two functions.
                    497: 
                    498:    * Each pseudo-register has only a single `reg' object to represent it,
                    499:      and therefore only a single machine mode.
                    500: 
                    501:    * For any symbolic label, there is only one `symbol_ref' object
                    502:      referring to it.
                    503: 
                    504:    * There is only one `const_int' expression with value zero, and only one
                    505:      with value one.
                    506: 
                    507:    * There is only one `pc' expression.
                    508: 
                    509:    * There is only one `cc0' expression.
                    510: 
                    511:    * There is only one `const_double' expression with mode `SFmode' and
                    512:      value zero, and only one with mode `DFmode' and value zero.
                    513: 
                    514:    * No `label_ref' appears in more than one place in the RTL structure; in
                    515:      other words, it is safe to do a tree-walk of all the insns in the
                    516:      function and assume that each time a `label_ref' is seen it is
                    517:      distinct from all others that are seen.
                    518: 
                    519:    * Only one `mem' object is normally created for each static variable or
                    520:      stack slot, so these objects are frequently shared in all the places
                    521:      they appear.  However, separate but equal objects for these variables
                    522:      are occasionally made.
                    523: 
                    524:    * No RTL object appears in more than one place in the RTL structure
                    525:      except as described above.  Many passes of the compiler rely on this
                    526:      by assuming that they can modify RTL objects in place without unwanted
                    527:      side-effects on other insns.
                    528: 
                    529:    * During initial RTL generation, shared structure is freely introduced. 
                    530:      After all the RTL for a function has been generated, all shared
                    531:      structure is copied by `unshare_all_rtl' in `emit-rtl.c', after which
                    532:      the above rules are guaranteed to be followed.
                    533: 
                    534:    * During the combiner pass, shared structure with an insn can exist
                    535:      temporarily.  However, the shared structure is copied before the
                    536:      combiner is finished with the insn.  This is done by
                    537:      `copy_substitutions' in `combine.c'.
                    538: 
                    539: 
                    540: File: internals,  Node: Machine Desc,  Next: Machine Macros,  Prev: RTL,  Up: Top
                    541: 
                    542: Machine Descriptions
                    543: ********************
                    544: 
                    545: A machine description has two parts: a file of instruction patterns (`.md'
                    546: file) and a C header file of macro definitions.
                    547: 
                    548: The `.md' file for a target machine contains a pattern for each instruction
                    549: that the target machine supports (or at least each instruction that is
                    550: worth telling the compiler about).  It may also contain comments.  A
                    551: semicolon causes the rest of the line to be a comment, unless the semicolon
                    552: is inside a quoted string.
                    553: 
                    554: See the next chapter for information on the C header file.
                    555: 
                    556: * Menu:
                    557: 
                    558: * Patterns::            How to write instruction patterns.
                    559: * Example::             An explained example of a `define_insn' pattern.
                    560: * RTL Template::        The RTL template defines what insns match a pattern.
                    561: * Output Template::     The output template says how to make assembler code
                    562:                           from such an insn.
                    563: * Output Statement::    For more generality, write C code to output 
                    564:                           the assembler code.
                    565: * Constraints::         When not all operands are general operands.
                    566: * Standard Names::      Names mark patterns to use for code generation.
                    567: * Pattern Ordering::    When the order of patterns makes a difference.
                    568: * Dependent Patterns::  Having one pattern may make you need another.
                    569: * Jump Patterns::       Special considerations for patterns for jump insns.
                    570: * Peephole Definitions::Defining machine-specific peephole optimizations.
                    571: * Expander Definitions::Generating a sequence of several RTL insns
                    572:                          for a standard operation.
                    573: 
                    574: 
                    575: 
                    576: File: internals,  Node: Patterns,  Next: Example,  Prev: Machine Desc,  Up: Machine Desc
                    577: 
                    578: Everything about Instruction Patterns
                    579: =====================================
                    580: 
                    581: Each instruction pattern contains an incomplete RTL expression, with pieces
                    582: to be filled in later, operand constraints that restrict how the pieces can
                    583: be filled in, and an output pattern or C code to generate the assembler
                    584: output, all wrapped up in a `define_insn' expression.
                    585: 
                    586: A `define_insn' is an RTL expression containing four operands:
                    587: 
                    588:   1. An optional name.  The presence of a name indicate that this instruction
                    589:      pattern can perform a certain standard job for the RTL-generation pass
                    590:      of the compiler.  This pass knows certain names and will use the
                    591:      instruction patterns with those names, if the names are defined in the
                    592:      machine description.
                    593: 
                    594:      The absence of a name is indicated by writing an empty string where
                    595:      the name should go.  Nameless instruction patterns are never used for
                    596:      generating RTL code, but they may permit several simpler insns to be
                    597:      combined later on.
                    598: 
                    599:      Names that are not thus known and used in RTL-generation have no
                    600:      effect; they are equivalent to no name at all.
                    601: 
                    602:   2. The "RTL template" (*Note RTL Template::.) is a vector of incomplete RTL
                    603:      expressions which show what the instruction should look like.  It is
                    604:      incomplete because it may contain `match_operand' and `match_dup'
                    605:      expressions that stand for operands of the instruction.
                    606: 
                    607:      If the vector has only one element, that element is what the
                    608:      instruction should look like.  If the vector has multiple elements,
                    609:      then the instruction looks like a `parallel' expression containing
                    610:      that many elements as described.
                    611: 
                    612:   3. A condition.  This is a string which contains a C expression that is the
                    613:      final test to decide whether an insn body matches this pattern.
                    614: 
                    615:      For a named pattern, the condition (if present) may not depend on the
                    616:      data in the insn being matched, but only the target-machine-type
                    617:      flags.  The compiler needs to test these conditions during
                    618:      initialization in order to learn exactly which named instructions are
                    619:      available in a particular run.
                    620: 
                    621:      For nameless patterns, the condition is applied only when matching an
                    622:      individual insn, and only after the insn has matched the pattern's
                    623:      recognition template.  The insn's operands may be found in the vector
                    624:      `operands'.
                    625: 
                    626:   4. The "output template": a string that says how to output matching insns
                    627:      as assembler code.  `%' in this string specifies where to substitute
                    628:      the value of an operand.  *note Output Template::.
                    629: 
                    630:      When simple substitution isn't general enough, you can specify a piece
                    631:      of C code to compute the output.  *note Output Statement::.
                    632: 
                    633: 
                    634: File: internals,  Node: Example,  Next: RTL Template,  Prev: Patterns,  Up: Machine Desc
                    635: 
                    636: Example of `define_insn'
                    637: ========================
                    638: 
                    639: Here is an actual example of an instruction pattern, for the 68000/68020.
                    640: 
                    641:      (define_insn "tstsi"
                    642:        [(set (cc0)
                    643:              (match_operand:SI 0 "general_operand" "rm"))]
                    644:        ""
                    645:        "*
                    646:      { if (TARGET_68020 || ! ADDRESS_REG_P (operands[0]))
                    647:          return \"tstl %0\";
                    648:        return \"cmpl #0,%0\"; }")
                    649: 
                    650: This is an instruction that sets the condition codes based on the value of
                    651: a general operand.  It has no condition, so any insn whose RTL description
                    652: has the form shown may be handled according to this pattern.  The name
                    653: `tstsi' means ``test a `SImode' value'' and tells the RTL generation pass
                    654: that, when it is necessary to test such a value, an insn to do so can be
                    655: constructed using this pattern.
                    656: 
                    657: The output control string is a piece of C code which chooses which output
                    658: template to return based on the kind of operand and the specific type of
                    659: CPU for which code is being generated.
                    660: 
                    661: `"rm"' is an operand constraint.  Its meaning is explained below.
                    662: 
                    663: 
                    664: File: internals,  Node: RTL Template,  Next: Output Template,  Prev: Example,  Up: Machine Desc
                    665: 
                    666: RTL Template for Generating and Recognizing Insns
                    667: =================================================
                    668: 
                    669: The RTL template is used to define which insns match the particular pattern
                    670: and how to find their operands.  For named patterns, the RTL template also
                    671: says how to construct an insn from specified operands.
                    672: 
                    673: Construction involves substituting specified operands into a copy of the
                    674: template.  Matching involves determining the values that serve as the
                    675: operands in the insn being matched.  Both of these activities are
                    676: controlled by special expression types that direct matching and
                    677: substitution of the operands.
                    678: 
                    679: `(match_operand:M N TESTFN CONSTRAINT)'
                    680:      This expression is a placeholder for operand number N of the insn. 
                    681:      When constructing an insn, operand number N will be substituted at
                    682:      this point.  When matching an insn, whatever appears at this position
                    683:      in the insn will be taken as operand number N; but it must satisfy
                    684:      TESTFN or this instruction pattern will not match at all.
                    685: 
                    686:      Operand numbers must be chosen consecutively counting from zero in
                    687:      each instruction pattern.  There may be only one `match_operand'
                    688:      expression in the pattern for each expression number, and they must
                    689:      appear in order of increasing expression number.
                    690: 
                    691:      TESTFN is a string that is the name of a C function that accepts two
                    692:      arguments, a machine mode and an expression.  During matching, the
                    693:      function will be called with M as the mode argument and the putative
                    694:      operand as the other argument.  If it returns zero, this instruction
                    695:      pattern fails to match.  TESTFN may be an empty string; then it means
                    696:      no test is to be done on the operand.
                    697: 
                    698:      Most often, TESTFN is `"general_operand"'.  It checks that the
                    699:      putative operand is either a constant, a register or a memory
                    700:      reference, and that it is valid for mode M.
                    701: 
                    702:      For an operand that must be a register, TESTFN should be
                    703:      `"register_operand"'.  This prevents GNU CC from creating insns that
                    704:      have memory references in these operands, insns which would only have
                    705:      to be taken apart in the reload pass.
                    706: 
                    707:      For an operand that must be a constant, either TESTFN should be
                    708:      `"immediate_operand"', or the instruction pattern's extra condition
                    709:      should check for constants, or both.
                    710: 
                    711:      CONSTRAINT is explained later (*Note Constraints::.).
                    712: 
                    713: `(match_dup N)'
                    714:      This expression is also a placeholder for operand number N.  It is
                    715:      used when the operand needs to appear more than once in the insn.
                    716: 
                    717:      In construction, `match_dup' behaves exactly like `match_operand': the
                    718:      operand is substituted into the insn being constructed.  But in
                    719:      matching, `match_dup' behaves differently.  It assumes that operand
                    720:      number N has already been determined by a `match_operand' appearing
                    721:      earlier in the recognition template, and it matches only an
                    722:      identical-looking expression.
                    723: 
                    724: `(address (match_operand:M N "address_operand" ""))'
                    725:      This complex of expressions is a placeholder for an operand number N
                    726:      in a ``load address'' instruction: an operand which specifies a memory
                    727:      location in the usual way, but for which the actual operand value used
                    728:      is the address of the location, not the contents of the location.
                    729: 
                    730:      `address' expressions never appear in RTL code, only in machine
                    731:      descriptions.  And they are used only in machine descriptions that do
                    732:      not use the operand constraint feature.  When operand constraints are
                    733:      in use, the letter `p' in the constraint serves this purpose.
                    734: 
                    735:      M is the machine mode of the *memory location being addressed*, not
                    736:      the machine mode of the address itself.  That mode is always the same
                    737:      on a given target machine (it is `Pmode', which normally is `SImode'),
                    738:      so there is no point in mentioning it; thus, no machine mode is
                    739:      written in the `address' expression.  If some day support is added for
                    740:      machines in which addresses of different kinds of objects appear
                    741:      differently or are used differently (such as the PDP-10), different
                    742:      formats would perhaps need different machine modes and these modes
                    743:      might be written in the `address' expression.
                    744: 
                    745: 
                    746: File: internals,  Node: Output Template,  Next: Output Statement,  Prev: RTL Template,  Up: Machine Desc
                    747: 
                    748: Output Templates and Operand Substitution
                    749: =========================================
                    750: 
                    751: The "output template" is a string which specifies how to output the
                    752: assembler code for an instruction pattern.  Most of the template is a fixed
                    753: string which is output literally.  The character `%' is used to specify
                    754: where to substitute an operand; it can also be used to identify places
                    755: different variants of the assembler require different syntax.
                    756: 
                    757: In the simplest case, a `%' followed by a digit N says to output operand N
                    758: at that point in the string.
                    759: 
                    760: `%' followed by a letter and a digit says to output an operand in an
                    761: alternate fashion.  Four letters have standard, built-in meanings described
                    762: below.  The machine description macro `PRINT_OPERAND' can define additional
                    763: letters with nonstandard meanings.
                    764: 
                    765: `%cDIGIT' can be used to substitute an operand that is a constant value
                    766: without the syntax that normally indicates an immediate operand.
                    767: 
                    768: `%nDIGIT' is like `%cDIGIT' except that the value of the constant is
                    769: negated before printing.
                    770: 
                    771: `%aDIGIT' can be used to substitute an operand as if it were a memory
                    772: reference, with the actual operand treated as the address.  This may be
                    773: useful when outputting a ``load address'' instruction, because often the
                    774: assembler syntax for such an instruction requires you to write the operand
                    775: as if it were a memory reference.
                    776: 
                    777: `%lDIGIT' is used to substitute a `label_ref' into a jump instruction.
                    778: 
                    779: `%' followed by a punctuation character specifies a substitution that does
                    780: not use an operand.  Only one case is standard: `%%' outputs a `%' into the
                    781: assembler code.  Other nonstandard cases can be defined in the
                    782: `PRINT_OPERAND' macro.
                    783: 
                    784: The template may generate multiple assembler instructions.  Write the text
                    785: for the instructions, with `\;' between them.
                    786: 
                    787: When the RTL contains two operand which are required by constraint to match
                    788: each other, the output template must refer only to the lower-numbered
                    789: operand.  Matching operands are not always identical, and the rest of the
                    790: compiler arranges to put the proper RTL expression for printing into the
                    791: lower-numbered operand.
                    792: 
                    793: One use of nonstandard letters or punctuation following `%' is to
                    794: distinguish between different assembler languages for the same machine; for
                    795: example, Motorola syntax versus MIT syntax for the 68000.  Motorola syntax
                    796: requires periods in most opcode names, while MIT syntax does not.  For
                    797: example, the opcode `movel' in MIT syntax is `move.l' in Motorola syntax. 
                    798: The same file of patterns is used for both kinds of output syntax, but the
                    799: character sequence `%.' is used in each place where Motorola syntax wants a
                    800: period.  The `PRINT_OPERAND' macro for Motorola syntax defines the sequence
                    801: to output a period; the macro for MIT syntax defines it to do nothing.
                    802: 
                    803: 
                    804: File: internals,  Node: Output Statement,  Next: Constraints,  Prev: Output Template,  Up: Machine Desc
                    805: 
                    806: C Statements for Generating Assembler Output
                    807: ============================================
                    808: 
                    809: Often a single fixed template string cannot produce correct and efficient
                    810: assembler code for all the cases that are recognized by a single
                    811: instruction pattern.  For example, the opcodes may depend on the kinds of
                    812: operands; or some unfortunate combinations of operands may require extra
                    813: machine instructions.
                    814: 
                    815: If the output control string starts with a `*', then it is not an output
                    816: template but rather a piece of C program that should compute a template. 
                    817: It should execute a `return' statement to return the template-string you
                    818: want.  Most such templates use C string literals, which require doublequote
                    819: characters to delimit them.  To include these doublequote characters in the
                    820: string, prefix each one with `\'.
                    821: 
                    822: The operands may be found in the array `operands', whose C data type is
                    823: `rtx []'.
                    824: 
                    825: It is possible to output an assembler instruction and then go on to output
                    826: or compute more of them, using the subroutine `output_asm_insn'.  This
                    827: receives two arguments: a template-string and a vector of operands.  The
                    828: vector may be `operands', or it may be another array of `rtx' that you
                    829: declare locally and initialize yourself.
                    830: 
                    831: When an insn pattern has multiple alternatives in its constraints, often
                    832: the appearance of the assembler code determined mostly by which alternative
                    833: was matched.  When this is so, the C code can test the variable
                    834: `which_alternative', which is the ordinal number of the alternative that
                    835: was actually satisfied (0 for the first, 1 for the second alternative, etc.).
                    836: 
                    837: For example, suppose there are two opcodes for storing zero, `clrreg' for
                    838: registers and `clrmem' for memory locations.  Here is how a pattern could
                    839: use `which_alternative' to choose between them:
                    840: 
                    841:      (define_insn ""
                    842:        [(set (match_operand:SI 0 "general_operand" "r,m")
                    843:              (const_int 0))]
                    844:        ""
                    845:        "*
                    846:        return (which_alternative == 0
                    847:                ? \"clrreg %0\" : \"clrmem %0\");
                    848:        ")
                    849: 
                    850: 
                    851: File: internals,  Node: Constraints,  Next: Standard Names,  Prev: Output Statement,  Up: Machine Desc
                    852: 
                    853: Operand Constraints
                    854: ===================
                    855: 
                    856: Each `match_operand' in an instruction pattern can specify a constraint for
                    857: the type of operands allowed.  Constraints can say whether an operand may
                    858: be in a register, and which kinds of register; whether the operand can be a
                    859: memory reference, and which kinds of address; whether the operand may be an
                    860: immediate constant, and which possible values it may have.  Constraints can
                    861: also require two operands to match.
                    862: 
                    863: * Menu:
                    864: 
                    865: * Simple Constraints::  Basic use of constraints.
                    866: * Multi-Alternative::   When an insn has two alternative constraint-patterns.
                    867: * Class Preferences::   Constraints guide which hard register to put things in.
                    868: * Modifiers::           More precise control over effects of constraints.
                    869: * No Constraints::      Describing a clean machine without constraints.
                    870: 
                    871: 
                    872: 
                    873: File: internals,  Node: Simple Constraints,  Next: Multi-Alternative,  Prev: Constraints,  Up: Constraints
                    874: 
                    875: Simple Constraints
                    876: ------------------
                    877: 
                    878: The simplest kind of constraint is a string full of letters, each of which
                    879: describes one kind of operand that is permitted.  Here are the letters that
                    880: are allowed:
                    881: 
                    882: `m'
                    883:      A memory operand is allowed, with any kind of address that the machine
                    884:      supports in general.
                    885: 
                    886: `o'
                    887:      A memory operand is allowed, but only if the address is "offsetable". 
                    888:      This means that adding a small integer (actually, the width in bytes
                    889:      of the operand, as determined by its machine mode) may be added to the
                    890:      address and the result is also a valid memory address.
                    891: 
                    892:      For example, an address which is constant is offsetable; so is an
                    893:      address that is the sum of a register and a constant (as long as a
                    894:      slightly larger constant is also within the range of address-offsets
                    895:      supported by the machine); but an autoincrement or autodecrement
                    896:      address is not offsetable.  More complicated indirect/indexed
                    897:      addresses may or may not be offsetable depending on the other
                    898:      addressing modes that the machine supports.
                    899: 
                    900:      Note that in an output operand which can be matched by another
                    901:      operand, the constraint letter `o' is valid only when accompanied by
                    902:      both `<' (if the target machine has predecrement addressing) and `>'
                    903:      (if the target machine has preincrement addressing).
                    904: 
                    905: `<'
                    906:      A memory operand with autodecrement addressing (either predecrement or
                    907:      postdecrement) is allowed.
                    908: 
                    909: `>'
                    910:      A memory operand with autoincrement addressing (either preincrement or
                    911:      postincrement) is allowed.
                    912: 
                    913: `r'
                    914:      A register operand is allowed provided that it is in a general register.
                    915: 
                    916: `d', `a', `f', ...
                    917:       Other letters can be defined in machine-dependent fashion to stand for
                    918:      particular classes of registers.  `d', `a' and `f' are defined on the
                    919:      68000/68020 to stand for data, address and floating point registers.
                    920: 
                    921: `i'
                    922:      An immediate integer operand (one with constant value) is allowed. 
                    923:      This includes symbolic constants whose values will be known only at
                    924:      assembly time.
                    925: 
                    926: `n'
                    927:      An immediate integer operand with a known numeric value is allowed. 
                    928:      Many systems cannot support assembly-time constants for operands less
                    929:      than a word wide.  Constraints for these operands should use `n'
                    930:      rather than `i'.
                    931: 
                    932: `I', `J', `K', ...
                    933:       Other letters in the range `I' through `M' may be defined in a
                    934:      machine-dependent fashion to permit immediate integer operands with
                    935:      explicit integer values in specified ranges.  For example, on the
                    936:      68000, `I' is defined to stand for the range of values 1 to 8.  This
                    937:      is the range permitted as a shift count in the shift instructions.
                    938: 
                    939: `F'
                    940:      An immediate floating operand (expression code `const_double') is
                    941:      allowed.
                    942: 
                    943: `G', `H'
                    944:      `G' and `H' may be defined in a machine-dependent fashion to permit
                    945:      immediate floating operands in particular ranges of values.
                    946: 
                    947: `s'
                    948:      An immediate integer operand whose value is not an explicit integer is
                    949:      allowed.
                    950: 
                    951:      This might appear strange; if an insn allows a constant operand with a
                    952:      value not known at compile time, it certainly must allow any known
                    953:      value.  So why use `s' instead of `i'?  Sometimes it allows better
                    954:      code to be generated.
                    955: 
                    956:      For example, on the 68000 in a fullword instruction it is possible to
                    957:      use an immediate operand; but if the immediate value is between -32
                    958:      and 31, better code results from loading the value into a register and
                    959:      using the register.  This is because the load into the register can be
                    960:      done with a `moveq' instruction.  We arrange for this to happen by
                    961:      defining the letter `K' to mean ``any integer outside the range -32 to
                    962:      31'', and then specifying `Ks' in the operand constraints.
                    963: 
                    964: `g'
                    965:      Any register, memory or immediate integer operand is allowed, except
                    966:      for registers that are not general registers.
                    967: 
                    968: `N' (a digit)
                    969:      An operand that matches operand number N is allowed.  If a digit is
                    970:      used together with letters, the digit should come last.
                    971: 
                    972:      This is called a "matching constraint" and what it really means is
                    973:      that the assembler has only a single operand that fills two roles
                    974:      considered separate in the RTL insn.  For example, an add insn has two
                    975:      input operands and one output operand in the RTL, but on most machines
                    976:      an add instruction really has only two operands, one of them an
                    977:      input-output operand.
                    978: 
                    979:      Matching constraints work only in circumstances like that add insn. 
                    980:      More precisely, the matching constraint must appear in an input-only
                    981:      operand and the operand that it matches must be an output-only operand
                    982:      with a lower number.
                    983: 
                    984:      For operands to match in a particular case usually means that they are
                    985:      identical-looking RTL expressions.  But in a few special cases
                    986:      specific kinds of dissimilarity are allowed.  For example, `*x' as an
                    987:      input operand will match `*x++' as an output operand.  For proper
                    988:      results in such cases, the output template should always use the
                    989:      output-operand's number when printing the operand.
                    990: 
                    991: `p'
                    992:      An operand that is a valid memory address is allowed.  This is for
                    993:      ``load address'' and ``push address'' instructions.
                    994: 
                    995:      If `p' is used in the constraint, the test-function in the
                    996:      `match_operand' must be `address_operand'.
                    997: 
                    998: In order to have valid assembler code, each operand must satisfy its
                    999: constraint.  But a failure to do so does not prevent the pattern from
                   1000: applying to an insn.  Instead, it directs the compiler to modify the code
                   1001: so that the constraint will be satisfied.  Usually this is done by copying
                   1002: an operand into a register.
                   1003: 
                   1004: Contrast, therefore, the two instruction patterns that follow:
                   1005: 
                   1006:      (define_insn ""
                   1007:        [(set (match_operand:SI 0 "general_operand" "r")
                   1008:              (plus:SI (match_dup 0)
                   1009:                       (match_operand:SI 1 "general_operand" "r")))]
                   1010:        ""
                   1011:        "...")
                   1012: 
                   1013: which has two operands, one of which must appear in two places, and
                   1014: 
                   1015:      (define_insn ""
                   1016:        [(set (match_operand:SI 0 "general_operand" "r")
                   1017:              (plus:SI (match_operand:SI 1 "general_operand" "0")
                   1018:                       (match_operand:SI 2 "general_operand" "r")))]
                   1019:        ""
                   1020:        "...")
                   1021: 
                   1022: which has three operands, two of which are required by a constraint to be
                   1023: identical.  If we are considering an insn of the form
                   1024: 
                   1025:      (insn N PREV NEXT
                   1026:        (set (reg:SI 3)
                   1027:             (plus:SI (reg:SI 6) (reg:SI 109)))
                   1028:        ...)
                   1029: 
                   1030: the first pattern would not apply at all, because this insn does not
                   1031: contain two identical subexpressions in the right place.  The pattern would
                   1032: say, ``That does not look like an add instruction; try other patterns.''
                   1033: The second pattern would say, ``Yes, that's an add instruction, but there
                   1034: is something wrong with it.''  It would direct the reload pass of the
                   1035: compiler to generate additional insns to make the constraint true.  The
                   1036: results might look like this:
                   1037: 
                   1038:      (insn N2 PREV N
                   1039:        (set (reg:SI 3) (reg:SI 6))
                   1040:        ...)
                   1041:      
                   1042:      (insn N N2 NEXT
                   1043:        (set (reg:SI 3)
                   1044:             (plus:SI (reg:SI 3) (reg:SI 109)))
                   1045:        ...)
                   1046: 
                   1047: Because insns that don't fit the constraints are fixed up by loading
                   1048: operands into registers, every instruction pattern's constraints must
                   1049: permit the case where all the operands are in registers.  It need not
                   1050: permit all classes of registers; the compiler knows how to copy registers
                   1051: into other registers of the proper class in order to make an instruction
                   1052: valid.  But if no registers are permitted, the compiler will be stymied: it
                   1053: does not know how to save a register in memory in order to make an
                   1054: instruction valid.  Instruction patterns that reject registers can be made
                   1055: valid by attaching a condition-expression that refuses to match an insn at
                   1056: all if the crucial operand is a register.
                   1057: 
                   1058: 

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.