|
|
1.1 root 1: /* Perform instruction reorganizations for delay slot filling.
2: Copyright (C) 1992, 1993 Free Software Foundation, Inc.
3: Contributed by Richard Kenner ([email protected]).
4: Hacked by Michael Tiemann ([email protected]).
5:
6: This file is part of GNU CC.
7:
8: GNU CC is free software; you can redistribute it and/or modify
9: it under the terms of the GNU General Public License as published by
10: the Free Software Foundation; either version 2, or (at your option)
11: any later version.
12:
13: GNU CC is distributed in the hope that it will be useful,
14: but WITHOUT ANY WARRANTY; without even the implied warranty of
15: MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
16: GNU General Public License for more details.
17:
18: You should have received a copy of the GNU General Public License
19: along with GNU CC; see the file COPYING. If not, write to
20: the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */
21:
22: /* Instruction reorganization pass.
23:
24: This pass runs after register allocation and final jump
25: optimization. It should be the last pass to run before peephole.
26: It serves primarily to fill delay slots of insns, typically branch
27: and call insns. Other insns typically involve more complicated
28: interactions of data dependencies and resource constraints, and
29: are better handled by scheduling before register allocation (by the
30: function `schedule_insns').
31:
32: The Branch Penalty is the number of extra cycles that are needed to
33: execute a branch insn. On an ideal machine, branches take a single
34: cycle, and the Branch Penalty is 0. Several RISC machines approach
35: branch delays differently:
36:
37: The MIPS and AMD 29000 have a single branch delay slot. Most insns
38: (except other branches) can be used to fill this slot. When the
39: slot is filled, two insns execute in two cycles, reducing the
40: branch penalty to zero.
41:
42: The Motorola 88000 conditionally exposes its branch delay slot,
43: so code is shorter when it is turned off, but will run faster
44: when useful insns are scheduled there.
45:
46: The IBM ROMP has two forms of branch and call insns, both with and
47: without a delay slot. Much like the 88k, insns not using the delay
48: slot can be shorted (2 bytes vs. 4 bytes), but will run slowed.
49:
50: The SPARC always has a branch delay slot, but its effects can be
51: annulled when the branch is not taken. This means that failing to
52: find other sources of insns, we can hoist an insn from the branch
53: target that would only be safe to execute knowing that the branch
54: is taken.
55:
56: The HP-PA always has a branch delay slot. For unconditional branches
57: its effects can be annulled when the branch is taken. The effects
58: of the delay slot in a conditional branch can be nullified for forward
59: taken branches, or for untaken backward branches. This means
60: we can hoist insns from the fall-through path for forward branches or
61: steal insns from the target of backward branches.
62:
63: Three techniques for filling delay slots have been implemented so far:
64:
65: (1) `fill_simple_delay_slots' is the simplest, most efficient way
66: to fill delay slots. This pass first looks for insns which come
67: from before the branch and which are safe to execute after the
68: branch. Then it searches after the insn requiring delay slots or,
69: in the case of a branch, for insns that are after the point at
70: which the branch merges into the fallthrough code, if such a point
71: exists. When such insns are found, the branch penalty decreases
72: and no code expansion takes place.
73:
74: (2) `fill_eager_delay_slots' is more complicated: it is used for
75: scheduling conditional jumps, or for scheduling jumps which cannot
76: be filled using (1). A machine need not have annulled jumps to use
77: this strategy, but it helps (by keeping more options open).
78: `fill_eager_delay_slots' tries to guess the direction the branch
79: will go; if it guesses right 100% of the time, it can reduce the
80: branch penalty as much as `fill_simple_delay_slots' does. If it
81: guesses wrong 100% of the time, it might as well schedule nops (or
82: on the m88k, unexpose the branch slot). When
83: `fill_eager_delay_slots' takes insns from the fall-through path of
84: the jump, usually there is no code expansion; when it takes insns
85: from the branch target, there is code expansion if it is not the
86: only way to reach that target.
87:
88: (3) `relax_delay_slots' uses a set of rules to simplify code that
89: has been reorganized by (1) and (2). It finds cases where
90: conditional test can be eliminated, jumps can be threaded, extra
91: insns can be eliminated, etc. It is the job of (1) and (2) to do a
92: good job of scheduling locally; `relax_delay_slots' takes care of
93: making the various individual schedules work well together. It is
94: especially tuned to handle the control flow interactions of branch
95: insns. It does nothing for insns with delay slots that do not
96: branch.
97:
98: On machines that use CC0, we are very conservative. We will not make
99: a copy of an insn involving CC0 since we want to maintain a 1-1
100: correspondence between the insn that sets and uses CC0. The insns are
101: allowed to be separated by placing an insn that sets CC0 (but not an insn
102: that uses CC0; we could do this, but it doesn't seem worthwhile) in a
103: delay slot. In that case, we point each insn at the other with REG_CC_USER
104: and REG_CC_SETTER notes. Note that these restrictions affect very few
105: machines because most RISC machines with delay slots will not use CC0
106: (the RT is the only known exception at this point).
107:
108: Not yet implemented:
109:
110: The Acorn Risc Machine can conditionally execute most insns, so
111: it is profitable to move single insns into a position to execute
112: based on the condition code of the previous insn.
113:
114: The HP-PA can conditionally nullify insns, providing a similar
115: effect to the ARM, differing mostly in which insn is "in charge". */
116:
117: #include <stdio.h>
118: #include "config.h"
119: #include "rtl.h"
120: #include "insn-config.h"
121: #include "conditions.h"
122: #include "hard-reg-set.h"
123: #include "basic-block.h"
124: #include "regs.h"
125: #include "insn-flags.h"
126: #include "recog.h"
127: #include "flags.h"
128: #include "output.h"
129: #include "obstack.h"
130: #include "insn-attr.h"
131:
132: #ifdef DELAY_SLOTS
133:
134: #define obstack_chunk_alloc xmalloc
135: #define obstack_chunk_free free
136:
137: #ifndef ANNUL_IFTRUE_SLOTS
138: #define eligible_for_annul_true(INSN, SLOTS, TRIAL, FLAGS) 0
139: #endif
140: #ifndef ANNUL_IFFALSE_SLOTS
141: #define eligible_for_annul_false(INSN, SLOTS, TRIAL, FLAGS) 0
142: #endif
143:
144: /* Insns which have delay slots that have not yet been filled. */
145:
146: static struct obstack unfilled_slots_obstack;
147: static rtx *unfilled_firstobj;
148:
149: /* Define macros to refer to the first and last slot containing unfilled
150: insns. These are used because the list may move and its address
151: should be recomputed at each use. */
152:
153: #define unfilled_slots_base \
154: ((rtx *) obstack_base (&unfilled_slots_obstack))
155:
156: #define unfilled_slots_next \
157: ((rtx *) obstack_next_free (&unfilled_slots_obstack))
158:
159: /* This structure is used to indicate which hardware resources are set or
160: needed by insns so far. */
161:
162: struct resources
163: {
164: char memory; /* Insn sets or needs a memory location. */
165: char volatil; /* Insn sets or needs a volatile memory loc. */
166: char cc; /* Insn sets or needs the condition codes. */
167: HARD_REG_SET regs; /* Which registers are set or needed. */
168: };
169:
170: /* Macro to clear all resources. */
171: #define CLEAR_RESOURCE(RES) \
172: do { (RES)->memory = (RES)->volatil = (RES)->cc = 0; \
173: CLEAR_HARD_REG_SET ((RES)->regs); } while (0)
174:
175: /* Indicates what resources are required at the beginning of the epilogue. */
176: static struct resources start_of_epilogue_needs;
177:
178: /* Indicates what resources are required at function end. */
179: static struct resources end_of_function_needs;
180:
181: /* Points to the label before the end of the function. */
182: static rtx end_of_function_label;
183:
184: /* This structure is used to record liveness information at the targets or
185: fallthrough insns of branches. We will most likely need the information
186: at targets again, so save them in a hash table rather than recomputing them
187: each time. */
188:
189: struct target_info
190: {
191: int uid; /* INSN_UID of target. */
192: struct target_info *next; /* Next info for same hash bucket. */
193: HARD_REG_SET live_regs; /* Registers live at target. */
194: int block; /* Basic block number containing target. */
195: int bb_tick; /* Generation count of basic block info. */
196: };
197:
198: #define TARGET_HASH_PRIME 257
199:
200: /* Define the hash table itself. */
201: static struct target_info **target_hash_table;
202:
203: /* For each basic block, we maintain a generation number of its basic
204: block info, which is updated each time we move an insn from the
205: target of a jump. This is the generation number indexed by block
206: number. */
207:
208: static int *bb_ticks;
209:
210: /* Mapping between INSN_UID's and position in the code since INSN_UID's do
211: not always monotonically increase. */
212: static int *uid_to_ruid;
213:
214: /* Highest valid index in `uid_to_ruid'. */
215: static int max_uid;
216:
217: static void mark_referenced_resources PROTO((rtx, struct resources *, int));
218: static void mark_set_resources PROTO((rtx, struct resources *, int, int));
219: static int stop_search_p PROTO((rtx, int));
220: static int resource_conflicts_p PROTO((struct resources *,
221: struct resources *));
222: static int insn_references_resource_p PROTO((rtx, struct resources *, int));
223: static int insn_sets_resources_p PROTO((rtx, struct resources *, int));
224: static rtx find_end_label PROTO((void));
225: static rtx emit_delay_sequence PROTO((rtx, rtx, int, int));
226: static rtx add_to_delay_list PROTO((rtx, rtx));
227: static void delete_from_delay_slot PROTO((rtx));
228: static void delete_scheduled_jump PROTO((rtx));
229: static void note_delay_statistics PROTO((int, int));
230: static rtx optimize_skip PROTO((rtx));
231: static int get_jump_flags PROTO((rtx, rtx));
232: static int rare_destination PROTO((rtx));
233: static int mostly_true_jump PROTO((rtx, rtx));
234: static rtx get_branch_condition PROTO((rtx, rtx));
235: static int condition_dominates_p PROTO((rtx, rtx));
236: static rtx steal_delay_list_from_target PROTO((rtx, rtx, rtx, rtx,
237: struct resources *,
238: struct resources *,
239: struct resources *,
240: int, int *, int *, rtx *));
241: static rtx steal_delay_list_from_fallthrough PROTO((rtx, rtx, rtx, rtx,
242: struct resources *,
243: struct resources *,
244: struct resources *,
245: int, int *, int *));
246: static void try_merge_delay_insns PROTO((rtx, rtx));
247: static int redundant_insn_p PROTO((rtx, rtx, rtx));
248: static int own_thread_p PROTO((rtx, rtx, int));
249: static int find_basic_block PROTO((rtx));
250: static void update_block PROTO((rtx, rtx));
251: static int reorg_redirect_jump PROTO((rtx, rtx));
252: static void update_reg_dead_notes PROTO((rtx, rtx));
253: static void update_live_status PROTO((rtx, rtx));
254: static rtx next_insn_no_annul PROTO((rtx));
255: static void mark_target_live_regs PROTO((rtx, struct resources *));
256: static void fill_simple_delay_slots PROTO((rtx, int));
257: static rtx fill_slots_from_thread PROTO((rtx, rtx, rtx, rtx, int, int,
258: int, int, int, int *));
259: static void fill_eager_delay_slots PROTO((rtx));
260: static void relax_delay_slots PROTO((rtx));
261: static void make_return_insns PROTO((rtx));
262: static int redirect_with_delay_slots_safe_p PROTO ((rtx, rtx, rtx));
263: static int rereict_with_delay_list_safe_p PROTO ((rtx, rtx, rtx));
264:
265: /* Given X, some rtl, and RES, a pointer to a `struct resource', mark
266: which resources are references by the insn. If INCLUDE_CALLED_ROUTINE
267: is TRUE, resources used by the called routine will be included for
268: CALL_INSNs. */
269:
270: static void
271: mark_referenced_resources (x, res, include_delayed_effects)
272: register rtx x;
273: register struct resources *res;
274: register int include_delayed_effects;
275: {
276: register enum rtx_code code = GET_CODE (x);
277: register int i, j;
278: register char *format_ptr;
279:
280: /* Handle leaf items for which we set resource flags. Also, special-case
281: CALL, SET and CLOBBER operators. */
282: switch (code)
283: {
284: case CONST:
285: case CONST_INT:
286: case CONST_DOUBLE:
287: case PC:
288: case SYMBOL_REF:
289: case LABEL_REF:
290: return;
291:
292: case SUBREG:
293: if (GET_CODE (SUBREG_REG (x)) != REG)
294: mark_referenced_resources (SUBREG_REG (x), res, 0);
295: else
296: {
297: int regno = REGNO (SUBREG_REG (x)) + SUBREG_WORD (x);
298: int last_regno = regno + HARD_REGNO_NREGS (regno, GET_MODE (x));
299: for (i = regno; i < last_regno; i++)
300: SET_HARD_REG_BIT (res->regs, i);
301: }
302: return;
303:
304: case REG:
305: for (i = 0; i < HARD_REGNO_NREGS (REGNO (x), GET_MODE (x)); i++)
306: SET_HARD_REG_BIT (res->regs, REGNO (x) + i);
307: return;
308:
309: case MEM:
310: /* If this memory shouldn't change, it really isn't referencing
311: memory. */
312: if (! RTX_UNCHANGING_P (x))
313: res->memory = 1;
314: res->volatil = MEM_VOLATILE_P (x);
315:
316: /* Mark registers used to access memory. */
317: mark_referenced_resources (XEXP (x, 0), res, 0);
318: return;
319:
320: case CC0:
321: res->cc = 1;
322: return;
323:
324: case UNSPEC_VOLATILE:
325: case ASM_INPUT:
326: /* Traditional asm's are always volatile. */
327: res->volatil = 1;
328: return;
329:
330: case ASM_OPERANDS:
331: res->volatil = MEM_VOLATILE_P (x);
332:
333: /* For all ASM_OPERANDS, we must traverse the vector of input operands.
334: We can not just fall through here since then we would be confused
335: by the ASM_INPUT rtx inside ASM_OPERANDS, which do not indicate
336: traditional asms unlike their normal usage. */
337:
338: for (i = 0; i < ASM_OPERANDS_INPUT_LENGTH (x); i++)
339: mark_referenced_resources (ASM_OPERANDS_INPUT (x, i), res, 0);
340: return;
341:
342: case CALL:
343: /* The first operand will be a (MEM (xxx)) but doesn't really reference
344: memory. The second operand may be referenced, though. */
345: mark_referenced_resources (XEXP (XEXP (x, 0), 0), res, 0);
346: mark_referenced_resources (XEXP (x, 1), res, 0);
347: return;
348:
349: case SET:
350: /* Usually, the first operand of SET is set, not referenced. But
351: registers used to access memory are referenced. SET_DEST is
352: also referenced if it is a ZERO_EXTRACT or SIGN_EXTRACT. */
353:
354: mark_referenced_resources (SET_SRC (x), res, 0);
355:
356: x = SET_DEST (x);
357: if (GET_CODE (x) == SIGN_EXTRACT || GET_CODE (x) == ZERO_EXTRACT)
358: mark_referenced_resources (x, res, 0);
359: else if (GET_CODE (x) == SUBREG)
360: x = SUBREG_REG (x);
361: if (GET_CODE (x) == MEM)
362: mark_referenced_resources (XEXP (x, 0), res, 0);
363: return;
364:
365: case CLOBBER:
366: return;
367:
368: case CALL_INSN:
369: if (include_delayed_effects)
370: {
371: /* A CALL references memory, the frame pointer if it exists, the
372: stack pointer, any global registers and any registers given in
373: USE insns immediately in front of the CALL.
374:
375: However, we may have moved some of the parameter loading insns
376: into the delay slot of this CALL. If so, the USE's for them
377: don't count and should be skipped. */
378: rtx insn = PREV_INSN (x);
379: rtx sequence = 0;
380: int seq_size = 0;
381: int i;
382:
383: /* If we are part of a delay slot sequence, point at the SEQUENCE. */
384: if (NEXT_INSN (insn) != x)
385: {
386: sequence = PATTERN (NEXT_INSN (insn));
387: seq_size = XVECLEN (sequence, 0);
388: if (GET_CODE (sequence) != SEQUENCE)
389: abort ();
390: }
391:
392: res->memory = 1;
393: SET_HARD_REG_BIT (res->regs, STACK_POINTER_REGNUM);
394: if (frame_pointer_needed)
395: {
396: SET_HARD_REG_BIT (res->regs, FRAME_POINTER_REGNUM);
397: #if FRAME_POINTER_REGNUM != HARD_FRAME_POINTER_REGNUM
398: SET_HARD_REG_BIT (res->regs, HARD_FRAME_POINTER_REGNUM);
399: #endif
400: }
401:
402: for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
403: if (global_regs[i])
404: SET_HARD_REG_BIT (res->regs, i);
405:
406: /* Skip any labels between the CALL_INSN and possible USE insns. */
407: while (GET_CODE (insn) == CODE_LABEL)
408: insn = PREV_INSN (insn);
409:
410: for ( ; (insn && GET_CODE (insn) == INSN
411: && GET_CODE (PATTERN (insn)) == USE);
412: insn = PREV_INSN (insn))
413: {
414: for (i = 1; i < seq_size; i++)
415: {
416: rtx slot_pat = PATTERN (XVECEXP (sequence, 0, i));
417: if (GET_CODE (slot_pat) == SET
418: && rtx_equal_p (SET_DEST (slot_pat),
419: XEXP (PATTERN (insn), 0)))
420: break;
421: }
422: if (i >= seq_size)
423: mark_referenced_resources (XEXP (PATTERN (insn), 0), res, 0);
424: }
425: }
426:
427: /* ... fall through to other INSN processing ... */
428:
429: case INSN:
430: case JUMP_INSN:
431:
432: #ifdef INSN_REFERENCES_ARE_DELAYED
433: if (! include_delayed_effects
434: && INSN_REFERENCES_ARE_DELAYED (x))
435: return;
436: #endif
437:
438: /* No special processing, just speed up. */
439: mark_referenced_resources (PATTERN (x), res, include_delayed_effects);
440: return;
441: }
442:
443: /* Process each sub-expression and flag what it needs. */
444: format_ptr = GET_RTX_FORMAT (code);
445: for (i = 0; i < GET_RTX_LENGTH (code); i++)
446: switch (*format_ptr++)
447: {
448: case 'e':
449: mark_referenced_resources (XEXP (x, i), res, include_delayed_effects);
450: break;
451:
452: case 'E':
453: for (j = 0; j < XVECLEN (x, i); j++)
454: mark_referenced_resources (XVECEXP (x, i, j), res,
455: include_delayed_effects);
456: break;
457: }
458: }
459:
460: /* Given X, a part of an insn, and a pointer to a `struct resource', RES,
461: indicate which resources are modified by the insn. If INCLUDE_CALLED_ROUTINE
462: is nonzero, also mark resources potentially set by the called routine.
463:
464: If IN_DEST is nonzero, it means we are inside a SET. Otherwise,
465: objects are being referenced instead of set.
466:
467: We never mark the insn as modifying the condition code unless it explicitly
468: SETs CC0 even though this is not totally correct. The reason for this is
469: that we require a SET of CC0 to immediately precede the reference to CC0.
470: So if some other insn sets CC0 as a side-effect, we know it cannot affect
471: our computation and thus may be placed in a delay slot. */
472:
473: static void
474: mark_set_resources (x, res, in_dest, include_delayed_effects)
475: register rtx x;
476: register struct resources *res;
477: int in_dest;
478: int include_delayed_effects;
479: {
480: register enum rtx_code code;
481: register int i, j;
482: register char *format_ptr;
483:
484: restart:
485:
486: code = GET_CODE (x);
487:
488: switch (code)
489: {
490: case NOTE:
491: case BARRIER:
492: case CODE_LABEL:
493: case USE:
494: case CONST_INT:
495: case CONST_DOUBLE:
496: case LABEL_REF:
497: case SYMBOL_REF:
498: case CONST:
499: case PC:
500: /* These don't set any resources. */
501: return;
502:
503: case CC0:
504: if (in_dest)
505: res->cc = 1;
506: return;
507:
508: case CALL_INSN:
509: /* Called routine modifies the condition code, memory, any registers
510: that aren't saved across calls, global registers and anything
511: explicitly CLOBBERed immediately after the CALL_INSN. */
512:
513: if (include_delayed_effects)
514: {
515: rtx next = NEXT_INSN (x);
516: rtx prev = PREV_INSN (x);
517:
518: res->cc = res->memory = 1;
519: for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
520: if (call_used_regs[i] || global_regs[i])
521: SET_HARD_REG_BIT (res->regs, i);
522:
523: /* If X is part of a delay slot sequence, then NEXT should be
524: the first insn after the sequence. */
525: if (NEXT_INSN (prev) != x)
526: next = NEXT_INSN (NEXT_INSN (prev));
527:
528: /* Skip any possible labels between the CALL_INSN and CLOBBERs. */
529: while (GET_CODE (next) == CODE_LABEL)
530: next = NEXT_INSN (next);
531:
532: for (; (next && GET_CODE (next) == INSN
533: && GET_CODE (PATTERN (next)) == CLOBBER);
534: next = NEXT_INSN (next))
535: mark_set_resources (XEXP (PATTERN (next), 0), res, 1, 0);
536:
537: /* Check for a NOTE_INSN_SETJMP. If it exists, then we must
538: assume that this call can clobber any register. */
539: if (next && GET_CODE (next) == NOTE
540: && NOTE_LINE_NUMBER (next) == NOTE_INSN_SETJMP)
541: SET_HARD_REG_SET (res->regs);
542: }
543:
544: /* ... and also what it's RTL says it modifies, if anything. */
545:
546: case JUMP_INSN:
547: case INSN:
548:
549: /* An insn consisting of just a CLOBBER (or USE) is just for flow
550: and doesn't actually do anything, so we ignore it. */
551:
552: #ifdef INSN_SETS_ARE_DELAYED
553: if (! include_delayed_effects
554: && INSN_SETS_ARE_DELAYED (x))
555: return;
556: #endif
557:
558: x = PATTERN (x);
559: if (GET_CODE (x) != USE && GET_CODE (x) != CLOBBER)
560: goto restart;
561: return;
562:
563: case SET:
564: /* If the source of a SET is a CALL, this is actually done by
565: the called routine. So only include it if we are to include the
566: effects of the calling routine. */
567:
568: mark_set_resources (SET_DEST (x), res,
569: (include_delayed_effects
570: || GET_CODE (SET_SRC (x)) != CALL),
571: 0);
572:
573: mark_set_resources (SET_SRC (x), res, 0, 0);
574: return;
575:
576: case CLOBBER:
577: mark_set_resources (XEXP (x, 0), res, 1, 0);
578: return;
579:
580: case SEQUENCE:
581: for (i = 0; i < XVECLEN (x, 0); i++)
582: if (! (INSN_ANNULLED_BRANCH_P (XVECEXP (x, 0, 0))
583: && INSN_FROM_TARGET_P (XVECEXP (x, 0, i))))
584: mark_set_resources (XVECEXP (x, 0, i), res, 0,
585: include_delayed_effects);
586: return;
587:
588: case POST_INC:
589: case PRE_INC:
590: case POST_DEC:
591: case PRE_DEC:
592: mark_set_resources (XEXP (x, 0), res, 1, 0);
593: return;
594:
595: case ZERO_EXTRACT:
596: mark_set_resources (XEXP (x, 0), res, in_dest, 0);
597: mark_set_resources (XEXP (x, 1), res, 0, 0);
598: mark_set_resources (XEXP (x, 2), res, 0, 0);
599: return;
600:
601: case MEM:
602: if (in_dest)
603: {
604: res->memory = 1;
605: res->volatil = MEM_VOLATILE_P (x);
606: }
607:
608: mark_set_resources (XEXP (x, 0), res, 0, 0);
609: return;
610:
611: case REG:
612: if (in_dest)
613: for (i = 0; i < HARD_REGNO_NREGS (REGNO (x), GET_MODE (x)); i++)
614: SET_HARD_REG_BIT (res->regs, REGNO (x) + i);
615: return;
616: }
617:
618: /* Process each sub-expression and flag what it needs. */
619: format_ptr = GET_RTX_FORMAT (code);
620: for (i = 0; i < GET_RTX_LENGTH (code); i++)
621: switch (*format_ptr++)
622: {
623: case 'e':
624: mark_set_resources (XEXP (x, i), res, in_dest, include_delayed_effects);
625: break;
626:
627: case 'E':
628: for (j = 0; j < XVECLEN (x, i); j++)
629: mark_set_resources (XVECEXP (x, i, j), res, in_dest,
630: include_delayed_effects);
631: break;
632: }
633: }
634:
635: /* Return TRUE if this insn should stop the search for insn to fill delay
636: slots. LABELS_P indicates that labels should terminate the search.
637: In all cases, jumps terminate the search. */
638:
639: static int
640: stop_search_p (insn, labels_p)
641: rtx insn;
642: int labels_p;
643: {
644: if (insn == 0)
645: return 1;
646:
647: switch (GET_CODE (insn))
648: {
649: case NOTE:
650: case CALL_INSN:
651: return 0;
652:
653: case CODE_LABEL:
654: return labels_p;
655:
656: case JUMP_INSN:
657: case BARRIER:
658: return 1;
659:
660: case INSN:
661: /* OK unless it contains a delay slot or is an `asm' insn of some type.
662: We don't know anything about these. */
663: return (GET_CODE (PATTERN (insn)) == SEQUENCE
664: || GET_CODE (PATTERN (insn)) == ASM_INPUT
665: || asm_noperands (PATTERN (insn)) >= 0);
666:
667: default:
668: abort ();
669: }
670: }
671:
672: /* Return TRUE if any resources are marked in both RES1 and RES2 or if either
673: resource set contains a volatile memory reference. Otherwise, return FALSE. */
674:
675: static int
676: resource_conflicts_p (res1, res2)
677: struct resources *res1, *res2;
678: {
679: if ((res1->cc && res2->cc) || (res1->memory && res2->memory)
680: || res1->volatil || res2->volatil)
681: return 1;
682:
683: #ifdef HARD_REG_SET
684: return (res1->regs & res2->regs) != HARD_CONST (0);
685: #else
686: {
687: int i;
688:
689: for (i = 0; i < HARD_REG_SET_LONGS; i++)
690: if ((res1->regs[i] & res2->regs[i]) != 0)
691: return 1;
692: return 0;
693: }
694: #endif
695: }
696:
697: /* Return TRUE if any resource marked in RES, a `struct resources', is
698: referenced by INSN. If INCLUDE_CALLED_ROUTINE is set, return if the called
699: routine is using those resources.
700:
701: We compute this by computing all the resources referenced by INSN and
702: seeing if this conflicts with RES. It might be faster to directly check
703: ourselves, and this is the way it used to work, but it means duplicating
704: a large block of complex code. */
705:
706: static int
707: insn_references_resource_p (insn, res, include_delayed_effects)
708: register rtx insn;
709: register struct resources *res;
710: int include_delayed_effects;
711: {
712: struct resources insn_res;
713:
714: CLEAR_RESOURCE (&insn_res);
715: mark_referenced_resources (insn, &insn_res, include_delayed_effects);
716: return resource_conflicts_p (&insn_res, res);
717: }
718:
719: /* Return TRUE if INSN modifies resources that are marked in RES.
720: INCLUDE_CALLED_ROUTINE is set if the actions of that routine should be
721: included. CC0 is only modified if it is explicitly set; see comments
722: in front of mark_set_resources for details. */
723:
724: static int
725: insn_sets_resource_p (insn, res, include_delayed_effects)
726: register rtx insn;
727: register struct resources *res;
728: int include_delayed_effects;
729: {
730: struct resources insn_sets;
731:
732: CLEAR_RESOURCE (&insn_sets);
733: mark_set_resources (insn, &insn_sets, 0, include_delayed_effects);
734: return resource_conflicts_p (&insn_sets, res);
735: }
736:
737: /* Find a label at the end of the function or before a RETURN. If there is
738: none, make one. */
739:
740: static rtx
741: find_end_label ()
742: {
743: rtx insn;
744:
745: /* If we found one previously, return it. */
746: if (end_of_function_label)
747: return end_of_function_label;
748:
749: /* Otherwise, see if there is a label at the end of the function. If there
750: is, it must be that RETURN insns aren't needed, so that is our return
751: label and we don't have to do anything else. */
752:
753: insn = get_last_insn ();
754: while (GET_CODE (insn) == NOTE
755: || (GET_CODE (insn) == INSN
756: && (GET_CODE (PATTERN (insn)) == USE
757: || GET_CODE (PATTERN (insn)) == CLOBBER)))
758: insn = PREV_INSN (insn);
759:
760: /* When a target threads its epilogue we might already have a
761: suitable return insn. If so put a label before it for the
762: end_of_function_label. */
763: if (GET_CODE (insn) == BARRIER
764: && GET_CODE (PREV_INSN (insn)) == JUMP_INSN
765: && GET_CODE (PATTERN (PREV_INSN (insn))) == RETURN)
766: {
767: rtx temp = PREV_INSN (PREV_INSN (insn));
768: end_of_function_label = gen_label_rtx ();
769: LABEL_NUSES (end_of_function_label) = 0;
770:
771: /* Put the label before an USE insns that may proceed the RETURN insn. */
772: while (GET_CODE (temp) == USE)
773: temp = PREV_INSN (temp);
774:
775: emit_label_after (end_of_function_label, temp);
776: }
777:
778: else if (GET_CODE (insn) == CODE_LABEL)
779: end_of_function_label = insn;
780: else
781: {
782: /* Otherwise, make a new label and emit a RETURN and BARRIER,
783: if needed. */
784: end_of_function_label = gen_label_rtx ();
785: LABEL_NUSES (end_of_function_label) = 0;
786: emit_label (end_of_function_label);
787: #ifdef HAVE_return
788: if (HAVE_return)
789: {
790: /* The return we make may have delay slots too. */
791: rtx insn = gen_return ();
792: insn = emit_jump_insn (insn);
793: emit_barrier ();
794: if (num_delay_slots (insn) > 0)
795: obstack_ptr_grow (&unfilled_slots_obstack, insn);
796: }
797: #endif
798: }
799:
800: /* Show one additional use for this label so it won't go away until
801: we are done. */
802: ++LABEL_NUSES (end_of_function_label);
803:
804: return end_of_function_label;
805: }
806:
807: /* Put INSN and LIST together in a SEQUENCE rtx of LENGTH, and replace
808: the pattern of INSN with the SEQUENCE.
809:
810: Chain the insns so that NEXT_INSN of each insn in the sequence points to
811: the next and NEXT_INSN of the last insn in the sequence points to
812: the first insn after the sequence. Similarly for PREV_INSN. This makes
813: it easier to scan all insns.
814:
815: Returns the SEQUENCE that replaces INSN. */
816:
817: static rtx
818: emit_delay_sequence (insn, list, length, avail)
819: rtx insn;
820: rtx list;
821: int length;
822: int avail;
823: {
824: register int i = 1;
825: register rtx li;
826: int had_barrier = 0;
827:
828: /* Allocate the the rtvec to hold the insns and the SEQUENCE. */
829: rtvec seqv = rtvec_alloc (length + 1);
830: rtx seq = gen_rtx (SEQUENCE, VOIDmode, seqv);
831: rtx seq_insn = make_insn_raw (seq);
832: rtx first = get_insns ();
833: rtx last = get_last_insn ();
834:
835: /* Make a copy of the insn having delay slots. */
836: rtx delay_insn = copy_rtx (insn);
837:
838: /* If INSN is followed by a BARRIER, delete the BARRIER since it will only
839: confuse further processing. Update LAST in case it was the last insn.
840: We will put the BARRIER back in later. */
841: if (NEXT_INSN (insn) && GET_CODE (NEXT_INSN (insn)) == BARRIER)
842: {
843: delete_insn (NEXT_INSN (insn));
844: last = get_last_insn ();
845: had_barrier = 1;
846: }
847:
848: /* Splice our SEQUENCE into the insn stream where INSN used to be. */
849: NEXT_INSN (seq_insn) = NEXT_INSN (insn);
850: PREV_INSN (seq_insn) = PREV_INSN (insn);
851:
852: if (insn == last)
853: set_new_first_and_last_insn (first, seq_insn);
854: else
855: PREV_INSN (NEXT_INSN (seq_insn)) = seq_insn;
856:
857: if (insn == first)
858: set_new_first_and_last_insn (seq_insn, last);
859: else
860: NEXT_INSN (PREV_INSN (seq_insn)) = seq_insn;
861:
862: /* Build our SEQUENCE and rebuild the insn chain. */
863: XVECEXP (seq, 0, 0) = delay_insn;
864: INSN_DELETED_P (delay_insn) = 0;
865: PREV_INSN (delay_insn) = PREV_INSN (seq_insn);
866:
867: for (li = list; li; li = XEXP (li, 1), i++)
868: {
869: rtx tem = XEXP (li, 0);
870: rtx note;
871:
872: /* Show that this copy of the insn isn't deleted. */
873: INSN_DELETED_P (tem) = 0;
874:
875: XVECEXP (seq, 0, i) = tem;
876: PREV_INSN (tem) = XVECEXP (seq, 0, i - 1);
877: NEXT_INSN (XVECEXP (seq, 0, i - 1)) = tem;
878:
879: /* Remove any REG_DEAD notes because we can't rely on them now
880: that the insn has been moved. */
881: for (note = REG_NOTES (tem); note; note = XEXP (note, 1))
882: if (REG_NOTE_KIND (note) == REG_DEAD)
883: XEXP (note, 0) = const0_rtx;
884: }
885:
886: NEXT_INSN (XVECEXP (seq, 0, length)) = NEXT_INSN (seq_insn);
887:
888: /* If the previous insn is a SEQUENCE, update the NEXT_INSN pointer on the
889: last insn in that SEQUENCE to point to us. Similarly for the first
890: insn in the following insn if it is a SEQUENCE. */
891:
892: if (PREV_INSN (seq_insn) && GET_CODE (PREV_INSN (seq_insn)) == INSN
893: && GET_CODE (PATTERN (PREV_INSN (seq_insn))) == SEQUENCE)
894: NEXT_INSN (XVECEXP (PATTERN (PREV_INSN (seq_insn)), 0,
895: XVECLEN (PATTERN (PREV_INSN (seq_insn)), 0) - 1))
896: = seq_insn;
897:
898: if (NEXT_INSN (seq_insn) && GET_CODE (NEXT_INSN (seq_insn)) == INSN
899: && GET_CODE (PATTERN (NEXT_INSN (seq_insn))) == SEQUENCE)
900: PREV_INSN (XVECEXP (PATTERN (NEXT_INSN (seq_insn)), 0, 0)) = seq_insn;
901:
902: /* If there used to be a BARRIER, put it back. */
903: if (had_barrier)
904: emit_barrier_after (seq_insn);
905:
906: if (i != length + 1)
907: abort ();
908:
909: return seq_insn;
910: }
911:
912: /* Add INSN to DELAY_LIST and return the head of the new list. The list must
913: be in the order in which the insns are to be executed. */
914:
915: static rtx
916: add_to_delay_list (insn, delay_list)
917: rtx insn;
918: rtx delay_list;
919: {
920: /* If we have an empty list, just make a new list element. If
921: INSN has it's block number recorded, clear it since we may
922: be moving the insn to a new block. */
923:
924: if (delay_list == 0)
925: {
926: struct target_info *tinfo;
927:
928: for (tinfo = target_hash_table[INSN_UID (insn) % TARGET_HASH_PRIME];
929: tinfo; tinfo = tinfo->next)
930: if (tinfo->uid == INSN_UID (insn))
931: break;
932:
933: if (tinfo)
934: tinfo->block = -1;
935:
936: return gen_rtx (INSN_LIST, VOIDmode, insn, NULL_RTX);
937: }
938:
939: /* Otherwise this must be an INSN_LIST. Add INSN to the end of the
940: list. */
941: XEXP (delay_list, 1) = add_to_delay_list (insn, XEXP (delay_list, 1));
942:
943: return delay_list;
944: }
945:
946: /* Delete INSN from the the delay slot of the insn that it is in. This may
947: produce an insn without anything in its delay slots. */
948:
949: static void
950: delete_from_delay_slot (insn)
951: rtx insn;
952: {
953: rtx trial, seq_insn, seq, prev;
954: rtx delay_list = 0;
955: int i;
956:
957: /* We first must find the insn containing the SEQUENCE with INSN in its
958: delay slot. Do this by finding an insn, TRIAL, where
959: PREV_INSN (NEXT_INSN (TRIAL)) != TRIAL. */
960:
961: for (trial = insn;
962: PREV_INSN (NEXT_INSN (trial)) == trial;
963: trial = NEXT_INSN (trial))
964: ;
965:
966: seq_insn = PREV_INSN (NEXT_INSN (trial));
967: seq = PATTERN (seq_insn);
968:
969: /* Create a delay list consisting of all the insns other than the one
970: we are deleting (unless we were the only one). */
971: if (XVECLEN (seq, 0) > 2)
972: for (i = 1; i < XVECLEN (seq, 0); i++)
973: if (XVECEXP (seq, 0, i) != insn)
974: delay_list = add_to_delay_list (XVECEXP (seq, 0, i), delay_list);
975:
976: /* Delete the old SEQUENCE, re-emit the insn that used to have the delay
977: list, and rebuild the delay list if non-empty. */
978: prev = PREV_INSN (seq_insn);
979: trial = XVECEXP (seq, 0, 0);
980: delete_insn (seq_insn);
981: add_insn_after (trial, prev);
982:
983: if (GET_CODE (trial) == JUMP_INSN
984: && (simplejump_p (trial) || GET_CODE (PATTERN (trial)) == RETURN))
985: emit_barrier_after (trial);
986:
987: /* If there are any delay insns, remit them. Otherwise clear the
988: annul flag. */
989: if (delay_list)
990: trial = emit_delay_sequence (trial, delay_list, XVECLEN (seq, 0) - 2, 0);
991: else
992: INSN_ANNULLED_BRANCH_P (trial) = 0;
993:
994: INSN_FROM_TARGET_P (insn) = 0;
995:
996: /* Show we need to fill this insn again. */
997: obstack_ptr_grow (&unfilled_slots_obstack, trial);
998: }
999:
1000: /* Delete INSN, a JUMP_INSN. If it is a conditional jump, we must track down
1001: the insn that sets CC0 for it and delete it too. */
1002:
1003: static void
1004: delete_scheduled_jump (insn)
1005: rtx insn;
1006: {
1007: /* Delete the insn that sets cc0 for us. On machines without cc0, we could
1008: delete the insn that sets the condition code, but it is hard to find it.
1009: Since this case is rare anyway, don't bother trying; there would likely
1010: be other insns that became dead anyway, which we wouldn't know to
1011: delete. */
1012:
1013: #ifdef HAVE_cc0
1014: if (reg_mentioned_p (cc0_rtx, insn))
1015: {
1016: rtx note = find_reg_note (insn, REG_CC_SETTER, NULL_RTX);
1017:
1018: /* If a reg-note was found, it points to an insn to set CC0. This
1019: insn is in the delay list of some other insn. So delete it from
1020: the delay list it was in. */
1021: if (note)
1022: {
1023: if (! FIND_REG_INC_NOTE (XEXP (note, 0), NULL_RTX)
1024: && sets_cc0_p (PATTERN (XEXP (note, 0))) == 1)
1025: delete_from_delay_slot (XEXP (note, 0));
1026: }
1027: else
1028: {
1029: /* The insn setting CC0 is our previous insn, but it may be in
1030: a delay slot. It will be the last insn in the delay slot, if
1031: it is. */
1032: rtx trial = previous_insn (insn);
1033: if (GET_CODE (trial) == NOTE)
1034: trial = prev_nonnote_insn (trial);
1035: if (sets_cc0_p (PATTERN (trial)) != 1
1036: || FIND_REG_INC_NOTE (trial, 0))
1037: return;
1038: if (PREV_INSN (NEXT_INSN (trial)) == trial)
1039: delete_insn (trial);
1040: else
1041: delete_from_delay_slot (trial);
1042: }
1043: }
1044: #endif
1045:
1046: delete_insn (insn);
1047: }
1048:
1049: /* Counters for delay-slot filling. */
1050:
1051: #define NUM_REORG_FUNCTIONS 2
1052: #define MAX_DELAY_HISTOGRAM 3
1053: #define MAX_REORG_PASSES 2
1054:
1055: static int num_insns_needing_delays[NUM_REORG_FUNCTIONS][MAX_REORG_PASSES];
1056:
1057: static int num_filled_delays[NUM_REORG_FUNCTIONS][MAX_DELAY_HISTOGRAM+1][MAX_REORG_PASSES];
1058:
1059: static int reorg_pass_number;
1060:
1061: static void
1062: note_delay_statistics (slots_filled, index)
1063: int slots_filled, index;
1064: {
1065: num_insns_needing_delays[index][reorg_pass_number]++;
1066: if (slots_filled > MAX_DELAY_HISTOGRAM)
1067: slots_filled = MAX_DELAY_HISTOGRAM;
1068: num_filled_delays[index][slots_filled][reorg_pass_number]++;
1069: }
1070:
1071: #if defined(ANNUL_IFFALSE_SLOTS) || defined(ANNUL_IFTRUE_SLOTS)
1072:
1073: /* Optimize the following cases:
1074:
1075: 1. When a conditional branch skips over only one instruction,
1076: use an annulling branch and put that insn in the delay slot.
1077: Use either a branch that annuls when the condition if true or
1078: invert the test with a branch that annuls when the condition is
1079: false. This saves insns, since otherwise we must copy an insn
1080: from the L1 target.
1081:
1082: (orig) (skip) (otherwise)
1083: Bcc.n L1 Bcc',a L1 Bcc,a L1'
1084: insn insn insn2
1085: L1: L1: L1:
1086: insn2 insn2 insn2
1087: insn3 insn3 L1':
1088: insn3
1089:
1090: 2. When a conditional branch skips over only one instruction,
1091: and after that, it unconditionally branches somewhere else,
1092: perform the similar optimization. This saves executing the
1093: second branch in the case where the inverted condition is true.
1094:
1095: Bcc.n L1 Bcc',a L2
1096: insn insn
1097: L1: L1:
1098: Bra L2 Bra L2
1099:
1100: INSN is a JUMP_INSN.
1101:
1102: This should be expanded to skip over N insns, where N is the number
1103: of delay slots required. */
1104:
1105: static rtx
1106: optimize_skip (insn)
1107: register rtx insn;
1108: {
1109: register rtx trial = next_nonnote_insn (insn);
1110: rtx next_trial = next_active_insn (trial);
1111: rtx delay_list = 0;
1112: rtx target_label;
1113: int flags;
1114:
1115: flags = get_jump_flags (insn, JUMP_LABEL (insn));
1116:
1117: if (trial == 0
1118: || GET_CODE (trial) != INSN
1119: || GET_CODE (PATTERN (trial)) == SEQUENCE
1120: || recog_memoized (trial) < 0
1121: || (! eligible_for_annul_false (insn, 0, trial, flags)
1122: && ! eligible_for_annul_true (insn, 0, trial, flags)))
1123: return 0;
1124:
1125: /* There are two cases where we are just executing one insn (we assume
1126: here that a branch requires only one insn; this should be generalized
1127: at some point): Where the branch goes around a single insn or where
1128: we have one insn followed by a branch to the same label we branch to.
1129: In both of these cases, inverting the jump and annulling the delay
1130: slot give the same effect in fewer insns. */
1131: if ((next_trial == next_active_insn (JUMP_LABEL (insn)))
1132: || (next_trial != 0
1133: && GET_CODE (next_trial) == JUMP_INSN
1134: && JUMP_LABEL (insn) == JUMP_LABEL (next_trial)
1135: && (simplejump_p (next_trial)
1136: || GET_CODE (PATTERN (next_trial)) == RETURN)))
1137: {
1138: if (eligible_for_annul_false (insn, 0, trial, flags))
1139: {
1140: if (invert_jump (insn, JUMP_LABEL (insn)))
1141: INSN_FROM_TARGET_P (trial) = 1;
1142: else if (! eligible_for_annul_true (insn, 0, trial, flags))
1143: return 0;
1144: }
1145:
1146: delay_list = add_to_delay_list (trial, NULL_RTX);
1147: next_trial = next_active_insn (trial);
1148: update_block (trial, trial);
1149: delete_insn (trial);
1150:
1151: /* Also, if we are targeting an unconditional
1152: branch, thread our jump to the target of that branch. Don't
1153: change this into a RETURN here, because it may not accept what
1154: we have in the delay slot. We'll fix this up later. */
1155: if (next_trial && GET_CODE (next_trial) == JUMP_INSN
1156: && (simplejump_p (next_trial)
1157: || GET_CODE (PATTERN (next_trial)) == RETURN))
1158: {
1159: target_label = JUMP_LABEL (next_trial);
1160: if (target_label == 0)
1161: target_label = find_end_label ();
1162:
1163: /* Recompute the flags based on TARGET_LABEL since threading
1164: the jump to TARGET_LABEL may change the direction of the
1165: jump (which may change the circumstances in which the
1166: delay slot is nullified). */
1167: flags = get_jump_flags (insn, target_label);
1168: if (eligible_for_annul_true (insn, 0, trial, flags))
1169: reorg_redirect_jump (insn, target_label);
1170: }
1171:
1172: INSN_ANNULLED_BRANCH_P (insn) = 1;
1173: }
1174:
1175: return delay_list;
1176: }
1177: #endif
1178:
1179:
1180: /* Encode and return branch direction and prediction information for
1181: INSN assuming it will jump to LABEL.
1182:
1183: Non conditional branches return no direction information and
1184: are predicted as very likely taken. */
1185: static int
1186: get_jump_flags (insn, label)
1187: rtx insn, label;
1188: {
1189: int flags;
1190:
1191: /* get_jump_flags can be passed any insn with delay slots, these may
1192: be INSNs, CALL_INSNs, or JUMP_INSNs. Only JUMP_INSNs have branch
1193: direction information, and only if they are conditional jumps.
1194:
1195: If LABEL is zero, then there is no way to determine the branch
1196: direction. */
1197: if (GET_CODE (insn) == JUMP_INSN
1198: && condjump_p (insn)
1199: && INSN_UID (insn) <= max_uid
1200: && label != 0
1201: && INSN_UID (label) <= max_uid)
1202: flags
1203: = (uid_to_ruid[INSN_UID (label)] > uid_to_ruid[INSN_UID (insn)])
1204: ? ATTR_FLAG_forward : ATTR_FLAG_backward;
1205: /* No valid direction information. */
1206: else
1207: flags = 0;
1208:
1209: /* If insn is a conditional branch call mostly_true_jump to get
1210: determine the branch prediction.
1211:
1212: Non conditional branches are predicted as very likely taken. */
1213: if (GET_CODE (insn) == JUMP_INSN
1214: && condjump_p (insn))
1215: {
1216: int prediction;
1217:
1218: prediction = mostly_true_jump (insn, get_branch_condition (insn, label));
1219: switch (prediction)
1220: {
1221: case 2:
1222: flags |= (ATTR_FLAG_very_likely | ATTR_FLAG_likely);
1223: break;
1224: case 1:
1225: flags |= ATTR_FLAG_likely;
1226: break;
1227: case 0:
1228: flags |= ATTR_FLAG_unlikely;
1229: break;
1230: case -1:
1231: flags |= (ATTR_FLAG_very_unlikely | ATTR_FLAG_unlikely);
1232: break;
1233:
1234: default:
1235: abort();
1236: }
1237: }
1238: else
1239: flags |= (ATTR_FLAG_very_likely | ATTR_FLAG_likely);
1240:
1241: return flags;
1242: }
1243:
1244: /* Return 1 if INSN is a destination that will be branched to rarely (the
1245: return point of a function); return 2 if DEST will be branched to very
1246: rarely (a call to a function that doesn't return). Otherwise,
1247: return 0. */
1248:
1249: static int
1250: rare_destination (insn)
1251: rtx insn;
1252: {
1253: int jump_count = 0;
1254: rtx next;
1255:
1256: for (; insn; insn = next)
1257: {
1258: if (GET_CODE (insn) == INSN && GET_CODE (PATTERN (insn)) == SEQUENCE)
1259: insn = XVECEXP (PATTERN (insn), 0, 0);
1260:
1261: next = NEXT_INSN (insn);
1262:
1263: switch (GET_CODE (insn))
1264: {
1265: case CODE_LABEL:
1266: return 0;
1267: case BARRIER:
1268: /* A BARRIER can either be after a JUMP_INSN or a CALL_INSN. We
1269: don't scan past JUMP_INSNs, so any barrier we find here must
1270: have been after a CALL_INSN and hence mean the call doesn't
1271: return. */
1272: return 2;
1273: case JUMP_INSN:
1274: if (GET_CODE (PATTERN (insn)) == RETURN)
1275: return 1;
1276: else if (simplejump_p (insn)
1277: && jump_count++ < 10)
1278: next = JUMP_LABEL (insn);
1279: else
1280: return 0;
1281: }
1282: }
1283:
1284: /* If we got here it means we hit the end of the function. So this
1285: is an unlikely destination. */
1286:
1287: return 1;
1288: }
1289:
1290: /* Return truth value of the statement that this branch
1291: is mostly taken. If we think that the branch is extremely likely
1292: to be taken, we return 2. If the branch is slightly more likely to be
1293: taken, return 1. If the branch is slightly less likely to be taken,
1294: return 0 and if the branch is highly unlikely to be taken, return -1.
1295:
1296: CONDITION, if non-zero, is the condition that JUMP_INSN is testing. */
1297:
1298: static int
1299: mostly_true_jump (jump_insn, condition)
1300: rtx jump_insn, condition;
1301: {
1302: rtx target_label = JUMP_LABEL (jump_insn);
1303: rtx insn;
1304: int rare_dest = rare_destination (target_label);
1305: int rare_fallthrough = rare_destination (NEXT_INSN (jump_insn));
1306:
1307: /* If this is a branch outside a loop, it is highly unlikely. */
1308: if (GET_CODE (PATTERN (jump_insn)) == SET
1309: && GET_CODE (SET_SRC (PATTERN (jump_insn))) == IF_THEN_ELSE
1310: && ((GET_CODE (XEXP (SET_SRC (PATTERN (jump_insn)), 1)) == LABEL_REF
1311: && LABEL_OUTSIDE_LOOP_P (XEXP (SET_SRC (PATTERN (jump_insn)), 1)))
1312: || (GET_CODE (XEXP (SET_SRC (PATTERN (jump_insn)), 2)) == LABEL_REF
1313: && LABEL_OUTSIDE_LOOP_P (XEXP (SET_SRC (PATTERN (jump_insn)), 2)))))
1314: return -1;
1315:
1316: if (target_label)
1317: {
1318: /* If this is the test of a loop, it is very likely true. We scan
1319: backwards from the target label. If we find a NOTE_INSN_LOOP_BEG
1320: before the next real insn, we assume the branch is to the top of
1321: the loop. */
1322: for (insn = PREV_INSN (target_label);
1323: insn && GET_CODE (insn) == NOTE;
1324: insn = PREV_INSN (insn))
1325: if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG)
1326: return 2;
1327:
1328: /* If this is a jump to the test of a loop, it is likely true. We scan
1329: forwards from the target label. If we find a NOTE_INSN_LOOP_VTOP
1330: before the next real insn, we assume the branch is to the loop branch
1331: test. */
1332: for (insn = NEXT_INSN (target_label);
1333: insn && GET_CODE (insn) == NOTE;
1334: insn = PREV_INSN (insn))
1335: if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_VTOP)
1336: return 1;
1337: }
1338:
1339: /* Look at the relative rarities of the fallthough and destination. If
1340: they differ, we can predict the branch that way. */
1341:
1342: switch (rare_fallthrough - rare_dest)
1343: {
1344: case -2:
1345: return -1;
1346: case -1:
1347: return 0;
1348: case 0:
1349: break;
1350: case 1:
1351: return 1;
1352: case 2:
1353: return 2;
1354: }
1355:
1356: /* If we couldn't figure out what this jump was, assume it won't be
1357: taken. This should be rare. */
1358: if (condition == 0)
1359: return 0;
1360:
1361: /* EQ tests are usually false and NE tests are usually true. Also,
1362: most quantities are positive, so we can make the appropriate guesses
1363: about signed comparisons against zero. */
1364: switch (GET_CODE (condition))
1365: {
1366: case CONST_INT:
1367: /* Unconditional branch. */
1368: return 1;
1369: case EQ:
1370: return 0;
1371: case NE:
1372: return 1;
1373: case LE:
1374: case LT:
1375: if (XEXP (condition, 1) == const0_rtx)
1376: return 0;
1377: break;
1378: case GE:
1379: case GT:
1380: if (XEXP (condition, 1) == const0_rtx)
1381: return 1;
1382: break;
1383: }
1384:
1385: /* Predict backward branches usually take, forward branches usually not. If
1386: we don't know whether this is forward or backward, assume the branch
1387: will be taken, since most are. */
1388: return (target_label == 0 || INSN_UID (jump_insn) > max_uid
1389: || INSN_UID (target_label) > max_uid
1390: || (uid_to_ruid[INSN_UID (jump_insn)]
1391: > uid_to_ruid[INSN_UID (target_label)]));;
1392: }
1393:
1394: /* Return the condition under which INSN will branch to TARGET. If TARGET
1395: is zero, return the condition under which INSN will return. If INSN is
1396: an unconditional branch, return const_true_rtx. If INSN isn't a simple
1397: type of jump, or it doesn't go to TARGET, return 0. */
1398:
1399: static rtx
1400: get_branch_condition (insn, target)
1401: rtx insn;
1402: rtx target;
1403: {
1404: rtx pat = PATTERN (insn);
1405: rtx src;
1406:
1407: if (GET_CODE (pat) == RETURN)
1408: return target == 0 ? const_true_rtx : 0;
1409:
1410: else if (GET_CODE (pat) != SET || SET_DEST (pat) != pc_rtx)
1411: return 0;
1412:
1413: src = SET_SRC (pat);
1414: if (GET_CODE (src) == LABEL_REF && XEXP (src, 0) == target)
1415: return const_true_rtx;
1416:
1417: else if (GET_CODE (src) == IF_THEN_ELSE
1418: && ((target == 0 && GET_CODE (XEXP (src, 1)) == RETURN)
1419: || (GET_CODE (XEXP (src, 1)) == LABEL_REF
1420: && XEXP (XEXP (src, 1), 0) == target))
1421: && XEXP (src, 2) == pc_rtx)
1422: return XEXP (src, 0);
1423:
1424: else if (GET_CODE (src) == IF_THEN_ELSE
1425: && ((target == 0 && GET_CODE (XEXP (src, 2)) == RETURN)
1426: || (GET_CODE (XEXP (src, 2)) == LABEL_REF
1427: && XEXP (XEXP (src, 2), 0) == target))
1428: && XEXP (src, 1) == pc_rtx)
1429: return gen_rtx (reverse_condition (GET_CODE (XEXP (src, 0))),
1430: GET_MODE (XEXP (src, 0)),
1431: XEXP (XEXP (src, 0), 0), XEXP (XEXP (src, 0), 1));
1432:
1433: return 0;
1434: }
1435:
1436: /* Return non-zero if CONDITION is more strict than the condition of
1437: INSN, i.e., if INSN will always branch if CONDITION is true. */
1438:
1439: static int
1440: condition_dominates_p (condition, insn)
1441: rtx condition;
1442: rtx insn;
1443: {
1444: rtx other_condition = get_branch_condition (insn, JUMP_LABEL (insn));
1445: enum rtx_code code = GET_CODE (condition);
1446: enum rtx_code other_code;
1447:
1448: if (rtx_equal_p (condition, other_condition)
1449: || other_condition == const_true_rtx)
1450: return 1;
1451:
1452: else if (condition == const_true_rtx || other_condition == 0)
1453: return 0;
1454:
1455: other_code = GET_CODE (other_condition);
1456: if (GET_RTX_LENGTH (code) != 2 || GET_RTX_LENGTH (other_code) != 2
1457: || ! rtx_equal_p (XEXP (condition, 0), XEXP (other_condition, 0))
1458: || ! rtx_equal_p (XEXP (condition, 1), XEXP (other_condition, 1)))
1459: return 0;
1460:
1461: return comparison_dominates_p (code, other_code);
1462: }
1463:
1464: /* Return non-zero if redirecting JUMP to NEWLABEL does not invalidate
1465: any insns already in the delay slot of JUMP. */
1466:
1467: static int
1468: redirect_with_delay_slots_safe_p (jump, newlabel, seq)
1469: rtx jump, newlabel, seq;
1470: {
1471: int flags, slots, i;
1472: rtx pat = PATTERN (seq);
1473:
1474: /* Make sure all the delay slots of this jump would still
1475: be valid after threading the jump. If they are still
1476: valid, then return non-zero. */
1477:
1478: flags = get_jump_flags (jump, newlabel);
1479: for (i = 1; i < XVECLEN (pat, 0); i++)
1480: if (! (
1481: #ifdef ANNUL_IFFALSE_SLOTS
1482: (INSN_ANNULLED_BRANCH_P (jump)
1483: && INSN_FROM_TARGET_P (XVECEXP (pat, 0, i)))
1484: ? eligible_for_annul_false (jump, i - 1,
1485: XVECEXP (pat, 0, i), flags) :
1486: #endif
1487: #ifdef ANNUL_IFTRUE_SLOTS
1488: (INSN_ANNULLED_BRANCH_P (jump)
1489: && ! INSN_FROM_TARGET_P (XVECEXP (pat, 0, i)))
1490: ? eligible_for_annul_true (jump, i - 1,
1491: XVECEXP (pat, 0, i), flags) :
1492: #endif
1493: eligible_for_delay (jump, i -1, XVECEXP (pat, 0, i), flags)))
1494: break;
1495:
1496: return (i == XVECLEN (pat, 0));
1497: }
1498:
1499: /* Return non-zero if redirecting JUMP to NEWLABEL does not invalidate
1500: any insns we wish to place in the delay slot of JUMP. */
1501:
1502: static int
1503: redirect_with_delay_list_safe_p (jump, newlabel, delay_list)
1504: rtx jump, newlabel, delay_list;
1505: {
1506: int flags, i;
1507: rtx li;
1508:
1509: /* Make sure all the insns in DELAY_LIST would still be
1510: valid after threading the jump. If they are still
1511: valid, then return non-zero. */
1512:
1513: flags = get_jump_flags (jump, newlabel);
1514: for (li = delay_list, i = 0; li; li = XEXP (li, 1), i++)
1515: if (! (
1516: #ifdef ANNUL_IFFALSE_SLOTS
1517: (INSN_ANNULLED_BRANCH_P (jump)
1518: && INSN_FROM_TARGET_P (XEXP (li, 0)))
1519: ? eligible_for_annul_false (jump, i - 1, XEXP (li, 0), flags) :
1520: #endif
1521: #ifdef ANNUL_IFTRUE_SLOTS
1522: (INSN_ANNULLED_BRANCH_P (jump)
1523: && ! INSN_FROM_TARGET_P (XEXP (li, 0)))
1524: ? eligible_for_annul_true (jump, i - 1, XEXP (li, 0), flags) :
1525: #endif
1526: eligible_for_delay (jump, i - 1, XEXP (li, 0), flags)))
1527: break;
1528:
1529: return (li == NULL);
1530: }
1531:
1532:
1533: /* INSN branches to an insn whose pattern SEQ is a SEQUENCE. Given that
1534: the condition tested by INSN is CONDITION and the resources shown in
1535: OTHER_NEEDED are needed after INSN, see whether INSN can take all the insns
1536: from SEQ's delay list, in addition to whatever insns it may execute
1537: (in DELAY_LIST). SETS and NEEDED are denote resources already set and
1538: needed while searching for delay slot insns. Return the concatenated
1539: delay list if possible, otherwise, return 0.
1540:
1541: SLOTS_TO_FILL is the total number of slots required by INSN, and
1542: PSLOTS_FILLED points to the number filled so far (also the number of
1543: insns in DELAY_LIST). It is updated with the number that have been
1544: filled from the SEQUENCE, if any.
1545:
1546: PANNUL_P points to a non-zero value if we already know that we need
1547: to annul INSN. If this routine determines that annulling is needed,
1548: it may set that value non-zero.
1549:
1550: PNEW_THREAD points to a location that is to receive the place at which
1551: execution should continue. */
1552:
1553: static rtx
1554: steal_delay_list_from_target (insn, condition, seq, delay_list,
1555: sets, needed, other_needed,
1556: slots_to_fill, pslots_filled, pannul_p,
1557: pnew_thread)
1558: rtx insn, condition;
1559: rtx seq;
1560: rtx delay_list;
1561: struct resources *sets, *needed, *other_needed;
1562: int slots_to_fill;
1563: int *pslots_filled;
1564: int *pannul_p;
1565: rtx *pnew_thread;
1566: {
1567: rtx temp;
1568: int slots_remaining = slots_to_fill - *pslots_filled;
1569: int total_slots_filled = *pslots_filled;
1570: rtx new_delay_list = 0;
1571: int must_annul = *pannul_p;
1572: int i;
1573:
1574: /* We can't do anything if there are more delay slots in SEQ than we
1575: can handle, or if we don't know that it will be a taken branch.
1576:
1577: We know that it will be a taken branch if it is either an unconditional
1578: branch or a conditional branch with a stricter branch condition. */
1579:
1580: if (XVECLEN (seq, 0) - 1 > slots_remaining
1581: || ! condition_dominates_p (condition, XVECEXP (seq, 0, 0)))
1582: return delay_list;
1583:
1584: for (i = 1; i < XVECLEN (seq, 0); i++)
1585: {
1586: rtx trial = XVECEXP (seq, 0, i);
1587: int flags;
1588:
1589: if (insn_references_resource_p (trial, sets, 0)
1590: || insn_sets_resource_p (trial, needed, 0)
1591: || insn_sets_resource_p (trial, sets, 0)
1592: #ifdef HAVE_cc0
1593: /* If TRIAL sets CC0, we can't copy it, so we can't steal this
1594: delay list. */
1595: || find_reg_note (trial, REG_CC_USER, NULL_RTX)
1596: #endif
1597: /* If TRIAL is from the fallthrough code of an annulled branch insn
1598: in SEQ, we cannot use it. */
1599: || (INSN_ANNULLED_BRANCH_P (XVECEXP (seq, 0, 0))
1600: && ! INSN_FROM_TARGET_P (trial)))
1601: return delay_list;
1602:
1603: /* If this insn was already done (usually in a previous delay slot),
1604: pretend we put it in our delay slot. */
1605: if (redundant_insn_p (trial, insn, new_delay_list))
1606: continue;
1607:
1608: /* We will end up re-vectoring this branch, so compute flags
1609: based on jumping to the new label. */
1610: flags = get_jump_flags (insn, JUMP_LABEL (XVECEXP (seq, 0, 0)));
1611:
1612: if (! must_annul
1613: && ((condition == const_true_rtx
1614: || (! insn_sets_resource_p (trial, other_needed, 0)
1615: && ! may_trap_p (PATTERN (trial)))))
1616: ? eligible_for_delay (insn, total_slots_filled, trial, flags)
1617: : (must_annul = 1,
1618: eligible_for_annul_false (insn, total_slots_filled, trial, flags)))
1619: {
1620: temp = copy_rtx (trial);
1621: INSN_FROM_TARGET_P (temp) = 1;
1622: new_delay_list = add_to_delay_list (temp, new_delay_list);
1623: total_slots_filled++;
1624:
1625: if (--slots_remaining == 0)
1626: break;
1627: }
1628: else
1629: return delay_list;
1630: }
1631:
1632: /* Show the place to which we will be branching. */
1633: *pnew_thread = next_active_insn (JUMP_LABEL (XVECEXP (seq, 0, 0)));
1634:
1635: /* Add any new insns to the delay list and update the count of the
1636: number of slots filled. */
1637: *pslots_filled = total_slots_filled;
1638: *pannul_p = must_annul;
1639:
1640: if (delay_list == 0)
1641: return new_delay_list;
1642:
1643: for (temp = new_delay_list; temp; temp = XEXP (temp, 1))
1644: delay_list = add_to_delay_list (XEXP (temp, 0), delay_list);
1645:
1646: return delay_list;
1647: }
1648:
1649: /* Similar to steal_delay_list_from_target except that SEQ is on the
1650: fallthrough path of INSN. Here we only do something if the delay insn
1651: of SEQ is an unconditional branch. In that case we steal its delay slot
1652: for INSN since unconditional branches are much easier to fill. */
1653:
1654: static rtx
1655: steal_delay_list_from_fallthrough (insn, condition, seq,
1656: delay_list, sets, needed, other_needed,
1657: slots_to_fill, pslots_filled, pannul_p)
1658: rtx insn, condition;
1659: rtx seq;
1660: rtx delay_list;
1661: struct resources *sets, *needed, *other_needed;
1662: int slots_to_fill;
1663: int *pslots_filled;
1664: int *pannul_p;
1665: {
1666: int i;
1667: int flags;
1668:
1669: flags = get_jump_flags (insn, JUMP_LABEL (insn));
1670:
1671: /* We can't do anything if SEQ's delay insn isn't an
1672: unconditional branch. */
1673:
1674: if (! simplejump_p (XVECEXP (seq, 0, 0))
1675: && GET_CODE (PATTERN (XVECEXP (seq, 0, 0))) != RETURN)
1676: return delay_list;
1677:
1678: for (i = 1; i < XVECLEN (seq, 0); i++)
1679: {
1680: rtx trial = XVECEXP (seq, 0, i);
1681:
1682: /* If TRIAL sets CC0, stealing it will move it too far from the use
1683: of CC0. */
1684: if (insn_references_resource_p (trial, sets, 0)
1685: || insn_sets_resource_p (trial, needed, 0)
1686: || insn_sets_resource_p (trial, sets, 0)
1687: #ifdef HAVE_cc0
1688: || sets_cc0_p (PATTERN (trial))
1689: #endif
1690: )
1691:
1692: break;
1693:
1694: /* If this insn was already done, we don't need it. */
1695: if (redundant_insn_p (trial, insn, delay_list))
1696: {
1697: delete_from_delay_slot (trial);
1698: continue;
1699: }
1700:
1701: if (! *pannul_p
1702: && ((condition == const_true_rtx
1703: || (! insn_sets_resource_p (trial, other_needed, 0)
1704: && ! may_trap_p (PATTERN (trial)))))
1705: ? eligible_for_delay (insn, *pslots_filled, trial, flags)
1706: : (*pannul_p = 1,
1707: eligible_for_annul_true (insn, *pslots_filled, trial, flags)))
1708: {
1709: delete_from_delay_slot (trial);
1710: delay_list = add_to_delay_list (trial, delay_list);
1711:
1712: if (++(*pslots_filled) == slots_to_fill)
1713: break;
1714: }
1715: else
1716: break;
1717: }
1718:
1719: return delay_list;
1720: }
1721:
1722: /* Try merging insns starting at THREAD which match exactly the insns in
1723: INSN's delay list.
1724:
1725: If all insns were matched and the insn was previously annulling, the
1726: annul bit will be cleared.
1727:
1728: For each insn that is merged, if the branch is or will be non-annulling,
1729: we delete the merged insn. */
1730:
1731: static void
1732: try_merge_delay_insns (insn, thread)
1733: rtx insn, thread;
1734: {
1735: rtx trial, next_trial;
1736: rtx delay_insn = XVECEXP (PATTERN (insn), 0, 0);
1737: int annul_p = INSN_ANNULLED_BRANCH_P (delay_insn);
1738: int slot_number = 1;
1739: int num_slots = XVECLEN (PATTERN (insn), 0);
1740: rtx next_to_match = XVECEXP (PATTERN (insn), 0, slot_number);
1741: struct resources set, needed;
1742: rtx merged_insns = 0;
1743: int i;
1744: int flags;
1745:
1746: flags = get_jump_flags (delay_insn, JUMP_LABEL (delay_insn));
1747:
1748: CLEAR_RESOURCE (&needed);
1749: CLEAR_RESOURCE (&set);
1750:
1751: /* If this is not an annulling branch, take into account anything needed in
1752: NEXT_TO_MATCH. This prevents two increments from being incorrectly
1753: folded into one. If we are annulling, this would be the correct
1754: thing to do. (The alternative, looking at things set in NEXT_TO_MATCH
1755: will essentially disable this optimization. This method is somewhat of
1756: a kludge, but I don't see a better way.) */
1757: if (! annul_p)
1758: mark_referenced_resources (next_to_match, &needed, 1);
1759:
1760: for (trial = thread; !stop_search_p (trial, 1); trial = next_trial)
1761: {
1762: rtx pat = PATTERN (trial);
1763:
1764: next_trial = next_nonnote_insn (trial);
1765:
1766: /* TRIAL must be a CALL_INSN or INSN. Skip USE and CLOBBER. */
1767: if (GET_CODE (trial) == INSN
1768: && (GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER))
1769: continue;
1770:
1771: if (GET_CODE (next_to_match) == GET_CODE (trial)
1772: #ifdef HAVE_cc0
1773: /* We can't share an insn that sets cc0. */
1774: && ! sets_cc0_p (pat)
1775: #endif
1776: && ! insn_references_resource_p (trial, &set, 1)
1777: && ! insn_sets_resource_p (trial, &set, 1)
1778: && ! insn_sets_resource_p (trial, &needed, 1)
1779: && (trial = try_split (pat, trial, 0)) != 0
1780: && rtx_equal_p (PATTERN (next_to_match), PATTERN (trial))
1781: /* Have to test this condition if annul condition is different
1782: from (and less restrictive than) non-annulling one. */
1783: && eligible_for_delay (delay_insn, slot_number - 1, trial, flags))
1784: {
1785: next_trial = next_nonnote_insn (trial);
1786:
1787: if (! annul_p)
1788: {
1789: update_block (trial, thread);
1790: delete_insn (trial);
1791: INSN_FROM_TARGET_P (next_to_match) = 0;
1792: }
1793: else
1794: merged_insns = gen_rtx (INSN_LIST, VOIDmode, trial, merged_insns);
1795:
1796: if (++slot_number == num_slots)
1797: break;
1798:
1799: next_to_match = XVECEXP (PATTERN (insn), 0, slot_number);
1800: if (! annul_p)
1801: mark_referenced_resources (next_to_match, &needed, 1);
1802: }
1803:
1804: mark_set_resources (trial, &set, 0, 1);
1805: mark_referenced_resources (trial, &needed, 1);
1806: }
1807:
1808: /* See if we stopped on a filled insn. If we did, try to see if its
1809: delay slots match. */
1810: if (slot_number != num_slots
1811: && trial && GET_CODE (trial) == INSN
1812: && GET_CODE (PATTERN (trial)) == SEQUENCE
1813: && ! INSN_ANNULLED_BRANCH_P (XVECEXP (PATTERN (trial), 0, 0)))
1814: {
1815: rtx pat = PATTERN (trial);
1816:
1817: for (i = 1; i < XVECLEN (pat, 0); i++)
1818: {
1819: rtx dtrial = XVECEXP (pat, 0, i);
1820:
1821: if (! insn_references_resource_p (dtrial, &set, 1)
1822: && ! insn_sets_resource_p (dtrial, &set, 1)
1823: && ! insn_sets_resource_p (dtrial, &needed, 1)
1824: #ifdef HAVE_cc0
1825: && ! sets_cc0_p (PATTERN (dtrial))
1826: #endif
1827: && rtx_equal_p (PATTERN (next_to_match), PATTERN (dtrial))
1828: && eligible_for_delay (delay_insn, slot_number - 1, dtrial, flags))
1829: {
1830: if (! annul_p)
1831: {
1832: update_block (dtrial, thread);
1833: delete_from_delay_slot (dtrial);
1834: INSN_FROM_TARGET_P (next_to_match) = 0;
1835: }
1836: else
1837: merged_insns = gen_rtx (INSN_LIST, SImode, dtrial,
1838: merged_insns);
1839:
1840: if (++slot_number == num_slots)
1841: break;
1842:
1843: next_to_match = XVECEXP (PATTERN (insn), 0, slot_number);
1844: }
1845: }
1846: }
1847:
1848: /* If all insns in the delay slot have been matched and we were previously
1849: annulling the branch, we need not any more. In that case delete all the
1850: merged insns. Also clear the INSN_FROM_TARGET_P bit of each insn the
1851: the delay list so that we know that it isn't only being used at the
1852: target. */
1853: if (next_to_match == 0 && annul_p)
1854: {
1855: for (; merged_insns; merged_insns = XEXP (merged_insns, 1))
1856: {
1857: if (GET_MODE (merged_insns) == SImode)
1858: {
1859: update_block (XEXP (merged_insns, 0), thread);
1860: delete_from_delay_slot (XEXP (merged_insns, 0));
1861: }
1862: else
1863: {
1864: update_block (XEXP (merged_insns, 0), thread);
1865: delete_insn (XEXP (merged_insns, 0));
1866: }
1867: }
1868:
1869: INSN_ANNULLED_BRANCH_P (delay_insn) = 0;
1870:
1871: for (i = 0; i < XVECLEN (PATTERN (insn), 0); i++)
1872: INSN_FROM_TARGET_P (XVECEXP (PATTERN (insn), 0, i)) = 0;
1873: }
1874: }
1875:
1876: /* See if INSN is redundant with an insn in front of TARGET. Often this
1877: is called when INSN is a candidate for a delay slot of TARGET.
1878: DELAY_LIST are insns that will be placed in delay slots of TARGET in front
1879: of INSN. Often INSN will be redundant with an insn in a delay slot of
1880: some previous insn. This happens when we have a series of branches to the
1881: same label; in that case the first insn at the target might want to go
1882: into each of the delay slots.
1883:
1884: If we are not careful, this routine can take up a significant fraction
1885: of the total compilation time (4%), but only wins rarely. Hence we
1886: speed this routine up by making two passes. The first pass goes back
1887: until it hits a label and sees if it find an insn with an identical
1888: pattern. Only in this (relatively rare) event does it check for
1889: data conflicts.
1890:
1891: We do not split insns we encounter. This could cause us not to find a
1892: redundant insn, but the cost of splitting seems greater than the possible
1893: gain in rare cases. */
1894:
1895: static int
1896: redundant_insn_p (insn, target, delay_list)
1897: rtx insn;
1898: rtx target;
1899: rtx delay_list;
1900: {
1901: rtx target_main = target;
1902: rtx ipat = PATTERN (insn);
1903: rtx trial, pat;
1904: struct resources needed, set;
1905: int i;
1906:
1907: /* Scan backwards looking for a match. */
1908: for (trial = PREV_INSN (target); trial; trial = PREV_INSN (trial))
1909: {
1910: if (GET_CODE (trial) == CODE_LABEL)
1911: return 0;
1912:
1913: if (GET_RTX_CLASS (GET_CODE (trial)) != 'i')
1914: continue;
1915:
1916: pat = PATTERN (trial);
1917: if (GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER)
1918: continue;
1919:
1920: if (GET_CODE (pat) == SEQUENCE)
1921: {
1922: /* Stop for a CALL and its delay slots because it is difficult to
1923: track its resource needs correctly. */
1924: if (GET_CODE (XVECEXP (pat, 0, 0)) == CALL_INSN)
1925: return 0;
1926:
1927: /* Stop for an INSN or JUMP_INSN with delayed effects and its delay
1928: slots because it is difficult to track its resource needs
1929: correctly. */
1930:
1931: #ifdef INSN_SETS_ARE_DELAYED
1932: if (INSN_SETS_ARE_DELAYED (XVECEXP (pat, 0, 0)))
1933: return 0;
1934: #endif
1935:
1936: #ifdef INSN_REFERENCES_ARE_DELAYED
1937: if (INSN_REFERENCES_ARE_DELAYED (XVECEXP (pat, 0, 0)))
1938: return 0;
1939: #endif
1940:
1941: /* See if any of the insns in the delay slot match, updating
1942: resource requirements as we go. */
1943: for (i = XVECLEN (pat, 0) - 1; i > 0; i--)
1944: if (GET_CODE (XVECEXP (pat, 0, i)) == GET_CODE (insn)
1945: && rtx_equal_p (PATTERN (XVECEXP (pat, 0, i)), ipat))
1946: break;
1947:
1948: /* If found a match, exit this loop early. */
1949: if (i > 0)
1950: break;
1951: }
1952:
1953: else if (GET_CODE (trial) == GET_CODE (insn) && rtx_equal_p (pat, ipat))
1954: break;
1955: }
1956:
1957: /* If we didn't find an insn that matches, return 0. */
1958: if (trial == 0)
1959: return 0;
1960:
1961: /* See what resources this insn sets and needs. If they overlap, or
1962: if this insn references CC0, it can't be redundant. */
1963:
1964: CLEAR_RESOURCE (&needed);
1965: CLEAR_RESOURCE (&set);
1966: mark_set_resources (insn, &set, 0, 1);
1967: mark_referenced_resources (insn, &needed, 1);
1968:
1969: /* If TARGET is a SEQUENCE, get the main insn. */
1970: if (GET_CODE (target) == INSN && GET_CODE (PATTERN (target)) == SEQUENCE)
1971: target_main = XVECEXP (PATTERN (target), 0, 0);
1972:
1973: if (resource_conflicts_p (&needed, &set)
1974: #ifdef HAVE_cc0
1975: || reg_mentioned_p (cc0_rtx, ipat)
1976: #endif
1977: /* The insn requiring the delay may not set anything needed or set by
1978: INSN. */
1979: || insn_sets_resource_p (target_main, &needed, 1)
1980: || insn_sets_resource_p (target_main, &set, 1))
1981: return 0;
1982:
1983: /* Insns we pass may not set either NEEDED or SET, so merge them for
1984: simpler tests. */
1985: needed.memory |= set.memory;
1986: IOR_HARD_REG_SET (needed.regs, set.regs);
1987:
1988: /* This insn isn't redundant if it conflicts with an insn that either is
1989: or will be in a delay slot of TARGET. */
1990:
1991: while (delay_list)
1992: {
1993: if (insn_sets_resource_p (XEXP (delay_list, 0), &needed, 1))
1994: return 0;
1995: delay_list = XEXP (delay_list, 1);
1996: }
1997:
1998: if (GET_CODE (target) == INSN && GET_CODE (PATTERN (target)) == SEQUENCE)
1999: for (i = 1; i < XVECLEN (PATTERN (target), 0); i++)
2000: if (insn_sets_resource_p (XVECEXP (PATTERN (target), 0, i), &needed, 1))
2001: return 0;
2002:
2003: /* Scan backwards until we reach a label or an insn that uses something
2004: INSN sets or sets something insn uses or sets. */
2005:
2006: for (trial = PREV_INSN (target);
2007: trial && GET_CODE (trial) != CODE_LABEL;
2008: trial = PREV_INSN (trial))
2009: {
2010: if (GET_CODE (trial) != INSN && GET_CODE (trial) != CALL_INSN
2011: && GET_CODE (trial) != JUMP_INSN)
2012: continue;
2013:
2014: pat = PATTERN (trial);
2015: if (GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER)
2016: continue;
2017:
2018: if (GET_CODE (pat) == SEQUENCE)
2019: {
2020: /* If this is a CALL_INSN and its delay slots, it is hard to track
2021: the resource needs properly, so give up. */
2022: if (GET_CODE (XVECEXP (pat, 0, 0)) == CALL_INSN)
2023: return 0;
2024:
2025: /* If this this is an INSN or JUMP_INSN with delayed effects, it
2026: is hard to track the resource needs properly, so give up. */
2027:
2028: #ifdef INSN_SETS_ARE_DELAYED
2029: if (INSN_SETS_ARE_DELAYED (XVECEXP (pat, 0, 0)))
2030: return 0;
2031: #endif
2032:
2033: #ifdef INSN_REFERENCES_ARE_DELAYED
2034: if (INSN_REFERENCES_ARE_DELAYED (XVECEXP (pat, 0, 0)))
2035: return 0;
2036: #endif
2037:
2038: /* See if any of the insns in the delay slot match, updating
2039: resource requirements as we go. */
2040: for (i = XVECLEN (pat, 0) - 1; i > 0; i--)
2041: {
2042: rtx candidate = XVECEXP (pat, 0, i);
2043:
2044: /* If an insn will be annulled if the branch is false, it isn't
2045: considered as a possible duplicate insn. */
2046: if (rtx_equal_p (PATTERN (candidate), ipat)
2047: && ! (INSN_ANNULLED_BRANCH_P (XVECEXP (pat, 0, 0))
2048: && INSN_FROM_TARGET_P (candidate)))
2049: {
2050: /* Show that this insn will be used in the sequel. */
2051: INSN_FROM_TARGET_P (candidate) = 0;
2052: return 1;
2053: }
2054:
2055: /* Unless this is an annulled insn from the target of a branch,
2056: we must stop if it sets anything needed or set by INSN. */
2057: if ((! INSN_ANNULLED_BRANCH_P (XVECEXP (pat, 0, 0))
2058: || ! INSN_FROM_TARGET_P (candidate))
2059: && insn_sets_resource_p (candidate, &needed, 1))
2060: return 0;
2061: }
2062:
2063:
2064: /* If the insn requiring the delay slot conflicts with INSN, we
2065: must stop. */
2066: if (insn_sets_resource_p (XVECEXP (pat, 0, 0), &needed, 1))
2067: return 0;
2068: }
2069: else
2070: {
2071: /* See if TRIAL is the same as INSN. */
2072: pat = PATTERN (trial);
2073: if (rtx_equal_p (pat, ipat))
2074: return 1;
2075:
2076: /* Can't go any further if TRIAL conflicts with INSN. */
2077: if (insn_sets_resource_p (trial, &needed, 1))
2078: return 0;
2079: }
2080: }
2081:
2082: return 0;
2083: }
2084:
2085: /* Return 1 if THREAD can only be executed in one way. If LABEL is non-zero,
2086: it is the target of the branch insn being scanned. If ALLOW_FALLTHROUGH
2087: is non-zero, we are allowed to fall into this thread; otherwise, we are
2088: not.
2089:
2090: If LABEL is used more than one or we pass a label other than LABEL before
2091: finding an active insn, we do not own this thread. */
2092:
2093: static int
2094: own_thread_p (thread, label, allow_fallthrough)
2095: rtx thread;
2096: rtx label;
2097: int allow_fallthrough;
2098: {
2099: rtx active_insn;
2100: rtx insn;
2101:
2102: /* We don't own the function end. */
2103: if (thread == 0)
2104: return 0;
2105:
2106: /* Get the first active insn, or THREAD, if it is an active insn. */
2107: active_insn = next_active_insn (PREV_INSN (thread));
2108:
2109: for (insn = thread; insn != active_insn; insn = NEXT_INSN (insn))
2110: if (GET_CODE (insn) == CODE_LABEL
2111: && (insn != label || LABEL_NUSES (insn) != 1))
2112: return 0;
2113:
2114: if (allow_fallthrough)
2115: return 1;
2116:
2117: /* Ensure that we reach a BARRIER before any insn or label. */
2118: for (insn = prev_nonnote_insn (thread);
2119: insn == 0 || GET_CODE (insn) != BARRIER;
2120: insn = prev_nonnote_insn (insn))
2121: if (insn == 0
2122: || GET_CODE (insn) == CODE_LABEL
2123: || (GET_CODE (insn) == INSN
2124: && GET_CODE (PATTERN (insn)) != USE
2125: && GET_CODE (PATTERN (insn)) != CLOBBER))
2126: return 0;
2127:
2128: return 1;
2129: }
2130:
2131: /* Find the number of the basic block that starts closest to INSN. Return -1
2132: if we couldn't find such a basic block. */
2133:
2134: static int
2135: find_basic_block (insn)
2136: rtx insn;
2137: {
2138: int i;
2139:
2140: /* Scan backwards to the previous BARRIER. Then see if we can find a
2141: label that starts a basic block. Return the basic block number. */
2142:
2143: for (insn = prev_nonnote_insn (insn);
2144: insn && GET_CODE (insn) != BARRIER;
2145: insn = prev_nonnote_insn (insn))
2146: ;
2147:
2148: /* The start of the function is basic block zero. */
2149: if (insn == 0)
2150: return 0;
2151:
2152: /* See if any of the upcoming CODE_LABELs start a basic block. If we reach
2153: anything other than a CODE_LABEL or note, we can't find this code. */
2154: for (insn = next_nonnote_insn (insn);
2155: insn && GET_CODE (insn) == CODE_LABEL;
2156: insn = next_nonnote_insn (insn))
2157: {
2158: for (i = 0; i < n_basic_blocks; i++)
2159: if (insn == basic_block_head[i])
2160: return i;
2161: }
2162:
2163: return -1;
2164: }
2165:
2166: /* Called when INSN is being moved from a location near the target of a jump.
2167: We leave a marker of the form (use (INSN)) immediately in front
2168: of WHERE for mark_target_live_regs. These markers will be deleted when
2169: reorg finishes.
2170:
2171: We used to try to update the live status of registers if WHERE is at
2172: the start of a basic block, but that can't work since we may remove a
2173: BARRIER in relax_delay_slots. */
2174:
2175: static void
2176: update_block (insn, where)
2177: rtx insn;
2178: rtx where;
2179: {
2180: int b;
2181:
2182: /* Ignore if this was in a delay slot and it came from the target of
2183: a branch. */
2184: if (INSN_FROM_TARGET_P (insn))
2185: return;
2186:
2187: emit_insn_before (gen_rtx (USE, VOIDmode, insn), where);
2188:
2189: /* INSN might be making a value live in a block where it didn't use to
2190: be. So recompute liveness information for this block. */
2191:
2192: b = find_basic_block (insn);
2193: if (b != -1)
2194: bb_ticks[b]++;
2195: }
2196:
2197: /* Similar to REDIRECT_JUMP except that we update the BB_TICKS entry for
2198: the basic block containing the jump. */
2199:
2200: static int
2201: reorg_redirect_jump (jump, nlabel)
2202: rtx jump;
2203: rtx nlabel;
2204: {
2205: int b = find_basic_block (jump);
2206:
2207: if (b != -1)
2208: bb_ticks[b]++;
2209:
2210: return redirect_jump (jump, nlabel);
2211: }
2212:
2213: /* Called when INSN is being moved forward into a delay slot of DELAYED_INSN.
2214: We check every instruction between INSN and DELAYED_INSN for REG_DEAD notes
2215: that reference values used in INSN. If we find one, then we move the
2216: REG_DEAD note to INSN.
2217:
2218: This is needed to handle the case where an later insn (after INSN) has a
2219: REG_DEAD note for a register used by INSN, and this later insn subsequently
2220: gets moved before a CODE_LABEL because it is a redundant insn. In this
2221: case, mark_target_live_regs may be confused into thinking the register
2222: is dead because it sees a REG_DEAD note immediately before a CODE_LABEL. */
2223:
2224: static void
2225: update_reg_dead_notes (insn, delayed_insn)
2226: rtx insn, delayed_insn;
2227: {
2228: rtx p, link, next;
2229:
2230: for (p = next_nonnote_insn (insn); p != delayed_insn;
2231: p = next_nonnote_insn (p))
2232: for (link = REG_NOTES (p); link; link = next)
2233: {
2234: next = XEXP (link, 1);
2235:
2236: if (REG_NOTE_KIND (link) != REG_DEAD
2237: || GET_CODE (XEXP (link, 0)) != REG)
2238: continue;
2239:
2240: if (reg_referenced_p (XEXP (link, 0), PATTERN (insn)))
2241: {
2242: /* Move the REG_DEAD note from P to INSN. */
2243: remove_note (p, link);
2244: XEXP (link, 1) = REG_NOTES (insn);
2245: REG_NOTES (insn) = link;
2246: }
2247: }
2248: }
2249:
2250: /* Marks registers possibly live at the current place being scanned by
2251: mark_target_live_regs. Used only by next two function. */
2252:
2253: static HARD_REG_SET current_live_regs;
2254:
2255: /* Marks registers for which we have seen a REG_DEAD note but no assignment.
2256: Also only used by the next two functions. */
2257:
2258: static HARD_REG_SET pending_dead_regs;
2259:
2260: /* Utility function called from mark_target_live_regs via note_stores.
2261: It deadens any CLOBBERed registers and livens any SET registers. */
2262:
2263: static void
2264: update_live_status (dest, x)
2265: rtx dest;
2266: rtx x;
2267: {
2268: int first_regno, last_regno;
2269: int i;
2270:
2271: if (GET_CODE (dest) != REG
2272: && (GET_CODE (dest) != SUBREG || GET_CODE (SUBREG_REG (dest)) != REG))
2273: return;
2274:
2275: if (GET_CODE (dest) == SUBREG)
2276: first_regno = REGNO (SUBREG_REG (dest)) + SUBREG_WORD (dest);
2277: else
2278: first_regno = REGNO (dest);
2279:
2280: last_regno = first_regno + HARD_REGNO_NREGS (first_regno, GET_MODE (dest));
2281:
2282: if (GET_CODE (x) == CLOBBER)
2283: for (i = first_regno; i < last_regno; i++)
2284: CLEAR_HARD_REG_BIT (current_live_regs, i);
2285: else
2286: for (i = first_regno; i < last_regno; i++)
2287: {
2288: SET_HARD_REG_BIT (current_live_regs, i);
2289: CLEAR_HARD_REG_BIT (pending_dead_regs, i);
2290: }
2291: }
2292:
2293: /* Similar to next_insn, but ignores insns in the delay slots of
2294: an annulled branch. */
2295:
2296: static rtx
2297: next_insn_no_annul (insn)
2298: rtx insn;
2299: {
2300: if (insn)
2301: {
2302: /* If INSN is an annulled branch, skip any insns from the target
2303: of the branch. */
2304: if (INSN_ANNULLED_BRANCH_P (insn)
2305: && NEXT_INSN (PREV_INSN (insn)) != insn)
2306: while (INSN_FROM_TARGET_P (NEXT_INSN (insn)))
2307: insn = NEXT_INSN (insn);
2308:
2309: insn = NEXT_INSN (insn);
2310: if (insn && GET_CODE (insn) == INSN
2311: && GET_CODE (PATTERN (insn)) == SEQUENCE)
2312: insn = XVECEXP (PATTERN (insn), 0, 0);
2313: }
2314:
2315: return insn;
2316: }
2317:
2318: /* Set the resources that are live at TARGET.
2319:
2320: If TARGET is zero, we refer to the end of the current function and can
2321: return our precomputed value.
2322:
2323: Otherwise, we try to find out what is live by consulting the basic block
2324: information. This is tricky, because we must consider the actions of
2325: reload and jump optimization, which occur after the basic block information
2326: has been computed.
2327:
2328: Accordingly, we proceed as follows::
2329:
2330: We find the previous BARRIER and look at all immediately following labels
2331: (with no intervening active insns) to see if any of them start a basic
2332: block. If we hit the start of the function first, we use block 0.
2333:
2334: Once we have found a basic block and a corresponding first insns, we can
2335: accurately compute the live status from basic_block_live_regs and
2336: reg_renumber. (By starting at a label following a BARRIER, we are immune
2337: to actions taken by reload and jump.) Then we scan all insns between
2338: that point and our target. For each CLOBBER (or for call-clobbered regs
2339: when we pass a CALL_INSN), mark the appropriate registers are dead. For
2340: a SET, mark them as live.
2341:
2342: We have to be careful when using REG_DEAD notes because they are not
2343: updated by such things as find_equiv_reg. So keep track of registers
2344: marked as dead that haven't been assigned to, and mark them dead at the
2345: next CODE_LABEL since reload and jump won't propagate values across labels.
2346:
2347: If we cannot find the start of a basic block (should be a very rare
2348: case, if it can happen at all), mark everything as potentially live.
2349:
2350: Next, scan forward from TARGET looking for things set or clobbered
2351: before they are used. These are not live.
2352:
2353: Because we can be called many times on the same target, save our results
2354: in a hash table indexed by INSN_UID. */
2355:
2356: static void
2357: mark_target_live_regs (target, res)
2358: rtx target;
2359: struct resources *res;
2360: {
2361: int b = -1;
2362: int i;
2363: struct target_info *tinfo;
2364: rtx insn, next;
2365: rtx jump_insn = 0;
2366: rtx jump_target;
2367: HARD_REG_SET scratch;
2368: struct resources set, needed;
2369: int jump_count = 0;
2370:
2371: /* Handle end of function. */
2372: if (target == 0)
2373: {
2374: *res = end_of_function_needs;
2375: return;
2376: }
2377:
2378: /* We have to assume memory is needed, but the CC isn't. */
2379: res->memory = 1;
2380: res->volatil = 0;
2381: res->cc = 0;
2382:
2383: /* See if we have computed this value already. */
2384: for (tinfo = target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME];
2385: tinfo; tinfo = tinfo->next)
2386: if (tinfo->uid == INSN_UID (target))
2387: break;
2388:
2389: /* Start by getting the basic block number. If we have saved information,
2390: we can get it from there unless the insn at the start of the basic block
2391: has been deleted. */
2392: if (tinfo && tinfo->block != -1
2393: && ! INSN_DELETED_P (basic_block_head[tinfo->block]))
2394: b = tinfo->block;
2395:
2396: if (b == -1)
2397: b = find_basic_block (target);
2398:
2399: if (tinfo)
2400: {
2401: /* If the information is up-to-date, use it. Otherwise, we will
2402: update it below. */
2403: if (b == tinfo->block && b != -1 && tinfo->bb_tick == bb_ticks[b])
2404: {
2405: COPY_HARD_REG_SET (res->regs, tinfo->live_regs);
2406: return;
2407: }
2408: }
2409: else
2410: {
2411: /* Allocate a place to put our results and chain it into the
2412: hash table. */
2413: tinfo = (struct target_info *) oballoc (sizeof (struct target_info));
2414: tinfo->uid = INSN_UID (target);
2415: tinfo->block = b;
2416: tinfo->next = target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME];
2417: target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME] = tinfo;
2418: }
2419:
2420: CLEAR_HARD_REG_SET (pending_dead_regs);
2421:
2422: /* If we found a basic block, get the live registers from it and update
2423: them with anything set or killed between its start and the insn before
2424: TARGET. Otherwise, we must assume everything is live. */
2425: if (b != -1)
2426: {
2427: regset regs_live = basic_block_live_at_start[b];
2428: int offset, j;
2429: REGSET_ELT_TYPE bit;
2430: int regno;
2431: rtx start_insn, stop_insn;
2432:
2433: /* Compute hard regs live at start of block -- this is the real hard regs
2434: marked live, plus live pseudo regs that have been renumbered to
2435: hard regs. */
2436:
2437: #ifdef HARD_REG_SET
2438: current_live_regs = *regs_live;
2439: #else
2440: COPY_HARD_REG_SET (current_live_regs, regs_live);
2441: #endif
2442:
2443: for (offset = 0, i = 0; offset < regset_size; offset++)
2444: {
2445: if (regs_live[offset] == 0)
2446: i += REGSET_ELT_BITS;
2447: else
2448: for (bit = 1; bit && i < max_regno; bit <<= 1, i++)
2449: if ((regs_live[offset] & bit)
2450: && (regno = reg_renumber[i]) >= 0)
2451: for (j = regno;
2452: j < regno + HARD_REGNO_NREGS (regno,
2453: PSEUDO_REGNO_MODE (i));
2454: j++)
2455: SET_HARD_REG_BIT (current_live_regs, j);
2456: }
2457:
2458: /* Get starting and ending insn, handling the case where each might
2459: be a SEQUENCE. */
2460: start_insn = (b == 0 ? get_insns () : basic_block_head[b]);
2461: stop_insn = target;
2462:
2463: if (GET_CODE (start_insn) == INSN
2464: && GET_CODE (PATTERN (start_insn)) == SEQUENCE)
2465: start_insn = XVECEXP (PATTERN (start_insn), 0, 0);
2466:
2467: if (GET_CODE (stop_insn) == INSN
2468: && GET_CODE (PATTERN (stop_insn)) == SEQUENCE)
2469: stop_insn = next_insn (PREV_INSN (stop_insn));
2470:
2471: for (insn = start_insn; insn != stop_insn;
2472: insn = next_insn_no_annul (insn))
2473: {
2474: rtx link;
2475: rtx real_insn = insn;
2476:
2477: /* If this insn is from the target of a branch, it isn't going to
2478: be used in the sequel. If it is used in both cases, this
2479: test will not be true. */
2480: if (INSN_FROM_TARGET_P (insn))
2481: continue;
2482:
2483: /* If this insn is a USE made by update_block, we care about the
2484: underlying insn. */
2485: if (GET_CODE (insn) == INSN && GET_CODE (PATTERN (insn)) == USE
2486: && GET_RTX_CLASS (GET_CODE (XEXP (PATTERN (insn), 0))) == 'i')
2487: real_insn = XEXP (PATTERN (insn), 0);
2488:
2489: if (GET_CODE (real_insn) == CALL_INSN)
2490: {
2491: /* CALL clobbers all call-used regs that aren't fixed except
2492: sp, ap, and fp. Do this before setting the result of the
2493: call live. */
2494: for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
2495: if (call_used_regs[i]
2496: && i != STACK_POINTER_REGNUM && i != FRAME_POINTER_REGNUM
2497: && i != ARG_POINTER_REGNUM
2498: #if HARD_FRAME_POINTER_REGNUM != FRAME_POINTER_REGNUM
2499: && i != HARD_FRAME_POINTER_REGNUM
2500: #endif
2501: #if ARG_POINTER_REGNUM != FRAME_POINTER_REGNUM
2502: && ! (i == ARG_POINTER_REGNUM && fixed_regs[i])
2503: #endif
2504: #ifdef PIC_OFFSET_TABLE_REGNUM
2505: && ! (i == PIC_OFFSET_TABLE_REGNUM && flag_pic)
2506: #endif
2507: )
2508: CLEAR_HARD_REG_BIT (current_live_regs, i);
2509:
2510: /* A CALL_INSN sets any global register live, since it may
2511: have been modified by the call. */
2512: for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
2513: if (global_regs[i])
2514: SET_HARD_REG_BIT (current_live_regs, i);
2515: }
2516:
2517: /* Mark anything killed in an insn to be deadened at the next
2518: label. Ignore USE insns; the only REG_DEAD notes will be for
2519: parameters. But they might be early. A CALL_INSN will usually
2520: clobber registers used for parameters. It isn't worth bothering
2521: with the unlikely case when it won't. */
2522: if ((GET_CODE (real_insn) == INSN
2523: && GET_CODE (PATTERN (real_insn)) != USE
2524: && GET_CODE (PATTERN (real_insn)) != CLOBBER)
2525: || GET_CODE (real_insn) == JUMP_INSN
2526: || GET_CODE (real_insn) == CALL_INSN)
2527: {
2528: for (link = REG_NOTES (real_insn); link; link = XEXP (link, 1))
2529: if (REG_NOTE_KIND (link) == REG_DEAD
2530: && GET_CODE (XEXP (link, 0)) == REG
2531: && REGNO (XEXP (link, 0)) < FIRST_PSEUDO_REGISTER)
2532: {
2533: int first_regno = REGNO (XEXP (link, 0));
2534: int last_regno
2535: = (first_regno
2536: + HARD_REGNO_NREGS (first_regno,
2537: GET_MODE (XEXP (link, 0))));
2538:
2539: for (i = first_regno; i < last_regno; i++)
2540: SET_HARD_REG_BIT (pending_dead_regs, i);
2541: }
2542:
2543: note_stores (PATTERN (real_insn), update_live_status);
2544:
2545: /* If any registers were unused after this insn, kill them.
2546: These notes will always be accurate. */
2547: for (link = REG_NOTES (real_insn); link; link = XEXP (link, 1))
2548: if (REG_NOTE_KIND (link) == REG_UNUSED
2549: && GET_CODE (XEXP (link, 0)) == REG
2550: && REGNO (XEXP (link, 0)) < FIRST_PSEUDO_REGISTER)
2551: {
2552: int first_regno = REGNO (XEXP (link, 0));
2553: int last_regno
2554: = (first_regno
2555: + HARD_REGNO_NREGS (first_regno,
2556: GET_MODE (XEXP (link, 0))));
2557:
2558: for (i = first_regno; i < last_regno; i++)
2559: CLEAR_HARD_REG_BIT (current_live_regs, i);
2560: }
2561: }
2562:
2563: else if (GET_CODE (real_insn) == CODE_LABEL)
2564: {
2565: /* A label clobbers the pending dead registers since neither
2566: reload nor jump will propagate a value across a label. */
2567: AND_COMPL_HARD_REG_SET (current_live_regs, pending_dead_regs);
2568: CLEAR_HARD_REG_SET (pending_dead_regs);
2569: }
2570:
2571: /* The beginning of the epilogue corresponds to the end of the
2572: RTL chain when there are no epilogue insns. Certain resources
2573: are implicitly required at that point. */
2574: else if (GET_CODE (real_insn) == NOTE
2575: && NOTE_LINE_NUMBER (real_insn) == NOTE_INSN_EPILOGUE_BEG)
2576: IOR_HARD_REG_SET (current_live_regs, start_of_epilogue_needs.regs);
2577: }
2578:
2579: COPY_HARD_REG_SET (res->regs, current_live_regs);
2580: tinfo->block = b;
2581: tinfo->bb_tick = bb_ticks[b];
2582: }
2583: else
2584: /* We didn't find the start of a basic block. Assume everything
2585: in use. This should happen only extremely rarely. */
2586: SET_HARD_REG_SET (res->regs);
2587:
2588: /* Now step forward from TARGET looking for registers that are set before
2589: they are used. These are dead. If we pass a label, any pending dead
2590: registers that weren't yet used can be made dead. Stop when we pass a
2591: conditional JUMP_INSN; follow the first few unconditional branches. */
2592:
2593: CLEAR_RESOURCE (&set);
2594: CLEAR_RESOURCE (&needed);
2595:
2596: for (insn = target; insn; insn = next)
2597: {
2598: rtx this_jump_insn = insn;
2599:
2600: next = NEXT_INSN (insn);
2601: switch (GET_CODE (insn))
2602: {
2603: case CODE_LABEL:
2604: AND_COMPL_HARD_REG_SET (pending_dead_regs, needed.regs);
2605: AND_COMPL_HARD_REG_SET (res->regs, pending_dead_regs);
2606: CLEAR_HARD_REG_SET (pending_dead_regs);
2607: continue;
2608:
2609: case BARRIER:
2610: case NOTE:
2611: continue;
2612:
2613: case INSN:
2614: if (GET_CODE (PATTERN (insn)) == USE)
2615: {
2616: /* If INSN is a USE made by update_block, we care about the
2617: underlying insn. Any registers set by the underlying insn
2618: are live since the insn is being done somewhere else. */
2619: if (GET_RTX_CLASS (GET_CODE (XEXP (PATTERN (insn), 0))) == 'i')
2620: mark_set_resources (XEXP (PATTERN (insn), 0), res, 0, 1);
2621:
2622: /* All other USE insns are to be ignored. */
2623: continue;
2624: }
2625: else if (GET_CODE (PATTERN (insn)) == CLOBBER)
2626: continue;
2627: else if (GET_CODE (PATTERN (insn)) == SEQUENCE)
2628: {
2629: /* An unconditional jump can be used to fill the delay slot
2630: of a call, so search for a JUMP_INSN in any position. */
2631: for (i = 0; i < XVECLEN (PATTERN (insn), 0); i++)
2632: {
2633: this_jump_insn = XVECEXP (PATTERN (insn), 0, i);
2634: if (GET_CODE (this_jump_insn) == JUMP_INSN)
2635: break;
2636: }
2637: }
2638: }
2639:
2640: if (GET_CODE (this_jump_insn) == JUMP_INSN)
2641: {
2642: if (jump_count++ < 10
2643: && (simplejump_p (this_jump_insn)
2644: || GET_CODE (PATTERN (this_jump_insn)) == RETURN))
2645: {
2646: next = next_active_insn (JUMP_LABEL (this_jump_insn));
2647: if (jump_insn == 0)
2648: {
2649: jump_insn = insn;
2650: jump_target = JUMP_LABEL (this_jump_insn);
2651: }
2652: }
2653: else
2654: break;
2655: }
2656:
2657: mark_referenced_resources (insn, &needed, 1);
2658: mark_set_resources (insn, &set, 0, 1);
2659:
2660: COPY_HARD_REG_SET (scratch, set.regs);
2661: AND_COMPL_HARD_REG_SET (scratch, needed.regs);
2662: AND_COMPL_HARD_REG_SET (res->regs, scratch);
2663: }
2664:
2665: /* If we hit an unconditional branch, we have another way of finding out
2666: what is live: we can see what is live at the branch target and include
2667: anything used but not set before the branch. The only things that are
2668: live are those that are live using the above test and the test below.
2669:
2670: Don't try this if we expired our jump count above, since that would
2671: mean there may be an infinite loop in the function being compiled. */
2672:
2673: if (jump_insn && jump_count < 10)
2674: {
2675: struct resources new_resources;
2676: rtx stop_insn = next_active_insn (jump_insn);
2677:
2678: mark_target_live_regs (next_active_insn (jump_target), &new_resources);
2679: CLEAR_RESOURCE (&set);
2680: CLEAR_RESOURCE (&needed);
2681:
2682: /* Include JUMP_INSN in the needed registers. */
2683: for (insn = target; insn != stop_insn; insn = next_active_insn (insn))
2684: {
2685: mark_referenced_resources (insn, &needed, 1);
2686:
2687: COPY_HARD_REG_SET (scratch, needed.regs);
2688: AND_COMPL_HARD_REG_SET (scratch, set.regs);
2689: IOR_HARD_REG_SET (new_resources.regs, scratch);
2690:
2691: mark_set_resources (insn, &set, 0, 1);
2692: }
2693:
2694: AND_HARD_REG_SET (res->regs, new_resources.regs);
2695: }
2696:
2697: COPY_HARD_REG_SET (tinfo->live_regs, res->regs);
2698: }
2699:
2700: /* Scan a function looking for insns that need a delay slot and find insns to
2701: put into the delay slot.
2702:
2703: NON_JUMPS_P is non-zero if we are to only try to fill non-jump insns (such
2704: as calls). We do these first since we don't want jump insns (that are
2705: easier to fill) to get the only insns that could be used for non-jump insns.
2706: When it is zero, only try to fill JUMP_INSNs.
2707:
2708: When slots are filled in this manner, the insns (including the
2709: delay_insn) are put together in a SEQUENCE rtx. In this fashion,
2710: it is possible to tell whether a delay slot has really been filled
2711: or not. `final' knows how to deal with this, by communicating
2712: through FINAL_SEQUENCE. */
2713:
2714: static void
2715: fill_simple_delay_slots (first, non_jumps_p)
2716: rtx first;
2717: int non_jumps_p;
2718: {
2719: register rtx insn, pat, trial, next_trial;
2720: register int i, j;
2721: int num_unfilled_slots = unfilled_slots_next - unfilled_slots_base;
2722: struct resources needed, set;
2723: register int slots_to_fill, slots_filled;
2724: rtx delay_list;
2725:
2726: for (i = 0; i < num_unfilled_slots; i++)
2727: {
2728: int flags;
2729: /* Get the next insn to fill. If it has already had any slots assigned,
2730: we can't do anything with it. Maybe we'll improve this later. */
2731:
2732: insn = unfilled_slots_base[i];
2733: if (insn == 0
2734: || INSN_DELETED_P (insn)
2735: || (GET_CODE (insn) == INSN
2736: && GET_CODE (PATTERN (insn)) == SEQUENCE)
2737: || (GET_CODE (insn) == JUMP_INSN && non_jumps_p)
2738: || (GET_CODE (insn) != JUMP_INSN && ! non_jumps_p))
2739: continue;
2740:
2741: if (GET_CODE (insn) == JUMP_INSN)
2742: flags = get_jump_flags (insn, JUMP_LABEL (insn));
2743: else
2744: flags = get_jump_flags (insn, NULL_RTX);
2745: slots_to_fill = num_delay_slots (insn);
2746: if (slots_to_fill == 0)
2747: abort ();
2748:
2749: /* This insn needs, or can use, some delay slots. SLOTS_TO_FILL
2750: says how many. After initialization, first try optimizing
2751:
2752: call _foo call _foo
2753: nop add %o7,.-L1,%o7
2754: b,a L1
2755: nop
2756:
2757: If this case applies, the delay slot of the call is filled with
2758: the unconditional jump. This is done first to avoid having the
2759: delay slot of the call filled in the backward scan. Also, since
2760: the unconditional jump is likely to also have a delay slot, that
2761: insn must exist when it is subsequently scanned.
2762:
2763: This is tried on each insn with delay slots as some machines
2764: have insns which perform calls, but are not represented as
2765: CALL_INSNs. */
2766:
2767: slots_filled = 0;
2768: delay_list = 0;
2769:
2770: if ((trial = next_active_insn (insn))
2771: && GET_CODE (trial) == JUMP_INSN
2772: && simplejump_p (trial)
2773: && eligible_for_delay (insn, slots_filled, trial, flags)
2774: && no_labels_between_p (insn, trial))
2775: {
2776: slots_filled++;
2777: delay_list = add_to_delay_list (trial, delay_list);
2778: /* Remove the unconditional jump from consideration for delay slot
2779: filling and unthread it. */
2780: if (unfilled_slots_base[i + 1] == trial)
2781: unfilled_slots_base[i + 1] = 0;
2782: {
2783: rtx next = NEXT_INSN (trial);
2784: rtx prev = PREV_INSN (trial);
2785: if (prev)
2786: NEXT_INSN (prev) = next;
2787: if (next)
2788: PREV_INSN (next) = prev;
2789: }
2790: }
2791:
2792: /* Now, scan backwards from the insn to search for a potential
2793: delay-slot candidate. Stop searching when a label or jump is hit.
2794:
2795: For each candidate, if it is to go into the delay slot (moved
2796: forward in execution sequence), it must not need or set any resources
2797: that were set by later insns and must not set any resources that
2798: are needed for those insns.
2799:
2800: The delay slot insn itself sets resources unless it is a call
2801: (in which case the called routine, not the insn itself, is doing
2802: the setting). */
2803:
2804: if (slots_filled < slots_to_fill)
2805: {
2806: CLEAR_RESOURCE (&needed);
2807: CLEAR_RESOURCE (&set);
2808: mark_set_resources (insn, &set, 0, 0);
2809: mark_referenced_resources (insn, &needed, 0);
2810:
2811: for (trial = prev_nonnote_insn (insn); ! stop_search_p (trial, 1);
2812: trial = next_trial)
2813: {
2814: next_trial = prev_nonnote_insn (trial);
2815:
2816: /* This must be an INSN or CALL_INSN. */
2817: pat = PATTERN (trial);
2818:
2819: /* USE and CLOBBER at this level was just for flow; ignore it. */
2820: if (GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER)
2821: continue;
2822:
2823: /* Check for resource conflict first, to avoid unnecessary
2824: splitting. */
2825: if (! insn_references_resource_p (trial, &set, 1)
2826: && ! insn_sets_resource_p (trial, &set, 1)
2827: && ! insn_sets_resource_p (trial, &needed, 1)
2828: #ifdef HAVE_cc0
2829: /* Can't separate set of cc0 from its use. */
2830: && ! (reg_mentioned_p (cc0_rtx, pat)
2831: && ! sets_cc0_p (cc0_rtx, pat))
2832: #endif
2833: )
2834: {
2835: trial = try_split (pat, trial, 1);
2836: next_trial = prev_nonnote_insn (trial);
2837: if (eligible_for_delay (insn, slots_filled, trial, flags))
2838: {
2839: /* In this case, we are searching backward, so if we
2840: find insns to put on the delay list, we want
2841: to put them at the head, rather than the
2842: tail, of the list. */
2843:
2844: update_reg_dead_notes (trial, insn);
2845: delay_list = gen_rtx (INSN_LIST, VOIDmode,
2846: trial, delay_list);
2847: update_block (trial, trial);
2848: delete_insn (trial);
2849: if (slots_to_fill == ++slots_filled)
2850: break;
2851: continue;
2852: }
2853: }
2854:
2855: mark_set_resources (trial, &set, 0, 1);
2856: mark_referenced_resources (trial, &needed, 1);
2857: }
2858: }
2859:
2860: /* If all needed slots haven't been filled, we come here. */
2861:
2862: /* Try to optimize case of jumping around a single insn. */
2863: #if defined(ANNUL_IFFALSE_SLOTS) || defined(ANNUL_IFTRUE_SLOTS)
2864: if (slots_filled != slots_to_fill
2865: && delay_list == 0
2866: && GET_CODE (insn) == JUMP_INSN && condjump_p (insn))
2867: {
2868: delay_list = optimize_skip (insn);
2869: if (delay_list)
2870: slots_filled += 1;
2871: }
2872: #endif
2873:
2874: /* Try to get insns from beyond the insn needing the delay slot.
2875: These insns can neither set or reference resources set in insns being
2876: skipped, cannot set resources in the insn being skipped, and, if this
2877: is a CALL_INSN (or a CALL_INSN is passed), cannot trap (because the
2878: call might not return).
2879:
2880: If this is a conditional jump, see if it merges back to us early
2881: enough for us to pick up insns from the merge point. Don't do
2882: this if there is another branch to our label unless we pass all of
2883: them.
2884:
2885: Another similar merge is if we jump to the same place that a
2886: later unconditional jump branches to. In that case, we don't
2887: care about the number of uses of our label. */
2888:
2889: if (slots_filled != slots_to_fill
2890: && (GET_CODE (insn) != JUMP_INSN
2891: || (condjump_p (insn) && ! simplejump_p (insn)
2892: && JUMP_LABEL (insn) != 0)))
2893: {
2894: rtx target = 0;
2895: int maybe_never = 0;
2896: int passed_label = 0;
2897: int target_uses;
2898: struct resources needed_at_jump;
2899:
2900: CLEAR_RESOURCE (&needed);
2901: CLEAR_RESOURCE (&set);
2902:
2903: if (GET_CODE (insn) == CALL_INSN)
2904: {
2905: mark_set_resources (insn, &set, 0, 1);
2906: mark_referenced_resources (insn, &needed, 1);
2907: maybe_never = 1;
2908: }
2909: else
2910: {
2911: mark_set_resources (insn, &set, 0, 1);
2912: mark_referenced_resources (insn, &needed, 1);
2913: if (GET_CODE (insn) == JUMP_INSN)
2914: {
2915: /* Get our target and show how many more uses we want to
2916: see before we hit the label. */
2917: target = JUMP_LABEL (insn);
2918: target_uses = LABEL_NUSES (target) - 1;
2919: }
2920:
2921: }
2922:
2923: for (trial = next_nonnote_insn (insn); trial; trial = next_trial)
2924: {
2925: rtx pat, trial_delay;
2926:
2927: next_trial = next_nonnote_insn (trial);
2928:
2929: if (GET_CODE (trial) == CODE_LABEL)
2930: {
2931: passed_label = 1;
2932:
2933: /* If this is our target, see if we have seen all its uses.
2934: If so, indicate we have passed our target and ignore it.
2935: All other labels cause us to stop our search. */
2936: if (trial == target && target_uses == 0)
2937: {
2938: target = 0;
2939: continue;
2940: }
2941: else
2942: break;
2943: }
2944: else if (GET_CODE (trial) == BARRIER)
2945: break;
2946:
2947: /* We must have an INSN, JUMP_INSN, or CALL_INSN. */
2948: pat = PATTERN (trial);
2949:
2950: /* Stand-alone USE and CLOBBER are just for flow. */
2951: if (GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER)
2952: continue;
2953:
2954: /* If this already has filled delay slots, get the insn needing
2955: the delay slots. */
2956: if (GET_CODE (pat) == SEQUENCE)
2957: trial_delay = XVECEXP (pat, 0, 0);
2958: else
2959: trial_delay = trial;
2960:
2961: /* If this is a jump insn to our target, indicate that we have
2962: seen another jump to it. If we aren't handling a conditional
2963: jump, stop our search. Otherwise, compute the needs at its
2964: target and add them to NEEDED. */
2965: if (GET_CODE (trial_delay) == JUMP_INSN)
2966: {
2967: if (target == 0)
2968: break;
2969: else if (JUMP_LABEL (trial_delay) == target)
2970: target_uses--;
2971: else
2972: {
2973: mark_target_live_regs
2974: (next_active_insn (JUMP_LABEL (trial_delay)),
2975: &needed_at_jump);
2976: needed.memory |= needed_at_jump.memory;
2977: IOR_HARD_REG_SET (needed.regs, needed_at_jump.regs);
2978: }
2979: }
2980:
2981: /* See if we have a resource problem before we try to
2982: split. */
2983: if (target == 0
2984: && GET_CODE (pat) != SEQUENCE
2985: && ! insn_references_resource_p (trial, &set, 1)
2986: && ! insn_sets_resource_p (trial, &set, 1)
2987: && ! insn_sets_resource_p (trial, &needed, 1)
2988: #ifdef HAVE_cc0
2989: && ! (reg_mentioned_p (cc0_rtx, pat) && ! sets_cc0_p (pat))
2990: #endif
2991: && ! (maybe_never && may_trap_p (pat))
2992: && (trial = try_split (pat, trial, 0))
2993: && eligible_for_delay (insn, slots_filled, trial, flags))
2994: {
2995: next_trial = next_nonnote_insn (trial);
2996: delay_list = add_to_delay_list (trial, delay_list);
2997:
2998: #ifdef HAVE_cc0
2999: if (reg_mentioned_p (cc0_rtx, pat))
3000: link_cc0_insns (trial);
3001: #endif
3002:
3003: if (passed_label)
3004: update_block (trial, trial);
3005: delete_insn (trial);
3006: if (slots_to_fill == ++slots_filled)
3007: break;
3008: continue;
3009: }
3010:
3011: mark_set_resources (trial, &set, 0, 1);
3012: mark_referenced_resources (trial, &needed, 1);
3013:
3014: /* Ensure we don't put insns between the setting of cc and the
3015: comparison by moving a setting of cc into an earlier delay
3016: slot since these insns could clobber the condition code. */
3017: set.cc = 1;
3018:
3019: /* If this is a call or jump, we might not get here. */
3020: if (GET_CODE (trial) == CALL_INSN
3021: || GET_CODE (trial) == JUMP_INSN)
3022: maybe_never = 1;
3023: }
3024:
3025: /* If there are slots left to fill and our search was stopped by an
3026: unconditional branch, try the insn at the branch target. We can
3027: redirect the branch if it works. */
3028: if (slots_to_fill != slots_filled
3029: && trial
3030: && GET_CODE (trial) == JUMP_INSN
3031: && simplejump_p (trial)
3032: && (target == 0 || JUMP_LABEL (trial) == target)
3033: && (next_trial = next_active_insn (JUMP_LABEL (trial))) != 0
3034: && ! (GET_CODE (next_trial) == INSN
3035: && GET_CODE (PATTERN (next_trial)) == SEQUENCE)
3036: && ! insn_references_resource_p (next_trial, &set, 1)
3037: && ! insn_sets_resource_p (next_trial, &set, 1)
3038: && ! insn_sets_resource_p (next_trial, &needed, 1)
3039: #ifdef HAVE_cc0
3040: && ! reg_mentioned_p (cc0_rtx, PATTERN (next_trial))
3041: #endif
3042: && ! (maybe_never && may_trap_p (PATTERN (next_trial)))
3043: && (next_trial = try_split (PATTERN (next_trial), next_trial, 0))
3044: && eligible_for_delay (insn, slots_filled, next_trial, flags))
3045: {
3046: rtx new_label = next_active_insn (next_trial);
3047:
3048: if (new_label != 0)
3049: new_label = get_label_before (new_label);
3050: else
3051: new_label = find_end_label ();
3052:
3053: delay_list
3054: = add_to_delay_list (copy_rtx (next_trial), delay_list);
3055: slots_filled++;
3056: reorg_redirect_jump (trial, new_label);
3057:
3058: /* If we merged because we both jumped to the same place,
3059: redirect the original insn also. */
3060: if (target)
3061: reorg_redirect_jump (insn, new_label);
3062: }
3063: }
3064:
3065: if (delay_list)
3066: unfilled_slots_base[i]
3067: = emit_delay_sequence (insn, delay_list,
3068: slots_filled, slots_to_fill);
3069:
3070: if (slots_to_fill == slots_filled)
3071: unfilled_slots_base[i] = 0;
3072:
3073: note_delay_statistics (slots_filled, 0);
3074: }
3075:
3076: #ifdef DELAY_SLOTS_FOR_EPILOGUE
3077: /* See if the epilogue needs any delay slots. Try to fill them if so.
3078: The only thing we can do is scan backwards from the end of the
3079: function. If we did this in a previous pass, it is incorrect to do it
3080: again. */
3081: if (current_function_epilogue_delay_list)
3082: return;
3083:
3084: slots_to_fill = DELAY_SLOTS_FOR_EPILOGUE;
3085: if (slots_to_fill == 0)
3086: return;
3087:
3088: slots_filled = 0;
3089: CLEAR_RESOURCE (&needed);
3090: CLEAR_RESOURCE (&set);
3091:
3092: for (trial = get_last_insn (); ! stop_search_p (trial, 1);
3093: trial = PREV_INSN (trial))
3094: {
3095: if (GET_CODE (trial) == NOTE)
3096: continue;
3097: pat = PATTERN (trial);
3098: if (GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER)
3099: continue;
3100:
3101: if (! insn_references_resource_p (trial, &set, 1)
3102: && ! insn_sets_resource_p (trial, &needed, 1)
3103: #ifdef HAVE_cc0
3104: /* Don't want to mess with cc0 here. */
3105: && ! reg_mentioned_p (cc0_rtx, pat)
3106: #endif
3107: )
3108: {
3109: trial = try_split (pat, trial, 1);
3110: if (ELIGIBLE_FOR_EPILOGUE_DELAY (trial, slots_filled))
3111: {
3112: /* Here as well we are searching backward, so put the
3113: insns we find on the head of the list. */
3114:
3115: current_function_epilogue_delay_list
3116: = gen_rtx (INSN_LIST, VOIDmode, trial,
3117: current_function_epilogue_delay_list);
3118: mark_referenced_resources (trial, &end_of_function_needs, 1);
3119: update_block (trial, trial);
3120: delete_insn (trial);
3121:
3122: /* Clear deleted bit so final.c will output the insn. */
3123: INSN_DELETED_P (trial) = 0;
3124:
3125: if (slots_to_fill == ++slots_filled)
3126: break;
3127: continue;
3128: }
3129: }
3130:
3131: mark_set_resources (trial, &set, 0, 1);
3132: mark_referenced_resources (trial, &needed, 1);
3133: }
3134:
3135: note_delay_statistics (slots_filled, 0);
3136: #endif
3137: }
3138:
3139: /* Try to find insns to place in delay slots.
3140:
3141: INSN is the jump needing SLOTS_TO_FILL delay slots. It tests CONDITION
3142: or is an unconditional branch if CONDITION is const_true_rtx.
3143: *PSLOTS_FILLED is updated with the number of slots that we have filled.
3144:
3145: THREAD is a flow-of-control, either the insns to be executed if the
3146: branch is true or if the branch is false, THREAD_IF_TRUE says which.
3147:
3148: OPPOSITE_THREAD is the thread in the opposite direction. It is used
3149: to see if any potential delay slot insns set things needed there.
3150:
3151: LIKELY is non-zero if it is extremely likely that the branch will be
3152: taken and THREAD_IF_TRUE is set. This is used for the branch at the
3153: end of a loop back up to the top.
3154:
3155: OWN_THREAD and OWN_OPPOSITE_THREAD are true if we are the only user of the
3156: thread. I.e., it is the fallthrough code of our jump or the target of the
3157: jump when we are the only jump going there.
3158:
3159: If OWN_THREAD is false, it must be the "true" thread of a jump. In that
3160: case, we can only take insns from the head of the thread for our delay
3161: slot. We then adjust the jump to point after the insns we have taken. */
3162:
3163: static rtx
3164: fill_slots_from_thread (insn, condition, thread, opposite_thread, likely,
3165: thread_if_true, own_thread, own_opposite_thread,
3166: slots_to_fill, pslots_filled)
3167: rtx insn;
3168: rtx condition;
3169: rtx thread, opposite_thread;
3170: int likely;
3171: int thread_if_true;
3172: int own_thread, own_opposite_thread;
3173: int slots_to_fill, *pslots_filled;
3174: {
3175: rtx new_thread;
3176: rtx delay_list = 0;
3177: struct resources opposite_needed, set, needed;
3178: rtx trial;
3179: int lose = 0;
3180: int must_annul = 0;
3181: int flags;
3182:
3183: /* Validate our arguments. */
3184: if ((condition == const_true_rtx && ! thread_if_true)
3185: || (! own_thread && ! thread_if_true))
3186: abort ();
3187:
3188: flags = get_jump_flags (insn, JUMP_LABEL (insn));
3189:
3190: /* If our thread is the end of subroutine, we can't get any delay
3191: insns from that. */
3192: if (thread == 0)
3193: return 0;
3194:
3195: /* If this is an unconditional branch, nothing is needed at the
3196: opposite thread. Otherwise, compute what is needed there. */
3197: if (condition == const_true_rtx)
3198: CLEAR_RESOURCE (&opposite_needed);
3199: else
3200: mark_target_live_regs (opposite_thread, &opposite_needed);
3201:
3202: /* If the insn at THREAD can be split, do it here to avoid having to
3203: update THREAD and NEW_THREAD if it is done in the loop below. Also
3204: initialize NEW_THREAD. */
3205:
3206: new_thread = thread = try_split (PATTERN (thread), thread, 0);
3207:
3208: /* Scan insns at THREAD. We are looking for an insn that can be removed
3209: from THREAD (it neither sets nor references resources that were set
3210: ahead of it and it doesn't set anything needs by the insns ahead of
3211: it) and that either can be placed in an annulling insn or aren't
3212: needed at OPPOSITE_THREAD. */
3213:
3214: CLEAR_RESOURCE (&needed);
3215: CLEAR_RESOURCE (&set);
3216:
3217: /* If we do not own this thread, we must stop as soon as we find
3218: something that we can't put in a delay slot, since all we can do
3219: is branch into THREAD at a later point. Therefore, labels stop
3220: the search if this is not the `true' thread. */
3221:
3222: for (trial = thread;
3223: ! stop_search_p (trial, ! thread_if_true) && (! lose || own_thread);
3224: trial = next_nonnote_insn (trial))
3225: {
3226: rtx pat, old_trial;
3227:
3228: /* If we have passed a label, we no longer own this thread. */
3229: if (GET_CODE (trial) == CODE_LABEL)
3230: {
3231: own_thread = 0;
3232: continue;
3233: }
3234:
3235: pat = PATTERN (trial);
3236: if (GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER)
3237: continue;
3238:
3239: /* If TRIAL conflicts with the insns ahead of it, we lose. Also,
3240: don't separate or copy insns that set and use CC0. */
3241: if (! insn_references_resource_p (trial, &set, 1)
3242: && ! insn_sets_resource_p (trial, &set, 1)
3243: && ! insn_sets_resource_p (trial, &needed, 1)
3244: #ifdef HAVE_cc0
3245: && ! (reg_mentioned_p (cc0_rtx, pat)
3246: && (! own_thread || ! sets_cc0_p (pat)))
3247: #endif
3248: )
3249: {
3250: /* If TRIAL is redundant with some insn before INSN, we don't
3251: actually need to add it to the delay list; we can merely pretend
3252: we did. */
3253: if (redundant_insn_p (trial, insn, delay_list))
3254: {
3255: if (own_thread)
3256: {
3257: update_block (trial, thread);
3258: delete_insn (trial);
3259: }
3260: else
3261: new_thread = next_active_insn (trial);
3262:
3263: continue;
3264: }
3265:
3266: /* There are two ways we can win: If TRIAL doesn't set anything
3267: needed at the opposite thread and can't trap, or if it can
3268: go into an annulled delay slot. */
3269: if (condition == const_true_rtx
3270: || (! insn_sets_resource_p (trial, &opposite_needed, 1)
3271: && ! may_trap_p (pat)))
3272: {
3273: old_trial = trial;
3274: trial = try_split (pat, trial, 0);
3275: if (new_thread == old_trial)
3276: new_thread = trial;
3277: pat = PATTERN (trial);
3278: if (eligible_for_delay (insn, *pslots_filled, trial, flags))
3279: goto winner;
3280: }
3281: else if (0
3282: #ifdef ANNUL_IFTRUE_SLOTS
3283: || ! thread_if_true
3284: #endif
3285: #ifdef ANNUL_IFFALSE_SLOTS
3286: || thread_if_true
3287: #endif
3288: )
3289: {
3290: old_trial = trial;
3291: trial = try_split (pat, trial, 0);
3292: if (new_thread == old_trial)
3293: new_thread = trial;
3294: pat = PATTERN (trial);
3295: if ((thread_if_true
3296: ? eligible_for_annul_false (insn, *pslots_filled, trial, flags)
3297: : eligible_for_annul_true (insn, *pslots_filled, trial, flags)))
3298: {
3299: rtx temp;
3300:
3301: must_annul = 1;
3302: winner:
3303:
3304: #ifdef HAVE_cc0
3305: if (reg_mentioned_p (cc0_rtx, pat))
3306: link_cc0_insns (trial);
3307: #endif
3308:
3309: /* If we own this thread, delete the insn. If this is the
3310: destination of a branch, show that a basic block status
3311: may have been updated. In any case, mark the new
3312: starting point of this thread. */
3313: if (own_thread)
3314: {
3315: update_block (trial, thread);
3316: delete_insn (trial);
3317: }
3318: else
3319: new_thread = next_active_insn (trial);
3320:
3321: temp = own_thread ? trial : copy_rtx (trial);
3322: if (thread_if_true)
3323: INSN_FROM_TARGET_P (temp) = 1;
3324:
3325: delay_list = add_to_delay_list (temp, delay_list);
3326:
3327: if (slots_to_fill == ++(*pslots_filled))
3328: {
3329: /* Even though we have filled all the slots, we
3330: may be branching to a location that has a
3331: redundant insn. Skip any if so. */
3332: while (new_thread && ! own_thread
3333: && ! insn_sets_resource_p (new_thread, &set, 1)
3334: && ! insn_sets_resource_p (new_thread, &needed, 1)
3335: && ! insn_references_resource_p (new_thread,
3336: &set, 1)
3337: && redundant_insn_p (new_thread, insn,
3338: delay_list))
3339: new_thread = next_active_insn (new_thread);
3340: break;
3341: }
3342:
3343: continue;
3344: }
3345: }
3346: }
3347:
3348: /* This insn can't go into a delay slot. */
3349: lose = 1;
3350: mark_set_resources (trial, &set, 0, 1);
3351: mark_referenced_resources (trial, &needed, 1);
3352:
3353: /* Ensure we don't put insns between the setting of cc and the comparison
3354: by moving a setting of cc into an earlier delay slot since these insns
3355: could clobber the condition code. */
3356: set.cc = 1;
3357:
3358: /* If this insn is a register-register copy and the next insn has
3359: a use of our destination, change it to use our source. That way,
3360: it will become a candidate for our delay slot the next time
3361: through this loop. This case occurs commonly in loops that
3362: scan a list.
3363:
3364: We could check for more complex cases than those tested below,
3365: but it doesn't seem worth it. It might also be a good idea to try
3366: to swap the two insns. That might do better.
3367:
3368: We can't do this if the next insn modifies our destination, because
3369: that would make the replacement into the insn invalid. We also can't
3370: do this if it modifies our source, because it might be an earlyclobber
3371: operand. This latter test also prevents updating the contents of
3372: a PRE_INC. */
3373:
3374: if (GET_CODE (trial) == INSN && GET_CODE (pat) == SET
3375: && GET_CODE (SET_SRC (pat)) == REG
3376: && GET_CODE (SET_DEST (pat)) == REG)
3377: {
3378: rtx next = next_nonnote_insn (trial);
3379:
3380: if (next && GET_CODE (next) == INSN
3381: && GET_CODE (PATTERN (next)) != USE
3382: && ! reg_set_p (SET_DEST (pat), next)
3383: && ! reg_set_p (SET_SRC (pat), next)
3384: && reg_referenced_p (SET_DEST (pat), PATTERN (next)))
3385: validate_replace_rtx (SET_DEST (pat), SET_SRC (pat), next);
3386: }
3387: }
3388:
3389: /* If we stopped on a branch insn that has delay slots, see if we can
3390: steal some of the insns in those slots. */
3391: if (trial && GET_CODE (trial) == INSN
3392: && GET_CODE (PATTERN (trial)) == SEQUENCE
3393: && GET_CODE (XVECEXP (PATTERN (trial), 0, 0)) == JUMP_INSN)
3394: {
3395: /* If this is the `true' thread, we will want to follow the jump,
3396: so we can only do this if we have taken everything up to here. */
3397: if (thread_if_true && trial == new_thread)
3398: delay_list
3399: = steal_delay_list_from_target (insn, condition, PATTERN (trial),
3400: delay_list, &set, &needed,
3401: &opposite_needed, slots_to_fill,
3402: pslots_filled, &must_annul,
3403: &new_thread);
3404: else if (! thread_if_true)
3405: delay_list
3406: = steal_delay_list_from_fallthrough (insn, condition,
3407: PATTERN (trial),
3408: delay_list, &set, &needed,
3409: &opposite_needed, slots_to_fill,
3410: pslots_filled, &must_annul);
3411: }
3412:
3413: /* If we haven't found anything for this delay slot and it is very
3414: likely that the branch will be taken, see if the insn at our target
3415: increments or decrements a register with an increment that does not
3416: depend on the destination register. If so, try to place the opposite
3417: arithmetic insn after the jump insn and put the arithmetic insn in the
3418: delay slot. If we can't do this, return. */
3419: if (delay_list == 0 && likely && new_thread && GET_CODE (new_thread) == INSN)
3420: {
3421: rtx pat = PATTERN (new_thread);
3422: rtx dest;
3423: rtx src;
3424:
3425: trial = new_thread;
3426: pat = PATTERN (trial);
3427:
3428: if (GET_CODE (trial) != INSN || GET_CODE (pat) != SET
3429: || ! eligible_for_delay (insn, 0, trial, flags))
3430: return 0;
3431:
3432: dest = SET_DEST (pat), src = SET_SRC (pat);
3433: if ((GET_CODE (src) == PLUS || GET_CODE (src) == MINUS)
3434: && rtx_equal_p (XEXP (src, 0), dest)
3435: && ! reg_overlap_mentioned_p (dest, XEXP (src, 1)))
3436: {
3437: rtx other = XEXP (src, 1);
3438: rtx new_arith;
3439: rtx ninsn;
3440:
3441: /* If this is a constant adjustment, use the same code with
3442: the negated constant. Otherwise, reverse the sense of the
3443: arithmetic. */
3444: if (GET_CODE (other) == CONST_INT)
3445: new_arith = gen_rtx (GET_CODE (src), GET_MODE (src), dest,
3446: negate_rtx (GET_MODE (src), other));
3447: else
3448: new_arith = gen_rtx (GET_CODE (src) == PLUS ? MINUS : PLUS,
3449: GET_MODE (src), dest, other);
3450:
3451: ninsn = emit_insn_after (gen_rtx (SET, VOIDmode, dest, new_arith),
3452: insn);
3453:
3454: if (recog_memoized (ninsn) < 0
3455: || (insn_extract (ninsn),
3456: ! constrain_operands (INSN_CODE (ninsn), 1)))
3457: {
3458: delete_insn (ninsn);
3459: return 0;
3460: }
3461:
3462: if (own_thread)
3463: {
3464: update_block (trial, thread);
3465: delete_insn (trial);
3466: }
3467: else
3468: new_thread = next_active_insn (trial);
3469:
3470: ninsn = own_thread ? trial : copy_rtx (trial);
3471: if (thread_if_true)
3472: INSN_FROM_TARGET_P (ninsn) = 1;
3473:
3474: delay_list = add_to_delay_list (ninsn, NULL_RTX);
3475: (*pslots_filled)++;
3476: }
3477: }
3478:
3479: if (delay_list && must_annul)
3480: INSN_ANNULLED_BRANCH_P (insn) = 1;
3481:
3482: /* If we are to branch into the middle of this thread, find an appropriate
3483: label or make a new one if none, and redirect INSN to it. If we hit the
3484: end of the function, use the end-of-function label. */
3485: if (new_thread != thread)
3486: {
3487: rtx label;
3488:
3489: if (! thread_if_true)
3490: abort ();
3491:
3492: if (new_thread && GET_CODE (new_thread) == JUMP_INSN
3493: && (simplejump_p (new_thread)
3494: || GET_CODE (PATTERN (new_thread)) == RETURN)
3495: && redirect_with_delay_list_safe_p (insn,
3496: JUMP_LABEL (new_thread),
3497: delay_list))
3498: new_thread = follow_jumps (JUMP_LABEL (new_thread));
3499:
3500: if (new_thread == 0)
3501: label = find_end_label ();
3502: else if (GET_CODE (new_thread) == CODE_LABEL)
3503: label = new_thread;
3504: else
3505: label = get_label_before (new_thread);
3506:
3507: reorg_redirect_jump (insn, label);
3508: }
3509:
3510: return delay_list;
3511: }
3512:
3513: /* Make another attempt to find insns to place in delay slots.
3514:
3515: We previously looked for insns located in front of the delay insn
3516: and, for non-jump delay insns, located behind the delay insn.
3517:
3518: Here only try to schedule jump insns and try to move insns from either
3519: the target or the following insns into the delay slot. If annulling is
3520: supported, we will be likely to do this. Otherwise, we can do this only
3521: if safe. */
3522:
3523: static void
3524: fill_eager_delay_slots (first)
3525: rtx first;
3526: {
3527: register rtx insn;
3528: register int i;
3529: int num_unfilled_slots = unfilled_slots_next - unfilled_slots_base;
3530:
3531: for (i = 0; i < num_unfilled_slots; i++)
3532: {
3533: rtx condition;
3534: rtx target_label, insn_at_target, fallthrough_insn;
3535: rtx delay_list = 0;
3536: int own_target;
3537: int own_fallthrough;
3538: int prediction, slots_to_fill, slots_filled;
3539:
3540: insn = unfilled_slots_base[i];
3541: if (insn == 0
3542: || INSN_DELETED_P (insn)
3543: || GET_CODE (insn) != JUMP_INSN
3544: || ! condjump_p (insn))
3545: continue;
3546:
3547: slots_to_fill = num_delay_slots (insn);
3548: if (slots_to_fill == 0)
3549: abort ();
3550:
3551: slots_filled = 0;
3552: target_label = JUMP_LABEL (insn);
3553: condition = get_branch_condition (insn, target_label);
3554:
3555: if (condition == 0)
3556: continue;
3557:
3558: /* Get the next active fallthough and target insns and see if we own
3559: them. Then see whether the branch is likely true. We don't need
3560: to do a lot of this for unconditional branches. */
3561:
3562: insn_at_target = next_active_insn (target_label);
3563: own_target = own_thread_p (target_label, target_label, 0);
3564:
3565: if (condition == const_true_rtx)
3566: {
3567: own_fallthrough = 0;
3568: fallthrough_insn = 0;
3569: prediction = 2;
3570: }
3571: else
3572: {
3573: fallthrough_insn = next_active_insn (insn);
3574: own_fallthrough = own_thread_p (NEXT_INSN (insn), NULL_RTX, 1);
3575: prediction = mostly_true_jump (insn, condition);
3576: }
3577:
3578: /* If this insn is expected to branch, first try to get insns from our
3579: target, then our fallthrough insns. If it is not, expected to branch,
3580: try the other order. */
3581:
3582: if (prediction > 0)
3583: {
3584: delay_list
3585: = fill_slots_from_thread (insn, condition, insn_at_target,
3586: fallthrough_insn, prediction == 2, 1,
3587: own_target, own_fallthrough,
3588: slots_to_fill, &slots_filled);
3589:
3590: if (delay_list == 0 && own_fallthrough)
3591: {
3592: /* Even though we didn't find anything for delay slots,
3593: we might have found a redundant insn which we deleted
3594: from the thread that was filled. So we have to recompute
3595: the next insn at the target. */
3596: target_label = JUMP_LABEL (insn);
3597: insn_at_target = next_active_insn (target_label);
3598:
3599: delay_list
3600: = fill_slots_from_thread (insn, condition, fallthrough_insn,
3601: insn_at_target, 0, 0,
3602: own_fallthrough, own_target,
3603: slots_to_fill, &slots_filled);
3604: }
3605: }
3606: else
3607: {
3608: if (own_fallthrough)
3609: delay_list
3610: = fill_slots_from_thread (insn, condition, fallthrough_insn,
3611: insn_at_target, 0, 0,
3612: own_fallthrough, own_target,
3613: slots_to_fill, &slots_filled);
3614:
3615: if (delay_list == 0)
3616: delay_list
3617: = fill_slots_from_thread (insn, condition, insn_at_target,
3618: next_active_insn (insn), 0, 1,
3619: own_target, own_fallthrough,
3620: slots_to_fill, &slots_filled);
3621: }
3622:
3623: if (delay_list)
3624: unfilled_slots_base[i]
3625: = emit_delay_sequence (insn, delay_list,
3626: slots_filled, slots_to_fill);
3627:
3628: if (slots_to_fill == slots_filled)
3629: unfilled_slots_base[i] = 0;
3630:
3631: note_delay_statistics (slots_filled, 1);
3632: }
3633: }
3634:
3635: /* Once we have tried two ways to fill a delay slot, make a pass over the
3636: code to try to improve the results and to do such things as more jump
3637: threading. */
3638:
3639: static void
3640: relax_delay_slots (first)
3641: rtx first;
3642: {
3643: register rtx insn, next, pat;
3644: register rtx trial, delay_insn, target_label;
3645:
3646: /* Look at every JUMP_INSN and see if we can improve it. */
3647: for (insn = first; insn; insn = next)
3648: {
3649: rtx other;
3650:
3651: next = next_active_insn (insn);
3652:
3653: /* If this is a jump insn, see if it now jumps to a jump, jumps to
3654: the next insn, or jumps to a label that is not the last of a
3655: group of consecutive labels. */
3656: if (GET_CODE (insn) == JUMP_INSN
3657: && condjump_p (insn)
3658: && (target_label = JUMP_LABEL (insn)) != 0)
3659: {
3660: target_label = follow_jumps (target_label);
3661: target_label = prev_label (next_active_insn (target_label));
3662:
3663: if (target_label == 0)
3664: target_label = find_end_label ();
3665:
3666: if (next_active_insn (target_label) == next)
3667: {
3668: delete_jump (insn);
3669: continue;
3670: }
3671:
3672: if (target_label != JUMP_LABEL (insn))
3673: reorg_redirect_jump (insn, target_label);
3674:
3675: /* See if this jump branches around a unconditional jump.
3676: If so, invert this jump and point it to the target of the
3677: second jump. */
3678: if (next && GET_CODE (next) == JUMP_INSN
3679: && (simplejump_p (next) || GET_CODE (PATTERN (next)) == RETURN)
3680: && next_active_insn (target_label) == next_active_insn (next)
3681: && no_labels_between_p (insn, next))
3682: {
3683: rtx label = JUMP_LABEL (next);
3684:
3685: /* Be careful how we do this to avoid deleting code or
3686: labels that are momentarily dead. See similar optimization
3687: in jump.c.
3688:
3689: We also need to ensure we properly handle the case when
3690: invert_jump fails. */
3691:
3692: ++LABEL_NUSES (target_label);
3693: if (label)
3694: ++LABEL_NUSES (label);
3695:
3696: if (invert_jump (insn, label))
3697: {
3698: delete_insn (next);
3699: next = insn;
3700: }
3701:
3702: if (label)
3703: --LABEL_NUSES (label);
3704:
3705: if (--LABEL_NUSES (target_label) == 0)
3706: delete_insn (target_label);
3707:
3708: continue;
3709: }
3710: }
3711:
3712: /* If this is an unconditional jump and the previous insn is a
3713: conditional jump, try reversing the condition of the previous
3714: insn and swapping our targets. The next pass might be able to
3715: fill the slots.
3716:
3717: Don't do this if we expect the conditional branch to be true, because
3718: we would then be making the more common case longer. */
3719:
3720: if (GET_CODE (insn) == JUMP_INSN
3721: && (simplejump_p (insn) || GET_CODE (PATTERN (insn)) == RETURN)
3722: && (other = prev_active_insn (insn)) != 0
3723: && condjump_p (other)
3724: && no_labels_between_p (other, insn)
3725: && 0 < mostly_true_jump (other,
3726: get_branch_condition (other,
3727: JUMP_LABEL (other))))
3728: {
3729: rtx other_target = JUMP_LABEL (other);
3730: target_label = JUMP_LABEL (insn);
3731:
3732: /* Increment the count of OTHER_TARGET, so it doesn't get deleted
3733: as we move the label. */
3734: if (other_target)
3735: ++LABEL_NUSES (other_target);
3736:
3737: if (invert_jump (other, target_label))
3738: reorg_redirect_jump (insn, other_target);
3739:
3740: if (other_target)
3741: --LABEL_NUSES (other_target);
3742: }
3743:
3744: /* Now look only at cases where we have filled a delay slot. */
3745: if (GET_CODE (insn) != INSN
3746: || GET_CODE (PATTERN (insn)) != SEQUENCE)
3747: continue;
3748:
3749: pat = PATTERN (insn);
3750: delay_insn = XVECEXP (pat, 0, 0);
3751:
3752: /* See if the first insn in the delay slot is redundant with some
3753: previous insn. Remove it from the delay slot if so; then set up
3754: to reprocess this insn. */
3755: if (redundant_insn_p (XVECEXP (pat, 0, 1), delay_insn, 0))
3756: {
3757: delete_from_delay_slot (XVECEXP (pat, 0, 1));
3758: next = prev_active_insn (next);
3759: continue;
3760: }
3761:
3762: /* Now look only at the cases where we have a filled JUMP_INSN. */
3763: if (GET_CODE (XVECEXP (PATTERN (insn), 0, 0)) != JUMP_INSN
3764: || ! condjump_p (XVECEXP (PATTERN (insn), 0, 0)))
3765: continue;
3766:
3767: target_label = JUMP_LABEL (delay_insn);
3768:
3769: if (target_label)
3770: {
3771: /* If this jump goes to another unconditional jump, thread it, but
3772: don't convert a jump into a RETURN here. */
3773: trial = follow_jumps (target_label);
3774: trial = prev_label (next_active_insn (trial));
3775: if (trial == 0 && target_label != 0)
3776: trial = find_end_label ();
3777:
3778: if (trial != target_label
3779: && redirect_with_delay_slots_safe_p (delay_insn, trial, insn))
3780: {
3781: reorg_redirect_jump (delay_insn, trial);
3782: target_label = trial;
3783: }
3784:
3785: /* If the first insn at TARGET_LABEL is redundant with a previous
3786: insn, redirect the jump to the following insn process again. */
3787: trial = next_active_insn (target_label);
3788: if (trial && GET_CODE (PATTERN (trial)) != SEQUENCE
3789: && redundant_insn_p (trial, insn, 0))
3790: {
3791: trial = next_active_insn (trial);
3792: if (trial == 0)
3793: target_label = find_end_label ();
3794: else
3795: target_label = get_label_before (trial);
3796: reorg_redirect_jump (delay_insn, target_label);
3797: next = insn;
3798: continue;
3799: }
3800:
3801: /* Similarly, if it is an unconditional jump with one insn in its
3802: delay list and that insn is redundant, thread the jump. */
3803: if (trial && GET_CODE (PATTERN (trial)) == SEQUENCE
3804: && XVECLEN (PATTERN (trial), 0) == 2
3805: && GET_CODE (XVECEXP (PATTERN (trial), 0, 0)) == JUMP_INSN
3806: && (simplejump_p (XVECEXP (PATTERN (trial), 0, 0))
3807: || GET_CODE (PATTERN (XVECEXP (PATTERN (trial), 0, 0))) == RETURN)
3808: && redundant_insn_p (XVECEXP (PATTERN (trial), 0, 1), insn, 0))
3809: {
3810: target_label = JUMP_LABEL (XVECEXP (PATTERN (trial), 0, 0));
3811: if (target_label == 0)
3812: target_label = find_end_label ();
3813:
3814: if (redirect_with_delay_slots_safe_p (delay_insn, target_label,
3815: insn))
3816: {
3817: reorg_redirect_jump (delay_insn, target_label);
3818: next = insn;
3819: continue;
3820: }
3821: }
3822: }
3823:
3824: if (! INSN_ANNULLED_BRANCH_P (delay_insn)
3825: && prev_active_insn (target_label) == insn
3826: #ifdef HAVE_cc0
3827: /* If the last insn in the delay slot sets CC0 for some insn,
3828: various code assumes that it is in a delay slot. We could
3829: put it back where it belonged and delete the register notes,
3830: but it doesn't seem worthwhile in this uncommon case. */
3831: && ! find_reg_note (XVECEXP (pat, 0, XVECLEN (pat, 0) - 1),
3832: REG_CC_USER, NULL_RTX)
3833: #endif
3834: )
3835: {
3836: int i;
3837:
3838: /* All this insn does is execute its delay list and jump to the
3839: following insn. So delete the jump and just execute the delay
3840: list insns.
3841:
3842: We do this by deleting the INSN containing the SEQUENCE, then
3843: re-emitting the insns separately, and then deleting the jump.
3844: This allows the count of the jump target to be properly
3845: decremented. */
3846:
3847: /* Clear the from target bit, since these insns are no longer
3848: in delay slots. */
3849: for (i = 0; i < XVECLEN (pat, 0); i++)
3850: INSN_FROM_TARGET_P (XVECEXP (pat, 0, i)) = 0;
3851:
3852: trial = PREV_INSN (insn);
3853: delete_insn (insn);
3854: emit_insn_after (pat, trial);
3855: delete_scheduled_jump (delay_insn);
3856: continue;
3857: }
3858:
3859: /* See if this is an unconditional jump around a single insn which is
3860: identical to the one in its delay slot. In this case, we can just
3861: delete the branch and the insn in its delay slot. */
3862: if (next && GET_CODE (next) == INSN
3863: && prev_label (next_active_insn (next)) == target_label
3864: && simplejump_p (insn)
3865: && XVECLEN (pat, 0) == 2
3866: && rtx_equal_p (PATTERN (next), PATTERN (XVECEXP (pat, 0, 1))))
3867: {
3868: delete_insn (insn);
3869: continue;
3870: }
3871:
3872: /* See if this jump (with its delay slots) branches around another
3873: jump (without delay slots). If so, invert this jump and point
3874: it to the target of the second jump. We cannot do this for
3875: annulled jumps, though. Again, don't convert a jump to a RETURN
3876: here. */
3877: if (! INSN_ANNULLED_BRANCH_P (delay_insn)
3878: && next && GET_CODE (next) == JUMP_INSN
3879: && (simplejump_p (next) || GET_CODE (PATTERN (next)) == RETURN)
3880: && next_active_insn (target_label) == next_active_insn (next)
3881: && no_labels_between_p (insn, next))
3882: {
3883: rtx label = JUMP_LABEL (next);
3884: rtx old_label = JUMP_LABEL (delay_insn);
3885:
3886: if (label == 0)
3887: label = find_end_label ();
3888:
3889: if (redirect_with_delay_slots_safe_p (delay_insn, label, insn))
3890: {
3891: /* Be careful how we do this to avoid deleting code or labels
3892: that are momentarily dead. See similar optimization in
3893: jump.c */
3894: if (old_label)
3895: ++LABEL_NUSES (old_label);
3896:
3897: if (invert_jump (delay_insn, label))
3898: {
3899: delete_insn (next);
3900: next = insn;
3901: }
3902:
3903: if (old_label && --LABEL_NUSES (old_label) == 0)
3904: delete_insn (old_label);
3905: continue;
3906: }
3907: }
3908:
3909: /* If we own the thread opposite the way this insn branches, see if we
3910: can merge its delay slots with following insns. */
3911: if (INSN_FROM_TARGET_P (XVECEXP (pat, 0, 1))
3912: && own_thread_p (NEXT_INSN (insn), 0, 1))
3913: try_merge_delay_insns (insn, next);
3914: else if (! INSN_FROM_TARGET_P (XVECEXP (pat, 0, 1))
3915: && own_thread_p (target_label, target_label, 0))
3916: try_merge_delay_insns (insn, next_active_insn (target_label));
3917:
3918: /* If we get here, we haven't deleted INSN. But we may have deleted
3919: NEXT, so recompute it. */
3920: next = next_active_insn (insn);
3921: }
3922: }
3923:
3924: #ifdef HAVE_return
3925:
3926: /* Look for filled jumps to the end of function label. We can try to convert
3927: them into RETURN insns if the insns in the delay slot are valid for the
3928: RETURN as well. */
3929:
3930: static void
3931: make_return_insns (first)
3932: rtx first;
3933: {
3934: rtx insn, jump_insn, pat;
3935: rtx real_return_label = end_of_function_label;
3936: int slots, i;
3937:
3938: /* See if there is a RETURN insn in the function other than the one we
3939: made for END_OF_FUNCTION_LABEL. If so, set up anything we can't change
3940: into a RETURN to jump to it. */
3941: for (insn = first; insn; insn = NEXT_INSN (insn))
3942: if (GET_CODE (insn) == JUMP_INSN && GET_CODE (PATTERN (insn)) == RETURN)
3943: {
3944: real_return_label = get_label_before (insn);
3945: break;
3946: }
3947:
3948: /* Show an extra usage of REAL_RETURN_LABEL so it won't go away if it
3949: was equal to END_OF_FUNCTION_LABEL. */
3950: LABEL_NUSES (real_return_label)++;
3951:
3952: /* Clear the list of insns to fill so we can use it. */
3953: obstack_free (&unfilled_slots_obstack, unfilled_firstobj);
3954:
3955: for (insn = first; insn; insn = NEXT_INSN (insn))
3956: {
3957: int flags;
3958:
3959: /* Only look at filled JUMP_INSNs that go to the end of function
3960: label. */
3961: if (GET_CODE (insn) != INSN
3962: || GET_CODE (PATTERN (insn)) != SEQUENCE
3963: || GET_CODE (XVECEXP (PATTERN (insn), 0, 0)) != JUMP_INSN
3964: || JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) != end_of_function_label)
3965: continue;
3966:
3967: pat = PATTERN (insn);
3968: jump_insn = XVECEXP (pat, 0, 0);
3969:
3970: /* If we can't make the jump into a RETURN, redirect it to the best
3971: RETURN and go on to the next insn. */
3972: if (! reorg_redirect_jump (jump_insn, NULL_RTX))
3973: {
3974: reorg_redirect_jump (jump_insn, real_return_label);
3975: continue;
3976: }
3977:
3978: /* See if this RETURN can accept the insns current in its delay slot.
3979: It can if it has more or an equal number of slots and the contents
3980: of each is valid. */
3981:
3982: flags = get_jump_flags (jump_insn, JUMP_LABEL (jump_insn));
3983: slots = num_delay_slots (jump_insn);
3984: if (slots >= XVECLEN (pat, 0) - 1)
3985: {
3986: for (i = 1; i < XVECLEN (pat, 0); i++)
3987: if (! (
3988: #ifdef ANNUL_IFFALSE_SLOTS
3989: (INSN_ANNULLED_BRANCH_P (jump_insn)
3990: && INSN_FROM_TARGET_P (XVECEXP (pat, 0, i)))
3991: ? eligible_for_annul_false (jump_insn, i - 1,
3992: XVECEXP (pat, 0, i), flags) :
3993: #endif
3994: #ifdef ANNUL_IFTRUE_SLOTS
3995: (INSN_ANNULLED_BRANCH_P (jump_insn)
3996: && ! INSN_FROM_TARGET_P (XVECEXP (pat, 0, i)))
3997: ? eligible_for_annul_true (jump_insn, i - 1,
3998: XVECEXP (pat, 0, i), flags) :
3999: #endif
4000: eligible_for_delay (jump_insn, i -1, XVECEXP (pat, 0, i), flags)))
4001: break;
4002: }
4003: else
4004: i = 0;
4005:
4006: if (i == XVECLEN (pat, 0))
4007: continue;
4008:
4009: /* We have to do something with this insn. If it is an unconditional
4010: RETURN, delete the SEQUENCE and output the individual insns,
4011: followed by the RETURN. Then set things up so we try to find
4012: insns for its delay slots, if it needs some. */
4013: if (GET_CODE (PATTERN (jump_insn)) == RETURN)
4014: {
4015: rtx prev = PREV_INSN (insn);
4016:
4017: delete_insn (insn);
4018: for (i = 1; i < XVECLEN (pat, 0); i++)
4019: prev = emit_insn_after (PATTERN (XVECEXP (pat, 0, i)), prev);
4020:
4021: insn = emit_jump_insn_after (PATTERN (jump_insn), prev);
4022: emit_barrier_after (insn);
4023:
4024: if (slots)
4025: obstack_ptr_grow (&unfilled_slots_obstack, insn);
4026: }
4027: else
4028: /* It is probably more efficient to keep this with its current
4029: delay slot as a branch to a RETURN. */
4030: reorg_redirect_jump (jump_insn, real_return_label);
4031: }
4032:
4033: /* Now delete REAL_RETURN_LABEL if we never used it. Then try to fill any
4034: new delay slots we have created. */
4035: if (--LABEL_NUSES (real_return_label) == 0)
4036: delete_insn (real_return_label);
4037:
4038: fill_simple_delay_slots (first, 1);
4039: fill_simple_delay_slots (first, 0);
4040: }
4041: #endif
4042:
4043: /* Try to find insns to place in delay slots. */
4044:
4045: void
4046: dbr_schedule (first, file)
4047: rtx first;
4048: FILE *file;
4049: {
4050: rtx insn, next, epilogue_insn = 0;
4051: int i;
4052: #if 0
4053: int old_flag_no_peephole = flag_no_peephole;
4054:
4055: /* Execute `final' once in prescan mode to delete any insns that won't be
4056: used. Don't let final try to do any peephole optimization--it will
4057: ruin dataflow information for this pass. */
4058:
4059: flag_no_peephole = 1;
4060: final (first, 0, NO_DEBUG, 1, 1);
4061: flag_no_peephole = old_flag_no_peephole;
4062: #endif
4063:
4064: /* If the current function has no insns other than the prologue and
4065: epilogue, then do not try to fill any delay slots. */
4066: if (n_basic_blocks == 0)
4067: return;
4068:
4069: /* Find the highest INSN_UID and allocate and initialize our map from
4070: INSN_UID's to position in code. */
4071: for (max_uid = 0, insn = first; insn; insn = NEXT_INSN (insn))
4072: {
4073: if (INSN_UID (insn) > max_uid)
4074: max_uid = INSN_UID (insn);
4075: if (GET_CODE (insn) == NOTE
4076: && NOTE_LINE_NUMBER (insn) == NOTE_INSN_EPILOGUE_BEG)
4077: epilogue_insn = insn;
4078: }
4079:
4080: uid_to_ruid = (int *) alloca ((max_uid + 1) * sizeof (int *));
4081: for (i = 0, insn = first; insn; i++, insn = NEXT_INSN (insn))
4082: uid_to_ruid[INSN_UID (insn)] = i;
4083:
4084: /* Initialize the list of insns that need filling. */
4085: if (unfilled_firstobj == 0)
4086: {
4087: gcc_obstack_init (&unfilled_slots_obstack);
4088: unfilled_firstobj = (rtx *) obstack_alloc (&unfilled_slots_obstack, 0);
4089: }
4090:
4091: for (insn = next_active_insn (first); insn; insn = next_active_insn (insn))
4092: {
4093: rtx target;
4094:
4095: INSN_ANNULLED_BRANCH_P (insn) = 0;
4096: INSN_FROM_TARGET_P (insn) = 0;
4097:
4098: /* Skip vector tables. We can't get attributes for them. */
4099: if (GET_CODE (insn) == JUMP_INSN
4100: && (GET_CODE (PATTERN (insn)) == ADDR_VEC
4101: || GET_CODE (PATTERN (insn)) == ADDR_DIFF_VEC))
4102: continue;
4103:
4104: if (num_delay_slots (insn) > 0)
4105: obstack_ptr_grow (&unfilled_slots_obstack, insn);
4106:
4107: /* Ensure all jumps go to the last of a set of consecutive labels. */
4108: if (GET_CODE (insn) == JUMP_INSN && condjump_p (insn)
4109: && JUMP_LABEL (insn) != 0
4110: && ((target = prev_label (next_active_insn (JUMP_LABEL (insn))))
4111: != JUMP_LABEL (insn)))
4112: redirect_jump (insn, target);
4113: }
4114:
4115: /* Indicate what resources are required to be valid at the end of the current
4116: function. The condition code never is and memory always is. If the
4117: frame pointer is needed, it is and so is the stack pointer unless
4118: EXIT_IGNORE_STACK is non-zero. If the frame pointer is not needed, the
4119: stack pointer is. Registers used to return the function value are
4120: needed. Registers holding global variables are needed. */
4121:
4122: end_of_function_needs.cc = 0;
4123: end_of_function_needs.memory = 1;
4124: CLEAR_HARD_REG_SET (end_of_function_needs.regs);
4125:
4126: if (frame_pointer_needed)
4127: {
4128: SET_HARD_REG_BIT (end_of_function_needs.regs, FRAME_POINTER_REGNUM);
4129: #if HARD_FRAME_POINTER_REGNUM != FRAME_POINTER_REGNUM
4130: SET_HARD_REG_BIT (end_of_function_needs.regs, HARD_FRAME_POINTER_REGNUM);
4131: #endif
4132: #ifdef EXIT_IGNORE_STACK
4133: if (! EXIT_IGNORE_STACK)
4134: #endif
4135: SET_HARD_REG_BIT (end_of_function_needs.regs, STACK_POINTER_REGNUM);
4136: }
4137: else
4138: SET_HARD_REG_BIT (end_of_function_needs.regs, STACK_POINTER_REGNUM);
4139:
4140: if (current_function_return_rtx != 0
4141: && GET_CODE (current_function_return_rtx) == REG)
4142: mark_referenced_resources (current_function_return_rtx,
4143: &end_of_function_needs, 1);
4144:
4145: for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
4146: if (global_regs[i])
4147: SET_HARD_REG_BIT (end_of_function_needs.regs, i);
4148:
4149: /* The registers required to be live at the end of the function are
4150: represented in the flow information as being dead just prior to
4151: reaching the end of the function. For example, the return of a value
4152: might be represented by a USE of the return register immediately
4153: followed by an unconditional jump to the return label where the
4154: return label is the end of the RTL chain. The end of the RTL chain
4155: is then taken to mean that the return register is live.
4156:
4157: This sequence is no longer maintained when epilogue instructions are
4158: added to the RTL chain. To reconstruct the original meaning, the
4159: start of the epilogue (NOTE_INSN_EPILOGUE_BEG) is regarded as the
4160: point where these registers become live (start_of_epilogue_needs).
4161: If epilogue instructions are present, the registers set by those
4162: instructions won't have been processed by flow. Thus, those
4163: registers are additionally required at the end of the RTL chain
4164: (end_of_function_needs). */
4165:
4166: start_of_epilogue_needs = end_of_function_needs;
4167:
4168: while (epilogue_insn = next_nonnote_insn (epilogue_insn))
4169: mark_set_resources (epilogue_insn, &end_of_function_needs, 0, 1);
4170:
4171: /* Show we haven't computed an end-of-function label yet. */
4172: end_of_function_label = 0;
4173:
4174: /* Allocate and initialize the tables used by mark_target_live_regs. */
4175: target_hash_table
4176: = (struct target_info **) alloca ((TARGET_HASH_PRIME
4177: * sizeof (struct target_info *)));
4178: bzero (target_hash_table, TARGET_HASH_PRIME * sizeof (struct target_info *));
4179:
4180: bb_ticks = (int *) alloca (n_basic_blocks * sizeof (int));
4181: bzero (bb_ticks, n_basic_blocks * sizeof (int));
4182:
4183: /* Initialize the statistics for this function. */
4184: bzero (num_insns_needing_delays, sizeof num_insns_needing_delays);
4185: bzero (num_filled_delays, sizeof num_filled_delays);
4186:
4187: /* Now do the delay slot filling. Try everything twice in case earlier
4188: changes make more slots fillable. */
4189:
4190: for (reorg_pass_number = 0;
4191: reorg_pass_number < MAX_REORG_PASSES;
4192: reorg_pass_number++)
4193: {
4194: fill_simple_delay_slots (first, 1);
4195: fill_simple_delay_slots (first, 0);
4196: fill_eager_delay_slots (first);
4197: relax_delay_slots (first);
4198: }
4199:
4200: /* Delete any USE insns made by update_block; subsequent passes don't need
4201: them or know how to deal with them. */
4202: for (insn = first; insn; insn = next)
4203: {
4204: next = NEXT_INSN (insn);
4205:
4206: if (GET_CODE (insn) == INSN && GET_CODE (PATTERN (insn)) == USE
4207: && GET_RTX_CLASS (GET_CODE (XEXP (PATTERN (insn), 0))) == 'i')
4208: next = delete_insn (insn);
4209: }
4210:
4211: /* If we made an end of function label, indicate that it is now
4212: safe to delete it by undoing our prior adjustment to LABEL_NUSES.
4213: If it is now unused, delete it. */
4214: if (end_of_function_label && --LABEL_NUSES (end_of_function_label) == 0)
4215: delete_insn (end_of_function_label);
4216:
4217: #ifdef HAVE_return
4218: if (HAVE_return && end_of_function_label != 0)
4219: make_return_insns (first);
4220: #endif
4221:
4222: obstack_free (&unfilled_slots_obstack, unfilled_firstobj);
4223:
4224: /* It is not clear why the line below is needed, but it does seem to be. */
4225: unfilled_firstobj = (rtx *) obstack_alloc (&unfilled_slots_obstack, 0);
4226:
4227: /* Reposition the prologue and epilogue notes in case we moved the
4228: prologue/epilogue insns. */
4229: reposition_prologue_and_epilogue_notes (first);
4230:
4231: if (file)
4232: {
4233: register int i, j, need_comma;
4234:
4235: for (reorg_pass_number = 0;
4236: reorg_pass_number < MAX_REORG_PASSES;
4237: reorg_pass_number++)
4238: {
4239: fprintf (file, ";; Reorg pass #%d:\n", reorg_pass_number + 1);
4240: for (i = 0; i < NUM_REORG_FUNCTIONS; i++)
4241: {
4242: need_comma = 0;
4243: fprintf (file, ";; Reorg function #%d\n", i);
4244:
4245: fprintf (file, ";; %d insns needing delay slots\n;; ",
4246: num_insns_needing_delays[i][reorg_pass_number]);
4247:
4248: for (j = 0; j < MAX_DELAY_HISTOGRAM; j++)
4249: if (num_filled_delays[i][j][reorg_pass_number])
4250: {
4251: if (need_comma)
4252: fprintf (file, ", ");
4253: need_comma = 1;
4254: fprintf (file, "%d got %d delays",
4255: num_filled_delays[i][j][reorg_pass_number], j);
4256: }
4257: fprintf (file, "\n");
4258: }
4259: }
4260: }
4261: }
4262: #endif /* DELAY_SLOTS */
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.