|
|
1.1 ! root 1: 1. Better optimization. ! 2: ! 3: * More cse ! 4: ! 5: The techniques for doing full global cse are described in the ! 6: red dragon book. It is likely to be slow and use a lot of memory, ! 7: but it might be worth offering as an additional option. ! 8: ! 9: It is probably possible to extend cse to a few very frequent cases ! 10: without so much expense. ! 11: ! 12: For example, it is not very hard to handle cse through if-then ! 13: statements with no else clauses. Here's how to do it. On reaching a ! 14: label, notice that the label's use-count is 1 and that the last ! 15: preceding jump jumps conditionally to this label. Now you know it ! 16: is a simple if-then statement. Remove from the hash table ! 17: all the expressions that were entered since that jump insn ! 18: and you can continue with cse. ! 19: ! 20: It is probably not hard to handle cse from the end of a loop ! 21: around to the beginning, and a few loops would be greatly sped ! 22: up by this. ! 23: ! 24: * Iteration variables and strength reduction. ! 25: ! 26: The red dragon book describes standard techniques for these kinds ! 27: of loop optimization. But be careful! These optimization techniques ! 28: don't always make the code better. You need to avoid performing ! 29: the standard transformations unless they are greatly worth while. ! 30: ! 31: In many common cases it is possible to deduce that an iteration ! 32: variable is always positive during the loop. This information ! 33: may make it possible to use decrement-and-branch instructions ! 34: whose branch conditions are inconvenient. For example, the 68000 ! 35: `dbra' instruction branches if the value was not equal to zero. ! 36: Therefore, it is not applicable to `for (i = 10; i >= 0; i--)' ! 37: unless the compiler can know that I will never be negative ! 38: before it is decremented. ! 39: ! 40: * Special local optimizations. ! 41: ! 42: The instruction combiner finds only certain classes of local optimizations. ! 43: For example, it cannot use the 68020 instruction `cmp2' because it would ! 44: not think to combine the instructions that would be equivalent to a `cmp2'. ! 45: ! 46: In order to take advantage of such instructions, the combiner would need ! 47: special hints as to which instructions to consider combining. To be ! 48: generally useful, this feature would have to be controlled somehow ! 49: by new information in the machine description. ! 50: ! 51: * Smarter reload pass. ! 52: ! 53: The reload pass as currently written can reload values only into registers ! 54: that are reserved for reloading. This means that in order to use a ! 55: register for reloading it must spill everything out of that register. ! 56: ! 57: It would be straightforward, though complicated, for reload1.c to keep ! 58: track, during its scan, of which hard registers were available at each ! 59: point in the function, and use for reloading even registers that were ! 60: free only at the point they were needed. This would avoid much spilling ! 61: and make better code. ! 62: ! 63: * Change the type of a variable. ! 64: ! 65: Sometimes a variable is declared as `int', it is assigned only once ! 66: from a value of type `char', and then it is used only by comparison ! 67: against constants. On many machines, better code would result if ! 68: the variable had type `char'. If the compiler could detect this ! 69: case, it could change the declaration of the variable and change ! 70: all the places that use it. ! 71: ! 72: * Order of subexpressions. ! 73: ! 74: It might be possible to make better code by paying attention ! 75: to the order in which to generate code for subexpressions of an expression. ! 76: ! 77: * Better code for switch statements. ! 78: ! 79: If a switch statement has only a few cases, a sequence of conditional ! 80: branches is generated for it, rather than a jump table. It would ! 81: be better to output a binary tree of branches. ! 82: ! 83: * Distributive law. ! 84: ! 85: *(X + 4 * (Y + C)) compiles better as *(X + 4*C + 4*Y) ! 86: on some machines because of known addressing modes. ! 87: It may be tricky to determine when, and for which machines, ! 88: to use each alternative. ! 89: ! 90: 2. Simpler porting. ! 91: ! 92: Right now, describing the target machine's instructions is done ! 93: cleanly, but describing its addressing mode is done with several ! 94: ad-hoc macro definitions. Porting would be much easier if there were ! 95: an RTL description for addressing modes like that for instructions. ! 96: Tools analogous to genflags and genrecog would generate macros from ! 97: this description. ! 98: ! 99: There would be one pattern in the address-description file for each ! 100: kind of addressing, and this pattern would have: ! 101: ! 102: * the RTL expression for the address ! 103: * C code to verify its validity (since that may depend on ! 104: the exact data). ! 105: * C code to print the address in assembler language. ! 106: * C code to convert the address into a valid one, if it is not valid. ! 107: (This would replace LEGITIMIZE_ADDRESS). ! 108: * Register constraints for all indeterminates that appear ! 109: in the RTL expression. ! 110: ! 111: 3. Other languages. ! 112: ! 113: Front ends for Pascal, Fortran, Algol, Cobol and Ada are desirable. ! 114: ! 115: Pascal requires the implementation of functions within functions. ! 116: Some of the mechanisms for this already exist. ! 117: ! 118: 4. Generalize the machine model. ! 119: ! 120: 4.A. Parameters in registers. ! 121: ! 122: One some machines, conventions are that some parameters are passed ! 123: in general registers. The compiler currently cannot handle this. ! 124: ! 125: This requires changes in the code in expr.c for function calls. ! 126: For function entry, changes are required in stmt.c, and in ! 127: layout_parms, and perhaps also in final and in register allocation, ! 128: but the last should be minor. ! 129: ! 130: Where stmt.c now copies the stack slot into a pseudo register, ! 131: instead copy the special argument register into a pseudo register. ! 132: Use the pseudo register throughout the body of the function to ! 133: represent the parameter. That way, parameters can still be spilled ! 134: to the stack. ! 135: ! 136: 4.B. Jump-execute-next. ! 137: ! 138: Many recent machines have jumps which execute the following instruction ! 139: before the instruction jumped to. To take advantage of this capability ! 140: requires a new compiler pass that would reorder instructions when possible. ! 141: After reload is a good place for it. ! 142: ! 143: 5. Add a profiling feature like Berkeley's -pg, ! 144: or other debugging and measurement features. ! 145:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.