Annotation of gcc/PROJECTS, revision 1.1.1.2

1.1       root        1: 1. Better optimization.
                      2: 
                      3: * More cse
                      4: 
                      5: The techniques for doing full global cse are described in the
                      6: red dragon book.  It is likely to be slow and use a lot of memory,
                      7: but it might be worth offering as an additional option.
                      8: 
                      9: It is probably possible to extend cse to a few very frequent cases
                     10: without so much expense.
                     11: 
                     12: For example, it is not very hard to handle cse through if-then
                     13: statements with no else clauses.  Here's how to do it.  On reaching a
                     14: label, notice that the label's use-count is 1 and that the last
                     15: preceding jump jumps conditionally to this label.  Now you know it
                     16: is a simple if-then statement.  Remove from the hash table
                     17: all the expressions that were entered since that jump insn
                     18: and you can continue with cse.
                     19: 
                     20: It is probably not hard to handle cse from the end of a loop
                     21: around to the beginning, and a few loops would be greatly sped
                     22: up by this.
                     23: 
                     24: * Iteration variables and strength reduction.
                     25: 
                     26: The red dragon book describes standard techniques for these kinds
                     27: of loop optimization.  But be careful!  These optimization techniques
                     28: don't always make the code better.  You need to avoid performing
                     29: the standard transformations unless they are greatly worth while.
                     30: 
                     31: In many common cases it is possible to deduce that an iteration
                     32: variable is always positive during the loop.  This information
                     33: may make it possible to use decrement-and-branch instructions
                     34: whose branch conditions are inconvenient.  For example, the 68000
                     35: `dbra' instruction branches if the value was not equal to zero.
                     36: Therefore, it is not applicable to `for (i = 10; i >= 0; i--)'
                     37: unless the compiler can know that I will never be negative
                     38: before it is decremented.
                     39: 
1.1.1.2 ! root       40: * Using constraints on values.
1.1       root       41: 
1.1.1.2 ! root       42: Many operations could be simplified based on knowledge of the
        !            43: minimum and maximum possible values of a register at any particular time.
        !            44: These limits could come from the data types in the tree, via rtl generation,
        !            45: or they can be deduced from operations that are performed.  For example,
        !            46: the result of an `and' operation one of whose operands is 7 must be in
        !            47: the range 0 to 7.  Compare instructions also tell something about the
        !            48: possible values of the operand, in the code beyond the test.
        !            49: 
        !            50: Value constraints can be used to determine the results of a further
        !            51: comparison.  They can also indicate that certain `and' operations are
        !            52: redundant.  Constraints might permit a decrement and branch
        !            53: instruction that checks zeroness to be used when the user has
        !            54: specified to exit if negative.
1.1       root       55: 
                     56: * Smarter reload pass.
                     57: 
                     58: The reload pass as currently written can reload values only into registers
                     59: that are reserved for reloading.  This means that in order to use a
                     60: register for reloading it must spill everything out of that register.
                     61: 
                     62: It would be straightforward, though complicated, for reload1.c to keep
                     63: track, during its scan, of which hard registers were available at each
                     64: point in the function, and use for reloading even registers that were
                     65: free only at the point they were needed.  This would avoid much spilling
                     66: and make better code.
                     67: 
                     68: * Change the type of a variable.
                     69: 
                     70: Sometimes a variable is declared as `int', it is assigned only once
                     71: from a value of type `char', and then it is used only by comparison
                     72: against constants.  On many machines, better code would result if
                     73: the variable had type `char'.  If the compiler could detect this
                     74: case, it could change the declaration of the variable and change
                     75: all the places that use it.
                     76: 
                     77: * Order of subexpressions.
                     78: 
                     79: It might be possible to make better code by paying attention
                     80: to the order in which to generate code for subexpressions of an expression.
                     81: 
                     82: * Better code for switch statements.
                     83: 
                     84: If a switch statement has only a few cases, a sequence of conditional
                     85: branches is generated for it, rather than a jump table.  It would
                     86: be better to output a binary tree of branches.
                     87: 
                     88: * Distributive law.
                     89: 
1.1.1.2 ! root       90: The C expression *(X + 4 * (Y + C)) compiles better on certain
        !            91: machines if rewritten as *(X + 4*C + 4*Y) because of known addressing
        !            92: modes.  It may be tricky to determine when, and for which machines, to
        !            93: use each alternative.
        !            94: 
        !            95: * Jump-execute-next.
        !            96: 
        !            97: Many recent machines have jumps which optionally execute the following
        !            98: instruction before the instruction jumped to, either conditionally or
        !            99: unconditionally.  To take advantage of this capability requires a new
        !           100: compiler pass that would reorder instructions when possible.  After
        !           101: reload may be a good place for it.
        !           102: 
        !           103: On some machines, the result of a load from memory is not available
        !           104: until after the following instruction.  The easiest way to support
        !           105: these machines is to output each RTL load instruction as two assembler
        !           106: instructions, the second being a no-op.  Putting useful instructions
        !           107: after the load instructions may be a similar task to putting them
        !           108: after jump instructions.
        !           109: 
        !           110: * Pipeline scheduling.
        !           111: 
        !           112: On many machines, code gets faster if instructions are reordered
        !           113: so that pipelines are kept full.  Doing the best possible job of this
        !           114: requires knowing which functional units each kind of instruction executes
        !           115: in and how long the functional unit stays busy with it.  Then the
        !           116: goal is to reorder the instructions to keep many functional units
        !           117: busy but never feed them so fast they must wait.
1.1       root      118: 
                    119: 2. Simpler porting.
                    120: 
                    121: Right now, describing the target machine's instructions is done
                    122: cleanly, but describing its addressing mode is done with several
                    123: ad-hoc macro definitions.  Porting would be much easier if there were
                    124: an RTL description for addressing modes like that for instructions.
                    125: Tools analogous to genflags and genrecog would generate macros from
                    126: this description.
                    127: 
                    128: There would be one pattern in the address-description file for each
                    129: kind of addressing, and this pattern would have:
                    130: 
                    131:   * the RTL expression for the address
                    132:   * C code to verify its validity (since that may depend on
                    133:     the exact data).
                    134:   * C code to print the address in assembler language.
                    135:   * C code to convert the address into a valid one, if it is not valid.
                    136:     (This would replace LEGITIMIZE_ADDRESS).
                    137:   * Register constraints for all indeterminates that appear
                    138:     in the RTL expression.
                    139: 
                    140: 3. Other languages.
                    141: 
1.1.1.2 ! root      142: Front ends for Pascal, Fortran, Algol, Cobol, Modula-2 and Ada are
        !           143: desirable.
1.1       root      144: 
1.1.1.2 ! root      145: Pascal, Modula-2 and Ada require the implementation of functions
        !           146: within functions.  Some of the mechanisms for this already exist.
1.1       root      147: 
                    148: 4. Generalize the machine model.
                    149: 
1.1.1.2 ! root      150: Some new compiler features may be needed to do a good job on machines
        !           151: where static data needs to be addressed using base registers.
1.1       root      152: 
1.1.1.2 ! root      153: Some machines have two stacks in different areas of memory, one used
        !           154: for scalars and another for large objects.  The compiler does not
        !           155: now have a way to understand this.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.