--- gcc/PROJECTS	2018/04/24 16:37:52	1.1
+++ gcc/PROJECTS	2018/04/24 16:51:24	1.1.1.5
@@ -1,5 +1,15 @@
 1. Better optimization.
 
+* Constants in unused inline functions
+
+It would be nice to delay output of string constants so that string
+constants mentioned in unused inline functions are never generated.
+Perhaps this would also take care of string constants in dead code.
+
+The difficulty is in finding a clean way for the RTL which refers
+to the constant (currently, only by an assembler symbol name)
+to point to the constant and cause it to be output.
+
 * More cse
 
 The techniques for doing full global cse are described in the
@@ -21,32 +31,61 @@ It is probably not hard to handle cse fr
 around to the beginning, and a few loops would be greatly sped
 up by this.
 
-* Iteration variables and strength reduction.
+* Support more general tail-recursion among different functions.
 
-The red dragon book describes standard techniques for these kinds
-of loop optimization.  But be careful!  These optimization techniques
-don't always make the code better.  You need to avoid performing
-the standard transformations unless they are greatly worth while.
-
-In many common cases it is possible to deduce that an iteration
-variable is always positive during the loop.  This information
-may make it possible to use decrement-and-branch instructions
-whose branch conditions are inconvenient.  For example, the 68000
-`dbra' instruction branches if the value was not equal to zero.
-Therefore, it is not applicable to `for (i = 10; i >= 0; i--)'
-unless the compiler can know that I will never be negative
-before it is decremented.
-
-* Special local optimizations.
-
-The instruction combiner finds only certain classes of local optimizations.
-For example, it cannot use the 68020 instruction `cmp2' because it would
-not think to combine the instructions that would be equivalent to a `cmp2'.
-
-In order to take advantage of such instructions, the combiner would need
-special hints as to which instructions to consider combining.  To be
-generally useful, this feature would have to be controlled somehow
-by new information in the machine description.
+This might be possible under certain circumstances, such as when
+the argument lists of the functions have the same lengths.
+Perhaps it could be done with a special declaration.
+
+You would need to verify in the calling function that it does not
+use the addresses of any local variables and does not use setjmp.
+
+* Put short statics vars at low addresses and use short addressing mode?
+Useful on the 68000/68020 and perhaps on the 32000 series,
+provided one has a linker that works with the feature.
+This is said to make a 15% speedup on the 68000.
+This brings to mind Hayes' changes for Stanford MIPS.
+
+* Detect dead stores into memory?
+
+A store into memory is dead if it is followed by another store into
+the same location; and, in between, there is no reference to anything
+that might be that location (including no reference to a variable
+address).
+
+* Loop optimization.
+
+Strength reduction and iteration variable elimination could be
+smarter.  They should know how to decide which iteration variables are
+not worth making explicit because they can be computed as part of an
+address calculation.  Based on this information, they should decide
+when it is desirable to eliminate one iteration variable and create
+another in its place.
+
+It should be possible to compute what the value of an iteration
+variable will be at the end of the loop, and eliminate the variable
+within the loop by computing that value at the loop end.
+
+When a loop has a simple increment that adds 1,
+instead of jumping in after the increment,
+decrement the loop count and jump to the increment.
+This allows aob insns to be used.
+
+* Using constraints on values.
+
+Many operations could be simplified based on knowledge of the
+minimum and maximum possible values of a register at any particular time.
+These limits could come from the data types in the tree, via rtl generation,
+or they can be deduced from operations that are performed.  For example,
+the result of an `and' operation one of whose operands is 7 must be in
+the range 0 to 7.  Compare instructions also tell something about the
+possible values of the operand, in the code beyond the test.
+
+Value constraints can be used to determine the results of a further
+comparison.  They can also indicate that certain `and' operations are
+redundant.  Constraints might permit a decrement and branch
+instruction that checks zeroness to be used when the user has
+specified to exit if negative.
 
 * Smarter reload pass.
 
@@ -74,18 +113,61 @@ all the places that use it.
 It might be possible to make better code by paying attention
 to the order in which to generate code for subexpressions of an expression.
 
-* Better code for switch statements.
+* More code motion.
+
+Consider hoisting common code up past conditional branches or
+tablejumps.
 
-If a switch statement has only a few cases, a sequence of conditional
-branches is generated for it, rather than a jump table.  It would
-be better to output a binary tree of branches.
+* Trace scheduling.
+
+This technique is said to be able to figure out which way a jump
+will usually go, and rearrange the code to make that path the
+faster one.
 
 * Distributive law.
 
-*(X + 4 * (Y + C)) compiles better as *(X + 4*C + 4*Y)
-on some machines because of known addressing modes.
-It may be tricky to determine when, and for which machines,
-to use each alternative.
+The C expression *(X + 4 * (Y + C)) compiles better on certain
+machines if rewritten as *(X + 4*C + 4*Y) because of known addressing
+modes.  It may be tricky to determine when, and for which machines, to
+use each alternative.
+
+Some work has been done on this, in combine.c.
+
+* Jump-execute-next.
+
+Many recent machines have jumps which optionally execute the following
+instruction before the instruction jumped to, either conditionally or
+unconditionally.  To take advantage of this capability requires a new
+compiler pass that would reorder instructions when possible.  After
+reload may be a good place for it.
+
+On some machines, the result of a load from memory is not available
+until after the following instruction.  The easiest way to support
+these machines is to output each RTL load instruction as two assembler
+instructions, the second being a no-op.  Putting useful instructions
+after the load instructions may be a similar task to putting them
+after jump instructions.
+
+* Pipeline scheduling.
+
+On many machines, code gets faster if instructions are reordered
+so that pipelines are kept full.  Doing the best possible job of this
+requires knowing which functional units each kind of instruction executes
+in and how long the functional unit stays busy with it.  Then the
+goal is to reorder the instructions to keep many functional units
+busy but never feed them so fast they must wait.
+
+* Can optimize by changing if (x) y; else z; into z; if (x) y;
+if z and x do not interfere and z has no effects not undone by y.
+This is desirable if z is faster than jumping.
+
+* For a two-insn loop on the 68020, such as
+  foo:	movb	a2@+,a3@+
+	jne	foo
+it is better to insert dbeq d0,foo before the jne.
+d0 can be a junk register.  The challenge is to fit this into
+a portable framework: when can you detect this situation and
+still be able to allocate a junk register?
 
 2. Simpler porting.
 
@@ -110,36 +192,163 @@ kind of addressing, and this pattern wou
 
 3. Other languages.
 
-Front ends for Pascal, Fortran, Algol, Cobol and Ada are desirable.
+Front ends for Pascal, Fortran, Algol, Cobol, Modula-2 and Ada are
+desirable.
 
-Pascal requires the implementation of functions within functions.
-Some of the mechanisms for this already exist.
+Pascal, Modula-2 and Ada require the implementation of functions
+within functions.  Some of the mechanisms for this already exist.
 
 4. Generalize the machine model.
 
-4.A. Parameters in registers.
-
-One some machines, conventions are that some parameters are passed
-in general registers.  The compiler currently cannot handle this.
-
-This requires changes in the code in expr.c for function calls.
-For function entry, changes are required in stmt.c, and in
-layout_parms, and perhaps also in final and in register allocation,
-but the last should be minor.
-
-Where stmt.c now copies the stack slot into a pseudo register,
-instead copy the special argument register into a pseudo register.
-Use the pseudo register throughout the body of the function to
-represent the parameter.  That way, parameters can still be spilled
-to the stack.
-
-4.B. Jump-execute-next.
-
-Many recent machines have jumps which execute the following instruction
-before the instruction jumped to.  To take advantage of this capability
-requires a new compiler pass that would reorder instructions when possible.
-After reload is a good place for it.
+* Some new compiler features may be needed to do a good job on machines
+where static data needs to be addressed using base registers.
 
-5. Add a profiling feature like Berkeley's -pg,
-or other debugging and measurement features.
+* Some machines have two stacks in different areas of memory, one used
+for scalars and another for large objects.  The compiler does not
+now have a way to understand this.
+
+5. Precompilation of header files.
+
+In the future, many programs will use thousands of lines of header files.
+Compiling the headers might be slower than compiling the guts of any one
+source file.  Here is a scheme for precompiling header files to make
+compilation faster for a sequence of headers which is often used.
+
+A precompiled header corresponds to a sequence of header files.  The
+preprocessor recognizes when the input starts with a sequence of
+`#include' commands and searches a data base for a precompiled header
+corresponding to that sequence.  The modtimes of all these files are
+stored in the data base so that one can tell whether the precompiled
+header is still valid.
+
+For robustness, each directory should have its own collection of
+precompiled headers and its own data base of them.  Probably each
+precompiled header would be a file and the data base would be one
+more file.
+
+The data base records the entire collection of predefined macros and
+their definitions, except for __FILE__, __LINE__ and __DATE__, for
+each precompiled header.  If this collection does not match the setup
+at the start of the current compilation (including the results of -D
+and -U switches), the precompiled header is inapplicable.  It might
+be possible to have distinct precompiled headers for the same sequence
+of header files but different collections of predefined macros.
+
+The state of any option that affects macro processing, such as -ansi
+or -traditional, must also be recorded, and the precompiled header is
+valid only if these options match.
+
+The precompiled header contains an ordered series of strings.  Some
+strings are marked "unconditional"; these must be compiled each time
+the precompiled header is used.  Other strings are have keys, which
+are identifiers.  A string with keys must be compiled if at least one
+of its keys is mentioned in the input.  The order these strings appear
+in the precompiled header is called their intrinsic order.
+
+The C preprocessor reads in the precompiled header file and scan all
+the strings, making for each key an entry in the same symbol table
+used for macros, pointing at a list of all the strings for which it is
+a key.  Each string must have a flag (one flag per string, not one per
+key per string).  The same code in `rescan' that detects references to
+macros would detect a reference to a key and flag all of the strings
+that it belongs to as needing to be output.
+
+Each of these strings is immediately recursively macroexpanded (i.e.
+`rescan' is called), but the output from this is discarded.  The
+expansion is to detect any other keys mentioned in the string, and to
+define any macros for which the string contains a #define.  The key's
+symbol table entry is be deleted to save time if the key is
+encountered again, and to avoid an infinite recursion.
+
+The unconditional strings are macroexpanded with `rescan' (but the
+output is discarded) at some time before anything is actually output.
+
+At the end of compilation, before any of the actual input text is
+output, the list of strings is scanned in the intrinsic order, and
+each string that is unconditional or flagged is output verbatim,
+except that any #define lines are discarded.
+
+Precompiled headers would be constructed by explicit request with a
+special tool.  The first step is to run cpp on the sequence of header
+files' contents.  This would use a new option that would cause all
+#define lines to be output unchanged as well as defining the macro.
+The second step is to divide the output into strings, some keyed and
+some unconditional.  This division is done without changing the order
+of the text being divided up.
+
+JNC@lcs.mit.edu has some ideas on this subject also.
+
+6. Other possibly nice features.
+
+* cpp could have a #provide directive.
+#provide would have the same syntax as #include,
+and it would nullify any future #include directive
+with the same argument.  Thus, the file foo.h
+could contain #provide <foo> to prevent itself from
+being included twice.
+
+This is much cleaner than the alternative sometimes implemented,
+which is to require the user to use something other than #include
+in order to ensure inclusion only once.
+
+7. Better documentation of how GCC works and how to port it.
+
+Here is an outline proposed by Allan Adler.
+
+I.    Overview of this document
+II.   The machines on which GCC is implemented
+    A. Prose description of those characteristics of target machines and
+       their operating systems which are pertinent to the implementation
+       of GCC.
+	i. target machine characteristics
+	ii. comparison of this system of machine characteristics with
+	    other systems of machine specification currently in use
+    B. Tables of the characteristics of the target machines on which
+       GCC is implemented.
+    C. A priori restrictions on the values of characteristics of target 
+       machines, with special reference to those parts of the source code
+       which entail those restrictions
+	i. restrictions on individual characteristics 
+        ii. restrictions involving relations between various characteristics
+    D. The use of GCC as a cross-compiler 
+	i. cross-compilation to existing machines
+	ii. cross-compilation to non-existent machines
+    E. Assumptions which are made regarding the target machine
+	i.  assumptions regarding the architecture of the target machine
+	ii. assumptions regarding the operating system of the target machine
+	iii. assumptions regarding software resident on the target machine
+	iv. where in the source code these assumptions are in effect made
+III.   A systematic approach to writing the files tm.h and xm.h
+    A. Macros which require special care or skill
+    B. Examples, with special reference to the underlying reasoning
+IV.    A systematic approach to writing the machine description file md
+    A. Minimal viable sets of insn descriptions
+    B. Examples, with special reference to the underlying reasoning
+V.     Uses of the file aux-output.c
+VI.    Specification of what constitutes correct performance of an 
+       implementation of GCC
+    A. The components of GCC
+    B. The itinerary of a C program through GCC
+    C. A system of benchmark programs
+    D. What your RTL and assembler should look like with these benchmarks
+    E. Fine tuning for speed and size of compiled code
+VII.   A systematic procedure for debugging an implementation of GCC
+    A. Use of GDB
+	i. the macros in the file .gdbinit for GCC
+	ii. obstacles to the use of GDB
+	    a. functions implemented as macros can't be called in GDB
+    B. Debugging without GDB
+	i. How to turn off the normal operation of GCC and access specific
+	   parts of GCC
+    C. Debugging tools
+    D. Debugging the parser
+	i. how machine macros and insn definitions affect the parser
+    E. Debugging the recognizer
+	i. how machine macros and insn definitions affect the recognizer
+
+ditto for other components
+
+VIII. Data types used by GCC, with special reference to restrictions not 
+      specified in the formal definition of the data type
+IX.   References to the literature for the algorithms used in GCC