--- gcc/PROJECTS	2018/04/24 16:38:49	1.1.1.2
+++ gcc/PROJECTS	2018/04/24 16:52:25	1.1.1.6
@@ -1,5 +1,22 @@
+0. Improved efficiency.
+
+* Parse and output array initializers an element at a time, freeing
+storage after each, instead of parsing the whole initializer first and
+then outputting.  This would reduce memory usage for large
+initializers.
+
 1. Better optimization.
 
+* Constants in unused inline functions
+
+It would be nice to delay output of string constants so that string
+constants mentioned in unused inline functions are never generated.
+Perhaps this would also take care of string constants in dead code.
+
+The difficulty is in finding a clean way for the RTL which refers
+to the constant (currently, only by an assembler symbol name)
+to point to the constant and cause it to be output.
+
 * More cse
 
 The techniques for doing full global cse are described in the
@@ -21,21 +38,45 @@ It is probably not hard to handle cse fr
 around to the beginning, and a few loops would be greatly sped
 up by this.
 
-* Iteration variables and strength reduction.
+* Support more general tail-recursion among different functions.
 
-The red dragon book describes standard techniques for these kinds
-of loop optimization.  But be careful!  These optimization techniques
-don't always make the code better.  You need to avoid performing
-the standard transformations unless they are greatly worth while.
-
-In many common cases it is possible to deduce that an iteration
-variable is always positive during the loop.  This information
-may make it possible to use decrement-and-branch instructions
-whose branch conditions are inconvenient.  For example, the 68000
-`dbra' instruction branches if the value was not equal to zero.
-Therefore, it is not applicable to `for (i = 10; i >= 0; i--)'
-unless the compiler can know that I will never be negative
-before it is decremented.
+This might be possible under certain circumstances, such as when
+the argument lists of the functions have the same lengths.
+Perhaps it could be done with a special declaration.
+
+You would need to verify in the calling function that it does not
+use the addresses of any local variables and does not use setjmp.
+
+* Put short statics vars at low addresses and use short addressing mode?
+Useful on the 68000/68020 and perhaps on the 32000 series,
+provided one has a linker that works with the feature.
+This is said to make a 15% speedup on the 68000.
+This brings to mind Hayes' changes for Stanford MIPS.
+
+* Detect dead stores into memory?
+
+A store into memory is dead if it is followed by another store into
+the same location; and, in between, there is no reference to anything
+that might be that location (including no reference to a variable
+address).
+
+* Loop optimization.
+
+Strength reduction and iteration variable elimination could be
+smarter.  They should know how to decide which iteration variables are
+not worth making explicit because they can be computed as part of an
+address calculation.  Based on this information, they should decide
+when it is desirable to eliminate one iteration variable and create
+another in its place.
+
+It should be possible to compute what the value of an iteration
+variable will be at the end of the loop, and eliminate the variable
+within the loop by computing that value at the loop end.
+
+When a loop has a simple increment that adds 1,
+instead of jumping in after the increment,
+decrement the loop count and jump to the increment.
+This allows aob insns to be used.
 
 * Using constraints on values.
 
@@ -79,11 +120,16 @@ all the places that use it.
 It might be possible to make better code by paying attention
 to the order in which to generate code for subexpressions of an expression.
 
-* Better code for switch statements.
+* More code motion.
+
+Consider hoisting common code up past conditional branches or
+tablejumps.
+
+* Trace scheduling.
 
-If a switch statement has only a few cases, a sequence of conditional
-branches is generated for it, rather than a jump table.  It would
-be better to output a binary tree of branches.
+This technique is said to be able to figure out which way a jump
+will usually go, and rearrange the code to make that path the
+faster one.
 
 * Distributive law.
 
@@ -92,6 +138,8 @@ machines if rewritten as *(X + 4*C + 4*Y
 modes.  It may be tricky to determine when, and for which machines, to
 use each alternative.
 
+Some work has been done on this, in combine.c.
+
 * Jump-execute-next.
 
 Many recent machines have jumps which optionally execute the following
@@ -116,6 +164,25 @@ in and how long the functional unit stay
 goal is to reorder the instructions to keep many functional units
 busy but never feed them so fast they must wait.
 
+* Can optimize by changing if (x) y; else z; into z; if (x) y;
+if z and x do not interfere and z has no effects not undone by y.
+This is desirable if z is faster than jumping.
+
+* For a two-insn loop on the 68020, such as
+  foo:	movb	a2@+,a3@+
+	jne	foo
+it is better to insert dbeq d0,foo before the jne.
+d0 can be a junk register.  The challenge is to fit this into
+a portable framework: when can you detect this situation and
+still be able to allocate a junk register?
+
+* For the 80387 floating point, perhaps it would be possible to use 3
+or 4 registers in the stack to hold register variables.  (It would be
+necessary to keep track of how those slots move in the stack as other
+pushes and pops are done.)  This is probably very tricky, but if
+you are a GCC wizard and you care about the speed of floating point on
+an 80386, you might want to work on it.
+
 2. Simpler porting.
 
 Right now, describing the target machine's instructions is done
@@ -147,9 +214,142 @@ within functions.  Some of the mechanism
 
 4. Generalize the machine model.
 
-Some new compiler features may be needed to do a good job on machines
+* Some new compiler features may be needed to do a good job on machines
 where static data needs to be addressed using base registers.
 
-Some machines have two stacks in different areas of memory, one used
+* Some machines have two stacks in different areas of memory, one used
 for scalars and another for large objects.  The compiler does not
 now have a way to understand this.
+
+5. Precompilation of header files.
+
+In the future, many programs will use thousands of lines of header files.
+Compiling the headers might be slower than compiling the guts of any one
+source file.  Here is a scheme for precompiling header files to make
+compilation faster for a sequence of headers which is often used.
+
+A precompiled header corresponds to a sequence of header files.  The
+preprocessor recognizes when the input starts with a sequence of
+`#include' commands and searches a data base for a precompiled header
+corresponding to that sequence.  The modtimes of all these files are
+stored in the data base so that one can tell whether the precompiled
+header is still valid.
+
+For robustness, each directory should have its own collection of
+precompiled headers and its own data base of them.  Probably each
+precompiled header would be a file and the data base would be one
+more file.
+
+The data base records the entire collection of predefined macros and
+their definitions, except for __FILE__, __LINE__ and __DATE__, for
+each precompiled header.  If this collection does not match the setup
+at the start of the current compilation (including the results of -D
+and -U switches), the precompiled header is inapplicable.  It might
+be possible to have distinct precompiled headers for the same sequence
+of header files but different collections of predefined macros.
+
+The state of any option that affects macro processing, such as -ansi
+or -traditional, must also be recorded, and the precompiled header is
+valid only if these options match.
+
+The precompiled header contains an ordered series of strings.  Some
+strings are marked "unconditional"; these must be compiled each time
+the precompiled header is used.  Other strings are have keys, which
+are identifiers.  A string with keys must be compiled if at least one
+of its keys is mentioned in the input.  The order these strings appear
+in the precompiled header is called their intrinsic order.
+
+The C preprocessor reads in the precompiled header file and scan all
+the strings, making for each key an entry in the same symbol table
+used for macros, pointing at a list of all the strings for which it is
+a key.  Each string must have a flag (one flag per string, not one per
+key per string).  The same code in `rescan' that detects references to
+macros would detect a reference to a key and flag all of the strings
+that it belongs to as needing to be output.
+
+Each of these strings is immediately recursively macroexpanded (i.e.
+`rescan' is called), but the output from this is discarded.  The
+expansion is to detect any other keys mentioned in the string, and to
+define any macros for which the string contains a #define.  The key's
+symbol table entry is be deleted to save time if the key is
+encountered again, and to avoid an infinite recursion.
+
+The unconditional strings are macroexpanded with `rescan' (but the
+output is discarded) at some time before anything is actually output.
+
+At the end of compilation, before any of the actual input text is
+output, the list of strings is scanned in the intrinsic order, and
+each string that is unconditional or flagged is output verbatim,
+except that any #define lines are discarded.
+
+Precompiled headers would be constructed by explicit request with a
+special tool.  The first step is to run cpp on the sequence of header
+files' contents.  This would use a new option that would cause all
+#define lines to be output unchanged as well as defining the macro.
+The second step is to divide the output into strings, some keyed and
+some unconditional.  This division is done without changing the order
+of the text being divided up.
+
+JNC@lcs.mit.edu has some ideas on this subject also.
+
+6. Better documentation of how GCC works and how to port it.
+
+Here is an outline proposed by Allan Adler.
+
+I.    Overview of this document
+II.   The machines on which GCC is implemented
+    A. Prose description of those characteristics of target machines and
+       their operating systems which are pertinent to the implementation
+       of GCC.
+	i. target machine characteristics
+	ii. comparison of this system of machine characteristics with
+	    other systems of machine specification currently in use
+    B. Tables of the characteristics of the target machines on which
+       GCC is implemented.
+    C. A priori restrictions on the values of characteristics of target 
+       machines, with special reference to those parts of the source code
+       which entail those restrictions
+	i. restrictions on individual characteristics 
+        ii. restrictions involving relations between various characteristics
+    D. The use of GCC as a cross-compiler 
+	i. cross-compilation to existing machines
+	ii. cross-compilation to non-existent machines
+    E. Assumptions which are made regarding the target machine
+	i.  assumptions regarding the architecture of the target machine
+	ii. assumptions regarding the operating system of the target machine
+	iii. assumptions regarding software resident on the target machine
+	iv. where in the source code these assumptions are in effect made
+III.   A systematic approach to writing the files tm.h and xm.h
+    A. Macros which require special care or skill
+    B. Examples, with special reference to the underlying reasoning
+IV.    A systematic approach to writing the machine description file md
+    A. Minimal viable sets of insn descriptions
+    B. Examples, with special reference to the underlying reasoning
+V.     Uses of the file aux-output.c
+VI.    Specification of what constitutes correct performance of an 
+       implementation of GCC
+    A. The components of GCC
+    B. The itinerary of a C program through GCC
+    C. A system of benchmark programs
+    D. What your RTL and assembler should look like with these benchmarks
+    E. Fine tuning for speed and size of compiled code
+VII.   A systematic procedure for debugging an implementation of GCC
+    A. Use of GDB
+	i. the macros in the file .gdbinit for GCC
+	ii. obstacles to the use of GDB
+	    a. functions implemented as macros can't be called in GDB
+    B. Debugging without GDB
+	i. How to turn off the normal operation of GCC and access specific
+	   parts of GCC
+    C. Debugging tools
+    D. Debugging the parser
+	i. how machine macros and insn definitions affect the parser
+    E. Debugging the recognizer
+	i. how machine macros and insn definitions affect the recognizer
+
+ditto for other components
+
+VIII. Data types used by GCC, with special reference to restrictions not 
+      specified in the formal definition of the data type
+IX.   References to the literature for the algorithms used in GCC
+