43BSD/ingres/doc/other/design.roff - annotate

Return to design.roff CVS log
Up to [CSRG BSD Unix] / 43BSD / ingres / doc / other
Annotation of 43BSD/ingres/doc/other/design.roff, revision 1.1.1.1

1.1       root        1: .ds
                      2: .po 10
                      3: .de ph
                      4: .sp 1
                      5: ..
                      6: .de se
                      7: .sp 1
                      8: ..
                      9: .de bb
                     10: .in+8
                     11: .sp
                     12: ..
                     13: .de be
                     14: .in -8
                     15: .sp
                     16: ..
                     17: .tr & 
                     18: .m3 3
                     19: .m1 +1
                     20: .he ''''
                     21: .bp
                     22: .ce14
                     23: THE DESIGN AND IMPLEMENTATION
                     24: OF INGRES
                     25: 
                     26: by
                     27: 
                     28: MICHAEL STONEBRAKER, EUGENE WONG AND PETER KREPS
                     29: DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
                     30: UNIVERSITY OF CALIFORNIA, BERKELEY, CA.
                     31: 
                     32: and
                     33: 
                     34: GERALD HELD
                     35: TANDEM COMPUTERS, INC.
                     36: CUPERTINO, CA.
                     37: .sp 5
                     38: 
                     39: 
                     40: 
                     41: 
                     42: ABSTRACT
                     43: .ph
                     44: This paper describes the CURRENTLY OPERATIONAL
                     45: version of the INGRES data base management system.
                     46: This multi-user system gives a relational view of
                     47: data, supports two high level non-procedural data
                     48: sublanguages and runs as a collection of user
                     49: processes on top of the UNIX operating system for
                     50: Digital Equipment Corporation PDP 11/40, 11/45 and 11/70
                     51: computers.
                     52: Stressed here are the design decisions and
                     53: tradeoffs related to 1) structuring the
                     54: system into processes, 2) embedding one
                     55: command language in a general purpose programming language,
                     56: 3) the algorithms implemented to process interactions,
                     57: 4) the access methods implemented 5) the concurrency
                     58: and recovery control provided 6) support for views, protection
                     59: and integrity constraints and 7) the data
                     60: structures used for system catalogs and role
                     61: of the data base administrator.
                     62: .bp
                     63: .fo 'DESIGN OF INGRES (I)'-%-'12/5/75'
                     64: 1  INTRODUCTION
                     65: 
                     66: INGRES (Interactive Graphics and Retrieval System) is
                     67: a relational data base system which is 
                     68: implemented
                     69: on top of
                     70: the UNIX operating system developed at Bell Telephone
                     71: Laboratories [RITC74] for Digital Equipment
                     72: Corporation PDP 11/40, 11/45 and 11/70 computer systems.
                     73: The implementation of INGRES is primarily programmed in "C", a high
                     74: level language in which UNIX itself is written.  Parsing is done
                     75: with the assistance of YACC, a compiler-compiler available
                     76: on UNIX [JOHN74].
                     77: 
                     78: The advantages of a relational model for data
                     79: base management systems have been extensively discussed
                     80: in the literature, [CODD70,CODD74,DATE74] and hardly require further elaboration.
                     81: In choosing the relational model, we were particularly 
                     82: motivated by (a) the high degree of data independence that such
                     83: a model affords, and (b) the possibility
                     84: of providing a high level and entirely procedure-free
                     85: facility for data definition, retrieval, update, access
                     86: control, support of views, and integrity
                     87: verification.
                     88: 
                     89: In this paper we will describe the design decisions made in INGRES.
                     90: In particular, we will stress the design and implementation of:
                     91: 
                     92: .ss
                     93: a) the embedding of all INGRES commands in the general purpose 
                     94: programming language "C"
                     95: 
                     96: b) the access methods implemented
                     97: 
                     98: c) the catalog structure and the role of the data base
                     99: administrator
                    100: 
                    101: d) support for views, protection and integrity constraints
                    102: 
                    103: e) the decomposition procedure implemented
                    104: 
                    105: f) implementation of updates and consistency of secondary indices
                    106: 
                    107: g) recovery and concurrency control
                    108: 
                    109: .ds
                    110: Except where noted to the contrary, this paper describes the 
                    111: INGRES system operational in January, 1976.
                    112: 
                    113: To this end we first briefly describe in Section 1.2  
                    114: the primary query language supported, QUEL,
                    115: and the utility commands accepted by the current system.   
                    116: The second user interface, CUPID, is a graphics
                    117: oriented, casual user language which is also
                    118: operational and described in [MCDO75a, MCDO75b].
                    119: It will not be discussed further in this paper.
                    120: Then in Section
                    121: 1.3 we describe the relevant factors in the UNIX environment which have affected our 
                    122: design decisions.
                    123: 
                    124: In Section 2 we discuss the structure of the four
                    125: processes (see Section 1.3 for a discussion of
                    126: this UNIX notion)
                    127: into which INGRES is divided and the reasoning behind the 
                    128: choice implemented.  The EQUEL (Embedded QUEL) precompiler, which allows the substitution of 
                    129: a user-supplied C program for the "front end" process is also 
                    130: discussed.
                    131: This program has the effect of embedding all
                    132: of INGRES in the general purpose
                    133: programming language "C".
                    134: Then in Section 3 we indicate the data structures which are implemented in
                    135: INGRES, the catalog (system) relations which exist and the role 
                    136: of the data base administrator with respect to all relations in a data base.
                    137: The implemented access methods, their calling conventions, and the
                    138: actual layout of data pages in secondary storage
                    139: where appropriate, are also presented.
                    140: 
                    141: Sections 4, 5 and 6 discuss respectively the various functions of each of the 
                    142: three "core" processes in the system.  Also discussed are the design and 
                    143: implementation strategy of each process.  Lastly, Section 7 draws 
                    144: conclusions, suggests future extensions and indicates the nature of the
                    145: current applications run on INGRES.
                    146: 
                    147: 1.2 QUEL AND THE OTHER INGRES UTILITY COMMANDS
                    148: 
                    149: QUEL (QUEry Language) has points in common with Data Language/ALPHA
                    150: [CODD71], SQUARE
                    151: [BOYC73] and SEQUEL [CHAM74] in that it is a complete [CODD72] query language which frees the
                    152: programmer from concern for how data structures are implemented and what
                    153: algorithms are operating on stored data.  As such it facilitates a considerable
                    154: degree of data independence [STON74a].
                    155: 
                    156: The QUEL examples in this section all concern
                    157: the following relation.
                    158: 
                    159: .nf
                    160: .ne 7
                    161:             NAME     DEPT     SALARY      MANAGER      AGE
                    162: .ss
                    163: 
                    164:             Smith    toy      10000       Jones        25
                    165: EMPLOYEE    Jones    toy      15000       Johnson      32
                    166:             Adams    candy    12000       Baker        36
                    167:             Johnson  toy      14000       Harding      29
                    168:             Baker    admin    20000       Harding      47
                    169:             Harding  admin    40000       none         58
                    170: .ds
                    171: .fi
                    172: 
                    173: Indicated here is an EMPLOYEE relation with domains NAME, DEPT, SALARY,
                    174: MANAGER and AGE.  Each employee has a manager (except for Harding who
                    175: is presumably the company president), a salary, an age, and is in a department.
                    176: 
                    177: A QUEL interaction includes at least one RANGE statement of the form:
                    178: 
                    179:        RANGE OF variable-list IS relation-name
                    180: 
                    181: The symbols declared in the range statement are variables which will be used 
                    182: as arguments for tuples.  These are called TUPLE VARIABLES.  The purpose
                    183: of this statement is to specify the relation over which each variable ranges.
                    184: 
                    185: Moreover, an interaction includes one or more statements of the form:
                    186: 
                    187:        Command [Result-name] ( Target-list )
                    188:                [ WHERE Qualification ]
                    189: 
                    190: Here, Command is either RETRIEVE, APPEND, REPLACE,
                    191: or DELETE.
                    192: For RETRIEVE and APPEND,
                    193: Result-name is the name of the relation which qualifying tuples
                    194: will be retrieved into or appended to.
                    195: For REPLACE and DELETE, Result-name is the name of a tuple variable
                    196: which, through the qualification, identifies tuples to be modified or
                    197: deleted.
                    198: The Target-list is a list of the form
                    199: 
                    200:        Result-domain = Function ...
                    201: 
                    202: Here, the Result-domain's are domain names in the result relation
                    203: which are to be assigned the value of the corresponding
                    204: function.
                    205: 
                    206: The following suggest valid QUEL interactions.  A complete description of the
                    207: language is presented in [HELD75a].
                    208: 
                    209: Example 1.1     Find the birth year of employee Jones
                    210: 
                    211: .ss
                    212:        RANGE OF E IS EMPLOYEE
                    213:        RETRIEVE INTO W (BYEAR = 1975 - E.AGE)
                    214:        WHERE E.NAME = "Jones"
                    215: .ds
                    216: 
                    217: Here, E is a tuple variable which ranges over the EMPLOYEE relation and all tuples
                    218: in that relation are found which satisfy the qualification E.NAME = "Jones".
                    219: The result of the query is a new relation, W, which has a single domain,
                    220: BYEAR, that has been calculated for each qualifying tuple.
                    221: If the result relation is omitted, qualifying tuples are 
                    222: written in display format on the user's
                    223: terminal or returned to a calling
                    224: program in a prescribed format as discussed in Section 2.
                    225: Also, in the Target-list, the "Result-domain =" may be omitted
                    226: if Function is the right hand side is an existing domain (i.e. NAME = E.NAME may
                    227: be written as E.NAME -- see example 1.6).
                    228: 
                    229: Example 1.2    Insert the tuple (Jackson,candy,13000,Baker,30) into EMPLOYEE.
                    230: 
                    231: .ss
                    232:        APPEND TO EMPLOYEE(NAME = "Jackson", DEPT = "candy",
                    233:                SALARY = 13000, MGR = "Baker", AGE = 30)
                    234: .ds
                    235: 
                    236: Here, the result relation EMPLOYEE is modified by adding the indicated tuple
                    237: to the relation.  
                    238: If not all domains are specified,
                    239: the remainder default to zero for numeric domains
                    240: and null for character strings.
                    241: 
                    242: Example 1.3    If a second relation DEPT(DEPT, FLOOR#) contains
                    243: the floor# of each department that an
                    244: employee might work in, then one can fire
                    245: everybody on the first floor as follows:
                    246: 
                    247: .ss
                    248:        RANGE OF E IS EMPLOYEE
                    249:        RANGE OF D IS DEPT
                    250:        DELETE E WHERE   E.DEPT = D.DEPT 
                    251:                     AND  D.FLOOR# = 1
                    252: .ds
                    253: 
                    254: Here E specifies that the EMPLOYEE relation is to be modified.
                    255: All tuples are to be removed which have a value for DEPT
                    256: which is the same as some department of the first floor.
                    257: 
                    258: Example 1.4    Give a 10 percent raise to Jones if he works on the first floor
                    259: 
                    260: .ss
                    261: 
                    262:        RANGE OF E IS EMPLOYEE
                    263:        RANGE of D is DEPT
                    264:        REPLACE E(SALARY BY 1.1 * E.SALARY)
                    265:        WHERE E.NAME = "Jones" AND
                    266:               E.DEPT = D.DEPT AND D.FLOOR# = 1
                    267: 
                    268: .ds
                    269: 
                    270: Here, E.SALARY is to be replaced by 1.1*E.SALARY for those tuples in EMPLOYEE
                    271: where the qualification is true.
                    272: (Note that the keywords IS and BY may be used interchangeably with "=" in
                    273: any QUEL statement.)
                    274: 
                    275: Also, QUEL contains aggregation operators including COUNT, SUM, MAX, MIN,
                    276: and AVG.
                    277: Two examples of the use of aggregation follow.
                    278: 
                    279: Example 1.5    Replace the salary of all toy department employees by the average
                    280: toy department salary.
                    281: 
                    282: .ss
                    283:        RANGE OF E IS EMPLOYEE
                    284:        REPLACE E(SALARY BY AVG(E.SALARY WHERE E.DEPT = "toy") )
                    285:                WHERE E.DEPT = "toy"
                    286: .ds
                    287: 
                    288: Here, AVG is to be taken of the salary domain for those tuples satisfying 
                    289: the qualification E.DEPT = "toy".  Note that AVG(E.SALARY WHERE E.DEPT= "toy") is
                    290: scalar valued (in this instance, $13,000)
                    291: and consequently will be called an AGGREGATE.  More general 
                    292: aggregations are possible as suggested by the following example.
                    293: 
                    294: Example 1.6     Find those departments whose average salary exceeds the company
                    295: wide average salary, both averages to be taken only for those
                    296: employees whose salary exceeds $10000.
                    297: 
                    298: .ss
                    299:        RANGE OF E IS EMPLOYEE
                    300:        RETRIEVE INTO HIGHPAY (E.DEPT)
                    301:        WHERE   AVG(E.SALARY BY E.DEPT WHERE E.SALARY > 10000)
                    302:                >
                    303:                AVG(E.SALARY WHERE E.SALARY > 10000)
                    304: .ds
                    305: 
                    306: Here, AVG(E.SALARY BY E.DEPT WHERE E.SALARY>10000) is an AGGREGATE FUNCTION and takes
                    307: a value for each value of E.DEPT.  This value is the aggregate  AVG(E.SALARY
                    308: WHERE E.SALARY>10000 AND E.DEPT = value).  (For the toy, candy 
                    309: and admin departments this value is respectively 14,500, 12,000 and 30,000.)
                    310: The qualification expression for the
                    311: statement is then true for departments for which this aggregate function exceeds
                    312: the aggregate  AVG(E.SALARY WHERE E.SALARY>10000).  
                    313: 
                    314: In addition to the above QUEL commands 
                    315: INGRES also supports a variety of utility commands
                    316: These utility commands can be classified into 
                    317: seven
                    318: major categories.
                    319: 
                    320: a) invocation of INGRES
                    321: 
                    322:    INGRES data-base-name
                    323: 
                    324: This command executed from UNIX "logs in" a user to a given data base. 
                    325: (A data base is simply a named collection of
                    326: relations with a given data base administrator who has
                    327: powers not available to ordinary users.)
                    328: Thereafter, the user may issue all other commands
                    329: (except those executed directly from UNIX)
                    330: within the environment of the invoked data base.
                    331: 
                    332: b) creation and destruction of data bases
                    333: 
                    334:    CREATEDB data-base-name
                    335:    DESTROYDB data-base-name
                    336: 
                    337: These two commands are called from UNIX.  The invoker of CREATEDB 
                    338: must be authorized to create data bases (in a manner to be described 
                    339: presently) and he automatically becomes the data base administrator.
                    340: DESTROYDB successfully destroys a data base only if invoked by the data
                    341: base administrator.
                    342: 
                    343: c) creation and destruction of relations
                    344: 
                    345: .nf
                    346:    CREATE relname(domain-name IS format, domain-name IS format,...)
                    347:    DESTROY relname
                    348: .fi
                    349: 
                    350: These commands create and destroy relations within the current data base.
                    351: The invoker of the CREATE command becomes the "owner" of the relation 
                    352: created.  A user may only destroy a relation that he owns.
                    353: The current formats accepted by INGRES
                    354: are 1, 2 and 4 byte integers, 4 and 8
                    355: byte floating point numbers and fixed length ASCII character strings between
                    356: 1 and 255 bytes.
                    357: 
                    358: d) bulk copy of data
                    359: 
                    360: .ss
                    361:    COPY relname(domain-name IS format, domain-name IS format,...) 
                    362:      direction  "filename"
                    363: 
                    364:    PRINT relname
                    365: .ds
                    366: 
                    367: The command COPY transfers an entire relation to or from a UNIX file whose
                    368: name is "filename".  Direction is either "TO" or "FROM".  The format for
                    369: each domain is a description of how it appears (or is to appear) in the 
                    370: UNIX file.  The relation relname must exist and have domain names identical 
                    371: to the ones appearing in the COPY command.  However, the formats need not agree and
                    372: COPY will automatically convert data types.  Also, support is provided 
                    373: for dummy and variable length fields in a UNIX file.
                    374: 
                    375: PRINT copies a relation onto the user's terminal  
                    376: formatting it
                    377: as a report.  In this sense, it is stylized version of COPY.
                    378: 
                    379: e) storage structure modification
                    380: 
                    381:    MODIFY relname TO storage-structure ON (key1, key2,...)
                    382:    INDEX ON relname IS indexname(key1, key2,...)
                    383: 
                    384: The MODIFY command changes the storage structure of a relation from one 
                    385: access method to another.  The five access methods currently supported are 
                    386: discussed in Section 2.  The indicated keys are domains in relname which are 
                    387: concatenated left to right to form a combined key which is used in the 
                    388: organization of tuples in all but one of the access methods.
                    389: Only the owner of a relation may modify its storage structure.
                    390: 
                    391: INDEX creates a secondary index for a relation .  It has domains of 
                    392: key1, key2,...,pointer.  The domain, pointer, is the address of a tuple 
                    393: in the indexed relation having the given values for key1, key2,....
                    394: An index named AGEINDEX for EMPLOYEE would be the
                    395: following binary relation
                    396: 
                    397: .ss
                    398: .nf
                    399:                AGE            POINTER
                    400: 
                    401:                25      address of Smith's tuple
                    402:                32      address of Jones' tuple
                    403:   AGEINDEX     36      address of Adams' tuple
                    404:                29      address of Johnson's tuple
                    405:                47      address of Baker's tuple
                    406:                58      address of Harding's tuple
                    407: 
                    408: 
                    409: .ds
                    410: .fi
                    411: The relation indexname is in turn treated and accessed just like any other 
                    412: relation except that it is automatically updated when the relation it indexes
                    413: is updated.  This is discussed further in Section 6.
                    414: Naturally, only the owner of a relation may create and destroy secondary
                    415: indexes for it.
                    416: 
                    417: f) consistency and integrity control
                    418: 
                    419:    INTEGRITY CONSTRAINT is qualification
                    420:    INTEGRITY CONSTRAINT LIST relname
                    421:    INTEGRITY CONSTRAINT OFF relname
                    422:    INTEGRITY CONSTRAINT OFF (integer, ... ,integer)
                    423:    RESTORE data-base-name
                    424: 
                    425: The first four commands support the insertion, listing, deletion 
                    426: and selective deletion of integrity constraints which are to be enforced for all
                    427: interactions with a relation.  The mechanism for handling this enforcement 
                    428: is dicussed in Section 4.  The last command restores a data base to a 
                    429: consistent state after a system crash.  It must be executed from UNIX and its 
                    430: operation is discussed in Section 6.
                    431: The RESTORE command is only available
                    432: to the data-base administrator.
                    433: 
                    434: g) miscellaneous
                    435: 
                    436:    HELP [relname|manual-section]
                    437:    SAVE relname UNTIL expiration-date
                    438:    RELKILLER  data-base-name
                    439: 
                    440: HELP provides information about the system or the data base invoked.  When 
                    441: called with an optional argument which is a command name, HELP will 
                    442: return the appropriate page from the INGRES reference manual [ZOOK75].
                    443: When called with a relation name as an argument, it returns all information 
                    444: about that relation.  With no argument at all it returns information about
                    445: all relations in the current data base.
                    446: 
                    447: SAVE is the mechanism by which a user can declare his intention to keep a 
                    448: relation until a specified time.  RELKILLER is a UNIX command which can be 
                    449: invoked by a data base administrator to delete all relations  
                    450: whose "expiration-dates" have passed.  This should be done when
                    451: space in a data base is exhausted.
                    452: (The data base administrator can also remove any
                    453: relations from his data base using the DESTROY command,
                    454: regardless of who their owners are.)
                    455: 
                    456: Two comments should be noted at this time.
                    457: 
                    458: a) The system currently accepts the language specified as QUEL1 in [HELD75a].
                    459: Extension is in progress to accept QUELn.
                    460: 
                    461: b) The system currently does not accept views or protection statements.  Although
                    462: the algorithms have been specified [STON74b,STON75], they have not yet been implemented.
                    463: For this reason, no syntax for these statements is given in this section;
                    464: however the subject is dicussed further in Section 4.
                    465: 
                    466: 1.3  THE UNIX ENVIRONMENT
                    467: 
                    468: Two points concerning UNIX are worthy of mention in this section.
                    469: 
                    470: a) The UNIX file system
                    471: 
                    472: UNIX supports a tree structured file system similar to that of MULTICS.
                    473: Each file is either a directory (containing references to descendant 
                    474: files in the file system) or a data file.  Each data file can be viewed 
                    475: as an array 1 byte wide and 2**24 bytes long.  
                    476: (It is expected that this maximum length
                    477: will be increased by the UNIX implementors.)
                    478: Addressing in a file is similar
                    479: to referencing such an array.  Physically, each file is divided into 512 byte 
                    480: blocks (pages).  In response to a read request, UNIX moves one or more
                    481: pages from secondary memory to UNIX core buffers then returns to the user the 
                    482: the actual byte string desired.  If the same page is referenced again  
                    483: (by the same or another user) while
                    484: it is still in a core buffer, no disk I/O takes place.   
                    485: 
                    486: It is important to
                    487: note that UNIX pages data from
                    488: the file system into and out of system buffers
                    489: using a "least
                    490: recently used" replacement algorithm.  In this way the entire file system 
                    491: is managed as a large virtual store.
                    492: 
                    493: In part because the INGRES designers believe that a data base system 
                    494: should appear as a user job to UNIX and in part because they believe
                    495: that the operating system should deal with all space management issues 
                    496: for the mix of jobs being run, INGRES contains NO facilities to 
                    497: do its own memory management.
                    498: 
                    499: Each file in UNIX can be granted by its owner any combination of the 
                    500: following protection clauses:
                    501: 
                    502: .ss
                    503:  a) owner read
                    504:  b) owner write
                    505:  c) non owner read
                    506:  d) non owner write
                    507:  e) execute
                    508:  f) special execute
                    509: .ds
                    510: 
                    511: When INGRES is initially generated, a UNIX user named INGRES is 
                    512: created.  All data files managed by the INGRES system are 
                    513: owned by this "super-user" and have their protection status 
                    514: set to "owner read, owner write, no other access".  Consequently,
                    515: only the INGRES super-user can directly tamper with INGRES 
                    516: files.  (The protection system is currently being altered to 
                    517: optionally require the consent of the data base administrator 
                    518: before unrestricted access by the super-user is allowed.)
                    519: 
                    520: The INGRES object code is stored in files whose protection status is 
                    521: set to "special execute, no other access".  When a user invokes the 
                    522: INGRES system (by executing command a) above), UNIX creates the 
                    523: INGRES processes operating temporarily with a user-id of INGRES.  When 
                    524: a user exits from INGRES these processes are destroyed and the user is 
                    525: restored to operating with his own user-id.
                    526: 
                    527: Using this mechanism, the only way
                    528: a user may access an INGRES data base is to execute
                    529: INGRES object code.  This "safety latch" effectively isolates users from 
                    530: tampering directly with INGRES data.
                    531: 
                    532: b) The UNIX process structure
                    533: 
                    534: A process in UNIX is an address space (64K bytes or less on an 11/40,
                    535: 128K bytes or less on 11/45's and 11/70's) which is associated with a user-id 
                    536: and is the unit of work scheduled by the UNIX scheduler.  Processes 
                    537: may "fork" subprocesses; consequently, a parent process can be the root of 
                    538: a process subtree.
                    539: Furthermore, a process can request that UNIX
                    540: execute a file in a descendant process.
                    541: Such processes may communicate with each other
                    542: via an inter-process communication facility called
                    543: "pipes".  A pipe
                    544: may be declared as a one direction
                    545: communication link which is written into
                    546: by one process and read by a second one.
                    547: UNIX maintains synchronization
                    548: of pipes so no messages are lost.
                    549: Each process has a "standard input device" and a "standard output device".
                    550: These are usually the user's terminal but may be redirected by
                    551: the user to be files, pipes to other
                    552: processes, or other devices.
                    553: .ph
                    554: Lastly UNIX provides a facility for processes executing
                    555: re-entrant code to share
                    556: procedure segments if possible.
                    557: INGRES takes advantage of this facility
                    558: so the core space overhead of multiple
                    559: concurrent users is only that
                    560: required by data segments.
                    561: .ph
                    562: We turn in the next section to the process structure in which INGRES runs.
                    563: 
                    564: .bp
                    565: .fo 'DESIGN OF INGRES (II)'-%-'12/5/75'
                    566: 2  THE INGRES PROCESS STRUCTURE
                    567: 
                    568: INGRES can be invoked in two ways:  First, it can be directly invoked 
                    569: from UNIX by executing INGRES data-base-name; second it can be invoked by 
                    570: executing a program written using the EQUEL precompiler.  We discuss each 
                    571: in turn and then comment briefly on why two mechanisms exist.
                    572: 
                    573: 2.1 INVOCATION FROM UNIX
                    574: 
                    575: Issuing INGRES as a UNIX command
                    576: causes the process structure shown in Figure 1 to 
                    577: be created.
                    578: 
                    579: .ne 15
                    580: .ls 1
                    581: .nf
                    582: ________     ________     ________     ________     ________
                    583: |      |     |      |  A  |      |  B  |      |  C  |      |
                    584: | user |---->|      |---->|      |---->|      |---->|      |
                    585: | term-|     |      |     |      |     |      |     |      |
                    586: | inal |<----|      |<----|      |<----|      |<----|      |
                    587: |      |     |      |  F  |      |  E  |      |  D  |      |
                    588: ________     ________     ________     ________     ________
                    589: 
                    590:              process      process      process      process
                    591:                 1            2            3            4
                    592: 
                    593: 
                    594:                  INGRES Process Structure
                    595: 
                    596:                       Figure 1
                    597: .ls 2
                    598: 
                    599: .fi
                    600: Process 1 is an interactive terminal monitor which allows the user to 
                    601: formulate, print, edit and execute collections of INGRES commands.  It 
                    602: maintains a workspace with which the user interacts until he is 
                    603: satisfied with his interaction.
                    604: The contents of this workspace are passed
                    605: down pipe A as a string of ASCII characters
                    606: when execution is desired.
                    607: .ph
                    608: As noted above,
                    609: UNIX allows a user to alter the standard input and output devices for 
                    610: his processes when executing a command.  As a result the invoker of INGRES 
                    611: may direct the terminal monitor to take input from a user file  
                    612: (in which case he runs a "canned" collection of interactions)
                    613: and direct
                    614: output to another device (such as the line printer) or a file.
                    615: 
                    616: The current terminal monitor accepts the following commands.  Anything else 
                    617: is simply appended to the user's workspace.
                    618: 
                    619: .nf
                    620: .ss
                    621:          #  :  Erase the previous character.  Successive uses of this
                    622:                instruction will erase back to, but not beyond, the
                    623:                beginning of the current line.
                    624: 
                    625:           @  :  Erase the current line.  Successive uses of  this  in-
                    626:                struction are ignored.
                    627: 
                    628:        \\r   :  Erase the entire interaction (reset the workspace).  The
                    629:                former contents of the workspace are irretrieveably lost.
                    630: 
                    631:        \\p   :  Print the current workspace.  Its contents are printed
                    632:                on the user's terminal.
                    633: 
                    634:        \\e   :  Enter the UNIX text editor and begin accepting editor
                    635:                commands.  The editor allows sophisticated editing of the
                    636:                user's workspace.  This command is executed by simply
                    637:                "forking" a subprocess and executing the UNIX editor in it.
                    638: 
                    639:        \\g   :  Process the current query (go).  The contents of the
                    640:                workspace are transmitted to process 2.
                    641: 
                    642:        \\q   :  Exit from INGRES.
                    643: 
                    644: .fi
                    645: 
                    646: .ds
                    647: Process 2 contains a lexical analyzer, a parser, query modification routines 
                    648: for integrity control (and in the future support of views and protection) 
                    649: and concurrency control.  When process 2 finishes, it passes a string of tokens 
                    650: to process 3 through pipe B.
                    651: Process 2 is discussed in Section 4.
                    652: 
                    653: Process 3 accepts this token string and contains execution routines for 
                    654: the commands
                    655: RETRIEVE, REPLACE, DELETE and APPEND.  Any update is  
                    656: turned into a RETRIEVE command to isolate
                    657: tuples to be changed.  Revised copies of modified tuples
                    658: are spooled into a special file.
                    659: This
                    660: file is then processed by a
                    661: "deferred update processor" in
                    662: process 4 which is discussed in Section 6.
                    663: .ph
                    664: Basically process 3 performs two functions for RETRIEVE commands.
                    665: a) A multivariable query is DECOMPOSED into a sequence of 
                    666: interactions involving only a single variable.
                    667: b) A one variable query is executed by a one variable query processor (OVQP).
                    668: OVQP in turn performs its function by making calls on the access methods.
                    669: These two functions are discussed in Section 5;
                    670: the access method are indicated in Section 3.
                    671: 
                    672: In process 4 resides all code to support utility commands (CREATE, DESTROY,
                    673: INDEX, etc.).  Process 3 simply passes to process 4 any commands which process 4 
                    674: will execute.
                    675: Process
                    676: 4 is organized as a collection of overlays which accomplish the various 
                    677: functions.  The structure of this process will be discussed in Section 6.
                    678: 
                    679: Error messages are passed back through pipes D, E and F
                    680: to process 1 which returns
                    681: them to the user. If the command is a RETRIEVE with no result
                    682: relation specified, process 3 returns qualifying tuples 
                    683: in a stylized format directly to the
                    684: "standard output device" of process 1.
                    685: Unless redirected, this is the user's terminal.
                    686: 
                    687: We now turn to the operation of INGRES when invoked by code from the 
                    688: precompiler.
                    689: 
                    690: 2.2  EQUEL
                    691: 
                    692: Although QUEL alone provides the flexibility for most data 
                    693: management requirements, there are many
                    694: applications which require a customized user
                    695: interface in place of the QUEL language.
                    696: For this as well as other reasons,
                    697: it is often useful to have the flexibility of a general purpose
                    698: programming language in addition to the
                    699: data base facilities of QUEL.
                    700: To this end, a new language, EQUEL (Embedded QUEL), has
                    701: been implemented which consists of QUEL embedded in the general purpose
                    702: programming language "C".
                    703: 
                    704: In this section we describe the EQUEL language and indicate how it
                    705: operates in the INGRES environment.
                    706: 
                    707: In the design of EQUEL, the following goals were set:
                    708: .in+5
                    709: .ti-3
                    710: 1)&The new language must have the full capabilities of both
                    711: "C" and QUEL.
                    712: .ti-3
                    713: 2)&The C program should have the capability for processing
                    714: each tuple individually which satisfies the qualification
                    715: in a QUEL RETRIEVE statement.
                    716: (this is the "piped" return facility described in
                    717: Data Language/ALPHA [CODD71]).
                    718: .ti-3
                    719: 3)&The implementation should make as much use as possible
                    720: of the existing C and QUEL language processors. 
                    721: (The implementation cost of EQUEL should be small).
                    722: .in-5
                    723: 
                    724: With these goals in mind, EQUEL was defined as follows:
                    725: .in+5
                    726: .ti-3
                    727: 1)&Any C language statement is a valid EQUEL statement.
                    728: .ti-3
                    729: 2)&Any QUEL statement (or INGRES utility command)
                    730: is a valid EQUEL statement as long as it is
                    731: prefixed by two number signs ("##").
                    732: .ti-3
                    733: 3)&C program variables may be used in QUEL statements in place
                    734: of relation names, domain names, target list elements,
                    735: or domain values.
                    736: The declaration statements of C variables used in this manner
                    737: must also be prefixed by double number signs.
                    738: .ti-3
                    739: 4)&RETRIEVE statements without a result relation have the form
                    740:        RETRIEVE (Target-list) 
                    741:                 [WHERE Qualification] C-Block
                    742: .br
                    743: which results in the C-Block
                    744: being executed once for each qualifying tuple.
                    745: .in-5
                    746: 
                    747: Two short examples illustrate EQUEL syntax.
                    748: 
                    749: Example 2.1  The following section of code implements a small
                    750: front end to INGRES which performs only one query.
                    751: It reads in the name of an employee and prints out the
                    752: employee's salary in a suitable format.
                    753: It continues to do this as long as there are more names
                    754: to be read in.
                    755: The functions READ and PRINT are assumed to have the obvious meaning.
                    756: 
                    757: .nf
                    758: .ss
                    759: .in +4
                    760: main()
                    761: {
                    762: ## char NAME[20];
                    763: ## int SAL;
                    764: while (READ(NAME))
                    765:        {
                    766: ##     RANGE OF X IS EMP
                    767: ##     RETRIEVE (SAL = X.SALARY) 
                    768: ##     WHERE X.NAME = NAME
                    769:                {
                    770:                PRINT("The salary of ",NAME," is ",SAL);
                    771:                }
                    772:        }
                    773: }
                    774: 
                    775: .fi
                    776: .ds
                    777: In this example the C-variable NAME is used in the qualification of
                    778: the QUEL statement and for each qualifying tuple,
                    779: the C-variable SAL is set to the appropriate value and then
                    780: the Print statement is executed.
                    781: (note: in C "{" and "}" are equivalent to BEGIN and END in ALGOL).
                    782: 
                    783: .in -4
                    784: 
                    785: 
                    786: Example 2.2  Read in a relation name and two domain names.
                    787: Then for each of a collection of values which the second
                    788: domain is to assume,
                    789: do some processing on all values
                    790: which the first domain assumes.
                    791: (We assume the functions READ and PROCESS exist and have the
                    792: obvious meanings.)
                    793: A more elaborate version of this program could serve as a simple 
                    794: report generator.
                    795: 
                    796: .nf
                    797: .ss
                    798: .in +4
                    799: ## int VALUE;
                    800: ## char RELNAME[13], DOMNAME[13], DOMVAL[80];
                    801: ## char DOMNAME_2[13];
                    802: READ(RELNAME);
                    803: READ(DOMNAME);
                    804: READ(DOMNAME_2);
                    805: ## RANGE OF X IS RELNAME
                    806: while (READ(DOMVAL))
                    807:         {
                    808: ##      RETRIEVE (VALUE = X.DOMNAME)
                    809: ##          WHERE X.DOMNAME_2 = DOMVAL
                    810:                 {
                    811:                 PROCESS(VALUE);
                    812:                 }
                    813:         }
                    814: .fi
                    815: .in -4
                    816: .ds
                    817: 
                    818: Any RANGE declaration (in this case the one for X) is assumed by 
                    819: INGRES to hold until redefined.
                    820: Hence, only one RANGE statement is required regardless of the number 
                    821: of times the RETRIEVE statement is executed.
                    822: 
                    823: In order to implement EQUEL, a translator (pre-compiler)
                    824: was written which converts an EQUEL program into a valid C-program
                    825: with QUEL statements converted to appropriate C-code and calls to
                    826: INGRES.
                    827: The resulting C-program is then compiled by the normal
                    828: C-compiler producing an executable module.
                    829: Moreover, when an EQUEL program is run, the executable module
                    830: produced by the C-compiler is used as the 
                    831: front end process in place of the interactive terminal monitor as noted 
                    832: in Figure 2.
                    833: .ne 15
                    834: .ls 1
                    835: 
                    836:              ________     ________     ________     ________
                    837:              |      |  A  |      |  B  |      |  C  |      |
                    838:              |      |---->|      |---->|      |---->|      |
                    839:              |      |     |      |     |      |     |      |
                    840:              |      |<----|      |<----|      |<----|      |
                    841:              |      |  F  |      |  E  |      |  D  |      |
                    842:              ________     ________     ________     ________
                    843: 
                    844:                 C         process      process      process
                    845:              program         2            3            4
                    846: 
                    847: 
                    848:                          The Forked Process Structure
                    849: 
                    850:                                 Figure 2
                    851: 
                    852: .ls 2
                    853: 
                    854: 
                    855: During execution of the front-end program,
                    856: data base requests (QUEL statements in the EQUEL program)
                    857: are passed through pipe A and processed by INGRES.
                    858: If tuples must be returned for tuple at a time processing,
                    859: then they are returned through a special data pipe set up between process 3 and the C program.
                    860: A condition code is also returned
                    861: through pipe F to indicate success or the type of error encountered.
                    862: 
                    863: Consequently, the EQUEL translator must perform the following five 
                    864: functions:
                    865: 
                    866: .in+5
                    867: 1) insert system calls to "spawn" at run time the process structure 
                    868: shown in Figure 2.
                    869: 
                    870: 2) note C-variable declarations prefaced by ## as legal for inclusion 
                    871: in INGRES commands.
                    872: 
                    873: 3) process other lines prefaced by ##.  These are parsed to isolate
                    874: C-variables.  In adddition, C statements are inserted to write the 
                    875: line down pipe A in ASCII format, modified so that 
                    876: values are substituted for any C-variables.
                    877: The rationale for not completely parsing a QUEL statement in EQUEL 
                    878: is given in [ALLM76].
                    879: 
                    880: 4) insert C statements to read pipe F for completion information 
                    881: and call the procedure IIerror.
                    882: The user may define IIerror himself or have EQUEL include a standard 
                    883: version which prints the error message (for abnormal terminations) 
                    884: and continues.
                    885: 
                    886: 5) If data is to be returned through the data pipe (by a RETRIEVE with no result
                    887: relation specified), EQUEL must also:
                    888: 
                    889: .in+5
                    890: a) insert C statements to read the data pipe for a tuple formatted as type/value
                    891: pairs.
                    892: 
                    893: b) insert C statements to substitute values into C-variables declared in the
                    894: target list.
                    895: If necessary, values are converted to the types of the declared C-variables.
                    896: 
                    897: c) insert C statements to pass control to the C-block following the RETRIEVE.
                    898: 
                    899: d) insert C statements following 
                    900: the block to return to step a) if there are more tuples.
                    901: .in-10
                    902: 
                    903: 
                    904: 2.3 COMMENTS ON THE PROCESS STRUCTURE
                    905: 
                    906: The process structure shown in Figures 1 and 2 is the fourth different process 
                    907: structure implemented.  The following considerations suggested this final choice:
                    908: 
                    909: a) Simple control flow.  Previous process structures had a more complex 
                    910: interconnection of processes which made debugging harder.   
                    911: 
                    912: b) Commands are passed to the right only.  Process 3 must issue commands 
                    913: to various overlays in process 4 to execute interactions as discussed in 
                    914: Section 5.  Hence, process 3 must be to the left of process 4.
                    915: 
                    916: c) The utility commands are expected to be called relatively infrequently 
                    917: compared to the activity in process 2 and 3.  Hence, it appears appropriate 
                    918: to overlay little used code in a single process.
                    919: The alternative is to create additional processes (and pipes)
                    920: which are quiescent most of the time.  This would
                    921: require added space in UNIX core tables for no particular advantage.
                    922: 
                    923: d) The first 3 processes are used frequently. 
                    924: Overlaying code in these processes was
                    925: tried in a previous version and slowed the
                    926: system considerably.
                    927: 
                    928: e) To run on an 11/40, the 64K address space limitation must be adhered to.  Processes
                    929: 2 and 3 are nearly their maximum size and hence cannot be combined. 
                    930: (For 11/45 and 11/70 versions we may experiment with such a combination.) 
                    931: 
                    932: f) The C program which replaces the terminal monitor as a front 
                    933: end must run with a user-id different from that of INGRES for protection reasons. 
                    934: (Otherwise it could tamper directly with data
                    935: managed by INGRES.)
                    936: Hence, either it must be overlayed into a process or run in its 
                    937: own process.  For efficiency and convenience, the latter was chosen.
                    938: 
                    939: g) The interactive terminal monitor could have been written (albeit 
                    940: clumsily) in EQUEL.  Such a strategy would have avoided the 
                    941: existence of two process structures which differ only by the
                    942: treatment of the data pipe.  This was not done because response time 
                    943: would have degraded and because EQUEL does type 
                    944: conversion to predefined types.  This feature would unnecessarily 
                    945: complicate the terminal monitor.
                    946: 
                    947: h) The processes are all synchronized (i.e. each waits for an
                    948: error return from the next process to the right before continuing to accept
                    949: input from the process to the left.
                    950: This is done because it simplifies the flow of
                    951: control.  Moreover in many instances the
                    952: various processes MUST be synchronized.
                    953: Future versions of INGRES may attempt to exploit parallelism 
                    954: where possible.
                    955: 
                    956: .fo 'DESIGN OF INGRES (III)'-%-'12/5/75'
                    957: .bp
                    958: 3  DATA STRUCTURES AND ACCESS METHODS
                    959: .ph
                    960: We begin this section with a discussion of the files
                    961: that INGRES manipulates and their contents.
                    962: Then we sketch the language used to access all
                    963: non directory files.
                    964: Finally, the five possible file formats are indicated.
                    965: .se
                    966: 3.1 THE INGRES FILE STRUCTURE
                    967: .ph
                    968: Figure 3 indicates the subtree of the UNIX
                    969: file system that INGRES manipulates.
                    970: 
                    971: .ss
                    972: 
                    973: 
                    974: 
                    975: 
                    976: 
                    977: 
                    978: 
                    979: 
                    980: 
                    981: 
                    982: 
                    983: 
                    984: 
                    985: 
                    986: 
                    987: 
                    988: 
                    989: 
                    990:        
                    991: .ce 3
                    992: The INGRES Subtree
                    993: 
                    994: Figure 3
                    995: 
                    996: .ds
                    997: 
                    998: The root of this subtree
                    999: is a directory made for the UNIX user "INGRES".
                   1000: It has six descendant directories.  The AUX
                   1001: directory contains descendant files containing tables which
                   1002: control the spawning of processes shown in
                   1003: Figures 1 and 2, and an authorization list of users who are allowed to
                   1004: create data bases.  Only the INGRES "super-user"
                   1005: may modify these files (by using the UNIX editor).
                   1006: BIN and SOURCE are directories indicating descendant
                   1007: files of respectively object and source code.
                   1008: TMP contains temporary files containing
                   1009: the workspaces used by the interactive
                   1010: terminal monitor.
                   1011: DOC is the root of a subtree with system documentation and the reference manual.
                   1012: Lastly there is a directory entry in DATADIR for each data base that
                   1013: exists in INGRES.
                   1014: These directories contain the data base files 
                   1015: in a given data base as descendants.
                   1016: .ph
                   1017: These data base files are of four types:
                   1018: .ph
                   1019: a)  an administration file.  This contains the user-id
                   1020: of the data base administrator (DBA) and 
                   1021: initialization information.
                   1022: .ph
                   1023: b)  System relations.  These relations have
                   1024: predefined names 
                   1025: and are created for every data base.
                   1026: They are owned by the DBA and 
                   1027: constitute the system catalogs.
                   1028: They may be queried by a knowledgeable user
                   1029: issuing RETRIEVE statements,
                   1030: however, they may be updated only by the INGRES utility commands
                   1031: (or directly by the INGRES "super-user" in an emergency).
                   1032: (When protection statements are implemented the DBA
                   1033: will be able to selectively restrict RETRIEVE access
                   1034: to these relations if he wishes.)
                   1035: The form and content of some of these relations will be presently discussed.
                   1036: .ph
                   1037: c)  DBA relations.  These are relations owned by the DBA and are
                   1038: shared in that any user may
                   1039: access them.
                   1040: When protection
                   1041: is implemented the DBA can "authorize" other users by inserting protection
                   1042: predicates (which will be in one of the system relations) and
                   1043: "deauthorize" them by removing such predicates.
                   1044: .ph
                   1045: d)  Other relations.  These are relations
                   1046: created by other users (by RETRIEVE into W or CREATE)
                   1047: and are NOT SHARED.
                   1048: .ph
                   1049: Three comments should be made at this time.
                   1050: .ph
                   1051: a)  The DBA has the following power not available
                   1052: to ordinary users:
                   1053: .bb
                   1054: 1) the ability to create shared relations and to specify access control
                   1055: for them
                   1056: .ph
                   1057: 2) the ability to run RELKILLER
                   1058: .ph
                   1059: 3) the ability to destroy any relations in his data base (except the
                   1060: system catalogs)
                   1061: .be
                   1062: This system allows "one level sharing" in that only the DBA has
                   1063: the powers in a) and he cannot delegate any of these
                   1064: powers to others (as in the file systems of most
                   1065: time-sharing systems).  This strategy was implemented for
                   1066: three reasons:
                   1067: .bb
                   1068: 1) The need for added generality was not perceived.
                   1069: Moreover, added generality would have created
                   1070: tedious problems (such as making revocation 
                   1071: of access privileges
                   1072: non trivial).
                   1073: .ph
                   1074: 2) It seems appropriate to entrust to the DBA the duty (and power) to resolve
                   1075: the policy decision which must be made when
                   1076: space is exhausted and some relations must be destroyed (or archived).  
                   1077: This policy decision becomes much harder (or impossible)
                   1078: if a data base is not in the control of one user.
                   1079: .ph
                   1080: 3) Someone must be entrusted with the policy
                   1081: decision concerning which relations to physically store and which to
                   1082: define as "views".  This "data base design" problem is
                   1083: best centralized in a single DBA.
                   1084: .be
                   1085: b) Except for the single administration file in each data base 
                   1086: every file
                   1087: is treated as a relation.
                   1088: Storing system catalogs as relations
                   1089: has the following advantages:
                   1090: .bb
                   1091: 1) Code is economized by
                   1092: sharing routines for accessing both catalog and data relations.
                   1093: .ph
                   1094: 2) Since several storage structures are supported for 
                   1095: accessing
                   1096: data relations quickly and flexibly under various interaction mixes, these same
                   1097: storage choices may be utilized to enhance access to catalog information.
                   1098: .ph
                   1099: 3) The ability to execute QUEL statements to examine (and patch)
                   1100: system relations where necessary has greatly
                   1101: aided system debugging.
                   1102: .be
                   1103: c)  Each relation is stored in a separate file, i.e., no attempt
                   1104: is made to "cluster" tuples from different relations 
                   1105: which may be accessed together
                   1106: on the same
                   1107: (or a nearby) page. 
                   1108: This decision is based on the following reasoning.
                   1109: .bb
                   1110: 1) The access methods would be more complicated if clustering were
                   1111: supported.
                   1112: .ph
                   1113: 2) UNIX has a small (512 byte) page size.
                   1114: Hence it is expected that the number of tuples which can be grouped
                   1115: on the same page is small.
                   1116: Moreover, logically adjacent pages in a UNIX file are NOT
                   1117: NECESSARILY physically adjacent.
                   1118: Hence clustering tuples on "nearby" pages has no meaning in
                   1119: UNIX; the next logical page in a file
                   1120: may be further away (in terms of disk arm motion) than a page in a
                   1121: different file.  In keeping with the design decision of NOT
                   1122: modifying UNIX, these considerations were incorporated
                   1123: in the design decision not to support clustering.
                   1124: .ph
                   1125: 3) Clustering of tuples only makes sense if associated
                   1126: tuples can be linked together using "sets" [CODA71] or
                   1127: "links" [TSIC75].  Incorporating these access paths into the 
                   1128: decomposition scheme would have greatly increased its complexity.
                   1129: .be
                   1130: .se
                   1131: 3.2  SYSTEM CATALOGS
                   1132: .ph
                   1133: We turn now to a discussion of the system catalogs. 
                   1134: We discuss two relations in detail and
                   1135: indicate briefly the contents of the others.
                   1136: .ph
                   1137: The RELATION relation contains one tuple
                   1138: for every relation in the data base (including all the system
                   1139: relations.)  The domains
                   1140: of this relation are:
                   1141: .in +24
                   1142: .na
                   1143: .ti8
                   1144: relid          the name of the relation
                   1145: .ti8
                   1146: owner          the UNIX user-id of the relation owner; 
                   1147: when appended to relid it produces a unique file name for storing
                   1148: the relation.
                   1149: .ti8
                   1150: spec           indicates one of 5 possible storage
                   1151: schemes or else a special code indicating a virtual
                   1152: relation (or "view").
                   1153: .ti8
                   1154: indexd         flag set if secondary index exists for this
                   1155: relation.  (This flag and the following two are
                   1156: present to improve performance by avoiding catalog
                   1157: lookup's when possible during query modification
                   1158: and one variable query processing.)
                   1159: .ti8
                   1160: protect                flag set if this relation has protection predicates.
                   1161: .ti 8
                   1162: integ          flag set if there are integrity constraints.
                   1163: .ti 8
                   1164: save           scheduled life time of relation.
                   1165: .ti 8
                   1166: tuples         number of tuples in relation.
                   1167: .ti 8
                   1168: atts           number of domains in relation.
                   1169: .ti 8
                   1170: width          width (in bytes) of a tuple.
                   1171: .ti 8
                   1172: prim           number of primary file pages for this relation.
                   1173: .in -24
                   1174: .ad
                   1175: .ph
                   1176: The ATTRIBUTE catalog contains information relating to
                   1177: individual domains of relations.
                   1178: Tuples of the ATTRIBUTE catalog contain the following items
                   1179: for each domain of every relation in the data base:
                   1180: 
                   1181: .in +24
                   1182: .na
                   1183: .ti8
                   1184: relid          name of relation in which attribute appears
                   1185: .ti8
                   1186: owner          relation owner
                   1187: .ti8
                   1188: domain-name    domain name
                   1189: .ti8
                   1190: domainno       domain number (position) in relation.  In processing
                   1191: interactions INGRES uses this number to reference this domain.
                   1192: .ti8
                   1193: offset         offset in bytes from beginning of tuple to beginning of domain.
                   1194: .ti8
                   1195: type           data type of domain (integer, floating point or character string).
                   1196: .ti8
                   1197: length         length (in bytes) of domain; 
                   1198: .ti8
                   1199: keyno          if this domain is part of a key, then "keyno"
                   1200: indicates the ordering of this domain within the key.
                   1201: .in -24
                   1202: .ad
                   1203: .ph
                   1204: These two catalogs together provide information about the structure and 
                   1205: content of each relation in the data base.  
                   1206: No doubt items will continue to be added or deleted as the system undergoes
                   1207: further development. 
                   1208: The first planned extensions are the minimum and maximum values 
                   1209: assumed by the domain.
                   1210: These will be used by a more sophisticated decomposition scheme being 
                   1211: developed, which is discussed
                   1212: briefly in the next section and in detail in [WONG76].
                   1213: The representation of the catalogs as relations
                   1214: has allowed this restructuring to occur very easily.
                   1215: .ph
                   1216: Several other system relations exist which provide auxiliary
                   1217: information about relations.
                   1218: The INDEX catalog contains a tuple for every
                   1219: secondary index in the data base.
                   1220: Since secondary indices are themselves relations
                   1221: they are independently cataloged in the RELATION and ATTRIBUTE
                   1222: relations.  However, the INDEX catalog provides the association
                   1223: between a primary relation and the secondary indices for it
                   1224: including which domains of the primary relation are in the index.
                   1225: .ph
                   1226: The PROTECTION and INTEGRITY catalogs contain respectively
                   1227: the protection and integrity predicates for each relation in the 
                   1228: data base.
                   1229: These predicates are stored in a partially processed 
                   1230: form as character strings.
                   1231: (This mechanism exists for INTEGRITY and will be implemented
                   1232: in the same way for PROTECTION.)
                   1233: The VIEW catalog will contain, for each virtual relation, a partially
                   1234: processed QUEL-like description which can be used to 
                   1235: construct the view from its component physical relations.
                   1236: The use of these last three catalogs will be described in 
                   1237: Section 4.
                   1238: The existence of any of this auxiliary information for
                   1239: a given relation is signalled by the appropriate flag(s) in
                   1240: the RELATION catalog.
                   1241: .ph
                   1242: Yet another set of system relations are those used by the
                   1243: graphics sub-system to catalog and process maps, which (like
                   1244: everything else) are stored as relations in the data base.
                   1245: This topic has been discussed separately in [GO75].
                   1246: .se
                   1247: 3.3 ACCESS METHODS INTERFACE (AMI).
                   1248: .ph
                   1249: We will now discuss in more detail the AMI which handles
                   1250: all actual accessing of data from relations.
                   1251: The AMI language is implemented as a set of functions whose calling 
                   1252: conventions are indicated below. 
                   1253: 
                   1254: .ph
                   1255: Each access method must do two things to support the
                   1256: following calls.  First it must provide SOME
                   1257: linear ordering of the tuples in a relation so
                   1258: that the concept of "next tuple" is well defined.
                   1259: Second it must assign to each tuple a tuple-id (TID) which
                   1260: uniquely identifies a tuple.
                   1261: .ph
                   1262: The nine implemented calls are as follows:
                   1263: .ph
                   1264: a)  openr(descriptor, mode, relation_name)
                   1265: .ph
                   1266: Before a relation may be accessed it must be "opened".
                   1267: This function opens the UNIX file for the relation and fills in
                   1268: a "descriptor"
                   1269: with information about the relation from the RELATION and ATTRIBUTE
                   1270: catalogs.
                   1271: The descriptor, which must be declared in the calling program, is used
                   1272: in subsequent calls on AMI routines as an input parameter 
                   1273: to indicate what relation is involved.
                   1274: Consequently, the AMI data accessing routines need not themselves check the
                   1275: system catalogs for the description of a relation.
                   1276: "Mode" specifies whether the relation is being opened for update or
                   1277: for retrieval only.
                   1278: .ph
                   1279: b)  get(descriptor, tid, limit_tid, tuple, next_flag)
                   1280: .ph
                   1281: This function retrieves into 'tuple' a single tuple from the relation indicated by 'descriptor'.
                   1282: 'tid' and 'limit_tid' are tuple-identifiers.
                   1283: There are two modes of retrieval, "scan" and "direct".
                   1284: In "scan" mode "get" is intended to be called successively to 
                   1285: retrieve all tuples within a range of tuple-id's. 
                   1286: An initial value of 'tid' sets the low end of the range desired
                   1287: and 'limit_tid' sets the high end.
                   1288: Each time "get" is called with 'next_flag' = TRUE, the 
                   1289: tuple following
                   1290: 'tid' is retrieved and its tuple-id placed into 'tid' in readiness
                   1291: for the next call.
                   1292: Reaching 'limit_tid' is indicated by a special return code.
                   1293: The initial setting of 'tid' and 'limit_tid' is done
                   1294: by the "find" function.
                   1295: In "direct" mode ('next_flag' = FALSE) the function retrieves
                   1296: the tuple with tuple-id = 'tid'.
                   1297: .ph
                   1298: c)  find(descriptor, key, tid, match_mode)
                   1299: .ph
                   1300: "Find" places in 'tid' the tuple-id at the low or high end
                   1301: of the range of tuples which match the key value supplied.
                   1302: The matching condition to be applied depends on 'match-mode'.
                   1303: .ph
                   1304: If the relation does not have a keyed storage structure
                   1305: or if the key supplied does not correspond to the correct key domains,
                   1306: the 'tid' returned will be as if no key were supplied.
                   1307: The objective of "find" is to restrict the scan of a relation
                   1308: by eliminating from consideration those tuples known from
                   1309: their placement in the relation not to satisfy the
                   1310: matching condition with the key.
                   1311: Calls to "find" occur in pairs, one to set the low end of a scan,
                   1312: the other for the high end,
                   1313: and the two tuple-id's obtained are used in subsequent calls on "get".
                   1314: .ph
                   1315: Two functions are available for determining the access characteristics
                   1316: of the storage structure
                   1317: of a primary data relation or secondary index,
                   1318: respectively.
                   1319: .ph
                   1320: d)  paramd(descriptor, access_characteristics_structure)
                   1321: .br
                   1322: e)  parami(descriptor, access_characteristics_structure)
                   1323: .ph
                   1324: These functions fill in the 'access_characteristics_structure' with 
                   1325: information regarding the type of key which may be constructed
                   1326: to optimize access to the given relation.
                   1327: This includes whether exact key values or ranges of key values can be used,
                   1328: and whether a partially specified key may be used.
                   1329: This will determine the 'match-mode' used in a subsequent call to "find".
                   1330: The ordering of domains in the key is also indicated.
                   1331: These functions relieve optimization routines executed
                   1332: during the processing of an interaction
                   1333: of the need to know directly about specific storage structures.
                   1334: .ph
                   1335: Other AMI functions provide a facility for updating relations.
                   1336: .ph
                   1337: f)  insert(descriptor, tuple)
                   1338: .ph
                   1339: The tuple is added to the relation in  its "proper" place
                   1340: according to its key value and the storage mode of the relation.
                   1341: .ph
                   1342: g)  replace(descriptor, tid, new_tuple)
                   1343: .br
                   1344: h)  delete(descriptor, tid)
                   1345: .ph
                   1346: The tuple indicated by 'tid' is either replaced by new values
                   1347: or deleted from the relation altogether.
                   1348: The tuple-id of the affected tuple will have been obtained by a previous
                   1349: "get".
                   1350: .ph
                   1351: Finally, when all access to a relation is complete it must be closed:
                   1352: .ph
                   1353: i)  closer(descriptor)
                   1354: .ph
                   1355: This closes the relation's UNIX file and rewrites the information
                   1356: in the descriptor back into the system catalogs if there has been
                   1357: any change.
                   1358: .se
                   1359: 3.4 STORAGE STRUCTURES AVAILABLE.
                   1360: .ph
                   1361: We will now describe the five storage structures currently available in INGRES.
                   1362: Four of the schemes are keyed, i.e., the storage location of a tuple within the
                   1363: file is a function of the value of the tuple's key domains.
                   1364: These schemes allow rapid access to specific portions of a relation
                   1365: when key values are supplied. The remaining non-keyed scheme stores tuples
                   1366: in the file independently of their values
                   1367: and provides a low-overhead storage structure, especially attractive when
                   1368: the entire relation must be accessed anyway.
                   1369: .ph
                   1370: The non-keyed storage structure in INGRES is a randomly ordered sequential
                   1371: file.
                   1372: Fixed-length tuples are
                   1373: simply placed sequentially in the file in the order supplied.
                   1374: New tuples added to the relation are merely appended to the end
                   1375: of the file. 
                   1376: The unique tuple-identifier for each tuple is its byte-offset within
                   1377: the file.
                   1378: This mode is intended mainly for 
                   1379: .ph
                   1380: a) very small relations, for which the overhead of other schemes
                   1381: is unwarranted
                   1382: .ph
                   1383: b) transitional storage of data being moved into or out of the system by COPY;
                   1384: .ph
                   1385: c) certain temporary relations created as intermediate results during query
                   1386: processing.
                   1387: .be
                   1388: In the remaining schemes, the key-value of a tuple determines
                   1389: the page of the file on which the tuple will
                   1390: be placed. 
                   1391: The schemes share a common "page-structure" for managing
                   1392: tuples on file pages as shown in Figure 4.
                   1393: .ne 15
                   1394: 
                   1395: 
                   1396: 
                   1397: 
                   1398: 
                   1399: 
                   1400: 
                   1401: 
                   1402: 
                   1403: 
                   1404: 
                   1405: 
                   1406: 
                   1407: 
                   1408: 
                   1409: .ce 2
                   1410: Page Layout for Keyed Storage Structures
                   1411: Figure 4
                   1412: .ph
                   1413: A tuple must fit entirely on a single page.
                   1414: Its unique identifier (TID) consists of a page number
                   1415: (the ordering of its page in the UNIX file)
                   1416: plus a "line
                   1417: number" indicating its position on the page.
                   1418: A "line table", which grows upwards from the bottom of the page,
                   1419: contains as an entry for each tuple, a pointer to the beginning of
                   1420: the tuple.
                   1421: In this way a page can be reorganized without affecting TID's.
                   1422: .ph
                   1423: Initially the file will contain all its tuples on a number of 
                   1424: "primary" pages.
                   1425: If the relation grows and these pages fill, "overflow"
                   1426: pages are allocated and chained by pointers to the primary pages
                   1427: with which they are associated.
                   1428: Within a chained group of pages no
                   1429: special ordering of tuples is maintained.
                   1430: Thus in a keyed access which locates a particular primary page,
                   1431: tuples matching the key may be on any page in the chain.
                   1432: .ph
                   1433: As discussed in [HELD75b] two modes of key-to-address transformation
                   1434: are used -- randomizing and order preserving.
                   1435: In a "hash" file tuples are distributed randomly throughout
                   1436: the primary pages of the file according to a hashing function
                   1437: on a key.
                   1438: This mode is well suited for situations in which access
                   1439: is to be conditioned on a specific
                   1440: key value.
                   1441: .ph
                   1442: As an order-preserving mode, a scheme similar to IBM's ISAM [IBM66]
                   1443: is used.
                   1444: The relation is sorted to produce the ordering on a particular key.
                   1445: A multi-level directory is created which records the high key on each
                   1446: primary page.
                   1447: The directory, which is static, resides on several pages within the file itself,
                   1448: following the primary pages.
                   1449: A primary page and its overflow pages are not maintained in sort order.  This
                   1450: decision is discussed in the section on concurrency.
                   1451: The "ISAM-like" mode is useful in cases where the key value is likely
                   1452: to be specified as falling within a range of values, since
                   1453: a near ordering of the keys is preserved.
                   1454: The index compression scheme discussed in [HELD75b] is
                   1455: currently under implementation.
                   1456: .ph
                   1457: In the above mentioned keyed modes, fixed length tuples are stored.
                   1458: In addition, both schemes can be used in conjunction with data compression
                   1459: techniques [GOTT75] in cases
                   1460: where increased storage utilization outweighs the added cost of
                   1461: encoding and decoding data during access. 
                   1462: These modes are known as "compressed hash" and "compressed ISAM".
                   1463: 
                   1464: The current compression scheme suppresses blanks and portions of a 
                   1465: tuple which match the preceding tuple.  This compression is applied 
                   1466: to each page independently.  Other schemes are being experimented with.
                   1467: .ph
                   1468: 
                   1469: 3.5  ADDITION OF NEW ACCESS METHODS
                   1470: 
                   1471: .ph
                   1472: One of the goals of the AMI design was to insulate
                   1473: higher level software from the actual functioning
                   1474: of the access methods and thereby to make it
                   1475: easy to add different ones.
                   1476: .ph
                   1477: In order to add a new access method one need
                   1478: only extend the AMI routines to handle the new case.
                   1479: If the new method uses the same page layout and TID scheme,
                   1480: only find, parami, and paramd need to be extended.
                   1481: Otherwise new procedures to perform these functions must be
                   1482: supplied for use by get, insert, replace and delete.
                   1483: .bp
                   1484: .fo 'DESIGN OF INGRES (IV)'-%-'12/5/75'
                   1485: 4  THE STRUCTURE OF PROCESS 2
                   1486: 
                   1487: Process 2 contains code to perform four main functions
                   1488: .br
                   1489: a)  a lexical analyzer
                   1490: .br
                   1491: b)  a parser (written in YACC [JOHN74])
                   1492: .br
                   1493: c)  query modification routines to support
                   1494: protection, views and integrity control
                   1495: .br
                   1496: d)  concurrency control
                   1497: .in 0
                   1498: 
                   1499: These are discussed in turn
                   1500: 
                   1501: 4.1  LEXICAL ANALYSIS AND PARSING
                   1502: .ph
                   1503: The lexical analysis and parsing phases of
                   1504: INGRES have been organized around the
                   1505: YACC translator writing system
                   1506: available in UNIX [JOHN74].  YACC takes
                   1507: as input a description of a grammar consisting
                   1508: of BNF-like parsing rules (productions) and
                   1509: precedence rules,
                   1510: plus action statements associated with
                   1511: each production.
                   1512: It produces a set of tables to be
                   1513: interpreted by a parse table interpreter which is combined
                   1514: with locally-supplied lexical analysis
                   1515: and parsing action routines to produce a complete translator.
                   1516: .ph
                   1517: The interpreter uses a bottom-up
                   1518: LR(1) parsing approach.
                   1519: The lexical analyzer is called to obtain successive symbols
                   1520: from the input stream as the interpreter attempts to
                   1521: match the input with productions in the grammar.
                   1522: When a production is matched YACC performs a reduction and
                   1523: executes the action statement associated with
                   1524: the production.  YACC has a mechanism for recovering
                   1525: from errors to continue parsing input in its
                   1526: entirety.
                   1527: .ph
                   1528: While the YACC parse table interpreter checks
                   1529: the syntactic correctness of the input
                   1530: commands, the action statements check for sementic
                   1531: consistency and correctness and prepare the commands for
                   1532: further processing.  The system
                   1533: catalogs are used to check that relation and
                   1534: domain names, formats, and so on, are
                   1535: specified appropriately.  
                   1536: .ph
                   1537: For utility commands a command indicator
                   1538: and the parameters for the
                   1539: command are sent directly to process 3
                   1540: for transmission to process 4.  Section 6 discusses these
                   1541: commands and their implementation.
                   1542: .ph
                   1543: For QUEL commands, the input is translated to a tree-structured
                   1544: internal form which will be used in the
                   1545: remaining analysis and processing.  
                   1546: Moreover, the qualification part
                   1547: is converted to conjunctive normal form.
                   1548: The parse tree is now ready to undergo
                   1549: what has been termed "query-modification," to be described in
                   1550: Section 4.2 and 4.3.
                   1551: .br
                   1552: 
                   1553: 4.2  INTEGRITY
                   1554: .ph
                   1555: Query modification includes adding integrity and
                   1556: protection predicates to the original query
                   1557: and changing references to virtual relations into references
                   1558: to the appropriate physical relations.
                   1559: At the present time only a simple integrity
                   1560: scheme has been implemented.
                   1561: .ph
                   1562: In [STON75] algorithms of
                   1563: several levels of complexity are presented for
                   1564: performing integrity control on updates.
                   1565: In the present system only the simplest case,
                   1566: involving single-variable, aggregate-free
                   1567: integrity assertions, has been implemented and is
                   1568: described in detail in [SCHO75].
                   1569: .ph
                   1570: Briefly, integrity assertions are entered
                   1571: in the form of QUEL qualification
                   1572: clauses to be applied to interactions updating
                   1573: the relation over which the variable in the
                   1574: assertion ranges.  A parse
                   1575: tree is created for the qualification and a representation of this
                   1576: tree stored in the INTEGRITY catalog together with
                   1577: an indication of the relation and specific domains
                   1578: involved.  At query
                   1579: modification time, updates are checked
                   1580: for any possible integrity assertions
                   1581: on the affected domains.  Relevant assertions are
                   1582: retrieved, re-built into tree
                   1583: form and grafted onto the update tree
                   1584: so as to AND the assertions with the existing
                   1585: qualification of the interaction.
                   1586: .ph
                   1587: 4.3  PROTECTION AND VIEWS
                   1588: .ph
                   1589: Algorithms for the support of views are also given
                   1590: in [STON75].  Basically a view is any relation
                   1591: which could be created from existing relations by
                   1592: the use of a RETRIEVE command.  Such view definitions
                   1593: will be treated in a manner
                   1594: somewhat analogous to that used for integrity
                   1595: control.  They will be allowed in INGRES
                   1596: to support EQUEL programs written for obsolete
                   1597: versions of the data base and for user
                   1598: convenience.
                   1599: .ph
                   1600: Protection will be handled according to the
                   1601: algorithm described in [STON74b].  Like integrity control
                   1602: this algorithm involves adding
                   1603: qualifications to the user's interaction.
                   1604: In the remainder of this section we distinguish this protection
                   1605: scheme from the one in [CHAM75] and indicate the
                   1606: rationale behind its use.
                   1607: .ph
                   1608: Consider the following two views:
                   1609: 
                   1610: .in 10
                   1611: .ss
                   1612: RANGE OF E IS EMPLOYEE
                   1613: .br
                   1614: DEFINE RESTRICTION-1 (E.NAME, E. SALARY, E.AGE)
                   1615: .br
                   1616: WHERE E.DEPT = "toy"
                   1617: 
                   1618: DEFINE RESTRICTION-2 (E.NAME, E.DEPT, E.SALARY)
                   1619: .br
                   1620: WHERE E.AGE < 50
                   1621: .ds
                   1622: 
                   1623: .in 0
                   1624: and the following two access control statements:
                   1625: 
                   1626: .in 10
                   1627: .ss
                   1628: RANGE OF E IS EMPLOYEE
                   1629: .br
                   1630: PROTECT (E.NAME, E.SALARY, E.AGE) 
                   1631: .br
                   1632: WHERE E.DEPT = "toy"
                   1633: 
                   1634: PROTECT (E.NAME, E.SALARY, E.DEPT) 
                   1635: .br
                   1636: WHERE E.AGE < 50
                   1637: .ds
                   1638: 
                   1639: .in 0
                   1640: .ph
                   1641: Acess control could be based on
                   1642: views as suggested in [CHAM75]
                   1643: and a given user could be authorized to use
                   1644: views RESTRICTION-1 and RESTRICTION-2. 
                   1645: To find
                   1646: the salary of Harding he could interrogate RESTRICTION-1 as
                   1647: follows:
                   1648: 
                   1649: .in 10
                   1650: .ss
                   1651: RANGE OF R IS RESTRICTION-1
                   1652: .br
                   1653: RETRIEVE (R.SALARY) WHERE
                   1654: .br
                   1655: R.NAME = "Harding"
                   1656: .ds
                   1657: .in 0
                   1658: 
                   1659: .ph
                   1660: Failing to find Harding in RESTRICTION-1 he would
                   1661: have to then interrogate RESTRICTION-2.  After two queries
                   1662: be would be returned the appropriate salary if
                   1663: Harding was under 50 or in the toy department.
                   1664: .ph
                   1665: Under the INGRES scheme the user can issue
                   1666: .in 10
                   1667: 
                   1668: .ss
                   1669: RANGE OF E IS EMPLOYEE
                   1670: .br
                   1671: RETRIEVE (E.SALARY) WHERE
                   1672: .br
                   1673: E.NAME = "Harding"
                   1674: .ds
                   1675: .in 0
                   1676: .ph
                   1677: which will be modified by the access control
                   1678: algorithm to
                   1679: .in 10
                   1680: 
                   1681: .ss
                   1682: RANGE OF E IS EMPLOYEE
                   1683: .br
                   1684: RETRIEVE (E.SALARY) WHERE
                   1685: .br
                   1686: E.NAME = "Brown"
                   1687:     AND
                   1688: .br
                   1689: (E.AGE < 50 OR E.DEPT = "toy")
                   1690: .ds
                   1691: .in 0
                   1692: .ph
                   1693: In this way the user need not manually sequence through
                   1694: his views to obtain desired data but automatically
                   1695: obtains such data if permitted.  Note
                   1696: clearly that the portion of EMPLOYEE to which the
                   1697: user has access (the union of RESTRICTION-1 and RESTRICTION-2) is
                   1698: not a relation and hence
                   1699: cannot be defined as a single view.
                   1700: 
                   1701: To summarize, access control restrictions are handled automatically by the
                   1702: INGRES algorithms but must be dealt with by a
                   1703: user sequencing through his views in a "view
                   1704: oriented" access control scheme.
                   1705: 
                   1706: 4.4  CONCURRENCY CONTROL
                   1707: 
                   1708: In any multiuser system provisions must be included
                   1709: to ensure that multiple concurrent updates are
                   1710: executed in a manner such that some level
                   1711: of data integrity can be guaranteed.  The
                   1712: following two (somewhat facetious) updates illustrate the problem.
                   1713: 
                   1714: .nf
                   1715: .ss
                   1716: 
                   1717:                RANGE OF E is EMPLOYEE
                   1718:        U1      REPLACE E(DEPT = "toy") WHERE
                   1719:                        E.DEPT = "candy"
                   1720: 
                   1721:                RANGE OF F is EMPLOYEE
                   1722:        U2      REPLACE F(DEPT = "candy") WHERE
                   1723:                        F.DEPT = "toy"
                   1724: 
                   1725: .ds
                   1726: .fi
                   1727:        If U1 and U2 are executed concurrently with
                   1728: no controls, some employees may end up in
                   1729: each department and the particular result
                   1730: may not be repeatable if the data base is
                   1731: backed up and the interactions reexecuted.
                   1732: 
                   1733:        The control which must be provided is to guarantee
                   1734: that some data base operation is "atomic" (i.e. occurs
                   1735: in such a fashion that it appears instantaneous and before
                   1736: or after any other data base operation).
                   1737: This atomic unit will be called a transaction.
                   1738: .ph
                   1739: In INGRES there are three basic choices available for
                   1740: defining a transaction.
                   1741: 
                   1742: .br
                   1743: a)  something smaller than one INGRES command
                   1744: .br
                   1745: b)  one INGRES command
                   1746: .br
                   1747: c)  a collection of INGRES commands.
                   1748: .in 0
                   1749: .br
                   1750: 
                   1751: If a) is chosen INGRES could not guarantee that two concurrently executing update
                   1752: commands gave the same result as if they were executed sequentially (in either order)
                   1753: in one collection of INGRES processes.
                   1754: In fact the outcome could fail to be repeatable,
                   1755: as noted in the example above.
                   1756: This situation is clearly undesirable.
                   1757: 
                   1758: Option c) is in the opinion of the INGRES designers
                   1759: impossible to support.
                   1760: The following transaction could be declared in an EQUEL program.
                   1761: 
                   1762: .in 5
                   1763: .br
                   1764: .ss
                   1765: BEGIN TRANSACTION
                   1766: .in 10
                   1767: FIRST QUEL UPDATE
                   1768: .br
                   1769: SYSTEM CALLS TO CREATE AND DESTROY FILES
                   1770: .br
                   1771: SYSTEM CALLS TO FORK A SECOND COLLECTION OF INGRES PROCESSES TO WHICH
                   1772: COMMANDS ARE PASSED
                   1773: .br
                   1774: SYSTEM CALLS TO READ FROM A TERMINAL
                   1775: .br
                   1776: SYSTEM CALLS TO READ FROM A TAPE
                   1777: .br
                   1778: SECOND QUEL UPDATE (whose form depends on previous two
                   1779: system calls)
                   1780: .br
                   1781: .in 5
                   1782: END TRANSACTION
                   1783: .ds
                   1784: 
                   1785: .in 0
                   1786: Suppose T1 is the above transaction and runs concurrently with
                   1787: a transaction T2 involving commands of the same
                   1788: form.  The second update of each transaction may well "conflict" with the first
                   1789: update of the other.
                   1790: Note that there is no way to tell apriori that T1 and T2 conflict
                   1791: because the form of the second update is
                   1792: not known in advance.  Hence a deadlock situation
                   1793: can arise which can
                   1794: only be resolved by aborting one
                   1795: transaction (an undesirable policy in the eyes of the INGRES designers)
                   1796: or attempting to back out one transaction.
                   1797: The overhead of backing out through the intermediate system
                   1798: calls appears prohibitive (if it is possible at all).
                   1799: Restricting a transaction to have no system calls (and hence
                   1800: no I/0) cripples the power of a transaction in order to make deadlock resolution
                   1801: possible and was judged undesirable.
                   1802: Thus option b) was chosen.
                   1803: 
                   1804: The implementation of b) can be achieved by physical
                   1805: locks on data items, pages, tuples, domains, relations, etc.
                   1806: [GRAY75] or by predicate locks [STON74c].
                   1807: The current implementation is by relatively crude physical
                   1808: locks (on domains of a relation) and avoids deadlock by not
                   1809: allowing an interaction to proceed to process 3 until it can lock all
                   1810: required resources.
                   1811: Because of a problem with the current design of certain access
                   1812: method calls, all domains
                   1813: of a relation must currently be locked (i.e. a whole relation is locked)
                   1814: to perform an update.  This situation will soon be rectified.
                   1815: 
                   1816: The choice of avoiding deadlock rather than
                   1817: detecting and resolving it is made primarily for implementation
                   1818: simplicity.  The choice of a crude locking unit
                   1819: reflects a minicomputer environment where core storage for
                   1820: a large lock lable is not available.  In the future we
                   1821: plan to experimentally implement a crude and (thereby low CPU overhead)
                   1822: version of a predicate locking scheme previously described in [STON74c].
                   1823: Such an approach may provide considerable concurrency at an
                   1824: acceptable overhead in lock table space and CPU time,
                   1825: although such a statement is highly speculative.
                   1826: 
                   1827: Once the concurrency processor has
                   1828: assembled locks on all domains needed by an interaction, it may
                   1829: proceed to process 3 for unemcumbered execution.
                   1830: 
                   1831: To conclude this section we briefly indicate the reasoning behind
                   1832: not sorting a page and its overflow pages in the "ISAM-like" access method.  This 
                   1833: topic is also discussed in [HELD75c].
                   1834: 
                   1835: Basically, maintenance of the sort order of these pages may require the access
                   1836: method to lock more than one page when it inserts a tuple.
                   1837: Clearly deadlock might be possible given concurrent updates and 
                   1838: locks for physical pages would be required 
                   1839: (at least once
                   1840: a more sophisticated predicate locking scheme is tried such as [STON74c]).
                   1841: To avoid both problems these pages remain unsorted and
                   1842: the access method need only be able to read-modify-write a single
                   1843: page as an atomic operation.
                   1844: Although such an atomic operation is not currently in UNIX
                   1845: (and not needed by the current primitive scheme) it is a minor
                   1846: addition.
                   1847: 
                   1848: 
                   1849: .bp
                   1850: .fo 'DESIGN OF INGRES (V)'-%-'12/12/75'
                   1851: 5  PROCESS 3
                   1852: 
                   1853: .ph
                   1854: As noted in Section 2 this process performs the following two
                   1855: functions which will be discussed in turn:
                   1856: .ph
                   1857: a)  the DECOMPOSITION of queries involving more than one variable
                   1858: into sequences of one-variable queries.  Partial results are
                   1859: accumulated until the entire query is evaluated.  This program is called DECOMP.
                   1860: It also turns any updates into the appropriate queries to isolate 
                   1861: qualifying tuples and spools new values into a special file 
                   1862: for deferred update.
                   1863: .ph
                   1864: b)  the processing of a single variable queries.
                   1865: The program is called
                   1866: the one variable query processor (OVQP).
                   1867: 
                   1868: .ph
                   1869: 5.1  DECOMP
                   1870: 
                   1871: Because INGRES allows interactions which are defined on the
                   1872: cross product of perhaps several relations, efficient execution of
                   1873: this step is of crucial importance in order to search as small a
                   1874: portion of the appropriate cross product space as
                   1875: possible.
                   1876: DECOMP uses three techniques in processing
                   1877: interactions.  We describe each technique then
                   1878: give the actual algorithm implemented.  Finally, we
                   1879: indicate the role of a more sophisticated decomposition scheme under
                   1880: design.
                   1881: .ph
                   1882: a)  Tuple substitution
                   1883: 
                   1884: The basic technique used by DECOMP to reduce a query
                   1885: to fewer variables is tuple substitution.
                   1886: One variable (out of possibly many) in the
                   1887: query is selected for substitution.  The AMI
                   1888: language is used to scan the relation
                   1889: associated with the variable one tuple at a time.
                   1890: For each tuple, the values of domains in that relation
                   1891: are substituted into the query.
                   1892: In the resulting modified query, all
                   1893: previous references to the substituted variable have now
                   1894: been replaced by values (constants), and the query has
                   1895: thus been reduced to one less variable.  Precomposition
                   1896: is repeated (recursively) on the modified query until
                   1897: only one variable remains, at which point the OVQP is called to continue processing.
                   1898: .ph
                   1899: b)  One-Variable Detachment
                   1900: 
                   1901: If the qualification Q of the query is of the form
                   1902: .br
                   1903: .ce
                   1904: Q1(V1)  AND  Q2(V1,...,Vn)
                   1905: .br
                   1906: for some
                   1907: tuple variable V1, the following two steps can be
                   1908: executed:
                   1909: 
                   1910: 1)  Issue the query
                   1911: .in 5
                   1912: .ss
                   1913: 
                   1914: RETRIEVE INTO W (TL[V1])
                   1915: .br
                   1916: WHERE Q1[V1]
                   1917: .ds
                   1918: .in 0
                   1919: .ph
                   1920: Here TL[V1] are those domains required in the remainder
                   1921: of the query.  Note that this is a one variable query and may
                   1922: be passed directly to OVQP.
                   1923: 
                   1924: 2)  Replace R1, the relation over which V1 ranges,
                   1925: by W in the range declaration and delete
                   1926: Q1[V1] from Q.
                   1927: 
                   1928: .ph
                   1929: The query formed in 1) is called a "one-variable,
                   1930: detachable sub-query" (OVDSQ) and the technique
                   1931: for forming and executing it "one-variable detachment" (OVD).
                   1932: This step has the effect of reducing the size
                   1933: of the relation over which V1 ranges
                   1934: by restriction and projection.
                   1935: Hence, it may reduce the complexity of the
                   1936: processing to follow.
                   1937: .ph
                   1938: Moreover, the opportunity exists in the process
                   1939: of creating new relations through OVD, to choose storage structures
                   1940: (and particularly keys) which will prove
                   1941: helpful in further processing.
                   1942: .ph
                   1943: 
                   1944: c)  Reformatting
                   1945: 
                   1946: When a tuple variable is selected for substitution, a large
                   1947: number of queries each with one less variable will
                   1948: be executed.  If b) is a possible operation after
                   1949: the substitution for some 
                   1950: remaining variable, V1, then the relation over which
                   1951: V1 ranges, R1, can be reformatted to have domains
                   1952: used in Q1(V1) as a key.  This will expedite
                   1953: b) each time it is executed during tuple substitution.
                   1954: 
                   1955: We can now state the complete decomposition algorithm.
                   1956: .ph
                   1957: a)  If number of variables
                   1958: in query is 0 or 1 call OVQP and  stop; else go on.
                   1959: 
                   1960: b)  Find all variables,
                   1961: {V1,...Vn},
                   1962: for which the query contains a 
                   1963: one-variable clause.  Perform
                   1964: OVD to create new ranges for each of these
                   1965: variables.
                   1966: The new relation for each variable, Vi, is stored
                   1967: as a hash file with key, Ki, chosen as follows.
                   1968: .bb
                   1969: 1) For each j select from the remaining multi-variable clauses in the query
                   1970: the collection, C(ij), which have the form
                   1971: .ti +8
                   1972: Vi.di = Vj.dj  
                   1973: .br
                   1974: where di,dj are domains of Vi and Vj.
                   1975: .br
                   1976: 2)  Form the key Ki to be the concatenation
                   1977: of domains di1,di2,...of Vi appearing in clauses
                   1978: in C(ij).
                   1979: .br
                   1980: 3) If more than one j exists, for which C(ij) is non empty, one C(ij)
                   1981: is chosen arbitrarily for forming the key.
                   1982: If C(ij) is empty for all j, 
                   1983: the relation is stored as an unsorted table.
                   1984: .ph
                   1985: .be
                   1986: c)  Choose the variable, Vs, with the smallest number of tuples
                   1987: as the next one for which to perform
                   1988: tuple substitution.
                   1989: .ph
                   1990: d)  For each tuple variable Vj for which C(js)
                   1991: is non null, reformat the storage structure
                   1992: of the relation Rj which it ranges over, if necessary,
                   1993: so that the key of Rj is the concatenation of
                   1994: domains dj1,... appearing
                   1995: in C(js). 
                   1996: This ensures that when the clauses in C(js) become one-variable
                   1997: after substituting for Vs, subsequent calls to OVQP to restrict
                   1998: further the range of Vj will be done as efficiently as possible.
                   1999: .ph
                   2000: e)  Perform the following two steps for all
                   2001: tuples in the range of the variable selected in (c):
                   2002: .bb
                   2003: 1) substitute values from tuple into query.
                   2004: .br
                   2005: 2) call decomposition algorithm recursively on a
                   2006: copy of resulting query which now has been reduced by one variable.
                   2007: .ph
                   2008: .be
                   2009: The following comments on the algorithm
                   2010: are appropriate:
                   2011: .ph
                   2012: a)  OVD is almost always assured of speeding
                   2013: processing.  Not only is it possible to
                   2014: wisely choose the storage structure
                   2015: of a temporary relation but also the
                   2016: cardinality of this relation
                   2017: may be much less than the one it
                   2018: replaces as the range for a tuple variable.
                   2019: It only fails if little or no
                   2020: reduction takes place and
                   2021: reformating is unproductive.
                   2022: 
                   2023: It should be noted that a temporary relation is created rather 
                   2024: than a list of qualifying tuple-id's.  The basic tradeoff 
                   2025: is that OVD must copy qualifying tuples but can remove duplicates 
                   2026: created during the projection.  Storing tuple-id's avoids the copy 
                   2027: operation at the expense of reaccessing qualifying tuples and 
                   2028: retaining duplicates.  It is clear that cases exist where each 
                   2029: strategy is superior.  The INGRES designers have chosen OVD because it 
                   2030: does not appear to offer worse performance than the alternative, 
                   2031: allows a more accurate choice of the variable with the smallest 
                   2032: range in step c) above and results in cleaner code.
                   2033: 
                   2034: b)  Tuple substitution is done when necessary on the
                   2035: variable associated with the smallest
                   2036: number of tuples.  This has the effect of
                   2037: reducing the number of eventual calls on OVQP.
                   2038: 
                   2039: c)  Reformatting is done (if necessary)
                   2040: with the knowledge that it will replace a collection
                   2041: of complete sequential scans of a relation by a 
                   2042: collection of limited scans.  This
                   2043: will almost always reduce processing time.
                   2044: 
                   2045: d)  It is believed that this algorithm
                   2046: efficiently handles a large class of
                   2047: interactions.  Moreover, the algorithm does not
                   2048: require excessive CPU overhead to perform.  There are,
                   2049: however, cases where a more elaborate algorithm is 
                   2050: needed.  The following comment applies to these cases.
                   2051: 
                   2052: e)  Suppose that we have two or more strategies ST0, ST1, ... ,STn,
                   2053: each one being better than the previous one but also requiring
                   2054: a greater overhead.
                   2055: Suppose we begin an interaction on ST0 and run it for an amount
                   2056: of time equal to a fraction of the
                   2057: estimated overhead of ST1.
                   2058: At the end of that time, by simply counting
                   2059: the number of tuples of the first
                   2060: substitution variable which
                   2061: have already been processed, we can get an
                   2062: estimate for the total
                   2063: processing time using ST0. If this is significantly
                   2064: greater than the overhead of ST1, then we switch to ST1.
                   2065: Otherwise we stay and complete processing the interaction
                   2066: using ST0.  Obviously, the procedure can be
                   2067: repeated on ST1 to call ST2 if necessary,
                   2068: and so forth.  
                   2069: 
                   2070: The algorithm detailed in this section is ST0.  A more sophisticated algorithm
                   2071: ST1 is currently under development and is discussed in [WONG76].
                   2072: .se
                   2073: 5.2 ONE VARIABLE QUERY PROCESSOR (OVQP)
                   2074: .ph
                   2075: This program is concerned solely with the
                   2076: efficient accessing of tuples from a single
                   2077: relation given a particular one-variable query. 
                   2078: The initial portion of this program, known
                   2079: as STRATEGY, determines what key (if any) may
                   2080: be profitably used
                   2081: to access the relation, what the value(s) of that
                   2082: key will be used in calls to the AMI routine "find", and
                   2083: whether the access
                   2084: may be accomplished
                   2085: directly through the AMI to the storage structure of the
                   2086: relation itself or if a secondary index on the relation should be used.
                   2087: If access is to be through a secondary index then STRATEGY must
                   2088: choose which ONE of possibly many indices to use.
                   2089: .ph
                   2090: Then, the tuples retrieved
                   2091: according to the access strategy selected are processed
                   2092: by the SCAN
                   2093: portion of OVQP.  This program
                   2094: evaluates each tuple against the
                   2095: qualification part of the query, creates
                   2096: target list values for qualifying tuples, and
                   2097: disposes of the target list appropriately.
                   2098: .ph
                   2099: Since SCAN is relatively straightforward, we
                   2100: discuss only the policy decisions made in
                   2101: STRATEGY.
                   2102: .ph
                   2103: First STRATEGY examines the qualification
                   2104: for clauses which specify the value of a domain,
                   2105: i.e. clauses of the form:
                   2106: 
                   2107:        V.domain op constant
                   2108: .br
                   2109: where "op" is one of {=, <, >, <=, >=}.
                   2110: Such clauses are termed "simple" clauses and are
                   2111: organized into a list
                   2112: .ph
                   2113: Obviously a non-simple clause may be equivalent to a simple one.
                   2114: .br
                   2115: For example
                   2116:        E.SALARY/2 = 10000 is equivalent to E.SALARY = 20000.
                   2117: .br
                   2118: However, recognizing and converting such clauses
                   2119: requires a general algebraic symbol
                   2120: manipulator.  This issue has been
                   2121: avoided by ignoring all non-simple
                   2122: clauses.  STRATEGY must now select one of
                   2123: two accessing strategies:
                   2124: .br
                   2125: a)  issuing two AMI find commands on the primary
                   2126: relation followed by a sequential scan of the
                   2127: relation between the limits specified
                   2128: .br
                   2129: b)  issuing two AMI find commands on some index relation
                   2130: followed by a sequential scan of the index
                   2131: between the limits specified.  For each
                   2132: tuple retrieved the "pointer" domain
                   2133: is obtained and is a tuple-id of a tuple in
                   2134: the primary relation.  This tuple
                   2135: is fetched and examined.
                   2136: .ph
                   2137: Key information about the primary 
                   2138: relation is obtained using
                   2139: the AMI function "paramd".
                   2140: Names of indices are obtained from the
                   2141: index catalog and keying information about indices in obtained with
                   2142: the function "parami".
                   2143: .ph
                   2144: STRATEGY now checks if a simple
                   2145: clause is available to limit the scan of
                   2146: the primary relation or an index relation.
                   2147: If a relation is hashed the simple
                   2148: clause must specify equality as the
                   2149: operator in order to be
                   2150: useful.  ISAM structures on the other hand
                   2151: allow ranges of values and less than
                   2152: all keys may be specified as long as
                   2153: the first one is present for structures
                   2154: with combined keys.
                   2155: .ph
                   2156: STRATEGY checks for such a simple clause for a relation in the following
                   2157: order:
                   2158: .bb
                   2159: a.  hashed primary relation
                   2160: .br
                   2161: b.  hashed index
                   2162: .br
                   2163: c.  ISAM primary relation
                   2164: .br
                   2165: d.  ISAM index
                   2166: .be
                   2167: .in 0
                   2168: The rationale for this ordering is related to the
                   2169: expected number of page accesses required to retrieve
                   2170: a tuple from the source relation in each case.
                   2171: .ph
                   2172: In case a) the key value provided locates a desired
                   2173: source tuple in one access,
                   2174: (ignoring overflows).  
                   2175: In case b) the
                   2176: key value locates an appropriate index relation
                   2177: tuple in one access but an additional access in required to retrieve the proper
                   2178: source tuple.  For the ISAM scheme, the
                   2179: directory must be examined.  The number of accesses
                   2180: incurred in this look-up is at least 2.
                   2181: Again, with an index, 
                   2182: an additional
                   2183: access is required, making the total at least 3
                   2184: in case d).
                   2185: .bp
                   2186: .fo 'DESIGN OF INGRES (VI)'-%-'12/5/75'
                   2187: 6  UTILITIES IN PROCESS 4
                   2188: .se
                   2189: 6.1  IMPLEMENTATION OF UTILITY COMMANDS
                   2190: .ph
                   2191: We have indicated in Section 1 several data base utilities
                   2192: available to users.  We will now briefly describe their 
                   2193: implementation and indicate their configuration within the
                   2194: INGRES system.
                   2195: .ph
                   2196: The commands are organized into several overlay programs.
                   2197: Since an overlay may have more than one
                   2198: entry point, it can contain
                   2199: more than one utility command.
                   2200: In fact, the utilities
                   2201: are grouped where possible to minimize overlaying.
                   2202: The overlays all contain a common main program known as
                   2203: the "controller", which reads pipe C and writes
                   2204: completion messages into pipe D.
                   2205: The processing of a utility command occurs as
                   2206: follows.
                   2207: .ph
                   2208: First, the parser recognizes a utility command in a user interaction.
                   2209: This name is looked up in the INGRES "process-table",
                   2210: which has an entry for each command name in the language.
                   2211: Each entry has an "overlay-id" and a "function-id".
                   2212: The first indicates the overlay program containing the
                   2213: command, and, the second indicates the proper entry point
                   2214: within that overlay.
                   2215: .ph
                   2216: These id's are passed down pipe B to process 3
                   2217: followed by the parameters of the command specified.
                   2218: Process 3 determines from the "overlay-id"
                   2219: if the command is a utility command intended for process 4.
                   2220: If so, the information is simply written on pipe C.
                   2221: .ph
                   2222: At this point, some overlay is occupying process 4, having remained
                   2223: from a previous command.  Its copy of the controller
                   2224: reads pipe C to obtain the overlay-id.
                   2225: If a different overlay from the present one is
                   2226: indicated, the controller overlays process 4 and
                   2227: control passes to the controller of the new overlay.
                   2228: .ph
                   2229: Most of the utilities update or read the system
                   2230: relations using AMI calls.  MODIFY contains a sort
                   2231: routine which puts tuples in collating sequence
                   2232: according to the concatenation of the
                   2233: desired keys (which need not
                   2234: be of the same data type).
                   2235: Pages are initially loaded to approximately 80%
                   2236: of capacity.  The sort routine
                   2237: is a recursive N-way merge-sort
                   2238: where N is maximum number of files process 4
                   2239: can have open at once (currently 8).
                   2240: The index building occurs in an obvious way.  To convert
                   2241: to hash structures MODIFY
                   2242: must specify the number of primary pages to be
                   2243: allocated.  This parameter is used by the AMI in its
                   2244: hash scheme (which is a standard modulo division method).
                   2245: This is done by a rule of thumb.
                   2246: .ph
                   2247: It should be noted that a user who
                   2248: creates an empty hash relation using the CREATE command and then
                   2249: copies a large UNIX file into it using COPY
                   2250: will create a very inefficient structure.  This is because
                   2251: a relatively small default number of primary pages will have
                   2252: been specified by CREATE and overflow chains
                   2253: will be long.  A better strategy is
                   2254: to COPY into an unsorted table so that
                   2255: MODIFY can subsequently make a good guess
                   2256: at the number of primary pages to allocate.
                   2257: .se
                   2258: 6.2  DEFERRED UPDATE AND RECOVERY
                   2259: 
                   2260: Any updates (APPEND, DELETE, REPLACE) are processed
                   2261: by writing the tuples to be added changed or modified into
                   2262: a temporary file.  When process 3 finishes,
                   2263: it calls process 4 to actually perform the modifications
                   2264: requested as a final step in processing.  Deferred update
                   2265: is done for four reasons.
                   2266: 
                   2267: a)  Secondary index considerations. Suppose the following
                   2268: QUEL statement is executed;
                   2269: .in 10
                   2270: .ss
                   2271: 
                   2272: RANGE OF E IS EMPLOYEE
                   2273: .br
                   2274: REPLACE E(SALARY = 1.1*E.SALARY)  WHERE
                   2275: .br
                   2276:      E.SALARY > 20000
                   2277: .ds
                   2278: .in 0
                   2279: 
                   2280: Suppose further that there is a secondary index on the
                   2281: salary domain and the primary relation is keyed on another domain.
                   2282: 
                   2283: OVQP in finding the employees who qualify for the raise
                   2284: will use the secondary index.  If one employee (say Smith qualifies and his
                   2285: tuple is modified and the secondary index updated) then the scan of the secondary
                   2286: index will find his tuple a second time (in fact an arbitrary number of times).
                   2287: Either secondary indexes cannot be used to identify qualifying
                   2288: tuples when range qualifications are present (a rather
                   2289: unnatural restriction) or secondary
                   2290: indices must be updated in deferred mode.
                   2291: 
                   2292: .ph
                   2293: b)  Primary relation considerations.  
                   2294: Suppose the following QUEL statement is executed
                   2295: 
                   2296: .in 10
                   2297: .ss
                   2298: RANGE OF E, M IS EMPLOYEE
                   2299: .br
                   2300: REPLACE E(SALARY = .9*E.SALARY)  WHERE
                   2301: .br
                   2302: .in 15
                   2303: E. MGR = M.NAME
                   2304: .br
                   2305: .in 17
                   2306: AND
                   2307: .br
                   2308: .in 15
                   2309: E.SALARY > M.SALARY
                   2310: .ds
                   2311: .in 0
                   2312: 
                   2313: for the following EMPLOYEE relation
                   2314: 
                   2315: .nf
                   2316:                NAME    SAL     MANAGE  
                   2317: .ss
                   2318:                Smith   10k      Jones     
                   2319:                Jones    8k                
                   2320:                Brown  9.5k      Smith     
                   2321: .fi
                   2322: .ds
                   2323: Logically Smith should get the pay cut but Brown should
                   2324: not.  However, if Smith's tuple is updated before Brown
                   2325: is checked for the pay cut, Brown will qualify.  This
                   2326: undesirable situation must be avoided by deferred update.
                   2327: 
                   2328: c)  Functionality of updates.
                   2329: Suppose the following QUEL statement is executed.
                   2330: 
                   2331: .in 10
                   2332: .ss
                   2333: RANGE of E,M is EMPLOYEE
                   2334: .br
                   2335: REPLACE E(SALARY = M.SALARY)
                   2336: .ds
                   2337: .in 0
                   2338: 
                   2339: This update attempts to assign to each employee
                   2340: the salary of every other employee, i.e.
                   2341: a single data item is to be replaced by multiple values.
                   2342: Stated differently, the REPLACE statement does not specify a function.
                   2343: This non-functionality
                   2344: can only be checked if deferred update is performed.
                   2345: .ph
                   2346: d)  Recovery is easier.  
                   2347: The deferred update file provides a log of updates
                   2348: to be made.  Recovery is provided upon system
                   2349: crash by the RESTORE command.
                   2350: In this case the deferred update routine is requested
                   2351: to destroy the temporary file if it has
                   2352: not yet started processing it.
                   2353: If it has begun processing, it reprocesses the
                   2354: entire update file which is done in such a way that the
                   2355: effect is the same as if it were processed exactly once from
                   2356: start to finish.
                   2357: 
                   2358: Hence the update is "backed out" if deferred updating has not yet
                   2359: begun; otherwise it is processed to conclusion.
                   2360: The software is designed so the update file can be optionally spooled onto tape and
                   2361: recovered from tape.  This added feature should soon be operational.
                   2362: 
                   2363: If a user from the terminal monitor (or a C-program)
                   2364: wishes to stop a command he can issue a "break" character.
                   2365: In this case all processes reset execept the deferred update
                   2366: program which recovers in the same manner as above.
                   2367: 
                   2368: All update commands do deferred update; however the INGRES
                   2369: utilities have not yet been modified to do likewise.
                   2370: When this is completed INGRES will
                   2371: recover from all crashes which leave the disk intact.
                   2372: In the meantime there can be disk-intact crashes which
                   2373: cannot be recovered in this manner (if they happen in such a
                   2374: way that the system catalogs are left inconsistent).
                   2375: 
                   2376: The INGRES "super-user" can checkpoint a data base(s)
                   2377: onto tape using the UNIX backup scheme.  Since INGRES
                   2378: logs all interactions, a consistent system can always be
                   2379: obtained (albeit slowly) by restoring the last checkpoint
                   2380: and running the log of interactions  (or the tape(s)
                   2381: of deferred updates if it exists).
                   2382: 
                   2383: 
                   2384: .bp
                   2385: .fo 'DESIGN OF INGRES (VII)'-%-'12/5/75'
                   2386: 7  CONCLUSION AND FUTURE EXTENSIONS
                   2387: 
                   2388: The system described herein is in use at 5
                   2389: installations and is being brought up at 8 others.
                   2390: It forms the basis of an
                   2391: accounting system, a system for managing student records,
                   2392: a geo-data system, a system for maintaining
                   2393: a wiring diagram for a large telelphone company and assorted
                   2394: other smaller applications.  These
                   2395: applications have been running for periods of
                   2396: up to nine months.
                   2397: 
                   2398: 7.1  PERFORMANCE
                   2399: 
                   2400: At this time no detailed performance measurements have
                   2401: been made.  However, on our system (an 11/40 mainframe with 80k words of
                   2402: core) about 4-6 simultaneous INGRES users can be supported with
                   2403: reasonable response time (assuming they are doing interactions which 
                   2404: affect a small number of tuples and for which a fast access path 
                   2405: exists.  Of course, a user has the ability to execute interactions 
                   2406: in INGRES which require hours to process).  This hardware configuration
                   2407: costs about
                   2408: $60,000.  Larger 11/70 installations should be
                   2409: able to run substantially more INGRES users.
                   2410: 
                   2411: The sizes of the processes in INGRES are indicated below.
                   2412: Since the access methods are loaded
                   2413: with processes 2 and 3 and with many of the utilities
                   2414: their contribution to the respective process sizes has been noted separately.
                   2415: .bb
                   2416: access methods (AM)            11 K (bytes)
                   2417: .br
                   2418: terminal monitor               10 K
                   2419: .br
                   2420: EQUEL                          30 K + AM
                   2421: .br
                   2422: process 2                      45 K + AM
                   2423: .br
                   2424: process 3 (query processor)    45 K + AM
                   2425: .br
                   2426: utilities (8 overlays)          160 K + AM
                   2427: 
                   2428: .in 0
                   2429: 7.2  USER FEEDBACK
                   2430: 
                   2431: The feedback from internal and external users
                   2432: has been overwhelmingly positive.
                   2433: In this section we indicate features that have been
                   2434: suggested for future systems.
                   2435: 
                   2436: a)  higher performance
                   2437: 
                   2438: Earlier versions of INGRES were very slow.
                   2439: The current version should alleviate this problem.
                   2440: 
                   2441: b)  recursion
                   2442: 
                   2443: QUEL does not support recursion.  Hence, recursion must be
                   2444: tediously programmed in C using the precompiler.
                   2445: This has been suggested as a desired extension.
                   2446: 
                   2447: 
                   2448: c)  other language extensions
                   2449: 
                   2450: These include user defined functions (especially counters),
                   2451: multiple target lists for a single qualification statement and if-then-else
                   2452: control structures in QUEL.
                   2453: These extensions are all so a user can avoid using the precompiler.
                   2454: 
                   2455: d)  report generator
                   2456: 
                   2457: PRINT is only a very primitive report generator.
                   2458: The need for augmented facilities in this area is clear.
                   2459: It should be written in EQUEL.
                   2460: 
                   2461: e)   bulk copy
                   2462: 
                   2463: The COPY routine fails to handle easily all situations that have arisen.
                   2464: 
                   2465: 7.3  FUTURE EXTENSIONS
                   2466: 
                   2467: Noted throughout the paper are areas where
                   2468: system improvement is in progress, planned or desired by users.
                   2469: Other areas of extension include the
                   2470: following
                   2471: 
                   2472: a)  A multi-computer system version of INGRES to operate
                   2473: on distributed data bases
                   2474: 
                   2475: b)  Further performance enhancements
                   2476: 
                   2477: c)  A higher level user language including recursion and user defined
                   2478: functions
                   2479: 
                   2480: d)  Better data definition and integrity features
                   2481: 
                   2482: e)  A data base administrator advisor.  This program would run at idle 
                   2483: priority and issue queries against a statistics relation to be kept by 
                   2484: INGRES.  It could then offer advice to a DBA concerning the choice of 
                   2485: access methods and the selection of indices.  This topic is discussed 
                   2486: further in [HELD75b].
                   2487: 
                   2488: ACKNOWLEDGEMENT
                   2489: .ph
                   2490: Besides the authors, the following
                   2491: persons have played an active role
                   2492: in the design and implementation of INGRES:
                   2493: Eric Allman, Rick Berman, Jim Ford, Angela Go,
                   2494: Nancy McDonald, Peter Rubinstein, Iris Schoenberg, Nick Whyte,
                   2495: Carol Williams, Karel Youssefi, Bill Zook.
                   2496: .ph
                   2497: Support for the project has come from
                   2498: ARO23007, DAHC04-74-G0087,
                   2499: NESCN0039-76-C-0022,
                   2500: NSFF44620-71-C-0087,
                   2501: JSEPDCR75-03839, NSFENG74-06651-AO1,
                   2502: and a grant from the Sloan Foundation.
                   2503: .wh 0 tm
                   2504: .de tm
                   2505: .sp 6
                   2506: ..
                   2507: .pl 60
                   2508: .po 5
                   2509: REFERENCES
                   2510: 
                   2511: 
                   2512: .ls 2
                   2513: .in+9
                   2514: .ti-9
                   2515: ALLM76 
                   2516: Allman, E., Stonebraker, M., and Held, G. 
                   2517: "Embedding a Relational Data Sub-language in a General Purpose Programming Language.", 
                   2518: Proc. ACM SIGPLAN-SIGMOD Workshop on Data, 
                   2519: Salt Lake City, Utah, March, 1976.
                   2520: .ti-9
                   2521: ASTR76 
                   2522: Astrahan, M. et. al., "SYSTEM R: A Relational Approach to
                   2523: Data Base Management," TODS, Vol 1, No. 2.
                   2524: .ti-9
                   2525: BOYC73 
                   2526: Boyce, R. et. al., 
                   2527: "Specifying Queries as Relational Expressions: SQUARE", 
                   2528: IBM Research, San Jose, Ca., RJ 1291, Oct. 1973.
                   2529: .ti-9
                   2530: CHAM74 
                   2531: Chamberlin, D. and Boyce, R., 
                   2532: "SEQUEL: A Structured English Query Language", 
                   2533: Proc. 1974 ACM-SIGFIDET Workshop on Data Description, Access and Control, Ann Arbor, Mich., May 1974.
                   2534: .ti-9
                   2535: CHAM75 
                   2536: Chamberlin, D., Gray, J.N., and Traiger, I.L., 
                   2537: "Views, Authorization and Locking in a Relational Data Base System", 
                   2538: Proc. 1975 National Computer Conference, AFIPS Press, May 1975.
                   2539: .ti-9
                   2540: CODA71 
                   2541: Committee on Data Systems Languages, 
                   2542: "CODASYL Data Base Task Group Report", 
                   2543: ACM, New York, 1971.
                   2544: .ti-9
                   2545: CODD70 
                   2546: Codd, E.F., 
                   2547: "A Relational Model of Data for Large Shared Data Banks", 
                   2548: CACM, Vol. 13 No. 6, pp. 377-387, June, 1970.
                   2549: .ti-9
                   2550: CODD71 
                   2551: Codd, E.F., 
                   2552: "A Data Base Sublanguage Founded on the Relational Calculus", 
                   2553: Proc. 1971 ACM-SIGFIDET Workshop on Data Description, Access and Control, San Diego, Ca., Nov. 1971.
                   2554: .ti-9
                   2555: CODD72 
                   2556: Codd, E.F., 
                   2557: "Relational Completeness of Data Base Sublanguages", 
                   2558: Courant Computer Science Symposium 6, May 1972.
                   2559: .ti-9
                   2560: CODD74 
                   2561: Codd, E.F. and Date, C.J., 
                   2562: "Interactive Support for Non-Programmers, The Relational and Network Approaches", 
                   2563: Proc. 1974 ACM-SIGFIDET Workshop on Data Description, Access and Control, Ann Arbor, Mich., May 1974.
                   2564: .ti-9
                   2565: DATE74 
                   2566: Date, C.J. and Codd, E.F., 
                   2567: "The Relational and Network Approaches: Comparison of the Aplication Programming Interfaces", 
                   2568: Proc. 1974 ACM-SIGFIDET Workshop on Data Description, Access and Control, Ann Arbor, Mich., May 1974.
                   2569: .ti-9
                   2570: GRAY75   Gray, J.N., Lorie, R.A., and Putzolu, G.R.
                   2571: "Granularity of Locks in a Shared Data Base"
                   2572: Proc. International Conference on Very Large
                   2573: Data Bases, Framingham, Mass., September, 1975.
                   2574: .ti-9
                   2575: GO75   
                   2576: Go, A., Stonebraker, M., and Williams, C., 
                   2577: "An Approach to Implementing a Geo-data System", 
                   2578: Proc. ACM SIGGRAPH/SIGMOD Conference on Data Bases in Interactive Design,
                   2579: Waterloo, Ontario, Sept. 1975.
                   2580: .ti-9
                   2581: GOTT75   Gottlieb, D., et. al.  "A Classification of
                   2582: Compression Methods and their
                   2583: Usefulness in a Large Data Processing Center"
                   2584: Proc. 1975 National Computer Conference,
                   2585: AFIPS Press, May, 1975.
                   2586: .ti-9
                   2587: HELD75a        
                   2588: Held, G.D., Stonebraker, M. and Wong, E., 
                   2589: "INGRES - A Relational Data Base Management System", 
                   2590: Proc. 1975 National Computer Conference, AFIPS Press, 1975.
                   2591: .ti-9
                   2592: HELD75b        
                   2593: Held, G.D., 
                   2594: "Storage Structures for Relational Data Base Management Systems", 
                   2595: Ph.D. Thesis, Dept. of Electrical Engineering and Computer Science, Univ. of California, Berkeley, 1975.
                   2596: .ti-9
                   2597: HELD75c        
                   2598: Held, G., and Stonebraker M., 
                   2599: "B-Trees Re-examined", 
                   2600: to appear in CACM.
                   2601: .ti-9
                   2602: IBM66  
                   2603: IBM Corp., 
                   2604: "OS ISAM Logic", 
                   2605: IBM, White Plains, N.Y., GY28-6618.
                   2606: .ti-9
                   2607: JOHN74 
                   2608: Johnson, S.C., 
                   2609: "YACC, Yet Another Compiler-Compiler", 
                   2610: UNIX Programmer's Manual, Bell Telephone Labs, Murray Hill, N.J., July 1974.
                   2611: .ti-9
                   2612: MCDO75a        
                   2613: McDonald, N. and Stonebraker, M., 
                   2614: "Cupid -- The Friendly Query Language", 
                   2615: Proc. ACM-Pacific-75, San Francisco, California, April 1975.
                   2616: .ti-9
                   2617: MCDO75b        
                   2618: McDonald, Nancy., 
                   2619: "CUPID: A Graphics Oriented Facility for Support of Non-programmer Interactions with a Data Base", 
                   2620: Ph.D. Thesis, Dept. of Electrical Engineering and Computer Science,
                   2621: University of California, Berkeley, 1975.
                   2622: .ti-9
                   2623: RITC74 
                   2624: Ritchie, D.M. and Thompson, K. 
                   2625: "The UNIX Tine-Sharing System,"
                   2626: CACM,
                   2627: Vol. 17, No. 3., March, 1974.
                   2628: .ti-9
                   2629: SCHO75 
                   2630: Schoenberg, Iris, 
                   2631: "Implementation of Integrity Constraints in the Relational Data Base Management System, INGRES", 
                   2632: M.S. Thesis, Dept. of Electrical Engineering and Computer Science,
                   2633: University of California, Berkeley, 1975.
                   2634: .ti-9
                   2635: STON74a        
                   2636: Stonebraker, M., "A Functional View of Data Independence,"
                   2637: Proc. 1974 SIGFIDET Workshop on Data
                   2638: Description, Access and Control, Ann Arbor, Mich., May 1974.
                   2639: .ti-9
                   2640: STON74b        
                   2641: Stonebraker, M. and Wong, E., 
                   2642: "Access Control in a Relational Data Base Management System by Query Modification", 
                   2643: Proc. 1974 ACM National Conference, San Diego, Ca., Nov. 1974
                   2644: .ti-9
                   2645: STON74c  Stonebraker, M., "High Level Integrity
                   2646: Assurance in Relational Data Base Systems", 
                   2647: University of California, Electronics Research
                   2648: Laboratory, Memo ERL-M473, August, 1974.
                   2649: .ti-9
                   2650: STON75 
                   2651: Stonebraker, M., "Implementation of Integrity Constraints and
                   2652: Views by Query Modification", Proc 1975 SIGMOD Workshop
                   2653: on Management of Data, San Jose, Ca., May 1975.
                   2654: .ti-9
                   2655: STON76 
                   2656: Stonebraker, M. and Rubinstein, P., "The INGRES Protection System,"
                   2657: Proc. 1976 ACM National Conference, Houston, Texas, October, 1976.
                   2658: .ti-9
                   2659: TSIC75 
                   2660: Tsichritzis, D., 
                   2661: "A Network Framework for Relational Implementation", 
                   2662: University of Toronto, Computer Systems Research Group Report CSRG-51, Feb. 1975
                   2663: .ti-9
                   2664: WONG76 
                   2665: Wong, E. and Youssefi, K., 
                   2666: "Decomposition-A Strategy for Query Processing,"
                   2667: TODS, (this issue).
                   2668: .ti-9
                   2669: ZOOK76 
                   2670: Zook, W., et. al.,
                   2671: "INGRES - Reference Manual (Version 5)",
                   2672: University of California, Electronics Research
                   2673: Laboratory, Memo. No. ERL-M585, April, 1976.
unix.superglobalmegacorp.com
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.