42BSD/ingres/doc/other/maintain.nr - annotate

Return to maintain.nr CVS log
Up to [CSRG BSD Unix] / 42BSD / ingres / doc / other
Annotation of 42BSD/ingres/doc/other/maintain.nr, revision 1.1

1.1     ! root        1: .ds HE 'DATA BASE'INGRES'Page %'
        !             2: .so nmacs
        !             3: .hx
        !             4: .rs
        !             5: .sp 8
        !             6: .ce
        !             7: CREATING AND MAINTAINING A DATABASE USING INGRES
        !             8: .sp 15
        !             9: .ce
        !            10: by
        !            11: .ce
        !            12: Robert Epstein
        !            13: .sp 18
        !            14: .ce 2
        !            15: Memorandum No. ERL - M77-71
        !            16: December 16, 1977
        !            17: .sp 3
        !            18: .ce 4
        !            19: Electronics Research Laboratory
        !            20: College of Engineering
        !            21: University of California, Berkeley
        !            22: 94720
        !            23: .bp 1
        !            24: .ce
        !            25: .bl + 5
        !            26: CREATING AND MAINTAINING A DATABASE USING INGRES
        !            27: .sp 3
        !            28: 1.   INTRODUCTION
        !            29: .sp
        !            30: In this paper we describe how to create,
        !            31: structure and maintain relations
        !            32: in INGRES.
        !            33: It is assumed that the reader
        !            34: is familiar with INGRES
        !            35: and understands QUEL, the INGRES
        !            36: query language.
        !            37: It is strongly suggested that the
        !            38: document "A Tutorial on INGRES"
        !            39: (ERL M77/25) be read first.
        !            40: .sp 1
        !            41: This paper is divided into six sections
        !            42: .in +5
        !            43: .sp 1
        !            44: 1.  Introduction
        !            45: .sp 1
        !            46: 2.  Creating a Relation
        !            47: .sp 1
        !            48: 3.  Using Copy
        !            49: .sp 1
        !            50: 4.  Storage Structures
        !            51: .sp 1
        !            52: 5.  Secondary Indices
        !            53: .sp 1
        !            54: 6.  Recovery and Data Update
        !            55: .in -5
        !            56: .sp 1
        !            57: To create a new data base you must be a valid
        !            58: INGRES user and have "create data base"
        !            59: permission.
        !            60: These permissions are granted by the
        !            61: "ingres" superuser.
        !            62: If you pass those two requirements you can create
        !            63: a data base using the command to the
        !            64: Unix shell:
        !            65: .sp 1
        !            66: % creatdb mydata
        !            67: .sp 1
        !            68: where "mydata" is the name of the data base.
        !            69: You become the "data base administrator" (DBA) for
        !            70: mydata.
        !            71: As the DBA you have certain special
        !            72: powers.
        !            73: 
        !            74: .in +4
        !            75: .ti -5
        !            76: 1.  Any relation created by you can be accessed
        !            77: by anyone else using "mydata".
        !            78: If any other user creates a relation it is
        !            79: strictly private and cannot
        !            80: be accessed by the DBA or any other user.
        !            81: 
        !            82: .ti -5
        !            83: 2.  You can use the "-u" flag in ingres
        !            84: and printr. 
        !            85: This enables you to use ingres on "mydata"
        !            86: with someone else's id.
        !            87: Refer to the INGRES reference
        !            88: manual under sections
        !            89: ingres(unix) and users(files)
        !            90: for details.
        !            91: 
        !            92: .ti -5
        !            93: 3.  You can run sysmod, restore and purge
        !            94: on "mydata".
        !            95: 
        !            96: .ti -5
        !            97: 4.  The data base by default is created to
        !            98: allow for multiple concurrent users.
        !            99: If only one user will ever use the
        !           100: data base at a time,
        !           101: the data base administrator can
        !           102: turn off the concurrency control.
        !           103: Refer to creatdb(unix) in the INGRES
        !           104: reference manual.
        !           105: .in -4
        !           106: .sp 1
        !           107: Once a data base has been created you should
        !           108: immediately run
        !           109: .sp 1
        !           110: % sysmod mydata
        !           111: .sp 1
        !           112: This program will convert
        !           113: the system relations to their
        !           114: "best" structure for use in INGRES.
        !           115: Sysmod will be explained further in
        !           116: section 4.
        !           117: .sp 1
        !           118: As a DBA or as a user you can create and
        !           119: structure new relations in any data
        !           120: base to which you have access.
        !           121: The remainder of this paper describes how this is done.
        !           122: .bp
        !           123: 2.   CREATING NEW RELATIONS IN INGRES
        !           124: .sp
        !           125: There are two ways to create new relations in INGRES.
        !           126: .sp 1
        !           127: .ti +5
        !           128: create
        !           129: .br
        !           130: .ti +5
        !           131: retrieve into
        !           132: .sp 1
        !           133: "Retrieve into" is used to form a new relation from one or
        !           134: more existing relations.
        !           135: "Create" is used to create
        !           136: a new relation with no tuples in it.
        !           137: .sp 1
        !           138: example 1:
        !           139: .sp 1
        !           140: .ti +5
        !           141: range of p is parts
        !           142: .ti +5
        !           143: range of s is supply
        !           144: .ti +5
        !           145: retrieve into newsupply(
        !           146: .ti +19
        !           147: number = s.snum,
        !           148: .ti +19
        !           149: p.pname,
        !           150: .ti +19
        !           151: s.shipdate)
        !           152: .ti +5
        !           153: where s.pnum = p.pnum
        !           154: .sp 1
        !           155: example 2:
        !           156: .sp 1
        !           157: .ti +5
        !           158: create newsupply(
        !           159: .ti +12
        !           160: number = i2,
        !           161: .ti +12
        !           162: pname = c20,
        !           163: .ti +12
        !           164: shipdate = c8)
        !           165: .sp 1
        !           166: In example 1 INGRES creates a new relation called
        !           167: "newsupply", computing what the format
        !           168: of each domain should be.
        !           169: The query is then run and newsupply is
        !           170: modified to "cheapsort".
        !           171: (This will be covered in more detail in section 4.)
        !           172: .sp 1
        !           173: In example 2 "newsupply"
        !           174: is created and the name and
        !           175: format for each domain is given.
        !           176: The format types which are
        !           177: allowed are:
        !           178: .sp 1
        !           179: .in +5
        !           180: .nf
        !           181: i1             1  byte integer
        !           182: i2             2   "      "
        !           183: i4             4   "      "
        !           184: f4             4  byte floating point number
        !           185: f8             8   "      "       "     "
        !           186: c1,c2,..,c255  1,2,..,255 byte character
        !           187: .in -5
        !           188: .fi
        !           189: .sp 1
        !           190: In example 2, the width of an individual
        !           191: tuple is 30 bytes
        !           192: (2 + 20 + 8), and the
        !           193: relation has three domains.
        !           194: Beware that INGRES
        !           195: has limits.
        !           196: A relation cannot have more
        !           197: than 49 domains and the tuple width
        !           198: cannot exceed 498 bytes.
        !           199: .sp 1
        !           200: UNIX allocates space on a disk
        !           201: in units of 512 byte pages.
        !           202: INGRES gets a performance advantage
        !           203: by doing I/O in one block units.
        !           204: Therefore relations are divided into 512 byte pages.
        !           205: INGRES never splits a tuple
        !           206: between two pages.
        !           207: Thus some space can be wasted.
        !           208: There is an overhead of 12 bytes per page plus
        !           209: 2 bytes for every tuple on the page.
        !           210: The formulas are:
        !           211: .sp 1
        !           212: .ti +5
        !           213: number tuples per page = 500/(tuple width + 2)
        !           214: .sp 1
        !           215: .ti +5
        !           216: wasted space = 500 - number of tuples per page
        !           217: .ti +5
        !           218: *(tuple width +2)
        !           219: .sp 1
        !           220: For our example there are
        !           221: .sp 1
        !           222: .ti +5
        !           223: 22 = 500/(20 + 2)
        !           224: .sp 1
        !           225: .ti +5
        !           226: 16 = 500 - 22 * (20 + 2)
        !           227: .sp 1
        !           228: 22 tuples per page and 16 bytes
        !           229: wasted per page.
        !           230: These computations are valid
        !           231: only for uncompressed relations.
        !           232: We will return to this subject
        !           233: in section 4 when we discuss compression.
        !           234: .sp 1
        !           235: If you forget a domain name or
        !           236: format, use the "help" command.
        !           237: For example if you gave the INGRES
        !           238: command:
        !           239: .sp 1
        !           240: .ti +5
        !           241: help newsupply
        !           242: .sp 1
        !           243: the following would be printed:
        !           244: .nf
        !           245: 
        !           246: Relation:              newsupply
        !           247: Owner:                 bob
        !           248: Tuple width:           30
        !           249: Saved until:           Thu Nov 10 16:17:06 1977
        !           250: Number of tuples:      0
        !           251: Storage structure:     paged heap
        !           252: Relation type:         user relation
        !           253: 
        !           254:  attribute name    type  length  keyno.
        !           255: 
        !           256:  number            i       2
        !           257:  pname             c      20
        !           258:  shipdate          c       8
        !           259: 
        !           260: .fi
        !           261: Notice that every relation has an expiration
        !           262: date.
        !           263: This is set to be one week
        !           264: from the time when it was
        !           265: created.
        !           266: The "save" command
        !           267: can be used to save the relation longer.
        !           268: See "save(quel)" and "purge(unix)" in the 
        !           269: INGRES reference manual.
        !           270: 
        !           271: .bp
        !           272: 3.   COPYING DATA TO AND FROM INGRES
        !           273: .sp
        !           274: Once a relation is created, there are two mechanisms for
        !           275: inserting new data:
        !           276: .sp
        !           277: .in +5
        !           278: append command
        !           279: .br
        !           280: copy command
        !           281: .in -5
        !           282: .sp
        !           283: Append is used to insert tuples one at a time,
        !           284: or for filling one relation from other relations.
        !           285: .sp
        !           286: Copy is used for copying data from a UNIX file into
        !           287: a relation. 
        !           288: It is used for copying data from another program, or for copying
        !           289: data from another system.  
        !           290: It is also the most convenient way to copy any data
        !           291: larger than a few tuples.
        !           292: .sp
        !           293: Let's begin by creating a simple relation
        !           294: and loading data into it.
        !           295: .sp
        !           296: Example:
        !           297: .sp
        !           298: .ti +5
        !           299: .nf
        !           300: create donation (name = c10, amount = f4, ext = i2)
        !           301: .sp
        !           302: .fi
        !           303: Now suppose we have two people to enter.  
        !           304: The simplest procedure is probably
        !           305: to run the two queries in INGRES using
        !           306: the append command.
        !           307: .sp
        !           308: .ti 5
        !           309: .nf
        !           310: append to donation (name="frank",amount = 5,ext = 204)
        !           311: .sp 1
        !           312: .ti 5
        !           313: append to donation (name="harry",ext = 209,amount = 4.50)
        !           314: .fi
        !           315: .sp
        !           316: Note that the order in which the domains are given
        !           317: does not matter.
        !           318: INGRES matches by recognizing attribute names and
        !           319: does not care in what order attributes
        !           320: are listed.
        !           321: Here is what the relation "donation" looks like now:
        !           322: .nf
        !           323: 
        !           324: donation relation
        !           325: 
        !           326: |name      |amount    |ext   |
        !           327: |----------------------------|
        !           328: |frank     |5.000     |204   |
        !           329: |harry     |4.500     |209   |
        !           330: |----------------------------|
        !           331: .fi
        !           332: .sp
        !           333: We now have two people entered into
        !           334: the donation relation.
        !           335: Suppose we had fifty more to enter.
        !           336: Using the append command is far too tedious
        !           337: since so much typing is involved for each tuple.
        !           338: The copy command will better suit our purposes.
        !           339: .sp
        !           340: Copy can take data from a regular
        !           341: Unix file in a variety of formats and
        !           342: append it to a relation.
        !           343: To use the copy command first create
        !           344: a Unix file (typically using "ed") containing
        !           345: the data.
        !           346: .sp
        !           347: For example, let's put five new names in a file
        !           348: using the editor.
        !           349: .sp
        !           350: .nf
        !           351: .tr Z.
        !           352: % ed
        !           353: a
        !           354: bill,3.50,302
        !           355: sam,10.00,410
        !           356: susan,,100
        !           357: sally,.5,305
        !           358: george,4.00,302
        !           359: Z
        !           360: w newdom
        !           361: 68
        !           362: q
        !           363: %
        !           364: .tr ZZ
        !           365: .sp 
        !           366: .fi
        !           367: The format of the above file is a
        !           368: name followed by a comma, followed 
        !           369: by the amount, then a comma, then the extension,
        !           370: and finally a newline.
        !           371: Null entries, for example the amount
        !           372: for susan, are perfectly
        !           373: legal and default to zero
        !           374: for numerical domains and
        !           375: blanks for character domains.
        !           376: .sp
        !           377: To use copy we enter INGRES and give the copy command.
        !           378: .sp
        !           379: .in +5
        !           380: .nf
        !           381: copy donation (name = c0, amount = c0, ext = c0)
        !           382:        from "/mnt/bob/newdom"
        !           383: .fi
        !           384: .sp
        !           385: .in -5
        !           386: Here is how the copy command works:
        !           387: .sp
        !           388: .ti +5
        !           389: copy relname (list of what to copy) from "full pathname"
        !           390: .sp
        !           391: In the case above we wrote:
        !           392: .sp
        !           393: .ti +5
        !           394: copy donation (. . .) from "/mnt/bob/newdom"
        !           395: .sp
        !           396: Although amount and ext are stored in the relation
        !           397: as f4 (floating point) and i2 (integer), in the
        !           398: Unix file they were entered as characters.
        !           399: In specifying the format of the domain,
        !           400: copy accepts:
        !           401: .sp
        !           402: .ti +5
        !           403: domain = format
        !           404: .sp
        !           405: where domain is the domain name and
        !           406: the format in the UNIX file is one of
        !           407: .sp
        !           408: .in +5
        !           409: .nf
        !           410: i1, i2, i4         (true binary integer of size 1, 2, or 4)
        !           411: .br
        !           412: f4, f8             (true binary float point of size 4 or 8)
        !           413: .br
        !           414: c1, c2, c3,...c255 (a fixed length character string)
        !           415: .br
        !           416: c0                 (a variable length character string de-
        !           417:                    limited by a comma, tab or new line)
        !           418: .sp
        !           419: .in -5
        !           420: .fi
        !           421: In the example we use
        !           422: .sp
        !           423: .ti +5
        !           424: name = c0, amount = c0, extension = c0
        !           425: .sp
        !           426: This means that each of the domains
        !           427: was stored in the Unix file as
        !           428: variable length character
        !           429: strings.
        !           430: Copy takes the first comma,
        !           431: tab, or new line character
        !           432: as the end of the string.
        !           433: This by far is the most
        !           434: common use of copy
        !           435: when the data is being entered
        !           436: into a relation
        !           437: for the first time.
        !           438: .sp
        !           439: Copy can also be used to copy data from a relation
        !           440: into a Unix file.
        !           441: For example:
        !           442: .sp
        !           443: .in +5
        !           444: .nf
        !           445: copy donation (name = c10, amount = c10, ext = c5)
        !           446:         into "/mnt/bob/data"
        !           447: .fi
        !           448: .in -5
        !           449: .sp
        !           450: This will cause the following to happen:
        !           451: .sp 1
        !           452: .in +4
        !           453: .ti -5
        !           454: 1.  If the file /mnt/bob/data already exists it will
        !           455: be destroyed.
        !           456: .ti -5
        !           457: .sp 1
        !           458: 2.  The file is created in mode 600 (read/write by you only)
        !           459: .sp 1
        !           460: .ti -5
        !           461: 3.  Name will be copied as a 10 character field,
        !           462: immediately followed by amount,
        !           463: immediately followed by ext.
        !           464: Amount will be converted to a character
        !           465: field 10 characters wide.
        !           466: Ext will be converted to a character
        !           467: field 5 characters wide.
        !           468: .in -4
        !           469: .sp 1
        !           470: The file "/mnt/bob/data" would be a stream of characters
        !           471: looking like this:
        !           472: 
        !           473: .tr Z 
        !           474: .nf
        !           475: frankZZZZZZZZZZ5.000ZZ204harryZZZZZZZZZZ4.500ZZ209bill
        !           476: ZZZZZZZZZZZ3.500ZZ302samZZZZZZZZZZZ10.000ZZ410susanZZZ
        !           477: ZZZZZZZ0.000ZZ100sallyZZZZZZZZZZ0.500ZZ305georgeZZZZZZ
        !           478: ZZZ4.000ZZ302
        !           479: .fi
        !           480: .tr ZZ
        !           481: 
        !           482: .sp
        !           483: The output was broken into four lines to
        !           484: make it fit on this page.
        !           485: In actuality the file
        !           486: is a single line.
        !           487: Another example:
        !           488: .sp
        !           489: .in +5
        !           490: .nf
        !           491: copy (name = c0, colon = d1, ext = c0, comma = d1
        !           492:        amt = c0, nl = d1) into "/mnt/bob/data"
        !           493: .fi
        !           494: .in -5
        !           495: .sp
        !           496: In this example "c0" is interpreted to mean "use
        !           497: the appropriate character format".
        !           498: For character domains it is the
        !           499: width of the domain.
        !           500: Numeric domains are converted to characters
        !           501: according to the INGRES defaults
        !           502: (see ingres(unix)).
        !           503: .sp
        !           504: The statements:
        !           505: .sp
        !           506: .in +5
        !           507: colon = d1
        !           508: .br
        !           509: comma = d1
        !           510: .br
        !           511: nl = d1
        !           512: .in -5
        !           513: .sp
        !           514: are used to insert one colon,
        !           515: comma, and newline into the file.
        !           516: The format "d1" is interpreted to mean
        !           517: one dummy character.
        !           518: When copying into a Unix file,
        !           519: a selected set of characters can be inserted into
        !           520: the file using this
        !           521: "dummy domain" specification.
        !           522: Here is what the file "/mnt/bob/data" would look like:
        !           523: 
        !           524: .nf
        !           525: frank     :   204,     5.000
        !           526: harry     :   209,     4.500
        !           527: bill      :   302,     3.500
        !           528: sam       :   410,    10.000
        !           529: susan     :   100,     0.000
        !           530: sally     :   305,     0.500
        !           531: george    :   302,     4.000
        !           532: 
        !           533: .fi
        !           534: .sp
        !           535: If you wanted a file with the true binary representation
        !           536: of the numbers you would use:
        !           537: .sp
        !           538: .ti +5
        !           539: copy (name = c10, amount = f4, ext = i2)
        !           540: .sp
        !           541: This would create a file with the exact
        !           542: copy of each tuple,
        !           543: one after the other.
        !           544: This is frequently desireable for
        !           545: temporary backup purposes
        !           546: and it guarantees that floating
        !           547: point domains will be exact.
        !           548: .sp 2
        !           549: TYPICAL ERRORS
        !           550: .sp 1
        !           551: There are 17 different errors
        !           552: that can occur in copy.
        !           553: We will go through the most
        !           554: common ones.
        !           555: .sp 1
        !           556: Suppose you have a file with
        !           557: .sp 1
        !           558: bill,3.5,302
        !           559: .br
        !           560: sam,10,410,
        !           561: .br
        !           562: susan,3,100
        !           563: .sp 1
        !           564: and run the copy command
        !           565: .sp 1
        !           566: .in +5
        !           567: .nf
        !           568: copy donation (name = c0, amount = c0, ext = c0)
        !           569:        from "/mnt/bob/data"
        !           570: .fi
        !           571: .in -5
        !           572: .sp 1
        !           573: You would get the error message
        !           574: .sp 1
        !           575: .nf
        !           576: 5809: COPY: bad input string for domain amount. Input was "susan".
        !           577: There were 2 tuples sucessfully copied from /mnt/bob/data into
        !           578: donation.
        !           579: .fi
        !           580: .sp 1
        !           581: What happened is that line 2 had an extra
        !           582: comma.
        !           583: The first two tuples were copied correctly.
        !           584: For the next tuple, name = "" (blank), amount =
        !           585: "susan", and ext = "3".  
        !           586: Since "susan" is not a proper floating point
        !           587: number, an error was generated and
        !           588: processing was stopped after two tuples.
        !           589: .sp 1
        !           590: If you tried to copy the file with a file
        !           591: such as
        !           592: .sp 1
        !           593: nancy,5.0,35000
        !           594: .sp 1
        !           595: you would get the error message
        !           596: .sp 1
        !           597: .nf
        !           598: 5809: COPY: bad input string for domain ext. Input was "35000".
        !           599: There were 0 tuples successfully copied from /mnt/bob/data into
        !           600: donation.
        !           601: .fi
        !           602: .sp 1
        !           603: Here, since ext is an i2 (integer) domain,
        !           604: it cannot exceed the value 32767.
        !           605: .sp 1
        !           606: There are numerous other error messages,
        !           607: most of which are self-explanatory.
        !           608: .sp 1
        !           609: In addition there are three, non-fatal warnings
        !           610: which may appear on a copy "from".
        !           611: .sp 1
        !           612: If you are copying from
        !           613: a file into a relation which
        !           614: is ISAM or hash, a count
        !           615: of the number of duplicate
        !           616: tuples will appear, (if there were
        !           617: any).
        !           618: This will never appear on a "heap"
        !           619: because no duplicate checking
        !           620: is performed.
        !           621: .sp 1
        !           622: INGRES does not allow
        !           623: control characters
        !           624: (such as "bell" etc.)
        !           625: to be stored.
        !           626: If copy reads any control characters, it converts them
        !           627: to blanks and reports the number
        !           628: of domains that had control characters in them.
        !           629: .sp 1
        !           630: If you are copying using the c0
        !           631: option, copy will
        !           632: report if any character strings were
        !           633: longer than their domains
        !           634: and had to be truncated.
        !           635: .sp 2
        !           636: SPECIAL FEATURES
        !           637: .sp 1
        !           638: .ti +3
        !           639: There are a few special functions that
        !           640: make copy a little
        !           641: easier to use
        !           642: .sp 1
        !           643: .nr in 6n
        !           644: .ti -4
        !           645: 1.  Bulk copy
        !           646: .sp 1
        !           647: If you ask for:
        !           648: .sp 1
        !           649: .ti +4
        !           650: copy relname () from "file"
        !           651: .ti +8
        !           652: or
        !           653: .ti +4
        !           654: copy relname () into "file"
        !           655: .sp 1
        !           656: copy expands the statement to mean:
        !           657: .sp 1
        !           658: .in +5
        !           659: copy each domain in its proper
        !           660: order according to its proper
        !           661: format.  
        !           662: .sp 1
        !           663: .in -5
        !           664: So, if you said
        !           665: .sp 1
        !           666: .ti +4
        !           667: copy donation () into "/mnt/bob/donation"
        !           668: .sp 1
        !           669: it would be the same as asking for:
        !           670: .sp 1
        !           671: .ti +4
        !           672: .nf
        !           673: copy donation (name = c10, amount = f4, ext = i2)
        !           674:        into "/mnt/bob/donation"
        !           675: .fi
        !           676: .sp 1
        !           677: This provides a convenient way to copy
        !           678: whole relations to and from INGRES.
        !           679: .sp 1
        !           680: .ti -4
        !           681: 2.  Dummy Domains
        !           682: .sp 1
        !           683: If you are copying data
        !           684: from another computer or program,
        !           685: frequently there will be
        !           686: a portion of data that you will want to
        !           687: ignore.
        !           688: This can be done using the
        !           689: dummy domain specifications
        !           690: d0, d1, d2 ... d511.
        !           691: For example
        !           692: .sp 1
        !           693: .ti +4
        !           694: .nf
        !           695: copy rel (dom1 = c5, dummy = d2, dom2 = i4,
        !           696:        dumb = d0) from "/mnt/me/data"
        !           697: .fi
        !           698: .sp 1
        !           699: The first five characters
        !           700: are put in dom1,
        !           701: the next two characters are ignored.
        !           702: The next four bytes are
        !           703: an i4 (integer) and go in dom2,
        !           704: and the remaining delimited string
        !           705: is ignored.
        !           706: The name given to a dummy specifier is
        !           707: ignored.
        !           708: .sp 1
        !           709: As mentioned previously,
        !           710: dummy domains can be used on a copy
        !           711: "into" a Unix file for inserting
        !           712: special characters.
        !           713: The list of recognizable names includes:
        !           714: .sp 1
        !           715: .in +5
        !           716: .nf
        !           717: nl        newline
        !           718: tab       tab character
        !           719: sp        space
        !           720: nul       a zero byte
        !           721: null      a zero byte
        !           722: comma     ,
        !           723: dash      -
        !           724: colon     :
        !           725: lparen    (
        !           726: rparen    )
        !           727: .fi
        !           728: .in -5
        !           729: .sp 1
        !           730: .ti -4
        !           731: 3.  Truncation
        !           732: .sp 1
        !           733: It is not uncommon to have a mistake occur
        !           734: and need to start over.
        !           735: The simplest way to do that
        !           736: is to "truncate" the relation.
        !           737: This is done by the command:
        !           738: .sp 1
        !           739: .ti +4
        !           740: modify relname to truncated
        !           741: .sp 1
        !           742: This has the effect of removing
        !           743: all tuples in relname,
        !           744: releasing all disk space,
        !           745: and making relname a heap again.
        !           746: It is the logical equivalent of
        !           747: a destroy followed by a create
        !           748: (but with a lot less typing).
        !           749: .sp 1
        !           750: Since formatting mistakes are possible
        !           751: with copy,
        !           752: it is not generally a good idea to
        !           753: copy data into a relation that already
        !           754: has valid data in it.
        !           755: The best procedure is to create a
        !           756: temporary relation with the same domains
        !           757: as the existing relation.
        !           758: Copy data into the temporary relation
        !           759: and then append it to the real relation.
        !           760: For example:
        !           761: 
        !           762: .in +8
        !           763: .nf
        !           764: create tempdom(name=c10,amount=f4,ext=i2)
        !           765: 
        !           766: copy tempdom(name=c0,amount=c0,ext=c0)
        !           767: from "/mnt/bob/data"
        !           768: 
        !           769: range of td is tempdom
        !           770: append to donation(td.all)
        !           771: .fi
        !           772: .in -8
        !           773: .sp 1
        !           774: 4.  Specifing Delimitors.
        !           775: .sp 1
        !           776: Sometimes it is desirable to specify
        !           777: what the delimiting character should be
        !           778: on a copy "from" a file.
        !           779: This can be done by specifing:
        !           780: 
        !           781: .ti +8
        !           782: domain = c0delim
        !           783: 
        !           784: where "delim" is a valid delimitor
        !           785: taken from the list of recognizable names.
        !           786: This list was summarized on the
        !           787: previous page under "dummy domains".
        !           788: For example:
        !           789: 
        !           790: .ti +8
        !           791: copy donation (name = c0nl) from "/mnt/me/data"
        !           792: 
        !           793: will copy names from the file to the relation.
        !           794: Only a new line will delimit the names so
        !           795: any commas or tabs will be passed along as
        !           796: part of the name.
        !           797: 
        !           798: When copying "into" a Unix file,
        !           799: the "delim" is actually written into the
        !           800: file,
        !           801: so on a copy "into" the specification:
        !           802: 
        !           803: .ti +8
        !           804: copy donation (name = c0nl) into "/mnt/me/file"
        !           805: 
        !           806: will cause "name" to be written followed by a new line
        !           807: character.
        !           808: .nr in 0
        !           809: .bp
        !           810: 4.   CHOOSING THE BEST STORAGE STRUCTURES
        !           811: .sp 1
        !           812: .sp
        !           813: We now turn to the issue of efficiency.
        !           814: Once you have created a relation
        !           815: and inserted your
        !           816: data using either copy or append,
        !           817: INGRES can process any query
        !           818: on the relation.
        !           819: There are several things you can do
        !           820: to improve the speed at which INGRES
        !           821: can process a query.
        !           822: .sp
        !           823: INGRES can store a relation in three different
        !           824: internal
        !           825: structures.
        !           826: These are called "heap",
        !           827: "isam", and "hash".
        !           828: First we will briefly describe each
        !           829: structure and then later expand our
        !           830: discussion.
        !           831: .sp
        !           832: HEAP
        !           833: .sp 1
        !           834: When a relation is first created, it is
        !           835: created as a "heap".
        !           836: There are two important properties about a heap:
        !           837: duplicate tuples are not removed,
        !           838: and nothing is known about the location of the tuples.
        !           839: If you ran the query:
        !           840: .sp 1
        !           841: .ti +5
        !           842: range of d is donation
        !           843: .br
        !           844: .ti +5
        !           845: retrieve (d.amount) where d.name = "bill"
        !           846: .sp 1
        !           847: INGRES would have to read every tuple in the
        !           848: relation looking for those with name "bill".
        !           849: If the relation is small this isn't a 
        !           850: serious matter.
        !           851: But if the relation is very large, this can take
        !           852: minutes (or even hours!).
        !           853: .sp 1
        !           854: HASH
        !           855: .sp 1
        !           856: A relation whose structure is "hash" can give fast
        !           857: access to searches on certain domains.
        !           858: (Those domains are usually referred to as
        !           859: "keyed domains".)
        !           860: In addition, a "hashed" relation contains
        !           861: no duplicate tuples.
        !           862: For example, suppose the donation relation is stored hashed on
        !           863: domain "name".  
        !           864: Then the query:
        !           865: .sp 1
        !           866: .ti +5
        !           867: retrieve (d.amount) where d.name = "bill"
        !           868: .sp 1
        !           869: will run quickly
        !           870: since INGRES knows approximately where on
        !           871: disk the tuple is stored.
        !           872: If the relation contains only a few tuples you
        !           873: won't notice the difference between a "heap"
        !           874: and a "hash" structure.
        !           875: But as the relation becomes larger, the
        !           876: difference in speed becomes
        !           877: much more noticeable.
        !           878: .sp 1
        !           879: ISAM
        !           880: .sp 1
        !           881: An isam structure is one where the relation is
        !           882: sorted on one or more domains,
        !           883: (also called keyed domains).
        !           884: Duplicates are also removed on "isam relations".
        !           885: When new tuples are appended they are
        !           886: placed "approximately" in their sorted position in the
        !           887: relation.
        !           888: (The "approximately" will be explained a bit
        !           889: later.)
        !           890: .sp 1
        !           891: Suppose donation is isam on name.
        !           892: To process the query
        !           893: .sp 1
        !           894: .ti +5
        !           895: retrieve (d.amount) where d.name = "bill"
        !           896: .sp 1
        !           897: INGRES will determine where in the sorted order
        !           898: the name "bill" would be and read only
        !           899: those portions of the relation.
        !           900: .sp 1
        !           901: Since the relation is approximately sorted,
        !           902: an isam structure is also efficient for
        !           903: processing the query:
        !           904: .ti +5
        !           905: .sp 1
        !           906: retrieve (d.amount) where d.name >= "b" and d.name < "g"
        !           907: .sp 1
        !           908: This query would retrieve all names beginning
        !           909: with "b" through "f".
        !           910: The entire relation would not have to be
        !           911: searched since it is isam on name.
        !           912: .sp 2
        !           913: SPECIFYING THE STORAGE STRUCTURE
        !           914: .sp
        !           915: Any user created relation can be converted
        !           916: to any storage structure using the
        !           917: "modify" command.
        !           918: For example
        !           919: .sp
        !           920: .ti 5
        !           921: modify donation to hash on name
        !           922: .br
        !           923: or
        !           924: .br
        !           925: .ti 5
        !           926: modify donation to isam on name
        !           927: .sp
        !           928: or even
        !           929: .sp
        !           930: .ti 5
        !           931: modify donation to heap
        !           932: .sp 2
        !           933: PRIMARY AND OVERFLOW PAGES
        !           934: .sp
        !           935: At this point it is necessary to introduce the
        !           936: concepts of primary and overflow pages on
        !           937: hash and isam structures.
        !           938: Both hash and isam are techniques for assigning
        !           939: specific tuples to specific pages of a relation
        !           940: based on the tuple's keyed domains.
        !           941: Thus each page will contain only a certain
        !           942: specified subset of the relation.
        !           943: 
        !           944: When a new tuple is appended to a hash or isam
        !           945: relation, INGRES
        !           946: first determines what page it belongs to,
        !           947: and then looks for room on that page.
        !           948: If there is space then the tuple
        !           949: is placed on that page.
        !           950: If not,
        !           951: then an "overflow" page is created and
        !           952: the tuple is placed there.
        !           953: 
        !           954: The overflow page is linked to the
        !           955: original page.
        !           956: The original page is called the "primary"
        !           957: page.
        !           958: If the overflow page became full,
        !           959: then INGRES
        !           960: would connect an overflow page to it.
        !           961: We would then have one primary page
        !           962: linked to an overflow page,
        !           963: linked to another overflow page.
        !           964: Overflow pages are dynamically added as
        !           965: needed.
        !           966: .sp 2
        !           967: SPECIFYING FREE SPACE
        !           968: .sp
        !           969: The modify command also lets you specify how much
        !           970: room to leave for the relation to grow.
        !           971: As was mentiond in "create",
        !           972: relations are divided into pages.
        !           973: A "fillfactor" can be used to specify how
        !           974: full to make each primary page.
        !           975: This decision should be based
        !           976: only on whether more tuples 
        !           977: will be appended to the relation.
        !           978: For example:
        !           979: .sp
        !           980: .ti 5
        !           981: .nf
        !           982: modify donation to isam on name where fillfactor = 100
        !           983: .fi
        !           984: .sp
        !           985: This tells modify to make each page 100% full
        !           986: if at all possible.
        !           987: .sp
        !           988: .ti 5
        !           989: .nf
        !           990: modify donation to isam on name where fillfactor = 25
        !           991: .fi
        !           992: .sp
        !           993: This will leave each page 25% full or, in other words,
        !           994: 75% empty.
        !           995: We would do this if we had roughly 1/4 of the
        !           996: data already loaded and it was fairly well distributed
        !           997: about the alphabet.
        !           998: .sp
        !           999: Keep in mind that if you don't specify the fillfactor,
        !          1000: INGRES will typically default to a reasonable choice.
        !          1001: Also when a page becomes full, INGRES
        !          1002: automatically creates an "overflow"
        !          1003: page so it is never the case that a relation
        !          1004: will be unable to expand.  
        !          1005: .sp
        !          1006: When modifying a relation
        !          1007: to hash, an additional
        !          1008: parameter "minpages" can
        !          1009: be specified.
        !          1010: Modify will guarantee
        !          1011: that at least "minpage" primary pages will be allocated
        !          1012: for the relation.
        !          1013: .sp
        !          1014: Modify computes how may primary pages will be
        !          1015: needed to store the existing tuples at 
        !          1016: the specified fillfactor
        !          1017: assuming that no overflow pages will be necessary originally.
        !          1018: If that number is less than
        !          1019: minpages, then minpages is used instead.
        !          1020: .sp
        !          1021: For example:
        !          1022: .sp
        !          1023: .ti 5
        !          1024: .nf
        !          1025: modify donation to hash on name where fillfactor = 50,
        !          1026: .ti 10
        !          1027: minpages = 1
        !          1028: .sp 1
        !          1029: .ti 5
        !          1030: modify donation to hash on name where minpages = 150
        !          1031: .fi
        !          1032: .sp
        !          1033: In the first case we guarantee that no more
        !          1034: pages than are necessary will be
        !          1035: used for 50% occupancy.
        !          1036: The second case is typically
        !          1037: used for modifying an empty or near
        !          1038: empty relation.
        !          1039: If the approximate maximum
        !          1040: size of the relation is known in advance,
        !          1041: minpages
        !          1042: can be used to guarantee that the relation will
        !          1043: have its expected maximum size.
        !          1044: .sp
        !          1045: There is one other option available for hash called
        !          1046: "maxpages".
        !          1047: Its syntax is the same as minpages.
        !          1048: It can be used to specify the maximum
        !          1049: number of primary pages to use.
        !          1050: .sp
        !          1051: COMPRESSION
        !          1052: .sp 1
        !          1053: The three storage structures
        !          1054: (heap, hash, isam) can optionally
        !          1055: have "compression" applied
        !          1056: to them.
        !          1057: To do this, refer to the
        !          1058: storage structures as cheap, chash, and cisam.
        !          1059: Compression reduces
        !          1060: the amount of space needed to store each tuple
        !          1061: internally.
        !          1062: The current compression technique is to
        !          1063: suppress trailing blanks in 
        !          1064: character domains.
        !          1065: Using compression will never
        !          1066: require more space and typically
        !          1067: it can save disk space and improve
        !          1068: performance.
        !          1069: Here is an example:
        !          1070: .sp 1
        !          1071: .nf
        !          1072: .ti +5
        !          1073: modify donation to cisam on name where fillfactor = 100
        !          1074: .fi
        !          1075: .sp 1
        !          1076: This will make donation a compressed isam 
        !          1077: structure and fill every page as
        !          1078: full as possible.
        !          1079: With compression, each tuple
        !          1080: can have a different compressed
        !          1081: length.
        !          1082: Thus the number of tuples
        !          1083: that can fit on one page will
        !          1084: depend on how successfully
        !          1085: they can be compressed.
        !          1086: 
        !          1087: Compressed relations can be more expensive to update.
        !          1088: In particular if a replace is done on one or
        !          1089: more domains and the compressed tuple is no
        !          1090: longer the same length,
        !          1091: then INGRES must look for a new place to put the tuple.
        !          1092: .sp 2
        !          1093: TWO VARIATIONS ON A THEME
        !          1094: .sp
        !          1095: As mentioned, duplicates are not removed
        !          1096: from a relation stored
        !          1097: as a heap.
        !          1098: Frequently it is desirable
        !          1099: to remove duplicates and sort
        !          1100: a heap relation.
        !          1101: One way of doing this is to modify the
        !          1102: relation to isam specifying
        !          1103: the order in which to sort
        !          1104: the relation.
        !          1105: An alternative to this is to use either
        !          1106: "heapsort" or "cheapsort".
        !          1107: For example
        !          1108: .sp
        !          1109: .ti 5
        !          1110: .nf
        !          1111: modify donation to heapsort on name, ext
        !          1112: .fi
        !          1113: .sp
        !          1114: This will sort the relation by
        !          1115: name then ext.
        !          1116: The tuples are further sorted on the
        !          1117: remaining domains,
        !          1118: in the order they were listed in the
        !          1119: original create statement.
        !          1120: So in this case the relation will be
        !          1121: sorted on name then ext and then amount.
        !          1122: Duplicate tuples are always removed.
        !          1123: The relation will be left
        !          1124: as a heap.
        !          1125: Heapsort and cheapsort are intended
        !          1126: for sorting a temporary relation before printing and
        !          1127: destroying it.
        !          1128: It is more efficient than modifying
        !          1129: to isam because
        !          1130: with isam INGRES creates a
        !          1131: "directory" containing
        !          1132: key information about each page.
        !          1133: The relation will NOT be kept sorted
        !          1134: when further updates occur.
        !          1135: .sp
        !          1136: Examples:
        !          1137: .sp
        !          1138: .nr in 2n
        !          1139: Here are a collection of examples
        !          1140: and comments as to the efficiency of 
        !          1141: each query.
        !          1142: The queries are based on the
        !          1143: relations:
        !          1144: .(l
        !          1145: parts(pnum, pname, color, weight, qoh)
        !          1146: .br
        !          1147: supply(snum, pnum, jnum, shipdate, quan)
        !          1148: .sp 1
        !          1149: range of p is parts
        !          1150: .br
        !          1151: range of s is supply
        !          1152: .sp 1
        !          1153: modify parts to hash on pnum
        !          1154: .br
        !          1155: modify supply to hash on snum,jnum
        !          1156: .)l
        !          1157: .ti +5
        !          1158: .sp 1
        !          1159: retrieve (p.all) where p.pnum = 10
        !          1160: .sp 1
        !          1161: INGRES will recognize that parts is
        !          1162: hashed on pnum and go directly to the
        !          1163: page where parts with number 10 would be stored.
        !          1164: .sp 1
        !          1165: .ti +5
        !          1166: retrieve (p.all) where p.pname = "tape drive"
        !          1167: .sp 1
        !          1168: INGRES will read the entire relation
        !          1169: looking for matching pnames.
        !          1170: .sp 1
        !          1171: .ti +5
        !          1172: retrieve (p.all) where p.pnum < 10  and p.pnum > 5
        !          1173: .sp 1
        !          1174: INGRES will read the entire relation
        !          1175: because no exact value for pnum
        !          1176: was given.
        !          1177: .sp 1
        !          1178: .ti +5
        !          1179: retrieve (s.shipdate) where s.snum = 471 and s.jnum = 1008
        !          1180: .sp 1
        !          1181: INGRES will recognize that supply is hashed on the
        !          1182: combination of snum and jnum and will go directly
        !          1183: to the correct page.
        !          1184: .ti +5
        !          1185: .sp 1
        !          1186: retrieve (s.shipdate) where s.snum = 471
        !          1187: .sp 1
        !          1188: INGRES will read the entire
        !          1189: relation.
        !          1190: Supply is hashed on the
        !          1191: combination of snum and jnum.
        !          1192: Unless INGRES is given a unique
        !          1193: value for both, it cannot
        !          1194: take advantage of the storage
        !          1195: structure.
        !          1196: .sp 1
        !          1197: .ti +5
        !          1198: retrieve (p.pname, s.shipdate) where
        !          1199: .ti +5
        !          1200: .br
        !          1201: p.pnum = s.pnum and s.snum = 471 and s.jnum = 1008
        !          1202: .sp 1
        !          1203: INGRES will take advantage of both
        !          1204: storage structures.
        !          1205: It will first find all
        !          1206: s.pnum and s.shipdate
        !          1207: where s.snum = 471 and
        !          1208: s.jnum = 1008.
        !          1209: After that it will look for all 
        !          1210: p.pname where p.pnum is equal to
        !          1211: the correct value.
        !          1212: .sp 1
        !          1213: This example illustrates the idea that it is 
        !          1214: frequently a good idea to hash a
        !          1215: relation on the domains where it is
        !          1216: "joined" with another relation.
        !          1217: For example, in this
        !          1218: case it is very common to ask
        !          1219: for p.pnum = s.pnum
        !          1220: .sp 1
        !          1221: To summarize:
        !          1222: .sp 1
        !          1223: To take advantage of a hash
        !          1224: structure,
        !          1225: INGRES needs an exact value
        !          1226: for each key domain.
        !          1227: An exact value is anything
        !          1228: such as:
        !          1229: .ti +5
        !          1230: .sp 1
        !          1231: s.snum = 471
        !          1232: .br
        !          1233: .ti +5
        !          1234: s.pnum = p.pnum
        !          1235: .sp 1
        !          1236: An exact value is not
        !          1237: .sp 1
        !          1238: .ti +5
        !          1239: s.snum >= 471
        !          1240: .br
        !          1241: .ti +5
        !          1242: (s.snum = 10 or s.snum = 20)
        !          1243: .sp 1
        !          1244: Now let's consider some
        !          1245: cases using isam
        !          1246: .sp 1
        !          1247: .in +5
        !          1248: modify supply to isam on snum,shipdate
        !          1249: .br
        !          1250: retrieve (s.all) where s.snum = 471
        !          1251: .br
        !          1252: and s.shipdate > "75-12-31"
        !          1253: .br
        !          1254: and s.shipdate < "77-01-01"
        !          1255: .sp 1
        !          1256: .in -5
        !          1257: Since supply is sorted first on snum and then
        !          1258: on shipdate, INGRES
        !          1259: can take full advantage of the
        !          1260: isam structure to locate the
        !          1261: portions of supply which satisfy
        !          1262: the query.
        !          1263: .sp 1
        !          1264: .ti +5
        !          1265: retrieve (s.all) where s.snum = 47l
        !          1266: .sp 1
        !          1267: Unlike hash, an isam structure
        !          1268: can still be used if only the first key is
        !          1269: provided.
        !          1270: .sp 1
        !          1271: .ti +5
        !          1272: retrieve (s.all) where s.snum > 400 and s.snum < 500
        !          1273: .sp 1
        !          1274: Again INGRES will take advantage of the structure.
        !          1275: .sp 1
        !          1276: .ti +5
        !          1277: retrieve (s.all) where s.shipdate >= "75-12-31" and
        !          1278: .ti +5
        !          1279: s.shipdate <= "77-01-01"
        !          1280: .sp 1
        !          1281: Here INGRES will read the entire relation.
        !          1282: This is because the first key (snum) is not
        !          1283: provided in the query.
        !          1284: .sp 1
        !          1285: To summarize:
        !          1286: .sp 1
        !          1287: Isam can provide improved access
        !          1288: on either exact values or ranges of
        !          1289: values.
        !          1290: It is useful as long as at least
        !          1291: the first key is provided.
        !          1292: .sp 1
        !          1293: To locate where the tuples are
        !          1294: in an isam relation,
        !          1295: INGRES searches the isam directory for that
        !          1296: relation.
        !          1297: When a relation is modified to isam,
        !          1298: the tuples are first sorted and duplicates
        !          1299: are removed.
        !          1300: Next, the relation is
        !          1301: filled (according to the fillfactor) starting
        !          1302: at page 0, 1, 2... for as many
        !          1303: pages as are needed.
        !          1304: .sp 1
        !          1305: Now the directory is built.
        !          1306: The key domains from the first
        !          1307: tuple on each page are collected and
        !          1308: organized into a directory (stored in the relation
        !          1309: on disk).
        !          1310: The directory is never changed
        !          1311: until the next time a modify is done.
        !          1312: .sp 1
        !          1313: Whenever a tuple is added to the relation,
        !          1314: the directory is searched to find
        !          1315: which page the new tuple belongs on.
        !          1316: Within that page, the individual
        !          1317: tuples are NOT kept sorted.
        !          1318: This is what is meant by "approximately" sorted.
        !          1319: .sp 2
        !          1320: .nr in 0
        !          1321: HEAP v. HASH v. ISAM
        !          1322: .sp 1
        !          1323: Let's now compare the relative advantages and disadvantages
        !          1324: of each option.
        !          1325: A relation is always created as a heap.
        !          1326: A heap is the most efficient
        !          1327: structure to use to initially
        !          1328: fill a relation using copy or append.
        !          1329: .sp 1
        !          1330: Space from deleted tuples of a heap
        !          1331: is only reused on the last page.
        !          1332: No duplicate checking is done on
        !          1333: a heap relation.
        !          1334: .sp 1
        !          1335: Hash is advantageous for locating tuples
        !          1336: referenced in a qualification by an exact
        !          1337: value.
        !          1338: The primary page for tuples with a specific
        !          1339: value can be easily computed.
        !          1340: .sp 1
        !          1341: Isam is useful for both exact values and ranges of values.
        !          1342: Since the isam directory must be searched to
        !          1343: locate tuples, it is never as efficient as hash.
        !          1344: .sp 2
        !          1345: OVERFLOW PAGES
        !          1346: .sp 1
        !          1347: When a tuple is to be inserted
        !          1348: and there is no more room on the
        !          1349: primary page of a relation, then an
        !          1350: overflow page is created.
        !          1351: As more tuples are inserted, additional overflow
        !          1352: pages are added as needed.
        !          1353: Overflow pages, while necessary, decrease
        !          1354: the system performance for
        !          1355: retrieves and updates.
        !          1356: .sp 1
        !          1357: For example, let's suppose that supply
        !          1358: is hashed on snum and has 10 primary pages.
        !          1359: Suppose the value snum = 3 falls on page 7.
        !          1360: To find all snum = 3 requires INGRES to search
        !          1361: primary page 7 and all overflow pages of page 7
        !          1362: (if any).
        !          1363: As more overflow pages are added the time
        !          1364: needed to search for
        !          1365: snum = 3 will increase.
        !          1366: Since duplicates are removed on isam and hash,
        !          1367: this search must be performed on appends and
        !          1368: replaces also.
        !          1369: .sp 1
        !          1370: When a hash or isam relation has too many overflow pages
        !          1371: it should be remodified to hash
        !          1372: or isam again.
        !          1373: This will clear up the relation
        !          1374: and eliminate as many overflow pages as possible.
        !          1375: .sp 2
        !          1376: UNIQUE KEYS
        !          1377: .sp 1
        !          1378: When choosing key domains for a relation
        !          1379: it is desirable to have each set of
        !          1380: key domains
        !          1381: as unique as possible.
        !          1382: For example, employee id numbers  
        !          1383: typically have no
        !          1384: duplicate values, while
        !          1385: something like color
        !          1386: is likely to have only a few distinct
        !          1387: values, and something like
        !          1388: sex, to the best of our knowledge, has only two
        !          1389: values.
        !          1390: .sp 1
        !          1391: If a relation is hashed on domain sex then you can expect to have all
        !          1392: males on one primary page and all its
        !          1393: overflow pages and a corresponding
        !          1394: situation with females.
        !          1395: With a hash relation there is no solution to this
        !          1396: problem.
        !          1397: A trade-off must be made between the
        !          1398: most desirable key domains to use in a
        !          1399: qualification versus the uniqueness of the
        !          1400: key values.
        !          1401: .sp 1
        !          1402: Since isam structure can be used if at least
        !          1403: the first key is provided, extra
        !          1404: key domains can sometimes be added to increase uniqueness.
        !          1405: For example, suppose the supply
        !          1406: relation has only 10 unique supplier numbers
        !          1407: but thousands of tuples.
        !          1408: Choosing an isam structure with the keys snum and jnum
        !          1409: will probably give many more unique keys.
        !          1410: However, the directory size will
        !          1411: be larger and consequently it will
        !          1412: take longer to search.
        !          1413: When providing additional keys
        !          1414: just for the sake of increasing
        !          1415: uniqueness,
        !          1416: try to use the smallest possible domains.
        !          1417: .sp 2
        !          1418: SYSTEM RELATIONS
        !          1419: .sp 1
        !          1420: INGRES uses three relations
        !          1421: ("relation", "attribute", and "indexes") to maintain
        !          1422: and organize a data base.
        !          1423: The "relation" relation has one tuple for
        !          1424: each relation in the data base.
        !          1425: The "attribute" relation has one tuple
        !          1426: for each attribute in each
        !          1427: relation.
        !          1428: The "indexes" relation
        !          1429: has one tuple for each secondary
        !          1430: index.
        !          1431: .sp 1
        !          1432: INGRES accesses these relations
        !          1433: in a very well defined manner.
        !          1434: A program called "sysmod" should be used
        !          1435: to modify these relations to hash on the
        !          1436: appropriate domains.
        !          1437: To use sysmod the data base
        !          1438: administrator types
        !          1439: .sp 1
        !          1440: % sysmod data-base-name
        !          1441: .sp 1
        !          1442: Sysmod should be run
        !          1443: initially after the data base is created and subsequently
        !          1444: as relations are created and the data
        !          1445: base grows.
        !          1446: It is insufficient to run
        !          1447: sysmod only once and forget about it.
        !          1448: Rerunning sysmod will cause the
        !          1449: system relations to be remodified.
        !          1450: This will typically remove
        !          1451: most overflow pages and improve
        !          1452: system response time
        !          1453: for everything.
        !          1454: .bp
        !          1455: 5.  SECONDARY INDICES
        !          1456: .sp 1
        !          1457: Using an isam or hash structure
        !          1458: provides a fast way to find
        !          1459: tuples in a relation given values for the key
        !          1460: domains.
        !          1461: Sometimes this is not enough.
        !          1462: For example, suppose we have
        !          1463: the donation relation
        !          1464: .sp 1
        !          1465: .ti +5
        !          1466: donation(name, amount, ext)
        !          1467: .sp 1
        !          1468: hashed on name.
        !          1469: This will provide fast access
        !          1470: to queries where the qualification has
        !          1471: an exact value for name.
        !          1472: What if we also will be doing
        !          1473: queries giving exact values for ext?
        !          1474: .sp 1
        !          1475: Donation can be hashed either on name
        !          1476: or ext, so we would have to choose which is more common
        !          1477: and hash donation on that domain.
        !          1478: The other domain (say ext) can have
        !          1479: a secondary index.
        !          1480: A secondary index is a relation which contains
        !          1481: each "ext" together with the exact
        !          1482: location of where the tuple is in the relation
        !          1483: donation.
        !          1484: .sp 1
        !          1485: The command to create a secondary
        !          1486: index is:
        !          1487: .sp 1
        !          1488: .ti +5
        !          1489: index on donation is donext (ext)
        !          1490: .sp 1
        !          1491: The general format is:
        !          1492: .sp 1
        !          1493: .ti +5
        !          1494: index on relation_name is secondary_index_name (domains)
        !          1495: .sp 1
        !          1496: Here we are asking INGRES
        !          1497: to create a secondary index on the relation
        !          1498: donation.
        !          1499: The domain being indexed is "ext".
        !          1500: Indices are formed in three steps:
        !          1501: .sp 1
        !          1502: .in +4
        !          1503: .ti -5
        !          1504: 1.  "Donext" is created as a heap.
        !          1505: .br
        !          1506: .ti -5
        !          1507: 2.  For each
        !          1508: tuple in donation, a tuple is inserted
        !          1509: in "donext" with the value for ext and the
        !          1510: exact location of the corresponding tuple in
        !          1511: donation.
        !          1512: .br
        !          1513: .ti -5
        !          1514: 3.  By default "donext" is modified to isam.
        !          1515: .in -4
        !          1516: .sp 1
        !          1517: Now if you run the query
        !          1518: .sp 1
        !          1519: .ti +5
        !          1520: range of d is donation
        !          1521: .ti +5
        !          1522: retrieve(d.amount) where d.ext = 207
        !          1523: .sp 1
        !          1524: INGRES will automatically look first in
        !          1525: "donext" to find ext = 207.
        !          1526: When it finds one it then goes directly
        !          1527: to the tuple in the donation relation.
        !          1528: Since "donext" is isam on ext, search for
        !          1529: ext = 207 can typically be
        !          1530: done rapidly.
        !          1531: .sp 1
        !          1532: If you run the query
        !          1533: .sp 1
        !          1534: .ti 5
        !          1535: retrieve(d.amount) where d.name = "frank"
        !          1536: .sp 1
        !          1537: then INGRES will continue to use the hash
        !          1538: structure of the relation "donation"
        !          1539: to locate the qualifying tuples.
        !          1540: .sp 1
        !          1541: Since secondary indices are themselves relations,
        !          1542: they also can be either hash, isam, chash or cisam.
        !          1543: It never makes sense to a secondary index a heap.
        !          1544: .sp 1
        !          1545: The decision as to what structure to make
        !          1546: them on involves the same issues
        !          1547: as were discussed before:
        !          1548: .sp 1
        !          1549: Will the domains be referenced by exact value?
        !          1550: .br
        !          1551: Will they be referenced by ranges of value?
        !          1552: .br
        !          1553: etc.
        !          1554: .sp 1
        !          1555: In this case the "ext" domain
        !          1556: will be referenced by exact values, and
        !          1557: since the relation is nearly full we will do:
        !          1558: .sp 1
        !          1559: .ti +5
        !          1560: modify donext to hash on ext where fillfactor = 100
        !          1561: .ti +5
        !          1562: and minpages = 1
        !          1563: .sp 1
        !          1564: Secondary indices provide a way for INGRES
        !          1565: to access tuples based on domains
        !          1566: that are not key domains.
        !          1567: A relation can have any number of secondary
        !          1568: indices and in addition
        !          1569: each secondary index can be an index
        !          1570: on up to six domains of the primary relation.
        !          1571: .sp 1
        !          1572: Whenever a tuple is replaced, deleted
        !          1573: or appended to a primary relation,
        !          1574: all secondary indices must
        !          1575: also be updated.  
        !          1576: Thus secondary indices
        !          1577: are "not free". 
        !          1578: They increase
        !          1579: the cost of updating the
        !          1580: primary relation, but
        !          1581: can decrease the cost of finding tuples
        !          1582: in the primary relation.
        !          1583: .sp 1
        !          1584: Whether a secondary index will improve
        !          1585: performance or not strongly
        !          1586: depends on the uniqueness of the
        !          1587: values of the domains being
        !          1588: indexed.
        !          1589: The primary concern is whether searching
        !          1590: through the secondary index is
        !          1591: more efficient than simply
        !          1592: reading the entire primary relation.
        !          1593: In general it is if the
        !          1594: number of tuples which satisfy the
        !          1595: qualification is less than the number of total pages
        !          1596: (both primary and overflow) in the primary
        !          1597: relation.
        !          1598: .sp 1
        !          1599: For example if we frequently want to find
        !          1600: all people who donated less than
        !          1601: five dollars, consider creating
        !          1602: .sp 1
        !          1603: .ti +5
        !          1604: index on donation is donamount (amount)
        !          1605: .sp 1
        !          1606: By default donamount will be isam
        !          1607: on amount.
        !          1608: IF INGRES processes the query:
        !          1609: .sp 1
        !          1610: .ti +5
        !          1611: retrieve(d.name) where d.amount < 5.0
        !          1612: .sp 1
        !          1613: it will locate d.amount < 5.0 in the secondary
        !          1614: index and for each tuple it
        !          1615: finds will fetch the corresponding
        !          1616: tuple in donation.
        !          1617: The tuples in donamount are sorted by
        !          1618: amount but the tuples
        !          1619: in donation are not.
        !          1620: Thus in general each tuple fetch from
        !          1621: donation via donamount will be on a
        !          1622: different page.
        !          1623: Retrieval using the secondary index can then cause more page
        !          1624: reads than simply reading all of donation sequentially!
        !          1625: So in this example it would
        !          1626: be a bad idea to create the secondary
        !          1627: index.
        !          1628: .bp
        !          1629: 6.  RECOVERY AND DATA UPDATE
        !          1630: .sp 1
        !          1631: INGRES has been carefully designed
        !          1632: to protect the integrity of a data base
        !          1633: against certain classes
        !          1634: of system failures.
        !          1635: To do this INGRES
        !          1636: processes changes to a relation
        !          1637: using what we call "deferred
        !          1638: update" or "batch file update".
        !          1639: In addition there are two INGRES
        !          1640: programs "restore" and "purge" that can be used to check
        !          1641: out a data base after a system failure.
        !          1642: We will first discuss how deferred updates are created
        !          1643: and processed, and second we will discuss
        !          1644: the use of purge and restore.
        !          1645: .sp 1
        !          1646: DEFERRED UPDATE (Batch update)
        !          1647: .in +4
        !          1648: .sp 1
        !          1649: .ti -5
        !          1650: An append, replace or delete command is run in four steps:
        !          1651: .sp 1
        !          1652: .ti -5
        !          1653: 1.  An empty batch file is created.
        !          1654: .ti -5
        !          1655: 2.  The command is run to completion
        !          1656: and each change to the result relation is written into
        !          1657: the batch file.
        !          1658: .ti -5
        !          1659: 3.  The batch file is read and the
        !          1660: relation and its secondary indices (if any)
        !          1661: are actually updated.
        !          1662: .ti -5
        !          1663: 4.  The batch file is destroyed and INGRES
        !          1664: returns back to the user.
        !          1665: .sp 1
        !          1666: .in -4
        !          1667: Deferred update defers all actual
        !          1668: updating until the very end of
        !          1669: the query.
        !          1670: There are three advantages to doing this.
        !          1671: .sp 1
        !          1672: l.  Provides recovery from system failures
        !          1673: .sp 1
        !          1674: If the system "crashes" during an update,
        !          1675: the INGRES recovery program will decide to either
        !          1676: run the update to completion or else
        !          1677: "back out" the update, leaving the
        !          1678: relation as it looked before the update
        !          1679: was started.
        !          1680: .sp 1
        !          1681: 2.  Prevents infinite queries
        !          1682: .sp 1
        !          1683: If "donation" were a heap and the query
        !          1684: .sp 1
        !          1685: .ti +4
        !          1686: range of d is donation
        !          1687: .ti +4
        !          1688: append to donation(d.all)
        !          1689: .sp 1
        !          1690: were run without deferred update,
        !          1691: it would terminate only when it ran
        !          1692: out of space on disk!
        !          1693: This is because INGRES would start reading the
        !          1694: relation from the beginning and
        !          1695: appending each tuple at the end.
        !          1696: It would soon start reading the tuples it
        !          1697: had just previously appended and
        !          1698: continue indefinitely to
        !          1699: "chase its tail".
        !          1700: .sp 1
        !          1701: While this query is certainly not
        !          1702: typical, it illustrates the point.
        !          1703: There are certain classes of queries
        !          1704: where problems occur if WHEN
        !          1705: an update actually occurs
        !          1706: is not precisely defined.
        !          1707: With deferred update we can
        !          1708: guarantee consistent and logical
        !          1709: results.
        !          1710: .sp 1
        !          1711: 3.  Speeds up processing of secondary indices
        !          1712: .sp 1
        !          1713: Secondary indices can be updated
        !          1714: faster if they are done one at a time
        !          1715: instead of all at once.
        !          1716: It also insures protection against
        !          1717: the secondary index becoming inconsistent
        !          1718: with its primary relation.
        !          1719: .sp 1
        !          1720: TURNING DEFERRED UPDATE OFF
        !          1721: .sp 1
        !          1722: If you are not persuaded by any of
        !          1723: these arguments, INGRES
        !          1724: allows you to turn deferred update off!
        !          1725: Indeed there are certain cases when
        !          1726: it is appropriate (although
        !          1727: certainly not essential) to perform
        !          1728: updates directly, that is, the relation is updated
        !          1729: while the query is being processed.
        !          1730: .sp 1
        !          1731: To use direct update, you must be given
        !          1732: permission by the INGRES
        !          1733: super user.
        !          1734: Then when invoking INGRES
        !          1735: specify the "-b" flag which turns
        !          1736: off batch update.
        !          1737: .sp 1
        !          1738: .ti +4
        !          1739: % ingres mydate -b
        !          1740: .sp 1
        !          1741: INGRES will use direct update on any relation without
        !          1742: secondary indices.  
        !          1743: It will still silently use
        !          1744: deferred update if a relation
        !          1745: has any secondary indices.
        !          1746: By using the "-b" flag you are
        !          1747: sacrificing points 1 and 2 above.
        !          1748: In most cases you SHOULD NOT
        !          1749: use the -b flag.
        !          1750: .sp 1
        !          1751: If you are using INGRES
        !          1752: to interactively enter
        !          1753: or change one tuple at
        !          1754: a time, it is slightly
        !          1755: more efficient to have deferred
        !          1756: update turned off.
        !          1757: If the system crashes during an
        !          1758: update the person entering the data
        !          1759: will be aware of the situation
        !          1760: and can check whether the tuple
        !          1761: was updated or not.
        !          1762: .sp 1
        !          1763: RESTORE
        !          1764: .sp 1
        !          1765: INGRES is designed to recover
        !          1766: from the common types of system
        !          1767: crashes which leave the Unix file
        !          1768: system intact.
        !          1769: It can recover from updates, creates,
        !          1770: destroys, modifies and index commands.
        !          1771: .sp 1
        !          1772: INGRES is designed to "fail safe".
        !          1773: If any inconsistancies are
        !          1774: discovered or any failures
        !          1775: are returned from Unix,
        !          1776: INGRES will generate a system error
        !          1777: message (SYSERR) and exit.
        !          1778: .sp 1
        !          1779: Whenever Unix crashes while INGRES
        !          1780: is running or whenever an INGRES
        !          1781: syserr occurs, it is
        !          1782: generally a good idea to have the date
        !          1783: base administrator run the command
        !          1784: .sp 1
        !          1785: .ti +5
        !          1786: % restore data_base_name
        !          1787: .sp 1
        !          1788: The restore program performs the
        !          1789: following functions:
        !          1790: .in +4
        !          1791: .sp 1
        !          1792: .ti -5
        !          1793: 1.  Looks for batch update files.
        !          1794: If any are found, it examines each
        !          1795: one to see if it is complete.
        !          1796: If the system crash occured while
        !          1797: the batch file was being read
        !          1798: and the data base being updated,
        !          1799: then restore will complete
        !          1800: the update.
        !          1801: Otherwise the batch file was not
        !          1802: completed and it is simply destroyed;
        !          1803: the effect is as though the query had never been run.
        !          1804: .sp 1
        !          1805: .ti -5
        !          1806: 2.  Checks for uncompleted modify commands.
        !          1807: This step is crucial.
        !          1808: It guarantees that you will either have the
        !          1809: relation as it existed before
        !          1810: the modify, or restore will complete
        !          1811: the modify command.
        !          1812: Modify works by creating a new copy
        !          1813: of the relation in the new structure.
        !          1814: Then when it is ready to replace the old
        !          1815: relation, it stores the new information in a
        !          1816: "modify batch file".  
        !          1817: This enables restore to determine the state of
        !          1818: uncompleted modifies.
        !          1819: .sp 1
        !          1820: .ti -5
        !          1821: 3.  Checks consistency of system
        !          1822: relations.
        !          1823: This check is used to complete "destory"
        !          1824: commands, back out "create" commands,
        !          1825: and back out or complete "index"
        !          1826: commands that were interrupted by a
        !          1827: system crash.
        !          1828: .sp 1
        !          1829: .ti -5
        !          1830: 4.  Purges temporary relations and files.
        !          1831: Restore executes the "purge" program to
        !          1832: remove temporary relations and temporary
        !          1833: files created by the system.
        !          1834: Purge will be discussed in more detail a bit later.
        !          1835: .in -4
        !          1836: .sp 1
        !          1837: Restore cannot tell the user which queries have run and
        !          1838: which have not.
        !          1839: It can only identify those queries which were in the
        !          1840: process of being run when the crash occured.
        !          1841: When batching queries together,
        !          1842: it is a good idea to save the output in a file.
        !          1843: By having the monitor print out each query or set of
        !          1844: queries,
        !          1845: the user can later identify which queries were run.
        !          1846: .sp 1
        !          1847: Restore has several options to increase its
        !          1848: usability.
        !          1849: They are specified by "flags".
        !          1850: The options include:
        !          1851: .sp 1
        !          1852: .in +4
        !          1853: .nf
        !          1854: -a            ask before doing anything
        !          1855: -f            passed to purge. used to remove temporary files.
        !          1856: -p            passed to purge.  used to destory expired rela-
        !          1857:               tions.
        !          1858: no database   restores all data bases for which you are the
        !          1859:               dba.
        !          1860: .fi
        !          1861: .in -4
        !          1862: .sp 1
        !          1863: Of these options the "-a" is the most
        !          1864: important.
        !          1865: It can happen that a Unix crash can cause a page of
        !          1866: the system catalogues to be
        !          1867: incorrect.
        !          1868: This might cause restore to destory
        !          1869: a relation.
        !          1870: In fact, you might want
        !          1871: to "patch" the system relations to correct
        !          1872: the problem.
        !          1873: No restore program can account
        !          1874: for all possibilities.
        !          1875: It is therefore no replacement
        !          1876: (fortunately) for a human.
        !          1877: .sp 1
        !          1878: If "-a" is specified, restore
        !          1879: will state what it wants to do and then ask
        !          1880: for permission.
        !          1881: It reads standard input and
        !          1882: accepts "y" to mean go ahead and anything
        !          1883: else to mean no.
        !          1884: For example, to have restore ask you before
        !          1885: doing anything
        !          1886: .sp 1
        !          1887: .ti +5
        !          1888: restore -a mydatabase
        !          1889: .sp 1
        !          1890: To have it take "no" for all its questions
        !          1891: .sp 1
        !          1892: .ti +5
        !          1893: restore -a mydatabase </dev/null
        !          1894: .sp 1
        !          1895: Using the -a flag,
        !          1896: restore might ask for permission
        !          1897: to perform some cleanup;
        !          1898: for example,
        !          1899: if it finds an attribute for which there
        !          1900: is no corresponding relation,
        !          1901: or if it finds a secondary index for which
        !          1902: there is no primary relation,
        !          1903: etc.
        !          1904: .sp 1
        !          1905: To date, we have never had a system
        !          1906: crash which INGRES
        !          1907: could not recover from.
        !          1908: This does not mean that it will never happen, but
        !          1909: rather that it shouldn't
        !          1910: be too great 
        !          1911: a concern for you.
        !          1912: It should be mentioned that restore is not
        !          1913: a substitution for doing periodic
        !          1914: backing up, nor does it
        !          1915: ever perform such a function.
        !          1916: .sp 1
        !          1917: PURGE
        !          1918: .sp 1
        !          1919: Purge can be used to report expired relations,
        !          1920: destroy temporary system relations,
        !          1921: remove extraneous files,
        !          1922: and destory expired relations.
        !          1923: To use purge you must be the DBA
        !          1924: for the data base.
        !          1925: .sp 1
        !          1926: .ti +5
        !          1927: % purge mydatabase
        !          1928: .sp 1
        !          1929: Purge has several options which are
        !          1930: specified by flags which are
        !          1931: worth noting:
        !          1932: .nr in 4n
        !          1933: .sp 1
        !          1934: .nf
        !          1935: -f   (default is off) remove all extraneous files.
        !          1936:      Each file is reported and then removed.  If "-f"
        !          1937:      is not specified then the file is only reported.
        !          1938: .sp 1
        !          1939: -p   (default is off) destroy all expired relations.
        !          1940:      Each expired relation is reported and if "-p"
        !          1941:      was specified the relation is destroyed.
        !          1942: .fi
        !          1943: .nr in 0
        !          1944: .sp 1
        !          1945: Purge always destroys relations and files
        !          1946: which are known to be INGRES
        !          1947: system temporaries.
        !          1948: When processing multi-variable
        !          1949: queries and queries with aggregate functions, 
        !          1950: INGRES will usually create temporary relations
        !          1951: with intermediate results.
        !          1952: These relations always begin with the
        !          1953: characters "_SYS".  
        !          1954: Other INGRES commands create temporary files which also
        !          1955: begin with "_SYS".
        !          1956: Under normal processing they are
        !          1957: always destroyed. 
        !          1958: If a system crash occurs, they might be left.
        !          1959: Purge will always clean up the temporary
        !          1960: system files.
        !          1961: It cleans up the user relations only
        !          1962: when specifically asked to.
unix.superglobalmegacorp.com
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.