|
|
1.1 root 1: .ds HE 'DATA BASE'INGRES'Page %'
2: .so nmacs
3: .hx
4: .rs
5: .sp 8
6: .ce
7: CREATING AND MAINTAINING A DATABASE USING INGRES
8: .sp 15
9: .ce
10: by
11: .ce
12: Robert Epstein
13: .sp 18
14: .ce 2
15: Memorandum No. ERL - M77-71
16: December 16, 1977
17: .sp 3
18: .ce 4
19: Electronics Research Laboratory
20: College of Engineering
21: University of California, Berkeley
22: 94720
23: .bp 1
24: .ce
25: .bl + 5
26: CREATING AND MAINTAINING A DATABASE USING INGRES
27: .sp 3
28: 1. INTRODUCTION
29: .sp
30: In this paper we describe how to create,
31: structure and maintain relations
32: in INGRES.
33: It is assumed that the reader
34: is familiar with INGRES
35: and understands QUEL, the INGRES
36: query language.
37: It is strongly suggested that the
38: document "A Tutorial on INGRES"
39: (ERL M77/25) be read first.
40: .sp 1
41: This paper is divided into six sections
42: .in +5
43: .sp 1
44: 1. Introduction
45: .sp 1
46: 2. Creating a Relation
47: .sp 1
48: 3. Using Copy
49: .sp 1
50: 4. Storage Structures
51: .sp 1
52: 5. Secondary Indices
53: .sp 1
54: 6. Recovery and Data Update
55: .in -5
56: .sp 1
57: To create a new data base you must be a valid
58: INGRES user and have "create data base"
59: permission.
60: These permissions are granted by the
61: "ingres" superuser.
62: If you pass those two requirements you can create
63: a data base using the command to the
64: Unix shell:
65: .sp 1
66: % creatdb mydata
67: .sp 1
68: where "mydata" is the name of the data base.
69: You become the "data base administrator" (DBA) for
70: mydata.
71: As the DBA you have certain special
72: powers.
73:
74: .in +4
75: .ti -5
76: 1. Any relation created by you can be accessed
77: by anyone else using "mydata".
78: If any other user creates a relation it is
79: strictly private and cannot
80: be accessed by the DBA or any other user.
81:
82: .ti -5
83: 2. You can use the "-u" flag in ingres
84: and printr.
85: This enables you to use ingres on "mydata"
86: with someone else's id.
87: Refer to the INGRES reference
88: manual under sections
89: ingres(unix) and users(files)
90: for details.
91:
92: .ti -5
93: 3. You can run sysmod, restore and purge
94: on "mydata".
95:
96: .ti -5
97: 4. The data base by default is created to
98: allow for multiple concurrent users.
99: If only one user will ever use the
100: data base at a time,
101: the data base administrator can
102: turn off the concurrency control.
103: Refer to creatdb(unix) in the INGRES
104: reference manual.
105: .in -4
106: .sp 1
107: Once a data base has been created you should
108: immediately run
109: .sp 1
110: % sysmod mydata
111: .sp 1
112: This program will convert
113: the system relations to their
114: "best" structure for use in INGRES.
115: Sysmod will be explained further in
116: section 4.
117: .sp 1
118: As a DBA or as a user you can create and
119: structure new relations in any data
120: base to which you have access.
121: The remainder of this paper describes how this is done.
122: .bp
123: 2. CREATING NEW RELATIONS IN INGRES
124: .sp
125: There are two ways to create new relations in INGRES.
126: .sp 1
127: .ti +5
128: create
129: .br
130: .ti +5
131: retrieve into
132: .sp 1
133: "Retrieve into" is used to form a new relation from one or
134: more existing relations.
135: "Create" is used to create
136: a new relation with no tuples in it.
137: .sp 1
138: example 1:
139: .sp 1
140: .ti +5
141: range of p is parts
142: .ti +5
143: range of s is supply
144: .ti +5
145: retrieve into newsupply(
146: .ti +19
147: number = s.snum,
148: .ti +19
149: p.pname,
150: .ti +19
151: s.shipdate)
152: .ti +5
153: where s.pnum = p.pnum
154: .sp 1
155: example 2:
156: .sp 1
157: .ti +5
158: create newsupply(
159: .ti +12
160: number = i2,
161: .ti +12
162: pname = c20,
163: .ti +12
164: shipdate = c8)
165: .sp 1
166: In example 1 INGRES creates a new relation called
167: "newsupply", computing what the format
168: of each domain should be.
169: The query is then run and newsupply is
170: modified to "cheapsort".
171: (This will be covered in more detail in section 4.)
172: .sp 1
173: In example 2 "newsupply"
174: is created and the name and
175: format for each domain is given.
176: The format types which are
177: allowed are:
178: .sp 1
179: .in +5
180: .nf
181: i1 1 byte integer
182: i2 2 " "
183: i4 4 " "
184: f4 4 byte floating point number
185: f8 8 " " " "
186: c1,c2,..,c255 1,2,..,255 byte character
187: .in -5
188: .fi
189: .sp 1
190: In example 2, the width of an individual
191: tuple is 30 bytes
192: (2 + 20 + 8), and the
193: relation has three domains.
194: Beware that INGRES
195: has limits.
196: A relation cannot have more
197: than 49 domains and the tuple width
198: cannot exceed 498 bytes.
199: .sp 1
200: UNIX allocates space on a disk
201: in units of 512 byte pages.
202: INGRES gets a performance advantage
203: by doing I/O in one block units.
204: Therefore relations are divided into 512 byte pages.
205: INGRES never splits a tuple
206: between two pages.
207: Thus some space can be wasted.
208: There is an overhead of 12 bytes per page plus
209: 2 bytes for every tuple on the page.
210: The formulas are:
211: .sp 1
212: .ti +5
213: number tuples per page = 500/(tuple width + 2)
214: .sp 1
215: .ti +5
216: wasted space = 500 - number of tuples per page
217: .ti +5
218: *(tuple width +2)
219: .sp 1
220: For our example there are
221: .sp 1
222: .ti +5
223: 22 = 500/(20 + 2)
224: .sp 1
225: .ti +5
226: 16 = 500 - 22 * (20 + 2)
227: .sp 1
228: 22 tuples per page and 16 bytes
229: wasted per page.
230: These computations are valid
231: only for uncompressed relations.
232: We will return to this subject
233: in section 4 when we discuss compression.
234: .sp 1
235: If you forget a domain name or
236: format, use the "help" command.
237: For example if you gave the INGRES
238: command:
239: .sp 1
240: .ti +5
241: help newsupply
242: .sp 1
243: the following would be printed:
244: .nf
245:
246: Relation: newsupply
247: Owner: bob
248: Tuple width: 30
249: Saved until: Thu Nov 10 16:17:06 1977
250: Number of tuples: 0
251: Storage structure: paged heap
252: Relation type: user relation
253:
254: attribute name type length keyno.
255:
256: number i 2
257: pname c 20
258: shipdate c 8
259:
260: .fi
261: Notice that every relation has an expiration
262: date.
263: This is set to be one week
264: from the time when it was
265: created.
266: The "save" command
267: can be used to save the relation longer.
268: See "save(quel)" and "purge(unix)" in the
269: INGRES reference manual.
270:
271: .bp
272: 3. COPYING DATA TO AND FROM INGRES
273: .sp
274: Once a relation is created, there are two mechanisms for
275: inserting new data:
276: .sp
277: .in +5
278: append command
279: .br
280: copy command
281: .in -5
282: .sp
283: Append is used to insert tuples one at a time,
284: or for filling one relation from other relations.
285: .sp
286: Copy is used for copying data from a UNIX file into
287: a relation.
288: It is used for copying data from another program, or for copying
289: data from another system.
290: It is also the most convenient way to copy any data
291: larger than a few tuples.
292: .sp
293: Let's begin by creating a simple relation
294: and loading data into it.
295: .sp
296: Example:
297: .sp
298: .ti +5
299: .nf
300: create donation (name = c10, amount = f4, ext = i2)
301: .sp
302: .fi
303: Now suppose we have two people to enter.
304: The simplest procedure is probably
305: to run the two queries in INGRES using
306: the append command.
307: .sp
308: .ti 5
309: .nf
310: append to donation (name="frank",amount = 5,ext = 204)
311: .sp 1
312: .ti 5
313: append to donation (name="harry",ext = 209,amount = 4.50)
314: .fi
315: .sp
316: Note that the order in which the domains are given
317: does not matter.
318: INGRES matches by recognizing attribute names and
319: does not care in what order attributes
320: are listed.
321: Here is what the relation "donation" looks like now:
322: .nf
323:
324: donation relation
325:
326: |name |amount |ext |
327: |----------------------------|
328: |frank |5.000 |204 |
329: |harry |4.500 |209 |
330: |----------------------------|
331: .fi
332: .sp
333: We now have two people entered into
334: the donation relation.
335: Suppose we had fifty more to enter.
336: Using the append command is far too tedious
337: since so much typing is involved for each tuple.
338: The copy command will better suit our purposes.
339: .sp
340: Copy can take data from a regular
341: Unix file in a variety of formats and
342: append it to a relation.
343: To use the copy command first create
344: a Unix file (typically using "ed") containing
345: the data.
346: .sp
347: For example, let's put five new names in a file
348: using the editor.
349: .sp
350: .nf
351: .tr Z.
352: % ed
353: a
354: bill,3.50,302
355: sam,10.00,410
356: susan,,100
357: sally,.5,305
358: george,4.00,302
359: Z
360: w newdom
361: 68
362: q
363: %
364: .tr ZZ
365: .sp
366: .fi
367: The format of the above file is a
368: name followed by a comma, followed
369: by the amount, then a comma, then the extension,
370: and finally a newline.
371: Null entries, for example the amount
372: for susan, are perfectly
373: legal and default to zero
374: for numerical domains and
375: blanks for character domains.
376: .sp
377: To use copy we enter INGRES and give the copy command.
378: .sp
379: .in +5
380: .nf
381: copy donation (name = c0, amount = c0, ext = c0)
382: from "/mnt/bob/newdom"
383: .fi
384: .sp
385: .in -5
386: Here is how the copy command works:
387: .sp
388: .ti +5
389: copy relname (list of what to copy) from "full pathname"
390: .sp
391: In the case above we wrote:
392: .sp
393: .ti +5
394: copy donation (. . .) from "/mnt/bob/newdom"
395: .sp
396: Although amount and ext are stored in the relation
397: as f4 (floating point) and i2 (integer), in the
398: Unix file they were entered as characters.
399: In specifying the format of the domain,
400: copy accepts:
401: .sp
402: .ti +5
403: domain = format
404: .sp
405: where domain is the domain name and
406: the format in the UNIX file is one of
407: .sp
408: .in +5
409: .nf
410: i1, i2, i4 (true binary integer of size 1, 2, or 4)
411: .br
412: f4, f8 (true binary float point of size 4 or 8)
413: .br
414: c1, c2, c3,...c255 (a fixed length character string)
415: .br
416: c0 (a variable length character string de-
417: limited by a comma, tab or new line)
418: .sp
419: .in -5
420: .fi
421: In the example we use
422: .sp
423: .ti +5
424: name = c0, amount = c0, extension = c0
425: .sp
426: This means that each of the domains
427: was stored in the Unix file as
428: variable length character
429: strings.
430: Copy takes the first comma,
431: tab, or new line character
432: as the end of the string.
433: This by far is the most
434: common use of copy
435: when the data is being entered
436: into a relation
437: for the first time.
438: .sp
439: Copy can also be used to copy data from a relation
440: into a Unix file.
441: For example:
442: .sp
443: .in +5
444: .nf
445: copy donation (name = c10, amount = c10, ext = c5)
446: into "/mnt/bob/data"
447: .fi
448: .in -5
449: .sp
450: This will cause the following to happen:
451: .sp 1
452: .in +4
453: .ti -5
454: 1. If the file /mnt/bob/data already exists it will
455: be destroyed.
456: .ti -5
457: .sp 1
458: 2. The file is created in mode 600 (read/write by you only)
459: .sp 1
460: .ti -5
461: 3. Name will be copied as a 10 character field,
462: immediately followed by amount,
463: immediately followed by ext.
464: Amount will be converted to a character
465: field 10 characters wide.
466: Ext will be converted to a character
467: field 5 characters wide.
468: .in -4
469: .sp 1
470: The file "/mnt/bob/data" would be a stream of characters
471: looking like this:
472:
473: .tr Z
474: .nf
475: frankZZZZZZZZZZ5.000ZZ204harryZZZZZZZZZZ4.500ZZ209bill
476: ZZZZZZZZZZZ3.500ZZ302samZZZZZZZZZZZ10.000ZZ410susanZZZ
477: ZZZZZZZ0.000ZZ100sallyZZZZZZZZZZ0.500ZZ305georgeZZZZZZ
478: ZZZ4.000ZZ302
479: .fi
480: .tr ZZ
481:
482: .sp
483: The output was broken into four lines to
484: make it fit on this page.
485: In actuality the file
486: is a single line.
487: Another example:
488: .sp
489: .in +5
490: .nf
491: copy (name = c0, colon = d1, ext = c0, comma = d1
492: amt = c0, nl = d1) into "/mnt/bob/data"
493: .fi
494: .in -5
495: .sp
496: In this example "c0" is interpreted to mean "use
497: the appropriate character format".
498: For character domains it is the
499: width of the domain.
500: Numeric domains are converted to characters
501: according to the INGRES defaults
502: (see ingres(unix)).
503: .sp
504: The statements:
505: .sp
506: .in +5
507: colon = d1
508: .br
509: comma = d1
510: .br
511: nl = d1
512: .in -5
513: .sp
514: are used to insert one colon,
515: comma, and newline into the file.
516: The format "d1" is interpreted to mean
517: one dummy character.
518: When copying into a Unix file,
519: a selected set of characters can be inserted into
520: the file using this
521: "dummy domain" specification.
522: Here is what the file "/mnt/bob/data" would look like:
523:
524: .nf
525: frank : 204, 5.000
526: harry : 209, 4.500
527: bill : 302, 3.500
528: sam : 410, 10.000
529: susan : 100, 0.000
530: sally : 305, 0.500
531: george : 302, 4.000
532:
533: .fi
534: .sp
535: If you wanted a file with the true binary representation
536: of the numbers you would use:
537: .sp
538: .ti +5
539: copy (name = c10, amount = f4, ext = i2)
540: .sp
541: This would create a file with the exact
542: copy of each tuple,
543: one after the other.
544: This is frequently desireable for
545: temporary backup purposes
546: and it guarantees that floating
547: point domains will be exact.
548: .sp 2
549: TYPICAL ERRORS
550: .sp 1
551: There are 17 different errors
552: that can occur in copy.
553: We will go through the most
554: common ones.
555: .sp 1
556: Suppose you have a file with
557: .sp 1
558: bill,3.5,302
559: .br
560: sam,10,410,
561: .br
562: susan,3,100
563: .sp 1
564: and run the copy command
565: .sp 1
566: .in +5
567: .nf
568: copy donation (name = c0, amount = c0, ext = c0)
569: from "/mnt/bob/data"
570: .fi
571: .in -5
572: .sp 1
573: You would get the error message
574: .sp 1
575: .nf
576: 5809: COPY: bad input string for domain amount. Input was "susan".
577: There were 2 tuples sucessfully copied from /mnt/bob/data into
578: donation.
579: .fi
580: .sp 1
581: What happened is that line 2 had an extra
582: comma.
583: The first two tuples were copied correctly.
584: For the next tuple, name = "" (blank), amount =
585: "susan", and ext = "3".
586: Since "susan" is not a proper floating point
587: number, an error was generated and
588: processing was stopped after two tuples.
589: .sp 1
590: If you tried to copy the file with a file
591: such as
592: .sp 1
593: nancy,5.0,35000
594: .sp 1
595: you would get the error message
596: .sp 1
597: .nf
598: 5809: COPY: bad input string for domain ext. Input was "35000".
599: There were 0 tuples successfully copied from /mnt/bob/data into
600: donation.
601: .fi
602: .sp 1
603: Here, since ext is an i2 (integer) domain,
604: it cannot exceed the value 32767.
605: .sp 1
606: There are numerous other error messages,
607: most of which are self-explanatory.
608: .sp 1
609: In addition there are three, non-fatal warnings
610: which may appear on a copy "from".
611: .sp 1
612: If you are copying from
613: a file into a relation which
614: is ISAM or hash, a count
615: of the number of duplicate
616: tuples will appear, (if there were
617: any).
618: This will never appear on a "heap"
619: because no duplicate checking
620: is performed.
621: .sp 1
622: INGRES does not allow
623: control characters
624: (such as "bell" etc.)
625: to be stored.
626: If copy reads any control characters, it converts them
627: to blanks and reports the number
628: of domains that had control characters in them.
629: .sp 1
630: If you are copying using the c0
631: option, copy will
632: report if any character strings were
633: longer than their domains
634: and had to be truncated.
635: .sp 2
636: SPECIAL FEATURES
637: .sp 1
638: .ti +3
639: There are a few special functions that
640: make copy a little
641: easier to use
642: .sp 1
643: .nr in 6n
644: .ti -4
645: 1. Bulk copy
646: .sp 1
647: If you ask for:
648: .sp 1
649: .ti +4
650: copy relname () from "file"
651: .ti +8
652: or
653: .ti +4
654: copy relname () into "file"
655: .sp 1
656: copy expands the statement to mean:
657: .sp 1
658: .in +5
659: copy each domain in its proper
660: order according to its proper
661: format.
662: .sp 1
663: .in -5
664: So, if you said
665: .sp 1
666: .ti +4
667: copy donation () into "/mnt/bob/donation"
668: .sp 1
669: it would be the same as asking for:
670: .sp 1
671: .ti +4
672: .nf
673: copy donation (name = c10, amount = f4, ext = i2)
674: into "/mnt/bob/donation"
675: .fi
676: .sp 1
677: This provides a convenient way to copy
678: whole relations to and from INGRES.
679: .sp 1
680: .ti -4
681: 2. Dummy Domains
682: .sp 1
683: If you are copying data
684: from another computer or program,
685: frequently there will be
686: a portion of data that you will want to
687: ignore.
688: This can be done using the
689: dummy domain specifications
690: d0, d1, d2 ... d511.
691: For example
692: .sp 1
693: .ti +4
694: .nf
695: copy rel (dom1 = c5, dummy = d2, dom2 = i4,
696: dumb = d0) from "/mnt/me/data"
697: .fi
698: .sp 1
699: The first five characters
700: are put in dom1,
701: the next two characters are ignored.
702: The next four bytes are
703: an i4 (integer) and go in dom2,
704: and the remaining delimited string
705: is ignored.
706: The name given to a dummy specifier is
707: ignored.
708: .sp 1
709: As mentioned previously,
710: dummy domains can be used on a copy
711: "into" a Unix file for inserting
712: special characters.
713: The list of recognizable names includes:
714: .sp 1
715: .in +5
716: .nf
717: nl newline
718: tab tab character
719: sp space
720: nul a zero byte
721: null a zero byte
722: comma ,
723: dash -
724: colon :
725: lparen (
726: rparen )
727: .fi
728: .in -5
729: .sp 1
730: .ti -4
731: 3. Truncation
732: .sp 1
733: It is not uncommon to have a mistake occur
734: and need to start over.
735: The simplest way to do that
736: is to "truncate" the relation.
737: This is done by the command:
738: .sp 1
739: .ti +4
740: modify relname to truncated
741: .sp 1
742: This has the effect of removing
743: all tuples in relname,
744: releasing all disk space,
745: and making relname a heap again.
746: It is the logical equivalent of
747: a destroy followed by a create
748: (but with a lot less typing).
749: .sp 1
750: Since formatting mistakes are possible
751: with copy,
752: it is not generally a good idea to
753: copy data into a relation that already
754: has valid data in it.
755: The best procedure is to create a
756: temporary relation with the same domains
757: as the existing relation.
758: Copy data into the temporary relation
759: and then append it to the real relation.
760: For example:
761:
762: .in +8
763: .nf
764: create tempdom(name=c10,amount=f4,ext=i2)
765:
766: copy tempdom(name=c0,amount=c0,ext=c0)
767: from "/mnt/bob/data"
768:
769: range of td is tempdom
770: append to donation(td.all)
771: .fi
772: .in -8
773: .sp 1
774: 4. Specifing Delimitors.
775: .sp 1
776: Sometimes it is desirable to specify
777: what the delimiting character should be
778: on a copy "from" a file.
779: This can be done by specifing:
780:
781: .ti +8
782: domain = c0delim
783:
784: where "delim" is a valid delimitor
785: taken from the list of recognizable names.
786: This list was summarized on the
787: previous page under "dummy domains".
788: For example:
789:
790: .ti +8
791: copy donation (name = c0nl) from "/mnt/me/data"
792:
793: will copy names from the file to the relation.
794: Only a new line will delimit the names so
795: any commas or tabs will be passed along as
796: part of the name.
797:
798: When copying "into" a Unix file,
799: the "delim" is actually written into the
800: file,
801: so on a copy "into" the specification:
802:
803: .ti +8
804: copy donation (name = c0nl) into "/mnt/me/file"
805:
806: will cause "name" to be written followed by a new line
807: character.
808: .nr in 0
809: .bp
810: 4. CHOOSING THE BEST STORAGE STRUCTURES
811: .sp 1
812: .sp
813: We now turn to the issue of efficiency.
814: Once you have created a relation
815: and inserted your
816: data using either copy or append,
817: INGRES can process any query
818: on the relation.
819: There are several things you can do
820: to improve the speed at which INGRES
821: can process a query.
822: .sp
823: INGRES can store a relation in three different
824: internal
825: structures.
826: These are called "heap",
827: "isam", and "hash".
828: First we will briefly describe each
829: structure and then later expand our
830: discussion.
831: .sp
832: HEAP
833: .sp 1
834: When a relation is first created, it is
835: created as a "heap".
836: There are two important properties about a heap:
837: duplicate tuples are not removed,
838: and nothing is known about the location of the tuples.
839: If you ran the query:
840: .sp 1
841: .ti +5
842: range of d is donation
843: .br
844: .ti +5
845: retrieve (d.amount) where d.name = "bill"
846: .sp 1
847: INGRES would have to read every tuple in the
848: relation looking for those with name "bill".
849: If the relation is small this isn't a
850: serious matter.
851: But if the relation is very large, this can take
852: minutes (or even hours!).
853: .sp 1
854: HASH
855: .sp 1
856: A relation whose structure is "hash" can give fast
857: access to searches on certain domains.
858: (Those domains are usually referred to as
859: "keyed domains".)
860: In addition, a "hashed" relation contains
861: no duplicate tuples.
862: For example, suppose the donation relation is stored hashed on
863: domain "name".
864: Then the query:
865: .sp 1
866: .ti +5
867: retrieve (d.amount) where d.name = "bill"
868: .sp 1
869: will run quickly
870: since INGRES knows approximately where on
871: disk the tuple is stored.
872: If the relation contains only a few tuples you
873: won't notice the difference between a "heap"
874: and a "hash" structure.
875: But as the relation becomes larger, the
876: difference in speed becomes
877: much more noticeable.
878: .sp 1
879: ISAM
880: .sp 1
881: An isam structure is one where the relation is
882: sorted on one or more domains,
883: (also called keyed domains).
884: Duplicates are also removed on "isam relations".
885: When new tuples are appended they are
886: placed "approximately" in their sorted position in the
887: relation.
888: (The "approximately" will be explained a bit
889: later.)
890: .sp 1
891: Suppose donation is isam on name.
892: To process the query
893: .sp 1
894: .ti +5
895: retrieve (d.amount) where d.name = "bill"
896: .sp 1
897: INGRES will determine where in the sorted order
898: the name "bill" would be and read only
899: those portions of the relation.
900: .sp 1
901: Since the relation is approximately sorted,
902: an isam structure is also efficient for
903: processing the query:
904: .ti +5
905: .sp 1
906: retrieve (d.amount) where d.name >= "b" and d.name < "g"
907: .sp 1
908: This query would retrieve all names beginning
909: with "b" through "f".
910: The entire relation would not have to be
911: searched since it is isam on name.
912: .sp 2
913: SPECIFYING THE STORAGE STRUCTURE
914: .sp
915: Any user created relation can be converted
916: to any storage structure using the
917: "modify" command.
918: For example
919: .sp
920: .ti 5
921: modify donation to hash on name
922: .br
923: or
924: .br
925: .ti 5
926: modify donation to isam on name
927: .sp
928: or even
929: .sp
930: .ti 5
931: modify donation to heap
932: .sp 2
933: PRIMARY AND OVERFLOW PAGES
934: .sp
935: At this point it is necessary to introduce the
936: concepts of primary and overflow pages on
937: hash and isam structures.
938: Both hash and isam are techniques for assigning
939: specific tuples to specific pages of a relation
940: based on the tuple's keyed domains.
941: Thus each page will contain only a certain
942: specified subset of the relation.
943:
944: When a new tuple is appended to a hash or isam
945: relation, INGRES
946: first determines what page it belongs to,
947: and then looks for room on that page.
948: If there is space then the tuple
949: is placed on that page.
950: If not,
951: then an "overflow" page is created and
952: the tuple is placed there.
953:
954: The overflow page is linked to the
955: original page.
956: The original page is called the "primary"
957: page.
958: If the overflow page became full,
959: then INGRES
960: would connect an overflow page to it.
961: We would then have one primary page
962: linked to an overflow page,
963: linked to another overflow page.
964: Overflow pages are dynamically added as
965: needed.
966: .sp 2
967: SPECIFYING FREE SPACE
968: .sp
969: The modify command also lets you specify how much
970: room to leave for the relation to grow.
971: As was mentiond in "create",
972: relations are divided into pages.
973: A "fillfactor" can be used to specify how
974: full to make each primary page.
975: This decision should be based
976: only on whether more tuples
977: will be appended to the relation.
978: For example:
979: .sp
980: .ti 5
981: .nf
982: modify donation to isam on name where fillfactor = 100
983: .fi
984: .sp
985: This tells modify to make each page 100% full
986: if at all possible.
987: .sp
988: .ti 5
989: .nf
990: modify donation to isam on name where fillfactor = 25
991: .fi
992: .sp
993: This will leave each page 25% full or, in other words,
994: 75% empty.
995: We would do this if we had roughly 1/4 of the
996: data already loaded and it was fairly well distributed
997: about the alphabet.
998: .sp
999: Keep in mind that if you don't specify the fillfactor,
1000: INGRES will typically default to a reasonable choice.
1001: Also when a page becomes full, INGRES
1002: automatically creates an "overflow"
1003: page so it is never the case that a relation
1004: will be unable to expand.
1005: .sp
1006: When modifying a relation
1007: to hash, an additional
1008: parameter "minpages" can
1009: be specified.
1010: Modify will guarantee
1011: that at least "minpage" primary pages will be allocated
1012: for the relation.
1013: .sp
1014: Modify computes how may primary pages will be
1015: needed to store the existing tuples at
1016: the specified fillfactor
1017: assuming that no overflow pages will be necessary originally.
1018: If that number is less than
1019: minpages, then minpages is used instead.
1020: .sp
1021: For example:
1022: .sp
1023: .ti 5
1024: .nf
1025: modify donation to hash on name where fillfactor = 50,
1026: .ti 10
1027: minpages = 1
1028: .sp 1
1029: .ti 5
1030: modify donation to hash on name where minpages = 150
1031: .fi
1032: .sp
1033: In the first case we guarantee that no more
1034: pages than are necessary will be
1035: used for 50% occupancy.
1036: The second case is typically
1037: used for modifying an empty or near
1038: empty relation.
1039: If the approximate maximum
1040: size of the relation is known in advance,
1041: minpages
1042: can be used to guarantee that the relation will
1043: have its expected maximum size.
1044: .sp
1045: There is one other option available for hash called
1046: "maxpages".
1047: Its syntax is the same as minpages.
1048: It can be used to specify the maximum
1049: number of primary pages to use.
1050: .sp
1051: COMPRESSION
1052: .sp 1
1053: The three storage structures
1054: (heap, hash, isam) can optionally
1055: have "compression" applied
1056: to them.
1057: To do this, refer to the
1058: storage structures as cheap, chash, and cisam.
1059: Compression reduces
1060: the amount of space needed to store each tuple
1061: internally.
1062: The current compression technique is to
1063: suppress trailing blanks in
1064: character domains.
1065: Using compression will never
1066: require more space and typically
1067: it can save disk space and improve
1068: performance.
1069: Here is an example:
1070: .sp 1
1071: .nf
1072: .ti +5
1073: modify donation to cisam on name where fillfactor = 100
1074: .fi
1075: .sp 1
1076: This will make donation a compressed isam
1077: structure and fill every page as
1078: full as possible.
1079: With compression, each tuple
1080: can have a different compressed
1081: length.
1082: Thus the number of tuples
1083: that can fit on one page will
1084: depend on how successfully
1085: they can be compressed.
1086:
1087: Compressed relations can be more expensive to update.
1088: In particular if a replace is done on one or
1089: more domains and the compressed tuple is no
1090: longer the same length,
1091: then INGRES must look for a new place to put the tuple.
1092: .sp 2
1093: TWO VARIATIONS ON A THEME
1094: .sp
1095: As mentioned, duplicates are not removed
1096: from a relation stored
1097: as a heap.
1098: Frequently it is desirable
1099: to remove duplicates and sort
1100: a heap relation.
1101: One way of doing this is to modify the
1102: relation to isam specifying
1103: the order in which to sort
1104: the relation.
1105: An alternative to this is to use either
1106: "heapsort" or "cheapsort".
1107: For example
1108: .sp
1109: .ti 5
1110: .nf
1111: modify donation to heapsort on name, ext
1112: .fi
1113: .sp
1114: This will sort the relation by
1115: name then ext.
1116: The tuples are further sorted on the
1117: remaining domains,
1118: in the order they were listed in the
1119: original create statement.
1120: So in this case the relation will be
1121: sorted on name then ext and then amount.
1122: Duplicate tuples are always removed.
1123: The relation will be left
1124: as a heap.
1125: Heapsort and cheapsort are intended
1126: for sorting a temporary relation before printing and
1127: destroying it.
1128: It is more efficient than modifying
1129: to isam because
1130: with isam INGRES creates a
1131: "directory" containing
1132: key information about each page.
1133: The relation will NOT be kept sorted
1134: when further updates occur.
1135: .sp
1136: Examples:
1137: .sp
1138: .nr in 2n
1139: Here are a collection of examples
1140: and comments as to the efficiency of
1141: each query.
1142: The queries are based on the
1143: relations:
1144: .(l
1145: parts(pnum, pname, color, weight, qoh)
1146: .br
1147: supply(snum, pnum, jnum, shipdate, quan)
1148: .sp 1
1149: range of p is parts
1150: .br
1151: range of s is supply
1152: .sp 1
1153: modify parts to hash on pnum
1154: .br
1155: modify supply to hash on snum,jnum
1156: .)l
1157: .ti +5
1158: .sp 1
1159: retrieve (p.all) where p.pnum = 10
1160: .sp 1
1161: INGRES will recognize that parts is
1162: hashed on pnum and go directly to the
1163: page where parts with number 10 would be stored.
1164: .sp 1
1165: .ti +5
1166: retrieve (p.all) where p.pname = "tape drive"
1167: .sp 1
1168: INGRES will read the entire relation
1169: looking for matching pnames.
1170: .sp 1
1171: .ti +5
1172: retrieve (p.all) where p.pnum < 10 and p.pnum > 5
1173: .sp 1
1174: INGRES will read the entire relation
1175: because no exact value for pnum
1176: was given.
1177: .sp 1
1178: .ti +5
1179: retrieve (s.shipdate) where s.snum = 471 and s.jnum = 1008
1180: .sp 1
1181: INGRES will recognize that supply is hashed on the
1182: combination of snum and jnum and will go directly
1183: to the correct page.
1184: .ti +5
1185: .sp 1
1186: retrieve (s.shipdate) where s.snum = 471
1187: .sp 1
1188: INGRES will read the entire
1189: relation.
1190: Supply is hashed on the
1191: combination of snum and jnum.
1192: Unless INGRES is given a unique
1193: value for both, it cannot
1194: take advantage of the storage
1195: structure.
1196: .sp 1
1197: .ti +5
1198: retrieve (p.pname, s.shipdate) where
1199: .ti +5
1200: .br
1201: p.pnum = s.pnum and s.snum = 471 and s.jnum = 1008
1202: .sp 1
1203: INGRES will take advantage of both
1204: storage structures.
1205: It will first find all
1206: s.pnum and s.shipdate
1207: where s.snum = 471 and
1208: s.jnum = 1008.
1209: After that it will look for all
1210: p.pname where p.pnum is equal to
1211: the correct value.
1212: .sp 1
1213: This example illustrates the idea that it is
1214: frequently a good idea to hash a
1215: relation on the domains where it is
1216: "joined" with another relation.
1217: For example, in this
1218: case it is very common to ask
1219: for p.pnum = s.pnum
1220: .sp 1
1221: To summarize:
1222: .sp 1
1223: To take advantage of a hash
1224: structure,
1225: INGRES needs an exact value
1226: for each key domain.
1227: An exact value is anything
1228: such as:
1229: .ti +5
1230: .sp 1
1231: s.snum = 471
1232: .br
1233: .ti +5
1234: s.pnum = p.pnum
1235: .sp 1
1236: An exact value is not
1237: .sp 1
1238: .ti +5
1239: s.snum >= 471
1240: .br
1241: .ti +5
1242: (s.snum = 10 or s.snum = 20)
1243: .sp 1
1244: Now let's consider some
1245: cases using isam
1246: .sp 1
1247: .in +5
1248: modify supply to isam on snum,shipdate
1249: .br
1250: retrieve (s.all) where s.snum = 471
1251: .br
1252: and s.shipdate > "75-12-31"
1253: .br
1254: and s.shipdate < "77-01-01"
1255: .sp 1
1256: .in -5
1257: Since supply is sorted first on snum and then
1258: on shipdate, INGRES
1259: can take full advantage of the
1260: isam structure to locate the
1261: portions of supply which satisfy
1262: the query.
1263: .sp 1
1264: .ti +5
1265: retrieve (s.all) where s.snum = 47l
1266: .sp 1
1267: Unlike hash, an isam structure
1268: can still be used if only the first key is
1269: provided.
1270: .sp 1
1271: .ti +5
1272: retrieve (s.all) where s.snum > 400 and s.snum < 500
1273: .sp 1
1274: Again INGRES will take advantage of the structure.
1275: .sp 1
1276: .ti +5
1277: retrieve (s.all) where s.shipdate >= "75-12-31" and
1278: .ti +5
1279: s.shipdate <= "77-01-01"
1280: .sp 1
1281: Here INGRES will read the entire relation.
1282: This is because the first key (snum) is not
1283: provided in the query.
1284: .sp 1
1285: To summarize:
1286: .sp 1
1287: Isam can provide improved access
1288: on either exact values or ranges of
1289: values.
1290: It is useful as long as at least
1291: the first key is provided.
1292: .sp 1
1293: To locate where the tuples are
1294: in an isam relation,
1295: INGRES searches the isam directory for that
1296: relation.
1297: When a relation is modified to isam,
1298: the tuples are first sorted and duplicates
1299: are removed.
1300: Next, the relation is
1301: filled (according to the fillfactor) starting
1302: at page 0, 1, 2... for as many
1303: pages as are needed.
1304: .sp 1
1305: Now the directory is built.
1306: The key domains from the first
1307: tuple on each page are collected and
1308: organized into a directory (stored in the relation
1309: on disk).
1310: The directory is never changed
1311: until the next time a modify is done.
1312: .sp 1
1313: Whenever a tuple is added to the relation,
1314: the directory is searched to find
1315: which page the new tuple belongs on.
1316: Within that page, the individual
1317: tuples are NOT kept sorted.
1318: This is what is meant by "approximately" sorted.
1319: .sp 2
1320: .nr in 0
1321: HEAP v. HASH v. ISAM
1322: .sp 1
1323: Let's now compare the relative advantages and disadvantages
1324: of each option.
1325: A relation is always created as a heap.
1326: A heap is the most efficient
1327: structure to use to initially
1328: fill a relation using copy or append.
1329: .sp 1
1330: Space from deleted tuples of a heap
1331: is only reused on the last page.
1332: No duplicate checking is done on
1333: a heap relation.
1334: .sp 1
1335: Hash is advantageous for locating tuples
1336: referenced in a qualification by an exact
1337: value.
1338: The primary page for tuples with a specific
1339: value can be easily computed.
1340: .sp 1
1341: Isam is useful for both exact values and ranges of values.
1342: Since the isam directory must be searched to
1343: locate tuples, it is never as efficient as hash.
1344: .sp 2
1345: OVERFLOW PAGES
1346: .sp 1
1347: When a tuple is to be inserted
1348: and there is no more room on the
1349: primary page of a relation, then an
1350: overflow page is created.
1351: As more tuples are inserted, additional overflow
1352: pages are added as needed.
1353: Overflow pages, while necessary, decrease
1354: the system performance for
1355: retrieves and updates.
1356: .sp 1
1357: For example, let's suppose that supply
1358: is hashed on snum and has 10 primary pages.
1359: Suppose the value snum = 3 falls on page 7.
1360: To find all snum = 3 requires INGRES to search
1361: primary page 7 and all overflow pages of page 7
1362: (if any).
1363: As more overflow pages are added the time
1364: needed to search for
1365: snum = 3 will increase.
1366: Since duplicates are removed on isam and hash,
1367: this search must be performed on appends and
1368: replaces also.
1369: .sp 1
1370: When a hash or isam relation has too many overflow pages
1371: it should be remodified to hash
1372: or isam again.
1373: This will clear up the relation
1374: and eliminate as many overflow pages as possible.
1375: .sp 2
1376: UNIQUE KEYS
1377: .sp 1
1378: When choosing key domains for a relation
1379: it is desirable to have each set of
1380: key domains
1381: as unique as possible.
1382: For example, employee id numbers
1383: typically have no
1384: duplicate values, while
1385: something like color
1386: is likely to have only a few distinct
1387: values, and something like
1388: sex, to the best of our knowledge, has only two
1389: values.
1390: .sp 1
1391: If a relation is hashed on domain sex then you can expect to have all
1392: males on one primary page and all its
1393: overflow pages and a corresponding
1394: situation with females.
1395: With a hash relation there is no solution to this
1396: problem.
1397: A trade-off must be made between the
1398: most desirable key domains to use in a
1399: qualification versus the uniqueness of the
1400: key values.
1401: .sp 1
1402: Since isam structure can be used if at least
1403: the first key is provided, extra
1404: key domains can sometimes be added to increase uniqueness.
1405: For example, suppose the supply
1406: relation has only 10 unique supplier numbers
1407: but thousands of tuples.
1408: Choosing an isam structure with the keys snum and jnum
1409: will probably give many more unique keys.
1410: However, the directory size will
1411: be larger and consequently it will
1412: take longer to search.
1413: When providing additional keys
1414: just for the sake of increasing
1415: uniqueness,
1416: try to use the smallest possible domains.
1417: .sp 2
1418: SYSTEM RELATIONS
1419: .sp 1
1420: INGRES uses three relations
1421: ("relation", "attribute", and "indexes") to maintain
1422: and organize a data base.
1423: The "relation" relation has one tuple for
1424: each relation in the data base.
1425: The "attribute" relation has one tuple
1426: for each attribute in each
1427: relation.
1428: The "indexes" relation
1429: has one tuple for each secondary
1430: index.
1431: .sp 1
1432: INGRES accesses these relations
1433: in a very well defined manner.
1434: A program called "sysmod" should be used
1435: to modify these relations to hash on the
1436: appropriate domains.
1437: To use sysmod the data base
1438: administrator types
1439: .sp 1
1440: % sysmod data-base-name
1441: .sp 1
1442: Sysmod should be run
1443: initially after the data base is created and subsequently
1444: as relations are created and the data
1445: base grows.
1446: It is insufficient to run
1447: sysmod only once and forget about it.
1448: Rerunning sysmod will cause the
1449: system relations to be remodified.
1450: This will typically remove
1451: most overflow pages and improve
1452: system response time
1453: for everything.
1454: .bp
1455: 5. SECONDARY INDICES
1456: .sp 1
1457: Using an isam or hash structure
1458: provides a fast way to find
1459: tuples in a relation given values for the key
1460: domains.
1461: Sometimes this is not enough.
1462: For example, suppose we have
1463: the donation relation
1464: .sp 1
1465: .ti +5
1466: donation(name, amount, ext)
1467: .sp 1
1468: hashed on name.
1469: This will provide fast access
1470: to queries where the qualification has
1471: an exact value for name.
1472: What if we also will be doing
1473: queries giving exact values for ext?
1474: .sp 1
1475: Donation can be hashed either on name
1476: or ext, so we would have to choose which is more common
1477: and hash donation on that domain.
1478: The other domain (say ext) can have
1479: a secondary index.
1480: A secondary index is a relation which contains
1481: each "ext" together with the exact
1482: location of where the tuple is in the relation
1483: donation.
1484: .sp 1
1485: The command to create a secondary
1486: index is:
1487: .sp 1
1488: .ti +5
1489: index on donation is donext (ext)
1490: .sp 1
1491: The general format is:
1492: .sp 1
1493: .ti +5
1494: index on relation_name is secondary_index_name (domains)
1495: .sp 1
1496: Here we are asking INGRES
1497: to create a secondary index on the relation
1498: donation.
1499: The domain being indexed is "ext".
1500: Indices are formed in three steps:
1501: .sp 1
1502: .in +4
1503: .ti -5
1504: 1. "Donext" is created as a heap.
1505: .br
1506: .ti -5
1507: 2. For each
1508: tuple in donation, a tuple is inserted
1509: in "donext" with the value for ext and the
1510: exact location of the corresponding tuple in
1511: donation.
1512: .br
1513: .ti -5
1514: 3. By default "donext" is modified to isam.
1515: .in -4
1516: .sp 1
1517: Now if you run the query
1518: .sp 1
1519: .ti +5
1520: range of d is donation
1521: .ti +5
1522: retrieve(d.amount) where d.ext = 207
1523: .sp 1
1524: INGRES will automatically look first in
1525: "donext" to find ext = 207.
1526: When it finds one it then goes directly
1527: to the tuple in the donation relation.
1528: Since "donext" is isam on ext, search for
1529: ext = 207 can typically be
1530: done rapidly.
1531: .sp 1
1532: If you run the query
1533: .sp 1
1534: .ti 5
1535: retrieve(d.amount) where d.name = "frank"
1536: .sp 1
1537: then INGRES will continue to use the hash
1538: structure of the relation "donation"
1539: to locate the qualifying tuples.
1540: .sp 1
1541: Since secondary indices are themselves relations,
1542: they also can be either hash, isam, chash or cisam.
1543: It never makes sense to a secondary index a heap.
1544: .sp 1
1545: The decision as to what structure to make
1546: them on involves the same issues
1547: as were discussed before:
1548: .sp 1
1549: Will the domains be referenced by exact value?
1550: .br
1551: Will they be referenced by ranges of value?
1552: .br
1553: etc.
1554: .sp 1
1555: In this case the "ext" domain
1556: will be referenced by exact values, and
1557: since the relation is nearly full we will do:
1558: .sp 1
1559: .ti +5
1560: modify donext to hash on ext where fillfactor = 100
1561: .ti +5
1562: and minpages = 1
1563: .sp 1
1564: Secondary indices provide a way for INGRES
1565: to access tuples based on domains
1566: that are not key domains.
1567: A relation can have any number of secondary
1568: indices and in addition
1569: each secondary index can be an index
1570: on up to six domains of the primary relation.
1571: .sp 1
1572: Whenever a tuple is replaced, deleted
1573: or appended to a primary relation,
1574: all secondary indices must
1575: also be updated.
1576: Thus secondary indices
1577: are "not free".
1578: They increase
1579: the cost of updating the
1580: primary relation, but
1581: can decrease the cost of finding tuples
1582: in the primary relation.
1583: .sp 1
1584: Whether a secondary index will improve
1585: performance or not strongly
1586: depends on the uniqueness of the
1587: values of the domains being
1588: indexed.
1589: The primary concern is whether searching
1590: through the secondary index is
1591: more efficient than simply
1592: reading the entire primary relation.
1593: In general it is if the
1594: number of tuples which satisfy the
1595: qualification is less than the number of total pages
1596: (both primary and overflow) in the primary
1597: relation.
1598: .sp 1
1599: For example if we frequently want to find
1600: all people who donated less than
1601: five dollars, consider creating
1602: .sp 1
1603: .ti +5
1604: index on donation is donamount (amount)
1605: .sp 1
1606: By default donamount will be isam
1607: on amount.
1608: IF INGRES processes the query:
1609: .sp 1
1610: .ti +5
1611: retrieve(d.name) where d.amount < 5.0
1612: .sp 1
1613: it will locate d.amount < 5.0 in the secondary
1614: index and for each tuple it
1615: finds will fetch the corresponding
1616: tuple in donation.
1617: The tuples in donamount are sorted by
1618: amount but the tuples
1619: in donation are not.
1620: Thus in general each tuple fetch from
1621: donation via donamount will be on a
1622: different page.
1623: Retrieval using the secondary index can then cause more page
1624: reads than simply reading all of donation sequentially!
1625: So in this example it would
1626: be a bad idea to create the secondary
1627: index.
1628: .bp
1629: 6. RECOVERY AND DATA UPDATE
1630: .sp 1
1631: INGRES has been carefully designed
1632: to protect the integrity of a data base
1633: against certain classes
1634: of system failures.
1635: To do this INGRES
1636: processes changes to a relation
1637: using what we call "deferred
1638: update" or "batch file update".
1639: In addition there are two INGRES
1640: programs "restore" and "purge" that can be used to check
1641: out a data base after a system failure.
1642: We will first discuss how deferred updates are created
1643: and processed, and second we will discuss
1644: the use of purge and restore.
1645: .sp 1
1646: DEFERRED UPDATE (Batch update)
1647: .in +4
1648: .sp 1
1649: .ti -5
1650: An append, replace or delete command is run in four steps:
1651: .sp 1
1652: .ti -5
1653: 1. An empty batch file is created.
1654: .ti -5
1655: 2. The command is run to completion
1656: and each change to the result relation is written into
1657: the batch file.
1658: .ti -5
1659: 3. The batch file is read and the
1660: relation and its secondary indices (if any)
1661: are actually updated.
1662: .ti -5
1663: 4. The batch file is destroyed and INGRES
1664: returns back to the user.
1665: .sp 1
1666: .in -4
1667: Deferred update defers all actual
1668: updating until the very end of
1669: the query.
1670: There are three advantages to doing this.
1671: .sp 1
1672: l. Provides recovery from system failures
1673: .sp 1
1674: If the system "crashes" during an update,
1675: the INGRES recovery program will decide to either
1676: run the update to completion or else
1677: "back out" the update, leaving the
1678: relation as it looked before the update
1679: was started.
1680: .sp 1
1681: 2. Prevents infinite queries
1682: .sp 1
1683: If "donation" were a heap and the query
1684: .sp 1
1685: .ti +4
1686: range of d is donation
1687: .ti +4
1688: append to donation(d.all)
1689: .sp 1
1690: were run without deferred update,
1691: it would terminate only when it ran
1692: out of space on disk!
1693: This is because INGRES would start reading the
1694: relation from the beginning and
1695: appending each tuple at the end.
1696: It would soon start reading the tuples it
1697: had just previously appended and
1698: continue indefinitely to
1699: "chase its tail".
1700: .sp 1
1701: While this query is certainly not
1702: typical, it illustrates the point.
1703: There are certain classes of queries
1704: where problems occur if WHEN
1705: an update actually occurs
1706: is not precisely defined.
1707: With deferred update we can
1708: guarantee consistent and logical
1709: results.
1710: .sp 1
1711: 3. Speeds up processing of secondary indices
1712: .sp 1
1713: Secondary indices can be updated
1714: faster if they are done one at a time
1715: instead of all at once.
1716: It also insures protection against
1717: the secondary index becoming inconsistent
1718: with its primary relation.
1719: .sp 1
1720: TURNING DEFERRED UPDATE OFF
1721: .sp 1
1722: If you are not persuaded by any of
1723: these arguments, INGRES
1724: allows you to turn deferred update off!
1725: Indeed there are certain cases when
1726: it is appropriate (although
1727: certainly not essential) to perform
1728: updates directly, that is, the relation is updated
1729: while the query is being processed.
1730: .sp 1
1731: To use direct update, you must be given
1732: permission by the INGRES
1733: super user.
1734: Then when invoking INGRES
1735: specify the "-b" flag which turns
1736: off batch update.
1737: .sp 1
1738: .ti +4
1739: % ingres mydate -b
1740: .sp 1
1741: INGRES will use direct update on any relation without
1742: secondary indices.
1743: It will still silently use
1744: deferred update if a relation
1745: has any secondary indices.
1746: By using the "-b" flag you are
1747: sacrificing points 1 and 2 above.
1748: In most cases you SHOULD NOT
1749: use the -b flag.
1750: .sp 1
1751: If you are using INGRES
1752: to interactively enter
1753: or change one tuple at
1754: a time, it is slightly
1755: more efficient to have deferred
1756: update turned off.
1757: If the system crashes during an
1758: update the person entering the data
1759: will be aware of the situation
1760: and can check whether the tuple
1761: was updated or not.
1762: .sp 1
1763: RESTORE
1764: .sp 1
1765: INGRES is designed to recover
1766: from the common types of system
1767: crashes which leave the Unix file
1768: system intact.
1769: It can recover from updates, creates,
1770: destroys, modifies and index commands.
1771: .sp 1
1772: INGRES is designed to "fail safe".
1773: If any inconsistancies are
1774: discovered or any failures
1775: are returned from Unix,
1776: INGRES will generate a system error
1777: message (SYSERR) and exit.
1778: .sp 1
1779: Whenever Unix crashes while INGRES
1780: is running or whenever an INGRES
1781: syserr occurs, it is
1782: generally a good idea to have the date
1783: base administrator run the command
1784: .sp 1
1785: .ti +5
1786: % restore data_base_name
1787: .sp 1
1788: The restore program performs the
1789: following functions:
1790: .in +4
1791: .sp 1
1792: .ti -5
1793: 1. Looks for batch update files.
1794: If any are found, it examines each
1795: one to see if it is complete.
1796: If the system crash occured while
1797: the batch file was being read
1798: and the data base being updated,
1799: then restore will complete
1800: the update.
1801: Otherwise the batch file was not
1802: completed and it is simply destroyed;
1803: the effect is as though the query had never been run.
1804: .sp 1
1805: .ti -5
1806: 2. Checks for uncompleted modify commands.
1807: This step is crucial.
1808: It guarantees that you will either have the
1809: relation as it existed before
1810: the modify, or restore will complete
1811: the modify command.
1812: Modify works by creating a new copy
1813: of the relation in the new structure.
1814: Then when it is ready to replace the old
1815: relation, it stores the new information in a
1816: "modify batch file".
1817: This enables restore to determine the state of
1818: uncompleted modifies.
1819: .sp 1
1820: .ti -5
1821: 3. Checks consistency of system
1822: relations.
1823: This check is used to complete "destory"
1824: commands, back out "create" commands,
1825: and back out or complete "index"
1826: commands that were interrupted by a
1827: system crash.
1828: .sp 1
1829: .ti -5
1830: 4. Purges temporary relations and files.
1831: Restore executes the "purge" program to
1832: remove temporary relations and temporary
1833: files created by the system.
1834: Purge will be discussed in more detail a bit later.
1835: .in -4
1836: .sp 1
1837: Restore cannot tell the user which queries have run and
1838: which have not.
1839: It can only identify those queries which were in the
1840: process of being run when the crash occured.
1841: When batching queries together,
1842: it is a good idea to save the output in a file.
1843: By having the monitor print out each query or set of
1844: queries,
1845: the user can later identify which queries were run.
1846: .sp 1
1847: Restore has several options to increase its
1848: usability.
1849: They are specified by "flags".
1850: The options include:
1851: .sp 1
1852: .in +4
1853: .nf
1854: -a ask before doing anything
1855: -f passed to purge. used to remove temporary files.
1856: -p passed to purge. used to destory expired rela-
1857: tions.
1858: no database restores all data bases for which you are the
1859: dba.
1860: .fi
1861: .in -4
1862: .sp 1
1863: Of these options the "-a" is the most
1864: important.
1865: It can happen that a Unix crash can cause a page of
1866: the system catalogues to be
1867: incorrect.
1868: This might cause restore to destory
1869: a relation.
1870: In fact, you might want
1871: to "patch" the system relations to correct
1872: the problem.
1873: No restore program can account
1874: for all possibilities.
1875: It is therefore no replacement
1876: (fortunately) for a human.
1877: .sp 1
1878: If "-a" is specified, restore
1879: will state what it wants to do and then ask
1880: for permission.
1881: It reads standard input and
1882: accepts "y" to mean go ahead and anything
1883: else to mean no.
1884: For example, to have restore ask you before
1885: doing anything
1886: .sp 1
1887: .ti +5
1888: restore -a mydatabase
1889: .sp 1
1890: To have it take "no" for all its questions
1891: .sp 1
1892: .ti +5
1893: restore -a mydatabase </dev/null
1894: .sp 1
1895: Using the -a flag,
1896: restore might ask for permission
1897: to perform some cleanup;
1898: for example,
1899: if it finds an attribute for which there
1900: is no corresponding relation,
1901: or if it finds a secondary index for which
1902: there is no primary relation,
1903: etc.
1904: .sp 1
1905: To date, we have never had a system
1906: crash which INGRES
1907: could not recover from.
1908: This does not mean that it will never happen, but
1909: rather that it shouldn't
1910: be too great
1911: a concern for you.
1912: It should be mentioned that restore is not
1913: a substitution for doing periodic
1914: backing up, nor does it
1915: ever perform such a function.
1916: .sp 1
1917: PURGE
1918: .sp 1
1919: Purge can be used to report expired relations,
1920: destroy temporary system relations,
1921: remove extraneous files,
1922: and destory expired relations.
1923: To use purge you must be the DBA
1924: for the data base.
1925: .sp 1
1926: .ti +5
1927: % purge mydatabase
1928: .sp 1
1929: Purge has several options which are
1930: specified by flags which are
1931: worth noting:
1932: .nr in 4n
1933: .sp 1
1934: .nf
1935: -f (default is off) remove all extraneous files.
1936: Each file is reported and then removed. If "-f"
1937: is not specified then the file is only reported.
1938: .sp 1
1939: -p (default is off) destroy all expired relations.
1940: Each expired relation is reported and if "-p"
1941: was specified the relation is destroyed.
1942: .fi
1943: .nr in 0
1944: .sp 1
1945: Purge always destroys relations and files
1946: which are known to be INGRES
1947: system temporaries.
1948: When processing multi-variable
1949: queries and queries with aggregate functions,
1950: INGRES will usually create temporary relations
1951: with intermediate results.
1952: These relations always begin with the
1953: characters "_SYS".
1954: Other INGRES commands create temporary files which also
1955: begin with "_SYS".
1956: Under normal processing they are
1957: always destroyed.
1958: If a system crash occurs, they might be left.
1959: Purge will always clean up the temporary
1960: system files.
1961: It cleans up the user relations only
1962: when specifically asked to.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.