Annotation of 43BSDTahoe/etc/named/doc/DynamicUpdate, revision 1.1.1.1

1.1       root        1:  
                      2:  
                      3:                Description of Dynamic Update and T_UNSPEC Code
                      4:  
                      5:  
                      6:  
                      7:  
                      8:                          Added by Mike Schwartz
                      9:           University of Washington Computer Science Department
                     10:                                   11/86
                     11:                        [email protected]
                     12:  
                     13:  
                     14:  
                     15:  
                     16: I have incorporated 2 new features into BIND:
                     17:        1. Code to allow (unauthenticated) dynamic updates: surrounded by
                     18:           #ifdef ALLOW_UPDATES
                     19:        2. Code to allow data of unspecified type: surrounded by
                     20:           #ifdef ALLOW_T_UNSPEC
                     21:  
                     22: Note that you can have one or the other or both (or neither) of these
                     23: modifications running, by appropriately modifying the makefiles.  Also,
                     24: the external interface isn't changed (other than being extended), i.e.,
                     25: a BIND server that allows dynamic updates and/or T_UNSPEC data can
                     26: still talk to a 'vanilla' server using the 'vanilla' operations.
                     27:  
                     28: The description that follows is broken into 3 parts: a functional
                     29: description of the dynamic update facility, a functional description of
                     30: the T_UNSPEC facility, and a discussion of the implementation of
                     31: dynamic updates.  The implementation description is mostly intended for
                     32: those who want to make future enhancements (especially the addition of
                     33: a good authentication mechanism).  If you make enhancements, I would be
                     34: interested in hearing about them.
                     35:  
                     36:  
                     37:  
                     38:  
                     39:  
                     40:                        1. Dynamic Update Facility
                     41:  
                     42: I added this code in conjunction with my research into naming in large
                     43: heterogeneous systems.  For the purposes of this research, I ignored
                     44: security issues.  In other words, no authentication/authorization
                     45: mechanism exists to control updates.  Authentication will hopefully be
                     46: addressed at some future point (although probably not by me). In the
                     47: mean time, BIND Internet name servers (as opposed to "private" name
                     48: server networks operating with their own port numbers, as I use in my
                     49: research) should be compiled *without* -DALLOW_UPDATES, so that the
                     50: integrity of the Internet name database won't be compromised by this
                     51: code.
                     52:  
                     53:  
                     54: There are 5 different dynamic update interfaces:
                     55:        UPDATEA  - add a resource record
                     56:        UPDATED  - delete a specific resource record
                     57:        UPDATEDA - delete all named resource records
                     58:        UPDATEM  - modify a specific resource record
                     59:        UPDATEMA - modify all named resource records
                     60:  
                     61: These all work through the normal resolver interface, i.e., these
                     62: interfaces are opcodes, and the data in the buffers passed to
                     63: res_mkquery must conform to what is expected for the particular
                     64: operation (see the #ifdef ALLOW_UPDATES extensions to nstest.c for
                     65: example usage).
                     66:  
                     67: UPDATEM is logically equivalent to an UPDATED followed by an UPDATEA,
                     68: except that the updates occur atomically at the primary server (as
                     69: usual with Domain servers, secondaries may become temporarily
                     70: inconsistent).  The difference between UPDATED and UPDATEDA is that the
                     71: latter allows you to delete all RRs associated with a name; similarly
                     72: for UPDATEM and UPDATEMA.  The reason for the UPDATE{D,M}A interfaces
                     73: is two-fold:
                     74:  
                     75:        1. Sometimes you want to delete/modify some data, but you know you'll
                     76:           only have a single RR for that data; in such a case, it's more
                     77:           convenient to delete/modify the RR by just giving the name;
                     78:           otherwise, you would have to first look it up, and then
                     79:           delete/modify it.
                     80:  
                     81:        2. It is sometimes useful to be able to delete/modify multiple RRs
                     82:           this way, since one can then perform the operation atomically.
                     83:           Otherwise, one would have to delete/modify the RRs one-by-one.
                     84:  
                     85: One additional point to note about UPDATEMA is that it will return a
                     86: success status if there were *zero* or more RRs associated with the given
                     87: name (and the RR add succeeds), whereas UPDATEM, UPDATED, and UPDATEDA
                     88: will return a success status if there were *one* or more RRs associated
                     89: with the given name.  The reason for the difference is to handle the
                     90: (probably common) case where what you want to do is set a particular
                     91: name to contain a single RR, irrespective of whether or not it was
                     92: already set.
                     93:  
                     94:  
                     95:  
                     96:  
                     97:                        2. T_UNSPEC Facility
                     98:  
                     99: Type T_UNSPEC allows you to store data whose layout BIND doesn't
                    100: understand.  Data of this type is not marshalled (i.e., converted
                    101: between host and network representation, as is done, for example, with
                    102: Internet addresses) by BIND, so it is up to the client to make sure
                    103: things work out ok w.r.t. heterogeneous data representations.  The way
                    104: I use this type is to have the client marshal data, store it, retrieve
                    105: it, and demarshal it.  This way I can store arbitrary data in BIND
                    106: without having to add new code for each specific type.
                    107:  
                    108: T_UNSPEC data is dumped in an ASCII-encoded, checksummed format so
                    109: that, although it's not human-readable, it at least doesn't fill the
                    110: dump file with unprintable characters.
                    111:  
                    112: Type T_UNSPEC is important for my research environment, where
                    113: potentially lots of people want to store data in the name service, and
                    114: each person's data looks different.  Instead of having BIND understand
                    115: the format of each of their data types, the clients define marshaling
                    116: routines and pass buffers of marshalled data to BIND; BIND never tries
                    117: to demarshal the data...it just holds on to it, and gives it back to
                    118: the client when the client requests it, and the client must then
                    119: demarshal it.
                    120:  
                    121: The Xerox Network System's name service (the Clearinghouse) works this
                    122: way.  The reason 'vanilla' BIND understands the format of all the data
                    123: it holds is probably that BIND is tailored for a very specific
                    124: application, and wants to make sure the data it holds makes sense (and,
                    125: for some types, BIND needs to take additional action depending on the
                    126: data's semantics).  For more general purpose name services (like the
                    127: Clearinghouse and my usage of BIND), this approach is less tractable.
                    128:  
                    129: See the #ifdef ALLOW_T_UNSPEC extensions to nstest.c for example usage of
                    130: this type.
                    131:  
                    132:  
                    133:  
                    134:  
                    135:  
                    136:  
                    137:                3. Dynamic Update Implementation Description
                    138:  
                    139: This section is divided into 3 subsections: General Discussion,
                    140: Miscellaneous Points, and Known Defects.
                    141:  
                    142:  
                    143:  
                    144:  
                    145:                3.1 General Discussion
                    146:  
                    147: The basic scheme is this: When an update message arrives, a call is
                    148: made to InitDynUpdate, which first looks up the SOA record for the zone
                    149: the update affects.  If this is the primary server for that zone, we do
                    150: the update and then update the zone serial number (so that secondaries
                    151: will refresh later).  If this is a secondary server, we forward the
                    152: update to the primary, and if that's successful, we update our copy
                    153: afterwards.  If it's neither, we refuse the update.  (One might think
                    154: to try to propagate the update to an authoritative server; I figured
                    155: that updates will probably be most likely within an administrative
                    156: domain anyway; this could be changed if someone has strong feelings
                    157: about it).
                    158:  
                    159: Note that this mechanism disallows updates when the primary is
                    160: down, preserving the Domain scheme's consistency requirements,
                    161: but making the primary a critical point for updates.  This seemed
                    162: reasonable to me because
                    163:        1. Alternative schemes must deal with potentially complex
                    164:           situations involving merging of inconsistent secondary
                    165:           updates
                    166:        2. Updates are presumed to be rare relative to read accesses,
                    167:           so this increased restrictiveness for updates over reads is
                    168:           probably not critical
                    169:  
                    170: I have placed comments through out the code, so it shouldn't be
                    171: too hard to see what I did.  The majority of the processing is in
                    172: doupdate() and InitDynUpdate().  Also, I added a field to the zone
                    173: struct, to keep track of when zones get updated, so that only changed
                    174: zones get checkpointed.
                    175:  
                    176:  
                    177:  
                    178:  
                    179:  
                    180:                3.2 Miscellaneous Points
                    181:  
                    182: I use ns_maint to call zonedump() if the database changes, to
                    183: provide a checkpointing mechanism.  I use the zone refresh times to
                    184: set up ns_maint interrupts if there are either secondaries or
                    185: primaries.  Hence, if there is a secondary, this interrupt can cause
                    186: zoneref (as before), and if there is a primary, this interrupt can
                    187: cause doadump.  I also checkpoint if needed before shutting down.
                    188:  
                    189: You can force a server to checkpoint any changed zones by sending the
                    190: maint signal (SIGALRM) to the process.  Otherwise it just checkpoints
                    191: during maint. interrupts, or when being shutdown (with SIGTERM).
                    192: Sending it the dump signal causes the database to be dumped into the
                    193: (single) dump file, but doesn't checkpoint (i.e., update the boot
                    194: files).  Note that the boot files will be overwritten with checkpoint
                    195: files, so if you want to preserve the comments, you should keep copies
                    196: of the original boot files separate from the versions that are actually
                    197: used.
                    198:  
                    199: I disallow T_SOA updates, for several reasons:
                    200:        - T_SOA deletes at the primary wont be discovered by the secondaries
                    201:          until they try to request them at maint time, which will cause
                    202:          a failure
                    203:        - the corresponding NS record would have to be deleted at the same
                    204:          time (atomically) to avoid various problems
                    205:        - T_SOA updates would have to be done in the right order, or else
                    206:          the primary and secondaries will be out-of-sync for that zone.
                    207: My feeling is that changing the zone topology is a weighty enough thing
                    208: to do that it should involve changing the load file and reloading all
                    209: affected servers.
                    210:  
                    211: There are alot of places where bind exits due to catastrophic failures
                    212: (mainly malloc failures).  I don't try to dump the database in these
                    213: places because it's probably inconsistent anyway.  It's probably better
                    214: to depend on the most recent dump.
                    215:  
                    216:  
                    217:  
                    218:  
                    219:  
                    220:                3.2 Known Defects
                    221:  
                    222: 1. I put the following comment in nlookup (db_lookup.c):
                    223:  
                    224:        Note: at this point, if np->n_data is NULL, we could be in one
                    225:        of two situations: Either we have come across a name for which
                    226:        all the RRs have been (dynamically) deleted, or else we have
                    227:        come across a name which has no RRs associated with it because
                    228:        it is just a place holder (e.g., EDU).  In the former case, we
                    229:        would like to delete the namebuf, since it is no longer of use,
                    230:        but in the latter case we need to hold on to it, so future
                    231:        lookups that depend on it don't fail.  The only way I can see
                    232:        of doing this is to always leave the namebufs around (although
                    233:        then the memory usage continues to grow whenever names are
                    234:        added, and can never shrink back down completely when all their
                    235:        associated RRs are deleted).
                    236:  
                    237:    Thus, there is a problem that the memory usage will keep growing for
                    238:    the situation described.  You might just choose to ignore this
                    239:    problem (since I don't see any good way out), since things probably
                    240:    wont grow fast anyway (how many names are created and then deleted
                    241:    during a single server incarnation, after all?)
                    242:  
                    243:    The problem is that one can't delete old namebufs because one would
                    244:    want to do it from db_update, but db_update calls nlookup to do the
                    245:    actual work, and can't do it there, since we need to maintain place
                    246:    holders.  One could make db_update not call nlookup, so we know it's
                    247:    ok to delete the namebuf (since we know the call is part of a delete
                    248:    call); but then there is code with alot of overlapping functionality
                    249:    in the 2 routines.
                    250:  
                    251:    This also causes another problem:  If you create a name and then do
                    252:    UPDATEDA, all it's RRs get deleted, but the name remains; then, if you
                    253:    do a lookup on that name later, the name is found in the hash table,
                    254:    but no RRs are found for it.  It then forwards the query to itself (for
                    255:    some reason), and then somehow decides there is no such domain, and then
                    256:    returns (with the correct answer, but after going through extra work).
                    257:    But the name remains, and each time it is looked up, we go through
                    258:    these same steps.  This should be fixed, but I don't have time right
                    259:    now (and the right answer seems to come back anyway, so it's good
                    260:    enough for now).
                    261:  
                    262: 2. There are 2 problems that crop up when you store data (other than
                    263:    T_SOA and T_NS records) in the root:
                    264:    a. Can't get primary to doaxfr RRs other than SOA and NS to
                    265:       secondary.
                    266:    b. Upon checkpoint (zonedump), this data sometimes comes out after other
                    267:       data in the root, so that (since the SOA and NS records have null
                    268:       names), they will get interpreted as being records under the
                    269:       other names upon the next boot up.  For example, if you have a
                    270:       T_A record called ABC, the checkpoint may look like:
                    271:         $ORIGIN .
                    272:         ABC     IN      A       128.95.1.3
                    273:         99999999        IN      NS      UW-BORNEO.
                    274:         IN      SOA     UW-BORNEO. SCHWARTZ.CS.WASHINGTON.EDU.
                    275:         ( 50 3600 300 3600000 3600 )
                    276:       Then when booting up the next time, the SOA and NS records get
                    277:       interpreted as being called "ABC" rather than the null root
                    278:       name.
                    279:  
                    280: 3. The secondary server caches the T_A RR for the primary, and hence when
                    281:    it tries to ns_forw an update, it won't find the address of the primary
                    282:    using nslookup unless that T_A RR is *also* stored in the main hashtable
                    283:    (by putting it in a named.db file as well as the named.ca file).
                    284:  

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.