|
|
1.1 root 1: .\" Copyright (c) 1983, 1986 The Regents of the University of California.
2: .\" All rights reserved.
3: .\"
4: .\" Redistribution and use in source and binary forms are permitted
5: .\" provided that the above copyright notice and this paragraph are
6: .\" duplicated in all such forms and that any documentation,
7: .\" advertising materials, and other materials related to such
8: .\" distribution and use acknowledge that the software was developed
9: .\" by the University of California, Berkeley. The name of the
10: .\" University may not be used to endorse or promote products derived
11: .\" from this software without specific prior written permission.
12: .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
13: .\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
14: .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
15: .\"
16: .\" @(#)a.t 6.4 (Berkeley) 3/7/89
17: .\"
18: .nr H2 1
19: .\".ds RH "Gateways and routing
20: .br
21: .ne 2i
22: .NH
23: \s+2Gateways and routing issues\s0
24: .PP
25: The system has been designed with the expectation that it will
26: be used in an internetwork environment. The ``canonical''
27: environment was envisioned to be a collection of local area
28: networks connected at one or more points through hosts with
29: multiple network interfaces (one on each local area network),
30: and possibly a connection to a long haul network (for example,
31: the ARPANET). In such an environment, issues of
32: gatewaying and packet routing become very important. Certain
33: of these issues, such as congestion
34: control, have been handled in a simplistic manner or specifically
35: not addressed.
36: Instead, where possible, the network system
37: attempts to provide simple mechanisms upon which more involved
38: policies may be implemented. As some of these problems become
39: better understood, the solutions developed will be incorporated
40: into the system.
41: .PP
42: This section will describe the facilities provided for packet
43: routing. The simplistic mechanisms provided for congestion
44: control are described in chapter 12.
45: .NH 2
46: Routing tables
47: .PP
48: The network system maintains a set of routing tables for
49: selecting a network interface to use in delivering a
50: packet to its destination. These tables are of the form:
51: .DS
52: .ta \w'struct 'u +\w'u_long 'u +\w'sockaddr rt_gateway; 'u
53: struct rtentry {
54: u_long rt_hash; /* hash key for lookups */
55: struct sockaddr rt_dst; /* destination net or host */
56: struct sockaddr rt_gateway; /* forwarding agent */
57: short rt_flags; /* see below */
58: short rt_refcnt; /* no. of references to structure */
59: u_long rt_use; /* packets sent using route */
60: struct ifnet *rt_ifp; /* interface to give packet to */
61: };
62: .DE
63: .PP
64: The routing information is organized in two separate tables, one
65: for routes to a host and one for routes to a network. The
66: distinction between hosts and networks is necessary so
67: that a single mechanism may be used
68: for both broadcast and multi-drop type networks, and
69: also for networks built from point-to-point links (e.g
70: DECnet [DEC80]).
71: .PP
72: Each table is organized as a hashed set of linked lists.
73: Two 32-bit hash values are calculated by routines defined for
74: each address family; one based on the destination being
75: a host, and one assuming the target is the network portion
76: of the address. Each hash value is used to
77: locate a hash chain to search (by taking the value modulo the
78: hash table size) and the entire 32-bit value is then
79: used as a key in scanning the list of routes. Lookups are
80: applied first to the routing
81: table for hosts, then to the routing table for networks.
82: If both lookups fail, a final lookup is made for a ``wildcard''
83: route (by convention, network 0).
84: The first appropriate route discovered is used.
85: By doing this, routes to a specific host on a network may be
86: present as well as routes to the network. This also allows a
87: ``fall back'' network route to be defined to a ``smart'' gateway
88: which may then perform more intelligent routing.
89: .PP
90: Each routing table entry contains a destination (the desired final destination),
91: a gateway to which to send the packet,
92: and various flags which indicate the route's status and type (host or
93: network). A count
94: of the number of packets sent using the route is kept, along
95: with a count of ``held references'' to the dynamically
96: allocated structure to insure that memory reclamation
97: occurs only when the route is not in use. Finally, a pointer to the
98: a network interface is kept; packets sent using
99: the route should be handed to this interface.
100: .PP
101: Routes are typed in two ways: either as host or network, and as
102: ``direct'' or ``indirect''. The host/network
103: distinction determines how to compare the \fIrt_dst\fP field
104: during lookup. If the route is to a network, only a packet's
105: destination network is compared to the \fIrt_dst\fP entry stored
106: in the table. If the route is to a host, the addresses must
107: match bit for bit.
108: .PP
109: The distinction between ``direct'' and ``indirect'' routes indicates
110: whether the destination is directly connected to the source.
111: This is needed when performing local network encapsulation. If
112: a packet is destined for a peer at a host or network which is
113: not directly connected to the source, the internetwork packet
114: header will
115: contain the address of the eventual destination, while
116: the local network header will address the intervening
117: gateway. Should the destination be directly connected, these addresses
118: are likely to be identical, or a mapping between the two exists.
119: The RTF_GATEWAY flag indicates that the route is to an ``indirect''
120: gateway agent, and that the local network header should be filled in
121: from the \fIrt_gateway\fP field instead of
122: from the final internetwork destination address.
123: .PP
124: It is assumed that multiple routes to the same destination will not
125: be present; only one of multiple routes, that most recently installed,
126: will be used.
127: .PP
128: Routing redirect control messages are used to dynamically
129: modify existing routing table entries as well as dynamically
130: create new routing table entries. On hosts where exhaustive
131: routing information is too expensive to maintain (e.g. work
132: stations), the
133: combination of wildcard routing entries and routing redirect
134: messages can be used to provide a simple routing management
135: scheme without the use of a higher level policy process.
136: Current connections may be rerouted after notification of the protocols
137: by means of their \fIpr_ctlinput\fP entries.
138: Statistics are kept by the routing table routines
139: on the use of routing redirect messages and their
140: affect on the routing tables. These statistics may be viewed using
141: .IR netstat (1).
142: .PP
143: Status information other than routing redirect control messages
144: may be used in the future, but at present they are ignored.
145: Likewise, more intelligent ``metrics'' may be used to describe
146: routes in the future, possibly based on bandwidth and monetary
147: costs.
148: .NH 2
149: Routing table interface
150: .PP
151: A protocol accesses the routing tables through
152: three routines,
153: one to allocate a route, one to free a route, and one
154: to process a routing redirect control message.
155: The routine \fIrtalloc\fP performs route allocation; it is
156: called with a pointer to the following structure containing
157: the desired destination:
158: .DS
159: ._f
160: struct route {
161: struct rtentry *ro_rt;
162: struct sockaddr ro_dst;
163: };
164: .DE
165: The route returned is assumed ``held'' by the caller until
166: released with an \fIrtfree\fP call. Protocols which implement
167: virtual circuits, such as TCP, hold onto routes for the duration
168: of the circuit's lifetime, while connection-less protocols,
169: such as UDP, allocate and free routes whenever their destination address
170: changes.
171: .PP
172: The routine \fIrtredirect\fP is called to process a routing redirect
173: control message. It is called with a destination address,
174: the new gateway to that destination, and the source of the redirect.
175: Redirects are accepted only from the current router for the destination.
176: If a non-wildcard route
177: exists to the destination, the gateway entry in the route is modified
178: to point at the new gateway supplied. Otherwise, a new routing
179: table entry is inserted reflecting the information supplied. Routes
180: to interfaces and routes to gateways which are not directly accessible
181: from the host are ignored.
182: .NH 2
183: User level routing policies
184: .PP
185: Routing policies implemented in user processes manipulate the
186: kernel routing tables through two \fIioctl\fP calls. The
187: commands SIOCADDRT and SIOCDELRT add and delete routing entries,
188: respectively; the tables are read through the /dev/kmem device.
189: The decision to place policy decisions in a user process implies
190: that routing table updates may lag a bit behind the identification of
191: new routes, or the failure of existing routes, but this period
192: of instability is normally very small with proper implementation
193: of the routing process. Advisory information, such as ICMP
194: error messages and IMP diagnostic messages, may be read from
195: raw sockets (described in the next section).
196: .PP
197: Several routing policy processes have already been implemented. The
198: system standard
199: ``routing daemon'' uses a variant of the Xerox NS Routing Information
200: Protocol [Xerox82] to maintain up-to-date routing tables in our local
201: environment. Interaction with other existing routing protocols,
202: such as the Internet EGP (Exterior Gateway Protocol), has been
203: accomplished using a similar process.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.