Annotation of 43BSDReno/share/doc/smm/15.net/6.t, revision 1.1.1.1

1.1       root        1: .\" Copyright (c) 1983, 1986 The Regents of the University of California.
                      2: .\" All rights reserved.
                      3: .\"
                      4: .\" Redistribution and use in source and binary forms are permitted
                      5: .\" provided that the above copyright notice and this paragraph are
                      6: .\" duplicated in all such forms and that any documentation,
                      7: .\" advertising materials, and other materials related to such
                      8: .\" distribution and use acknowledge that the software was developed
                      9: .\" by the University of California, Berkeley.  The name of the
                     10: .\" University may not be used to endorse or promote products derived
                     11: .\" from this software without specific prior written permission.
                     12: .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
                     13: .\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
                     14: .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
                     15: .\"
                     16: .\"    @(#)6.t 6.4 (Berkeley) 3/7/89
                     17: .\"
                     18: .nr H2 1
                     19: .\".ds RH "Internal layering
                     20: .br
                     21: .ne 2i
                     22: .NH
                     23: \s+2Internal layering\s0
                     24: .PP
                     25: The internal structure of the network system is divided into
                     26: three layers.  These
                     27: layers correspond to the services provided by the socket
                     28: abstraction, those provided by the communication protocols,
                     29: and those provided by the hardware interfaces.  The communication
                     30: protocols are normally layered into two or more individual
                     31: cooperating layers, though they are collectively viewed
                     32: in the system as one layer providing services supportive
                     33: of the appropriate socket abstraction.
                     34: .PP
                     35: The following sections describe the properties of each layer
                     36: in the system and the interfaces to which each must conform.
                     37: .NH 2
                     38: Socket layer
                     39: .PP
                     40: The socket layer deals with the interprocess communication
                     41: facilities provided by the system.  A socket is a bidirectional
                     42: endpoint of communication which is ``typed'' by the semantics
                     43: of communication it supports.  The system calls described in
                     44: the \fIBerkeley Software Architecture Manual\fP [Joy86]
                     45: are used to manipulate sockets.
                     46: .PP
                     47: A socket consists of the following data structure:
                     48: .DS
                     49: ._f
                     50: struct socket {
                     51:        short   so_type;                /* generic type */
                     52:        short   so_options;             /* from socket call */
                     53:        short   so_linger;              /* time to linger while closing */
                     54:        short   so_state;               /* internal state flags */
                     55:        caddr_t so_pcb;                 /* protocol control block */
                     56:        struct  protosw *so_proto;      /* protocol handle */
                     57:        struct  socket *so_head;        /* back pointer to accept socket */
                     58:        struct  socket *so_q0;          /* queue of partial connections */
                     59:        short   so_q0len;               /* partials on so_q0 */
                     60:        struct  socket *so_q;           /* queue of incoming connections */
                     61:        short   so_qlen;                /* number of connections on so_q */
                     62:        short   so_qlimit;              /* max number queued connections */
                     63:        struct  sockbuf so_rcv;         /* receive queue */
                     64:        struct  sockbuf so_snd;         /* send queue */
                     65:        short   so_timeo;               /* connection timeout */
                     66:        u_short so_error;               /* error affecting connection */
                     67:        u_short so_oobmark;             /* chars to oob mark */
                     68:        short   so_pgrp;                /* pgrp for signals */
                     69: };
                     70: .DE
                     71: .PP
                     72: Each socket contains two data queues, \fIso_rcv\fP and \fIso_snd\fP,
                     73: and a pointer to routines which provide supporting services. 
                     74: The type of the socket,
                     75: \fIso_type\fP is defined at socket creation time and used in selecting
                     76: those services which are appropriate to support it.  The supporting
                     77: protocol is selected at socket creation time and recorded in
                     78: the socket data structure for later use.  Protocols are defined
                     79: by a table of procedures, the \fIprotosw\fP structure, which will
                     80: be described in detail later.  A pointer to a protocol-specific
                     81: data structure,
                     82: the ``protocol control block,'' is also present in the socket structure.
                     83: Protocols control this data structure, which normally includes a
                     84: back pointer to the parent socket structure to allow easy
                     85: lookup when returning information to a user 
                     86: (for example, placing an error number in the \fIso_error\fP
                     87: field).  The other entries in the socket structure are used in
                     88: queuing connection requests, validating user requests, storing
                     89: socket characteristics (e.g.
                     90: options supplied at the time a socket is created), and maintaining
                     91: a socket's state.
                     92: .PP
                     93: Processes ``rendezvous at a socket'' in many instances.  For instance,
                     94: when a process wishes to extract data from a socket's receive queue
                     95: and it is empty, or lacks sufficient data to satisfy the request,
                     96: the process blocks, supplying the address of the receive queue as
                     97: a ``wait channel' to be used in notification.  When data arrives
                     98: for the process and is placed in the socket's queue, the blocked
                     99: process is identified by the fact it is waiting ``on the queue.''
                    100: .NH 3
                    101: Socket state
                    102: .PP
                    103: A socket's state is defined from the following:
                    104: .DS
                    105: .ta \w'#define 'u +\w'SS_ISDISCONNECTING    'u +\w'0x000     'u
                    106: #define        SS_NOFDREF      0x001   /* no file table ref any more */
                    107: #define        SS_ISCONNECTED  0x002   /* socket connected to a peer */
                    108: #define        SS_ISCONNECTING 0x004   /* in process of connecting to peer */
                    109: #define        SS_ISDISCONNECTING      0x008   /* in process of disconnecting */
                    110: #define        SS_CANTSENDMORE 0x010   /* can't send more data to peer */
                    111: #define        SS_CANTRCVMORE  0x020   /* can't receive more data from peer */
                    112: #define        SS_RCVATMARK    0x040   /* at mark on input */
                    113: 
                    114: #define        SS_PRIV 0x080   /* privileged */
                    115: #define        SS_NBIO 0x100   /* non-blocking ops */
                    116: #define        SS_ASYNC        0x200   /* async i/o notify */
                    117: .DE
                    118: .PP
                    119: The state of a socket is manipulated both by the protocols
                    120: and the user (through system calls).
                    121: When a socket is created, the state is defined based on the type of socket.
                    122: It may change as control actions are performed, for example connection
                    123: establishment.
                    124: It may also change according to the type of
                    125: input/output the user wishes to perform, as indicated by options
                    126: set with \fIfcntl\fP.  ``Non-blocking'' I/O  implies that
                    127: a process should never be blocked to await resources.  Instead, any
                    128: call which would block returns prematurely
                    129: with the error EWOULDBLOCK, or the service request may be partially
                    130: fulfilled, e.g. a request for more data than is present.
                    131: .PP
                    132: If a process requested ``asynchronous'' notification of events
                    133: related to the socket, the SIGIO signal is posted to the process
                    134: when such events occur.
                    135: An event is a change in the socket's state;
                    136: examples of such occurrences are: space
                    137: becoming available in the send queue, new data available in the
                    138: receive queue, connection establishment or disestablishment, etc. 
                    139: .PP
                    140: A socket may be marked ``privileged'' if it was created by the
                    141: super-user.  Only privileged sockets may
                    142: bind addresses in privileged portions of an address space
                    143: or use ``raw'' sockets to access lower levels of the network.
                    144: .NH 3
                    145: Socket data queues
                    146: .PP
                    147: A socket's data queue contains a pointer to the data stored in
                    148: the queue and other entries related to the management of
                    149: the data.  The following structure defines a data queue:
                    150: .DS
                    151: ._f
                    152: struct sockbuf {
                    153:        u_short sb_cc;          /* actual chars in buffer */
                    154:        u_short sb_hiwat;       /* max actual char count */
                    155:        u_short sb_mbcnt;       /* chars of mbufs used */
                    156:        u_short sb_mbmax;       /* max chars of mbufs to use */
                    157:        u_short sb_lowat;       /* low water mark */
                    158:        short   sb_timeo;       /* timeout */
                    159:        struct  mbuf *sb_mb;    /* the mbuf chain */
                    160:        struct  proc *sb_sel;   /* process selecting read/write */
                    161:        short   sb_flags;       /* flags, see below */
                    162: };
                    163: .DE
                    164: .PP
                    165: Data is stored in a queue as a chain of mbufs.
                    166: The actual count of data characters as well as high and low water marks are
                    167: used by the protocols in controlling the flow of data.
                    168: The amount of buffer space (characters of mbufs and associated data pages)
                    169: is also recorded along with the limit on buffer allocation.
                    170: The socket routines cooperate in implementing the flow control
                    171: policy by blocking a process when it requests to send data and
                    172: the high water mark has been reached, or when it requests to
                    173: receive data and less than the low water mark is present
                    174: (assuming non-blocking I/O has not been specified).*
                    175: .FS
                    176: * The low-water mark is always presumed to be 0
                    177: in the current implementation.
                    178: .FE
                    179: .PP
                    180: When a socket is created, the supporting protocol ``reserves'' space
                    181: for the send and receive queues of the socket.
                    182: The limit on buffer allocation is set somewhat higher than the limit
                    183: on data characters
                    184: to account for the granularity of buffer allocation.
                    185: The actual storage associated with a
                    186: socket queue may fluctuate during a socket's lifetime, but it is assumed
                    187: that this reservation will always allow a protocol to acquire enough memory
                    188: to satisfy the high water marks.
                    189: .PP
                    190: The timeout and select values are manipulated by the socket routines
                    191: in implementing various portions of the interprocess communications
                    192: facilities and will not be described here.
                    193: .PP
                    194: Data queued at a socket is stored in one of two styles.
                    195: Stream-oriented sockets queue data with no addresses, headers
                    196: or record boundaries.
                    197: The data are in mbufs linked through the \fIm_next\fP field.
                    198: Buffers containing access rights may be present within the chain
                    199: if the underlying protocol supports passage of access rights.
                    200: Record-oriented sockets, including datagram sockets,
                    201: queue data as a list of packets; the sections of packets are distinguished
                    202: by the types of the mbufs containing them.
                    203: The mbufs which comprise a record are linked through the \fIm_next\fP field;
                    204: records are linked from the \fIm_act\fP field of the first mbuf
                    205: of one packet to the first mbuf of the next.
                    206: Each packet begins with an mbuf containing the ``from'' address
                    207: if the protocol provides it,
                    208: then any buffers containing access rights, and finally any buffers
                    209: containing data.
                    210: If a record contains no data,
                    211: no data buffers are required unless neither address nor access rights
                    212: are present.
                    213: .PP
                    214: A socket queue has a number of flags used in synchronizing access
                    215: to the data and in acquiring resources:
                    216: .DS
                    217: ._d
                    218: #define        SB_LOCK 0x01    /* lock on data queue (so_rcv only) */
                    219: #define        SB_WANT 0x02    /* someone is waiting to lock */
                    220: #define        SB_WAIT 0x04    /* someone is waiting for data/space */
                    221: #define        SB_SEL  0x08    /* buffer is selected */
                    222: #define        SB_COLL 0x10    /* collision selecting */
                    223: .DE
                    224: The last two flags are manipulated by the system in implementing
                    225: the select mechanism.
                    226: .NH 3
                    227: Socket connection queuing
                    228: .PP
                    229: In dealing with connection oriented sockets (e.g. SOCK_STREAM)
                    230: the two ends are considered distinct.  One end is termed
                    231: \fIactive\fP, and generates connection requests.  The other
                    232: end is called \fIpassive\fP and accepts connection requests.
                    233: .PP
                    234: From the passive side, a socket is marked with
                    235: SO_ACCEPTCONN when a \fIlisten\fP call is made, 
                    236: creating two queues of sockets: \fIso_q0\fP for connections
                    237: in progress and \fIso_q\fP for connections already made and
                    238: awaiting user acceptance.
                    239: As a protocol is preparing incoming connections, it creates
                    240: a socket structure queued on \fIso_q0\fP by calling the routine
                    241: \fIsonewconn\fP().  When the connection
                    242: is established, the socket structure is then transferred
                    243: to \fIso_q\fP, making it available for an \fIaccept\fP.
                    244: .PP
                    245: If an SO_ACCEPTCONN socket is closed with sockets on either
                    246: \fIso_q0\fP or \fIso_q\fP, these sockets are dropped,
                    247: with notification to the peers as appropriate.
                    248: .NH 2
                    249: Protocol layer(s)
                    250: .PP
                    251: Each socket is created in a communications domain,
                    252: which usually implies both an addressing structure (address family)
                    253: and a set of protocols which implement various socket types within the domain
                    254: (protocol family).
                    255: Each domain is defined by the following structure:
                    256: .DS
                    257: .ta .5i +\w'struct  'u +\w'(*dom_externalize)();   'u
                    258: struct domain {
                    259:        int     dom_family;             /* PF_xxx */
                    260:        char    *dom_name;
                    261:        int     (*dom_init)();          /* initialize domain data structures */
                    262:        int     (*dom_externalize)();   /* externalize access rights */
                    263:        int     (*dom_dispose)();       /* dispose of internalized rights */
                    264:        struct  protosw *dom_protosw, *dom_protoswNPROTOSW;
                    265:        struct  domain *dom_next;
                    266: };
                    267: .DE
                    268: .PP
                    269: At boot time, each domain configured into the kernel
                    270: is added to a linked list of domain.
                    271: The initialization procedure of each domain is then called.
                    272: After that time, the domain structure is used to locate protocols
                    273: within the protocol family.
                    274: It may also contain procedure references
                    275: for externalization of access rights at the receiving socket
                    276: and the disposal of access rights that are not received.
                    277: .PP
                    278: Protocols are described by a set of entry points and certain
                    279: socket-visible characteristics, some of which are used in
                    280: deciding which socket type(s) they may support.  
                    281: .PP
                    282: An entry in the ``protocol switch'' table exists for each
                    283: protocol module configured into the system.  It has the following form:
                    284: .DS
                    285: .ta .5i +\w'struct  'u +\w'domain *pr_domain;    'u
                    286: struct protosw {
                    287:        short   pr_type;                /* socket type used for */
                    288:        struct  domain *pr_domain;      /* domain protocol a member of */
                    289:        short   pr_protocol;            /* protocol number */
                    290:        short   pr_flags;               /* socket visible attributes */
                    291: /* protocol-protocol hooks */
                    292:        int     (*pr_input)();          /* input to protocol (from below) */
                    293:        int     (*pr_output)();         /* output to protocol (from above) */
                    294:        int     (*pr_ctlinput)();       /* control input (from below) */
                    295:        int     (*pr_ctloutput)();      /* control output (from above) */
                    296: /* user-protocol hook */
                    297:        int     (*pr_usrreq)();         /* user request */
                    298: /* utility hooks */
                    299:        int     (*pr_init)();           /* initialization routine */
                    300:        int     (*pr_fasttimo)();       /* fast timeout (200ms) */
                    301:        int     (*pr_slowtimo)();       /* slow timeout (500ms) */
                    302:        int     (*pr_drain)();          /* flush any excess space possible */
                    303: };
                    304: .DE
                    305: .PP
                    306: A protocol is called through the \fIpr_init\fP entry before any other.
                    307: Thereafter it is called every 200 milliseconds through the
                    308: \fIpr_fasttimo\fP entry and
                    309: every 500 milliseconds through the \fIpr_slowtimo\fP for timer based actions.
                    310: The system will call the \fIpr_drain\fP entry if it is low on space and
                    311: this should throw away any non-critical data.
                    312: .PP
                    313: Protocols pass data between themselves as chains of mbufs using
                    314: the \fIpr_input\fP and \fIpr_output\fP routines.  \fIPr_input\fP
                    315: passes data up (towards
                    316: the user) and \fIpr_output\fP passes it down (towards the network); control
                    317: information passes up and down on \fIpr_ctlinput\fP and \fIpr_ctloutput\fP.
                    318: The protocol is responsible for the space occupied by any of the
                    319: arguments to these entries and must either pass it onward or dispose of it.
                    320: (On output, the lowest level reached must free buffers storing the arguments;
                    321: on input, the highest level is responsible for freeing buffers.)
                    322: .PP
                    323: The \fIpr_usrreq\fP routine interfaces protocols to the socket
                    324: code and is described below.
                    325: .PP
                    326: The \fIpr_flags\fP field is constructed from the following values:
                    327: .DS
                    328: .ta \w'#define 'u +\w'PR_CONNREQUIRED   'u +8n
                    329: #define        PR_ATOMIC       0x01            /* exchange atomic messages only */
                    330: #define        PR_ADDR 0x02            /* addresses given with messages */
                    331: #define        PR_CONNREQUIRED 0x04            /* connection required by protocol */
                    332: #define        PR_WANTRCVD     0x08            /* want PRU_RCVD calls */
                    333: #define        PR_RIGHTS       0x10            /* passes capabilities */
                    334: .DE
                    335: Protocols which are connection-based specify the PR_CONNREQUIRED
                    336: flag so that the socket routines will never attempt to send data
                    337: before a connection has been established.  If the PR_WANTRCVD flag
                    338: is set, the socket routines will notify the protocol when the user
                    339: has removed data from the socket's receive queue.  This allows
                    340: the protocol to implement acknowledgement on user receipt, and
                    341: also update windowing information based on the amount of space
                    342: available in the receive queue.  The PR_ADDR field indicates that any
                    343: data placed in the socket's receive queue will be preceded by the
                    344: address of the sender.  The PR_ATOMIC flag specifies that each \fIuser\fP
                    345: request to send data must be performed in a single \fIprotocol\fP send
                    346: request; it is the protocol's responsibility to maintain record
                    347: boundaries on data to be sent.  The PR_RIGHTS flag indicates that the
                    348: protocol supports the passing of capabilities;  this is currently
                    349: used only by the protocols in the UNIX protocol family.
                    350: .PP
                    351: When a socket is created, the socket routines scan the protocol
                    352: table for the domain
                    353: looking for an appropriate protocol to support the type of
                    354: socket being created.  The \fIpr_type\fP field contains one of the
                    355: possible socket types (e.g. SOCK_STREAM), while the \fIpr_domain\fP
                    356: is a back pointer to the domain structure.
                    357: The \fIpr_protocol\fP field contains the protocol number of the
                    358: protocol, normally a well-known value.
                    359: .NH 2
                    360: Network-interface layer
                    361: .PP
                    362: Each network-interface configured into a system defines a
                    363: path through which packets may be sent and received.
                    364: Normally a hardware device is associated with this interface,
                    365: though there is no requirement for this (for example, all
                    366: systems have a software ``loopback'' interface used for 
                    367: debugging and performance analysis).
                    368: In addition to manipulating the hardware device, an interface
                    369: module is responsible
                    370: for encapsulation and decapsulation of any link-layer header
                    371: information required to deliver a message to its destination.
                    372: The selection of which interface to use in delivering packets
                    373: is a routing decision carried out at a
                    374: higher level than the network-interface layer.
                    375: An interface may have addresses in one or more address families.
                    376: The address is set at boot time using an \fIioctl\fP on a socket
                    377: in the appropriate domain; this operation is implemented by the protocol
                    378: family, after verifying the operation through the device \fIioctl\fP entry.
                    379: .PP
                    380: An interface is defined by the following structure,
                    381: .DS
                    382: .ta .5i +\w'struct   'u +\w'ifaddr *if_addrlist;   'u
                    383: struct ifnet {
                    384:        char    *if_name;               /* name, e.g. ``en'' or ``lo'' */
                    385:        short   if_unit;                /* sub-unit for lower level driver */
                    386:        short   if_mtu;                 /* maximum transmission unit */
                    387:        short   if_flags;               /* up/down, broadcast, etc. */
                    388:        short   if_timer;               /* time 'til if_watchdog called */
                    389:        struct  ifaddr *if_addrlist;    /* list of addresses of interface */
                    390:        struct  ifqueue if_snd;         /* output queue */
                    391:        int     (*if_init)();           /* init routine */
                    392:        int     (*if_output)();         /* output routine */
                    393:        int     (*if_ioctl)();          /* ioctl routine */
                    394:        int     (*if_reset)();          /* bus reset routine */
                    395:        int     (*if_watchdog)();       /* timer routine */
                    396:        int     if_ipackets;            /* packets received on interface */
                    397:        int     if_ierrors;             /* input errors on interface */
                    398:        int     if_opackets;            /* packets sent on interface */
                    399:        int     if_oerrors;             /* output errors on interface */
                    400:        int     if_collisions;          /* collisions on csma interfaces */
                    401:        struct  ifnet *if_next;
                    402: };
                    403: .DE
                    404: Each interface address has the following form:
                    405: .DS
                    406: .ta \w'#define 'u +\w'struct   'u +\w'struct   'u +\w'sockaddr ifa_addr;   'u-\w'struct   'u
                    407: struct ifaddr {
                    408:        struct  sockaddr ifa_addr;      /* address of interface */
                    409:        union {
                    410:                struct  sockaddr ifu_broadaddr;
                    411:                struct  sockaddr ifu_dstaddr;
                    412:        } ifa_ifu;
                    413:        struct  ifnet *ifa_ifp;         /* back-pointer to interface */
                    414:        struct  ifaddr *ifa_next;       /* next address for interface */
                    415: };
                    416: .ta \w'#define 'u +\w'ifa_broadaddr   'u +\w'ifa_ifu.ifu_broadaddr        'u
                    417: #define        ifa_broadaddr   ifa_ifu.ifu_broadaddr   /* broadcast address */
                    418: #define        ifa_dstaddr     ifa_ifu.ifu_dstaddr     /* other end of p-to-p link */
                    419: .DE
                    420: The protocol generally maintains this structure as part of a larger
                    421: structure containing additional information concerning the address.
                    422: .PP
                    423: Each interface has a send queue and routines used for 
                    424: initialization, \fIif_init\fP, and output, \fIif_output\fP.
                    425: If the interface resides on a system bus, the routine \fIif_reset\fP
                    426: will be called after a bus reset has been performed. 
                    427: An interface may also
                    428: specify a timer routine, \fIif_watchdog\fP;
                    429: if \fIif_timer\fP is non-zero, it is decremented once per second
                    430: until it reaches zero, at which time the watchdog routine is called.
                    431: .PP
                    432: The state of an interface and certain characteristics are stored in
                    433: the \fIif_flags\fP field.  The following values are possible:
                    434: .DS
                    435: ._d
                    436: #define        IFF_UP  0x1     /* interface is up */
                    437: #define        IFF_BROADCAST   0x2     /* broadcast is possible */
                    438: #define        IFF_DEBUG       0x4     /* turn on debugging */
                    439: #define        IFF_LOOPBACK    0x8     /* is a loopback net */
                    440: #define        IFF_POINTOPOINT 0x10    /* interface is point-to-point link */
                    441: #define        IFF_NOTRAILERS  0x20    /* avoid use of trailers */
                    442: #define        IFF_RUNNING     0x40    /* resources allocated */
                    443: #define        IFF_NOARP       0x80    /* no address resolution protocol */
                    444: .DE
                    445: If the interface is connected to a network which supports transmission
                    446: of \fIbroadcast\fP packets, the IFF_BROADCAST flag will be set and
                    447: the \fIifa_broadaddr\fP field will contain the address to be used in
                    448: sending or accepting a broadcast packet.  If the interface is associated
                    449: with a point-to-point hardware link (for example, a DEC DMR-11), the
                    450: IFF_POINTOPOINT flag will be set and \fIifa_dstaddr\fP will contain the
                    451: address of the host on the other side of the connection.  These addresses
                    452: and the local address of the interface, \fIif_addr\fP, are used in
                    453: filtering incoming packets.  The interface sets IFF_RUNNING after
                    454: it has allocated system resources and posted an initial read on the
                    455: device it manages.  This state bit is used to avoid multiple allocation
                    456: requests when an interface's address is changed.  The IFF_NOTRAILERS
                    457: flag indicates the interface should refrain from using a \fItrailer\fP
                    458: encapsulation on outgoing packets, or (where per-host negotiation
                    459: of trailers is possible) that trailer encapsulations should not be requested;
                    460: \fItrailer\fP protocols are described
                    461: in section 14.  The IFF_NOARP flag indicates the interface should not
                    462: use an ``address resolution protocol'' in mapping internetwork addresses
                    463: to local network addresses.
                    464: .PP
                    465: Various statistics are also stored in the interface structure.  These
                    466: may be viewed by users using the \fInetstat\fP(1) program.
                    467: .PP
                    468: The interface address and flags may be set with the SIOCSIFADDR and
                    469: SIOCSIFFLAGS \fIioctl\fP\^s.  SIOCSIFADDR is used initially to define each
                    470: interface's address; SIOGSIFFLAGS can be used to mark
                    471: an interface down and perform site-specific configuration.
                    472: The destination address of a point-to-point link is set with SIOCSIFDSTADDR.
                    473: Corresponding operations exist to read each value.
                    474: Protocol families may also support operations to set and read the broadcast
                    475: address.
                    476: In addition, the SIOCGIFCONF \fIioctl\fP retrieves a list of interface
                    477: names and addresses for all interfaces and protocols on the host.
                    478: .NH 3
                    479: UNIBUS interfaces
                    480: .PP
                    481: All hardware related interfaces currently reside on the UNIBUS.
                    482: Consequently a common set of utility routines for dealing
                    483: with the UNIBUS has been developed.  Each UNIBUS interface
                    484: utilizes a structure of the following form:
                    485: .DS
                    486: .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR];    'u
                    487: struct ifubinfo {
                    488:        short   iff_uban;                       /* uba number */
                    489:        short   iff_hlen;                       /* local net header length */
                    490:        struct  uba_regs *iff_uba;              /* uba regs, in vm */
                    491:        short   iff_flags;                      /* used during uballoc's */
                    492: };
                    493: .DE
                    494: Additional structures are associated with each receive and transmit buffer,
                    495: normally one each per interface; for read,
                    496: .DS
                    497: .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR];    'u
                    498: struct ifrw {
                    499:        caddr_t ifrw_addr;                      /* virt addr of header */
                    500:        short   ifrw_bdp;                       /* unibus bdp */
                    501:        short   ifrw_flags;                     /* type, etc. */
                    502: #define        IFRW_W  0x01                            /* is a transmit buffer */
                    503:        int     ifrw_info;                      /* value from ubaalloc */
                    504:        int     ifrw_proto;                     /* map register prototype */
                    505:        struct  pte *ifrw_mr;                   /* base of map registers */
                    506: };
                    507: .DE
                    508: and for write,
                    509: .DS
                    510: .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR];    'u
                    511: struct ifxmt {
                    512:        struct  ifrw ifrw;
                    513:        caddr_t ifw_base;                       /* virt addr of buffer */
                    514:        struct  pte ifw_wmap[IF_MAXNUBAMR];     /* base pages for output */
                    515:        struct  mbuf *ifw_xtofree;              /* pages being dma'd out */
                    516:        short   ifw_xswapd;                     /* mask of clusters swapped */
                    517:        short   ifw_nmr;                        /* number of entries in wmap */
                    518: };
                    519: .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR];    'u
                    520: #define        ifw_addr        ifrw.ifrw_addr
                    521: #define        ifw_bdp ifrw.ifrw_bdp
                    522: #define        ifw_flags       ifrw.ifrw_flags
                    523: #define        ifw_info        ifrw.ifrw_info
                    524: #define        ifw_proto       ifrw.ifrw_proto
                    525: #define        ifw_mr  ifrw.ifrw_mr
                    526: .DE
                    527: One of each of these structures is conveniently packaged for interfaces
                    528: with single buffers for each direction, as follows:
                    529: .DS
                    530: .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR];    'u
                    531: struct ifuba {
                    532:        struct  ifubinfo ifu_info;
                    533:        struct  ifrw ifu_r;
                    534:        struct  ifxmt ifu_xmt;
                    535: };
                    536: .ta \w'#define 'u +\w'ifw_xtofree 'u
                    537: #define        ifu_uban        ifu_info.iff_uban
                    538: #define        ifu_hlen        ifu_info.iff_hlen
                    539: #define        ifu_uba         ifu_info.iff_uba
                    540: #define        ifu_flags       ifu_info.iff_flags
                    541: #define        ifu_w           ifu_xmt.ifrw
                    542: #define        ifu_xtofree     ifu_xmt.ifw_xtofree
                    543: .DE
                    544: .PP
                    545: The \fIif_ubinfo\fP structure contains the general information needed
                    546: to characterize the I/O-mapped buffers for the device.
                    547: In addition, there is a structure describing each buffer, including
                    548: UNIBUS resources held by the interface.
                    549: Sufficient memory pages and bus map registers are allocated to each buffer
                    550: upon initialization according to the maximum packet size and header length.
                    551: The kernel virtual address of the buffer is held in \fIifrw_addr\fP,
                    552: and the map registers begin
                    553: at \fIifrw_mr\fP.  UNIBUS map register \fIifrw_mr\fP\^[\-1]
                    554: maps the local network header
                    555: ending on a page boundary.  UNIBUS data paths are
                    556: reserved for read and for
                    557: write, given by \fIifrw_bdp\fP.  The prototype of the map
                    558: registers for read and for write is saved in \fIifrw_proto\fP.
                    559: .PP
                    560: When write transfers are not at least half-full pages on page boundaries,
                    561: the data are just copied into the pages mapped on the UNIBUS
                    562: and the transfer is started.
                    563: If a write transfer is at least half a page long and on a page
                    564: boundary, UNIBUS page table entries are swapped to reference
                    565: the pages, and then the initial pages are
                    566: remapped from \fIifw_wmap\fP when the transfer completes.
                    567: The mbufs containing the mapped pages are placed on the \fIifw_xtofree\fP
                    568: queue to be freed after transmission.
                    569: .PP
                    570: When read transfers give at least half a page of data to be input, page
                    571: frames are allocated from a network page list and traded
                    572: with the pages already containing the data, mapping the allocated
                    573: pages to replace the input pages for the next UNIBUS data input.
                    574: .PP
                    575: The following utility routines are available for use in
                    576: writing network interface drivers; all use the
                    577: structures described above.
                    578: .LP
                    579: if_ubaminit(ifubinfo, uban, hlen, nmr, ifr, nr, ifx, nx);
                    580: .br
                    581: if_ubainit(ifuba, uban, hlen, nmr);
                    582: .IP
                    583: \fIif_ubaminit\fP allocates resources on UNIBUS adapter \fIuban\fP,
                    584: storing the information in the \fIifubinfo\fP, \fIifrw\fP and \fIifxmt\fP
                    585: structures referenced.
                    586: The \fIifr\fP and \fIifx\fP parameters are pointers to arrays
                    587: of \fIifrw\fP and \fIifxmt\fP structures whose dimensions
                    588: are \fInr\fP and \fInx\fP, respectively.
                    589: \fIif_ubainit\fP is a simpler, backwards-compatible interface used
                    590: for hardware with single buffers of each type.
                    591: They are called only at boot time or after a UNIBUS reset. 
                    592: One data path (buffered or unbuffered,
                    593: depending on the \fIifu_flags\fP field) is allocated for each buffer.
                    594: The \fInmr\fP parameter indicates
                    595: the number of UNIBUS mapping registers required to map a maximal
                    596: sized packet onto the UNIBUS, while \fIhlen\fP specifies the size
                    597: of a local network header, if any, which should be mapped separately
                    598: from the data (see the description of trailer protocols in chapter 14).
                    599: Sufficient UNIBUS mapping registers and pages of memory are allocated
                    600: to initialize the input data path for an initial read.  For the output
                    601: data path, mapping registers and pages of memory are also allocated
                    602: and mapped onto the UNIBUS.  The pages associated with the output
                    603: data path are held in reserve in the event a write requires copying
                    604: non-page-aligned data (see \fIif_wubaput\fP below).
                    605: If \fIif_ubainit\fP is called with memory pages already allocated,
                    606: they will be used instead of allocating new ones (this normally
                    607: occurs after a UNIBUS reset).
                    608: A 1 is returned when allocation and initialization are successful,
                    609: 0 otherwise.
                    610: .LP
                    611: m = if_ubaget(ifubinfo, ifr, totlen, off0, ifp);
                    612: .br
                    613: m = if_rubaget(ifuba, totlen, off0, ifp);
                    614: .IP
                    615: \fIif_ubaget\fP and \fIif_rubaget\fP pull input data
                    616: out of an interface receive buffer and into an mbuf chain.
                    617: The first interface passes pointers to the \fIifubinfo\fP structure
                    618: for the interface and the \fIifrw\fP structure for the receive buffer;
                    619: the second call may be used for single-buffered devices.
                    620: \fItotlen\fP specifies the length of data to be obtained, not counting the
                    621: local network header.  If \fIoff0\fP is non-zero, it indicates
                    622: a byte offset to a trailing local network header which should be
                    623: copied into a separate mbuf and prepended to the front of the resultant mbuf
                    624: chain.  When the data amount to at least a half a page,
                    625: the previously mapped data pages are remapped
                    626: into the mbufs and swapped with fresh pages, thus avoiding
                    627: any copy.
                    628: The receiving interface is recorded as \fIifp\fP, a pointer to an \fIifnet\fP
                    629: structure, for the use of the receiving network protocol.
                    630: A 0 return value indicates a failure to allocate resources.
                    631: .LP
                    632: if_wubaput(ifubinfo, ifx, m);
                    633: .br
                    634: if_wubaput(ifuba, m);
                    635: .IP
                    636: \fIif_ubaput\fP and \fIif_wubaput\fP map a chain of mbufs
                    637: onto a network interface in preparation for output.
                    638: The first interface is used by devices with multiple transmit buffers.
                    639: The chain includes any local network
                    640: header, which is copied so that it resides in the mapped and
                    641: aligned I/O space.
                    642: Page-aligned data that are page-aligned in the output buffer
                    643: are mapped to the UNIBUS in place of the normal buffer page,
                    644: and the corresponding mbuf is placed on a queue to be freed after transmission.
                    645: Any other mbufs which contained non-page-sized
                    646: data portions are copied to the I/O space and then freed.
                    647: Pages mapped from a previous output operation (no longer needed)
                    648: are unmapped.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.