Annotation of 43BSD/contrib/news/doc/standard.mn, revision 1.1.1.1

1.1       root        1: .ds h0 "Standard for Interchange of USENET Messages
                      2: .ds h1
                      3: .ds h2 %
                      4: .ds f0 "\*(vr
                      5: .ds f1
                      6: .ds f2 "January 17, 1986
                      7: .mt
                      8: Standard for Interchange of USENET Messages
                      9: .au
                     10: Mark R. Horton
                     11: .ai
                     12: Bell Laboratories
                     13: Columbus, OH  43213
                     14: .au
                     15: Revised for 2.10.3 by Rick Adams
                     16: .hn
                     17: Introduction
                     18: .pg
                     19: This document defines the standard format for the interchange
                     20: of network Nnws articles among USENET sites.
                     21: It describes the format for articles themselves,
                     22: and gives partial standards for transmission of news.
                     23: The news transmission is not entirely standardized
                     24: in order to give a good deal of flexibility
                     25: to the individual hosts to choose transmission hardware and software,
                     26: whether to batch news,
                     27: and so on.
                     28: .pg
                     29: There are five sections to this document.
                     30: Section two section defines the format.
                     31: Section three defines the valid control messages.
                     32: Section four specifies some valid transmission methods.
                     33: Section five describes the overall news propagation algorithm.
                     34: .hn
                     35: Article Format
                     36: .pg
                     37: The primary consideration in choosing an article format is
                     38: that it fit in with existing tools as well as possible.
                     39: Existing tools include both implementations of mail and news.
                     40: (The
                     41: .i notesfiles
                     42: system from the University of Illinois
                     43: is considered a news implementation.)
                     44: A standard format for mail messages has existed for many years on the ARPANET,
                     45: and this format meets most of the needs of USENET.
                     46: Since the ARPANET format is extensible,
                     47: extensions to meet the additional needs of USENET
                     48: are easily made within the ARPANET standard.
                     49: Therefore,
                     50: the rule is adopted that all USENET news articles
                     51: must be formatted as valid ARPANET mail messages,
                     52: according to the ARPANET standard RFC 822.
                     53: This standard is more restrictive than the ARPANET standard,
                     54: placing additional requirements on each article
                     55: and forbidding use of certain ARPANET features.
                     56: However,
                     57: it should always be possible to use a tool
                     58: expecting an ARPANET message to process a news article.
                     59: In any situation where this standard conflicts with the ARPANET standard,
                     60: RFC 822 should be considered correct and this standard in error.
                     61: .pg
                     62: An example message is included to illustrate the fields.
                     63: .sd
                     64: From: [email protected] (Jerry Schwarz)
                     65: Path: cbosgd!mhuxj!mhuxt!eagle!jerry
                     66: Newsgroups: net.general
                     67: Subject: Usenet Etiquette -- Please Read
                     68: Message-ID: <[email protected]>
                     69: Date: Friday, 19 Nov 82 16:14:55 EST
                     70: Followup-To: net.news
                     71: Expires: Saturday, 1 Jan 83 00:00:00 EST
                     72: Organization: Bell Labs, Murray Hill
                     73: 
                     74: The body of the article comes here, after a blank line.
                     75: .ed
                     76: Here is an example of a message in the old format
                     77: (before the existence of this standard).
                     78: It is recommended that implementations also accept articles
                     79: in this format to ease upward conversion.
                     80: .sd
                     81: From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
                     82: Newsgroups: net.general
                     83: Title: Usenet Etiquette -- Please Read
                     84: Article-I.D.: eagle.642
                     85: Posted: Fri Nov 19 16:14:55 1982
                     86: Received: Fri Nov 19 16:59:30 1982
                     87: Expires: Mon Jan  1 00:00:00 1990
                     88: 
                     89: The body of the article comes here, after a blank line.
                     90: .ed
                     91: Some news systems transmit news in the
                     92: .pa A
                     93: format,
                     94: which looks like this:
                     95: .sd
                     96: Aeagle.642
                     97: net.general
                     98: cbosgd!mhuxj!mhuxt!eagle!jerry
                     99: Fri Nov 19 16:14:55 1982
                    100: Usenet Etiquette - Please Read
                    101: The body of the article comes here, with no blank line.
                    102: .ed
                    103: .pg
                    104: An article consists of several header lines,
                    105: followed by a blank line,
                    106: followed by the body of the message.
                    107: The header lines consist of a keyword,
                    108: a colon,
                    109: a blank,
                    110: and some additional information.
                    111: This is a subset of the ARPANET standard,
                    112: simplified to allow simpler software to handle it.
                    113: The
                    114: .hf From
                    115: line may optionally include a full name,
                    116: in the format above,
                    117: or use the ARPANET angle bracket syntax.
                    118: To keep the implementations simple,
                    119: other formats
                    120: (for example,
                    121: with part of the machine address after the close parenthesis)
                    122: are not allowed.
                    123: The ARPANET convention of continuation header lines
                    124: (beginning with a blank or tab)
                    125: is allowed.
                    126: .pg
                    127: Certain headers are required,
                    128: and certain other headers are optional.
                    129: Any unrecognized headers are allowed,
                    130: and will be passed through unchanged.
                    131: The required headers are
                    132: .hf From ,
                    133: .hf Date ,
                    134: .hf Newsgroups ,
                    135: .hf Subject ,
                    136: .hf Message-ID ,
                    137: and
                    138: .hf Path .
                    139: The optional headers are
                    140: .hf Followup-To ,
                    141: .hf Expires ,
                    142: .hf Reply-To ,
                    143: .hf Sender ,
                    144: .hf References ,
                    145: .hf Control ,
                    146: .hf Distribution ,
                    147: .hf Keywords ,
                    148: .hf Summary ,
                    149: and
                    150: .hf Organization .
                    151: .hn 2
                    152: Required Headers
                    153: .hn 3
                    154: From
                    155: .pg
                    156: The
                    157: .hf From
                    158: line contains the electronic mailing address of the person who sent the message,
                    159: in the ARPA internet syntax.
                    160: It may optionally also contain the full name of the person,
                    161: in parentheses,
                    162: after the electronic address.
                    163: The electronic address is the same as the entity responsible
                    164: for originating the article,
                    165: unless the
                    166: .hf Sender
                    167: header is present,
                    168: in which case the
                    169: .hf From
                    170: header might not be verified.
                    171: Note that in all site and domain names,
                    172: upper and lower case are considered the same,
                    173: thus
                    174: .cf [email protected] ,
                    175: .cf [email protected] ,
                    176: and
                    177: .cf [email protected]
                    178: are all equivalent.
                    179: User names may or may not be case sensitive, for example,
                    180: .cf [email protected]
                    181: might be different from
                    182: .cf [email protected] .
                    183: Programs should avoid changing the case of electronic addresses
                    184: when forwarding news or mail.
                    185: .pg
                    186: RFC 822 specifies that all text in parentheses is to be interpreted as a comment.
                    187: It is common in ARPANET mail to place the full name of the user
                    188: in a comment at the end of the
                    189: .hf From
                    190: line.
                    191: This standard specifies a more rigid syntax.
                    192: The full name is not considered a comment,
                    193: but an optional part of the header line.
                    194: Either the full name is omitted, 
                    195: or it appears in parentheses after the electronic address
                    196: of the person posting the article,
                    197: or it appears before an electronic address enclosed in angle brackets.
                    198: Thus,
                    199: the three permissible forms are:
                    200: .sd
                    201: From: [email protected]
                    202: From: [email protected] (Mark Horton)
                    203: From: Mark Horton <[email protected]>
                    204: .ed
                    205: Full names may contain any printing ASCII characters from space through tilde,
                    206: with the exceptions that they may not contain
                    207: \&\*(lq(\*(rq (left parenthesis),
                    208: \&\*(lq)\*(rq (right parenthesis),
                    209: \&\*(lq<\*(rq (left angle bracket),
                    210: or \*(lq>\*(rq (right angle bracket).
                    211: Additional restrictions may be placed on full names by the mail standard,
                    212: in particular,
                    213: the characters
                    214: \&\*(lq,\*(rq (comma),
                    215: \&\*(lq:\*(rq (colon),
                    216: and \*(lq;\*(rq (semicolon) are inadvisable in full names.
                    217: .hn 3
                    218: Date
                    219: .pg
                    220: The
                    221: .hf Date
                    222: line (formerly
                    223: .hf Posted )
                    224: is the date,
                    225: in a format that must be acceptable both to the ARPANET
                    226: and to the
                    227: .i getdate (3)
                    228: routine,
                    229: that the article was originally posted to the network.
                    230: This date remains unchanged as the article is propagated
                    231: throughout the network.
                    232: One format that is acceptable to both is
                    233: .sd c
                    234: \f2Wdy\fP, \f2DD\fP\ \f2Mon\fP\ \f2YY\fP \f2HH\fP:\f2MM\fP:\f2SS\fP \f2TIMEZONE\fP
                    235: .ed
                    236: Several examples of valid dates appear in the sample
                    237: article above.
                    238: Note in particular that
                    239: .i ctime (3)
                    240: format:
                    241: .sd c
                    242: \f2Wdy\fP \f2Mon\fP \f2DD\fP \f2HH\fP:\f2MM\fP:\f2SS\fP \f2YYYY\fP
                    243: .ed
                    244: is
                    245: .i not
                    246: acceptable because it is not a valid ARPANET date.
                    247: However,
                    248: since older software still generates this format,
                    249: news implementations are encouraged to accept this format
                    250: and translate it into an acceptable format.
                    251: .pg
                    252: The contents of the
                    253: .i TIMEZONE
                    254: field is currently subject to revision.
                    255: Eventually,
                    256: we hope to accept all possible worldwide time zone abbreviations,
                    257: including the usual American zones
                    258: (PST,
                    259: PDT,
                    260: MST,
                    261: MDT,
                    262: CST,
                    263: CDT,
                    264: EST,
                    265: EDT),
                    266: the other North American zones
                    267: (Bering through Newfoundland),
                    268: European zones,
                    269: Australian zones,
                    270: and so on.
                    271: Lacking a complete list at present
                    272: (and unsure if an unambiguous list exists),
                    273: authors of software are encouraged to keep this code flexible,
                    274: and in particular not to assume
                    275: that time zone names are exactly three letters long.
                    276: Implementations are free to edit this field,
                    277: keeping the time the same,
                    278: but changing the time zone
                    279: (with an appropriate adjustment to the local time shown)
                    280: to a known time zone.
                    281: It is recommended that times in message headers be transmitted in GMT
                    282: and displayed in the local time zone.
                    283: .hn 3
                    284: Newsgroups
                    285: .pg
                    286: The
                    287: .hf Newsgroups
                    288: line specifies which newsgroup or newsgroups the article belongs in.
                    289: Multiple newsgroups may be specified, separated by a comma.
                    290: Newsgroups specified must all be the names of existing newsgroups,
                    291: as no new newsgroups will be created by simply posting to them.
                    292: .pg
                    293: Wildcards
                    294: .i e\f1.\fPg ., (
                    295: the word
                    296: .ng all
                    297: are never allowed in a
                    298: .hf Newsgroups
                    299: line.
                    300: For example,
                    301: a newsgroup
                    302: .ng net.all
                    303: is illegal,
                    304: although a newsgroup name
                    305: .ng net.sport.football
                    306: is permitted.)
                    307: .pg
                    308: If an article is received with a
                    309: .hf Newsgroups
                    310: line listing some valid newsgroups and some invalid newsgroups,
                    311: a site should not remove invalid newsgroups from the list.
                    312: Instead,
                    313: the invalid newsgroups should be ignored.
                    314: For example,
                    315: suppose site
                    316: .cn A
                    317: subscribes to the classes
                    318: .ng btl.all
                    319: and 
                    320: .ng net.all ,
                    321: and exchanges news articles with site
                    322: .cn B ,
                    323: which subscribes to
                    324: .ng net.all
                    325: but not
                    326: .ng btl.all .
                    327: Suppose
                    328: .cn A
                    329: receives an article with
                    330: .sd c
                    331: Newsgroups: net.micro,btl.general
                    332: .ed
                    333: This article is passed on to
                    334: .cn B
                    335: because
                    336: .cn B
                    337: receives
                    338: .ng net.micro ,
                    339: but
                    340: .cn B
                    341: does not receive
                    342: .ng btl.general .
                    343: .cn A
                    344: must leave the
                    345: .hf Newsgroups
                    346: line unchanged.
                    347: If it were to remove
                    348: .ng btl.general ,
                    349: the edited header could eventually reenter the
                    350: .ng btl.all
                    351: class,
                    352: resulting in an article that is not shown to users subscribing to
                    353: .ng btl.general .
                    354: Also,
                    355: followups from outside
                    356: .ng btl.all
                    357: would not be shown to such users.
                    358: .hn 3
                    359: Subject
                    360: .pg
                    361: The
                    362: .hf Subject
                    363: line
                    364: (formerly
                    365: .hf Title )
                    366: tells what the article is about.
                    367: It should be suggestive enough of the contents of the article
                    368: to enable a reader to make a decision whether to read the article
                    369: based on the subject alone.
                    370: If the article is submitted in response to another article
                    371: .i e\f1.\fPg ., (
                    372: is a
                    373: .i followup )
                    374: the default subject should begin with the four characters \*(lqRe: \*(rq
                    375: and the
                    376: .hf Reference
                    377: line is required.
                    378: (The user might wish to edit the subject of the followup,
                    379: but the default should begin with \*(lqRe: \*(rq.)
                    380: .hn 3
                    381: Message-ID
                    382: .pg
                    383: The
                    384: .hf Message-ID
                    385: line gives the article a unique identifier.
                    386: The same message ID may not be reused during the lifetime of any article
                    387: with the same message ID.
                    388: (It is recommended that no message ID be reused for at least two years.)
                    389: Message ID's have the syntax
                    390: .sd c
                    391: <\f2string not containing blank or \*(lq>\*(rq\fP>
                    392: .ed
                    393: In order to conform to RFC 822,
                    394: the message ID must have the format
                    395: .sd c
                    396: <\f2unique\fP@\f2full_domain_name\fP>
                    397: .ed
                    398: where
                    399: .i "full_domain_name"
                    400: is the full name of the host at which the article entered the network,
                    401: including a domain that host is in,
                    402: and
                    403: .i unique
                    404: is any string of printing ASCII characters,
                    405: not including
                    406: \*(lq<\*(rq (left angle bracket),
                    407: \*(lq>\*(rq (right angle bracket),
                    408: or \*(lq@\*(rq (at sign).
                    409: For example,
                    410: the
                    411: .i unique
                    412: part could be an integer representing a sequence number
                    413: for articles submitted to the network,
                    414: or a short string derived from the date and time the article was created.
                    415: For example,
                    416: a valid message ID for an article submitted from site
                    417: .cn ucbvax
                    418: in domain
                    419: .cf Berkeley.EDU
                    420: would be
                    421: .cf <[email protected]> .
                    422: Programmers are urged not to make assumptions
                    423: about the content of message ID fields from other hosts,
                    424: but to treat them as unknown character strings.
                    425: It is not safe,
                    426: for example,
                    427: to assume that a message ID will be under 14 characters,
                    428: nor that it is unique in the first 14 characters.
                    429: .pg
                    430: The angle brackets are considered part of the message ID.
                    431: Thus,
                    432: in references to the message ID,
                    433: such as the
                    434: .pa ihave/sendme
                    435: and
                    436: .b cancel
                    437: control messages,
                    438: the angle brackets are included.
                    439: White space characters
                    440: .i e\f1.\fPg ., (
                    441: blank and tab)
                    442: are not allowed in a message ID.
                    443: All characters between the angle brackets must be printing ASCII characters.
                    444: .hn 3
                    445: Path
                    446: .pg
                    447: This line shows the path the article took to reach the current system.
                    448: When a system forwards the message,
                    449: it should add its own name to the list of systems in the
                    450: .hf Path
                    451: line.
                    452: The names may be separated by any punctuation character or characters,
                    453: thus
                    454: .cf cbosgd!mhuxj!mhuxt ,
                    455: .cf "cbosgd, mhuxj, mhuxt" ,
                    456: and
                    457: .cf "@cbosgd.uucp,@mhuxj.uucp,@mhuxt.uucp"
                    458: and even
                    459: .cf "teklabs, zehntel, sri-unix@cca!decvax"
                    460: are valid entries.
                    461: (The latter path indicates a message that passed through
                    462: .cn decvax ,
                    463: .cn cca ,
                    464: .cn sri-unix ,
                    465: .cn zehntel ,
                    466: and
                    467: .cn teklabs ,
                    468: in that order.)
                    469: Additional names should be added from the left,
                    470: for example,
                    471: the most recently added name in the third example was
                    472: .cn teklabs .
                    473: Letters,
                    474: digits,
                    475: periods and hyphens are considered part of site names;
                    476: other punctuation,
                    477: including blanks,
                    478: are considered separators.
                    479: .pg
                    480: Normally,
                    481: the rightmost name will be the name of the originating system.
                    482: However,
                    483: it is also permissible to include an extra entry on the right,
                    484: which is the name of the sender.
                    485: This is for upward compatibility with older system.
                    486: .pg
                    487: The
                    488: .hf Path
                    489: line is not used for replies,
                    490: and should not be taken as a mailing address.
                    491: It is intended to show the route
                    492: the message travelled to reach the local site.
                    493: There are several uses for this information.
                    494: One is to monitor USENET routing for performance reasons.
                    495: Another is to establish a path to reach new sites.
                    496: Perhaps the most important is to cut down on redundant USENET traffic
                    497: by failing to forward a message to a site that is
                    498: known to have already received it.
                    499: In particular,
                    500: when site
                    501: .cn A
                    502: sends an article to site
                    503: .cn B ,
                    504: the
                    505: .hf Path
                    506: line includes
                    507: .cn A ,
                    508: so that site
                    509: .cn B
                    510: will not immediately send the article back to site
                    511: .cn A .
                    512: The site name each site uses to identify itself should be
                    513: the same as the name by which its neighbors know it,
                    514: in order to make this optimization possible.
                    515: .pg
                    516: A site adds its own name to the front of a path
                    517: when it receives a message from another site.
                    518: Thus, if a message with path
                    519: .cf A!X!Y!Z
                    520: is passed from site
                    521: .cn A
                    522: to site
                    523: .cn B ,
                    524: .cn B
                    525: will add its own name to the path when it receives the message from
                    526: .cn A ,
                    527: .i e\f1.\fPg .,
                    528: .cf \*(lqB!A!X!Y!Z\*(rq .
                    529: If
                    530: .cn B
                    531: then passes the message on to
                    532: .cn C ,
                    533: the message sent to
                    534: .cn C
                    535: will contain the path
                    536: .cf B!A!X!Y!Z ,
                    537: and when
                    538: .cn C
                    539: receives it,
                    540: .cn C
                    541: will change it to
                    542: .cf C!B!A!X!Y!Z .
                    543: .pg
                    544: Special upward compatibility note:
                    545: Since the
                    546: .hf From ,
                    547: .hf Sender ,
                    548: and
                    549: .hf Reply-To
                    550: lines are in internet format,
                    551: and since many USENET sites do not yet have mailers
                    552: capable of understanding internet format,
                    553: it would break the reply capability to completely sever the connection
                    554: between the
                    555: .hf Path
                    556: header and the reply function.
                    557: It is recognized that the path is not always a valid reply string
                    558: in older implementations,
                    559: and no requirement to fix this problem is placed on implementations.
                    560: However,
                    561: the existing convention of placing the site name and an
                    562: .cf !
                    563: at the front of the path,
                    564: and of starting the path with the site name,
                    565: an
                    566: .cf ! ,
                    567: and the user name,
                    568: should be maintained when possible.
                    569: .hn 2
                    570: Optional Headers
                    571: .hn 3
                    572: Reply-To
                    573: .pg
                    574: This line has the same format as
                    575: .hf From .
                    576: If present,
                    577: mailed replies to the author should be sent to the name given here.
                    578: Otherwise,
                    579: replies are mailed to the name on the
                    580: .hf From
                    581: line.
                    582: (This does not prevent additional copies from being sent to recipients
                    583: named by the replier,
                    584: or on
                    585: .hf To
                    586: or
                    587: .hf Cc
                    588: lines.)
                    589: The full name may be optionally given,
                    590: in parentheses,
                    591: as in the
                    592: .hf From
                    593: line.
                    594: .hn 3
                    595: Sender
                    596: .pg
                    597: This field is present only if the submitter manually enters a
                    598: .hf From
                    599: line.
                    600: It is intended to record the entity responsible
                    601: for submitting the article to the network,
                    602: and should be verified by the software at the submitting site.
                    603: .pg
                    604: For example,
                    605: if John Smith is visiting CCA and wishes to post an article to the network,
                    606: using friend Sarah Jones account,
                    607: the message might read
                    608: .sd
                    609: From: [email protected] (John Smith)
                    610: Sender: [email protected] (Sarah Jones)
                    611: .ed
                    612: If a gateway program enters a mail message into the network at site
                    613: .cn sri-unix ,
                    614: the lines might read
                    615: .sd
                    616: From: [email protected]
                    617: Sender: [email protected]
                    618: .ed
                    619: The primary purpose of this field is to be able to track down articles
                    620: to determine how they were entered into the network.
                    621: The full name may be optionally given,
                    622: in parentheses,
                    623: as in the
                    624: .hf From
                    625: line.
                    626: .hn 3
                    627: Followup-To
                    628: .pg
                    629: This line has the same format as
                    630: .hf Newsgroups .
                    631: If present,
                    632: follow-up articles are to be posted
                    633: to the newsgroup or newsgroups listed here.
                    634: If this line is not present,
                    635: followups are posted to the newsgroup or newsgroups listed in the
                    636: .hf Newsgroups
                    637: line,
                    638: except that followups to
                    639: .ng net.general
                    640: should instead go to
                    641: .ng net.followup .
                    642: .hn 3
                    643: Expires
                    644: .pg
                    645: This line,
                    646: if present,
                    647: is in a legal USENET date format.
                    648: It specifies a suggested expiration date for the article.
                    649: If not present,
                    650: the local default expiration date is used.
                    651: .P
                    652: This field is intended to be used to clean up
                    653: articles with a limited usefulness,
                    654: or to keep important articles around for longer than usual.
                    655: For example,
                    656: a message announcing an upcoming seminar
                    657: could have an expiration date the day after the seminar,
                    658: since the message is not useful after the seminar is over.
                    659: Since local sites have local policies for expiration of news
                    660: (depending on available disk space,
                    661: for instance),
                    662: users are discouraged from providing expiration dates for articles
                    663: unless there is a natural expiration date associated with the topic.
                    664: System software should almost never provide a default
                    665: .hf Expires
                    666: line.
                    667: Leave it out and allow local policies to be used
                    668: unless there is a good reason not to.
                    669: .hn 3
                    670: References
                    671: .pg
                    672: This field lists the message ID's of any articles prompting
                    673: the submission of this article.
                    674: It is required for all follow-up articles,
                    675: and forbidden when a new subject is raised.
                    676: Implementations should provide a follow-up command,
                    677: which allows a user to post a follow-up article.
                    678: This command should generate a
                    679: .hf Subject
                    680: line which is the same as the original article,
                    681: except that if the original subject does not begin
                    682: with \*(lqRe: \*(rq or \*(lqre: \*(rq,
                    683: the four characters \*(lqRe: \*(rq are inserted before the subject.
                    684: If there is no
                    685: .hf References
                    686: line on the original header,
                    687: the
                    688: .hf References
                    689: line should contain the message ID of the original article
                    690: (including the angle brackets).
                    691: If the original article does have a
                    692: .hf References
                    693: line,
                    694: the followup article should have a
                    695: .hf References
                    696: line containing the text of the original
                    697: .hf References
                    698: line,
                    699: a blank,
                    700: and the message ID of the original article.
                    701: .pg
                    702: The purpose of the
                    703: .hf References
                    704: header is to allow articles to be grouped into conversations
                    705: by the user interface program.
                    706: This allows conversations within a newsgroup to be kept together,
                    707: and potentially users might shut off entire conversations
                    708: without unsubscribing to a newsgroup.
                    709: User interfaces may not make use of this header,
                    710: but all automatically generated followups should generate the
                    711: .hf References
                    712: line for the benefit of systems that do use it,
                    713: and manually generated followups
                    714: .i e\f1.\fPg ., (
                    715: typed in well after the original article has been printed by the machine)
                    716: should be encouraged to include them as well.
                    717: .hn 3
                    718: Control
                    719: .pg
                    720: If an article contains a
                    721: .hf Control
                    722: line,
                    723: the article is a control message.
                    724: Control messages are used for communication among USENET host machines,
                    725: not to be read by users.
                    726: Control messages are distributed by the same newsgroup mechanism
                    727: as ordinary messages.
                    728: The body of the
                    729: .hf Control
                    730: header line is the message to the host.
                    731: .pg
                    732: For upward compatibility,
                    733: messages that match the newsgroup pattern
                    734: .ng all.all.ctl
                    735: should also be interpreted as control messages.
                    736: If no
                    737: .hf Control
                    738: header is present on such messages,
                    739: the subject is used as the control message.
                    740: However,
                    741: messages on newsgroups matching this pattern do not conform to this standard.
                    742: .hn 3
                    743: Distribution
                    744: .pg
                    745: This line is used to alter the distribution scope of the message.
                    746: It has the same format as the
                    747: .hf Newsgroups
                    748: line.
                    749: User subscriptions are still controlled by
                    750: .hf Newsgroups ,
                    751: but the message is sent to all systems subscribing to the newsgroups
                    752: on the
                    753: .hf Distribution
                    754: line instead of the
                    755: .hf Newsgroups
                    756: line.
                    757: Thus, 
                    758: a car for sale in New Jersey might have headers including
                    759: .sd
                    760: Newsgroups: net.auto,net.wanted
                    761: Distribution: nj.all
                    762: .ed
                    763: so that it would only go to persons subscribing to
                    764: .ng net.auto
                    765: or
                    766: .ng net.wanted
                    767: within New Jersey.
                    768: The intent of this header is to restrict
                    769: the distribution of a newsgroup further,
                    770: not to increase it.
                    771: A local newsgroup,
                    772: such as
                    773: .ng nj.crazy-eddie ,
                    774: will probably not be propagated by sites outside New Jersey
                    775: that do not show such a newsgroup as valid.
                    776: Wildcards in newsgroup names in the
                    777: .hf Distribution
                    778: line are allowed.
                    779: Followup articles should default to the same
                    780: .hf Distribution
                    781: line as the original article,
                    782: but the user can change it to a more limited one,
                    783: or escalate the distribution
                    784: if it was originally restricted
                    785: and a more widely distributed reply is appropriate.
                    786: .hn 3
                    787: Organization
                    788: .pg
                    789: The text of this line is a short phrase describing the organization
                    790: to which the sender belongs,
                    791: or to which the machine belongs.
                    792: The intent of this line is to help identify the person posting the message,
                    793: since site names are often cryptic enough to make it hard
                    794: to recognize the organization by the electronic address.
                    795: .hn 3
                    796: Keywords
                    797: .pg
                    798: A few, well selected keywords identifying this article should be on
                    799: this line. This is used as an aid in determining if this article is
                    800: interesting to the reader.
                    801: .hn 3
                    802: Summary
                    803: .pg
                    804: This line (lines) should contain a brief summary of the article. It is
                    805: usually used as part of a followup to another article. Again, it is
                    806: very useful to the reader in determining whether to read the article.
                    807: .hn 1
                    808: Control Messages
                    809: .pg
                    810: This section lists the control messages currently defined.
                    811: The body of the
                    812: .hf Control
                    813: header is the control message.
                    814: Messages are a sequence of zero or more words,
                    815: separated by white space (blanks or tabs).
                    816: The first word is the name of the control message,
                    817: remaining words are parameters to the message.
                    818: The remainder of the header and the body of the message
                    819: are also potential parameters;
                    820: for example,
                    821: the
                    822: .hf From
                    823: line might suggest an address to which a response is to be mailed.
                    824: .pg
                    825: Implementors and administrators may choose to allow control messages
                    826: to be carried out automatically,
                    827: or to queue them for manual processing.
                    828: However,
                    829: manually processed messages should be dealt with promptly.
                    830: .hn 2
                    831: Cancel
                    832: .pg l
                    833: .sd
                    834: cancel <message ID>
                    835: .ed
                    836: If an article with the given message ID is present on the local system,
                    837: the article is cancelled.
                    838: This mechanism allows a user to cancel an article
                    839: after the article has been distributed over the network.
                    840: .pg
                    841: If the system is unable to cancel the article as requested, it should not
                    842: forward the cancellation request to its neighbor systems.
                    843: .pg
                    844: Only the author of the article or the local super user
                    845: is allowed to use this message.
                    846: The verified sender of a message is the
                    847: .hf Sender
                    848: line,
                    849: or if no
                    850: .hf Sender
                    851: line is present,
                    852: the
                    853: .hf From
                    854: line.
                    855: The verified sender of the cancel message must be the same
                    856: as either the
                    857: .hf Sender
                    858: or
                    859: .hf From
                    860: field of the original message.
                    861: A verified sender in the cancel message is allowed to match an unverified
                    862: .hf From
                    863: in the original message.
                    864: .hn 2
                    865: Ihave/Sendme
                    866: .pg l
                    867: .sd
                    868: ihave <message ID list> <remotesys>
                    869: sendme <message ID list> <remotesys>
                    870: .ed
                    871: This message is part of the
                    872: .pa ihave/sendme
                    873: protocol,
                    874: which allows one site
                    875: (say
                    876: .cn A )
                    877: to tell another site
                    878: .cn B ) (
                    879: that a particular message has been received on
                    880: .cn A .
                    881: Suppose that site
                    882: .cn A
                    883: receives article
                    884: .cf ucbvax.1234 ,
                    885: and wishes to transmit the article to site
                    886: .cn B .
                    887: .cn A
                    888: sends the control message
                    889: .cf "ihave ucbvax.1234 A"
                    890: to site
                    891: .cn B
                    892: (by posting it to newsgroup
                    893: .bi B ). \f3to.\fP
                    894: .cn B
                    895: responds with the control message
                    896: .cf "sendme ucbvax.1234 B"
                    897: (on newsgroup
                    898: .bi A ) \f3to.\fP
                    899: if it has not already received the article.
                    900: Upon receiving the
                    901: .pa sendme
                    902: message,
                    903: .cn A
                    904: sends the article to
                    905: .cn B .
                    906: .pg
                    907: This protocol can be used to cut down on redundant traffic between sites.
                    908: It is optional and should be used
                    909: only if the particular situation makes it worthwhile.
                    910: Frequently,
                    911: the outcome is that,
                    912: since most original messages are short,
                    913: and since there is a high overhead to start sending a new message with UUCP,
                    914: it costs as much to send the
                    915: .pa ihave
                    916: as it would cost to send the article itself.
                    917: .pg
                    918: One possible solution to this overhead problem is to batch requests.
                    919: Several message ID's may be announced or requested in one message.
                    920: If no message ID's are listed in the control message,
                    921: the body of the message should be scanned for message ID's,
                    922: one per line.
                    923: .hn 2
                    924: Newgroup
                    925: .sd
                    926: newgroup <groupname>
                    927: .ed
                    928: .pg
                    929: This control message creates a new newsgroup with the name given.
                    930: Since no articles may be posted or forwarded until a newsgroup is created,
                    931: this message is required before a newsgroup can be used.
                    932: The body of the message is expected to be a short paragraph
                    933: describing the intended use of the newsgroup.
                    934: .hn 2
                    935: Rmgroup
                    936: .sd
                    937: rmgroup <groupname>
                    938: .ed
                    939: .pg
                    940: This message removes a newsgroup with the given name.
                    941: Since the newsgroup is removed from every site on the network,
                    942: this command should be used carefully by a responsible administrator.
                    943: .hn 2
                    944: Sendsys
                    945: .sd
                    946: sendsys        (no arguments)
                    947: .ed
                    948: .pg
                    949: The
                    950: .i sys
                    951: file,
                    952: listing all neighbors and which newsgroups are sent to each neighbor,
                    953: will be mailed to the author of the control message
                    954: .hf Reply-to , (
                    955: if present,
                    956: otherwise
                    957: .hf From ).
                    958: This information is considered public information,
                    959: and it is a requirement of membership in USENET
                    960: that this information be provided on request,
                    961: either automatically in response to this control message,
                    962: or manually,
                    963: by mailing the requested information to the author of the message.
                    964: This information is used to keep the map of USENET up to date,
                    965: and to determine where netnews is sent.
                    966: .pg
                    967: The format of the file mailed back to the author
                    968: should be the same as that of the
                    969: .i sys
                    970: file.
                    971: This format has one line per neighboring site
                    972: (plus one line for the local site),
                    973: containing four colon separated fields.
                    974: The first field has the site name of the neighbor,
                    975: the second field has a newsgroup pattern
                    976: describing the newsgroups sent to the neighbor.
                    977: The third and fourth fields are not defined by this standard.
                    978: A sample response:
                    979: .sd
                    980: From cbosgd!mark  Sun Mar 27 20:39:37 1983
                    981: Subject: response to your sendsys request
                    982: To: [email protected]
                    983: 
                    984: Responding-System: cbosgd.UUCP
                    985: cbosgd:osg,cb,btl,bell,net,to,test
                    986: ucbvax:net,to.ucbvax:L:
                    987: cbosg:net,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews/cbosg
                    988: cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb
                    989: sescent:net,bell,btl,cb,to.sescent:F:/usr/spool/outnews/sescent
                    990: npois:net,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois
                    991: mhuxi:net,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi
                    992: .ed
                    993: .hn 2
                    994: Senduuname
                    995: .pg l
                    996: .sd
                    997: senduuname     (no arguments)
                    998: .ed
                    999: The
                   1000: .i uuname (1)
                   1001: program is run,
                   1002: and the output is mailed to the author of the control message
                   1003: .hf Reply-to , (
                   1004: if present,
                   1005: otherwise
                   1006: .hf From ).
                   1007: This program lists all UUCP neighbors of the local site.
                   1008: This information is used to make maps of the UUCP network.
                   1009: The
                   1010: .i sys
                   1011: file is
                   1012: .b not
                   1013: the same as the UUCP
                   1014: .i L.sys
                   1015: file.
                   1016: The
                   1017: .i L.sys
                   1018: file should
                   1019: .b never
                   1020: be transmitted to another party
                   1021: without the consent of the sites whose passwords are listed therein.
                   1022: .pg
                   1023: It is optional for a site to provide this information.
                   1024: Some reply should be made to the author of the control message,
                   1025: so that a transmission error won't be blamed.
                   1026: It is also permissible for a site to run the
                   1027: .i uuname
                   1028: program
                   1029: (or in some other way determine the UUCP neighbors)
                   1030: and edit the output,
                   1031: either automatically or manually,
                   1032: before mailing the reply back to the author.
                   1033: The file should contain one site per line,
                   1034: beginning with the UUCP site name.
                   1035: Additional information may be included,
                   1036: separated from the site name by a blank or tab.
                   1037: The phone number or password for the site should
                   1038: .ng not
                   1039: be included,
                   1040: as the reply is considered to be in the public domain.
                   1041: (The
                   1042: .i uuname
                   1043: program will send only the site name and not the entire contents of the
                   1044: .i L.sys
                   1045: file,
                   1046: thus,
                   1047: phone numbers and passwords are not transmitted.)
                   1048: .pg
                   1049: The purpose of this message is to generate and maintain UUCP mail routing maps.
                   1050: Thus, connections over which mail can be sent using the
                   1051: .cf site!user
                   1052: syntax should be included,
                   1053: regardless of whether the link is actually a UUCP link at the physical level.
                   1054: If a mail router should use it,
                   1055: it should be included.
                   1056: Since all information sent in response to this message is optional,
                   1057: sites are free to edit the list,
                   1058: deleting secret or private links they do not wish to publicize.
                   1059: .hn 2
                   1060: Version
                   1061: .pg l
                   1062: .sd
                   1063: version        (no arguments)
                   1064: .ed
                   1065: The name and version of the software running on the local system
                   1066: is to be mailed back to the author of the article
                   1067: .hf Reply-to "" (
                   1068: if present,
                   1069: otherwise
                   1070: .hf From ).
                   1071: .hn 1
                   1072: Transmission Methods
                   1073: .pg
                   1074: USENET is not a physical network,
                   1075: but rather a logical network
                   1076: resting on top of several existing physical networks.
                   1077: These networks include,
                   1078: but are not limited to,
                   1079: UUCP,
                   1080: the ARPANET,
                   1081: an Ethernet,
                   1082: the BLICN network,
                   1083: an NSC Hyperchannel,
                   1084: and a BERKNET.
                   1085: What is important is that two neighboring systems on USENET
                   1086: have some method to get a new article,
                   1087: in the format listed here,
                   1088: from one system to the other,
                   1089: and once on the receiving system,
                   1090: processed by the netnews software on that system.
                   1091: (On
                   1092: .ux
                   1093: systems,
                   1094: this usually means the
                   1095: .i rnews
                   1096: program being run with the article on the standard input.)
                   1097: .pg
                   1098: It is not a requirement that USENET sites have mail systems
                   1099: capable of understanding the ARPA Internet mail syntax,
                   1100: but it is strongly recommended.
                   1101: Since
                   1102: .hf From ,
                   1103: .hf Reply-To ,
                   1104: and
                   1105: .hf Sender
                   1106: lines use the Internet syntax, 
                   1107: replies will be difficult or impossible without an internet mailer.
                   1108: A site without an internet mailer can attempt to use the
                   1109: .hf Path
                   1110: header line for replies,
                   1111: but this field is not guaranteed to be a working path for replies.
                   1112: In any event,
                   1113: any site generating or forwarding news messages
                   1114: must have an internet address that allows them
                   1115: to receive mail from sites with internet mailers,
                   1116: and they must include their internet address on their From line.
                   1117: .hn 2
                   1118: Remote Execution
                   1119: .pg
                   1120: Some networks permit direct remote command execution.
                   1121: On these networks,
                   1122: news may be forwarded by spooling the
                   1123: .i rnews
                   1124: command with the article on the standard input.
                   1125: For example,
                   1126: if the remote system is called
                   1127: .cn remote ,
                   1128: news would be sent over a UUCP link with the command
                   1129: .sd c
                   1130: uux \- remote!rnews
                   1131: .ed
                   1132: and on a Berknet,
                   1133: .sd c
                   1134: net \-mremote rnews
                   1135: .ed
                   1136: It is important that the article be sent via a reliable mechanism,
                   1137: normally involving the possibility of spooling,
                   1138: rather than direct real-time remote execution.
                   1139: This is because,
                   1140: if the remote system is down,
                   1141: a direct execution command will fail,
                   1142: and the article will never be delivered.
                   1143: If the article is spooled,
                   1144: it will eventually be delivered when both systems are up.
                   1145: .hn 2
                   1146: Transfer by Mail
                   1147: .pg
                   1148: On some systems,
                   1149: direct remote spooled execution is not possible.
                   1150: However,
                   1151: most systems support electronic mail,
                   1152: and a news article can be sent as mail.
                   1153: One approach is to send a mail message
                   1154: which is identical to the news message:
                   1155: the mail headers are the news headers,
                   1156: and the mail body is the news body.
                   1157: By convention,
                   1158: this mail is sent to the user
                   1159: .i newsmail
                   1160: on the remote machine.
                   1161: .pg
                   1162: One problem with this method is that it may not be possible to convince
                   1163: the mail system that the
                   1164: .hf From
                   1165: line of the message is valid,
                   1166: since the mail message was generated by a program
                   1167: on a system different from the source of the news article.
                   1168: Another problem is that error messages caused by the mail transmission
                   1169: would be sent to the originator of the news article,
                   1170: who has no control over news transmission between two cooperating hosts
                   1171: and does not know who to contact.
                   1172: Transmission error messages should be directed to a responsible
                   1173: contact person on the sending machine.
                   1174: .pg
                   1175: A solution to this problem is to encapsulate the news article
                   1176: into a mail message,
                   1177: such that the entire article
                   1178: (headers and body)
                   1179: are part of the body of the mail message.
                   1180: The convention here is that such mail is sent to user
                   1181: .i rnews
                   1182: on the remote system.
                   1183: A mail message body is generated by prepending the letter
                   1184: .qp N
                   1185: to each line of the news article,
                   1186: and then attaching whatever mail headers are convenient to generate.
                   1187: The
                   1188: .qp N 's
                   1189: are attached to prevent any special lines in the news article
                   1190: from interfering with mail transmission,
                   1191: and to prevent any extra lines inserted by the mailer
                   1192: (headers,
                   1193: blank lines,
                   1194: etc.)
                   1195: from becoming part of the news article.
                   1196: A program on the receiving machine receives mail to
                   1197: .i rnews ,
                   1198: extracting the article itself and invoking the
                   1199: .i rnews
                   1200: program.
                   1201: An example in this format might look like this:
                   1202: .sd
                   1203: Date: Monday, 3-Jan-83 08:33:47 MST
                   1204: From: [email protected]
                   1205: Subject: network news article
                   1206: To: [email protected]
                   1207: 
                   1208: NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek
                   1209: NFrom: [email protected] (Derek Andrew)
                   1210: NNewsgroups: net.test
                   1211: NSubject: necessary test
                   1212: NMessage-ID: <[email protected]>
                   1213: NDate: Monday, 3 Jan 83 00:59:15 MST
                   1214: N
                   1215: NThis really is a test.  If anyone out there more than 6 
                   1216: Nhops away would kindly confirm this note I would
                   1217: Nappreciate it.  We suspect that our news postings
                   1218: Nare not getting out into the world.
                   1219: N
                   1220: 
                   1221: .ed
                   1222: .pg
                   1223: Using mail solves the spooling problem,
                   1224: since mail must always be spooled if the destination host is down.
                   1225: However,
                   1226: it adds more overhead to the transmission process
                   1227: (to encapsulate and extract the article)
                   1228: and makes it harder for software to give different priorities
                   1229: to news and mail.
                   1230: .hn 2
                   1231: Batching
                   1232: .pg
                   1233: Since news articles are usually short,
                   1234: and since a large number of messages
                   1235: are often sent between two sites in a day,
                   1236: it may make sense to batch news articles.
                   1237: Several articles can be combined into one large article,
                   1238: using conventions agreed upon in advance by the two sites.
                   1239: One such batching scheme is described here;
                   1240: its use is still considered experimental.
                   1241: .pg
                   1242: News articles are combined into a script, separated by a header of the form:
                   1243: .sd
                   1244: #! rnews 1234
                   1245: .ed
                   1246: where
                   1247: .i 1234
                   1248: is the length,
                   1249: in bytes,
                   1250: of the article.
                   1251: Each such line is followed by an article containing the given number of bytes.
                   1252: (The newline at the end of each line of the article is counted as one byte,
                   1253: for purposes of this count, even if it is stored as
                   1254: .qc "CARRIAGE RETURN\s+2><\s-2LINE FEED" \&.)
                   1255: For example,
                   1256: a batch of articles might look like this:
                   1257: .sd
                   1258: #! rnews 207
                   1259: From: [email protected] (Jerry Schwarz)
                   1260: Path: cbosgd!mhuxj!mhuxt!eagle!jerry
                   1261: Newsgroups: net.general
                   1262: Subject: Usenet Etiquette -- Please Read
                   1263: Message-ID: <[email protected]>
                   1264: Date: Friday, 19 Nov 82 16:14:55 EST
                   1265: 
                   1266: Here is an important message about USENET Etiquette.
                   1267: #! rnews 203
                   1268: From: [email protected] (Jerry Schwarz)
                   1269: Path: cbosgd!mhuxj!mhuxt!eagle!jerry
                   1270: Newsgroups: net.followup
                   1271: Subject: Notes on Etiquette article
                   1272: Message-ID: <[email protected]>
                   1273: Date: Friday, 19 Nov 82 17:24:12 EST
                   1274: 
                   1275: There was something I forgot to mention in the last message.
                   1276: .ed
                   1277: Batched news is recognized because the first character in the message is
                   1278: .qp # .
                   1279: The message is then passed to the unbatcher for interpretation.
                   1280: .hn 1
                   1281: The News Propagation Algorithm
                   1282: .pg
                   1283: This section describes the overall scheme of USENET and the algorithm
                   1284: followed by sites in propagating news to the entire network.
                   1285: Since all sites are affected by incorrectly formatted articles
                   1286: and by propagation errors,
                   1287: it is important for the method to be standardized.
                   1288: .pg
                   1289: USENET is a directed graph.
                   1290: Each node in the graph is a host computer,
                   1291: and each arc in the graph is a transmission path
                   1292: from one host to another host.
                   1293: Each arc is labelled with a newsgroup pattern,
                   1294: specifying which newsgroup classes are forwarded along that link.
                   1295: Most arcs are bidirectional,
                   1296: that is,
                   1297: if site
                   1298: .cn A
                   1299: sends a class of newsgroups to site
                   1300: .cn B ,
                   1301: then site
                   1302: .cn B
                   1303: usually sends the same class of newsgroups to site
                   1304: .cn A .
                   1305: This bidirectionality is not,
                   1306: however,
                   1307: required.
                   1308: .pg
                   1309: USENET is made up of many subnetworks.
                   1310: Each subnet has a name,
                   1311: such as
                   1312: .ng net
                   1313: or
                   1314: .ng btl .
                   1315: The special subnet
                   1316: .ng net
                   1317: is defined to be USENET,
                   1318: although the union of all subnets may be a superset of USENET
                   1319: (because of sites that get local newsgroup classes but do not get
                   1320: .ng net.all ).
                   1321: Each subnet is a connected graph,
                   1322: that is,
                   1323: a path exists from every node to every other node in the subnet.
                   1324: In addition,
                   1325: the entire graph is
                   1326: (theoretically)
                   1327: connected.
                   1328: (In practice,
                   1329: some political considerations have caused some sites
                   1330: to be unable to post articles reaching the rest of the network.)
                   1331: .pg
                   1332: An article is posted on one machine to a list of newsgroups.
                   1333: That machine accepts it locally,
                   1334: then forwards it to all its neighbors that are interested
                   1335: in at least one of the newsgroups of the message.
                   1336: (Site
                   1337: .cn A
                   1338: deems site
                   1339: .cn B
                   1340: to be \*(lqinterested\*(rq in a newsgroup
                   1341: if the newsgroup matches the pattern on the arc from
                   1342: .cn A
                   1343: to
                   1344: .cn B .
                   1345: This pattern is stored in a file on the
                   1346: .cn A
                   1347: machine.)
                   1348: The sites receiving the incoming article examine it
                   1349: to make sure they really want the article,
                   1350: accept it locally,
                   1351: and then in turn forward the article to all
                   1352: .i their
                   1353: interested neighbors.
                   1354: This process continues until the entire network has seen the article.
                   1355: .pg
                   1356: An important part of the algorithm is the prevention of loops.
                   1357: The above process would cause a message to loop along a cycle forever.
                   1358: In particular,
                   1359: when site
                   1360: .cn A
                   1361: sends an article to site
                   1362: .cn B ,
                   1363: site
                   1364: .cn B
                   1365: will send it back to site
                   1366: .cn A ,
                   1367: which will send it to site
                   1368: .cn B ,
                   1369: and so on.
                   1370: One solution to this is the history mechanism.
                   1371: Each site keeps track of all articles it has seen
                   1372: (by their message ID)
                   1373: and whenever an article comes in that it has already seen,
                   1374: the incoming article is discarded immediately.
                   1375: This solution is sufficient to prevent loops,
                   1376: but additional optimizations can be made to avoid sending articles to sites
                   1377: that will simply throw them away.
                   1378: .pg
                   1379: One optimization is that an article should never be sent to a machine
                   1380: listed in the
                   1381: .hf Path
                   1382: line of the header.
                   1383: When a machine name is in the
                   1384: .hf Path
                   1385: line,
                   1386: the message is known to have passed through the machine.
                   1387: Another optimization is that,
                   1388: if the article originated on site
                   1389: .cn A ,
                   1390: then site
                   1391: .cn A
                   1392: has already seen the article.
                   1393: (Origination can be determined by the
                   1394: .hf Posting-Version
                   1395: line.)
                   1396: .P
                   1397: Thus,
                   1398: if an article is posted to newsgroup
                   1399: .ng net.misc ,
                   1400: it will match the pattern
                   1401: .ng net.all
                   1402: (where
                   1403: .ng all
                   1404: is a metasymbol that matches any string),
                   1405: and will be forwarded to all sites that subscribe to
                   1406: .ng net.all
                   1407: (as determined by what their neighbors send them).
                   1408: These sites make up the
                   1409: .ng net
                   1410: subnetwork.
                   1411: An article posted to
                   1412: .ng btl.general
                   1413: will reach all sites receiving
                   1414: .ng btl.all ,
                   1415: but will not reach sites that do not get
                   1416: .ng btl.all .
                   1417: In effect,
                   1418: the articles reaches the
                   1419: .ng btl
                   1420: subnetwork.
                   1421: An article posted to newsgroups
                   1422: .ng net.micro,btl.general
                   1423: will reach all sites subscribing to either of the two classes.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.