|
|
1.1 root 1: .so tmac.ilib
2: .TH RSG 1 "The University of Arizona \- 5/16/83"
3: .SH NAME
4: rsg \- generate random sentences
5: .SH SYNOPSIS
6: \f3rsg\fP [\f3\-l\fI n\fR] [\f3\-l \fIn\fR] [\f3\-t\fR]
7: .SH DESCRIPTION
8: \fIRsg\fR generates randomly selected sentences from a grammar specified by
9: the user.
10: .PP
11: The following options may appear in any order:
12: .IP "\f3\-s\fI n\fR"
13: Set the seed for random generation to \fIn\fR.
14: The default seed is 0.
15: .IP "\f3\-l\fI n\fR"
16: Terminate generation if the number of symbols remaining to be processed
17: exceeds \fIn\fR. There is no default limit.
18: .IP \f3\-t\fR
19: Trace the generation of sentences. Trace output goes to standard error
20: output.
21: .PP
22: \fIRsg\fR works interactively, allowing the user to build, test, modify,
23: and save grammars. Input to \fIrsg\fR consists of various kinds of
24: specifications, which can be intermixed:
25: .PP
26: \fIProductions\fR define nonterminal symbols in a syntax similar to
27: the rewriting rules of BNF with various alternatives consisting
28: of the concatenation of nonterminal and terminal symbols.
29: .PP
30: \fIGeneration specifications\fR cause the generation of a specified
31: number of sentences from the language defined by a given nonterminal
32: symbol.
33: .PP
34: \fIGrammar output specifications\fR cause the definition of a
35: specified nonterminal or the entire current grammar to be written
36: to a given file.
37: .PP
38: \fISource specifications\fR cause subsequent input to be read from
39: a specified file.
40: .PP
41: In addition, any line beginning with \*M#\fR is considered to be
42: a comment, while any line beginning with \*M=\fR causes the rest
43: of that line to be used as a prompt to the user whenever \fIrsg\fR
44: is ready for input (there normally is no prompt). A line consisting
45: of a single \*M=\fR stops prompting.
46: .SH \0\0\0Productions
47: Examples of productions are:
48: .DS
49: <expr>::=<term>|<term>+<expr>
50: <term>::=<element>|<element>*<term>
51: <element>::=x|y|z|(<expr>)
52: .DE
53: Productions may occur in any order. The definition for a nonterminal
54: symbol can be changed by specifying a new production for it.
55: .PP
56: There are a number of special devices to facilitate the definition of
57: grammars, including eight predefined, built-in nonterminal symbols:
58: .nf
59: .sp 1
60: .ta .5i 1.5i
61: symbol definition
62: .sp .5
63: \*M<lb> <
64: <rb> >
65: <vb> |
66: <nl>\fR newline
67: \*M<>\fR empty string
68: \*M<&lcase>\fR any single lowercase letter
69: \*M<&ucase>\fR any single uppercase letter
70: \*M<&digit>\fR any single digit
71: .sp 1
72: .fi
73: In addition, if the string between a \*M<\fR and \*M>\fR
74: begins and
75: ends with a single quotation mark, that construction stands for
76: any single character between the quotation marks. For example,
77: .DS
78: <'xyz'>
79: .DE
80: is equivalent to
81: .DS
82: x|y|z
83: .DE
84: Finally, if the name of a nonterminal symbol between the \*M<\fR and
85: \*M>\fR begins with \*M?\fR, the user is queried during generation
86: to supply a string for that nonterminal symbol. For example, in
87: .DS
88: <expr>::=<term>|<term>+<expr>|<?expr>
89: .DE
90: if the third alternative is encountered during generation, the user is
91: asked to provide a string for \*M<expr>\fR.
92: .SH \0\0\0Generation Specifications
93: A generation specification consists of a nonterminal symbol
94: followed by a nonnegative integer. An example is
95: .DS
96: <expr>10
97: .DE
98: which specifies the generation of 10 \*M<expr>\fRs. If the
99: integer is omitted, it is assumed to be 1. Generated sentences
100: are written to standard output.
101: .SH \0\0\0Grammar Output Specifications
102: A grammar output specification consists of a nonterminal symbol,
103: followed by \*M\->\fR, followed by a file name. Such a specification
104: causes the current definition of the nonterminal symbol to be
105: written to the given file. If the file is omitted, standard output
106: is assumed. If the nonterminal symbol is omitted, the entire grammar
107: is written out. Thus,
108: .DS
109: \->
110: .DE
111: causes the entire grammar to be written to standard output.
112: .SH \0\0\0Source Specifications
113: A source specification consists of \*M@\fR followed by a file name.
114: Subsequent input is read from that file. When an end of file is encountered,
115: input reverts to the previous file. Input files can be nested.
116: .SH DIAGNOSTICS
117: Syntactically erroneous input lines are noted, but ignored.
118: .PP
119: Specifications for a file that cannot be opened are noted and treated as
120: erroneous.
121: .PP
122: If an undefined nonterminal symbol is encountered during generation,
123: an error message that identifies the undefined symbol is produced,
124: followed by the partial sentence generated to that point. Exceeding
125: the limit of symbols remaining to be generated as specified by
126: the \f3\-l\fR option is handled in similarly.
127: .SH CAVEATS
128: Generation may fail to terminate because of a loop in the rewriting
129: rules or, more seriously, because of the progressive accumulation
130: of nonterminal symbols. The latter problem can be identified
131: by using the \f3\-t\fR option and controlled by using the \f3\-l\fR
132: option. The problem often can be circumvented by duplicating alternatives
133: that lead to fewer rather than more nonterminal symbols. For
134: example, changing
135: .DS
136: <expr>::=<term>|<term>+<expr>
137: .DE
138: to
139: .DS
140: <expr>::=<term>|<term>|<term>+<expr>
141: .DE
142: increases the probability of selecting \*M<term>\fR from 1/2 to 2/3.
143: See the second reference listed below for a discussion of the general
144: problem.
145: .SH SEE ALSO
146: .Ib
147: pp. 211-219, 301-302.
148: .PP
149: Wetherell, C. S. ``Probabilistic Languages: A Review and Some Open
150: Questions'', \fIComputer Surveys\fR, Vol. 12, No. 4 (1980), pp. 361-379.
151: .SH AUTHOR
152: Ralph E. Griswold
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.