|
|
1.1 root 1: .th SPEAK I 8/15/73
2: .if t .ds A \o"a\(ga"
3: .if n .ds A a`
4: .if t .ds v \|\(bv
5: .sh NAME
6: speak \*- word to voice translator
7: .sh SYNOPSIS
8: .bd speak
9: [
10: .bd \*-epsv
11: ] [ vocabulary [ output ] ]
12: .sh DESCRIPTION
13: .it Speak
14: turns a stream of words
15: into utterances and outputs them to a voice synthesizer,
16: or to a specified output file.
17: It has facilities for maintaining a vocabulary.
18: It receives, from the standard input
19: .s3
20: .lp +5 3
21: \*- working lines: text of words separated by blanks
22: .lp +5 3
23: \*- phonetic lines: strings of phonemes for one word preceded
24: and separated by commas.
25: The phonemes may be followed by comma-percent then a `replacement
26: part' \*- an ASCII string with no spaces.
27: The phonetic code is given in vsp(VII).
28: .lp +5 3
29: \*- empty lines
30: .lp +5 3
31: \*- command lines: beginning with
32: .bd !.
33: The following command lines
34: are recognized:
35: .s3
36: .lp +15 10
37: \fB!r\fR file replace coded vocabulary from file
38: .lp +15 10
39: \fB!w\fR file write coded vocabulary on file
40: .lp +15 10
41: \fB!p\fR print parsing for working word
42: .lp +15 10
43: \fB!l\fR list vocabulary on standard output with phonetics
44: .lp +15 10
45: \fB!c\fR word copy phonetics from working word to
46: specified word
47: .lp +15 10
48: \fB!d\fR print phonetics for working word
49: .s3
50: .i0
51: Each working line replaces its predecessor.
52: Its first word is the `working word'.
53: Each phonetic line replaces the phonetics stored for the
54: working word.
55: In particular, a phonetic line of comma only deletes the
56: entry for the working word.
57: Each working line, phonetic line or empty line
58: causes the working line to be uttered.
59: The process terminates at the end of input.
60: .s3
61: Unknown words are pronounced by rules, and failing that,
62: are spelled.
63: Spelling is done by taking each character of
64: the word, prefixing it with *, and looking it up.
65: Unspellable words burp.
66: .s3
67: .it Speak
68: is initialized with a coded vocabulary stored in file
69: /usr/lib/speak.m.
70: The vocabulary option substitutes a different file for
71: /usr/lib/speak.m.
72: .s3
73: A set of single letter options may
74: appear in any order preceded by
75: .bd \*-.
76: Their meanings are:
77: .s3
78: .lp +8 4
79: \fB\*-e\fR suppress English steps (4\*-8) below
80: .lp +8 4
81: \fB\*-p\fR suppress pronunciation by rule
82: .lp +8 4
83: \fB\*-s\fR suppress spelling
84: .lp +8 4
85: \fB\*-v\fR suppress voice output
86: .s3
87: .i0
88: .ne4
89: The steps of pronunciation by rule are:
90: .s3
91: .lp +5 5
92: (1) If there were no lower case letters in the working line,
93: fold all upper case letters to lower.
94: .lp +5 5
95: (2) Fold an initial cap to lower case, and try again.
96: .lp +5 5
97: (3) If word has only one letter, or has no lower case vowels, quit.
98: .lp +5 5
99: (4) If there is a final
100: .it s,
101: strip it.
102: .lp +5 5
103: (5) Replace final \*-ie by \*-y.
104: .lp +5 5
105: (6) If any changes have been made, try whole word again.
106: .lp +5 5
107: (7) Locate probable long vowels and capitalize them.
108: Mark probable silent \fIe\fR's.
109: .lp +5 5
110: (8) Put back the
111: .it s
112: stripped in (4), if any.
113: .lp +5 5
114: (9) Place # before and after word.
115: .lp +5 5
116: (10) Prefix word with
117: .it %,
118: and look up longest initial match
119: in the stored table of words; if none, quit.
120: .lp +5 5
121: (11) Use phonemes from the stored phonetic string as
122: pronunciation, and replace the matched stuff by the
123: replacement part of the phonetic string.
124: .lp +5 5
125: (12) If anything remains, go to (10).
126: .s3
127: .i0
128: Long vowels are located this way in step (7):
129: .s3
130: .lp +5 5
131: (1) A \fIu\fR appearing in context
132: [^aeiou]u[^aeiouwxy][aieouy].
133: (The notation is just a regular expression \*A la ed(I).)
134: .ft I
135: (pustUlous)
136: .ft R
137: .lp +5 5
138: (2) One of [aeo] appearing in the context
139: [aeo][^aehiouwxy][ie][aou]
140: or in the context [aeo][^aehiouwxy]ien is assumed long.
141: The digram \fIth\fR behaves as a single letter in this test.
142: .ft I
143: (rAdium, facEtious, quOtient, carpAthian)
144: .ft R
145: .lp +5 5
146: (3) If the first vowel in the word is \fIi\fR followed by one of
147: \fIaou,\fR it is assumed long.
148: .ft I
149: (Iodine, dIameter, trIumph)
150: .ft R
151: .lp +5 5
152: (4) If the only vowel in the word is final \fIe,\fR the vowel is
153: assumed long.
154: .ft I
155: (bE, shE)
156: .ft R
157: .lp +5 5
158: (5) If the only vowels in the word appear in the pattern
159: [aeiouy][^aeiouwxy]S, where S is one of the suffixes
160: .br
161: .dt
162: \*-al \*-le \*-re \*-y
163: .br
164: .lp +5 0
165: then the first vowel is assumed long.
166: .ft I
167: (glObal, tAble, lUcre, lAdy)
168: .ft R
169: .lp +5 5
170: (6) If no suffix was found in (5),
171: as many of these suffixes as possible are isolated from
172: right to left.
173: Stripping stops when \fIe\fR has been stripped,
174: nor is \fIe\fR stripped before a suffix beginning with \fIe\fR.
175: Each suffix is marked by inserting \*v just before the first letter, or
176: just after \fIe\fR in those suffixes that begin with \fIe\fR.
177: .br
178: .dt
179: \*-able \*-ably \*-e \*-ed \*-en
180: .br
181: \*-er \*-ery \*-est \*-ful \*-ly
182: .br
183: \*-ing \*-less \*-ment \*-ness \*-or
184: .lp +5 0
185: .ft I
186: (care\*vful\*vly, maj\*vor, fine\*vry, state\*v, caree\*vr)
187: .ft R
188: .lp +5 5
189: (7) If the word, exclusive of suffixes, ends in \fIi\fR or \fIy\fR,
190: and contains no earlier vowel, then \fIi\fR or \fIy\fR is assumed long.
191: .ft I
192: (pY \fR(from pie),\fI crY\*ving, lIe\*vd)
193: .ft R
194: .lp +5 5
195: (8) If the first suffix begins with one of [aeio],
196: then the vowel [aeiouy] in an immediately
197: preceding pattern [^aeo][aeiouy][^aeiouwxy]
198: is assumed long.
199: The digram \fIth\fR behaves as a single letter in this test.
200: .ft I
201: (cAre\*vful\*vly, bAthe\*vd, mAj\*vor, pOt\*vable, port\*vable)
202: .ft R
203: .lp +5 5
204: (9) In these exceptional cases no long letter is assumed in the
205: preceding step:
206: .lp +10 5
207: (i) before \fIg\fR, if there are any earlier vowels
208: .ft I
209: (postage\*v, stAge\*v, college\*v)
210: .ft R
211: .lp +10 5
212: (ii) \fIe\fR is not long before \fIl\fR
213: .ft I
214: (travele\*vd)
215: .ft R
216: .lp +5 5
217: (10) If the first suffix begins with one of [aeio],
218: and the word exclusive of suffixes ends in
219: [aeiouyAEIOUY]th, then
220: digram \fIth\fR is capitalized.
221: .ft I
222: (breaTH\*ving, blITHe\*vly)
223: .ft R
224: .lp +5 5
225: (11) An attempt is made to
226: recognize silent \fIe\fR in the middle of compound words.
227: Such an \fIe\fR is marked by a following \*v, and preceding vowels, other than
228: \fIe\fR, are assumed long as in step (8).
229: Silent \fIe\fR is marked in the context
230: [bdgmnprst][bdgpt]le[^aeioruy\*v]S,
231: where S is any
232: string that contains [aeiouy] but does not contain \*v or the end of the word.
233: Silent \fIe\fR is also marked in the context
234: [^aeiu][aiou][^aeiouwxy]e[^aeinoruy]S.
235: .ft I
236: (simple\*vton, fAce\*vguard, cAve\*vman, cavernous)
237: .ft R
238: .s0
239: .sh FILES
240: /usr/lib/speak.m
241: .sh "SEE ALSO"
242: vs(VII), vs(IV)
243: .sh DIAGNOSTICS
244: `?' for unknown command with
245: .bd !,
246: or for
247: unreadable or unwritable vocabulary file
248: .sh BUGS
249: Vocabulary overflow is unchecked.
250: Excessively long words cause dumps.
251: Space is not reclaimed from deleted entries.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.