|
|
1.1 root 1: .\"
2: .\" Copyright (c) 1982 Regents of the University of California
3: .\" @(#)asdocs1.me 1.7 2/9/83
4: .\"
5: .EQ
6: delim $$
7: .EN
8: .(l C
9: .i "\*(VS \*(AM"
10: .sp 2.0v
11: John F. Reiser
12: Bell Laboratories,
13: Holmdel, NJ
14: .sp 1.0v
15: .i and
16: .sp 1.0v
17: Robert R. Henry\**
18: .(f
19: \**Preparation of this paper supported in part
20: by the National Science Foundation under grant MCS #78-07291.
21: .)f
22: Electronics Research Laboratory
23: University of California
24: Berkeley, CA 94720
25: .sp 1.0v
26: November 5, 1979
27: .sp 1.0v
28: .i Revised
29: \*(TD
30: .)l
31: .SH 1 Introduction
32: .pp
33: This document describes the usage and input syntax
34: of the \*(UX \*(VX-11 assembler
35: .i as .
36: .i As
37: is designed for assembling the code produced by the
38: \*(CL compiler;
39: certain concessions have been made to handle code written
40: directly by people,
41: but in general little sympathy has been extended.
42: This document is intended only for the writer of a compiler or a maintainer
43: of the assembler.
44: .SH 2 "Assembler Revisions since November 5, 1979"
45: .pp
46: There has been one major change to
47: .i as
48: since the last release.
49: .i As
50: has been updated to assemble the new instructions and
51: data formats for
52: .q G
53: and
54: .q H
55: floating point numbers,
56: as well as the new queue instructions.
57: .SH 2 "Features Supported, but No Longer Encouraged as of \*(TD"
58: .pp
59: These feature(s) in
60: .i as
61: are supported, but no longer encouraged.
62: .ip -
63: The colon operator for field initialization is likely to disappear.
64: .SH 1 "Usage"
65: .pp
66: .i As
67: is invoked with these command arguments:
68: .br
69: .sp 0.25v
70: as
71: [
72: .b \-LVWJR
73: ]
74: [
75: .b \-d $n$
76: ]
77: [
78: .b \-DTS
79: ]
80: [
81: .b \-t
82: .i directory
83: ]
84: [
85: .b \-o
86: .i output
87: ]
88: [ $name sub 1$ ] $...$
89: [ $name sub n$ ]
90: .br
91: .sp 0.25v
92: .pp
93: The
94: .b \-L
95: flag instructs the assembler to save labels beginning with a
96: .q L
97: in the symbol table portion of the
98: .i output
99: file.
100: Labels are not saved by default,
101: as the default action of the link editor
102: .i ld
103: is to discard them anyway.
104: .pp
105: The
106: .b \-V
107: flag tells the assembler to place its interpass temporary
108: file into virtual memory.
109: In normal circumstances,
110: the system manager will decide where the temporary file should lie.
111: Our experiments
112: with very large temporary files show that placing the temporary
113: file into virtual memory will save about 13% of the assembly time,
114: where the size of the temporary file is about 350K bytes.
115: Most assembler sources will not be this long.
116: .pp
117: The
118: .b \-W
119: turns of all warning error reporting.
120: .pp
121: The
122: .b \-J
123: flag forces \*(UX style pseudo\-branch
124: instructions with destinations further away than a
125: byte displacement to be
126: turned into jump instructions with 4 byte offsets.
127: The
128: .b \-J
129: flag buys you nothing if
130: .b \-d2
131: is set.
132: (See \(sc8.4, and future work described in \(sc11)
133: .pp
134: The
135: .b \-R
136: flag effectively turns
137: .q "\fB.data\fP $n$"
138: directives into
139: .q "\fB.text\fP $n$"
140: directives.
141: This obviates the need to run editor scripts on assembler source to
142: .q "read\-only"
143: fix initialized data segments.
144: Uninitialized data (via
145: .b .lcomm
146: and
147: .b .comm
148: directives)
149: is still assembled into the data or bss segments.
150: .pp
151: The
152: .b \-d
153: flag specifies the number of bytes
154: which the assembler should allow for a displacement when the value of the
155: displacement expression is undefined in the first pass.
156: The possible values of
157: .i n
158: are 1, 2, or 4;
159: the assembler uses 4 bytes
160: if
161: .b -d
162: is not specified.
163: See \(sc8.2.
164: .pp
165: Provided the
166: .b \-V
167: flag is not set,
168: the
169: .b \-t
170: flag causes the assembler to place its single temporary file
171: in the
172: .i directory
173: instead of in
174: .i /tmp .
175: .pp
176: The
177: .b \-o
178: flag causes the output to be placed on the file
179: .i output .
180: By default,
181: the output of the assembler is placed in the file
182: .i a.out
183: in the current directory.
184: .pp
185: The input to the assembler is normally taken from the standard input.
186: If file arguments occur,
187: then the input is taken sequentially from the files
188: $name sub 1$,
189: $name sub 2~...~name sub n$
190: This is not to say that the files are assembled separately;
191: $name sub 1$ is effectively concatenated to $name sub 2$,
192: so multiple definitions cannot occur amongst the input sources.
193: .pp
194: .pp
195: The
196: .b \-D
197: (debug),
198: .b \-T
199: (token trace),
200: and the
201: .b \-S
202: (symbol table)
203: flags enable assembler trace information,
204: provided that the assembler has been compiled with
205: the debugging code enabled.
206: The information printed is long and boring,
207: but useful when debugging the assembler.
208: .SH 1 "Lexical conventions"
209: .pp
210: Assembler tokens include identifiers (alternatively,
211: .q symbols
212: or
213: .q names ),
214: constants,
215: and operators.
216: .SH 2 "Identifiers"
217: .pp
218: An identifier consists of a sequence of alphanumeric characters
219: (including
220: period
221: .q "\fB\|.\|\fP" ,
222: underscore
223: .q "\*(US" ,
224: and
225: dollar
226: .q "\*(DL" ).
227: The first character may not be numeric.
228: Identifiers may be (practically) arbitrary long;
229: all characters are significant.
230: .SH 2 "Constants"
231: .SH 3 "Scalar constants"
232: .pp
233: All scalar (non floating point)
234: constants are (potentially) 128 bits wide.
235: Such constants are interpreted as two's complement numbers.
236: Note that 64 bit (quad words) and 128 bit (octal word) integers
237: are only partially supported by the \*(VX hardware.
238: In addition,
239: 128 bit integers are only supported by the extended \*(VX architecture.
240: .i As
241: supports 64 and 128 bit integers
242: only so they can be used as immediate constants
243: or to fill initialized data space.
244: .i As
245: can not perform arithmetic on constants larger than 32 bits.
246: .pp
247: Scalar constants are initially evaluated to a full 128 bits,
248: but are pared down by discarding high order copies of the sign bit
249: and categorizing the number as a long, quad or octal integer.
250: Numbers with less precision than 32 bits are treated as 32 bit quantities.
251: .pp
252: The digits are
253: .q 0123456789abcdefABCDEF
254: with the obvious values.
255: .pp
256: An octal constant consists of a sequence of digits with a leading zero.
257: .pp
258: A decimal constant consists of a sequence of digits without a leading zero.
259: .pp
260: A hexadecimal constant consists of the characters
261: .q 0x
262: (or
263: .q 0X )
264: followed by a sequence of digits.
265: .pp
266: A single-character constant consists of a single quote
267: .q "\|\(fm\|"
268: followed by an \*(AC character,
269: including \*(AC newline.
270: The constant's value is the code for the
271: given character.
272: .SH 3 "Floating Point Constants"
273: .pp
274: Floating point constants are internally represented
275: in the \*(VX floating point format
276: that is specified by the lexical form of the constant.
277: Using the meta notation that
278: [dec] is a decimal digit (\c
279: .q "0123456789" ),
280: [expt] is a type specification character (\c,
281: .q "fFdDhHgG" ),
282: [expe] is a exponent delimiter and type specification character (\c,
283: .q "eEfFdDhHgG" ),
284: $x sup roman "*"$ means 0 or more occurences of $x$,
285: $x sup +$ means 1 or more occurences of $x$,
286: then the general lexical form of a floating point number is:
287: .ce 1
288: 0[expe]([+-])$roman "[dec]" sup +$(.)($roman "[dec]" sup roman "*"$)([expt]([+-])($roman "dec]" sup +$))
289: .ce 0
290: The standard semantic interpretation is used for the
291: signed integer, fraction and signed power of 10 exponent.
292: If the exponent delimiter is specified,
293: it must be either an
294: .q e
295: or
296: .q E ,
297: or must agree with the initial type specification character that is used.
298: The type specification character specifies
299: the type and representation of the constructed number, as follows:
300: .(b
301: .TS
302: center;
303: c l c
304: c l n.
305: type character floating representation size (bits)
306: _
307: f, F F format floating 32
308: d, D D format floating 64
309: g, G G format floating 64
310: h, H H format floating 128
311: .TE
312: .)b
313: Note that
314: .q G
315: and
316: .q H
317: format floating point numbers are not supported
318: by all implementations of the \*(VX architecture.
319: .i As
320: does not require the augmented architecture in order to run.
321: .pp
322: The assembler uses the library routine
323: .i atof()
324: to convert
325: .q F
326: and
327: .q D
328: numbers,
329: and uses its own conversion routine
330: (derived from
331: .i atof ,
332: and believed to be numerically accurate)
333: to convert
334: .q G
335: and
336: .q H
337: floating point numbers.
338: .pp
339: Collectively,
340: all floating point numbers,
341: together with quad and octal scalars are called
342: .i Bignums .
343: When
344: .i as
345: requires a Bignum,
346: a 32 bit scalar quantity may also be used.
347: .SH 3 "String Constants"
348: .pp
349: A string constant is defined using
350: the same syntax and semantics as the \*(CL language uses.
351: Strings begin and end with a
352: .q "''"
353: (double quote).
354: The \*(DM assembler conventions for flexible string quoting is
355: not implemented.
356: All \*(CL backslash conventions are observed;
357: the backslash conventions
358: peculiar to the \*(PD assembler are not observed.
359: Strings are known by their value and their length;
360: the assembler does not implicitly end strings with a null byte.
361: .SH 2 "Operators"
362: .pp
363: There are several single-character
364: operators;
365: see \(sc6.1.
366: .SH 2 "Blanks"
367: .pp
368: Blank and tab characters
369: may be interspersed freely between tokens,
370: but may not be used within tokens (except character constants).
371: A blank or tab is required to separate adjacent
372: identifiers or constants not otherwise separated.
373: .SH 2 "Scratch Mark Comments"
374: .pp
375: The character
376: .q "#"
377: introduces a comment,
378: which extends through the end of the line on which it appears.
379: Comments starting in column 1,
380: having the format
381: .q "# $expression~~string$" ,
382: are interpreted as an indication that the assembler is now assembling
383: file
384: .i string
385: at line
386: .i expression .
387: Thus, one can use the \*(CL preprocessor on an assembly language source file,
388: and use the
389: .i #include
390: and
391: .i #define
392: preprocessor directives.
393: (Note that there may not be an assembler comment starting in column
394: 1 if the assembler source is given to the \*(CL preprocessor,
395: as it will be interpreted by the preprocessor in a way not intended.)
396: Comments are otherwise ignored by the assembler.
397: .SH 2 "\*(CL Style Comments"
398: .pp
399: The assembler will recognize \*(CL style comments,
400: introduced with the prologue
401: .b "/*"
402: and ending with the epilogue
403: .b "*/" .
404: \*(CL style comments may extend across multiple lines,
405: and are the preferred comment style
406: to use if one chooses to use the \*(CL preprocessor.
407: .SH 1 "Segments and Location Counters"
408: .pp
409: Assembled code and data fall into three segments: the text segment,
410: the data segment,
411: and the bss segment.
412: The \*(UX operating system makes
413: some assumptions about the content of these segments;
414: the assembler does not.
415: Within the text and data segments there are a number of sub-segments,
416: distinguished by number (\c
417: .q "\fBtext\fP 0" ,
418: .q "\fBtext\fP 1" ,
419: $...$
420: .q "\fBdata\fP 0" ,
421: .q "\fBdata\fP 1" ,
422: $...$).
423: Currently there are four subsegments each in text and data.
424: The subsegments are for programming convenience only.
425: .pp
426: Before writing the output file,
427: the assembler zero-pads each text subsegment to a multiple of four
428: bytes and then concatenates the subsegments in order to form the text segment;
429: an analogous operation is done for the data segment.
430: Requesting that the loader define symbols and storage regions is the only
431: action allowed by the assembler with respect to the bss segment.
432: Assembly begins in
433: .q "\fBtext\fP 0" .
434: .pp
435: Associated with each (sub)segment is an implicit location counter which
436: begins at zero and is incremented by 1 for each byte assembled into the
437: (sub)segment.
438: There is no way to explicitly reference a location counter.
439: Note that the location counters of subsegments other than
440: .q "\fBtext\fP 0"
441: and
442: .q "\fBdata\fP 0"
443: behave peculiarly due to the concatenation used to form
444: the text and data segments.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.