Annotation of researchv10no/cmd/sml/src/mips/opcodes.nw, revision 1.1

1.1     ! root        1: \chapter{Handling the MIPS opcodes}
        !             2: \section{Introduction}
        !             3: 
        !             4: This file generates the code necessary to handle MIPS instructions
        !             5: in a natural, mnemonic way from within ML.
        !             6: All MIPS instructions occupy 32 bits, and since ML has no simple
        !             7: 32~bit data type, we use pairs of integerss to represent MIPS instructions.
        !             8: A pair [[(hi,lo)]] of 16-bit integers holds the most and least significant
        !             9: halfwords of the MIPS word.
        !            10: ML integers are 31 bits, so this is more than adequate.
        !            11: 
        !            12: The biggest hassle in converting between these integer pairs and more
        !            13: mnemonic representations is that it is too easy to make mistakes
        !            14: (especially typographical errors) in writing the code.
        !            15: For that reason, I have added an extra level of indirection to the
        !            16: whole business by putting all of the instruction descriptions in
        !            17: tables.
        !            18: These tables are read by an awk script, which writes two ML files:
        !            19: {\tt opcodes.sml} and {\tt mipsdecode.sml}.
        !            20: The {\tt opcodes.sml} file contains the code needed to convert from
        !            21: a mnemonic like [[add(3,4,9)]] (add the contents of register~3 to
        !            22: the contents of register~4, placing the result in register~9) to 
        !            23: the integer pair representation of the actual bits in that add instruction
        !            24: (in this case [[(137,6176)]]).
        !            25: The {\tt mipsdecode.sml} file contains a [[decode]] function that converts
        !            26: from the integer pair representation of instructions to a string
        !            27: representation.
        !            28: The string representation is a little hokey at the moment (that is,
        !            29: it's different from the one used in the MIPS book), but it represents
        !            30: a nice compromise between being readable and easy to generate.
        !            31: 
        !            32: I have contemplating generating a third file to test the whole
        !            33: business.
        !            34: The idea would be to have a function that would write out (to files)
        !            35: two
        !            36: parallel representations of the same instruction stream (presumably
        !            37: one copy of each known instruction).
        !            38: One representation would be the binary one understood by the MIPS.
        !            39: The other representation would be a string representation.
        !            40: We could then use a tool like {\tt gdb} or {\tt adb} to print out
        !            41: the binary as an instruction sequence (i.e. convert back to
        !            42: a second string representation) and compare the string representations
        !            43: to see if they make sense.
        !            44: 
        !            45: \paragraph{Possible bugs}
        !            46: This code should be gone over with care to make sure that negative
        !            47: operands (e.g. in [[offset]]) won't break the code.
        !            48: 
        !            49: 
        !            50: @
        !            51: We need a special line in the Makefile to handle this file, since
        !            52: it writes both an awk program and that program's input.  The input
        !            53: is in module {\tt @<<opcodes table@>>} so the line is
        !            54: $$\hbox{[[     $(NOTANGLE) '-Ropcodes table' opcodes.ow > opcodes]]}$$
        !            55: The input is nothing but a sequence of tables, each labelled, and
        !            56: processed one after anothing according to the label.
        !            57: The label is always a single word on a line by itself.
        !            58: Tables end with blank lines.
        !            59: @ The opcode-to-pair code is written to the standard output, in 
        !            60: [[structure Opcodes]].
        !            61: The pair-to-string code is written to [["mipsdecode.sml"]], in
        !            62: [[structure MipsDecode]].
        !            63: 
        !            64: We begin by defining and and shift functions.
        !            65: We make pessimistic assumptions about shifting, trying always to
        !            66: keep the arguments between 0 and 31 inclusive.
        !            67: <<BEGIN>>=
        !            68: print "structure Opcodes = struct"
        !            69: print "val andb = Bits.andb"
        !            70: print "fun lshift(op1,amt) = "
        !            71: print "    if amt<0 then Bits.rshift(op1,0-amt)"
        !            72: print "    else Bits.lshift(op1,amt)"
        !            73: print "nonfix sub"     # bug fixes; want [[sub]] to be a MIPS opcode
        !            74: print "nonfix div"     # bug fixes; want [[div]] to be a MIPS opcode
        !            75: 
        !            76: decode = "mipsdecode.sml";
        !            77: print "structure MipsDecode = struct" > decode
        !            78: print "val andb = Bits.andb" > decode
        !            79: print "fun rshift(op1,amt) = " > decode
        !            80: print "    if amt<0 then Bits.lshift(op1,0-amt)" > decode
        !            81: print "    else Bits.rshift(op1,amt)" > decode
        !            82: <<END>>=
        !            83: <<write out the definitions of the decoding functions>>
        !            84: print "end (* Opcodes *)"
        !            85: print "end (* Decode *)" > decode
        !            86: @ The sections BEGIN and END are drawn from 
        !            87:  our universal model of an awk program:
        !            88: <<*>>=
        !            89: BEGIN {
        !            90:   <<BEGIN>>
        !            91: }
        !            92: <<functions>>
        !            93: <<statements>>
        !            94: END {
        !            95:   <<END>>
        !            96: }
        !            97: @ \section{The opcode tables}
        !            98: The numeric codes for all the MIPS opcodes are described in three
        !            99: tables in the MIPS book on page~A-87.
        !           100: Normal opcodes are six bits, and appear in the [[opcode]] field of the
        !           101: instruction.
        !           102: Two opcodes [[special]] and [[bcond]] stand for several instructions.
        !           103: These instructions are decoded by checking the bit-pattern in the
        !           104: [[funct]] and [[cond]] fields of the instructions, respectively.
        !           105: 
        !           106: The tables show which opcodes correspond to which bit-patterns.
        !           107: For example, the [[slti]] corresponds to an [[opcode]] value of octal~12.
        !           108: The table headed [[opcode]] gives the mnemonics for all six-bit patterns
        !           109: in the [[opcode]] field.
        !           110: The [[special]] table shows patterns for the [[funct]] field, used with
        !           111: the [[special]] opcode.
        !           112: The [[bcond]] table shows five-bit patterns for the [[cond]] field,
        !           113: used with the [[bcond]] opcode.
        !           114: In all tables, stars ([[*]]) stand for unused fields.
        !           115: 
        !           116: Each table is terminated with a blank line.
        !           117: <<opcodes table>>=
        !           118:                            opcode
        !           119: special        bcond   j       jal     beq     bne     blez    bgtz
        !           120: addi   addiu   slti    sltiu   andi    ori     xori    lui
        !           121: cop0   cop1    cop2    cop3    *       *       *       *
        !           122: *      *       *       *       *       *       *       *
        !           123: lb     lh      lwl     lw      lbu     lhu     lwr     *
        !           124: sb     sh      swl     sw      *       *       swr     *
        !           125: lwc0   lwc1    lwc2    lwc3    *       *       *       *
        !           126: swc0   swc1    swc2    swc3    *       *       *       *
        !           127: 
        !           128:                            special
        !           129: sll    *       srl     sra     sllv    *       srlv    srav
        !           130: jr     jalr    *       *       syscall break   *       *
        !           131: mfhi   mthi    mflo    mtlo    *       *       *       *
        !           132: mult   multu   div     divu    *       *       *       *
        !           133: add    addu    sub     subu    and'    or      xor     nor
        !           134: *      *       slt     sltu    *       *       *       *
        !           135: *      *       *       *       *       *       *       *
        !           136: *      *       *       *       *       *       *       *
        !           137: 
        !           138:                            bcond
        !           139: bltz   bgez    *       *       *       *       *       *
        !           140: *      *       *       *       *       *       *       *
        !           141: bltzal bgezal  *       *       *       *       *       *
        !           142: *      *       *       *       *       *       *       *
        !           143: 
        !           144: 
        !           145: @ The instructions codes for Coprocessor 1 (floating point)
        !           146: are takin from page B-28 of the Mips book.
        !           147: <<opcodes table>>=
        !           148:                            cop1
        !           149: add_fmt        sub_fmt mul_fmt div_fmt *       abs_fmt mov_fmt neg_fmt
        !           150: *      *       *       *       *       *       *       *
        !           151: *      *       *       *       *       *       *       *
        !           152: *      *       *       *       *       *       *       *
        !           153: cvt_s  cvt_d   *       *       cvt_w   *       *       *
        !           154: *      *       *       *       *       *       *       *
        !           155: c_f    c_un    c_eq    c_ueq   c_olt   c_ult   c_ole   c_ule
        !           156: c_sf   c_ngle  c_seq   c_ngl   c_lt    c_nge   c_le    c_ngt
        !           157: 
        !           158: @
        !           159: Now we have to deal with reading these tables, and extracting the
        !           160: information stored therein.
        !           161: First of all, for each mnemonic [[$i]] we store the corresponding bit
        !           162: pattern (as an integer, [[code]]) in the array [[numberof[$i] ]].
        !           163: Then, we store the type of the mnemonic (ordinary [[OPCODE]], 
        !           164: [[SPECIAL]], [[BCOND]], of [[COP1]]) in the array [[typeof[$i] ]].
        !           165: Finally, we store inverse (a map from type and bit pattern to mnemonic)
        !           166: in the [[opcode]] array.
        !           167: <<store opcode information>>=
        !           168: if ($i != "*") {
        !           169:        numberof[$i] = code
        !           170:        typeof[$i] = type
        !           171:        opcode[type,code] = $i
        !           172: } else {
        !           173:        opcode[type,code] = "reserved"
        !           174: }
        !           175: @ The types are just constants set at the beginning.
        !           176: <<BEGIN>>=
        !           177: OPCODE = 1 ; SPECIAL = 2 ; BCOND = 3 ; COP1 = 4
        !           178: @ We determine the type by scanning the header word that precedes
        !           179: each table.
        !           180: Once we see the appropriate table header, we set one of [[opcodes]],
        !           181: [[specials]], and [[bconds]], so that determining the type is easy:
        !           182: <<set [[type]]>>=
        !           183: type = OPCODE * opcodes + SPECIAL * specials + BCOND * bconds + COP1 * cop1s
        !           184: @ Seeing the right table header causes us to set the right variable.
        !           185: We also remember the line number, because we use the positions of later
        !           186: lines to help extract the bit patterns from the table.
        !           187: <<statements>>=
        !           188: NF == 1 && $1 == "opcode" {
        !           189:        startline = NR
        !           190:        opcodes = 1
        !           191:        next
        !           192: }
        !           193: NF == 1 && $1 == "special" {
        !           194:        startline = NR
        !           195:        specials = 1
        !           196:        next
        !           197: }
        !           198: NF == 1 && $1 == "bcond" {
        !           199:        startline = NR
        !           200:        bconds = 1
        !           201:        next
        !           202: }
        !           203: NF == 1 && $1 == "cop1" {
        !           204:        startline = NR
        !           205:        cop1s = 1
        !           206:        next
        !           207: }
        !           208: @ Any time we see a blank line, that ends the appropriate table.
        !           209: <<statements>>=
        !           210: NF == 0 {opcodes = 0; specials = 0; bconds = 0; cop1s = 0
        !           211:        <<blank line resets>>
        !           212: }
        !           213: @ Here is the code that actually extracts the bit patterns from
        !           214: the opcode tables.
        !           215: The code is the same for each of the three tables.
        !           216: 
        !           217: The [[insist_fields(8)]] issues an error message and returns false (0)
        !           218: unless there are exactly 8 fields on the input line.
        !           219: <<statements>>=
        !           220: opcodes || specials || bconds || cop1s {
        !           221:        if (!insist_fields(8)) next
        !           222:        <<set [[type]]>>
        !           223:        major = NR - startline - 1              # major octal digit from row
        !           224:        for (i=1; i<= NF; i++) {
        !           225:                minor = i-1                     # minor octal digit from column
        !           226:                code = minor + 8 * major
        !           227:                <<store opcode information>>
        !           228:        }
        !           229: }
        !           230: @ \section{The instruction fields}
        !           231: Now that we've dealt with the opcodes, we'll handle other fields of
        !           232: the instruction.
        !           233: This table tells us the position of each field within the word,
        !           234: so that if we know a bit-pattern for each field, we can assemble
        !           235: all the fields into an instruction.
        !           236: 
        !           237: Not all fields are used in all instructions.
        !           238: Later we'll have a table that indicates exactly which fields are used in
        !           239: which instructions.
        !           240: For now, we just list the fields and their positions with the
        !           241: understanding that some fields will overlap.
        !           242: 
        !           243: The table is taken from the MIPS book, page A-3.
        !           244: The numbers are the numbers of the starting and ending bit positions,
        !           245: where 0 is the least and 31 the most significant bit.
        !           246: The names are exactly those used in the book except [[op']] has been
        !           247: substituted for [[op]] since [[op]] is a reserved word in ML.
        !           248: 
        !           249: If a field is signed, we put a [[+]]~sign as the first character
        !           250: of its name.
        !           251: The sign information is used only in decoding (I think).
        !           252: <<opcodes table>>=
        !           253:                        fields
        !           254: op' 26 31
        !           255: rs 21 25
        !           256: rt 16 20
        !           257: +immed 0 15
        !           258: +offset 0 15
        !           259: base 21 25
        !           260: target 0 25
        !           261: rd 11 15
        !           262: shamt 6 10
        !           263: funct 0 5
        !           264: cond 16 20
        !           265: <<floating point load/store fields>>
        !           266: <<floating point computation fields>>
        !           267: 
        !           268: @ From page B-5.  Most fields are the same as the CPU instruction formats.
        !           269: <<floating point load/store fields>>=
        !           270: ft 16 20
        !           271: @ From page B-6.  Many fields are reused from earlier specifications.
        !           272: The computational instructions all have a one bit in position 25.
        !           273: Instead of trying to insert special code to handle that, we cheat on
        !           274: it by making that bit part of the format, and cheating on the format.
        !           275: Thus:
        !           276: <<floating point computation fields>>=
        !           277: fmt 21 25
        !           278: fs 11 15
        !           279: fd 6 10
        !           280: <<write format info>>=
        !           281: print "val S_fmt = 16+0"
        !           282: print "val D_fmt = 16+1"
        !           283: print "val W_fmt = 16+4"
        !           284: 
        !           285: @ The setup for the fields is similar to that used for the opcodes.
        !           286: <<statements>>=
        !           287: NF == 1 && $1 == "fields" {
        !           288:        startline = NR
        !           289:        fields = 1
        !           290:        <<write format info>>
        !           291:        next
        !           292: }
        !           293: <<blank line resets>>=
        !           294: fields = 0
        !           295: <<statements>>=
        !           296: fields {
        !           297:        if (!insist_fields(3)) next
        !           298:        fieldname = $1;  low = $2; high = $3
        !           299:        <<look for sign in [[fieldname]] and set [[signed]]>>
        !           300:        fieldnames[fieldname]= 1        # rememeber all the field names
        !           301: 
        !           302:        <<write to standard output a function to convert bit-pattern to pair>>
        !           303:        <<write to [[decode]] a function to extract field from pair>>
        !           304: 
        !           305: }
        !           306: <<look for sign in [[fieldname]] and set [[signed]]>>=
        !           307: if (substr(fieldname,1,1)=="+") {
        !           308:        signed = 1
        !           309:        fieldname = substr(fieldname,2)
        !           310: } else {
        !           311:        signed = 0
        !           312: }
        !           313: @
        !           314: The idea is that for each of these fields, we want to write a function
        !           315: that will take an integer argument and shift it by the right amount.
        !           316: Since we have to represent the 32-bit quantities as pairs of integers,
        !           317: we actually use two functions, one for the high half and one for the low.
        !           318: So, for example, for the [[rd]] field we will produce two function definitions,
        !           319: [[rdHI]] and [[rdLO]].
        !           320: 
        !           321: The awk function [[function_definition]] is used to compute ML function
        !           322: definitions.
        !           323: It takes as arguments the name of the function and the number of arguments
        !           324: to that function.
        !           325: The arguments are numbered [[A1]], [[A2]], et cetera.
        !           326: 
        !           327: The functions themselves are all tedious combinations of ands and shifts.
        !           328: At one time I had convinced myself that this worked.
        !           329: <<write to standard output a function to convert bit-pattern to pair>>=
        !           330: if (low >= 16) {
        !           331:        printf "%s", function_definition(fieldname "LO",1); print "0"
        !           332: } else {
        !           333:        printf "%s", function_definition(fieldname "LO",1)
        !           334:         printf "andb(lshift(A1,%d),65535)\n", low
        !           335: }
        !           336: if (high < 16) {
        !           337:        printf "%s", function_definition(fieldname "HI",1); print "0"
        !           338: } else {
        !           339:        printf "%s", function_definition(fieldname "HI",1)
        !           340:         printf "lshift(A1,%s)\n", mlnumber(low - 16)
        !           341: }
        !           342: @ The inverse operation is
        !           343: to extract a bit pattern from a pair.
        !           344: We'll want that if we ever care to decode instructions.
        !           345: This time, the function to extract e.g.\ field [[rd]] from a pair
        !           346: is the function [[THErd]] applied to that pair.
        !           347: 
        !           348: The functions work first by extracting from the low part, then
        !           349: from the high part, and adding everything together.
        !           350: If the field is signed, we make the value negative if it is too high.
        !           351: <<write to [[decode]] a function to extract field from pair>>=
        !           352: printf "%s", function_definition("THE" fieldname,2) > decode
        !           353: if (signed) printf "let val n = " > decode
        !           354: <<print expression for unsigned value>>
        !           355: if (signed) {
        !           356:        printf "in if n < %d then n else n - %d\nend\n",
        !           357:                2**(high-low), 2**(high-low+1) > decode
        !           358: }
        !           359: 
        !           360: <<print expression for unsigned value>>=
        !           361: if (low >= 16) {
        !           362:        printf "0" > decode
        !           363: } else {
        !           364:         printf "andb(rshift(A2,%d),%d)", low,
        !           365:                        (2**(min(15,high)-low+1)-1) > decode
        !           366: }
        !           367: printf " + " > decode
        !           368: if (high < 16) {
        !           369:        printf "0\n" > decode
        !           370: } else {
        !           371:         printf "rshift(andb(A1,%d),%s)\n", (2**(high-16+1)-1),
        !           372:                        mlnumber(low - 16) > decode
        !           373: }
        !           374: @ ML uses a strange minus sign ([[~]] instead of [[-]]), 
        !           375: so we print numbers that might be negative like this:
        !           376: <<functions>>=
        !           377: function mlnumber(n, s) {
        !           378:        if (n<0) s = sprintf("~%d", -n)
        !           379:        else s = sprintf("%d", n)
        !           380:        return s
        !           381: }
        !           382: @ For reasons best known to its designers, awk has no [[min]] function.
        !           383: <<functions>>=
        !           384: function min(x,y){
        !           385:        if (x<y) return x
        !           386:        else return y
        !           387: }
        !           388: @ \section{The list of instructions and their formats}
        !           389: This is the section that tells which fields are used in what instructions,
        !           390: and in what order the fields appear.
        !           391: The information is from Appendix A
        !           392: of the MIPS book and should be proofread.
        !           393: 
        !           394: To cut down on the number of ML functions generated, we can comment out
        !           395: instructions with a [[#]] in the first column.
        !           396: This means that no code will be generated for the instruction, and
        !           397: it won't appear in the [[structure Opcodes]].
        !           398: <<opcodes table>>=
        !           399:                        instructions
        !           400: add rd rs rt
        !           401: addi rt rs immed
        !           402: addiu rt rs immed
        !           403: addu rd rs rt
        !           404: and' rd rs rt
        !           405: andi rt rs immed
        !           406: beq rs rt offset
        !           407: bgez rs offset
        !           408: bgezal rs offset
        !           409: bgtz rs offset
        !           410: blez rs offset
        !           411: bltz rs offset
        !           412: bltzal rs offset
        !           413: bne rs rt offset
        !           414: break
        !           415: div rs rt
        !           416: divu rs rt
        !           417: j target
        !           418: jal target
        !           419: jalr rs rd
        !           420: jr rs
        !           421: lb rt offset base
        !           422: lbu rt offset base
        !           423: lh rt offset base
        !           424: lb rt offset base
        !           425: lhu rt offset base
        !           426: lui rt immed
        !           427: lw rt offset base
        !           428: lwl rt offset base
        !           429: lwr rt offset base
        !           430: mfhi rd
        !           431: mflo rd
        !           432: mthi rs
        !           433: mtlo rs
        !           434: mult rs rt
        !           435: multu rs rt
        !           436: nor rd rs rt
        !           437: or rd rs rt
        !           438: ori rt rs immed
        !           439: sb rt offset base
        !           440: sh rt offset base
        !           441: sll rd rt shamt
        !           442: sllv rd rt rs
        !           443: slt rd rs rt
        !           444: slti rt rs immed
        !           445: sltiu rt rs immed
        !           446: sltu rd rs rt
        !           447: sra rd rt shamt
        !           448: srav rd rt rs
        !           449: srl rd rt shamt
        !           450: srlv rd rt rs
        !           451: sub rd rs rt
        !           452: subu rd rs rt
        !           453: sw rt offset base
        !           454: swl rt offset base
        !           455: swr rt offset base
        !           456: syscall
        !           457: xor rd rs rt
        !           458: xori rt rs immed
        !           459: <<floating point instructions>>
        !           460: 
        !           461: 
        !           462: @ We define only those floating point instructions we seem likely to need.
        !           463: To distinguish them as floating point we append an f to their names.
        !           464: <<floating point instructions>>=
        !           465: add_fmt fmt fd fs ft
        !           466: div_fmt fmt fd fs ft
        !           467: lwc1 ft offset base
        !           468: mul_fmt fmt fd fs ft
        !           469: neg_fmt fmt fd fs
        !           470: sub_fmt fmt fd fs ft
        !           471: swc1 ft offset base
        !           472: c_seq fmt fs ft
        !           473: c_lt fmt fs ft
        !           474: @
        !           475:  Here is a terrible hack to enable us to construct branch on coprocessor~1
        !           476: true or false.
        !           477: We will use [[fun bc1f offset = cop1(0,offset)]] and
        !           478:        [[fun bc1t offset = cop1(1,offset)]].
        !           479: <<floating point instructions>>=
        !           480: cop1 rs rt offset
        !           481: @
        !           482: 
        !           483: 
        !           484: @ For each instruction, we define an ML function with the appropriate
        !           485: number of arguments.
        !           486: When that function is given an integer in each argument,
        !           487: it converts the whole thing to one MIPS instruction, represented as an
        !           488: integer pair.
        !           489: 
        !           490: The implementation is a bit of a grubby mess.
        !           491: Doing the fields is straightforward enough, but
        !           492: for each mnemonic we have to do something different based
        !           493: on its type, because each type of opcode goes in a different
        !           494: field.
        !           495: Moreover, for mnemonics of type [[SPECIAL]], [[BCOND]], and [[COP1]] we
        !           496: have to generate [[special]], [[bcond]], and [[cop1]] in the [[op']] field.
        !           497: Finally, we have to do it all twice; once for the high order
        !           498: halfword and once for the low order halfword.
        !           499: <<compute function that generates this instruction>>=
        !           500:        printf "%s", function_definition(opname, NF-1)
        !           501:        printf "("      # open parenthesis for pair
        !           502:        for (i=2; i<= NF; i++) {
        !           503:                if (!($i in fieldnames)) <<bad field name>>
        !           504:                printf "%sHI(A%d)+", $i, i-1
        !           505:        }
        !           506:        if (typeof[opname]==OPCODE) {
        !           507:                printf "op'HI(%d)", numberof[opname]
        !           508:        } else if (typeof[opname]==SPECIAL) {
        !           509:                printf "op'HI(%d)+", numberof["special"]
        !           510:                printf "functHI(%d)", numberof[opname]
        !           511:        } else if (typeof[opname]==BCOND) {
        !           512:                printf "op'HI(%d)+", numberof["bcond"]
        !           513:                printf "condHI(%d)", numberof[opname]
        !           514:        } else if (typeof[opname]==COP1) {
        !           515:                printf "op'HI(%d)+", numberof["cop1"]
        !           516:                printf "functHI(%d)", numberof[opname]
        !           517:        } else <<bad operator name>>
        !           518:        printf ", "
        !           519:        for (i=2; i<= NF; i++) {
        !           520:                if (!($i in fieldnames)) <<bad field name>>
        !           521:                printf "%sLO(A%d)+", $i, i-1
        !           522:        }
        !           523:        if (typeof[opname]==OPCODE) {
        !           524:                printf "op'LO(%d)", numberof[opname]
        !           525:        } else if (typeof[opname]==SPECIAL) {
        !           526:                printf "op'LO(%d)+", numberof["special"]
        !           527:                printf "functLO(%d)", numberof[opname]
        !           528:        } else if (typeof[opname]==BCOND) {
        !           529:                printf "op'LO(%d)+", numberof["bcond"]
        !           530:                printf "condLO(%d)", numberof[opname]
        !           531:        } else if (typeof[opname]==COP1) {
        !           532:                printf "op'LO(%d)+", numberof["cop1"]
        !           533:                printf "functLO(%d)", numberof[opname]
        !           534:        } else <<bad operator name>>
        !           535:        printf ")\n"
        !           536: @
        !           537: Setup is as before.
        !           538: <<statements>>=
        !           539: NF == 1 && $1 == "instructions" {
        !           540:        startline = NR
        !           541:        instructions = 1
        !           542:        next
        !           543: }
        !           544: <<blank line resets>>=
        !           545: instructions= 0
        !           546: <<statements>>=
        !           547: instructions && $0 !~ /^#/ {
        !           548:        opname = $1
        !           549: 
        !           550:        <<compute string displayed when this instruction is decoded>>
        !           551: ########       gsub("[^a-z']+"," ")   ### ill-advised
        !           552: 
        !           553:        <<compute function that generates this instruction>>
        !           554: }
        !           555: 
        !           556: @ \paragraph{Decoding instructions}
        !           557: When we've decoded an instruction, we have to display some sort of
        !           558: string representation that tells us what the instruction is.
        !           559: Ideally we should display either just what the assembler expects,
        !           560: or perhaps just what dbx displays when asked about actual instructions
        !           561: in memory images.
        !           562: 
        !           563: For now, we just give the mnemonic for the instruction, followed
        !           564: by a description of each field (followed by a newline).
        !           565: The fields are described as name-value pairs.
        !           566: 
        !           567: We rely on the fact that for a field e.g.\ [[rd]], the string
        !           568: representation of the value of that field is in [[Srd]].
        !           569: <<compute string displayed when this instruction is decoded>>=
        !           570: temp = "\"" opname " \""
        !           571: for (i=2; i<=NF; i++) {
        !           572:        temp = sprintf( "%s ^ \"%s = \" ^ S%s", temp, $i, $i)
        !           573:        if (i<NF) temp = sprintf("%s ^ \",\" ", temp)
        !           574: }
        !           575: displayof[opname]=temp " ^ \"\\n\""
        !           576: 
        !           577: @ The implementation of the decoding function is split into several parts.
        !           578: First, we have to be able to extract any field from an instruction.
        !           579: Then, we have to be able to decode four kinds of opcodes:
        !           580: [[OPCODE]]s, [[BCOND]]s,  [[SPECIAL]]s, and [[COP1]]s.
        !           581: The main function is the one that does ordinary opcodes.
        !           582: The others are auxiliary.
        !           583: <<write out the definitions of the decoding functions>>=
        !           584: printf "%s", function_definition("decode",2) > decode
        !           585: print "let" > decode
        !           586:   <<write definitions of integer and string representations of each field>>
        !           587:   <<write expression that decodes the [[funct]] field for [[special]]s>>
        !           588:   <<write expression that decodes the [[cond]] field for [[bcond]]s>>
        !           589:   <<write expression that decodes the [[funct]] field for [[cop1]]s>>
        !           590: print "in" > decode
        !           591:   <<write [[case]] expression that decodes the [[op']] field for each instruction>>
        !           592: print "end" > decode
        !           593: @ We give each field its own name for an integer version, and its name
        !           594: preceded by [[S]] for its string version.
        !           595: These values are all computed just once, from the arguments to the
        !           596: enclosing function ([[decode]]).
        !           597: <<write definitions of integer and string representations of each field>>=
        !           598: for (f in fieldnames) {
        !           599:        printf "val %s = THE%s(A1,A2)\n", f, f  > decode
        !           600:        printf "val S%s = Integer.makestring %s\n", f, f  > decode
        !           601: }
        !           602: @ The next three functions are very much of a piece.
        !           603: They are just enormous [[case]] expressions that match up integers
        !           604: (bit patterns) to strings.
        !           605: The fundamental operation is printing out a decimal value and a string
        !           606: for each opcode:
        !           607: <<if [[name]] is known, display a case for it>>=
        !           608: if (name != ""  && name != "reserved") {
        !           609:        <<print space or bar ([[|]])>>
        !           610:        disp = displayof[name]
        !           611:        if (disp=="") disp="\"" name "(??? unknown format???)\\n\""
        !           612:        printf "%d => %s\n", code, disp > decode
        !           613: }
        !           614: @ Cases must be separated by vertical bars.
        !           615: We do the separation by putting a vertical bar before each case except
        !           616: the first.
        !           617: We use a hack to discover the first; we assume that code~0 is always
        !           618: defined, and so it will always be the first.
        !           619: <<print space or bar ([[|]])>>=
        !           620: if (code!=0) printf " | "  > decode # hack but it works
        !           621: else printf "   " > decode
        !           622: <<write expression that decodes the [[funct]] field for [[special]]s>>=
        !           623: print "val do_special ="  > decode
        !           624: print "(case funct of" > decode
        !           625: for (code=0; code<256; code++) {
        !           626:        name = opcode[SPECIAL,code]
        !           627:        <<if [[name]] is known, display a case for it>>
        !           628: }
        !           629: printf " | _ => \"unknown special\\n\"\n" > decode
        !           630: print "   ) " > decode
        !           631: <<write expression that decodes the [[cond]] field for [[bcond]]s>>=
        !           632: print "val do_bcond =" > decode
        !           633: print "(case cond of" > decode
        !           634: for (code=0; code<256; code++) {
        !           635:        name = opcode[BCOND,code]
        !           636:        <<if [[name]] is known, display a case for it>>
        !           637: }
        !           638: printf " | _ => \"unknown bcond\\n\"\n" > decode
        !           639: print "   ) " > decode
        !           640: <<write expression that decodes the [[funct]] field for [[cop1]]s>>=
        !           641: print "val do_cop1 =" > decode
        !           642: print "(case funct of" > decode
        !           643: for (code=0; code<256; code++) {
        !           644:        name = opcode[COP1,code]
        !           645:        <<if [[name]] is known, display a case for it>>
        !           646: }
        !           647: printf " | _ => \"unknown cop1\\n\"\n" > decode
        !           648: print "   ) " > decode
        !           649: @ The major expression is a little more complicated, because it has to
        !           650: check for [[special]], [[bcond]], and [[cop1]] and handle those separately.
        !           651: <<write [[case]] expression that decodes the [[op']] field for each instruction>>=
        !           652: print "(case op' of" > decode
        !           653: for (code=0; code<256; code++) {
        !           654:        name = opcode[OPCODE,code]
        !           655:        if (name=="special") {
        !           656:                <<print space or bar ([[|]])>>
        !           657:                printf "%d => %s\n", code, "do_special" > decode
        !           658:        } else if (name=="bcond") {
        !           659:                <<print space or bar ([[|]])>>
        !           660:                printf "%d => %s\n", code, "do_bcond" > decode
        !           661:        } else if (name=="cop1") {
        !           662:                <<print space or bar ([[|]])>>
        !           663:                printf "%d => %s\n", code, "do_cop1" > decode
        !           664:        } else <<if [[name]] is known, display a case for it>>
        !           665: }
        !           666: printf " | _ => \"unknown opcode\\n\"\n" > decode
        !           667: print "   ) " > decode
        !           668: @ \section{testing}
        !           669: One day someone will have to modify the instruction handler so that
        !           670: it generates a test invocation of each instruction.
        !           671: Then the results can be handed to something like adb or dbx and we can
        !           672: see whether the system agrees with us about what we're generating.
        !           673: 
        !           674: @ \section{Defining ML functions}
        !           675: The awk function [[function_definition]] is used to
        !           676: come up with ML function definitions.
        !           677: It takes as arguments the name of the function and the number of arguments
        !           678: to that function, and returns a string containing the initial part of
        !           679: the function definition.
        !           680: Writing an expression following that string will result in a complete
        !           681: ML function.
        !           682: 
        !           683: If we ever wanted to define these things as C preprocessor macros instead,
        !           684: we could do it by substituting [[macro_definition]].
        !           685: I'm not sure it would ever make sense to do so, but I'm leaving the
        !           686: code here anyway.
        !           687: <<functions>>=
        !           688: function function_definition(name, argc,  i, temp) {
        !           689:        if (argc==0) {
        !           690:                temp = sprintf("val %s = ", name)
        !           691:        } else {
        !           692:                temp = sprintf( "fun %s(", name)
        !           693:                for (i=1; i< argc; i++) temp = sprintf("%sA%d,", temp,i)
        !           694:                temp = sprintf( "%sA%d) = ", temp, argc)
        !           695:        }
        !           696:        return temp
        !           697: }
        !           698: <<useless functions>>=
        !           699: function macro_definition(name, argc,  i, temp) {
        !           700:        if (argc==0) {
        !           701:                temp = sprintf("#define %s ", name)
        !           702:        } else {
        !           703:                temp = sprintf( "#define %s(", name)
        !           704:                for (i=1; i< argc; i++) temp = sprintf("%sA%d,", temp,i)
        !           705:                temp = sprintf( "%sA%d) ", temp, argc)
        !           706:        }
        !           707:        return temp
        !           708: }
        !           709: @ \section{Handling error conditions}
        !           710: Here are a bunch of uninteresting functions and modules
        !           711: that handle error conditions.
        !           712: <<bad operator name>>=
        !           713: {
        !           714:        print "unknown opcode", opname, "on line", NR > stderr
        !           715:        next
        !           716: }
        !           717: <<bad field name>>=
        !           718: {
        !           719:        print "unknown field", $i, "on line", NR > stderr
        !           720:        next
        !           721: }
        !           722: <<BEGIN>>=
        !           723: stderr="/dev/tty"
        !           724: <<functions>>=
        !           725: function insist_fields(n) {
        !           726:        if (NF != n) {
        !           727:                print "Must have", n, "fields on line",NR ":", $0 > stderr
        !           728:                return 0
        !           729:        } else {
        !           730:                return 1
        !           731:        }
        !           732: }
        !           733: @ \section{Leftover junk}
        !           734: Like a pack rat, I never throw out anything that might be useful again later.
        !           735: <<junk>>=
        !           736: function thetype(n) {
        !           737:        if (n==OPCODE) return "OPCODE"
        !           738:        else if (n==SPECIAL) return "SPECIAL"
        !           739:        else if (n==BCOND) return "BCOND"
        !           740:        else if (n==COP1) return "COP1"
        !           741:        else return "BADTYPE"
        !           742: }
        !           743: <<decoding junk>>=
        !           744: for (f in fieldnames) {
        !           745:        printf "^ \"\\n%s = \" ^ Integer.makestring %s\n",f,f > decode
        !           746: }
        !           747: printf "^\"\\n\"\n" > decode

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.