|
|
1.1 ! root 1: +---------------------------------------------------------------------------+ ! 2: | wm-FPU-emu an FPU emulator for 80386 and 80486SX microprocessors. | ! 3: | | ! 4: | Copyright (C) 1992 W. Metzenthen, 22 Parker St, Ormond, Vic 3163, | ! 5: | Australia. E-mail [email protected] | ! 6: | | ! 7: | This program is free software; you can redistribute it and/or modify | ! 8: | it under the terms of the GNU General Public License version 2 as | ! 9: | published by the Free Software Foundation. | ! 10: | | ! 11: | This program is distributed in the hope that it will be useful, | ! 12: | but WITHOUT ANY WARRANTY; without even the implied warranty of | ! 13: | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ! 14: | GNU General Public License for more details. | ! 15: | | ! 16: | You should have received a copy of the GNU General Public License | ! 17: | along with this program; if not, write to the Free Software | ! 18: | Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. | ! 19: | | ! 20: +---------------------------------------------------------------------------+ ! 21: ! 22: ! 23: ***NOTE*** THIS SHOULD BE REGARDED AS AN ALPHA TEST VERSION ! 24: (although the beta version may be identical) ! 25: ! 26: ! 27: wm-FPU-emu is an FPU emulator for Linux. It is derived from wm-emu387 ! 28: which is my 80387 emulator for djgpp (gcc under msdos); wm-emu387 was ! 29: in turn based upon emu387 which was written by DJ Delorie for djgpp. ! 30: The interface to the Linux kernel is based upon the original Linux ! 31: math emulator by Linus Torvalds. ! 32: ! 33: My target FPU for wm-FPU-emu is that described in the Intel486 ! 34: Programmer's Reference Manual (1992 edition). Numerous facets of the ! 35: functioning of the FPU are not well covered in the Reference Manual; ! 36: in the absence of clear details I have made guesses about the most ! 37: reasonable behaviour. ! 38: ! 39: wm-FPU-emu does not implement all of the behaviour of the 80486 FPU. ! 40: See "Limitations" later in this file for a partial list of some ! 41: differences. I believe that the missing features are never used by ! 42: normal C or FORTRAN programs. ! 43: ! 44: Please report bugs, etc to me at: ! 45: [email protected] ! 46: ! 47: ! 48: --Bill Metzenthen ! 49: Oct 1992 ! 50: ! 51: ----------------------- Internals of wm-FPU-emu ----------------------- ! 52: ! 53: Numeric algorithms: ! 54: (1) Add, subtract, and multiply. Nothing remarkable in these. ! 55: (2) Divide has been tuned to get reasonable performance. The algorithm ! 56: is not the obvious one which most people seem to use, but is designed ! 57: to take advantage of the characteristics of the 80386. I expect that ! 58: it has been invented many times before I discovered it, but I have not ! 59: seen it. It is based upon one of those ideas which one carries around ! 60: for years without ever bothering to check it out. ! 61: (3) The sqrt function has been tuned to get good performance. It is based ! 62: upon Newton's classic method. Performance was improved by capitalizing ! 63: upon the properties of Newton's method, and the code is once again ! 64: structured taking account of the 80386 characteristics. ! 65: (4) The trig, log, and exp functions are based in each case upon quasi- ! 66: "optimal" polynomial approximations. My definition of "optimal" was ! 67: based upon getting good accuracy with reasonable speed. ! 68: ! 69: The code of the emulator is complicated slightly by the need to ! 70: account for a limited form of re-entrancy. Normally, the emulator will ! 71: emulate each FPU instruction to completion without interruption. ! 72: However, it may happen that when the emulator is accessing the user ! 73: memory space, swapping may be needed. In this case the emulator may be ! 74: temporarily suspended while disk i/o takes place. During this time ! 75: another process may use the emulator, thereby changing some static ! 76: variables (eg FPU_st0_ptr, etc). The code which accesses user memory ! 77: is confined to five files: ! 78: fpu_entry.c ! 79: reg_ld_str.c ! 80: load_store.c ! 81: get_address.c ! 82: errors.c ! 83: ! 84: ----------------------- Limitations of wm-FPU-emu ----------------------- ! 85: ! 86: There are a number of differences between the current wm-FPU-emu ! 87: (version ALPHA 0.7) and the 80486 FPU (apart from bugs). Some of the ! 88: more important differences are listed below: ! 89: ! 90: Internal computations do not use de-normal numbers (but External ! 91: de-normals ARE recognised and generated). The design of wm-FPU-emu ! 92: allows a larger exponent range than the 80486 FPU for internal ! 93: computations. ! 94: ! 95: All computations are performed at full 64 bit precision (the PC bits ! 96: of the FPU control word are ignored). Under Linux, the FPU normally ! 97: runs at 64 bits precision. ! 98: ! 99: The precision flag (PE of the FPU status word) is not implemented. ! 100: Does anyone write code which uses this feature? ! 101: ! 102: The Roundup flag (C1) is not implemented. ! 103: ! 104: The functions which load/store the FPU state are partially implemented, ! 105: but the implementation should be sufficient for handling FPU errors etc ! 106: in 32 bit protected mode. ! 107: ! 108: ----------------------- Performance of wm-FPU-emu ----------------------- ! 109: ! 110: Speed. ! 111: ----- ! 112: ! 113: The speed of floating point computation with the emulator will depend ! 114: upon instruction mix. Relative performance is best for the instructions ! 115: which require most computation. The simple instructions are adversely ! 116: affected by the fpu instruction trap overhead. ! 117: ! 118: ! 119: Timing: Some simple timing tests have been made on the emulator functions. ! 120: The times include load/store instructions. All times are in microseconds ! 121: measured on a 33MHz 386 with 64k cache. The Turbo C tests were under ! 122: ms-dos, the next two columns are for emulators running with the djgpp ! 123: ms-dos extender. The final column is for wm-FPU-emu in Linux 0.97, ! 124: using libm4.0 (hard). ! 125: ! 126: function Turbo C djgpp 1.06 WM-emu387 wm-FPU-emu ! 127: ! 128: + 60.5 154.8 76.5 139.4 ! 129: - 61.1-65.5 157.3-160.8 76.2-79.5 142.9-144.7 ! 130: * 71.0 190.8 79.6 146.6 ! 131: / 61.2-75.0 261.4-266.9 75.3-91.6 142.2-158.1 ! 132: ! 133: sin() 310.8 4692.0 319.0 398.5 ! 134: cos() 284.4 4855.2 308.0 388.7 ! 135: tan() 495.0 8807.1 394.9 504.7 ! 136: atan() 328.9 4866.4 601.1 419.5-491.9 ! 137: ! 138: sqrt() 128.7 crashed 145.2 227.0 ! 139: log() 413.1-419.1 5103.4-5354.21 254.7-282.2 409.4-437.1 ! 140: exp() 479.1 6619.2 469.1 850.8 ! 141: ! 142: ! 143: The performance under Linux is improved by the use of look-ahead code. ! 144: The following results show the improvement which is obtained under ! 145: Linux due to the look-ahead code. Also given are the times for the ! 146: original Linux emulator with the 4.1 'soft' lib. ! 147: ! 148: [ Linus' note: I changed look-ahead to be the default under linux, as ! 149: there was no reason not to use it after I had edited it to be ! 150: disabled during tracing ] ! 151: ! 152: wm-FPU-emu w original w ! 153: look-ahead 'soft' lib ! 154: + 106.4 190.2 ! 155: - 108.6-111.6 192.4-216.2 ! 156: * 113.4 193.1 ! 157: / 108.8-124.4 700.1-706.2 ! 158: ! 159: sin() 390.5 2642.0 ! 160: cos() 381.5 2767.4 ! 161: tan() 496.5 3153.3 ! 162: atan() 367.2-435.5 2439.4-3396.8 ! 163: ! 164: sqrt() 195.1 4732.5 ! 165: log() 358.0-387.5 3359.2-3390.3 ! 166: exp() 619.3 4046.4 ! 167: ! 168: ! 169: ----------------------- Accuracy of wm-FPU-emu ----------------------- ! 170: ! 171: ! 172: Accuracy: The following table gives the accuracy of the sqrt(), trig ! 173: and log functions. Each function was tested at about 400 points. Ideal ! 174: results would be 64 bits. The reduced accuracy of cos() and tan() for ! 175: arguments greater than pi/4 can be thought of as being due to the ! 176: precision of the argument x; e.g. an argument of pi/2-(1e-10) which is ! 177: accurate to 64 bits can result in a relative accuracy in cos() of about ! 178: 64 + log2(cos(x)) = 31 bits. Results for the Turbo C emulator are given ! 179: in the last column. ! 180: ! 181: ! 182: Function Tested x range Worst result (bits) Turbo C ! 183: ! 184: sqrt(x) 1 .. 2 64.1 63.2 ! 185: atan(x) 1e-10 .. 200 62.6 62.8 ! 186: cos(x) 0 .. pi/2-(1e-10) 63.2 (x <= pi/4) 62.4 ! 187: 35.2 (x = pi/2-(1e-10)) 31.9 ! 188: sin(x) 1e-10 .. pi/2 63.0 62.8 ! 189: tan(x) 1e-10 .. pi/2-(1e-10) 62.4 (x <= pi/4) 62.1 ! 190: 35.2 (x = pi/2-(1e-10)) 31.9 ! 191: exp(x) 0 .. 1 63.1 62.9 ! 192: log(x) 1+1e-6 .. 2 62.4 62.1 ! 193:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.