|
|
1.1 ! root 1: .\" Copyright (c) 1986 The Regents of the University of California. ! 2: .\" All rights reserved. ! 3: .\" ! 4: .\" Redistribution and use in source and binary forms are permitted ! 5: .\" provided that the above copyright notice and this paragraph are ! 6: .\" duplicated in all such forms and that any documentation, ! 7: .\" advertising materials, and other materials related to such ! 8: .\" distribution and use acknowledge that the software was developed ! 9: .\" by the University of California, Berkeley. The name of the ! 10: .\" University may not be used to endorse or promote products derived ! 11: .\" from this software without specific prior written permission. ! 12: .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR ! 13: .\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED ! 14: .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. ! 15: .\" ! 16: .\" @(#)4.t 6.2 (Berkeley) 3/7/89 ! 17: .\" ! 18: .ds RH Performance ! 19: .NH ! 20: Performance ! 21: .PP ! 22: Ultimately, the proof of the effectiveness of the ! 23: algorithms described in the previous section ! 24: is the long term performance of the new file system. ! 25: .PP ! 26: Our empirical studies have shown that the inode layout policy has ! 27: been effective. ! 28: When running the ``list directory'' command on a large directory ! 29: that itself contains many directories (to force the system ! 30: to access inodes in multiple cylinder groups), ! 31: the number of disk accesses for inodes is cut by a factor of two. ! 32: The improvements are even more dramatic for large directories ! 33: containing only files, ! 34: disk accesses for inodes being cut by a factor of eight. ! 35: This is most encouraging for programs such as spooling daemons that ! 36: access many small files, ! 37: since these programs tend to flood the ! 38: disk request queue on the old file system. ! 39: .PP ! 40: Table 2 summarizes the measured throughput of the new file system. ! 41: Several comments need to be made about the conditions under which these ! 42: tests were run. ! 43: The test programs measure the rate at which user programs can transfer ! 44: data to or from a file without performing any processing on it. ! 45: These programs must read and write enough data to ! 46: insure that buffering in the ! 47: operating system does not affect the results. ! 48: They are also run at least three times in succession; ! 49: the first to get the system into a known state ! 50: and the second two to insure that the ! 51: experiment has stabilized and is repeatable. ! 52: The tests used and their results are ! 53: discussed in detail in [Kridle83]\(dg. ! 54: .FS ! 55: \(dg A UNIX command that is similar to the reading test that we used is ! 56: ``cp file /dev/null'', where ``file'' is eight megabytes long. ! 57: .FE ! 58: The systems were running multi-user but were otherwise quiescent. ! 59: There was no contention for either the CPU or the disk arm. ! 60: The only difference between the UNIBUS and MASSBUS tests ! 61: was the controller. ! 62: All tests used an AMPEX Capricorn 330 megabyte Winchester disk. ! 63: As Table 2 shows, all file system test runs were on a VAX 11/750. ! 64: All file systems had been in production use for at least ! 65: a month before being measured. ! 66: The same number of system calls were performed in all tests; ! 67: the basic system call overhead was a negligible portion of ! 68: the total running time of the tests. ! 69: .KF ! 70: .DS B ! 71: .TS ! 72: box; ! 73: c c|c s s ! 74: c c|c c c. ! 75: Type of Processor and Read ! 76: File System Bus Measured Speed Bandwidth % CPU ! 77: _ ! 78: old 1024 750/UNIBUS 29 Kbytes/sec 29/983 3% 11% ! 79: new 4096/1024 750/UNIBUS 221 Kbytes/sec 221/983 22% 43% ! 80: new 8192/1024 750/UNIBUS 233 Kbytes/sec 233/983 24% 29% ! 81: new 4096/1024 750/MASSBUS 466 Kbytes/sec 466/983 47% 73% ! 82: new 8192/1024 750/MASSBUS 466 Kbytes/sec 466/983 47% 54% ! 83: .TE ! 84: .ce 1 ! 85: Table 2a \- Reading rates of the old and new UNIX file systems. ! 86: .TS ! 87: box; ! 88: c c|c s s ! 89: c c|c c c. ! 90: Type of Processor and Write ! 91: File System Bus Measured Speed Bandwidth % CPU ! 92: _ ! 93: old 1024 750/UNIBUS 48 Kbytes/sec 48/983 5% 29% ! 94: new 4096/1024 750/UNIBUS 142 Kbytes/sec 142/983 14% 43% ! 95: new 8192/1024 750/UNIBUS 215 Kbytes/sec 215/983 22% 46% ! 96: new 4096/1024 750/MASSBUS 323 Kbytes/sec 323/983 33% 94% ! 97: new 8192/1024 750/MASSBUS 466 Kbytes/sec 466/983 47% 95% ! 98: .TE ! 99: .ce 1 ! 100: Table 2b \- Writing rates of the old and new UNIX file systems. ! 101: .DE ! 102: .KE ! 103: .PP ! 104: Unlike the old file system, ! 105: the transfer rates for the new file system do not ! 106: appear to change over time. ! 107: The throughput rate is tied much more strongly to the ! 108: amount of free space that is maintained. ! 109: The measurements in Table 2 were based on a file system ! 110: with a 10% free space reserve. ! 111: Synthetic work loads suggest that throughput deteriorates ! 112: to about half the rates given in Table 2 when the file ! 113: systems are full. ! 114: .PP ! 115: The percentage of bandwidth given in Table 2 is a measure ! 116: of the effective utilization of the disk by the file system. ! 117: An upper bound on the transfer rate from the disk is calculated ! 118: by multiplying the number of bytes on a track by the number ! 119: of revolutions of the disk per second. ! 120: The bandwidth is calculated by comparing the data rates ! 121: the file system is able to achieve as a percentage of this rate. ! 122: Using this metric, the old file system is only ! 123: able to use about 3\-5% of the disk bandwidth, ! 124: while the new file system uses up to 47% ! 125: of the bandwidth. ! 126: .PP ! 127: Both reads and writes are faster in the new system than in the old system. ! 128: The biggest factor in this speedup is because of the larger ! 129: block size used by the new file system. ! 130: The overhead of allocating blocks in the new system is greater ! 131: than the overhead of allocating blocks in the old system, ! 132: however fewer blocks need to be allocated in the new system ! 133: because they are bigger. ! 134: The net effect is that the cost per byte allocated is about ! 135: the same for both systems. ! 136: .PP ! 137: In the new file system, the reading rate is always at least ! 138: as fast as the writing rate. ! 139: This is to be expected since the kernel must do more work when ! 140: allocating blocks than when simply reading them. ! 141: Note that the write rates are about the same ! 142: as the read rates in the 8192 byte block file system; ! 143: the write rates are slower than the read rates in the 4096 byte block ! 144: file system. ! 145: The slower write rates occur because ! 146: the kernel has to do twice as many disk allocations per second, ! 147: making the processor unable to keep up with the disk transfer rate. ! 148: .PP ! 149: In contrast the old file system is about 50% ! 150: faster at writing files than reading them. ! 151: This is because the write system call is asynchronous and ! 152: the kernel can generate disk transfer ! 153: requests much faster than they can be serviced, ! 154: hence disk transfers queue up in the disk buffer cache. ! 155: Because the disk buffer cache is sorted by minimum seek distance, ! 156: the average seek between the scheduled disk writes is much ! 157: less than it would be if the data blocks were written out ! 158: in the random disk order in which they are generated. ! 159: However when the file is read, ! 160: the read system call is processed synchronously so ! 161: the disk blocks must be retrieved from the disk in the ! 162: non-optimal seek order in which they are requested. ! 163: This forces the disk scheduler to do long ! 164: seeks resulting in a lower throughput rate. ! 165: .PP ! 166: In the new system the blocks of a file are more optimally ! 167: ordered on the disk. ! 168: Even though reads are still synchronous, ! 169: the requests are presented to the disk in a much better order. ! 170: Even though the writes are still asynchronous, ! 171: they are already presented to the disk in minimum seek ! 172: order so there is no gain to be had by reordering them. ! 173: Hence the disk seek latencies that limited the old file system ! 174: have little effect in the new file system. ! 175: The cost of allocation is the factor in the new system that ! 176: causes writes to be slower than reads. ! 177: .PP ! 178: The performance of the new file system is currently ! 179: limited by memory to memory copy operations ! 180: required to move data from disk buffers in the ! 181: system's address space to data buffers in the user's ! 182: address space. These copy operations account for ! 183: about 40% of the time spent performing an input/output operation. ! 184: If the buffers in both address spaces were properly aligned, ! 185: this transfer could be performed without copying by ! 186: using the VAX virtual memory management hardware. ! 187: This would be especially desirable when transferring ! 188: large amounts of data. ! 189: We did not implement this because it would change the ! 190: user interface to the file system in two major ways: ! 191: user programs would be required to allocate buffers on page boundaries, ! 192: and data would disappear from buffers after being written. ! 193: .PP ! 194: Greater disk throughput could be achieved by rewriting the disk drivers ! 195: to chain together kernel buffers. ! 196: This would allow contiguous disk blocks to be read ! 197: in a single disk transaction. ! 198: Many disks used with UNIX systems contain either ! 199: 32 or 48 512 byte sectors per track. ! 200: Each track holds exactly two or three 8192 byte file system blocks, ! 201: or four or six 4096 byte file system blocks. ! 202: The inability to use contiguous disk blocks ! 203: effectively limits the performance ! 204: on these disks to less than 50% of the available bandwidth. ! 205: If the next block for a file cannot be laid out contiguously, ! 206: then the minimum spacing to the next allocatable ! 207: block on any platter is between a sixth and a half a revolution. ! 208: The implication of this is that the best possible layout without ! 209: contiguous blocks uses only half of the bandwidth of any given track. ! 210: If each track contains an odd number of sectors, ! 211: then it is possible to resolve the rotational delay to any number of sectors ! 212: by finding a block that begins at the desired ! 213: rotational position on another track. ! 214: The reason that block chaining has not been implemented is because it ! 215: would require rewriting all the disk drivers in the system, ! 216: and the current throughput rates are already limited by the ! 217: speed of the available processors. ! 218: .PP ! 219: Currently only one block is allocated to a file at a time. ! 220: A technique used by the DEMOS file system ! 221: when it finds that a file is growing rapidly, ! 222: is to preallocate several blocks at once, ! 223: releasing them when the file is closed if they remain unused. ! 224: By batching up allocations, the system can reduce the ! 225: overhead of allocating at each write, ! 226: and it can cut down on the number of disk writes needed to ! 227: keep the block pointers on the disk ! 228: synchronized with the block allocation [Powell79]. ! 229: This technique was not included because block allocation ! 230: currently accounts for less than 10% of the time spent in ! 231: a write system call and, once again, the ! 232: current throughput rates are already limited by the speed ! 233: of the available processors. ! 234: .ds RH Functional enhancements ! 235: .sp 2 ! 236: .ne 1i
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.