|
|
1.1 ! root 1: .\" Copyright (c) 1990 Regents of the University of California. ! 2: .\" All rights reserved. The Berkeley software License Agreement ! 3: .\" specifies the terms and conditions for redistribution. ! 4: .\" ! 5: .\" @(#)crash.8 5.1 (Berkeley) 6/29/90 ! 6: .\" ! 7: .TH CRASH 8 "June 29, 1990" ! 8: .UC 7 ! 9: .SH NAME ! 10: crash \- what happens when the system crashes ! 11: .SH DESCRIPTION ! 12: This section explains what happens when the system crashes ! 13: and (very briefly) how to analyze crash dumps. ! 14: .PP ! 15: When the system crashes voluntarily it prints a message of the form ! 16: .IP ! 17: panic: why i gave up the ghost ! 18: .LP ! 19: on the console, takes a dump on a mass storage peripheral, ! 20: and then invokes an automatic reboot procedure as ! 21: described in ! 22: .IR reboot (8). ! 23: Unless some unexpected inconsistency is encountered in the state ! 24: of the file systems due to hardware or software failure, the system ! 25: will then resume multi-user operations. ! 26: .PP ! 27: The system has a large number of internal consistency checks; if one ! 28: of these fails, then it will panic with a very short message indicating ! 29: which one failed. ! 30: In many instances, this will be the name of the routine which detected ! 31: the error, or a two-word description of the inconsistency. ! 32: A full understanding of most panic messages requires perusal of the ! 33: source code for the system. ! 34: .PP ! 35: The most common cause of system failures is hardware failure, which ! 36: can reflect itself in different ways. Here are the messages which ! 37: are most likely, with some hints as to causes. ! 38: Left unstated in all cases is the possibility that hardware or software ! 39: error produced the message in some unexpected way. ! 40: .TP ! 41: .B iinit ! 42: This cryptic panic message results from a failure to mount the root filesystem ! 43: during the bootstrap process. ! 44: Either the root filesystem has been corrupted, ! 45: or the system is attempting to use the wrong device as root filesystem. ! 46: Usually, an alternate copy of the system binary or an alternate root ! 47: filesystem can be used to bring up the system to investigate. ! 48: .TP ! 49: .B Can't exec /etc/init ! 50: This is not a panic message, as reboots are likely to be futile. ! 51: Late in the bootstrap procedure, the system was unable to locate ! 52: and execute the initialization process, ! 53: .IR init (8). ! 54: The root filesystem is incorrect or has been corrupted, or the mode ! 55: or type of /etc/init forbids execution. ! 56: .TP ! 57: .B IO err in push ! 58: .ns ! 59: .TP ! 60: .B hard IO err in swap ! 61: The system encountered an error trying to write to the paging device ! 62: or an error in reading critical information from a disk drive. ! 63: The offending disk should be fixed if it is broken or unreliable. ! 64: .TP ! 65: .B realloccg: bad optim ! 66: .ns ! 67: .TP ! 68: .B ialloc: dup alloc ! 69: .ns ! 70: .TP ! 71: .B alloccgblk: cyl groups corrupted ! 72: .ns ! 73: .TP ! 74: .B ialloccg: map corrupted ! 75: .ns ! 76: .TP ! 77: .B free: freeing free block ! 78: .ns ! 79: .TP ! 80: .B free: freeing free frag ! 81: .ns ! 82: .TP ! 83: .B ifree: freeing free inode ! 84: .ns ! 85: .TP ! 86: .B alloccg: map corrupted ! 87: These panic messages are among those that may be produced ! 88: when filesystem inconsistencies are detected. ! 89: The problem generally results from a failure to repair damaged filesystems ! 90: after a crash, hardware failures, or other condition that should not ! 91: normally occur. ! 92: A filesystem check will normally correct the problem. ! 93: .TP ! 94: .B timeout table overflow ! 95: .ns ! 96: This really shouldn't be a panic, but until the data structure ! 97: involved is made to be extensible, running out of entries causes a crash. ! 98: If this happens, make the timeout table bigger. ! 99: .TP ! 100: .B "trap type %d, code = %x, v = %x" ! 101: An unexpected trap has occurred within the system; the trap types are: ! 102: .sp ! 103: .nf ! 104: 0 bus error ! 105: 1 address error ! 106: 2 illegal instruction ! 107: 3 divide by zero ! 108: 4 \fIchk\fP instruction ! 109: 5 \fItrapv\fP instruction ! 110: 6 privileged instruction ! 111: 7 trace trap ! 112: 8 MMU fault ! 113: 9 simulated software interrupt ! 114: 10 format error ! 115: 11 FP coprocessor fault ! 116: 12 coprocessor fault ! 117: 13 simulated AST ! 118: .fi ! 119: .sp ! 120: The favorite trap type in system crashes is trap type 8, ! 121: indicating a wild reference. ! 122: ``code'' (hex) is the concatenation of the MMU status register ! 123: (see <hp300/cpu.h>) ! 124: in the high 16 bits and the 68020 special status word ! 125: (see the 68020 manual, page 6-17) ! 126: in the low 16. ! 127: ``v'' (hex) is the virtual address which caused the fault. ! 128: Additionally, the kernel will dump about a screenful of semi-useful ! 129: information. ! 130: ``pid'' (decimal) is the process id of the process running at the ! 131: time of the exception. ! 132: Note that if we panic in an interrupt routine, ! 133: this process may not be related to the panic. ! 134: ``ps'' (hex) is the 68020 processor status register ``ps''. ! 135: ``pc'' (hex) is the value of the program counter saved ! 136: on the hardware exception frame. ! 137: It may ! 138: .I not ! 139: be the PC of the instruction causing the fault. ! 140: ``sfc'' and ``dfc'' (hex) are the 68020 source/destination function codes. ! 141: They should always be one. ! 142: ``p0'' and ``p1'' are the VAX-like region registers. ! 143: They are of the form: ! 144: .sp ! 145: <length> '@' <kernel VA> ! 146: .sp ! 147: where both are in hex. ! 148: Following these values are a dump of the processor registers (hex). ! 149: Finally, is a dump of the stack (user/kernel) at the time of the offense. ! 150: .TP ! 151: .B init died ! 152: The system initialization process has exited. This is bad news, as no new ! 153: users will then be able to log in. Rebooting is the only fix, so the ! 154: system just does it right away. ! 155: .TP ! 156: .B out of mbufs: map full ! 157: The network has exhausted its private page map for network buffers. ! 158: This usually indicates that buffers are being lost, and rather than ! 159: allow the system to slowly degrade, it reboots immediately. ! 160: The map may be made larger if necessary. ! 161: .PP ! 162: That completes the list of panic types you are likely to see. ! 163: .PP ! 164: When the system crashes it writes (or at least attempts to write) ! 165: an image of memory into the back end of the dump device, ! 166: usually the same as the primary swap ! 167: area. After the system is rebooted, the program ! 168: .IR savecore (8) ! 169: runs and preserves a copy of this core image and the current ! 170: system in a specified directory for later perusal. See ! 171: .IR savecore (8) ! 172: for details. ! 173: .PP ! 174: To analyze a dump you should begin by running ! 175: .IR adb (1) ! 176: with the ! 177: .B \-k ! 178: flag on the system load image and core dump. ! 179: If the core image is the result of a panic, ! 180: the panic message is printed. ! 181: Normally the command ! 182: ``$c'' ! 183: will provide a stack trace from the point of ! 184: the crash and this will provide a clue as to ! 185: what went wrong. ! 186: A more complete discussion ! 187: of system debugging is impossible here. ! 188: See, however, ! 189: ``Using ADB to Debug the UNIX Kernel''. ! 190: .SH "SEE ALSO" ! 191: adb(1), ! 192: reboot(8) ! 193: .br ! 194: .I "MC68020 32-bit Microprocessor User's Manual" ! 195: .br ! 196: .I "Using ADB to Debug the UNIX Kernel" ! 197: .br ! 198: .I "4.3BSD for the HP300"
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.