|
|
1.1 ! root 1: .\" Copyright (c) 1980 Regents of the University of California. ! 2: .\" All rights reserved. The Berkeley software License Agreement ! 3: .\" specifies the terms and conditions for redistribution. ! 4: .\" ! 5: .\" @(#)crash.8v 6.2 (Berkeley) 5/20/86 ! 6: .\" ! 7: .TH CRASH 8V "May 20, 1986" ! 8: .UC 4 ! 9: .SH NAME ! 10: crash \- what happens when the system crashes ! 11: .SH DESCRIPTION ! 12: This section explains what happens when the system crashes ! 13: and (very briefly) how to analyze crash dumps. ! 14: .PP ! 15: When the system crashes voluntarily it prints a message of the form ! 16: .IP ! 17: panic: why i gave up the ghost ! 18: .LP ! 19: on the console, takes a dump on a mass storage peripheral, ! 20: and then invokes an automatic reboot procedure as ! 21: described in ! 22: .IR reboot (8). ! 23: (If auto-reboot is disabled on the front panel of the machine the system ! 24: will simply halt at this point.) ! 25: Unless some unexpected inconsistency is encountered in the state ! 26: of the file systems due to hardware or software failure, the system ! 27: will then resume multi-user operations. ! 28: .PP ! 29: The system has a large number of internal consistency checks; if one ! 30: of these fails, then it will panic with a very short message indicating ! 31: which one failed. ! 32: In many instances, this will be the name of the routine which detected ! 33: the error, or a two-word description of the inconsistency. ! 34: A full understanding of most panic messages requires perusal of the ! 35: source code for the system. ! 36: .PP ! 37: The most common cause of system failures is hardware failure, which ! 38: can reflect itself in different ways. Here are the messages which ! 39: are most likely, with some hints as to causes. ! 40: Left unstated in all cases is the possibility that hardware or software ! 41: error produced the message in some unexpected way. ! 42: .TP ! 43: .B iinit ! 44: This cryptic panic message results from a failure to mount the root filesystem ! 45: during the bootstrap process. ! 46: Either the root filesystem has been corrupted, ! 47: or the system is attempting to use the wrong device as root filesystem. ! 48: Usually, an alternate copy of the system binary or an alternate root ! 49: filesystem can be used to bring up the system to investigate. ! 50: .TP ! 51: .B Can't exec /etc/init ! 52: This is not a panic message, as reboots are likely to be futile. ! 53: Late in the bootstrap procedure, the system was unable to locate ! 54: and execute the initialization process, ! 55: .IR init (8). ! 56: The root filesystem is incorrect or has been corrupted, or the mode ! 57: or type of /etc/init forbids execution. ! 58: .TP ! 59: .B IO err in push ! 60: .ns ! 61: .TP ! 62: .B hard IO err in swap ! 63: The system encountered an error trying to write to the paging device ! 64: or an error in reading critical information from a disk drive. ! 65: The offending disk should be fixed if it is broken or unreliable. ! 66: .TP ! 67: .B realloccg: bad optim ! 68: .ns ! 69: .TP ! 70: .B ialloc: dup alloc ! 71: .ns ! 72: .TP ! 73: .B alloccgblk: cyl groups corrupted ! 74: .ns ! 75: .TP ! 76: .B ialloccg: map corrupted ! 77: .ns ! 78: .TP ! 79: .B free: freeing free block ! 80: .ns ! 81: .TP ! 82: .B free: freeing free frag ! 83: .ns ! 84: .TP ! 85: .B ifree: freeing free inode ! 86: .ns ! 87: .TP ! 88: .B alloccg: map corrupted ! 89: These panic messages are among those that may be produced ! 90: when filesystem inconsistencies are detected. ! 91: The problem generally results from a failure to repair damaged filesystems ! 92: after a crash, hardware failures, or other condition that should not ! 93: normally occur. ! 94: A filesystem check will normally correct the problem. ! 95: .TP ! 96: .B timeout table overflow ! 97: .ns ! 98: This really shouldn't be a panic, but until the data structure ! 99: involved is made to be extensible, running out of entries causes a crash. ! 100: If this happens, make the timeout table bigger. ! 101: .TP ! 102: .B KSP not valid ! 103: .ns ! 104: .TP ! 105: .B SBI fault ! 106: .ns ! 107: .TP ! 108: .B CHM? in kernel ! 109: These indicate either a serious bug in the system or, more often, ! 110: a glitch or failing hardware. ! 111: If SBI faults recur, check out the hardware or call ! 112: field service. If the other faults recur, there is likely a bug somewhere ! 113: in the system, although these can be caused by a flakey processor. ! 114: Run processor microdiagnostics. ! 115: .TP ! 116: .B machine check %x: ! 117: .I description ! 118: .ns ! 119: .TP ! 120: .I \0\0\0machine dependent machine-check information ! 121: .ns ! 122: Machine checks are different on each type of CPU. ! 123: Most of the internal processor registers are saved at the time of the fault ! 124: and are printed on the console. ! 125: For most processors, there is one line that summarizes the type of machine ! 126: check. ! 127: Often, the nature of the problem is apparent from this messaage ! 128: and/or the contents of key registers. ! 129: The VAX Hardware Handbook should be consulted, ! 130: and, if necessary, your friendly field service people should be informed ! 131: of the problem. ! 132: .TP ! 133: .B trap type %d, code=%x, pc=%x ! 134: A unexpected trap has occurred within the system; the trap types are: ! 135: .sp ! 136: .nf ! 137: 0 reserved addressing fault ! 138: 1 privileged instruction fault ! 139: 2 reserved operand fault ! 140: 3 bpt instruction fault ! 141: 4 xfc instruction fault ! 142: 5 system call trap ! 143: 6 arithmetic trap ! 144: 7 ast delivery trap ! 145: 8 segmentation fault ! 146: 9 protection fault ! 147: 10 trace trap ! 148: 11 compatibility mode fault ! 149: 12 page fault ! 150: 13 page table fault ! 151: .fi ! 152: .sp ! 153: The favorite trap types in system crashes are trap types 8 and 9, ! 154: indicating ! 155: a wild reference. The code is the referenced address, and the pc at the ! 156: time of the fault is printed. These problems tend to be easy to track ! 157: down if they are kernel bugs since the processor stops cold, but random ! 158: flakiness seems to cause this sometimes. ! 159: The debugger can be used to locate the instruction and subroutine ! 160: corresponding to the PC value. ! 161: If that is insufficient to suggest the nature of the problem, ! 162: more detailed examination of the system status at the time of the trap ! 163: usually can produce an explanation. ! 164: .TP ! 165: .B init died ! 166: The system initialization process has exited. This is bad news, as no new ! 167: users will then be able to log in. Rebooting is the only fix, so the ! 168: system just does it right away. ! 169: .TP ! 170: .B out of mbufs: map full ! 171: The network has exhausted its private page map for network buffers. ! 172: This usually indicates that buffers are being lost, and rather than ! 173: allow the system to slowly degrade, it reboots immediately. ! 174: The map may be made larger if necessary. ! 175: .PP ! 176: That completes the list of panic types you are likely to see. ! 177: .PP ! 178: When the system crashes it writes (or at least attempts to write) ! 179: an image of memory into the back end of the dump device, ! 180: usually the same as the primary swap ! 181: area. After the system is rebooted, the program ! 182: .IR savecore (8) ! 183: runs and preserves a copy of this core image and the current ! 184: system in a specified directory for later perusal. See ! 185: .IR savecore (8) ! 186: for details. ! 187: .PP ! 188: To analyze a dump you should begin by running ! 189: .IR adb (1) ! 190: with the ! 191: .B \-k ! 192: flag on the system load image and core dump. ! 193: If the core image is the result of a panic, ! 194: the panic message is printed. ! 195: Normally the command ! 196: ``$c'' ! 197: will provide a stack trace from the point of ! 198: the crash and this will provide a clue as to ! 199: what went wrong. ! 200: A more complete discussion ! 201: of system debugging is impossible here. ! 202: See, however, ! 203: ``Using ADB to Debug the UNIX Kernel''. ! 204: .SH "SEE ALSO" ! 205: adb(1), ! 206: reboot(8) ! 207: .br ! 208: .I "VAX 11/780 System Maintenance Guide" ! 209: and ! 210: .I "VAX Hardware Handbook" ! 211: for more information about machine checks. ! 212: .br ! 213: .I "Using ADB to Debug the UNIX Kernel"
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.