Annotation of 43BSDTahoe/man/man8/vax/crash.8, revision 1.1

1.1     ! root        1: .\" Copyright (c) 1980 Regents of the University of California.
        !             2: .\" All rights reserved.  The Berkeley software License Agreement
        !             3: .\" specifies the terms and conditions for redistribution.
        !             4: .\"
        !             5: .\"    @(#)crash.8v    6.2 (Berkeley) 5/20/86
        !             6: .\"
        !             7: .TH CRASH 8V "May 20, 1986"
        !             8: .UC 4
        !             9: .SH NAME
        !            10: crash \- what happens when the system crashes
        !            11: .SH DESCRIPTION
        !            12: This section explains what happens when the system crashes
        !            13: and (very briefly) how to analyze crash dumps.
        !            14: .PP
        !            15: When the system crashes voluntarily it prints a message of the form
        !            16: .IP
        !            17: panic: why i gave up the ghost
        !            18: .LP
        !            19: on the console, takes a dump on a mass storage peripheral,
        !            20: and then invokes an automatic reboot procedure as
        !            21: described in
        !            22: .IR reboot (8).
        !            23: (If auto-reboot is disabled on the front panel of the machine the system
        !            24: will simply halt at this point.)
        !            25: Unless some unexpected inconsistency is encountered in the state
        !            26: of the file systems due to hardware or software failure, the system
        !            27: will then resume multi-user operations.
        !            28: .PP
        !            29: The system has a large number of internal consistency checks; if one
        !            30: of these fails, then it will panic with a very short message indicating
        !            31: which one failed.
        !            32: In many instances, this will be the name of the routine which detected
        !            33: the error, or a two-word description of the inconsistency.
        !            34: A full understanding of most panic messages requires perusal of the
        !            35: source code for the system.
        !            36: .PP
        !            37: The most common cause of system failures is hardware failure, which
        !            38: can reflect itself in different ways.  Here are the messages which
        !            39: are most likely, with some hints as to causes.
        !            40: Left unstated in all cases is the possibility that hardware or software
        !            41: error produced the message in some unexpected way.
        !            42: .TP
        !            43: .B iinit
        !            44: This cryptic panic message results from a failure to mount the root filesystem
        !            45: during the bootstrap process.
        !            46: Either the root filesystem has been corrupted,
        !            47: or the system is attempting to use the wrong device as root filesystem.
        !            48: Usually, an alternate copy of the system binary or an alternate root
        !            49: filesystem can be used to bring up the system to investigate.
        !            50: .TP
        !            51: .B Can't exec /etc/init
        !            52: This is not a panic message, as reboots are likely to be futile.
        !            53: Late in the bootstrap procedure, the system was unable to locate
        !            54: and execute the initialization process,
        !            55: .IR init (8).
        !            56: The root filesystem is incorrect or has been corrupted, or the mode
        !            57: or type of /etc/init forbids execution.
        !            58: .TP
        !            59: .B IO err in push
        !            60: .ns
        !            61: .TP
        !            62: .B hard IO err in swap
        !            63: The system encountered an error trying to write to the paging device
        !            64: or an error in reading critical information from a disk drive.
        !            65: The offending disk should be fixed if it is broken or unreliable.
        !            66: .TP
        !            67: .B realloccg: bad optim
        !            68: .ns
        !            69: .TP
        !            70: .B ialloc: dup alloc
        !            71: .ns
        !            72: .TP
        !            73: .B alloccgblk: cyl groups corrupted
        !            74: .ns
        !            75: .TP
        !            76: .B ialloccg: map corrupted
        !            77: .ns
        !            78: .TP
        !            79: .B free: freeing free block
        !            80: .ns
        !            81: .TP
        !            82: .B free: freeing free frag
        !            83: .ns
        !            84: .TP
        !            85: .B ifree: freeing free inode
        !            86: .ns
        !            87: .TP
        !            88: .B alloccg: map corrupted
        !            89: These panic messages are among those that may be produced
        !            90: when filesystem inconsistencies are detected.
        !            91: The problem generally results from a failure to repair damaged filesystems
        !            92: after a crash, hardware failures, or other condition that should not
        !            93: normally occur.
        !            94: A filesystem check will normally correct the problem.
        !            95: .TP
        !            96: .B timeout table overflow
        !            97: .ns
        !            98: This really shouldn't be a panic, but until the data structure
        !            99: involved is made to be extensible, running out of entries causes a crash.
        !           100: If this happens, make the timeout table bigger.
        !           101: .TP
        !           102: .B KSP not valid
        !           103: .ns
        !           104: .TP
        !           105: .B SBI fault
        !           106: .ns
        !           107: .TP
        !           108: .B CHM? in kernel
        !           109: These indicate either a serious bug in the system or, more often,
        !           110: a glitch or failing hardware.
        !           111: If SBI faults recur, check out the hardware or call
        !           112: field service.  If the other faults recur, there is likely a bug somewhere
        !           113: in the system, although these can be caused by a flakey processor.
        !           114: Run processor microdiagnostics.
        !           115: .TP
        !           116: .B machine check %x:
        !           117: .I description
        !           118: .ns
        !           119: .TP
        !           120: .I \0\0\0machine dependent machine-check information
        !           121: .ns
        !           122: Machine checks are different on each type of CPU.
        !           123: Most of the internal processor registers are saved at the time of the fault
        !           124: and are printed on the console.
        !           125: For most processors, there is one line that summarizes the type of machine
        !           126: check.
        !           127: Often, the nature of the problem is apparent from this messaage
        !           128: and/or the contents of key registers.
        !           129: The VAX Hardware Handbook should be consulted,
        !           130: and, if necessary, your friendly field service people should be informed
        !           131: of the problem.
        !           132: .TP
        !           133: .B trap type %d, code=%x, pc=%x
        !           134: A unexpected trap has occurred within the system; the trap types are:
        !           135: .sp
        !           136: .nf
        !           137: 0      reserved addressing fault
        !           138: 1      privileged instruction fault
        !           139: 2      reserved operand fault
        !           140: 3      bpt instruction fault
        !           141: 4      xfc instruction fault
        !           142: 5      system call trap
        !           143: 6      arithmetic trap
        !           144: 7      ast delivery trap
        !           145: 8      segmentation fault
        !           146: 9      protection fault
        !           147: 10     trace trap
        !           148: 11     compatibility mode fault
        !           149: 12     page fault
        !           150: 13     page table fault
        !           151: .fi
        !           152: .sp
        !           153: The favorite trap types in system crashes are trap types 8 and 9,
        !           154: indicating
        !           155: a wild reference.  The code is the referenced address, and the pc at the
        !           156: time of the fault is printed.  These problems tend to be easy to track
        !           157: down if they are kernel bugs since the processor stops cold, but random
        !           158: flakiness seems to cause this sometimes.
        !           159: The debugger can be used to locate the instruction and subroutine
        !           160: corresponding to the PC value.
        !           161: If that is insufficient to suggest the nature of the problem,
        !           162: more detailed examination of the system status at the time of the trap
        !           163: usually can produce an explanation.
        !           164: .TP
        !           165: .B init died
        !           166: The system initialization process has exited.  This is bad news, as no new
        !           167: users will then be able to log in.  Rebooting is the only fix, so the
        !           168: system just does it right away.
        !           169: .TP
        !           170: .B out of mbufs: map full
        !           171: The network has exhausted its private page map for network buffers.
        !           172: This usually indicates that buffers are being lost, and rather than
        !           173: allow the system to slowly degrade, it reboots immediately.
        !           174: The map may be made larger if necessary.
        !           175: .PP
        !           176: That completes the list of panic types you are likely to see.
        !           177: .PP
        !           178: When the system crashes it writes (or at least attempts to write)
        !           179: an image of memory into the back end of the dump device,
        !           180: usually the same as the primary swap
        !           181: area.  After the system is rebooted, the program
        !           182: .IR savecore (8)
        !           183: runs and preserves a copy of this core image and the current
        !           184: system in a specified directory for later perusal.  See
        !           185: .IR savecore (8)
        !           186: for details.
        !           187: .PP
        !           188: To analyze a dump you should begin by running
        !           189: .IR adb (1)
        !           190: with the 
        !           191: .B \-k
        !           192: flag on the system load image and core dump.
        !           193: If the core image is the result of a panic,
        !           194: the panic message is printed.
        !           195: Normally the command
        !           196: ``$c''
        !           197: will provide a stack trace from the point of
        !           198: the crash and this will provide a clue as to
        !           199: what went wrong.
        !           200: A more complete discussion
        !           201: of system debugging is impossible here.
        !           202: See, however,
        !           203: ``Using ADB to Debug the UNIX Kernel''.
        !           204: .SH "SEE ALSO"
        !           205: adb(1),
        !           206: reboot(8)
        !           207: .br
        !           208: .I "VAX 11/780 System Maintenance Guide"
        !           209: and
        !           210: .I "VAX Hardware Handbook"
        !           211: for more information about machine checks.
        !           212: .br
        !           213: .I "Using ADB to Debug the UNIX Kernel"

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.