Annotation of 43BSDReno/share/man/man8/man8.hp300/crash.8, revision 1.1.1.1

1.1       root        1: .\" Copyright (c) 1990 Regents of the University of California.
                      2: .\" All rights reserved.  The Berkeley software License Agreement
                      3: .\" specifies the terms and conditions for redistribution.
                      4: .\"
                      5: .\"    @(#)crash.8     5.1 (Berkeley) 6/29/90
                      6: .\"
                      7: .TH CRASH 8 "June 29, 1990"
                      8: .UC 7
                      9: .SH NAME
                     10: crash \- what happens when the system crashes
                     11: .SH DESCRIPTION
                     12: This section explains what happens when the system crashes
                     13: and (very briefly) how to analyze crash dumps.
                     14: .PP
                     15: When the system crashes voluntarily it prints a message of the form
                     16: .IP
                     17: panic: why i gave up the ghost
                     18: .LP
                     19: on the console, takes a dump on a mass storage peripheral,
                     20: and then invokes an automatic reboot procedure as
                     21: described in
                     22: .IR reboot (8).
                     23: Unless some unexpected inconsistency is encountered in the state
                     24: of the file systems due to hardware or software failure, the system
                     25: will then resume multi-user operations.
                     26: .PP
                     27: The system has a large number of internal consistency checks; if one
                     28: of these fails, then it will panic with a very short message indicating
                     29: which one failed.
                     30: In many instances, this will be the name of the routine which detected
                     31: the error, or a two-word description of the inconsistency.
                     32: A full understanding of most panic messages requires perusal of the
                     33: source code for the system.
                     34: .PP
                     35: The most common cause of system failures is hardware failure, which
                     36: can reflect itself in different ways.  Here are the messages which
                     37: are most likely, with some hints as to causes.
                     38: Left unstated in all cases is the possibility that hardware or software
                     39: error produced the message in some unexpected way.
                     40: .TP
                     41: .B iinit
                     42: This cryptic panic message results from a failure to mount the root filesystem
                     43: during the bootstrap process.
                     44: Either the root filesystem has been corrupted,
                     45: or the system is attempting to use the wrong device as root filesystem.
                     46: Usually, an alternate copy of the system binary or an alternate root
                     47: filesystem can be used to bring up the system to investigate.
                     48: .TP
                     49: .B Can't exec /etc/init
                     50: This is not a panic message, as reboots are likely to be futile.
                     51: Late in the bootstrap procedure, the system was unable to locate
                     52: and execute the initialization process,
                     53: .IR init (8).
                     54: The root filesystem is incorrect or has been corrupted, or the mode
                     55: or type of /etc/init forbids execution.
                     56: .TP
                     57: .B IO err in push
                     58: .ns
                     59: .TP
                     60: .B hard IO err in swap
                     61: The system encountered an error trying to write to the paging device
                     62: or an error in reading critical information from a disk drive.
                     63: The offending disk should be fixed if it is broken or unreliable.
                     64: .TP
                     65: .B realloccg: bad optim
                     66: .ns
                     67: .TP
                     68: .B ialloc: dup alloc
                     69: .ns
                     70: .TP
                     71: .B alloccgblk: cyl groups corrupted
                     72: .ns
                     73: .TP
                     74: .B ialloccg: map corrupted
                     75: .ns
                     76: .TP
                     77: .B free: freeing free block
                     78: .ns
                     79: .TP
                     80: .B free: freeing free frag
                     81: .ns
                     82: .TP
                     83: .B ifree: freeing free inode
                     84: .ns
                     85: .TP
                     86: .B alloccg: map corrupted
                     87: These panic messages are among those that may be produced
                     88: when filesystem inconsistencies are detected.
                     89: The problem generally results from a failure to repair damaged filesystems
                     90: after a crash, hardware failures, or other condition that should not
                     91: normally occur.
                     92: A filesystem check will normally correct the problem.
                     93: .TP
                     94: .B timeout table overflow
                     95: .ns
                     96: This really shouldn't be a panic, but until the data structure
                     97: involved is made to be extensible, running out of entries causes a crash.
                     98: If this happens, make the timeout table bigger.
                     99: .TP
                    100: .B "trap type %d, code = %x, v = %x"
                    101: An unexpected trap has occurred within the system; the trap types are:
                    102: .sp
                    103: .nf
                    104: 0      bus error
                    105: 1      address error
                    106: 2      illegal instruction
                    107: 3      divide by zero
                    108: 4      \fIchk\fP instruction
                    109: 5      \fItrapv\fP instruction
                    110: 6      privileged instruction
                    111: 7      trace trap
                    112: 8      MMU fault
                    113: 9      simulated software interrupt
                    114: 10     format error
                    115: 11     FP coprocessor fault
                    116: 12     coprocessor fault
                    117: 13     simulated AST
                    118: .fi
                    119: .sp
                    120: The favorite trap type in system crashes is trap type 8,
                    121: indicating a wild reference.
                    122: ``code'' (hex) is the concatenation of the MMU status register
                    123: (see <hp300/cpu.h>)
                    124: in the high 16 bits and the 68020 special status word
                    125: (see the 68020 manual, page 6-17)
                    126: in the low 16.
                    127: ``v'' (hex) is the virtual address which caused the fault.
                    128: Additionally, the kernel will dump about a screenful of semi-useful
                    129: information.
                    130: ``pid'' (decimal) is the process id of the process running at the
                    131: time of the exception.
                    132: Note that if we panic in an interrupt routine,
                    133: this process may not be related to the panic.
                    134: ``ps'' (hex) is the 68020 processor status register ``ps''.
                    135: ``pc'' (hex) is the value of the program counter saved
                    136: on the hardware exception frame.
                    137: It may
                    138: .I not
                    139: be the PC of the instruction causing the fault.
                    140: ``sfc'' and ``dfc'' (hex) are the 68020 source/destination function codes.
                    141: They should always be one.
                    142: ``p0'' and ``p1'' are the VAX-like region registers.
                    143: They are of the form:
                    144: .sp
                    145:        <length> '@' <kernel VA>
                    146: .sp
                    147: where both are in hex.
                    148: Following these values are a dump of the processor registers (hex).
                    149: Finally, is a dump of the stack (user/kernel) at the time of the offense.
                    150: .TP
                    151: .B init died
                    152: The system initialization process has exited.  This is bad news, as no new
                    153: users will then be able to log in.  Rebooting is the only fix, so the
                    154: system just does it right away.
                    155: .TP
                    156: .B out of mbufs: map full
                    157: The network has exhausted its private page map for network buffers.
                    158: This usually indicates that buffers are being lost, and rather than
                    159: allow the system to slowly degrade, it reboots immediately.
                    160: The map may be made larger if necessary.
                    161: .PP
                    162: That completes the list of panic types you are likely to see.
                    163: .PP
                    164: When the system crashes it writes (or at least attempts to write)
                    165: an image of memory into the back end of the dump device,
                    166: usually the same as the primary swap
                    167: area.  After the system is rebooted, the program
                    168: .IR savecore (8)
                    169: runs and preserves a copy of this core image and the current
                    170: system in a specified directory for later perusal.  See
                    171: .IR savecore (8)
                    172: for details.
                    173: .PP
                    174: To analyze a dump you should begin by running
                    175: .IR adb (1)
                    176: with the 
                    177: .B \-k
                    178: flag on the system load image and core dump.
                    179: If the core image is the result of a panic,
                    180: the panic message is printed.
                    181: Normally the command
                    182: ``$c''
                    183: will provide a stack trace from the point of
                    184: the crash and this will provide a clue as to
                    185: what went wrong.
                    186: A more complete discussion
                    187: of system debugging is impossible here.
                    188: See, however,
                    189: ``Using ADB to Debug the UNIX Kernel''.
                    190: .SH "SEE ALSO"
                    191: adb(1),
                    192: reboot(8)
                    193: .br
                    194: .I "MC68020 32-bit Microprocessor User's Manual"
                    195: .br
                    196: .I "Using ADB to Debug the UNIX Kernel"
                    197: .br
                    198: .I "4.3BSD for the HP300"

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.