Annotation of 43BSDReno/sys/hp300/DOC/debug, revision 1.1.1.1

1.1       root        1: Some quick notes on the HPBSD VM layout and kernel debugging.
                      2: 
                      3: Physical memory:
                      4: 
                      5: Physical memory always ends at the top of the 32 bit address space; i.e. the
                      6: last addressible byte is at 0xFFFFFFFF.  Hence, the start of physical memory
                      7: varies depending on how much memory is installed.  The kernel variable "lowram"
                      8: contains the starting locatation of memory as provided by the ROM.
                      9: 
                     10: The low 128k (I think) of the physical address space is occupied by the ROM.
                     11: This is accessible via /dev/mem *only* if the kernel is compiled with DEBUG.
                     12: [ Maybe it should always be accessible? ]
                     13: 
                     14: Virtual address spaces:
                     15: 
                     16: The hardware page size is 4096 bytes.  The hardware uses a two-level lookup.
                     17: At the highest level is a one page segment table which maps a page table which
                     18: maps the address space.  Each 4 byte segment table entry (described in
                     19: hp300/pte.h) contains the page number of a single page of 4 byte page table
                     20: entries.  Each PTE maps a single page of address space.  Hence, each STE maps
                     21: 4Mb of address space and one page containing 1024 STEs is adequate to map the
                     22: entire 4Gb address space.
                     23: 
                     24: Both page and segment table entries look similar.  Both have the page frame
                     25: in the upper part and control bits in the lower.  This is the opposite of
                     26: the VAX.  It is easy to convert the page frame number in an STE/PTE to a
                     27: physical address, simply mentally mask out the low 12 bits.  For example
                     28: if a PTE contains 0xFF880019, the physical memory location mapped starts at
                     29: 0xFF880000.
                     30: 
                     31: Kernel address space:
                     32: 
                     33: The kernel resides in its own virtual address space independent of all user
                     34: processes.  When the processor is in supervisor mode (i.e. interrupt or 
                     35: exception handling) it uses the kernel virtual mapping.  The kernel segment
                     36: table is called Sysseg and is allocated statically in hp300/locore.s.  The
                     37: kernel page table is called Systab is also allocated statically in
                     38: hp300/locore.s and consists of the usual assortment of SYSMAPs.
                     39: The size of Systab (Syssize) depends on the configured size of the various
                     40: maps but as currently configured is 9216 PTEs.  Both segment and page tables
                     41: are initialized at bootup in hp300/locore.s.  The segment table never changes
                     42: (except for bits maintained by the hardware).  Portions of the page table
                     43: change as needed.  The kernel is mapped into the address space starting at 0.
                     44: 
                     45: Theoretically, any address in the range 0 to Syssize * 4096 (0x2400000 as
                     46: currently configured) is valid.  However, certain addresses are more common
                     47: in dumps than others.  Those are (for the current configuration):
                     48: 
                     49:        0         - 0x800000    kernel text and permanent data structures
                     50:        0x917000  - 0x91a000    u-area; 1st page is user struct, last k-stack
                     51:        0x1b1b000 - 0x2400000   user page tables, also kmem_alloc()ed data
                     52: 
                     53: User address space:
                     54: 
                     55: The user text and data are loaded starting at VA 0.  The user's stack starts
                     56: at 0xFFF00000 and grows toward lower addresses.  The pages above the user
                     57: stack are used by the kernel.  From 0xFFF00000 to 0xFFF03000 is the u-area.
                     58: The 3 PTEs for this range map (read-only) the same memory as does 0x917000
                     59: to 0x91a000 in the kernel address space.  This address range is never used
                     60: by the kernel, but exists for utilities that assume that the u-area sits
                     61: above the user stack.  The pages from FFF03000 up are not used.  They
                     62: exist so that the user stack is in the same location as in HPUX.
                     63: 
                     64: The user segment table is allocated along with the page tables from Usrptmap.
                     65: They are contiguous in kernel VA space with the page tables coming before
                     66: the segment table.  Hence, a process has p_szpt+1 pages allocated starting
                     67: at kernel VA p_p0br.
                     68: 
                     69: The user segment table is typically very sparse since each entry maps 4Mb.
                     70: There are usually only two valid STEs, one at the start mapping the text/data
                     71: potion of the page table, and one at the end mapping the stack/u-area.  For
                     72: example if the segment table was at 0xFFFFA000 there would be valid entries
                     73: at 0xFFFFA000 and 0xFFFFAFFC.
                     74: 
                     75: Random notes:
                     76: 
                     77: An important thing to note is that there are no hardware length registers
                     78: on the HP.  This implies that we cannot "pack" data and stack PTEs into the
                     79: same page table page.  Hence, every user page table has at least 2 pages
                     80: (3 if you count the segment table).
                     81: 
                     82: The HP maintains the p0br/p0lr and p1br/p1lr PCB fields the same as the
                     83: VAX even though they have no meaning to the hardware.  This also keeps many
                     84: utilities happy.
                     85: 
                     86: There is no seperate interrupt stack (right now) on the HPs.  Interrupt
                     87: processing is handled on the kernel stack of the "current" process.
                     88: 
                     89: Following is a list of things you might want to be able to do with a kernel
                     90: core dump.  One thing you should always have is a ps listing from the core
                     91: file.  Just do:
                     92: 
                     93:        ps klaw vmunix.? vmcore.?
                     94: 
                     95: Exception related panics (i.e. those detected in hp300/trap.c) will dump
                     96: out various useful information before panicing.  If available, you should
                     97: get this out of the /usr/adm/messages file.  Finally, you should be in adb:
                     98: 
                     99:        adb -k vmunix.? vmcore.?
                    100: 
                    101: Adb -k will allow you to examine the kernel address space more easily.
                    102: It automatically maps kernel VAs in the range 0 to 0x2400000 to physical
                    103: addresses.  Since the kernel and user address spaces overlap (i.e. both
                    104: start at 0), adb can't let you examine the address space of the "current"
                    105: process as it does on the VAX.
                    106: --------
                    107: 
                    108: 1. Find out what the current process was at the time of the crash:
                    109: 
                    110: If you have the dump info from /usr/adm/messages, it should contain the
                    111: PID of the active process.  If you don't have this info you can just look
                    112: at location "Umap".  This is the PTE for the first page of the u-area; i.e.
                    113: the user structure.  Forget about the last 3 hex digits and compare the top
                    114: 5 to the ADDR column in the ps listing.
                    115: 
                    116: 2. Locating a process' user structure:
                    117: 
                    118: Get the ADDR field of the desired process from the ps listing.  This is the
                    119: page frame number of the process' user structure.  Tack 3 zeros on to the
                    120: end to get the physical address.  Note that this doesn't give you the kernel
                    121: stack since it is in a different page than the user-structure and pages of
                    122: the u-area are not physically contiguous.
                    123: 
                    124: 3. Locating a process' proc structure:
                    125: 
                    126: First find the process' user structure as described above.  Find the u_procp
                    127: field at offset 0x200 from the beginning.  This gives you the kernel VA of
                    128: the proc structure.
                    129: 
                    130: 4. Locating a process' page table:
                    131: 
                    132: First find the process' user structure as described above.  The first part
                    133: of the user structure is the PCB.  The second longword (third field) of the
                    134: PCB is pcb_ustp, a pointer to the user segment table.  This pointer is
                    135: actually the page frame number.  Again adding 3 zeros yields the physical
                    136: address.  You can now use the values in the segment table to locate the
                    137: page tables.  For example, to locate the first page of the text/data part
                    138: of the page table, use the first STE (longword) in the segment table.
                    139: 
                    140: 5. Locating a process' kernel stack:
                    141: 
                    142: First find the process' page table as described above.  The kernel stack
                    143: is near the end of the user address space.  So, locate the last entry in the
                    144: user segment table (base+0xFFC) and use that entry to find the last page of
                    145: the user page table.  Look at the last 256 entries of this page
                    146: (pagebase+0xFE0)  The first is the PTE for the user-structure.  The second
                    147: was intended to be a read-only page to protect the user structure from the
                    148: kernel stack.  Currently it is read/write and actually allocated.  Hence
                    149: it can wind up being a second page for the kernel stack.  The third is the
                    150: kernel stack.  The last 253 should be zero.  Hence, indirecing through the
                    151: third of these last 256 PTEs will give you the kernel stack page.
                    152: 
                    153: An alternate way to do this is to use the p_addr field of the proc structure
                    154: which is found as described above.  The p_addr field is at offset 0x10 in the
                    155: proc structure and points to the first of the PTEs mentioned above (i.e. the
                    156: user structure PTE).
                    157: 
                    158: 6. Interpreting the info in a "trap type N..." panic:
                    159: 
                    160: As mentioned, when the kernel crashes out of hp300/trap.c it will dump some
                    161: useful information.  This dates back to the days when I was debugging the
                    162: exception handling code and had no kernel adb or even kernel crash dump code.
                    163: "trap type" (decimal) is as defined in hp300/trap.h, it doesn't really
                    164: correlate with anything useful.  "code" (hex) is only useful for MMU
                    165: (trap type 8) errors.  It is the concatination of the MMU status register
                    166: (see hp300/cpu.h) in the high 16 bits and the 68020 special status word
                    167: (see the 020 manual page 6-17) in the low 16.  "v" (hex) is the virtual
                    168: address which caused the fault.  "pid" (decimal) is the ID of the process
                    169: running at the time of the exception.  Note that if we panic in an interrupt
                    170: routine, this process may not be related to the panic.  "ps" (hex) is the
                    171: value of the 68020 status register (see page 1-4 of 020 manual) at the time
                    172: of the crash.  If the 0x2000 bit is on, we were in supervisor (kernel) mode
                    173: at the time, otherwise we were in user mode.  "pc" (hex) is the value of the
                    174: PC saved on the hardware exception frame.  It may *not* be the PC of the
                    175: instruction causing the fault (see the 020 manual for details).  The 0x2000
                    176: bit of "ps" dictates whether this is a kernel or user VA.  "sfc" and "dfc"
                    177: are the 68020 source/destination function codes.  They should always be one.
                    178: "p0" and "p1" are the VAX-like region registers.  They are of the form:
                    179: 
                    180:        <length> '@' <kernel VA>
                    181: 
                    182: where both are in hex.  Following these values are a dump of the processor
                    183: registers (hex).  Check the address registers for values close to "v", the
                    184: fault address.  Most faults are causes by dereferences of bogus pointers.
                    185: Most such dereferences are the result of 020 instructions using the:
                    186: 
                    187:        <address-register> '@' '(' offset ')'
                    188: 
                    189: addressing mode.  This can help you track down the faulting instruction (since
                    190: the PC may not point to it).  Note that the value of a7 (the stack pointer) is
                    191: ALWAYS the user SP.  This is brain-dead I know.  Finally, is a dump of the
                    192: stack (user/kernel) at the time of the offense.  Before kernel crash dumps,
                    193: this was very useful.
                    194: 
                    195: 7. Converting kernel virtual address to a physical address.
                    196: 
                    197: Adb -k already does this for you, but sometimes you want to know what the
                    198: resulting physical address is rather than what is there.  Doing this is
                    199: simply a matter of indexing into the kernel page table.  In theory we would
                    200: first have to do a lookup in the kernel segment table, but we know that the
                    201: kernel page table is physically contiguous so this isn't necessary.  The
                    202: base of the system page table is "Sysmap", so to convert an address V just
                    203: divide the address by 4096 to get the page number, multiply that by 4 (the
                    204: size of a PTE in bytes) to get a byte offset, and add that to "Sysmap".
                    205: This gives you the address of the PTE mapping V.  You can then get the
                    206: physical address by masking out the low 12 bits of the contents of that PTE.
                    207: To wit:
                    208: 
                    209:        *(Sysmap+(VA%1000*4))&fffff000
                    210: 
                    211: where VA is the virtual address in question.
                    212: 
                    213: This technique should also work for user virtual addresses if you replace
                    214: "Sysmap" with the value of the appropriate processes' P0BR.  This works
                    215: because a user's page table is *virtually* contiguous in the kernel
                    216: starting at P0BR, and adb will handle translating the kernel virtual addresses
                    217: for you.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.