|
|
1.1 root 1: .\" Copyright (c) 1986 Regents of the University of California.
2: .\" All rights reserved. The Berkeley software License Agreement
3: .\" specifies the terms and conditions for redistribution.
4: .\"
5: .\" @(#)sys.ufs.t 1.5 (Berkeley) 4/11/86
6: .\"
7: .NH
8: Changes in the filesystem
9: .PP
10: The major change in the filesystem was the addition of a name translation
11: cache.
12: A table of recent name-to-inode translations is maintained by \fInamei\fP,
13: and used as a lookaside cache when translating each component of each
14: file pathname.
15: Each \fInamecache\fP entry contains the parent directory's device and inode,
16: the length of the name, and the name itself, and is hashed on the name.
17: It also contains a pointer to the inode for the file whose name it contains.
18: Unlike most inode pointers, which hold a ``hard'' reference
19: by incrementing the reference count,
20: the name cache holds a ``soft'' reference, a pointer to an inode
21: that may be reused.
22: In order to validate the inode from a name cache reference,
23: each inode is assigned a unique ``capability'' when it is brought
24: into memory.
25: When the inode entry is reused for another file,
26: or when the name of the file is changed,
27: this capability is changed.
28: This allows the inode cache to be handled normally,
29: releasing inodes at the head of the LRU list without regard for name
30: cache references,
31: and allows multiple names for the same inode to be in the cache simultaneously
32: without complicating the invalidation procedure.
33: An additional feature of this scheme is that when opening
34: a file, it is possible to determine whether the file was previously open.
35: This is useful when beginning execution of a file, to check whether
36: the file might be open for writing, and for similar situations.
37: .PP
38: Other changes that are visible throughout the filesystem
39: include greater use of the ILOCK and IUNLOCK macros rather than the
40: subroutine equivalents.
41: The inode times are updated on each \fIirele\fP, not only when
42: the reference count reaches zero,
43: if the IACC, IUPD or ICHG flags are set.
44: This is accomplished with the ITIMES macro;
45: the inode is marked as modified with the new IMOD flag,
46: that causes it to be written to disk when released, or on the next sync.
47: .PP
48: The remainder of this section describes the filesystem changes that are
49: localized to individual files.
50: .XP ufs_alloc.c
51: The algorithm for extending file fragments was changed
52: to take advantage of the observation that fragments that were once extended
53: were frequently extended again, that is, that the file was being written
54: in fragments.
55: Therefore, the first time a given fragment is allocated,
56: a best-fit strategy is used.
57: Thereafter, when this fragment is to be extended,
58: a full-sized block is allocated, the fragment removed from it,
59: and the remainder freed for use in subsequent expansion.
60: As this policy may result in increased fragmentation,
61: it is not used when the filesystem becomes excessively
62: fragmented (i.e. when the number of free fragments falls to 2%
63: of the minfree value);
64: the policy is stored in the superblock and may be changed with \fItunefs\fP.
65: The \fIfserr\fP routine was converted to use \fIlog\fP rather than \fIprintf\fP.
66: .XP ufs_bio.c
67: I/O operations traced now include the size where relevant.
68: .XP ufs_inode.c
69: The size of the buffer hash table was increased substantially
70: and changed to a power of two to allow the modulus to be computed with a mask
71: operation.
72: \fIIget\fP invalidates the capability in each inode that is flushed
73: from the inode cache for reuse.
74: The new \fIigrab\fP routine is used instead of \fIiget\fP
75: when fetching an inode from a name cache reference;
76: it waits for the inode to be unlocked if necessary,
77: and removes it from the free list if it was free.
78: The caller must check that the inode is still valid after the \fIigrab\fP.
79: A bug was fixed in \fIitrunc\fP that allowed old contents to creep back into
80: a file.
81: When truncating to a location within a block,
82: \fIitrunc\fP must clear the remainder of the block.
83: Otherwise, if the file is extended by seeking past the end of file
84: and then writing, the old contents reappear.
85: .\" \fIItrunc\fP also waits for
86: .XP ufs_mount.c
87: The \fImount\fP system call was modified to return different error numbers
88: for different types of errors.
89: \fIMount\fP now examines the superblock more carefully
90: before using size field it contains as the amount to copy into a new buffer.
91: If a mount fails for a reason other than the device already being
92: mounted, the device is closed again.
93: When performing the name lookup for the mount point,
94: \fImount\fP must prevent the name translation from being left
95: in the name cache;
96: \fIumount\fP must flush all name translations for the device.
97: A bug in \fIgetmdev\fP caused an inode to remain locked
98: if the specified device was not a block special file; this has been fixed.
99: .XP ufs_namei.c
100: This file was previously called ufs_nami.c.
101: The \fInamei\fP function has a new calling convention
102: with its arguments, associated context, and side effects
103: encapsulated in a single structure.
104: It has been extensively modified to implement the name cache
105: and to cache directory offsets for each process.
106: It may now return ENAMETOOLONG when appropriate,
107: and returns EINVAL if the 8th bit is set on one of the pathname
108: characters.
109: Directories may be foreshortened if the last one or more blocks
110: contain no entries;
111: this is done when files are being created, as the entire directory
112: must already be searched.
113: An entry is provided for invalidating the entire name cache
114: when the 32-bit prototype for capabilities wraps around.
115: This is expected to happen after 13 months of operation,
116: assuming 100 name lookups per second, all of which miss the cache.
117: .XP
118: A change in filesystem semantics is the introduction
119: of ``sticky'' directories.
120: If the ISVTX (sticky text) bit is set in the mode of a directory,
121: files may only be removed from that directory by the owner of the file,
122: the owner of the directory, or the superuser.
123: This is enforced by \fInamei\fP when the lookup operation is DELETE.
124: .XP ufs_subr.c
125: The strategy for \fIsyncip\fP, the internal routine implementing \fIfsync\fP,
126: has been modified for large files (those larger than half of the buffer
127: cache).
128: For large files all modified buffers for the device are written out.
129: The old algorithm could run for a very long time on a very large file,
130: that might not actually have many data blocks.
131: The \fIupdate\fP routine now saves some work by calling \fIiupdate\fP
132: only for modified inodes.
133: The C replacements for the special VAX instructions have been collected
134: in this file.
135: .XP ufs_syscalls.c
136: When doing an open with flags O_CREAT and O_EXCL (create only if the file
137: did not exist), it is now considered to be an error if the target exists
138: and is a symbolic link, even if the symbolic link refers to a nonexistent
139: file.
140: This behavior is desirable for reasons of security
141: in programs that create files with predictable names.
142: \fIRename\fP follows the policy of \fInamei\fP in disallowing removal
143: of the target of a rename if the target directory is ``sticky''
144: and the user is not the owner of the target or the target directory.
145: A serious bug in the open code which allowed directories and other unwritable
146: files to be truncated has been corrected.
147: Interrupted opens no longer lose file descriptors.
148: The \fIlseek\fP call returns an ESPIPE error when seeking on sockets
149: (including pipes) for backward compatibility.
150: The error returned from \fIreadlink\fP when reading something other than
151: a symbolic link was changed from ENXIO to EINVAL.
152: Several calls that previously failed silently on read-only filesystems
153: (\fIchmod\fP, \fIchown\fP, \fIfchmod\fP, \fIfchown\fP and \fIutimes\fP)
154: now return EROFS.
155: The \fIrename\fP code was reworked to avoid several races
156: and to invalidate the name cache.
157: It marks a directory being renamed with IRENAME
158: to avoid races due to concurrent renames of the same directory.
159: \fIMkdir\fP now sets the size of all new directories to DIRBLKSIZE.
160: \fIRmdir\fP purges the name cache of entries for the removed directory.
161: .XP ufs_xxx.c
162: The routines \fIuchar\fP and \fIschar\fP are no longer used
163: and have been removed.
164: .XP quota_kern.c
165: The quota hash size was changed to a power of 2 so that the modulus could
166: be computed with a mask.
167: .XP quota_ufs.c
168: If a user has run out of warnings and had the hard limit enforced
169: while logged in,
170: but has then brought his allocation below the hard limit,
171: the quota system reverts to enforcing the soft limit,
172: and resets the warning count;
173: users previously were required to log out and in again to
174: get this affect.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.