Annotation of 43BSDReno/share/doc/ps1/09.lint/lint.ms, revision 1.1.1.1

1.1       root        1: .\"    @(#)lint        6.1 (Berkeley) 5/7/86
                      2: .\"
                      3: .EH 'PS1:9-%''Lint, a C Program Checker'
                      4: .OH 'Lint, a C Program Checker''PS1:9-%'
                      5: .\".RP
                      6: .ND "July 26, 1978"
                      7: .OK
                      8: .\"Program Portability
                      9: .\"Strong Type Checking
                     10: .TL
                     11: Lint, a C Program Checker
                     12: .AU "MH 2C-559" 3968
                     13: S. C. Johnson
                     14: .AI
                     15: .MH
                     16: .AB
                     17: .PP
                     18: .I Lint
                     19: is a command which examines C source programs,
                     20: detecting
                     21: a number of bugs and obscurities.
                     22: It enforces the type rules of C more strictly than
                     23: the C compilers.
                     24: It may also be used to enforce a number of portability
                     25: restrictions involved in moving
                     26: programs between different machines and/or operating systems.
                     27: Another option detects a number of wasteful, or error prone, constructions
                     28: which nevertheless are, strictly speaking, legal.
                     29: .PP
                     30: .I Lint
                     31: accepts multiple input files and library specifications, and checks them for consistency.
                     32: .PP
                     33: The separation of function between
                     34: .I lint
                     35: and the C compilers has both historical and practical
                     36: rationale.
                     37: The compilers turn C programs into executable files rapidly
                     38: and efficiently.
                     39: This is possible in part because the
                     40: compilers do not do sophisticated
                     41: type checking, especially between
                     42: separately compiled programs.
                     43: .I Lint
                     44: takes a more global, leisurely view of the program,
                     45: looking much more carefully at the compatibilities.
                     46: .PP
                     47: This document discusses the use of
                     48: .I lint ,
                     49: gives an overview of the implementation, and gives some hints on the
                     50: writing of machine independent C code.
                     51: .AE
                     52: .CS 10 2 12 0 0 5
                     53: .SH
                     54: Introduction and Usage
                     55: .PP
                     56: Suppose there are two C
                     57: .[
                     58: Kernighan Ritchie Programming Prentice 1978
                     59: .]
                     60: source files,
                     61: .I file1. c
                     62: and
                     63: .I file2.c  ,
                     64: which are ordinarily compiled and loaded together.
                     65: Then the command
                     66: .DS
                     67: lint  file1.c  file2.c
                     68: .DE
                     69: produces messages describing inconsistencies and inefficiencies
                     70: in the programs.
                     71: The program enforces the typing rules of C
                     72: more strictly than the C compilers
                     73: (for both historical and practical reasons)
                     74: enforce them.
                     75: The command
                     76: .DS
                     77: lint  \-p  file1.c  file2.c
                     78: .DE
                     79: will produce, in addition to the above messages, additional messages
                     80: which relate to the portability of the programs to other operating
                     81: systems and machines.
                     82: Replacing the
                     83: .B \-p
                     84: by
                     85: .B \-h
                     86: will produce messages about various error-prone or wasteful constructions
                     87: which, strictly speaking, are not bugs.
                     88: Saying
                     89: .B \-hp
                     90: gets the whole works.
                     91: .PP
                     92: The next several sections describe the major messages;
                     93: the document closes with sections
                     94: discussing the implementation and giving suggestions
                     95: for writing portable C.
                     96: An appendix gives a summary of the
                     97: .I lint
                     98: options.
                     99: .SH
                    100: A Word About Philosophy
                    101: .PP
                    102: Many of the facts which
                    103: .I lint
                    104: needs may be impossible to
                    105: discover.
                    106: For example, whether a given function in a program ever gets called
                    107: may depend on the input data.
                    108: Deciding whether
                    109: .I exit
                    110: is ever called is equivalent to solving the famous ``halting problem,'' known to be
                    111: recursively undecidable.
                    112: .PP
                    113: Thus, most of the
                    114: .I lint
                    115: algorithms are a compromise.
                    116: If a function is never mentioned, it can never be called.
                    117: If a function is mentioned,
                    118: .I lint
                    119: assumes it can be called; this is not necessarily so, but in practice is quite reasonable.
                    120: .PP
                    121: .I Lint
                    122: tries to give information with a high degree of relevance.
                    123: Messages of the form ``\fIxxx\fR might be a bug''
                    124: are easy to generate, but are acceptable only in proportion
                    125: to the fraction of real bugs they uncover.
                    126: If this fraction of real bugs is too small, the messages lose their credibility
                    127: and serve merely to clutter up the output,
                    128: obscuring the more important messages.
                    129: .PP
                    130: Keeping these issues in mind, we now consider in more detail
                    131: the classes of messages which
                    132: .I lint
                    133: produces.
                    134: .SH
                    135: Unused Variables and Functions
                    136: .PP
                    137: As sets of programs evolve and develop,
                    138: previously used variables and arguments to
                    139: functions may become unused;
                    140: it is not uncommon for external variables, or even entire
                    141: functions, to become unnecessary, and yet
                    142: not be removed from the source.
                    143: These ``errors of commission'' rarely cause working programs to fail, but they are a source
                    144: of inefficiency, and make programs harder to understand
                    145: and change.
                    146: Moreover, information about such unused variables and functions can occasionally
                    147: serve to discover bugs; if a function does a necessary job, and
                    148: is never called, something is wrong!
                    149: .PP
                    150: .I Lint
                    151: complains about variables and functions which are defined but not otherwise
                    152: mentioned.
                    153: An exception is variables which are declared through explicit
                    154: .B extern
                    155: statements but are never referenced; thus the statement
                    156: .DS
                    157: extern  float  sin(\|);
                    158: .DE
                    159: will evoke no comment if
                    160: .I sin
                    161: is never used.
                    162: Note that this agrees with the semantics of the C compiler.
                    163: In some cases, these unused external declarations might be of some interest; they
                    164: can be discovered by adding the
                    165: .B \-x
                    166: flag to the
                    167: .I lint
                    168: invocation.
                    169: .PP
                    170: Certain styles of programming
                    171: require many functions to be written with similar interfaces;
                    172: frequently, some of the arguments may be unused
                    173: in many of the calls.
                    174: The
                    175: .B \-v
                    176: option is available to suppress the printing of
                    177: complaints about unused arguments.
                    178: When
                    179: .B \-v
                    180: is in effect, no messages are produced about unused
                    181: arguments except for those
                    182: arguments which are unused and also declared as
                    183: register arguments; this can be considered
                    184: an active (and preventable) waste of the register
                    185: resources of the machine.
                    186: .PP
                    187: There is one case where information about unused, or
                    188: undefined, variables is more distracting
                    189: than helpful.
                    190: This is when
                    191: .I lint
                    192: is applied to some, but not all, files out of a collection
                    193: which are to be loaded together.
                    194: In this case, many of the functions and variables defined
                    195: may not be used, and, conversely,
                    196: many functions and variables defined elsewhere may be used.
                    197: The
                    198: .B \-u
                    199: flag may be used to suppress the spurious messages which might otherwise appear.
                    200: .SH
                    201: Set/Used Information
                    202: .PP
                    203: .I Lint
                    204: attempts to detect cases where a variable is used before it is set.
                    205: This is very difficult to do well;
                    206: many algorithms take a good deal of time and space,
                    207: and still produce messages about perfectly valid programs.
                    208: .I Lint
                    209: detects local variables (automatic and register storage classes)
                    210: whose first use appears physically earlier in the input file than the first assignment to the variable.
                    211: It assumes that taking the address of a variable constitutes a ``use,'' since the actual use
                    212: may occur at any later time, in a data dependent fashion.
                    213: .PP
                    214: The restriction to the physical appearance of variables in the file makes the
                    215: algorithm very simple and quick to implement,
                    216: since the true flow of control need not be discovered.
                    217: It does mean that
                    218: .I lint
                    219: can complain about some programs which are legal,
                    220: but these programs would probably be considered bad on stylistic grounds (e.g. might
                    221: contain at least two \fBgoto\fR's).
                    222: Because static and external variables are initialized to 0,
                    223: no meaningful information can be discovered about their uses.
                    224: The algorithm deals correctly, however, with initialized automatic variables, and variables
                    225: which are used in the expression which first sets them.
                    226: .PP
                    227: The set/used information also permits recognition of those local variables which are set
                    228: and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs.
                    229: .SH
                    230: Flow of Control
                    231: .PP
                    232: .I Lint
                    233: attempts to detect unreachable portions of the programs which it processes.
                    234: It will complain about unlabeled statements immediately following
                    235: \fBgoto\fR, \fBbreak\fR, \fBcontinue\fR, or \fBreturn\fR statements.
                    236: An attempt is made to detect loops which can never be left at the bottom, detecting the
                    237: special cases
                    238: \fBwhile\fR( 1 ) and \fBfor\fR(;;) as infinite loops.
                    239: .I Lint
                    240: also complains about loops which cannot be entered at the top;
                    241: some valid programs may have such loops, but at best they are bad style,
                    242: at worst bugs.
                    243: .PP
                    244: .I Lint
                    245: has an important area of blindness in the flow of control algorithm:
                    246: it has no way of detecting functions which are called and never return.
                    247: Thus, a call to
                    248: .I exit
                    249: may cause unreachable code which
                    250: .I lint
                    251: does not detect; the most serious effects of this are in the
                    252: determination of returned function values (see the next section).
                    253: .PP
                    254: One form of unreachable statement is not usually complained about by
                    255: .I lint;
                    256: a
                    257: .B break
                    258: statement that cannot be reached causes no message.
                    259: Programs generated by
                    260: .I yacc ,
                    261: .[
                    262: Johnson Yacc 1975
                    263: .]
                    264: and especially
                    265: .I lex ,
                    266: .[
                    267: Lesk Lex
                    268: .]
                    269: may have literally hundreds of unreachable
                    270: .B break
                    271: statements.
                    272: The
                    273: .B \-O
                    274: flag in the C compiler will often eliminate the resulting object code inefficiency.
                    275: Thus, these unreached statements are of little importance,
                    276: there is typically nothing the user can do about them, and the
                    277: resulting messages would clutter up the
                    278: .I lint
                    279: output.
                    280: If these messages are desired,
                    281: .I lint
                    282: can be invoked with the
                    283: .B \-b
                    284: option.
                    285: .SH
                    286: Function Values
                    287: .PP
                    288: Sometimes functions return values which are never used;
                    289: sometimes programs incorrectly use function ``values''
                    290: which have never been returned.
                    291: .I Lint
                    292: addresses this problem in a number of ways.
                    293: .PP
                    294: Locally, within a function definition,
                    295: the appearance of both
                    296: .DS
                    297: return(  \fIexpr\fR  );
                    298: .DE
                    299: and
                    300: .DS
                    301: return ;
                    302: .DE
                    303: statements is cause for alarm;
                    304: .I lint
                    305: will give the message
                    306: .DS
                    307: function \fIname\fR contains return(e) and return
                    308: .DE
                    309: The most serious difficulty with this is detecting when a function return is implied
                    310: by flow of control reaching the end of the function.
                    311: This can be seen with a simple example:
                    312: .DS
                    313: .ta .5i 1i 1.5i
                    314: \fRf ( a ) {
                    315:        if ( a ) return ( 3 );
                    316:        g (\|);
                    317:        }
                    318: .DE
                    319: Notice that, if \fIa\fR tests false, \fIf\fR will call \fIg\fR and then return
                    320: with no defined return value; this will trigger a complaint from
                    321: .I lint .
                    322: If \fIg\fR, like \fIexit\fR, never returns,
                    323: the message will still be produced when in fact nothing is wrong.
                    324: .PP
                    325: In practice, some potentially serious bugs have been discovered by this feature;
                    326: it also accounts for a substantial fraction of the ``noise'' messages produced
                    327: by
                    328: .I lint .
                    329: .PP
                    330: On a global scale,
                    331: .I lint
                    332: detects cases where a function returns a value, but this value is sometimes,
                    333: or always, unused.
                    334: When the value is always unused, it may constitute an inefficiency in the function definition.
                    335: When the value is sometimes unused, it may represent bad style (e.g., not testing for
                    336: error conditions).
                    337: .PP
                    338: The dual problem, using a function value when the function does not return one,
                    339: is also detected.
                    340: This is a serious problem.
                    341: Amazingly, this bug has been observed on a couple of occasions
                    342: in ``working'' programs; the desired function value just happened to have been computed
                    343: in the function return register!
                    344: .SH
                    345: Type Checking
                    346: .PP
                    347: .I Lint
                    348: enforces the type checking rules of C more strictly than the compilers do.
                    349: The additional checking is in four major areas:
                    350: across certain binary operators and implied assignments,
                    351: at the structure selection operators,
                    352: between the definition and uses of functions,
                    353: and in the use of enumerations.
                    354: .PP
                    355: There are a number of operators which have an implied balancing between types of the operands.
                    356: The assignment, conditional ( ?\|: ), and relational operators
                    357: have this property; the argument
                    358: of a \fBreturn\fR statement,
                    359: and expressions used in initialization also suffer similar conversions.
                    360: In these operations,
                    361: \fBchar\fR, \fBshort\fR, \fBint\fR, \fBlong\fR, \fBunsigned\fR, \fBfloat\fR, and \fBdouble\fR types may be freely intermixed.
                    362: The types of pointers must agree exactly,
                    363: except that arrays of \fIx\fR's can, of course, be intermixed with pointers to \fIx\fR's.
                    364: .PP
                    365: The type checking rules also require that, in structure references, the
                    366: left operand of the \(em> be a pointer to structure, the left operand of the \fB.\fR
                    367: be a structure, and the right operand of these operators be a member
                    368: of the structure implied by the left operand.
                    369: Similar checking is done for references to unions.
                    370: .PP
                    371: Strict rules apply to function argument and return value
                    372: matching.
                    373: The types \fBfloat\fR and \fBdouble\fR may be freely matched,
                    374: as may the types \fBchar\fR, \fBshort\fR, \fBint\fR, and \fBunsigned\fR.
                    375: Also, pointers can be matched with the associated arrays.
                    376: Aside from this, all actual arguments must agree in type with their declared counterparts.
                    377: .PP
                    378: With enumerations, checks are made that enumeration variables or members are not mixed
                    379: with other types, or other enumerations,
                    380: and that the only operations applied are =, initialization, ==, !=, and function arguments and return values.
                    381: .SH
                    382: Type Casts
                    383: .PP
                    384: The type cast feature in C was introduced largely as an aid
                    385: to producing more portable programs.
                    386: Consider the assignment
                    387: .DS
                    388: p = 1 ;
                    389: .DE
                    390: where
                    391: .I p
                    392: is a character pointer.
                    393: .I Lint
                    394: will quite rightly complain.
                    395: Now, consider the assignment
                    396: .DS
                    397: p = (char \(**)1 ;
                    398: .DE
                    399: in which a cast has been used to
                    400: convert the integer to a character pointer.
                    401: The programmer obviously had a strong motivation
                    402: for doing this, and has clearly signaled his intentions.
                    403: It seems harsh for
                    404: .I lint
                    405: to continue to complain about this.
                    406: On the other hand, if this code is moved to another
                    407: machine, such code should be looked at carefully.
                    408: The
                    409: .B \-c
                    410: flag controls the printing of comments about casts.
                    411: When
                    412: .B \-c
                    413: is in effect, casts are treated as though they were assignments
                    414: subject to complaint; otherwise, all legal casts are passed without comment,
                    415: no matter how strange the type mixing seems to be.
                    416: .SH
                    417: Nonportable Character Use
                    418: .PP
                    419: On the PDP-11, characters are signed quantities, with a range
                    420: from \-128 to 127.
                    421: On most of the other C implementations, characters take on only positive
                    422: values.
                    423: Thus,
                    424: .I lint
                    425: will flag certain comparisons and assignments as being
                    426: illegal or nonportable.
                    427: For example, the fragment
                    428: .DS
                    429: char c;
                    430:        ...
                    431: if( (c = getchar(\|)) < 0 ) ....
                    432: .DE
                    433: works on the PDP-11, but
                    434: will fail on machines where characters always take
                    435: on positive values.
                    436: The real solution is to declare
                    437: .I c
                    438: an integer, since
                    439: .I getchar
                    440: is actually returning
                    441: integer values.
                    442: In any case,
                    443: .I lint
                    444: will say
                    445: ``nonportable character comparison''.
                    446: .PP
                    447: A similar issue arises with bitfields; when assignments
                    448: of constant values are made to bitfields, the field may
                    449: be too small to hold the value.
                    450: This is especially true because
                    451: on some machines bitfields are considered as signed
                    452: quantities.
                    453: While it may seem unintuitive to consider
                    454: that a two bit field declared of type
                    455: .B int
                    456: cannot hold the value 3, the problem disappears
                    457: if the bitfield is declared to have type
                    458: .B unsigned .
                    459: .SH
                    460: Assignments of longs to ints
                    461: .PP
                    462: Bugs may arise from the assignment of
                    463: .B long
                    464: to
                    465: an
                    466: .B int ,
                    467: which loses accuracy.
                    468: This may happen in programs
                    469: which have been incompletely converted to use
                    470: .B typedefs .
                    471: When a
                    472: .B typedef
                    473: variable
                    474: is changed from \fBint\fR to \fBlong\fR,
                    475: the program can stop working because
                    476: some intermediate results may be assigned
                    477: to \fBints\fR, losing accuracy.
                    478: Since there are a number of legitimate reasons for
                    479: assigning \fBlongs\fR to \fBints\fR, the detection
                    480: of these assignments is enabled
                    481: by the
                    482: .B \-a
                    483: flag.
                    484: .SH
                    485: Strange Constructions
                    486: .PP
                    487: Several perfectly legal, but somewhat strange, constructions
                    488: are flagged by
                    489: .I lint;
                    490: the messages hopefully encourage better code quality, clearer style, and
                    491: may even point out bugs.
                    492: The
                    493: .B \-h
                    494: flag is used to enable these checks.
                    495: For example, in the statement
                    496: .DS
                    497: \(**p++ ;
                    498: .DE
                    499: the \(** does nothing; this provokes the message ``null effect'' from
                    500: .I lint .
                    501: The program fragment
                    502: .DS
                    503: unsigned x ;
                    504: if( x < 0 ) ...
                    505: .DE
                    506: is clearly somewhat strange; the
                    507: test will never succeed.
                    508: Similarly, the test
                    509: .DS
                    510: if( x > 0 ) ...
                    511: .DE
                    512: is equivalent to
                    513: .DS
                    514: if( x != 0 )
                    515: .DE
                    516: which may not be the intended action.
                    517: .I Lint
                    518: will say ``degenerate unsigned comparison'' in these cases.
                    519: If one says
                    520: .DS
                    521: if( 1 != 0 ) ....
                    522: .DE
                    523: .I lint
                    524: will report
                    525: ``constant in conditional context'', since the comparison
                    526: of 1 with 0 gives a constant result.
                    527: .PP
                    528: Another construction
                    529: detected by
                    530: .I lint
                    531: involves
                    532: operator precedence.
                    533: Bugs which arise from misunderstandings about the precedence
                    534: of operators can be accentuated by spacing and formatting,
                    535: making such bugs extremely hard to find.
                    536: For example, the statements
                    537: .DS
                    538: if( x&077 == 0 ) ...
                    539: .DE
                    540: or
                    541: .DS
                    542: x<\h'-.3m'<2 + 40
                    543: .DE
                    544: probably do not do what was intended.
                    545: The best solution is to parenthesize such expressions,
                    546: and
                    547: .I lint
                    548: encourages this by an appropriate message.
                    549: .PP
                    550: Finally, when the
                    551: .B \-h
                    552: flag is in force
                    553: .I lint
                    554: complains about variables which are redeclared in inner blocks
                    555: in a way that conflicts with their use in outer blocks.
                    556: This is legal, but is considered by many (including the author) to
                    557: be bad style, usually unnecessary, and frequently a bug.
                    558: .SH
                    559: Ancient History
                    560: .PP
                    561: There are several forms of older syntax which are being officially
                    562: discouraged.
                    563: These fall into two classes, assignment operators and initialization.
                    564: .PP
                    565: The older forms of assignment operators (e.g., =+, =\-, . . . )
                    566: could cause ambiguous expressions, such as
                    567: .DS
                    568: a  =\-1 ;
                    569: .DE
                    570: which could be taken as either
                    571: .DS
                    572: a =\-  1 ;
                    573: .DE
                    574: or
                    575: .DS
                    576: a  =  \-1 ;
                    577: .DE
                    578: The situation is especially perplexing if this
                    579: kind of ambiguity arises as the result of a macro substitution.
                    580: The newer, and preferred operators (+=, \-=, etc. )
                    581: have no such ambiguities.
                    582: To spur the abandonment of the older forms,
                    583: .I lint
                    584: complains about these old fashioned operators.
                    585: .PP
                    586: A similar issue arises with initialization.
                    587: The older language allowed
                    588: .DS
                    589: int  x  \fR1 ;
                    590: .DE
                    591: to initialize
                    592: .I x
                    593: to 1.
                    594: This also caused syntactic difficulties: for example,
                    595: .DS
                    596: int  x  ( \-1 ) ;
                    597: .DE
                    598: looks somewhat like the beginning of a function declaration:
                    599: .DS
                    600: int  x  ( y ) {  . . .
                    601: .DE
                    602: and the compiler must read a fair ways past
                    603: .I x
                    604: in order to sure what the declaration really is..
                    605: Again, the problem is even more perplexing when the
                    606: initializer involves a macro.
                    607: The current syntax places an equals sign between the
                    608: variable and the initializer:
                    609: .DS
                    610: int  x  =  \-1 ;
                    611: .DE
                    612: This is free of any possible syntactic ambiguity.
                    613: .SH
                    614: Pointer Alignment
                    615: .PP
                    616: Certain pointer assignments may be reasonable on some machines,
                    617: and illegal on others, due entirely to
                    618: alignment restrictions.
                    619: For example, on the PDP-11, it is reasonable
                    620: to assign integer pointers to double pointers, since
                    621: double precision values may begin on any integer boundary.
                    622: On the Honeywell 6000, double precision values must begin
                    623: on even word boundaries;
                    624: thus, not all such assignments make sense.
                    625: .I Lint
                    626: tries to detect cases where pointers are assigned to other
                    627: pointers, and such alignment problems might arise.
                    628: The message ``possible pointer alignment problem''
                    629: results from this situation whenever either the
                    630: .B \-p
                    631: or
                    632: .B \-h
                    633: flags are in effect.
                    634: .SH
                    635: Multiple Uses and Side Effects
                    636: .PP
                    637: In complicated expressions, the best order in which to evaluate
                    638: subexpressions may be highly machine dependent.
                    639: For example, on machines (like the PDP-11) in which the stack
                    640: runs backwards, function arguments will probably be best evaluated
                    641: from right-to-left; on machines with a stack running forward,
                    642: left-to-right seems most attractive.
                    643: Function calls embedded as arguments of other functions
                    644: may or may not be treated similarly to ordinary arguments.
                    645: Similar issues arise with other operators which have side effects,
                    646: such as the assignment operators and the increment and decrement operators.
                    647: .PP
                    648: In order that the efficiency of C on a particular machine not be
                    649: unduly compromised, the C language leaves the order
                    650: of evaluation of complicated expressions up to the
                    651: local compiler, and, in fact, the various C compilers have considerable
                    652: differences in the order in which they will evaluate complicated
                    653: expressions.
                    654: In particular, if any variable is changed by a side effect, and
                    655: also used elsewhere in the same expression, the result is explicitly undefined.
                    656: .PP
                    657: .I Lint
                    658: checks for the important special case where
                    659: a simple scalar variable is affected.
                    660: For example, the statement
                    661: .DS
                    662: \fIa\fR[\fIi\|\fR] = \fIb\fR[\fIi\fR++] ;
                    663: .DE
                    664: will draw the complaint:
                    665: .DS
                    666: warning: \fIi\fR evaluation order undefined
                    667: .DE
                    668: .SH
                    669: Implementation
                    670: .PP
                    671: .I Lint
                    672: consists of two programs and a driver.
                    673: The first program is a version of the
                    674: Portable C Compiler
                    675: .[
                    676: Johnson Ritchie BSTJ Portability Programs System
                    677: .]
                    678: .[
                    679: Johnson portable compiler  1978
                    680: .]
                    681: which is the basis of the
                    682: IBM 370, Honeywell 6000, and Interdata 8/32 C compilers.
                    683: This compiler does lexical and syntax analysis on the input text,
                    684: constructs and maintains symbol tables, and builds trees for expressions.
                    685: Instead of writing an intermediate file which is passed to
                    686: a code generator, as the other compilers
                    687: do,
                    688: .I lint
                    689: produces an intermediate file which consists of lines of ascii text.
                    690: Each line contains an external variable name,
                    691: an encoding of the context in which it was seen (use, definition, declaration, etc.),
                    692: a type specifier, and a source file name and line number.
                    693: The information about variables local to a function or file
                    694: is collected
                    695: by accessing the symbol table, and examining the expression trees.
                    696: .PP
                    697: Comments about local problems are produced as detected.
                    698: The information about external names is collected
                    699: onto an intermediate file.
                    700: After all the source files and library descriptions have
                    701: been collected, the intermediate file is sorted
                    702: to bring all information collected about a given external
                    703: name together.
                    704: The second, rather small, program then reads the lines
                    705: from the intermediate file and compares all of the
                    706: definitions, declarations, and uses for consistency.
                    707: .PP
                    708: The driver controls this
                    709: process, and is also responsible for making the options available
                    710: to both passes of
                    711: .I lint .
                    712: .SH
                    713: Portability
                    714: .PP
                    715: C on the Honeywell and IBM systems is used, in part, to write system code for the host operating system.
                    716: This means that the implementation of C tends to follow local conventions rather than
                    717: adhere strictly to
                    718: .UX
                    719: system conventions.
                    720: Despite these differences, many C programs have been successfully moved to GCOS and the various IBM
                    721: installations with little effort.
                    722: This section describes some of the differences between the implementations, and
                    723: discusses the
                    724: .I lint
                    725: features which encourage portability.
                    726: .PP
                    727: Uninitialized external variables are treated differently in different
                    728: implementations of C.
                    729: Suppose two files both contain a declaration without initialization, such as
                    730: .DS
                    731: int a ;
                    732: .DE
                    733: outside of any function.
                    734: The
                    735: .UX
                    736: loader will resolve these declarations, and cause only a single word of storage
                    737: to be set aside for \fIa\fR.
                    738: Under the GCOS and IBM implementations, this is not feasible (for various stupid reasons!)
                    739: so each such declaration causes a word of storage to be set aside and called \fIa\fR.
                    740: When loading or library editing takes place, this causes fatal conflicts which prevent
                    741: the proper operation of the program.
                    742: If
                    743: .I lint
                    744: is invoked with the \fB\-p\fR flag,
                    745: it will detect such multiple definitions.
                    746: .PP
                    747: A related difficulty comes from the amount of information retained about external names during the
                    748: loading process.
                    749: On the
                    750: .UX
                    751: system, externally known names have seven significant characters, with the upper/lower
                    752: case distinction kept.
                    753: On the IBM systems, there are eight significant characters, but the case distinction
                    754: is lost.
                    755: On GCOS, there are only six characters, of a single case.
                    756: This leads to situations where programs run on the
                    757: .UX
                    758: system, but encounter loader
                    759: problems on the IBM or GCOS systems.
                    760: .I Lint
                    761: .B \-p
                    762: causes all external symbols to be mapped to one case and truncated to six characters,
                    763: providing a worst-case analysis.
                    764: .PP
                    765: A number of differences arise in the area of character handling: characters in the
                    766: .UX
                    767: system are eight bit ascii, while they are eight bit ebcdic on the IBM, and
                    768: nine bit ascii on GCOS.
                    769: Moreover, character strings go from high to low bit positions (``left to right'')
                    770: on GCOS and IBM, and low to high (``right to left'') on the PDP-11.
                    771: This means that code attempting to construct strings
                    772: out of character constants, or attempting to use characters as indices
                    773: into arrays, must be looked at with great suspicion.
                    774: .I Lint
                    775: is of little help here, except to flag multi-character character constants.
                    776: .PP
                    777: Of course, the word sizes are different!
                    778: This causes less trouble than might be expected, at least when
                    779: moving from the
                    780: .UX
                    781: system (16 bit words) to the IBM (32 bits) or GCOS (36 bits).
                    782: The main problems are likely to arise in shifting or masking.
                    783: C now supports a bit-field facility, which can be used to write much of
                    784: this code in a reasonably portable way.
                    785: Frequently, portability of such code can be enhanced by
                    786: slight rearrangements in coding style.
                    787: Many of the incompatibilities seem to have the flavor of writing
                    788: .DS
                    789: x &= 0177700 ;
                    790: .DE
                    791: to clear the low order six bits of \fIx\fR.
                    792: This suffices on the PDP-11, but fails badly on GCOS and IBM.
                    793: If the bit field feature cannot be used, the same effect can be obtained by
                    794: writing
                    795: .DS
                    796: x &= \(ap 077 ;
                    797: .DE
                    798: which will work on all these machines.
                    799: .PP
                    800: The right shift operator is arithmetic shift on the PDP-11, and logical shift on most
                    801: other machines.
                    802: To obtain a logical shift on all machines, the left operand can be
                    803: typed \fBunsigned\fR.
                    804: Characters are considered signed integers on the PDP-11, and unsigned on the other machines.
                    805: This persistence of the sign bit may be reasonably considered a bug in the PDP-11 hardware
                    806: which has infiltrated itself into the C language.
                    807: If there were a good way to discover the programs which would be affected, C could be changed;
                    808: in any case,
                    809: .I lint
                    810: is no help here.
                    811: .PP
                    812: The above discussion may have made the problem of portability seem
                    813: bigger than it in fact is.
                    814: The issues involved here are rarely subtle or mysterious, at least to the
                    815: implementor of the program, although they can involve some work to straighten out.
                    816: The most serious bar to the portability of
                    817: .UX
                    818: system utilities has been the inability to mimic
                    819: essential
                    820: .UX
                    821: system functions on the other systems.
                    822: The inability to seek to a random character position in a text file, or to establish a pipe
                    823: between processes, has involved far more rewriting
                    824: and debugging than any of the differences in C compilers.
                    825: On the other hand,
                    826: .I lint
                    827: has been very helpful
                    828: in moving the
                    829: .UX
                    830: operating system and associated
                    831: utility programs to other machines.
                    832: .SH
                    833: Shutting Lint Up
                    834: .PP
                    835: There are occasions when
                    836: the programmer is smarter than
                    837: .I lint .
                    838: There may be valid reasons for ``illegal'' type casts,
                    839: functions with a variable number of arguments, etc.
                    840: Moreover, as specified above, the flow of control information
                    841: produced by
                    842: .I lint
                    843: often has blind spots, causing occasional spurious
                    844: messages about perfectly reasonable programs.
                    845: Thus, some way of communicating with
                    846: .I lint ,
                    847: typically to shut it up, is desirable.
                    848: .PP
                    849: The form which this mechanism should take is not at all clear.
                    850: New keywords would require current and old compilers to
                    851: recognize these keywords, if only to ignore them.
                    852: This has both philosophical and practical problems.
                    853: New preprocessor syntax suffers from similar problems.
                    854: .PP
                    855: What was finally done was to cause a number of words
                    856: to be recognized by
                    857: .I lint
                    858: when they were embedded in comments.
                    859: This required minimal preprocessor changes;
                    860: the preprocessor just had to agree to pass comments
                    861: through to its output, instead of deleting them
                    862: as had been previously done.
                    863: Thus,
                    864: .I lint
                    865: directives are invisible to the compilers, and
                    866: the effect on systems with the older preprocessors
                    867: is merely that the
                    868: .I lint
                    869: directives don't work.
                    870: .PP
                    871: The first directive is concerned with flow of control information;
                    872: if a particular place in the program cannot be reached,
                    873: but this is not apparent to
                    874: .I lint ,
                    875: this can be asserted by the directive
                    876: .DS
                    877: /* NOTREACHED */
                    878: .DE
                    879: at the appropriate spot in the program.
                    880: Similarly, if it is desired to turn off
                    881: strict type checking for
                    882: the next expression, the directive
                    883: .DS
                    884: /* NOSTRICT */
                    885: .DE
                    886: can be used; the situation reverts to the
                    887: previous default after the next expression.
                    888: The
                    889: .B \-v
                    890: flag can be turned on for one function by the directive
                    891: .DS
                    892: /* ARGSUSED */
                    893: .DE
                    894: Complaints about variable number of arguments in calls to a function
                    895: can be turned off by the directive
                    896: .DS
                    897: /* VARARGS */
                    898: .DE
                    899: preceding the function definition.
                    900: In some cases, it is desirable to check the
                    901: first several arguments, and leave the later arguments unchecked.
                    902: This can be done by following the VARARGS keyword immediately
                    903: with a digit giving the number of arguments which should be checked; thus,
                    904: .DS
                    905: /* VARARGS2 */
                    906: .DE
                    907: will cause the first two arguments to be checked, the others unchecked.
                    908: Finally, the directive
                    909: .DS
                    910: /* LINTLIBRARY */
                    911: .DE
                    912: at the head of a file identifies this file as
                    913: a library declaration file; this topic is worth a
                    914: section by itself.
                    915: .SH
                    916: Library Declaration Files
                    917: .PP
                    918: .I Lint
                    919: accepts certain library directives, such as
                    920: .DS
                    921: \-ly
                    922: .DE
                    923: and tests the source files for compatibility with these libraries.
                    924: This is done by accessing library description files whose
                    925: names are constructed from the library directives.
                    926: These files all begin with the directive
                    927: .DS
                    928: /* LINTLIBRARY */
                    929: .DE
                    930: which is followed by a series of dummy function
                    931: definitions.
                    932: The critical parts of these definitions
                    933: are the declaration of the function return type,
                    934: whether the dummy function returns a value, and
                    935: the number and types of arguments to the function.
                    936: The VARARGS and ARGSUSED directives can
                    937: be used to specify features of the library functions.
                    938: .PP
                    939: .I Lint
                    940: library files are processed almost exactly like ordinary
                    941: source files.
                    942: The only difference is that functions which are defined on a library file,
                    943: but are not used on a source file, draw no complaints.
                    944: .I Lint
                    945: does not simulate a full library search algorithm,
                    946: and complains if the source files contain a redefinition of
                    947: a library routine (this is a feature!).
                    948: .PP
                    949: By default,
                    950: .I lint
                    951: checks the programs it is given against a standard library
                    952: file, which contains descriptions of the programs which
                    953: are normally loaded when
                    954: a C program
                    955: is run.
                    956: When the
                    957: .B -p
                    958: flag is in effect, another file is checked containing
                    959: descriptions of the standard I/O library routines
                    960: which are expected to be portable across various machines.
                    961: The
                    962: .B -n
                    963: flag can be used to suppress all library checking.
                    964: .SH
                    965: Bugs, etc.
                    966: .PP
                    967: .I Lint
                    968: was a difficult program to write, partially
                    969: because it is closely connected with matters of programming style,
                    970: and partially because users usually don't notice bugs which cause
                    971: .I lint
                    972: to miss errors which it should have caught.
                    973: (By contrast, if
                    974: .I lint
                    975: incorrectly complains about something that is correct, the
                    976: programmer reports that immediately!)
                    977: .PP
                    978: A number of areas remain to be further developed.
                    979: The checking of structures and arrays is rather inadequate;
                    980: size
                    981: incompatibilities go unchecked,
                    982: and no attempt is made to match up structure and union
                    983: declarations across files.
                    984: Some stricter checking of the use of the
                    985: .B typedef
                    986: is clearly desirable, but what checking is appropriate, and how
                    987: to carry it out, is still to be determined.
                    988: .PP
                    989: .I Lint
                    990: shares the preprocessor with the C compiler.
                    991: At some point it may be appropriate for a
                    992: special version of the preprocessor to be constructed
                    993: which checks for things such as unused macro definitions,
                    994: macro arguments which have side effects which are
                    995: not expanded at all, or are expanded more than once, etc.
                    996: .PP
                    997: The central problem with
                    998: .I lint
                    999: is the packaging of the information which it collects.
                   1000: There are many options which
                   1001: serve only to turn off, or slightly modify,
                   1002: certain features.
                   1003: There are pressures to add even more of these options.
                   1004: .PP
                   1005: In conclusion, it appears that the general notion of having two
                   1006: programs is a good one.
                   1007: The compiler concentrates on quickly and accurately turning the
                   1008: program text into bits which can be run;
                   1009: .I lint
                   1010: concentrates on issues
                   1011: of portability, style, and efficiency.
                   1012: .I Lint
                   1013: can afford to be wrong, since incorrectness and over-conservatism
                   1014: are merely annoying, not fatal.
                   1015: The compiler can be fast since it knows that
                   1016: .I lint
                   1017: will cover its flanks.
                   1018: Finally, the programmer can
                   1019: concentrate at one stage
                   1020: of the programming process solely on the algorithms,
                   1021: data structures, and correctness of the
                   1022: program, and then later retrofit,
                   1023: with the aid of
                   1024: .I lint ,
                   1025: the desirable properties of universality and portability.
                   1026: .SG MH-1273-SCJ-unix
                   1027: .\".bp
                   1028: .[
                   1029: $LIST$
                   1030: .]
                   1031: .bp
                   1032: .SH
                   1033: Appendix:   Current Lint Options
                   1034: .PP
                   1035: The command currently has the form
                   1036: .DS
                   1037: lint\fR [\fB\-\fRoptions ] files... library-descriptors...
                   1038: .DE
                   1039: The options are
                   1040: .IP \fBh\fR
                   1041: Perform heuristic checks
                   1042: .IP \fBp\fR
                   1043: Perform portability checks
                   1044: .IP \fBv\fR
                   1045: Don't report unused arguments
                   1046: .IP \fBu\fR
                   1047: Don't report unused or undefined externals
                   1048: .IP \fBb\fR
                   1049: Report unreachable
                   1050: .B break
                   1051: statements.
                   1052: .IP \fBx\fR
                   1053: Report unused external declarations
                   1054: .IP \fBa\fR
                   1055: Report assignments of
                   1056: .B long
                   1057: to
                   1058: .B int
                   1059: or shorter.
                   1060: .IP \fBc\fR
                   1061: Complain about questionable casts
                   1062: .IP \fBn\fR
                   1063: No library checking is done
                   1064: .IP \fBs\fR
                   1065: Same as
                   1066: .B h
                   1067: (for historical reasons)

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.