|
|
1.1 root 1: .\" @(#)lint 6.1 (Berkeley) 5/7/86
2: .\"
3: .EH 'PS1:9-%''Lint, a C Program Checker'
4: .OH 'Lint, a C Program Checker''PS1:9-%'
5: .\".RP
6: .ND "July 26, 1978"
7: .OK
8: .\"Program Portability
9: .\"Strong Type Checking
10: .TL
11: Lint, a C Program Checker
12: .AU "MH 2C-559" 3968
13: S. C. Johnson
14: .AI
15: .MH
16: .AB
17: .PP
18: .I Lint
19: is a command which examines C source programs,
20: detecting
21: a number of bugs and obscurities.
22: It enforces the type rules of C more strictly than
23: the C compilers.
24: It may also be used to enforce a number of portability
25: restrictions involved in moving
26: programs between different machines and/or operating systems.
27: Another option detects a number of wasteful, or error prone, constructions
28: which nevertheless are, strictly speaking, legal.
29: .PP
30: .I Lint
31: accepts multiple input files and library specifications, and checks them for consistency.
32: .PP
33: The separation of function between
34: .I lint
35: and the C compilers has both historical and practical
36: rationale.
37: The compilers turn C programs into executable files rapidly
38: and efficiently.
39: This is possible in part because the
40: compilers do not do sophisticated
41: type checking, especially between
42: separately compiled programs.
43: .I Lint
44: takes a more global, leisurely view of the program,
45: looking much more carefully at the compatibilities.
46: .PP
47: This document discusses the use of
48: .I lint ,
49: gives an overview of the implementation, and gives some hints on the
50: writing of machine independent C code.
51: .AE
52: .CS 10 2 12 0 0 5
53: .SH
54: Introduction and Usage
55: .PP
56: Suppose there are two C
57: .[
58: Kernighan Ritchie Programming Prentice 1978
59: .]
60: source files,
61: .I file1. c
62: and
63: .I file2.c ,
64: which are ordinarily compiled and loaded together.
65: Then the command
66: .DS
67: lint file1.c file2.c
68: .DE
69: produces messages describing inconsistencies and inefficiencies
70: in the programs.
71: The program enforces the typing rules of C
72: more strictly than the C compilers
73: (for both historical and practical reasons)
74: enforce them.
75: The command
76: .DS
77: lint \-p file1.c file2.c
78: .DE
79: will produce, in addition to the above messages, additional messages
80: which relate to the portability of the programs to other operating
81: systems and machines.
82: Replacing the
83: .B \-p
84: by
85: .B \-h
86: will produce messages about various error-prone or wasteful constructions
87: which, strictly speaking, are not bugs.
88: Saying
89: .B \-hp
90: gets the whole works.
91: .PP
92: The next several sections describe the major messages;
93: the document closes with sections
94: discussing the implementation and giving suggestions
95: for writing portable C.
96: An appendix gives a summary of the
97: .I lint
98: options.
99: .SH
100: A Word About Philosophy
101: .PP
102: Many of the facts which
103: .I lint
104: needs may be impossible to
105: discover.
106: For example, whether a given function in a program ever gets called
107: may depend on the input data.
108: Deciding whether
109: .I exit
110: is ever called is equivalent to solving the famous ``halting problem,'' known to be
111: recursively undecidable.
112: .PP
113: Thus, most of the
114: .I lint
115: algorithms are a compromise.
116: If a function is never mentioned, it can never be called.
117: If a function is mentioned,
118: .I lint
119: assumes it can be called; this is not necessarily so, but in practice is quite reasonable.
120: .PP
121: .I Lint
122: tries to give information with a high degree of relevance.
123: Messages of the form ``\fIxxx\fR might be a bug''
124: are easy to generate, but are acceptable only in proportion
125: to the fraction of real bugs they uncover.
126: If this fraction of real bugs is too small, the messages lose their credibility
127: and serve merely to clutter up the output,
128: obscuring the more important messages.
129: .PP
130: Keeping these issues in mind, we now consider in more detail
131: the classes of messages which
132: .I lint
133: produces.
134: .SH
135: Unused Variables and Functions
136: .PP
137: As sets of programs evolve and develop,
138: previously used variables and arguments to
139: functions may become unused;
140: it is not uncommon for external variables, or even entire
141: functions, to become unnecessary, and yet
142: not be removed from the source.
143: These ``errors of commission'' rarely cause working programs to fail, but they are a source
144: of inefficiency, and make programs harder to understand
145: and change.
146: Moreover, information about such unused variables and functions can occasionally
147: serve to discover bugs; if a function does a necessary job, and
148: is never called, something is wrong!
149: .PP
150: .I Lint
151: complains about variables and functions which are defined but not otherwise
152: mentioned.
153: An exception is variables which are declared through explicit
154: .B extern
155: statements but are never referenced; thus the statement
156: .DS
157: extern float sin(\|);
158: .DE
159: will evoke no comment if
160: .I sin
161: is never used.
162: Note that this agrees with the semantics of the C compiler.
163: In some cases, these unused external declarations might be of some interest; they
164: can be discovered by adding the
165: .B \-x
166: flag to the
167: .I lint
168: invocation.
169: .PP
170: Certain styles of programming
171: require many functions to be written with similar interfaces;
172: frequently, some of the arguments may be unused
173: in many of the calls.
174: The
175: .B \-v
176: option is available to suppress the printing of
177: complaints about unused arguments.
178: When
179: .B \-v
180: is in effect, no messages are produced about unused
181: arguments except for those
182: arguments which are unused and also declared as
183: register arguments; this can be considered
184: an active (and preventable) waste of the register
185: resources of the machine.
186: .PP
187: There is one case where information about unused, or
188: undefined, variables is more distracting
189: than helpful.
190: This is when
191: .I lint
192: is applied to some, but not all, files out of a collection
193: which are to be loaded together.
194: In this case, many of the functions and variables defined
195: may not be used, and, conversely,
196: many functions and variables defined elsewhere may be used.
197: The
198: .B \-u
199: flag may be used to suppress the spurious messages which might otherwise appear.
200: .SH
201: Set/Used Information
202: .PP
203: .I Lint
204: attempts to detect cases where a variable is used before it is set.
205: This is very difficult to do well;
206: many algorithms take a good deal of time and space,
207: and still produce messages about perfectly valid programs.
208: .I Lint
209: detects local variables (automatic and register storage classes)
210: whose first use appears physically earlier in the input file than the first assignment to the variable.
211: It assumes that taking the address of a variable constitutes a ``use,'' since the actual use
212: may occur at any later time, in a data dependent fashion.
213: .PP
214: The restriction to the physical appearance of variables in the file makes the
215: algorithm very simple and quick to implement,
216: since the true flow of control need not be discovered.
217: It does mean that
218: .I lint
219: can complain about some programs which are legal,
220: but these programs would probably be considered bad on stylistic grounds (e.g. might
221: contain at least two \fBgoto\fR's).
222: Because static and external variables are initialized to 0,
223: no meaningful information can be discovered about their uses.
224: The algorithm deals correctly, however, with initialized automatic variables, and variables
225: which are used in the expression which first sets them.
226: .PP
227: The set/used information also permits recognition of those local variables which are set
228: and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs.
229: .SH
230: Flow of Control
231: .PP
232: .I Lint
233: attempts to detect unreachable portions of the programs which it processes.
234: It will complain about unlabeled statements immediately following
235: \fBgoto\fR, \fBbreak\fR, \fBcontinue\fR, or \fBreturn\fR statements.
236: An attempt is made to detect loops which can never be left at the bottom, detecting the
237: special cases
238: \fBwhile\fR( 1 ) and \fBfor\fR(;;) as infinite loops.
239: .I Lint
240: also complains about loops which cannot be entered at the top;
241: some valid programs may have such loops, but at best they are bad style,
242: at worst bugs.
243: .PP
244: .I Lint
245: has an important area of blindness in the flow of control algorithm:
246: it has no way of detecting functions which are called and never return.
247: Thus, a call to
248: .I exit
249: may cause unreachable code which
250: .I lint
251: does not detect; the most serious effects of this are in the
252: determination of returned function values (see the next section).
253: .PP
254: One form of unreachable statement is not usually complained about by
255: .I lint;
256: a
257: .B break
258: statement that cannot be reached causes no message.
259: Programs generated by
260: .I yacc ,
261: .[
262: Johnson Yacc 1975
263: .]
264: and especially
265: .I lex ,
266: .[
267: Lesk Lex
268: .]
269: may have literally hundreds of unreachable
270: .B break
271: statements.
272: The
273: .B \-O
274: flag in the C compiler will often eliminate the resulting object code inefficiency.
275: Thus, these unreached statements are of little importance,
276: there is typically nothing the user can do about them, and the
277: resulting messages would clutter up the
278: .I lint
279: output.
280: If these messages are desired,
281: .I lint
282: can be invoked with the
283: .B \-b
284: option.
285: .SH
286: Function Values
287: .PP
288: Sometimes functions return values which are never used;
289: sometimes programs incorrectly use function ``values''
290: which have never been returned.
291: .I Lint
292: addresses this problem in a number of ways.
293: .PP
294: Locally, within a function definition,
295: the appearance of both
296: .DS
297: return( \fIexpr\fR );
298: .DE
299: and
300: .DS
301: return ;
302: .DE
303: statements is cause for alarm;
304: .I lint
305: will give the message
306: .DS
307: function \fIname\fR contains return(e) and return
308: .DE
309: The most serious difficulty with this is detecting when a function return is implied
310: by flow of control reaching the end of the function.
311: This can be seen with a simple example:
312: .DS
313: .ta .5i 1i 1.5i
314: \fRf ( a ) {
315: if ( a ) return ( 3 );
316: g (\|);
317: }
318: .DE
319: Notice that, if \fIa\fR tests false, \fIf\fR will call \fIg\fR and then return
320: with no defined return value; this will trigger a complaint from
321: .I lint .
322: If \fIg\fR, like \fIexit\fR, never returns,
323: the message will still be produced when in fact nothing is wrong.
324: .PP
325: In practice, some potentially serious bugs have been discovered by this feature;
326: it also accounts for a substantial fraction of the ``noise'' messages produced
327: by
328: .I lint .
329: .PP
330: On a global scale,
331: .I lint
332: detects cases where a function returns a value, but this value is sometimes,
333: or always, unused.
334: When the value is always unused, it may constitute an inefficiency in the function definition.
335: When the value is sometimes unused, it may represent bad style (e.g., not testing for
336: error conditions).
337: .PP
338: The dual problem, using a function value when the function does not return one,
339: is also detected.
340: This is a serious problem.
341: Amazingly, this bug has been observed on a couple of occasions
342: in ``working'' programs; the desired function value just happened to have been computed
343: in the function return register!
344: .SH
345: Type Checking
346: .PP
347: .I Lint
348: enforces the type checking rules of C more strictly than the compilers do.
349: The additional checking is in four major areas:
350: across certain binary operators and implied assignments,
351: at the structure selection operators,
352: between the definition and uses of functions,
353: and in the use of enumerations.
354: .PP
355: There are a number of operators which have an implied balancing between types of the operands.
356: The assignment, conditional ( ?\|: ), and relational operators
357: have this property; the argument
358: of a \fBreturn\fR statement,
359: and expressions used in initialization also suffer similar conversions.
360: In these operations,
361: \fBchar\fR, \fBshort\fR, \fBint\fR, \fBlong\fR, \fBunsigned\fR, \fBfloat\fR, and \fBdouble\fR types may be freely intermixed.
362: The types of pointers must agree exactly,
363: except that arrays of \fIx\fR's can, of course, be intermixed with pointers to \fIx\fR's.
364: .PP
365: The type checking rules also require that, in structure references, the
366: left operand of the \(em> be a pointer to structure, the left operand of the \fB.\fR
367: be a structure, and the right operand of these operators be a member
368: of the structure implied by the left operand.
369: Similar checking is done for references to unions.
370: .PP
371: Strict rules apply to function argument and return value
372: matching.
373: The types \fBfloat\fR and \fBdouble\fR may be freely matched,
374: as may the types \fBchar\fR, \fBshort\fR, \fBint\fR, and \fBunsigned\fR.
375: Also, pointers can be matched with the associated arrays.
376: Aside from this, all actual arguments must agree in type with their declared counterparts.
377: .PP
378: With enumerations, checks are made that enumeration variables or members are not mixed
379: with other types, or other enumerations,
380: and that the only operations applied are =, initialization, ==, !=, and function arguments and return values.
381: .SH
382: Type Casts
383: .PP
384: The type cast feature in C was introduced largely as an aid
385: to producing more portable programs.
386: Consider the assignment
387: .DS
388: p = 1 ;
389: .DE
390: where
391: .I p
392: is a character pointer.
393: .I Lint
394: will quite rightly complain.
395: Now, consider the assignment
396: .DS
397: p = (char \(**)1 ;
398: .DE
399: in which a cast has been used to
400: convert the integer to a character pointer.
401: The programmer obviously had a strong motivation
402: for doing this, and has clearly signaled his intentions.
403: It seems harsh for
404: .I lint
405: to continue to complain about this.
406: On the other hand, if this code is moved to another
407: machine, such code should be looked at carefully.
408: The
409: .B \-c
410: flag controls the printing of comments about casts.
411: When
412: .B \-c
413: is in effect, casts are treated as though they were assignments
414: subject to complaint; otherwise, all legal casts are passed without comment,
415: no matter how strange the type mixing seems to be.
416: .SH
417: Nonportable Character Use
418: .PP
419: On the PDP-11, characters are signed quantities, with a range
420: from \-128 to 127.
421: On most of the other C implementations, characters take on only positive
422: values.
423: Thus,
424: .I lint
425: will flag certain comparisons and assignments as being
426: illegal or nonportable.
427: For example, the fragment
428: .DS
429: char c;
430: ...
431: if( (c = getchar(\|)) < 0 ) ....
432: .DE
433: works on the PDP-11, but
434: will fail on machines where characters always take
435: on positive values.
436: The real solution is to declare
437: .I c
438: an integer, since
439: .I getchar
440: is actually returning
441: integer values.
442: In any case,
443: .I lint
444: will say
445: ``nonportable character comparison''.
446: .PP
447: A similar issue arises with bitfields; when assignments
448: of constant values are made to bitfields, the field may
449: be too small to hold the value.
450: This is especially true because
451: on some machines bitfields are considered as signed
452: quantities.
453: While it may seem unintuitive to consider
454: that a two bit field declared of type
455: .B int
456: cannot hold the value 3, the problem disappears
457: if the bitfield is declared to have type
458: .B unsigned .
459: .SH
460: Assignments of longs to ints
461: .PP
462: Bugs may arise from the assignment of
463: .B long
464: to
465: an
466: .B int ,
467: which loses accuracy.
468: This may happen in programs
469: which have been incompletely converted to use
470: .B typedefs .
471: When a
472: .B typedef
473: variable
474: is changed from \fBint\fR to \fBlong\fR,
475: the program can stop working because
476: some intermediate results may be assigned
477: to \fBints\fR, losing accuracy.
478: Since there are a number of legitimate reasons for
479: assigning \fBlongs\fR to \fBints\fR, the detection
480: of these assignments is enabled
481: by the
482: .B \-a
483: flag.
484: .SH
485: Strange Constructions
486: .PP
487: Several perfectly legal, but somewhat strange, constructions
488: are flagged by
489: .I lint;
490: the messages hopefully encourage better code quality, clearer style, and
491: may even point out bugs.
492: The
493: .B \-h
494: flag is used to enable these checks.
495: For example, in the statement
496: .DS
497: \(**p++ ;
498: .DE
499: the \(** does nothing; this provokes the message ``null effect'' from
500: .I lint .
501: The program fragment
502: .DS
503: unsigned x ;
504: if( x < 0 ) ...
505: .DE
506: is clearly somewhat strange; the
507: test will never succeed.
508: Similarly, the test
509: .DS
510: if( x > 0 ) ...
511: .DE
512: is equivalent to
513: .DS
514: if( x != 0 )
515: .DE
516: which may not be the intended action.
517: .I Lint
518: will say ``degenerate unsigned comparison'' in these cases.
519: If one says
520: .DS
521: if( 1 != 0 ) ....
522: .DE
523: .I lint
524: will report
525: ``constant in conditional context'', since the comparison
526: of 1 with 0 gives a constant result.
527: .PP
528: Another construction
529: detected by
530: .I lint
531: involves
532: operator precedence.
533: Bugs which arise from misunderstandings about the precedence
534: of operators can be accentuated by spacing and formatting,
535: making such bugs extremely hard to find.
536: For example, the statements
537: .DS
538: if( x&077 == 0 ) ...
539: .DE
540: or
541: .DS
542: x<\h'-.3m'<2 + 40
543: .DE
544: probably do not do what was intended.
545: The best solution is to parenthesize such expressions,
546: and
547: .I lint
548: encourages this by an appropriate message.
549: .PP
550: Finally, when the
551: .B \-h
552: flag is in force
553: .I lint
554: complains about variables which are redeclared in inner blocks
555: in a way that conflicts with their use in outer blocks.
556: This is legal, but is considered by many (including the author) to
557: be bad style, usually unnecessary, and frequently a bug.
558: .SH
559: Ancient History
560: .PP
561: There are several forms of older syntax which are being officially
562: discouraged.
563: These fall into two classes, assignment operators and initialization.
564: .PP
565: The older forms of assignment operators (e.g., =+, =\-, . . . )
566: could cause ambiguous expressions, such as
567: .DS
568: a =\-1 ;
569: .DE
570: which could be taken as either
571: .DS
572: a =\- 1 ;
573: .DE
574: or
575: .DS
576: a = \-1 ;
577: .DE
578: The situation is especially perplexing if this
579: kind of ambiguity arises as the result of a macro substitution.
580: The newer, and preferred operators (+=, \-=, etc. )
581: have no such ambiguities.
582: To spur the abandonment of the older forms,
583: .I lint
584: complains about these old fashioned operators.
585: .PP
586: A similar issue arises with initialization.
587: The older language allowed
588: .DS
589: int x \fR1 ;
590: .DE
591: to initialize
592: .I x
593: to 1.
594: This also caused syntactic difficulties: for example,
595: .DS
596: int x ( \-1 ) ;
597: .DE
598: looks somewhat like the beginning of a function declaration:
599: .DS
600: int x ( y ) { . . .
601: .DE
602: and the compiler must read a fair ways past
603: .I x
604: in order to sure what the declaration really is..
605: Again, the problem is even more perplexing when the
606: initializer involves a macro.
607: The current syntax places an equals sign between the
608: variable and the initializer:
609: .DS
610: int x = \-1 ;
611: .DE
612: This is free of any possible syntactic ambiguity.
613: .SH
614: Pointer Alignment
615: .PP
616: Certain pointer assignments may be reasonable on some machines,
617: and illegal on others, due entirely to
618: alignment restrictions.
619: For example, on the PDP-11, it is reasonable
620: to assign integer pointers to double pointers, since
621: double precision values may begin on any integer boundary.
622: On the Honeywell 6000, double precision values must begin
623: on even word boundaries;
624: thus, not all such assignments make sense.
625: .I Lint
626: tries to detect cases where pointers are assigned to other
627: pointers, and such alignment problems might arise.
628: The message ``possible pointer alignment problem''
629: results from this situation whenever either the
630: .B \-p
631: or
632: .B \-h
633: flags are in effect.
634: .SH
635: Multiple Uses and Side Effects
636: .PP
637: In complicated expressions, the best order in which to evaluate
638: subexpressions may be highly machine dependent.
639: For example, on machines (like the PDP-11) in which the stack
640: runs backwards, function arguments will probably be best evaluated
641: from right-to-left; on machines with a stack running forward,
642: left-to-right seems most attractive.
643: Function calls embedded as arguments of other functions
644: may or may not be treated similarly to ordinary arguments.
645: Similar issues arise with other operators which have side effects,
646: such as the assignment operators and the increment and decrement operators.
647: .PP
648: In order that the efficiency of C on a particular machine not be
649: unduly compromised, the C language leaves the order
650: of evaluation of complicated expressions up to the
651: local compiler, and, in fact, the various C compilers have considerable
652: differences in the order in which they will evaluate complicated
653: expressions.
654: In particular, if any variable is changed by a side effect, and
655: also used elsewhere in the same expression, the result is explicitly undefined.
656: .PP
657: .I Lint
658: checks for the important special case where
659: a simple scalar variable is affected.
660: For example, the statement
661: .DS
662: \fIa\fR[\fIi\|\fR] = \fIb\fR[\fIi\fR++] ;
663: .DE
664: will draw the complaint:
665: .DS
666: warning: \fIi\fR evaluation order undefined
667: .DE
668: .SH
669: Implementation
670: .PP
671: .I Lint
672: consists of two programs and a driver.
673: The first program is a version of the
674: Portable C Compiler
675: .[
676: Johnson Ritchie BSTJ Portability Programs System
677: .]
678: .[
679: Johnson portable compiler 1978
680: .]
681: which is the basis of the
682: IBM 370, Honeywell 6000, and Interdata 8/32 C compilers.
683: This compiler does lexical and syntax analysis on the input text,
684: constructs and maintains symbol tables, and builds trees for expressions.
685: Instead of writing an intermediate file which is passed to
686: a code generator, as the other compilers
687: do,
688: .I lint
689: produces an intermediate file which consists of lines of ascii text.
690: Each line contains an external variable name,
691: an encoding of the context in which it was seen (use, definition, declaration, etc.),
692: a type specifier, and a source file name and line number.
693: The information about variables local to a function or file
694: is collected
695: by accessing the symbol table, and examining the expression trees.
696: .PP
697: Comments about local problems are produced as detected.
698: The information about external names is collected
699: onto an intermediate file.
700: After all the source files and library descriptions have
701: been collected, the intermediate file is sorted
702: to bring all information collected about a given external
703: name together.
704: The second, rather small, program then reads the lines
705: from the intermediate file and compares all of the
706: definitions, declarations, and uses for consistency.
707: .PP
708: The driver controls this
709: process, and is also responsible for making the options available
710: to both passes of
711: .I lint .
712: .SH
713: Portability
714: .PP
715: C on the Honeywell and IBM systems is used, in part, to write system code for the host operating system.
716: This means that the implementation of C tends to follow local conventions rather than
717: adhere strictly to
718: .UX
719: system conventions.
720: Despite these differences, many C programs have been successfully moved to GCOS and the various IBM
721: installations with little effort.
722: This section describes some of the differences between the implementations, and
723: discusses the
724: .I lint
725: features which encourage portability.
726: .PP
727: Uninitialized external variables are treated differently in different
728: implementations of C.
729: Suppose two files both contain a declaration without initialization, such as
730: .DS
731: int a ;
732: .DE
733: outside of any function.
734: The
735: .UX
736: loader will resolve these declarations, and cause only a single word of storage
737: to be set aside for \fIa\fR.
738: Under the GCOS and IBM implementations, this is not feasible (for various stupid reasons!)
739: so each such declaration causes a word of storage to be set aside and called \fIa\fR.
740: When loading or library editing takes place, this causes fatal conflicts which prevent
741: the proper operation of the program.
742: If
743: .I lint
744: is invoked with the \fB\-p\fR flag,
745: it will detect such multiple definitions.
746: .PP
747: A related difficulty comes from the amount of information retained about external names during the
748: loading process.
749: On the
750: .UX
751: system, externally known names have seven significant characters, with the upper/lower
752: case distinction kept.
753: On the IBM systems, there are eight significant characters, but the case distinction
754: is lost.
755: On GCOS, there are only six characters, of a single case.
756: This leads to situations where programs run on the
757: .UX
758: system, but encounter loader
759: problems on the IBM or GCOS systems.
760: .I Lint
761: .B \-p
762: causes all external symbols to be mapped to one case and truncated to six characters,
763: providing a worst-case analysis.
764: .PP
765: A number of differences arise in the area of character handling: characters in the
766: .UX
767: system are eight bit ascii, while they are eight bit ebcdic on the IBM, and
768: nine bit ascii on GCOS.
769: Moreover, character strings go from high to low bit positions (``left to right'')
770: on GCOS and IBM, and low to high (``right to left'') on the PDP-11.
771: This means that code attempting to construct strings
772: out of character constants, or attempting to use characters as indices
773: into arrays, must be looked at with great suspicion.
774: .I Lint
775: is of little help here, except to flag multi-character character constants.
776: .PP
777: Of course, the word sizes are different!
778: This causes less trouble than might be expected, at least when
779: moving from the
780: .UX
781: system (16 bit words) to the IBM (32 bits) or GCOS (36 bits).
782: The main problems are likely to arise in shifting or masking.
783: C now supports a bit-field facility, which can be used to write much of
784: this code in a reasonably portable way.
785: Frequently, portability of such code can be enhanced by
786: slight rearrangements in coding style.
787: Many of the incompatibilities seem to have the flavor of writing
788: .DS
789: x &= 0177700 ;
790: .DE
791: to clear the low order six bits of \fIx\fR.
792: This suffices on the PDP-11, but fails badly on GCOS and IBM.
793: If the bit field feature cannot be used, the same effect can be obtained by
794: writing
795: .DS
796: x &= \(ap 077 ;
797: .DE
798: which will work on all these machines.
799: .PP
800: The right shift operator is arithmetic shift on the PDP-11, and logical shift on most
801: other machines.
802: To obtain a logical shift on all machines, the left operand can be
803: typed \fBunsigned\fR.
804: Characters are considered signed integers on the PDP-11, and unsigned on the other machines.
805: This persistence of the sign bit may be reasonably considered a bug in the PDP-11 hardware
806: which has infiltrated itself into the C language.
807: If there were a good way to discover the programs which would be affected, C could be changed;
808: in any case,
809: .I lint
810: is no help here.
811: .PP
812: The above discussion may have made the problem of portability seem
813: bigger than it in fact is.
814: The issues involved here are rarely subtle or mysterious, at least to the
815: implementor of the program, although they can involve some work to straighten out.
816: The most serious bar to the portability of
817: .UX
818: system utilities has been the inability to mimic
819: essential
820: .UX
821: system functions on the other systems.
822: The inability to seek to a random character position in a text file, or to establish a pipe
823: between processes, has involved far more rewriting
824: and debugging than any of the differences in C compilers.
825: On the other hand,
826: .I lint
827: has been very helpful
828: in moving the
829: .UX
830: operating system and associated
831: utility programs to other machines.
832: .SH
833: Shutting Lint Up
834: .PP
835: There are occasions when
836: the programmer is smarter than
837: .I lint .
838: There may be valid reasons for ``illegal'' type casts,
839: functions with a variable number of arguments, etc.
840: Moreover, as specified above, the flow of control information
841: produced by
842: .I lint
843: often has blind spots, causing occasional spurious
844: messages about perfectly reasonable programs.
845: Thus, some way of communicating with
846: .I lint ,
847: typically to shut it up, is desirable.
848: .PP
849: The form which this mechanism should take is not at all clear.
850: New keywords would require current and old compilers to
851: recognize these keywords, if only to ignore them.
852: This has both philosophical and practical problems.
853: New preprocessor syntax suffers from similar problems.
854: .PP
855: What was finally done was to cause a number of words
856: to be recognized by
857: .I lint
858: when they were embedded in comments.
859: This required minimal preprocessor changes;
860: the preprocessor just had to agree to pass comments
861: through to its output, instead of deleting them
862: as had been previously done.
863: Thus,
864: .I lint
865: directives are invisible to the compilers, and
866: the effect on systems with the older preprocessors
867: is merely that the
868: .I lint
869: directives don't work.
870: .PP
871: The first directive is concerned with flow of control information;
872: if a particular place in the program cannot be reached,
873: but this is not apparent to
874: .I lint ,
875: this can be asserted by the directive
876: .DS
877: /* NOTREACHED */
878: .DE
879: at the appropriate spot in the program.
880: Similarly, if it is desired to turn off
881: strict type checking for
882: the next expression, the directive
883: .DS
884: /* NOSTRICT */
885: .DE
886: can be used; the situation reverts to the
887: previous default after the next expression.
888: The
889: .B \-v
890: flag can be turned on for one function by the directive
891: .DS
892: /* ARGSUSED */
893: .DE
894: Complaints about variable number of arguments in calls to a function
895: can be turned off by the directive
896: .DS
897: /* VARARGS */
898: .DE
899: preceding the function definition.
900: In some cases, it is desirable to check the
901: first several arguments, and leave the later arguments unchecked.
902: This can be done by following the VARARGS keyword immediately
903: with a digit giving the number of arguments which should be checked; thus,
904: .DS
905: /* VARARGS2 */
906: .DE
907: will cause the first two arguments to be checked, the others unchecked.
908: Finally, the directive
909: .DS
910: /* LINTLIBRARY */
911: .DE
912: at the head of a file identifies this file as
913: a library declaration file; this topic is worth a
914: section by itself.
915: .SH
916: Library Declaration Files
917: .PP
918: .I Lint
919: accepts certain library directives, such as
920: .DS
921: \-ly
922: .DE
923: and tests the source files for compatibility with these libraries.
924: This is done by accessing library description files whose
925: names are constructed from the library directives.
926: These files all begin with the directive
927: .DS
928: /* LINTLIBRARY */
929: .DE
930: which is followed by a series of dummy function
931: definitions.
932: The critical parts of these definitions
933: are the declaration of the function return type,
934: whether the dummy function returns a value, and
935: the number and types of arguments to the function.
936: The VARARGS and ARGSUSED directives can
937: be used to specify features of the library functions.
938: .PP
939: .I Lint
940: library files are processed almost exactly like ordinary
941: source files.
942: The only difference is that functions which are defined on a library file,
943: but are not used on a source file, draw no complaints.
944: .I Lint
945: does not simulate a full library search algorithm,
946: and complains if the source files contain a redefinition of
947: a library routine (this is a feature!).
948: .PP
949: By default,
950: .I lint
951: checks the programs it is given against a standard library
952: file, which contains descriptions of the programs which
953: are normally loaded when
954: a C program
955: is run.
956: When the
957: .B -p
958: flag is in effect, another file is checked containing
959: descriptions of the standard I/O library routines
960: which are expected to be portable across various machines.
961: The
962: .B -n
963: flag can be used to suppress all library checking.
964: .SH
965: Bugs, etc.
966: .PP
967: .I Lint
968: was a difficult program to write, partially
969: because it is closely connected with matters of programming style,
970: and partially because users usually don't notice bugs which cause
971: .I lint
972: to miss errors which it should have caught.
973: (By contrast, if
974: .I lint
975: incorrectly complains about something that is correct, the
976: programmer reports that immediately!)
977: .PP
978: A number of areas remain to be further developed.
979: The checking of structures and arrays is rather inadequate;
980: size
981: incompatibilities go unchecked,
982: and no attempt is made to match up structure and union
983: declarations across files.
984: Some stricter checking of the use of the
985: .B typedef
986: is clearly desirable, but what checking is appropriate, and how
987: to carry it out, is still to be determined.
988: .PP
989: .I Lint
990: shares the preprocessor with the C compiler.
991: At some point it may be appropriate for a
992: special version of the preprocessor to be constructed
993: which checks for things such as unused macro definitions,
994: macro arguments which have side effects which are
995: not expanded at all, or are expanded more than once, etc.
996: .PP
997: The central problem with
998: .I lint
999: is the packaging of the information which it collects.
1000: There are many options which
1001: serve only to turn off, or slightly modify,
1002: certain features.
1003: There are pressures to add even more of these options.
1004: .PP
1005: In conclusion, it appears that the general notion of having two
1006: programs is a good one.
1007: The compiler concentrates on quickly and accurately turning the
1008: program text into bits which can be run;
1009: .I lint
1010: concentrates on issues
1011: of portability, style, and efficiency.
1012: .I Lint
1013: can afford to be wrong, since incorrectness and over-conservatism
1014: are merely annoying, not fatal.
1015: The compiler can be fast since it knows that
1016: .I lint
1017: will cover its flanks.
1018: Finally, the programmer can
1019: concentrate at one stage
1020: of the programming process solely on the algorithms,
1021: data structures, and correctness of the
1022: program, and then later retrofit,
1023: with the aid of
1024: .I lint ,
1025: the desirable properties of universality and portability.
1026: .SG MH-1273-SCJ-unix
1027: .\".bp
1028: .[
1029: $LIST$
1030: .]
1031: .bp
1032: .SH
1033: Appendix: Current Lint Options
1034: .PP
1035: The command currently has the form
1036: .DS
1037: lint\fR [\fB\-\fRoptions ] files... library-descriptors...
1038: .DE
1039: The options are
1040: .IP \fBh\fR
1041: Perform heuristic checks
1042: .IP \fBp\fR
1043: Perform portability checks
1044: .IP \fBv\fR
1045: Don't report unused arguments
1046: .IP \fBu\fR
1047: Don't report unused or undefined externals
1048: .IP \fBb\fR
1049: Report unreachable
1050: .B break
1051: statements.
1052: .IP \fBx\fR
1053: Report unused external declarations
1054: .IP \fBa\fR
1055: Report assignments of
1056: .B long
1057: to
1058: .B int
1059: or shorter.
1060: .IP \fBc\fR
1061: Complain about questionable casts
1062: .IP \fBn\fR
1063: No library checking is done
1064: .IP \fBs\fR
1065: Same as
1066: .B h
1067: (for historical reasons)
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.