|
|
1.1 root 1: \input texinfo @c -*-texinfo-*-
2: @c %**start of header
3: @setfilename standards.info
4: @settitle GNU Coding Standards
5: @c %**end of header
6:
7: @ifinfo
8: @format
9: START-INFO-DIR-ENTRY
10: * Standards:: GNU Project coding standards
11: END-INFO-DIR-ENTRY
12: @end format
13: @end ifinfo
14:
15: @setchapternewpage off
16:
17: @ifinfo
18: Copyright (C) 1992, 1993 Free Software Foundation
19: Permission is granted to make and distribute verbatim copies of
20: this manual provided the copyright notice and this permission notice
21: are preserved on all copies.
22:
23: @ignore
24: Permission is granted to process this file through TeX and print the
25: results, provided the printed document carries copying permission
26: notice identical to this one except for the removal of this paragraph
27: (this paragraph not being relevant to the printed manual).
28: @end ignore
29:
30: Permission is granted to copy and distribute modified versions of this
31: manual under the conditions for verbatim copying, provided that the entire
32: resulting derived work is distributed under the terms of a permission
33: notice identical to this one.
34:
35: Permission is granted to copy and distribute translations of this manual
36: into another language, under the above conditions for modified versions,
37: except that this permission notice may be stated in a translation approved
38: by the Free Software Foundation.
39: @end ifinfo
40:
41: @titlepage
42: @sp 10
43: @titlefont{GNU Coding Standards}
44: @author{Richard Stallman}
45: @author{last updated 03 Feb 1993}
46: @c Note date also appears below.
47: @page
48:
49: @vskip 0pt plus 1filll
50: Copyright @copyright{} 1992, 1993 Free Software Foundation
51:
52: Permission is granted to make and distribute verbatim copies of
53: this manual provided the copyright notice and this permission notice
54: are preserved on all copies.
55:
56: Permission is granted to copy and distribute modified versions of this
57: manual under the conditions for verbatim copying, provided that the entire
58: resulting derived work is distributed under the terms of a permission
59: notice identical to this one.
60:
61: Permission is granted to copy and distribute translations of this manual
62: into another language, under the above conditions for modified versions,
63: except that this permission notice may be stated in a translation approved
64: by Free Software Foundation.
65: @end titlepage
66:
67: @ifinfo
68: @node Top, Reading Non-Free Code, (dir), (dir)
69: @top Version
70:
71: Last updated 03 Feb 1993.
72: @c Note date also appears above.
73: @end ifinfo
74:
75: @menu
76: * Reading Non-Free Code:: Referring to Proprietary Programs
77: * Contributions:: Accepting Contributions
78: * Change Logs:: Recording Changes
79: * Compatibility:: Compatibility with Other Implementations
80: * Makefile Conventions:: Makefile Conventions
81: * Configuration:: How Configuration Should Work
82: * Source Language:: Using Languages Other Than C
83: * Formatting:: Formatting Your Source Code
84: * Comments:: Commenting Your Work
85: * Syntactic Conventions:: Clean Use of C Constructs
86: * Names:: Naming Variables and Functions
87: * Using Extensions:: Using Non-standard Features
88: * Semantics:: Program Behaviour for All Programs
89: * Errors:: Formatting Error Messages
90: * Libraries:: Library Behaviour
91: * Portability:: Portability As It Applies to GNU
92: * User Interfaces:: Standards for Command Line Interfaces
93: * Documentation:: Documenting Programs
94: * Releases:: Making Releases
95: @end menu
96:
97: @node Reading Non-Free Code
98: @chapter Referring to Proprietary Programs
99:
100: Don't in any circumstances refer to Unix source code for or during
101: your work on GNU! (Or to any other proprietary programs.)
102:
103: If you have a vague recollection of the internals of a Unix program,
104: this does not absolutely mean you can't write an imitation of it, but
105: do try to organize the imitation internally along different lines,
106: because this is likely to make the details of the Unix version
107: irrelevant and dissimilar to your results.
108:
109: For example, Unix utilities were generally optimized to minimize
110: memory use; if you go for speed instead, your program will be very
111: different. You could keep the entire input file in core and scan it
112: there instead of using stdio. Use a smarter algorithm discovered more
113: recently than the Unix program. Eliminate use of temporary files. Do
114: it in one pass instead of two (we did this in the assembler).
115:
116: Or, on the contrary, emphasize simplicity instead of speed. For some
117: applications, the speed of today's computers makes simpler algorithms
118: adequate.
119:
120: Or go for generality. For example, Unix programs often have static
121: tables or fixed-size strings, which make for arbitrary limits; use
122: dynamic allocation instead. Make sure your program handles NULs and
123: other funny characters in the input files. Add a programming language
124: for extensibility and write part of the program in that language.
125:
126: Or turn some parts of the program into independently usable libraries.
127: Or use a simple garbage collector instead of tracking precisely when
128: to free memory, or use a new GNU facility such as obstacks.
129:
130:
131: @node Contributions
132: @chapter Accepting Contributions
133:
134: If someone else sends you a piece of code to add to the program you are
135: working on, we need legal papers to use it---the same sort of legal
136: papers we will need to get from you. @emph{Each} significant
137: contributor to a program must sign some sort of legal papers in order
138: for us to have clear title to the program. The main author alone is not
139: enough.
140:
141: So, before adding in any contributions from other people, tell us
142: so we can arrange to get the papers. Then wait until we tell you
143: that we have received the signed papers, before you actually use the
144: contribution.
145:
146: This applies both before you release the program and afterward. If
147: you receive diffs to fix a bug, and they make significant change, we
148: need legal papers for it.
149:
150: You don't need papers for changes of a few lines here or there, since
151: they are not significant for copyright purposes. Also, you don't need
152: papers if all you get from the suggestion is some ideas, not actual code
153: which you use. For example, if you write a different solution to the
154: problem, you don't need to get papers.
155:
156: I know this is frustrating; it's frustrating for us as well. But if
157: you don't wait, you are going out on a limb---for example, what if the
158: contributor's employer won't sign a disclaimer? You might have to take
159: that code out again!
160:
161: The very worst thing is if you forget to tell us about the other
162: contributor. We could be very embarrassed in court some day as a
163: result.
164:
165: @node Change Logs
166: @chapter Change Logs
167:
168: Keep a change log for each directory, describing the changes made to
169: source files in that directory. The purpose of this is so that people
170: investigating bugs in the future will know about the changes that
171: might have introduced the bug. Often a new bug can be found by
172: looking at what was recently changed. More importantly, change logs
173: can help eliminate conceptual inconsistencies between different parts
174: of a program; they can give you a history of how the conflicting
175: concepts arose.
176:
177: Use the Emacs command @kbd{M-x add-change} to start a new entry in the
178: change log. An entry should have an asterisk, the name of the changed
179: file, and then in parentheses the name of the changed functions,
180: variables or whatever, followed by a colon. Then describe the changes
181: you made to that function or variable.
182:
183: Separate unrelated entries with blank lines. When two entries
184: represent parts of the same change, so that they work together, then
185: don't put blank lines between them. Then you can omit the file name
186: and the asterisk when successive entries are in the same file.
187:
188: Here are some examples:
189:
190: @example
191: * register.el (insert-register): Return nil.
192: (jump-to-register): Likewise.
193:
194: * sort.el (sort-subr): Return nil.
195:
196: * tex-mode.el (tex-bibtex-file, tex-file, tex-region):
197: Restart the tex shell if process is gone or stopped.
198: (tex-shell-running): New function.
199:
200: * expr.c (store_one_arg): Round size up for move_block_to_reg.
201: (expand_call): Round up when emitting USE insns.
202: * stmt.c (assign_parms): Round size up for move_block_from_reg.
203: @end example
204:
205: There's no need to describe here the full purpose of the changes or how
206: they work together. It is better to put this explanation in comments in
207: the code. That's why just ``New function'' is enough; there is a
208: comment with the function in the source to explain what it does.
209:
210: However, sometimes it is useful to write one line to describe the
211: overall purpose of a large batch of changes.
212:
213: You can think of the change log as a conceptual ``undo list'' which
214: explains how earlier versions were different from the current version.
215: People can see the current version; they don't need the change log
216: to tell them what is in it. What they want from a change log is a
217: clear explanation of how the earlier version differed.
218:
219: When you change the calling sequence of a function in a simple
220: fashion, and you change all the callers of the function, there is no
221: need to make individual entries for all the callers. Just write in
222: the entry for the function being called, ``All callers changed.''
223:
224: When you change just comments or doc strings, it is enough to write an
225: entry for the file, without mentioning the functions. Write just,
226: ``Doc fix.'' There's no need to keep a change log for documentation
227: files. This is because documentation is not susceptible to bugs that
228: are hard to fix. Documentation does not consist of parts that must
229: interact in a precisely engineered fashion; to correct an error, you
230: need not know the history of the erroneous passage.
231:
232:
233: @node Compatibility
234: @chapter Compatibility with Other Implementations
235:
236: With certain exceptions, utility programs and libraries for GNU should
237: be upward compatible with those in Berkeley Unix, and upward compatible
238: with @sc{ANSI} C if @sc{ANSI} C specifies their behavior, and upward
239: compatible with @sc{POSIX} if @sc{POSIX} specifies their behavior.
240:
241: When these standards conflict, it is useful to offer compatibility
242: modes for each of them.
243:
244: @sc{ANSI} C and @sc{POSIX} prohibit many kinds of extensions. Feel
245: free to make the extensions anyway, and include a @samp{--ansi} or
246: @samp{--compatible} option to turn them off. However, if the extension
247: has a significant chance of breaking any real programs or scripts,
248: then it is not really upward compatible. Try to redesign its
249: interface.
250:
251: When a feature is used only by users (not by programs or command
252: files), and it is done poorly in Unix, feel free to replace it
253: completely with something totally different and better. (For example,
254: vi is replaced with Emacs.) But it is nice to offer a compatible
255: feature as well. (There is a free vi clone, so we offer it.)
256:
257: Additional useful features not in Berkeley Unix are welcome.
258: Additional programs with no counterpart in Unix may be useful,
259: but our first priority is usually to duplicate what Unix already
260: has.
261:
262: @comment The makefile standards are in a separate file that is also
263: @comment included by make.texinfo. Done by [email protected] on 1/6/93.
264: @include make-stds.texi
265:
266: @node Configuration
267: @chapter How Configuration Should Work
268:
269: Each GNU distribution should come with a shell script named
270: @code{configure}. This script is given arguments which describe the
271: kind of machine and system you want to compile the program for.
272:
273: The @code{configure} script must record the configuration options so
274: that they affect compilation.
275:
276: One way to do this is to make a link from a standard name such as
277: @file{config.h} to the proper configuration file for the chosen system.
278: If you use this technique, the distribution should @emph{not} contain a
279: file named @file{config.h}. This is so that people won't be able to
280: build the program without configuring it first.
281:
282: Another thing that @code{configure} can do is to edit the Makefile. If
283: you do this, the distribution should @emph{not} contain a file named
284: @file{Makefile}. Instead, include a file @file{Makefile.in} which
285: contains the input used for editing. Once again, this is so that people
286: won't be able to build the program without configuring it first.
287:
288: If @code{configure} does write the @file{Makefile}, then @file{Makefile}
289: should have a target named @file{Makefile} which causes @code{configure}
290: to be rerun, setting up the same configuration that was set up last
291: time. The files that @code{configure} reads should be listed as
292: dependencies of @file{Makefile}.
293:
294: All the files which are output from the @code{configure} script should
295: have comments at the beginning explaining that they were generated
296: automatically using @code{configure}. This is so that users won't think
297: of trying to edit them by hand.
298:
299: The @code{configure} script should write a file named @file{config.status}
300: which describes which configuration options were specified when the
301: program was last configured. This file should be a shell script which,
302: if run, will recreate the same configuration.
303:
304: The @code{configure} script should accept an option of the form
305: @samp{--srcdir=@var{dirname}} to specify the directory where sources are found
306: (if it is not the current directory). This makes it possible to build
307: the program in a separate directory, so that the actual source directory
308: is not modified.
309:
310: If the user does not specify @samp{--srcdir}, then @code{configure} should
311: check both @file{.} and @file{..} to see if it can find the sources. If
312: it finds the sources in one of these places, it should use them from
313: there. Otherwise, it should report that it cannot find the sources, and
314: should exit with nonzero status.
315:
316: Usually the easy way to support @samp{--srcdir} is by editing a
317: definition of @code{VPATH} into the Makefile. Some rules may need to
318: refer explicitly to the specified source directory. To make this
319: possible, @code{configure} can add to the Makefile a variable named
320: @code{srcdir} whose value is precisely the specified directory.
321:
322: The @code{configure} script should also take an argument which specifies the
323: type of system to build the program for. This argument should look like
324: this:
325:
326: @example
327: @var{cpu}-@var{company}-@var{system}
328: @end example
329:
330: For example, a Sun 3 might be @samp{m68k-sun-sunos4.1}.
331:
332: The @code{configure} script needs to be able to decode all plausible
333: alternatives for how to describe a machine. Thus, @samp{sun3-sunos4.1}
334: would be a valid alias. So would @samp{sun3-bsd4.2}, since SunOS is
335: basically @sc{BSD} and no other @sc{BSD} system is used on a Sun. For many
336: programs, @samp{vax-dec-ultrix} would be an alias for
337: @samp{vax-dec-bsd}, simply because the differences between Ultrix and
338: @sc{BSD} are rarely noticeable, but a few programs might need to distinguish
339: them.
340:
341: There is a shell script called @file{config.sub} that you can use
342: as a subroutine to validate system types and canonicalize aliases.
343:
344: Other options are permitted to specify in more detail the software
345: or hardware are present on the machine:
346:
347: @table @samp
348: @item --with-@var{package}
349: The package @var{package} will be installed, so configure this package
350: to work with @var{package}.
351:
352: Possible values of @var{package} include @samp{x}, @samp{gnu-as} (or
353: @samp{gas}), @samp{gnu-ld}, @samp{gnu-libc}, and @samp{gdb}.
354:
355: @item --nfp
356: The target machine has no floating point processor.
357:
358: @item --gas
359: The target machine assembler is GAS, the GNU assembler.
360: This is obsolete; use @samp{--with-gnu-as} instead.
361:
362: @item --x
363: The target machine has the X Window System installed.
364: This is obsolete; use @samp{--with-x} instead.
365: @end table
366:
367: All @code{configure} scripts should accept all of these ``detail''
368: options, whether or not they make any difference to the particular
369: package at hand. In particular, they should accept any option that
370: starts with @samp{--with-}. This is so users will be able to configure
371: an entire GNU source tree at once with a single set of options.
372:
373: Packages that perform part of compilation may support cross-compilation.
374: In such a case, the host and target machines for the program may be
375: different. The @code{configure} script should normally treat the
376: specified type of system as both the host and the target, thus producing
377: a program which works for the same type of machine that it runs on.
378:
379: The way to build a cross-compiler, cross-assembler, or what have you, is
380: to specify the option @samp{--host=@var{hosttype}} when running
381: @code{configure}. This specifies the host system without changing the
382: type of target system. The syntax for @var{hosttype} is the same as
383: described above.
384:
385: Programs for which cross-operation is not meaningful need not accept the
386: @samp{--host} option, because configuring an entire operating system for
387: cross-operation is not a meaningful thing.
388:
389: Some programs have ways of configuring themselves automatically. If
390: your program is set up to do this, your @code{configure} script can simply
391: ignore most of its arguments.
392:
393:
394: @node Source Language
395: @chapter Using Languages Other Than C
396:
397: Using a language other than C is like using a non-standard feature: it
398: will cause trouble for users. Even if GCC supports the other language,
399: users may find it inconvenient to have to install the compiler for that
400: other language in order to build your program. So please write in C.
401:
402: There are three exceptions for this rule:
403:
404: @itemize @bullet
405: @item
406: It is okay to use a special language if the same program contains an
407: interpreter for that language.
408:
409: Thus, it is not a problem that GNU Emacs contains code written in Emacs
410: Lisp, because it comes with a Lisp interpreter.
411:
412: @item
413: It is okay to use another language in a tool specifically intended for
414: use with that language.
415:
416: This is okay because the only people who want to build the tool will be
417: those who have installed the other language anyway.
418:
419: @item
420: If an application is not of extremely widespread interest, then perhaps
421: it's not important if the application is inconvenient to install.
422: @end itemize
423:
424: @node Formatting
425: @chapter Formatting Your Source Code
426:
427: It is important to put the open-brace that starts the body of a C
428: function in column zero, and avoid putting any other open-brace or
429: open-parenthesis or open-bracket in column zero. Several tools look
430: for open-braces in column zero to find the beginnings of C functions.
431: These tools will not work on code not formatted that way.
432:
433: It is also important for function definitions to start the name of the
434: function in column zero. This helps people to search for function
435: definitions, and may also help certain tools recognize them. Thus,
436: the proper format is this:
437:
438: @example
439: static char *
440: concat (s1, s2) /* Name starts in column zero here */
441: char *s1, *s2;
442: @{ /* Open brace in column zero here */
443: @dots{}
444: @}
445: @end example
446:
447: @noindent
448: or, if you want to use @sc{ANSI} C, format the definition like this:
449:
450: @example
451: static char *
452: concat (char *s1, char *s2)
453: @{
454: @dots{}
455: @}
456: @end example
457:
458: In @sc{ANSI} C, if the arguments don't fit nicely on one line,
459: split it like this:
460:
461: @example
462: int
463: lots_of_args (int an_integer, long a_long, short a_short,
464: double a_double, float a_float)
465: @dots{}
466: @end example
467:
468: For the body of the function, we prefer code formatted like this:
469:
470: @example
471: if (x < foo (y, z))
472: haha = bar[4] + 5;
473: else
474: @{
475: while (z)
476: @{
477: haha += foo (z, z);
478: z--;
479: @}
480: return ++x + bar ();
481: @}
482: @end example
483:
484: We find it easier to read a program when it has spaces before the
485: open-parentheses and after the commas. Especially after the commas.
486:
487: When you split an expression into multiple lines, split it
488: before an operator, not after one. Here is the right way:
489:
490: @example
491: if (foo_this_is_long && bar > win (x, y, z)
492: && remaining_condition)
493: @end example
494:
495: Try to avoid having two operators of different precedence at the same
496: level of indentation. For example, don't write this:
497:
498: @example
499: mode = (inmode[j] == VOIDmode
500: || GET_MODE_SIZE (outmode[j]) > GET_MODE_SIZE (inmode[j])
501: ? outmode[j] : inmode[j]);
502: @end example
503:
504: Instead, use extra parentheses so that the indentation shows the nesting:
505:
506: @example
507: mode = ((inmode[j] == VOIDmode
508: || (GET_MODE_SIZE (outmode[j]) > GET_MODE_SIZE (inmode[j])))
509: ? outmode[j] : inmode[j]);
510: @end example
511:
512: Insert extra parentheses so that Emacs will indent the code properly.
513: For example, the following indentation looks nice if you do it by hand,
514: but Emacs would mess it up:
515:
516: @example
517: v = rup->ru_utime.tv_sec*1000 + rup->ru_utime.tv_usec/1000
518: + rup->ru_stime.tv_sec*1000 + rup->ru_stime.tv_usec/1000;
519: @end example
520:
521: But adding a set of parentheses solves the problem:
522:
523: @example
524: v = (rup->ru_utime.tv_sec*1000 + rup->ru_utime.tv_usec/1000
525: + rup->ru_stime.tv_sec*1000 + rup->ru_stime.tv_usec/1000);
526: @end example
527:
528: Format do-while statements like this:
529:
530: @example
531: do
532: @{
533: a = foo (a);
534: @}
535: while (a > 0);
536: @end example
537:
538: Please use formfeed characters (control-L) to divide the program into
539: pages at logical places (but not within a function). It does not matter
540: just how long the pages are, since they do not have to fit on a printed
541: page. The formfeeds should appear alone on lines by themselves.
542:
543:
544: @node Comments
545: @chapter Commenting Your Work
546:
547: Every program should start with a comment saying briefly what it is for.
548: Example: @samp{fmt - filter for simple filling of text}.
549:
550: Please put a comment on each function saying what the function does,
551: what sorts of arguments it gets, and what the possible values of
552: arguments mean and are used for. It is not necessary to duplicate in
553: words the meaning of the C argument declarations, if a C type is being
554: used in its customary fashion. If there is anything nonstandard about
555: its use (such as an argument of type @code{char *} which is really the
556: address of the second character of a string, not the first), or any
557: possible values that would not work the way one would expect (such as,
558: that strings containing newlines are not guaranteed to work), be sure
559: to say so.
560:
561: Also explain the significance of the return value, if there is one.
562:
563: Please put two spaces after the end of a sentence in your comments, so
564: that the Emacs sentence commands will work. Also, please write
565: complete sentences and capitalize the first word. If a lower-case
566: identifer comes at the beginning of a sentence, don't capitalize it!
567: Changing the spelling makes it a different identifier. If you don't
568: like starting a sentence with a lower case letter, write the sentence
569: differently (e.g. ``The identifier lower-case is @dots{}'').
570:
571: The comment on a function is much clearer if you use the argument
572: names to speak about the argument values. The variable name itself
573: should be lower case, but write it in upper case when you are speaking
574: about the value rather than the variable itself. Thus, ``the inode
575: number @var{node_num}'' rather than ``an inode''.
576:
577: There is usually no purpose in restating the name of the function in
578: the comment before it, because the reader can see that for himself.
579: There might be an exception when the comment is so long that the function
580: itself would be off the bottom of the screen.
581:
582: There should be a comment on each static variable as well, like this:
583:
584: @example
585: /* Nonzero means truncate lines in the display;
586: zero means continue them. */
587:
588: int truncate_lines;
589: @end example
590:
591: Every @samp{#endif} should have a comment, except in the case of short
592: conditionals (just a few lines) that are not nested. The comment should
593: state the condition of the conditional that is ending, @emph{including
594: its sense}. @samp{#else} should have a comment describing the condition
595: @emph{and sense} of the code that follows. For example:
596:
597: @example
598: #ifdef foo
599: @dots{}
600: #else /* not foo */
601: @dots{}
602: #endif /* not foo */
603: @end example
604:
605: @noindent
606: but, by contrast, write the comments this way for a @samp{#ifndef}:
607:
608: @example
609: #ifndef foo
610: @dots{}
611: #else /* foo */
612: @dots{}
613: #endif /* foo */
614: @end example
615:
616:
617: @node Syntactic Conventions
618: @chapter Clean Use of C Constructs
619:
620: Please explicitly declare all arguments to functions.
621: Don't omit them just because they are ints.
622:
623: Declarations of external functions and functions to appear later
624: in the source file should all go in one place near the beginning of
625: the file (somewhere before the first function definition in the file),
626: or else should go in a header file. Don't put extern declarations
627: inside functions.
628:
629: It used to be common practice to use the same local variables (with
630: names like @code{tem}) over and over for different values within one
631: function. Instead of doing this, it is better declare a separate local
632: variable for each distinct purpose, and give it a name which is
633: meaningful. This not only makes programs easier to understand, it also
634: facilitates optimization by good compilers. You can also move the
635: declaration of each local variable into the smallest scope that includes
636: all its uses. This makes the program even cleaner.
637:
638: Don't use local variables or parameters that shadow global identifiers.
639:
640: Don't declare multiple variables in one declaration that spans lines.
641: Start a new declaration on each line, instead. For example, instead
642: of this:
643:
644: @example
645: int foo,
646: bar;
647: @end example
648:
649: @noindent
650: write either this:
651:
652: @example
653: int foo, bar;
654: @end example
655:
656: @noindent
657: or this:
658:
659: @example
660: int foo;
661: int bar;
662: @end example
663:
664: @noindent
665: (If they are global variables, each should have a comment preceding it
666: anyway.)
667:
668: When you have an if-else statement nested in another if statement,
669: always put braces around the if-else. Thus, never write like this:
670:
671: @example
672: if (foo)
673: if (bar)
674: win ();
675: else
676: lose ();
677: @end example
678:
679: @noindent
680: always like this:
681:
682: @example
683: if (foo)
684: @{
685: if (bar)
686: win ();
687: else
688: lose ();
689: @}
690: @end example
691:
692: If you have an if statement nested inside of an else statement,
693: either write @code{else if} on one line, like this,
694:
695: @example
696: if (foo)
697: @dots{}
698: else if (bar)
699: @dots{}
700: @end example
701:
702: @noindent
703: with its then-part indented like the preceding then-part, or write the
704: nested if within braces like this:
705:
706: @example
707: if (foo)
708: @dots{}
709: else
710: @{
711: if (bar)
712: @dots{}
713: @}
714: @end example
715:
716: Don't declare both a structure tag and variables or typedefs in the
717: same declaration. Instead, declare the structure tag separately
718: and then use it to declare the variables or typedefs.
719:
720: Try to avoid assignments inside if-conditions. For example, don't
721: write this:
722:
723: @example
724: if ((foo = (char *) malloc (sizeof *foo)) == 0)
725: fatal ("virtual memory exhausted");
726: @end example
727:
728: @noindent
729: instead, write this:
730:
731: @example
732: foo = (char *) malloc (sizeof *foo);
733: if (foo == 0)
734: fatal ("virtual memory exhausted");
735: @end example
736:
737: Don't make the program ugly to placate lint. Please don't insert any
738: casts to void. Zero without a cast is perfectly fine as a null
739: pointer constant.
740:
741:
742: @node Names
743: @chapter Naming Variables and Functions
744:
745: Please use underscores to separate words in a name, so that the Emacs
746: word commands can be useful within them. Stick to lower case; reserve
747: upper case for macros and enum constants, and for name-prefixes that
748: follow a uniform convention.
749:
750: For example, you should use names like @code{ignore_space_change_flag};
751: don't use names like @code{iCantReadThis}.
752:
753: Variables that indicate whether command-line options have been
754: specified should be named after the meaning of the option, not after
755: the option-letter. A comment should state both the exact meaning of
756: the option and its letter. For example,
757:
758: @example
759: /* Ignore changes in horizontal whitespace (-b). */
760: int ignore_space_change_flag;
761: @end example
762:
763: When you want to define names with constant integer values, use
764: @code{enum} rather than @samp{#define}. GDB knows about enumeration
765: constants.
766:
767: Use file names of 14 characters or less, to avoid creating gratuitous
768: problems on System V.
769:
770:
771: @node Using Extensions
772: @chapter Using Non-standard Features
773:
774: Many GNU facilities that already exist support a number of convenient
775: extensions over the comparable Unix facilities. Whether to use these
776: extensions in implementing your program is a difficult question.
777:
778: On the one hand, using the extensions can make a cleaner program.
779: On the other hand, people will not be able to build the program
780: unless the other GNU tools are available. This might cause the
781: program to work on fewer kinds of machines.
782:
783: With some extensions, it might be easy to provide both alternatives.
784: For example, you can define functions with a ``keyword'' @code{INLINE}
785: and define that as a macro to expand into either @code{inline} or
786: nothing, depending on the compiler.
787:
788: In general, perhaps it is best not to use the extensions if you can
789: straightforwardly do without them, but to use the extensions if they
790: are a big improvement.
791:
792: An exception to this rule are the large, established programs (such as
793: Emacs) which run on a great variety of systems. Such programs would
794: be broken by use of GNU extensions.
795:
796: Another exception is for programs that are used as part of
797: compilation: anything that must be compiled with other compilers in
798: order to bootstrap the GNU compilation facilities. If these require
799: the GNU compiler, then no one can compile them without having them
800: installed already. That would be no good.
801:
802: Since most computer systems do not yet implement @sc{ANSI} C, using the
803: @sc{ANSI} C features is effectively using a GNU extension, so the
804: same considerations apply. (Except for @sc{ANSI} features that we
805: discourage, such as trigraphs---don't ever use them.)
806:
807: @node Semantics
808: @chapter Program Behaviour for All Programs
809:
810: Avoid arbitrary limits on the length or number of @emph{any} data
811: structure, including filenames, lines, files, and symbols, by allocating
812: all data structures dynamically. In most Unix utilities, ``long lines
813: are silently truncated''. This is not acceptable in a GNU utility.
814:
815: Utilities reading files should not drop NUL characters, or any other
816: nonprinting characters @emph{including those with codes above 0177}. The
817: only sensible exceptions would be utilities specifically intended for
818: interface to certain types of printers that can't handle those characters.
819:
820: Check every system call for an error return, unless you know you wish to
821: ignore errors. Include the system error text (from @code{perror} or
822: equivalent) in @emph{every} error message resulting from a failing
823: system call, as well as the name of the file if any and the name of the
824: utility. Just ``cannot open foo.c'' or ``stat failed'' is not
825: sufficient.
826:
827: Check every call to @code{malloc} or @code{realloc} to see if it
828: returned zero. Check @code{realloc} even if you are making the block
829: smaller; in a system that rounds block sizes to a power of 2,
830: @code{realloc} may get a different block if you ask for less space.
831:
832: In Unix, @code{realloc} can destroy the storage block if it returns
833: zero. GNU @code{realloc} does not have this bug: if it fails, the
834: original block is unchanged. Feel free to assume the bug is fixed. If
835: you wish to run your program on Unix, and wish to avoid lossage in this
836: case, you can use the GNU @code{malloc}.
837:
838: You must expect @code{free} to alter the contents of the block that was
839: freed. Anything you want to fetch from the block, you must fetch before
840: calling @code{free}.
841:
842: Use @code{getopt_long} to decode arguments, unless the argument syntax
843: makes this unreasonable.
844:
845: When static storage is to be written in during program execution, use
846: explicit C code to initialize it. Reserve C initialized declarations
847: for data that will not be changed.
848:
849: Try to avoid low-level interfaces to obscure Unix data structures (such
850: as file directories, utmp, or the layout of kernel memory), since these
851: are less likely to work compatibly. If you need to find all the files
852: in a directory, use @code{readdir} or some other high-level interface.
853: These will be supported compatibly by GNU.
854:
855: By default, the GNU system will provide the signal handling functions of
856: @sc{BSD} and of @sc{POSIX}. So GNU software should be written to use
857: these.
858:
859: In error checks that detect ``impossible'' conditions, just abort.
860: There is usually no point in printing any message. These checks
861: indicate the existence of bugs. Whoever wants to fix the bugs will have
862: to read the source code and run a debugger. So explain the problem with
863: comments in the source. The relevant data will be in variables, which
864: are easy to examine with the debugger, so there is no point moving them
865: elsewhere.
866:
867:
868: @node Errors
869: @chapter Formatting Error Messages
870:
871: Error messages from compilers should look like this:
872:
873: @example
874: @var{source-file-name}:@var{lineno}: @var{message}
875: @end example
876:
877: Error messages from other noninteractive programs should look like this:
878:
879: @example
880: @var{program}:@var{source-file-name}:@var{lineno}: @var{message}
881: @end example
882:
883: @noindent
884: when there is an appropriate source file, or like this:
885:
886: @example
887: @var{program}: @var{message}
888: @end example
889:
890: @noindent
891: when there is no relevant source file.
892:
893: In an interactive program (one that is reading commands from a
894: terminal), it is better not to include the program name in an error
895: message. The place to indicate which program is running is in the
896: prompt or with the screen layout. (When the same program runs with
897: input from a source other than a terminal, it is not interactive and
898: would do best to print error messages using the noninteractive style.)
899:
900: The string @var{message} should not begin with a capital letter when
901: it follows a program name and/or filename. Also, it should not end
902: with a period.
903:
904: Error messages from interactive programs, and other messages such as
905: usage messages, should start with a capital letter. But they should not
906: end with a period.
907:
908:
909: @node Libraries
910: @chapter Library Behaviour
911:
912: Try to make library functions reentrant. If they need to do dynamic
913: storage allocation, at least try to avoid any nonreentrancy aside from
914: that of @code{malloc} itself.
915:
916: Here are certain name conventions for libraries, to avoid name
917: conflicts.
918:
919: Choose a name prefix for the library, more than two characters long.
920: All external function and variable names should start with this
921: prefix. In addition, there should only be one of these in any given
922: library member. This usually means putting each one in a separate
923: source file.
924:
925: An exception can be made when two external symbols are always used
926: together, so that no reasonable program could use one without the
927: other; then they can both go in the same file.
928:
929: External symbols that are not documented entry points for the user
930: should have names beginning with @samp{_}. They should also contain
931: the chosen name prefix for the library, to prevent collisions with
932: other libraries. These can go in the same files with user entry
933: points if you like.
934:
935: Static functions and variables can be used as you like and need not
936: fit any naming convention.
937:
938:
939: @node Portability
940: @chapter Portability As It Applies to GNU
941:
942: Much of what is called ``portability'' in the Unix world refers to
943: porting to different Unix versions. This is a secondary consideration
944: for GNU software, because its primary purpose is to run on top of one
945: and only one kernel, the GNU kernel, compiled with one and only one C
946: compiler, the GNU C compiler. The amount and kinds of variation among
947: GNU systems on different cpu's will be like the variation among Berkeley
948: 4.3 systems on different cpu's.
949:
950: All users today run GNU software on non-GNU systems. So supporting a
951: variety of non-GNU systems is desirable; simply not paramount.
952: The easiest way to achieve portability to a reasonable range of systems
953: is to use Autoconf. It's unlikely that your program needs to know more
954: information about the host machine than Autoconf can provide, simply
955: because most of the programs that need such knowledge have already been
956: written.
957:
958: It is difficult to be sure exactly what facilities the GNU kernel
959: will provide, since it isn't finished yet. Therefore, assume you can
960: use anything in 4.3; just avoid using the format of semi-internal data
961: bases (e.g., directories) when there is a higher-level alternative
962: (readdir).
963:
964: You can freely assume any reasonably standard facilities in the C
965: language, libraries or kernel, because we will find it necessary to
966: support these facilities in the full GNU system, whether or not we
967: have already done so. The fact that there may exist kernels or C
968: compilers that lack these facilities is irrelevant as long as the GNU
969: kernel and C compiler support them.
970:
971: It remains necessary to worry about differences among cpu types, such
972: as the difference in byte ordering and alignment restrictions. It's
973: unlikely that 16-bit machines will ever be supported by GNU, so there
974: is no point in spending any time to consider the possibility that an
975: int will be less than 32 bits.
976:
977: You can assume that all pointers have the same format, regardless
978: of the type they point to, and that this is really an integer.
979: There are some weird machines where this isn't true, but they aren't
980: important; don't waste time catering to them. Besides, eventually
981: we will put function prototypes into all GNU programs, and that will
982: probably make your program work even on weird machines.
983:
984: Since some important machines (including the 68000) are big-endian,
985: it is important not to assume that the address of an int object
986: is also the address of its least-significant byte. Thus, don't
987: make the following mistake:
988:
989: @example
990: int c;
991: @dots{}
992: while ((c = getchar()) != EOF)
993: write(file_descriptor, &c, 1);
994: @end example
995:
996: You can assume that it is reasonable to use a meg of memory. Don't
997: strain to reduce memory usage unless it can get to that level. If
998: your program creates complicated data structures, just make them in
999: core and give a fatal error if malloc returns zero.
1000:
1001: If a program works by lines and could be applied to arbitrary
1002: user-supplied input files, it should keep only a line in memory, because
1003: this is not very hard and users will want to be able to operate on input
1004: files that are bigger than will fit in core all at once.
1005:
1006:
1007: @node User Interfaces
1008: @chapter Standards for Command Line Interfaces
1009:
1010: Please don't make the behavior of a utility depend on the name used
1011: to invoke it. It is useful sometimes to make a link to a utility
1012: with a different name, and that should not change what it does.
1013:
1014: Instead, use a run time option or a compilation switch or both
1015: to select among the alternate behaviors.
1016:
1017: Likewise, please don't make the behavior of the program depend on the
1018: type of output device it is used with. Device independence is an
1019: important principle of the system's design; do not compromise it
1020: merely to save someone from typing an option now and then.
1021:
1022: If you think one behavior is most useful when the output is to a
1023: terminal, and another is most useful when the output is a file or a
1024: pipe, then it is usually best to make the default behavior the one that
1025: is useful with output to a terminal, and have an option for the other
1026: behavior.
1027:
1028: Compatibility requires certain programs to depend on the type of output
1029: device. It would be disastrous if @code{ls} or @code{sh} did not do so
1030: in the way all users expect. In some of these cases, we supplement the
1031: program with a preferred alternate version that does not depend on the
1032: output device type. For example, we provide a @code{dir} program much
1033: like @code{ls} except that its default output format is always
1034: multi-column format.
1035:
1036: It is a good idea to follow the @sc{POSIX} guidelines for the
1037: command-line options of a program. The easiest way to do this is to use
1038: @code{getopt} to parse them. Note that the GNU version of @code{getopt}
1039: will normally permit options anywhere among the arguments unless the
1040: special argument @samp{--} is used. This is not what @sc{POSIX}
1041: specifies; it is a GNU extension.
1042:
1043: Please define long-named options that are equivalent to the
1044: single-letter Unix-style options. We hope to make GNU more user
1045: friendly this way. This is easy to do with the GNU function
1046: @code{getopt_long}.
1047:
1048: One of the advantages of long-named options is that they can be
1049: consistent from program to program. For example, users should be able
1050: to expect the ``verbose'' option of any GNU program which has one, to be
1051: spelled precisely @samp{--verbose}. To achieve this uniformity, look at
1052: the table of common long-option names when you choose the option names
1053: for your program. The table is in the file @file{longopts.table}.
1054:
1055: If you use names not already in the table, please send
1056: @samp{gnu@@prep.ai.mit.edu} a list of them, with their meanings, so we
1057: can update the table.
1058:
1059: It is usually a good idea for file names given as ordinary arguments
1060: to be input files only; any output files would be specified using
1061: options (preferably @samp{-o}). Even if you allow an output file name
1062: as an ordinary argument for compatibility, try to provide a suitable
1063: option as well. This will lead to more consistency among GNU
1064: utilities, so that there are fewer idiosyncracies for users to
1065: remember.
1066:
1067: Programs should support an option @samp{--version} which prints the
1068: program's version number on standard output and exits successfully, and
1069: an option @samp{--help} which prints option usage information on
1070: standard output and exits successfully. These options should inhibit
1071: the normal function of the command; they should do nothing except print
1072: the requested information.
1073:
1074: @node Documentation
1075: @chapter Documenting Programs
1076:
1077: Please use Texinfo for documenting GNU programs. See the Texinfo
1078: manual, either the hardcopy or the version in the GNU Emacs Info
1079: subsystem (@kbd{C-h i}). See existing GNU Texinfo files (e.g. those
1080: under the @file{man/} directory in the GNU Emacs Distribution) for
1081: examples.
1082:
1083: The title page of the manual should state the version of the program
1084: which the manual applies to. The Top node of the manual should also
1085: contain this information. If the manual is changing more frequently
1086: than or independent of the program, also state a version number for
1087: the manual in both of these places.
1088:
1089: The manual should document all command-line arguments and all
1090: commands. It should give examples of their use. But don't organize
1091: the manual as a list of features. Instead, organize it by the
1092: concepts a user will have before reaching that point in the manual.
1093: Address the goals that a user will have in mind, and explain how to
1094: accomplish them. Don't use Unix man pages as a model for how to
1095: write GNU documentation; they are a bad example to follow.
1096:
1097: The manual should have a node named @samp{@var{program} Invocation},
1098: @samp{@var{program} Invoke} or @samp{Invoking @var{program}}, where
1099: @var{program} stands for the name of the program being described, as you
1100: would type it in the shell to run the program. This node (together with
1101: its subnodes if any) should describe the program's command line
1102: arguments and how to run it (the sort of information people would look
1103: in a man page for). Start with an @samp{@@example} containing a
1104: template for all the options and arguments that the program uses.
1105:
1106: Alternatively, put a menu item in some menu whose item name fits one of
1107: the above patterns. This identifies the node which that item points to
1108: as the node for this purpose, regardless of the node's actual name.
1109:
1110: There will be automatic features for specifying a program name and
1111: quickly reading just this part of its manual.
1112:
1113: If one manual describes several programs, it should have such a node for
1114: each program described.
1115:
1116: In addition to its manual, the package should have a file named
1117: @file{NEWS} which contains a list of user-visible changes worth
1118: mentioning. In each new release, add items to the front of the file and
1119: identify the version they pertain to. Don't discard old items; leave
1120: them in the file after the newer items. This way, a user upgrading from
1121: any previous version can see what is new.
1122:
1123: If the @file{NEWS} file gets very long, move some of the older items
1124: into a file named @file{ONEWS} and put a note at the end referring the
1125: user to that file.
1126:
1127: It is ok to supply a man page for the program as well as a Texinfo
1128: manual if you wish to. But keep in mind that supporting a man page
1129: requires continual effort, each time the program is changed. Any time
1130: you spend on the man page is time taken away from more useful things you
1131: could contribute.
1132:
1133: Thus, even if a user volunteers to donate a man page, you may find this
1134: gift costly to accept. Unless you have time on your hands, it may be
1135: better to refuse the man page unless the same volunteer agrees to take
1136: full responsibility for maintaining it---so that you can wash your hands
1137: of it entirely. If the volunteer ceases to do the job, then don't feel
1138: obliged to pick it up yourself; it may be better to withdraw the man
1139: page until another volunteer offers to carry on with it.
1140:
1141: Alternatively, if you expect the discrepancies to be small enough that
1142: the man page remains useful, put a prominent note near the beginning of
1143: the man page explaining that you don't maintain it and that the Texinfo
1144: manual is more authoritative, and describing how to access the Texinfo
1145: documentation.
1146:
1147: @node Releases
1148: @chapter Making Releases
1149:
1150: Package the distribution of Foo version 69.96 in a tar file named
1151: @file{foo-69.96.tar}. It should unpack into a subdirectory named
1152: @file{foo-69.96}.
1153:
1154: Building and installing the program should never modify any of the files
1155: contained in the distribution. This means that all the files that form
1156: part of the program in any way must be classified into @dfn{source
1157: files} and @dfn{non-source files}. Source files are written by humans
1158: and never changed automatically; non-source files are produced from
1159: source files by programs under the control of the Makefile.
1160:
1161: Naturally, all the source files must be in the distribution. It is okay
1162: to include non-source files in the distribution, provided they are
1163: up-to-date and machine-independent, so that building the distribution
1164: normally will never modify them. We commonly included non-source files
1165: produced by Bison, Lex, @TeX{}, and Makeinfo; this helps avoid
1166: unnecessary dependencies between our distributions, so that users can
1167: install whichever packages they want to install.
1168:
1169: Non-source files that might actually be modified by building and
1170: installing the program should @strong{never} be included in the
1171: distribution. So if you do distribute non-source files, always make
1172: sure they are up to date when you make a new distribution.
1173:
1174: Make sure that the directory into which the distribution unpacks (as
1175: well as any subdirectories) are all world-writable (octal mode 777).
1176: This is so that old versions of @code{tar} which preserve the
1177: ownership and permissions of the files from the tar archive will be
1178: able to extract all the files even if the user is unprivileged.
1179:
1180: Make sure that no file name in the distribution is more than 14
1181: characters long. Likewise, no file created by building the program
1182: should have a name longer than 14 characters. The reason for this is
1183: that some systems adhere to a foolish interpretation of the POSIX
1184: standard, and refuse to open a longer name, rather than truncating as
1185: they did in the past.
1186:
1187: Don't include any symbolic links in the distribution itself. If the tar
1188: file contains symbolic links, then people cannot even unpack it on
1189: systems that don't support symbolic links. Also, don't use multiple
1190: names for one file in different directories, because certain file
1191: systems cannot handle this and that prevents unpacking the
1192: distribution.
1193:
1194: Try to make sure that all the file names will be unique on MS-DOG. A
1195: name on MS-DOG consists of up to 8 characters, optionally followed by a
1196: period and up to three characters. MS-DOG will truncate extra
1197: characters both before and after the period. Thus,
1198: @file{foobarhacker.c} and @file{foobarhacker.o} are not ambiguous; they
1199: are truncated to @file{foobarha.c} and @file{foobarha.o}, which are
1200: distinct.
1201:
1202: Include in your distribution a copy of the @file{texinfo.tex} you used
1203: to test print any @file{*.texinfo} files.
1204:
1205: Likewise, if your program uses small GNU software packages like regex,
1206: getopt, obstack, or termcap, include them in the distribution file.
1207: Leaving them out would make the distribution file a little smaller at
1208: the expense of possible inconvenience to a user who doesn't know what
1209: other files to get.
1210: @bye
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.