Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are some functions declared extern and header file not included in source in Git source code?

I wanted to see the source code of a real world application to understand good programming practices etc. So I chose Git and downloaded the source for version 1.8.4.

After randomly browsing through various files something caught my attention in these two files: strbuf.h strbuf.c

These two files apparently define an API with this documentation.

I have two questions :

  1. Why the function declarations at line 16,17,18,19 & global variable at line 6 in 'strbuf.h' declared extern ?

  2. Why "strbuf.h" is not #included in strbuf .c ?

I as a novice programmer have always learned that you write function definitions in a .c file whereas the function declarations,macros,inlines etc. are written in a .h file which is then #included in every .c file which wants to use these functions etc.

Can anyone please explain this?

like image 395
rsjethani Avatar asked Aug 11 '13 11:08

rsjethani


1 Answers

strbuf.c includes cache.h and cache.h includes strbuf.h, so your premise for question 2 (that strbuf.c does not include strbuf.h) is wrong: it does include it, just not directly.

extern applied to functions

The extern keyword is never required for function declarations, but it does have an effect: it declares that the identifier naming the function (i.e., the function's name) has the same linkage as any previously visible declaration, or if no such declaration is visible, that the identifier has external linkage. This rather confusing phrasing really means that, given:

static int foo(void); extern int foo(void);

the second declaration of foo also declares it static, giving it internal linkage. If you write:

static int foo(void); int foo(void); /* wrong in 1990s era C */

you have declared it first as having internal linkage, and then second as having external linkage, and in pre-1999 versions of C,1 that produces undefined behavior. In one sense, then, the extern keyword adds some safety (at the price of confusion) as it can mean static when necessary. But you could always write static again, and extern is not a panacea:

extern int foo(void); static int foo(void); /* ERROR */

This third form is still erroneous. The first extern declaration has no previous visible declaration, so foo has external linkage, and then the second static declaration gives foo internal linkage, producing undefined behavior.

In short, extern is not required on function declarations. Some people just prefer it for style reasons.

(Note: I'm leaving out extern inline in C99, which is kind of weird, and implementations vary. See http://www.greenend.org.uk/rjk/2003/03/inline.html for more details.)

extern applied to variable declarations

The extern keyword on a variable declaration has multiple different effects. First, as with function declarations, it affects the linkage of the identifier. Second, for an identifier outside any function (a "global variable" in one of the two usual senses), it causes the declaration to be a declaration, rather than a definition, provided the variable is not also initialized.

For variables inside a function (i.e., with "block scope"), such as somevar in:

void f(void) {
    extern int somevar;
    ...
}

the extern keyword causes the identifier to have some linkage (internal or external) instead of "no linkage" (as for automatic-duration local variables). In the process, it also causes the variable itself to have static duration, rather than automatic. (Automatic-duration variables never have linkage, and always have block scope, rather than file scope.)

As with function declarations, the linkage extern assigns is internal if there is a previous visible internal-linkage declaration, and external otherwise. So the x inside f() here has internal linkage, despite the extern keyword:

static int x;
void f(void) {
    extern int x; /* note: don't do this */
    ...
}

The only reason to write this kind of code is to confuse other programmers, so don't do it. :-)

In general, the reason to annotate "global" (i.e., file-scope, static-duration, external-linkage) variables with the extern keyword is to prevent that particular declaration from becoming a definition. C compilers that use the so-called "def/ref" model get indigestion at link time when the same name is defined more than once. Thus, if file1.c says int globalvar; and file2.c also says int globalvar;, both are definitions and the code may not compile (although most Unix-like systems use the so-called "common model" by default, which makes this work anyway). If you are declaring such a variable in a header file—which is likely to be included from many different .c files—use extern to make that declaration "just a declaration".

One, and only one, of those .c files can then declare the variable again, leaving off the extern keyword and/or including an initializer. Or, some people prefer a style in which the header file uses something like this:

/* foo.h */
#ifndef EXTERN
# define EXTERN extern
#endif
EXTERN int globalvar;

In this case, one (and only one) of those .c files can contain the sequence:

#define EXTERN
#include "foo.h"

Here, since EXTERN is defined, the #ifndef turns off the subsequent #define and the line EXTERN int globalvar; expands to just int globalvar; so that this becomes a definition rather than a declaration. Personally, I dislike this coding style, although it does satisfy the "don't repeat yourself" principle. Mostly I find the uppercase EXTERN misleading, and this pattern is unhelpful with initialization. Those who favor it usually wind up adding a second macro to hide the initializers:

#ifndef EXTERN
# define EXTERN extern
# define INIT_VAL(x) /*nothing*/
#else
# define INIT_VAL(x) = x
#endif

EXTERN int globalvar INIT_VAL(42);

but even this falls apart when the item to be initialized needs a compound initializer (e.g., a struct that should be initialized to { 42, 23, 17, "hike!" }).

(Note: I've deliberately glossed over the whole "tentative definition" thing here. A definition without an initializer is only "tentatively defined" until the end of the translation unit. This allows certain kinds of forward-references that are otherwise too difficult to express. It's not normally very important.)

including the header that declares function f in code that defines function f

This is always a good idea, for one simple reason: the compiler will compare the declaration of f() in the header against the definition of f() in the code. If the two do not match (for any reason—typically a mistake in initial coding, or a failure to update one of the two during maintenance, but occasionally simply due to Cat Walked On Keyboard Syndrome or similar), the compiler can catch the mistake at compile time.


1The 1999 C standard says that omitting the extern keyword in a function declaration means the same thing as using the extern keyword there. This is much simpler to describe, and means you get defined (and sensible) behavior instead of undefined (and therefore maybe-good maybe-bad behavior).

like image 125
torek Avatar answered Oct 24 '22 17:10

torek