Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does it mean to have a struct without a definition?

Recently, I ran across the following code in my system stdio.h:

struct _IO_FILE_plus;
extern struct _IO_FILE_plus _IO_2_1_stdin_;
extern struct _IO_FILE_plus _IO_2_1_stdout_;
extern struct _IO_FILE_plus _IO_2_1_stderr_;

I'm used to seeing pointers to structs that are forward-declared like this: extern struct _IO_FILE *stdin;, but having a bare struct seems very odd, since you can't use the struct or pass it to functions. Is this just a no-op?

like image 767
Joshua Nelson Avatar asked Oct 07 '19 03:10

Joshua Nelson


People also ask

Can structs have an empty definition?

An empty struct is a struct type without fields struct{} . The cool thing about an empty structure is that it occupies zero bytes of storage.

Is a struct a definition or declaration?

A struct in the C programming language (and many derivatives) is a composite data type (or record) declaration that defines a physically grouped list of variables under one name in a block of memory, allowing the different variables to be accessed via a single pointer or by the struct declared name which returns the ...

Are structs definitions?

A struct (short for structure) is a data type available in C programming languages, such as C, C++, and C#. It is a user-defined data type that can store multiple related items. A struct variable is similar to a database record since it may contain multiple data types related to a single entity.

Can you define a structure without a name?

Anonymous unions/structures are also known as unnamed unions/structures as they don't have names. Since there is no names, direct objects(or variables) of them are not created and we use them in nested structure or unions.


1 Answers

The code struct _IO_FILE_plus; is a declaration of the name _IO_FILE_plus so that if the compiler sees it being used at some place, it knows that there will be a definition at some point that actually describes its members.

The extern modifier indicates that the symbol named is an external symbol that exists in some other compilation unit. Code such as:

extern struct _IO_FILE_plus _IO_2_1_stdin_;

is also a declaration of the symbol, in this case _IO_2_1_stdin_, telling the compiler that the symbol is an external symbol that is defined and exists in some other compilation unit (file) and what the type of the symbol is, in this case a struct _IO_FILE_plus.

However normally using a struct declaration in some other declaration would normally use a pointer to the struct since the size of the struct and its layout can not be determined by a mere declaration such as struct _IO_FILE_plus;.

However in this case since it is an external, unless the source code contains some statement which requires that the size and layout of the struct be available to the compiler, using a declared symbol in this fashion works.

So if you had source such as these statements:

struct _IO_FILE_plus *myIo = malloc(sizeof(struct _IO_FILE_plus));

struct _IO_FILE_plus myIo = _IO_2_1_stdin_;  // no pointers here, struct assignment

these would generate an error because the compiler needs the definition of the struct _IO_FILE_plus in order to determine the result of sizeof() or the amount of memory to copy for struct assignment in these statements.

However if you have a statement such as:

struct _IO_FILE_plus *myIO = &_IO_2_1_stdin_;

this will compile because the compiler only needs to know how to find the address of the external variable and to put that address into a pointer variable. The address of the external variable is fixed up by the loader when the application is loaded and set up to be run.

If the external does not exist then you will get an "unresolved external symbol" error when linking.

API Library Example

One way this may be useful is if you have several different objects or devices represented by proxy objects and you have a function library that you want to allow people to choose the target object or device for the functions in it.

So what you do is in your library you expose these objects or proxy objects as externals but you keep their internals secret by only providing the declaration.

Then in the function interface you require a pointer to the appropriate object or proxy object to be used with the function.

The nice thing about this approach is that other parties with access to your library internals can provide additional proxy objects that work with your library but with their own proxy objects.

This works especially well when the struct definition contains pointers to hook functions that your library would invoke to perform device specific operations that the third party knows about but you don't have to. The hook functions have a defined interface with a set of expected results and how that is done is up to the provider of the hook function.

So library source file:

struct _IO_FILE_plus {
    unsigned char  buffer[1024];
    int bufptr1;
    //  …  other struct member definitions
    int (*hookOne)(struct _IO_FILE_plus *obj);   // third party hook function pointer
    int (*hookTwo)(struct _IO_FILE_plus *obj);   // third party hook function pointer
};

struct _IO_FILE_plus _IO_2_1_stdin_ = { {0}, 0, …. };
struct _IO_FILE_plus _IO_2_1_stdout_ = { {0}, 0, …. };
struct _IO_FILE_plus _IO_2_1_stderr_ = { {0}, 0, …. };

int  funcOne (struct _IO_FILE_plus *obj, int aThing)
{
    int  iResult;

    if (obj->hookOne) iResult = obj->hookOne(obj);

    // do other funcOne() stuff using the object, obj, provided

    return iResult;
}


int  funcTwo (struct _IO_FILE_plus *obj, double aThing)
{
    int  iResult;

    if (obj->hookTwo) iResult = obj->hookTwo(obj);

    // do other funcTwo() stuff using the object, obj, provided

    return iResult;
}

The library source file compiles fine because the compiler has the full definition of the struct available. Then in the header file provided with the library you have these statements:

struct _IO_FILE_plus ;

extern struct _IO_FILE_plus _IO_2_1_stdin_ ;
extern struct _IO_FILE_plus _IO_2_1_stdout_ ;
extern struct _IO_FILE_plus _IO_2_1_stderr_ ;

extern int  funcOne (struct _IO_FILE_plus *obj, int aThing);
extern int  funcTwo (struct _IO_FILE_plus *obj, double aThing);

These all work because none of these source statements require the actual definition of the struct to be available to the compiler. The compiler only needs to know that there are such symbols defined somewhere.

In the source file using these you could have a statement like:

int k = funcOne(&_IO_2_1_stdin_, 5);

and again this only requires the compiler to know that the symbol exists and at some point the address of that symbol will be available.

And as part of the library design there may well be C Preprocessor macros used to hide some of this plumbing further. So you may have macros such as:

#define DO_FUNCONE(io,iVal)  funcOne(&(io), (iVal))

#define DO_FUNCONE_STDIN(iVal)  funcOne(&_IO_2_1_stdin_,(iVal))

#define IO_STDIN  (&_IO_2_1_stdin)

However a statement like the following will not compile because the compiler will be providing a copy of the struct to the function which takes the value of the external and not a pointer to it:

int k = doFuncOne (_IO_2_1_stdin_);  // compiler error. definition of struct _IO_FILE_plus not available

where the function definition of the function doFuncOne() looks like:

// compiler error. definition of struct _IO_FILE_plus not available
int doFuncOne (struct _IO_FILE_plus obj)  // notice this is struct and not pointer to struct
{
    // do some setup then call funcOne().
    return funcOne(&obj, 33);
}

However a change to the interface of the function doFuncOne() would allow it to compile:

// following would compile as only declaration is needed by the compiler.
int doFuncOne (struct _IO_FILE_plus *obj)  // notice this is now pointer to struct
{
    // do some setup then call funcOne().
    return funcOne(obj, 33);
}

The library could provide a version of the function funcOne(), say funcOneStruct(), which allowed an argument of the struct rather than pointer to the struct because the compiler has a definition of the struct available when compiling the source files of the library. However people using the library would be unable to use that function because the users of the library have only the declaration of the struct available to them and not the definition of the struct.

Such a function may be useful for third party developers who have a definition of the struct available to them perhaps to clone one of the existing objects provided by the library.

like image 120
Richard Chambers Avatar answered Oct 18 '22 18:10

Richard Chambers