Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When zeroing a struct such as sockaddr_in, sockaddr_in6 and addrinfo before use, which is correct: memset, an initializer or either?

Tags:

Whenever I look at real code or example socket code in books, man pages and websites, I almost always see something like:

struct sockaddr_in foo; memset(&foo, 0, sizeof foo);  /* or bzero(), which POSIX marks as LEGACY, and is not in standard C */ foo.sin_port = htons(42); 

instead of:

struct sockaddr_in foo = { 0 };  /* if at least one member is initialized, all others are set to    zero (as though they had static storage duration) as per     ISO/IEC 9899:1999 6.7.8 Initialization */  foo.sin_port = htons(42); 

or:

struct sockaddr_in foo = { .sin_port = htons(42) }; /* New in C99 */ 

or:

static struct sockaddr_in foo;  /* static storage duration will also behave as if     all members are explicitly assigned 0 */ foo.sin_port = htons(42); 

The same can also be found for setting struct addrinfo hints to zero before passing it to getaddrinfo, for example.

Why is this? As far as I understand, the examples that do not use memset are likely to be the equivalent to the one that does, if not better. I realize that there are differences:

  • memset will set all bits to zero, which is not necessarily the correct bit representation for setting each member to 0.
  • memset will also set padding bits to zero.

Are either of these differences relevant or required behavior when setting these structs to zero and therefore using an initializer instead is wrong? If so, why, and which standard or other source verifies this?

If both are correct, why does memset/bzero tend to appear instead of an initializer? Is it just a matter of style? If so, that's fine, I don't think we need a subjective answer on which is better style.

The usual practice is to use an initializer in preference to memset precisely because all bits zero is not usually desired and instead we want the correct representation of zero for the type(s). Is the opposite true for these socket related structs?

In my research I found that POSIX only seems to require sockaddr_in6 (and not sockaddr_in) to be zeroed at http://www.opengroup.org/onlinepubs/000095399/basedefs/netinet/in.h.html but makes no mention of how it should be zeroed (memset or initializer?). I realise BSD sockets predate POSIX and it is not the only standard, so are their compatibility considerations for legacy systems or modern non-POSIX systems?

Personally, I prefer from a style (and perhaps good practice) point of view to use an initializer and avoid memset entirely, but I am reluctant because:

  • Other source code and semi-canonical texts like UNIX Network Programming use bzero (eg. page 101 on 2nd ed. and page 124 in 3rd ed. (I own both)).
  • I am well aware that they are not identical, for reasons stated above.
like image 295
Chris Young Avatar asked May 21 '09 18:05

Chris Young


1 Answers

One problem with the partial initializers approach (that is '{ 0 }') is that GCC will warn you that the initializer is incomplete (if the warning level is high enough; I usually use '-Wall' and often '-Wextra'). With the designated initializer approach, that warning should not be given, but C99 is still not widely used - though these parts are fairly widely available, except, perhaps, in the world of Microsoft.

I tend used to favour an approach:

static const struct sockaddr_in zero_sockaddr_in; 

Followed by:

struct sockaddr_in foo = zero_sockaddr_in; 

The omission of the initializer in the static constant means everything is zero - but the compiler won't witter (shouldn't witter). The assignment uses the compiler's innate memory copy which won't be slower than a function call unless the compiler is seriously deficient.


GCC has changed over time

GCC versions 4.4.2 to 4.6.0 generate different warnings from GCC 4.7.1. Specifically, GCC 4.7.1 recognizes the = { 0 } initializer as a 'special case' and doesn't complain, whereas GCC 4.6.0 etc did complain.

Consider file init.c:

struct xyz {     int x;     int y;     int z; };  struct xyz xyz0;                // No explicit initializer; no warning struct xyz xyz1 = { 0 };        // Shorthand, recognized by 4.7.1 but not 4.6.0 struct xyz xyz2 = { 0, 0 };     // Missing an initializer; always a warning struct xyz xyz3 = { 0, 0, 0 };  // Fully initialized; no warning 

When compiled with GCC 4.4.2 (on Mac OS X), the warnings are:

$ /usr/gcc/v4.4.2/bin/gcc -O3 -g -std=c99 -Wall -Wextra -c init.c init.c:9: warning: missing initializer init.c:9: warning: (near initialization for ‘xyz1.y’) init.c:10: warning: missing initializer init.c:10: warning: (near initialization for ‘xyz2.z’) $ 

When compiled with GCC 4.5.1, the warnings are:

$ /usr/gcc/v4.5.1/bin/gcc -O3 -g -std=c99 -Wall -Wextra -c init.c init.c:9:8: warning: missing initializer init.c:9:8: warning: (near initialization for ‘xyz1.y’) init.c:10:8: warning: missing initializer init.c:10:8: warning: (near initialization for ‘xyz2.z’) $ 

When compiled with GCC 4.6.0, the warnings are:

$ /usr/gcc/v4.6.0/bin/gcc -O3 -g -std=c99 -Wall -Wextra -c init.c init.c:9:8: warning: missing initializer [-Wmissing-field-initializers] init.c:9:8: warning: (near initialization for ‘xyz1.y’) [-Wmissing-field-initializers] init.c:10:8: warning: missing initializer [-Wmissing-field-initializers] init.c:10:8: warning: (near initialization for ‘xyz2.z’) [-Wmissing-field-initializers] $ 

When compiled with GCC 4.7.1, the warnings are:

$ /usr/gcc/v4.7.1/bin/gcc -O3 -g -std=c99 -Wall -Wextra  -c init.c init.c:10:8: warning: missing initializer [-Wmissing-field-initializers] init.c:10:8: warning: (near initialization for ‘xyz2.z’) [-Wmissing-field-initializers] $ 

The compilers above were compiled by me. The Apple-provided compilers are nominally GCC 4.2.1 and Clang:

$ /usr/bin/clang -O3 -g -std=c99 -Wall -Wextra -c init.c init.c:9:23: warning: missing field 'y' initializer [-Wmissing-field-initializers] struct xyz xyz1 = { 0 };                       ^ init.c:10:26: warning: missing field 'z' initializer [-Wmissing-field-initializers] struct xyz xyz2 = { 0, 0 };                          ^ 2 warnings generated. $ clang --version Apple clang version 4.1 (tags/Apple/clang-421.11.65) (based on LLVM 3.1svn) Target: x86_64-apple-darwin11.4.2 Thread model: posix $ /usr/bin/gcc -O3 -g -std=c99 -Wall -Wextra -c init.c init.c:9: warning: missing initializer init.c:9: warning: (near initialization for ‘xyz1.y’) init.c:10: warning: missing initializer init.c:10: warning: (near initialization for ‘xyz2.z’) $ /usr/bin/gcc --version i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00) Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  $ 

As noted by SecurityMatt in a comment below, the advantage of memset() over copying a structure from memory is that the copy from memory is more expensive, requiring access to two memory locations (source and destination) instead of just one. By comparison, setting the values to zeroes doesn't have to access the memory for source, and on modern systems, the memory is a bottleneck. So, memset() coding should be faster than copy for simple initializers (where the same value, normally all zero bytes, is being placed in the target memory). If the initializers are a complex mix of values (not all zero bytes), then the balance may be changed in favour of an initializer, for notational compactness and reliability if nothing else.

There isn't a single cut and dried answer...there probably never was, and there isn't now. I still tend to use initializers, but memset() is often a valid alternative.

like image 141
Jonathan Leffler Avatar answered Oct 12 '22 12:10

Jonathan Leffler