Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calling a free() wrapper: dereferencing type-punned pointer will break strict-aliasing rules

I've tried to read up on the other questions here on SO with similar titles, but they are all a tiny bit too complex for me to be able to apply the solution (or even explanation) to my own issue, which seems to be of a simpler nature.

In my case, I have a wrapper around free() which sets the pointer to NULL after freeing it:

void myfree(void **ptr)
{
    free(*ptr);
    *ptr = NULL;
}

In the project I'm working on, it is called like this:

myfree((void **)&a);

This makes gcc (4.2.1 on OpenBSD) emit the warning "dereferencing type-punned pointer will break strict-aliasing rules" if I crank up the optimization level to -O3 and add -Wall (not otherwise).

Calling myfree() the following way does not make the compiler emit that warning:

myfree((void *)&a);

And so I wonder if we ought to change the way we call myfree() to this instead.

I believe that I'm invoking undefined behaviour with the first way of calling myfree(), but I haven't been able to wrap my head around why. Also, on all compilers that I have access to (clang and gcc), on all systems (OpenBSD, Mac OS X and Linux), this is the only compiler and system that actually gives me that warning (and I know emitting warnings is a nice optional).

Printing the value of the pointer before, inside and after the call to myfree(), with both ways of calling it, gives me identical results (but that may not mean anything if it's undefined behaviour):

#include <stdio.h>
#include <stdlib.h>

void myfree(void **ptr)
{
    printf("(in myfree) ptr = %p\n", *ptr);
    free(*ptr);
    *ptr = NULL;
}

int main(void)
{
    int *a, *b;

    a = malloc(100 * sizeof *a);
    b = malloc(100 * sizeof *b);

    printf("(before myfree) a = %p\n", (void *)a);
    printf("(before myfree) b = %p\n", (void *)b);

    myfree((void **)&a);  /* line 21 */
    myfree((void *)&b);

    printf("(after myfree) a = %p\n", (void *)a);
    printf("(after myfree) b = %p\n", (void *)b);

    return EXIT_SUCCESS;
}

Compiling and running it:

$ cc -O3 -Wall free-test.c
free-test.c: In function 'main':
free-test.c:21: warning: dereferencing type-punned pointer will break strict-aliasing rules

$ ./a.out
(before myfree) a = 0x15f8fcf1d600
(before myfree) b = 0x15f876b27200
(in myfree) ptr = 0x15f8fcf1d600
(in myfree) ptr = 0x15f876b27200
(after myfree) a = 0x0
(after myfree) b = 0x0

I'd like to understand what is wrong with the first call to myfree() and I'd like to know if the second call is correct. Thanks.

like image 645
Kusalananda Avatar asked Jul 25 '16 13:07

Kusalananda


People also ask

What is a type Punned pointer?

A form of pointer aliasing where two pointers and refer to the same location in memory but represent that location as different types. The compiler will treat both "puns" as unrelated pointers. Type punning has the potential to cause dependency problems for any data accessed through both pointers.

What is the strict aliasing rule and why do we care?

GCC compiler makes an assumption that pointers of different types will never point to the same memory location i.e., alias of each other. Strict aliasing rule helps the compiler to optimize the code.

Does c++ have strict aliasing?

In both C and C++ the standard specifies which expression types are allowed to alias which types. The compiler and optimizer are allowed to assume we follow the aliasing rules strictly, hence the term strict aliasing rule.


3 Answers

Since a is an int* and not a void*, &a cannot be converted to a pointer to a void*. (Suppose void* were wider than a pointer to an integer, something which the C standard allows.) As a result, neither of your alternatives -- myfree((void**)a) and myfree((void*)a) -- is correct. (Casting to void* is not a strict aliasing issue. But it still leads to undefined behaviour.)

A better solution (imho) is to force the user to insert a visible assignment:

void* myfree(void* p) {
    free(p);
    return 0;
}

a = myfree(a);

With clang and gcc, you can use an attribute to indicate that the return value of my_free must be used, so that the compiler will warn you if you forget the assignment. Or you could use a macro:

#define myfree(a) (a = myfree(a))
like image 59
rici Avatar answered Sep 27 '22 19:09

rici


Here's a suggestion that:

  1. Does not violate the strict aliasing rule.
  2. Makes the call more natural.
void* myfree(void *ptr)
{
    free(ptr);
    return NULL;
}

#define MYFREE(ptr) ptr = myfree(ptr);

you can use the macro simply as:

int* a = malloc(sizeof(int)*10);

...

MYFREE(a);
like image 25
R Sahu Avatar answered Sep 27 '22 18:09

R Sahu


There are basically a few ways to have a function work with and modify a pointer in a fashion agnostic to the pointer's target type:

  1. Pass the pointer into the function as void* and return it as void*, applying appropriate conversions in both directions at the call site. This approach has the disadvantage of tying up the function's return value, precluding its use for other purposes, and also precludes the possibility of performing the pointer update within a lock.

  2. Pass a pointer to function which accepts two void*, casts one of them into a pointer of the appropriate type and the other to a double-indirect pointer of that type, and possibly a second function that can read a passed-in pointer as a void*, and use those functions to read and write the pointer in question. This should be 100% portable, but likely very inefficient.

  3. Use pointer variables and fields of type void* elsewhere and cast them to real pointer types whenever they're actually used, thus allowing pointers of type void** to be used to modify the pointer variables.

  4. Use memcpy to read or modify pointers of unknown type, given double-indirect pointers which identify them.

  5. Document that code is written in a dialect of C, popular in the 1990s, which treated "void**" as a double-indirect pointer to any type, and use compilers and/or settings that support that dialect. The C Standard allows for implementations to use different representations for pointers to things of different types, and because those implementations couldn't support a universal double-indirect pointer type, and because implementations which could easily allow void** to be used that way already did so before the Standard was written, there was no perceived need for the Standard to describe that behavior.

The ability to have a universal double-indirect pointer type was and is extremely useful on the 90%+ of implementations that could (and did) readily support it, and the authors of the Standard certainly knew that, but the authors were far less interested in describing behaviors that sensible compiler writers would support anyway, than in mandating behaviors which would be on the whole beneficial even on platforms where they could not be cheaply supported (e.g. mandating that even on a platform whose unsigned math instructions wrap mod 65535, a compiler must generate whatever code is needed to make calculations wrap mod 65536). I'm not sure why modern compiler writers fail to recognize that.

Perhaps if programmers start overtly writing for sane dialects of C, the maintainers of standards might recognize that such dialects have value. [Note that from an aliasing perspective, treating void** as a universal double-indirect pointer will have far less severe performance costs than forcing programmers to use any of the alternatives 2-4 above; any claims by compiler writers that treating void** as a universal double-indirect pointer would kill performance should thus be treated skeptically].

like image 28
supercat Avatar answered Sep 27 '22 17:09

supercat