I came across a situation where it would be useful to have unnecessary calls to realloc
being optimized out. However, it seems like neither Clang nor GCC do such a thing (Compiler Explorer (godbolt.org)) - although I see optimizations being made with multiple calls to malloc
.
The example:
void *myfunc() { void *data; data = malloc(100); data = realloc(data, 200); return data; }
I expected it to be optimized to something like the following:
void *myfunc() { return malloc(200); }
Why is neither Clang nor GCC optimizing it out? - Are they not allowed to do so?
Compiler optimization is generally implemented using a sequence of optimizing transformations, algorithms which take a program and transform it to produce a semantically equivalent output program that uses fewer resources or executes faster.
Compilers are free to optimize code so long as they can guarantee the semantics of the code are not changed.
GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O , this option increases both compilation time and the performance of the generated code.
The degree to which the compiler will optimize the code it generates is controlled by the -O flag. No optimization. In the absence of any version of the -O flag, the compiler generates straightforward code with no instruction reordering or other attempt at performance improvement. -O or -O2.
Are they not allowed to do so?
Maybe, but optimization not done in this case may be due to corner functional differences.
If 150 bytes of allocatable memory remain,data = malloc(100); data = realloc(data, 200);
returns NULL
with 100 bytes consumed (and leaked) and 50 remain.
data = malloc(200);
returns NULL
with 0 bytes consumed (none leaked) and 150 remain.
Different functionality in this narrow case may prevent optimization.
Are compilers allowed to optimize-out realloc?
Perhaps - I would expect it is allowed. Yet it may not be worth the effect to enhance the compiler to determine when it can.
Successful malloc(n); ... realloc(p, 2*n)
differs from malloc(2*n);
when ...
may have set some of the memory.
It might be beyond that compiler's design to ensure ...
, even if empty code, did not set any memory.
A compiler which bundles its own self-contained versions of malloc/calloc/free/realloc could legitimately perform the indicated optimization if the authors thought doing so was worth the effort. A compiler that chains to externally-supplied functions could still perform such optimizations if it documented that it did not regard the precise sequence of calls to such functions as an observable side-effect, but such treatment could be a bit more tenuous.
If no storage is allocated or deallocated between the malloc() and realloc(), the size of the realloc() is known when the malloc() is performed, and the realloc() size is larger than the malloc() size, then it may make sense to consolidate the malloc() and realloc() operations into a single larger allocation. If the state of memory could change in the interim, however, then such an optimization might cause the failure of operations that should have succeeded. For example, given the sequence:
void *p1 = malloc(2000000000); void *p2 = malloc(2); free(p1); p2 = realloc(p2, 2000000000);
a system might not have 2000000000 bytes available for p2 until after p1 is freed. If it were to change the code to:
void *p1 = malloc(2000000000); void *p2 = malloc(2000000000); free(p1);
that would result in the allocation of p2 failing. Because the Standard never guarantees that allocation requests will succeed, such behavior would not be non-conforming. On the other hand, the following would also be a "conforming" implementation:
void *malloc(size_t size) { return 0; } void *calloc(size_t size, size_t count) { return 0; } void free(void *p) { } void *realloc(void *p, size_t size) { return 0; }
Such an implementation might arguably be regarded as more "efficient" than most others, but one would have to be rather obtuse to regard it as being very useful except, perhaps, in rare situations where the above functions are are called on code paths that are never executed.
I think the Standard would clearly allow the optimization, at least in cases that are as simple as those in the original question. Even in cases where it might cause operations to fail that could otherwise have succeeded, the Standard would still allow it. Most likely, the reason that many compilers don't perform the optimization is that the authors didn't think the benefits would be sufficient to justify the effort required to identify cases where it would be safe and useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With