I have a function to read a file using the read()
system call and return a char
pointer with the data read from the file. The function reallocates space if necessary. After a specific point the read fails with the error "Bad Address". The minimum code which fails is show below:
#include <errno.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
const unsigned BUFSIZE = 8192;
typedef struct
{
char* buffer;
long size;
} string_t;
string_t read_file(const char* path)
{
string_t error = { .buffer = NULL, .size = -1 };
int fd = open(path, O_RDONLY);
if (fd == -1) {
perror("open() failed in read_file");
return error;
}
string_t s;
s.buffer = malloc(BUFSIZE * sizeof(char));
s.size = 0;
int nread = 0;
long total_read = 0;
while ((nread = read(fd, s.buffer + total_read, BUFSIZE)) != 0) {
if (nread == -1) {
if (errno == EINTR) {
perror("error EINTR");
continue;
} else {
perror("read() failed in read_file");
close(fd);
return error;
}
} else {
printf("%ld %ld %d\n", total_read, s.size, nread);
total_read += nread;
s.size = total_read;
if (nread == BUFSIZE) {
if (realloc(s.buffer, s.size + BUFSIZE) == NULL) {
perror("out of memory...");
close(fd);
return error;
}
}
}
}
close(fd);
s.buffer[s.size] = 0;
return s;
}
int main()
{
const char* path = "/usr/share/dict/cracklib-small";
string_t s = read_file(path);
if (s.size == -1) {
printf("error\n");
return 1;
}
printf("%s\n", s.buffer);
free(s.buffer);
return 0;
}
Running this gives the following:
0 0 8192
8192 8192 8192
16384 16384 8192
24576 24576 8192
32768 32768 8192
40960 40960 8192
49152 49152 8192
57344 57344 8192
65536 65536 8192
73728 73728 8192
81920 81920 8192
90112 90112 8192
98304 98304 8192
106496 106496 8192
114688 114688 8192
122880 122880 8192
131072 131072 8192
read() failed in read_file: Bad address
error
Valgrind shows:
==4299== Syscall param read(buf) points to unaddressable byte(s)
==4299== at 0x5184F00: __read_nocancel (in /usr/lib/libc-2.21.so)
==4299== by 0x400A58: read_file (file_helpers.c:31)
==4299== by 0x400AA3: main (file_helpers.c:64)
==4299== Address 0x7568040 is 0 bytes after a block of size 8,192 free'd
==4299== at 0x4C2C29E: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4299== by 0x400A12: read_file (file_helpers.c:46)
==4299== by 0x400AA3: main (file_helpers.c:64)
==4299==
I can see it complains about the realloc but I do not understand what is causing the bad address error. Is the buffer to which read()
writes getting corrupt somehow after the last realloc()
?
wc -c
reports the file on which the function is run to have 477238 bytes.
Your usage of realloc()
looks wrong at first sight.
if (realloc(s.buffer, s.size + BUFSIZE) == NULL)
This statement checks if realloc()
is successfuil or not. If failure, this handles the case. Fine.
What if realloc()
is successful?
As per the man page,
The
realloc()
function returns a pointer to the newly allocated memory, which is suitably aligned for any kind of variable and may be different fromptr
, or NULL if the request fails. Ifsize
was equal to0
, eitherNULL
or a pointer suitable to be passed tofree()
is returned. Ifrealloc()
fails the original block is left untouched; it is not freed or moved.
That means, you're losing the newly allocated memory, and using the free()
d memory afterwards.
I guess, you'd have expected that realloc()
will resize the s.buffer
itself, but, I'm afraid, that is not the case here.
Solution:
You should collect the return value of realloc()
in a temporary variable, check against NULL
, and if not NULL
, assign the same back to the pointer s.buffer
.
FWIW, don't use the original pointer itself to collect the return value of realloc()
, in case if it fails, you'll lose the actual memory also.
The issue might be with how you are using realloc
. realloc(s.buffer, s.size + BUFSIZE) == NULL
is not the correct way to use realloc
. The return value of realloc
taken from here is:
Upon successful completion with a size not equal to 0, realloc() shall return a pointer to the (possibly moved) allocated space. If size is 0, either a null pointer or a unique pointer that can be successfully passed to free() shall be returned. If there is not enough available memory, realloc() shall return a null pointer and set errno to ENOMEM.
The key part is the fact that realloc
can move the allocated space and its data. So you cannot just check whether the return value is NULL. You want to do something more along the lines of:
void *tmp = realloc(s.buffer, s.size + BUFSIZE);
if (tmp == NULL) {
perror("out of memory...");
free(s.buffer); // Release the memory to avoid a leak
close(fd);
return error;
}
s.buffer = tmp;
In other words update the pointer with the return value of realloc
. If the data was not moved, realloc
returns the memory addressed passed into it; if it was moved, realloc
will return a new address.
Update:
Another issue that you probably aren't experiencing is how you handle the return value of read
. If read
returned less than was requested than you won't realloc
more memory and further reads might read past the buffer. You might not experiencing this right now as it will only show if read
doesn't fail and doesn't read less than the requested amount of data. A solution, which includes the realloc
fix is below:
int nread = 0;
long total_read = 0;
int space_remaining = BUFSIZE;
while ((nread = read(fd, s.buffer + total_read, space_remaining)) != 0) {
if (nread == -1) {
if (errno == EINTR) {
perror("error EINTR");
continue;
} else {
perror("read() failed in read_file");
close(fd);
return error;
}
} else {
printf("%ld %ld %d\n", total_read, s.size, nread);
total_read += nread;
s.size = total_read;
space_remaining -= nread;
if (space_remaining == 0) {
void *tmp = realloc(s.buffer, s.size + BUFSIZE);
if (tmp == NULL) {
perror("out of memory...");
free(s.buffer); // Release the memory to avoid a leak
close(fd);
return error;
}
s.buffer = tmp;
space_remaining = BUFSIZE;
}
}
}
The variable space_remaining
is used to keep track how much space is left in the buffer. This is the amount read and is reset when the buffer's size is increased. Since you are realloc
-ing more space you do not want to do the typical (BUFSIZE-total_read)
pattern that has previously been suggested, although this is the typical pattern one sees.
Again you won't see this problem if read
always returns the requested amount of data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With