Here is the description of my problem:
I want to read a big file, about 6.3GB, all to memory using the read
system call in C, but an error occurs.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <limits.h>
int main(int argc, char* argv[]) {
int _fd = open(argv[1], O_RDONLY, (mode_t) 0400);
if (_fd == -1)
return 1;
off_t size = lseek(_fd, 0, SEEK_END);
printf("total size: %lld\n", size);
lseek(_fd, 0, SEEK_SET);
char *buffer = malloc(size);
assert(buffer);
off_t total = 0;
ssize_t ret = read(_fd, buffer, size);
if (ret != size) {
printf("read fail, %lld, reason:%s\n", ret, strerror(errno));
printf("int max: %d\n", INT_MAX);
}
}
And compile it with:
gcc read_test.c
then run with:
./a.out bigfile
output:
total size: 6685526352
read fail, 2147479552, reason:Success
int max: 2147483647
The system environment is
3.10.0_1-0-0-8 #1 SMP Thu Oct 29 13:04:32 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
There two places I don't understand:
errno
is not correctly set.The read
system call can return a smaller number than the requested size for multiple reasons, a positive non zero return value is not an error, errno
is not set in this case, its value is indeterminate. You should keep reading in a loop until read
returns 0
for end of file or -1
for an error. It is a very common bug to rely on read
to read a complete block in a single call, even from regular files. Use fread
for simpler semantics.
You print the value of INT_MAX
, which is irrelevant to your issue. The size of off_t
and size_t
are the interesting ones. On your platform, 64 bit GNU/Linux, you are lucky that both off_t
and size_t
are 64 bit long. ssize_t
has the same size as size_t
by definition. On other 64 bit platforms, off_t
might be smaller than size_t
, preventing correct assessment of the file size, or size_t
might be smaller than off_t
, letting malloc
allocate a block smaller than the file size. Note that in this case, read
will be passed the same smaller size because size
would be silently truncated in both calls.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With