Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pread for very large files

I am reading a large file using pread as follows:

ssize_t s = pread(fd, buff, count, offset);
if (s != (ssize_t) count)
  fprintf(stderr, "s = %ld != count = %ld\n", s, count);
assert(s == (ssize_t ) count);

The above code has been working fine for small files (upto 1.5GB). However, for large file sizes, the returned number of bytes is different than the expected count.

In particular, for 2.4GB file size, my count is set to 2520133890 and the assertion fails with the fprintf saying:

s = 2147479552 != count = 2520133890

What makes this puzzling is that I am working on a 64-bit system and hence, sizeof(ssize_t) = 8.

What is the cause of this failure and how do I resolve this so that I can read the whole file in one go?

like image 618
John Elaine Avatar asked Feb 07 '23 07:02

John Elaine


1 Answers

Looks like you use linux, and magic number return by pread is 2147479552 = 0x7ffff000, so the answer is in man 2 read:

On Linux, read() (and similar system calls) will transfer at most 0x7ffff000 (2,147,479,552) bytes, returning the number of bytes actu‐ ally transferred. (This is true on both 32-bit and 64-bit systems.)

So you need at least twice to call pread to get your data, this restriction not related to _FILE_OFFSET_BITS=64, O_LARGEFILE, sizeof(off_t) and etc things, this restriction is create by rw_verify_area in linux kernel:

/*
 * rw_verify_area doesn't like huge counts. We limit
 * them to something that fits in "int" so that others
 * won't have to do range checks all the time.
 */
int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t count)
...
return count > MAX_RW_COUNT ? MAX_RW_COUNT : count;

#define MAX_RW_COUNT (INT_MAX & PAGE_CACHE_MASK)

like image 167
fghj Avatar answered Feb 15 '23 10:02

fghj