Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c handle large file

Tags:

c

large-files

I need to parse a file that could be many gbs in size. I would like to do this in C. Can anyone suggest any methods to accomplish this?

The file that I need to open and parse is a hard drive dump that I get from my mac's hard drive. However, I plan on running my program inside of 64-bit Ubuntu 10.04. Also given the large file size, the more optimized the method the better.

like image 486
romejoe Avatar asked Sep 11 '10 02:09

romejoe


2 Answers

On both *nix and Windows, there are extensions to the I/O routines that touch file size that will support sizes larger than 2GB or 4GB. Naturally, the underlying file system must also support a file that large. On Windows, NTFS does, but FAT doesn't for instance. This is generally known as "large file support".

The two routines that are most critical for these purposes are fseek() and ftell() so that you can do random access to the whole file. Otherwise, the ordinary fopen() and fread() and friends can do sequential access to any size of file as long as the underlying OS and stdio implementation support large files.

like image 142
RBerteig Avatar answered Oct 02 '22 18:10

RBerteig


Assuming you're on a linux/bsd/mac/notwindows 64-bit system (and seriously, who isn't these days?), mmap performs extremely well. It essentially lets you map a whole file into a process' address space and let the kernel perform caching/paging for you.

And if you MUST use windows, here's the same concept, but made by the friendly folks at Redmond. Note that for either of these, you will want to be running on a 64-bit system as the ABSOLUTE largest file you can map on a 32-bit system is ~4GB.

like image 20
Clark Gaebel Avatar answered Oct 02 '22 20:10

Clark Gaebel