Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to truncate the beginning of a file in C?

Tags:

c

There are many similar questions, but nothing that answers this specifically after googling around quite a bit. Here goes:

Say we have a file (could be binary, and much bigger too):

abcdefghijklmnopqrztuvwxyz

what is the best way in C to "move" a right most portion of this file to the left, truncating the beginning of the file.. so, for example, "front truncating" 7 bytes would change the file on disk to be:

hijklmnopqrztuvwxyz

I must avoid temporary files, and would prefer not to use a large buffer to read the whole file into memory. One possible method I thought of is to use fopen with "rb+" flag, and constantly fseek back and forth reading and writing to copy bytes starting from offset to the beginning, then setEndOfFile to truncate at the end. That seems to be a lot of seeking (possibly inefficient).

Another way would be to fopen the same file twice, and use fgetc and fputc with the respective file pointers. Is this even possible?

If there are other ways, I'd love to read all of them.

like image 531
snapfractalpop Avatar asked Dec 09 '11 23:12

snapfractalpop


3 Answers

You could mmap the file into memory and then memmove the contents. You would have to truncate the file separately.

like image 73
Neil Avatar answered Oct 19 '22 10:10

Neil


You don't have to use an enormous buffer size, and the kernel is going to be doing the hard work for you, but yes, reading a buffer full from up the file and writing nearer the beginning is the way to do it if you can't afford to do the simpler job of create a new file, copy what you want into that file, and then copy the new (temporary) file over the old one. I wouldn't rule out the possibility that the approach of copying what you want to a new file and then either moving the new file in place of the old or copying the new over the old will be faster than the shuffling process you describe. If the number of bytes to be removed was a disk block size, rather than 7 bytes, the situation might be different, but probably not. The only disadvantage is that the copying approach requires more intermediate disk space.

Your outline approach will require the use of truncate() or ftruncate() to shorten the file to the proper length, assuming you are on a POSIX system. If you don't have truncate(), then you will need to do the copying.

Note that opening the file twice will work OK if you are careful not to clobber the file when opening for writing - using "r+b" mode with fopen(), or avoiding O_TRUNC with open().

like image 24
Jonathan Leffler Avatar answered Oct 19 '22 10:10

Jonathan Leffler


If you are using Linux, since Kernel 3.15 you can use

#include <fcntl.h>

int fallocate(int fd, int mode, off_t offset, off_t len);

with the FALLOC_FL_COLLAPSE_RANGE flag.

http://manpages.ubuntu.com/manpages/disco/en/man2/fallocate.2.html

Note that not all file systems support it but most modern ones such as ext4 and xfs do.

like image 41
Francesquini Avatar answered Oct 19 '22 09:10

Francesquini