This question came to mind when I was trying to solve this problem.
I have harddrive with capacity 120 GB, of which 100 GB is occupied by a single huge file. So 20 GB is still free.
My question is, how can we split this huge file into smaller ones, say 1 GB each? I see that if I had ~100 GB free space, probably it was possible with simple algorithm. But given only 20 GB free space, we can write upto 20 1GB files. I've no idea how to delete contents from the bigger file while reading from it.
Any solution?
It seems I've to truncate the file by 1 GB, once I finish writing one file, but that boils down to this queston:
Is it possible to truncate a part of a file? How exactly?
I would like to see an algorithm (or an outline of an algorithm) that works in C or C++ (preferably Standard C and C++), so I may know the lower level details. I'm not looking for a magic function, script or command that can do this job.
According to this question (Partially truncating a stream) you should be able to use, on a system that is POSIX compliant, a call to int ftruncate(int fildes, off_t length)
to resize an existing file.
Modern implementations will probably resize the file "in place" (though this is unspecified in the documentation). The only gotcha is that you may have to do some extra work to ensure that off_t
is a 64 bit type (provisions exist within the POSIX standard for 32 bit off_t
types).
You should take steps to handle error conditions, just in case it fails for some reason, since obviously, any serious failure could result in the loss of your 100GB file.
Pseudocode (assume, and take steps to ensure, all data types are large enough to avoid overflows):
open (string filename) // opens a file, returns a file descriptor
file_size (descriptor file) // returns the absolute size of the specified file
seek (descriptor file, position p) // moves the caret to specified absolute point
copy_to_new_file (descriptor file, string newname)
// creates file specified by newname, copies data from specified file descriptor
// into newfile until EOF is reached
set descriptor = open ("MyHugeFile")
set gigabyte = 2^30 // 1024 * 1024 * 1024 bytes
set filesize = file_size(descriptor)
set blocks = (filesize + gigabyte - 1) / gigabyte
loop (i = blocks; i > 0; --i)
set truncpos = gigabyte * (i - 1)
seek (descriptor, truncpos)
copy_to_new_file (descriptor, "MyHugeFile" + i))
ftruncate (descriptor, truncpos)
Obviously some of this pseudocode is analogous to functions found in the standard library. In other cases, you will have to write your own.
There is no standard function for this job.
For Linux you can use the ftruncate method, while for Windows you can use _chsize or SetEndOfFile. A simple #ifdef
will make it cross-platform.
Also read this Q&A.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With