This may be a stupid question, but Google and MATLAB documentation have failed me. I have a rather large binary file (>10 GB) that I need to open and delete the last forty million bytes or so. Is there a way to do this without reading the entire file to memory in chunks and printing it out to a new file? It took 6 hours to generate the file, so I'm cringing at the thought of re-reading the whole thing.
EDIT:
The file is 14,440,000,000 bytes in size. I need to chop it to 14,400,000,000.
There is no ftruncate() in Matlab, but you've got access to the full Java standard library in the JVM embedded in Matlab, and can use java.io.RandomAccessFile or the Java NIO classes to truncate a file.
Here's a Matlab function that calls to Java to lop the last n bytes off a file. Should have minimal I/O cost.
function remove_last_n_bytes_from_file(file, n)
jFile = java.io.RandomAccessFile(file, 'rw');
currentLength = jFile.length();
wantLength = currentLength - n;
fprintf('Truncating file %s: Resizing to %d to remove %d bytes\n', file, wantLength, n);
jFile.setLength(wantLength);
jFile.close();
You could also do it as a one-liner.
java.io.RandomAccessFile('/path/to/my/file.bin', 'rw').setLength(n);
I found Perl is much quicker to do this than MATLAB.
Here are two examples from Perl Cookbook:
truncate(HANDLE, $length)
or die "Couldn't truncate: $!\n";
truncate("/tmp/$$.pid", $length)
or die "Couldn't truncate: $!\n";
You can run Perl script from MATLAB with PERL function.
Since you don't want to read the file into MATLAB (understandably), you are dealing with system level commands. MATLAB has a facility to call system commands using the "system" command
system
So now your problem is reduced to finding the shell command in your OS that will do it for you. Or you can write a program using truncate() (unix -- KennyTM) or SetEndOfFile (windows)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With