Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you concatenate two huge files with very little spare disk space? [closed]

Suppose that you have two huge files (several GB) that you want to concatenate together, but that you have very little spare disk space (let's say a couple hundred MB). That is, given file1 and file2, you want to end up with a single file which is the result of concatenating file1 and file2 together byte-for-byte, and delete the original files.

You can't do the obvious cat file2 >> file1; rm file2, since in between the two operations, you'd run out of disk space.

Solutions on any and all platforms with free or non-free tools are welcome; this is a hypothetical problem I thought up while I was downloading a Linux ISO the other day, and the download got interrupted partway through due to a wireless hiccup.

like image 744
Adam Rosenfield Avatar asked Nov 14 '08 16:11

Adam Rosenfield


2 Answers

time spent figuring out clever solution involving disk-sector shuffling and file-chain manipulation: 2-4 hours

time spent acquiring/writing software to do in-place copy and truncate: 2-20 hours

times median $50/hr programmer rate: $400-$1200

cost of 1TB USB drive: $100-$200

ability to understand the phrase "opportunity cost": priceless

like image 113
Steven A. Lowe Avatar answered Sep 22 '22 21:09

Steven A. Lowe


I think the difficulty is determining how the space can be recovered from the original files.

I think the following might work:

  1. Allocate a sparse file of the combined size.
  2. Copy 100Mb from the end of the second file to the end of the new file.
  3. Truncate 100Mb of the end of the second file
  4. Loop 2&3 till you finish the second file (With 2. modified to the correct place in the destination file).
  5. Do 2&3&4 but with the first file.

This all relies on sparse file support, and file truncation freeing space immediately.

If you actually wanted to do this then you should investigate the dd command. which can do the copying step

Someone in another answer gave a neat solution that doesn't require sparse files, but does copy file2 twice:

  1. Copy 100Mb chunks from the end of file 2 to a new file 3, ending up in reverse order. Truncating file 2 as you go.
  2. Copy 100Mb chunks from the end of file 3 into file 1, ending up with the chunks in their original order, at the end of file 1. Truncating file 3 as you go.
like image 31
Douglas Leeder Avatar answered Sep 25 '22 21:09

Douglas Leeder