So, I create a c++ executable file for merging files. I have 43 files with size of 100MB each. so a total of about 4.3GB.
Two cases:
One: If the file name are 1, 2, 3, 4, 5, 6, ..., 43 it takes about 2 minutes to finish merging.
Two: If the file name are This File.ova0, This File.ova1, ..., This File.ova42 it will takes about 7 minutes to finish merging.
This is the same exact file, I just rename the file. Any idea what's wrong?
This is the c++ code
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include "boost/filesystem.hpp"
namespace bfs = boost::filesystem;
#pragma warning(disable : 4244)
typedef std::vector<std::string> FileVector;
int main(int argc, char **argv)
{
int bucketSize = 3024 * 3024;
FileVector Files;
//Check all command-line params to see if they exist..
for(int i = 1; i < argc; i++)
{
if(!bfs::exists(argv[i]))
{
std::cerr << "Failed to locate required part file: " << argv[i] << std::endl;
return 1;
}
//Store this file and continue on..
std::cout << "ADDING " << argv[i] << std::endl;
Files.push_back(argv[i]);
}
//Prepare to combine all the files..
FILE *FinalFile = fopen("abc def.ova", "ab");
for(int i = 0; i < Files.size(); i++)
{
FILE *ThisFile = fopen(Files[i].c_str(), "rb");
char *dataBucket = new char[bucketSize];
std::cout << "Combining " << Files[i].c_str() << "..." << std::endl;
//Read the file in chucks so we do not chew up all the memory..
while(long read_size = (fread(dataBucket, 1, bucketSize, ThisFile)))
{
//FILE *FinalFile = fopen("abc def.ova", "ab");
//::fseek(FinalFile, 0, SEEK_END);
fwrite(dataBucket, 1, read_size, FinalFile);
//fclose(FinalFile);
}
delete [] dataBucket;
fclose(ThisFile);
}
fclose(FinalFile);
return 0;
}
I run it through .bat file like this:
@ECHO OFF
Combiner.exe "This File.ova0" "This File.ova1" "This File.ova2"
PAUSE
or
@ECHO OFF
Combiner.exe 1 2 3
PAUSE
both .bat file goes until the end of file name, I just wrote 3 files in here, otherwise it will be too long
Thank you
By default, Windows caches file data that is read from disks and written to disks. This implies that read operations read file data from an area in system memory known as the system file cache, rather than from the physical disk. Correspondingly, write operations write file data to the system file cache rather than to the disk, and this type of cache is referred to as a write-back cache. Caching is managed per file object: More informations: File Caching
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With