I want to write 2TB data into one file, in the future it might be a petabyte.
The data is composed of all '1'
. For example, 2TB data consisting of "1111111111111......11111"
(each byte is represented by '1').
Following is my way:
File.open("data",File::RDWR||File::CREAT) do |file|
2*1024*1024*1024*1024.times do
file.write('1')
end
end
That means, File.write
is called 2TB times. From the point of Ruby, is there a better way to implement it?
Using FileChannel. Next, we will cover an example of using Java FileChannels to transfer a very large amount of data from one file to other. Here, we are using a buffer of (4 * 1024) size. From the output it is clear that, this is so far the fastest and most memory efficient way of processing large files.
Data is written to a file using the PRINTF statement. The statement may include the FORMAT keyword to control the specific structure of the written file. Format rules are needed when writing an array to a file. Writing data to a file using simple format rules in the PRINTF procedure.
Java FileWriter class in java is used to write character-oriented data to a file as this class is character-oriented class because of what it is used in file handling in java.
FileWriter: FileWriter is the simplest way to write a file in Java. It provides overloaded write method to write int, byte array, and String to the File. You can also write part of the String or byte array using FileWriter. FileWriter writes directly into Files and should be used only when the number of writes is less.
You have a few problems:
File::RDWR||File::CREAT
always evaluates to File::RDWR
. You mean File::RDWR|File::CREAT
(|
rather than ||
).
2*1024*1024*1024*1024.times do
runs the loop 1024 times then multiplies the result of the loop by the stuff on the left. You mean (2*1024*1024*1024*1024).times do
.
Regarding your question, I get significant speedup by writing 1024 bytes at a time:
File.open("data",File::RDWR|File::CREAT) do |file|
buf = "1" * 1024
(2*1024*1024*1024).times do
file.write(buf)
end
end
You might experiment and find a better buffer size than 1024.
Don't know which OS you are using but the fastest approach would be to us a system copy to concatenate files to one big file, you can script that. An example. If you start with a string like "1" and echo it to a file
echo "1" > file1
you can concatenate this file with itself a number of time to a new file, in windows you have to use the parameter /b for binary copy to do that.
copy /b file1+file1 file2
gives you a file2 of 12 bytes (including the CR)
copy file2+file2 file1
gives you 24 bytes etc
I will let the math (and the fun of Rubying this) to you but you will reach your size quick enough and probably faster than the accepted answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With