Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a Random File in C#

I am creating a file of a specified size - I don't care what data is in it, although random would be nice. Currently I am doing this:

        var sizeInMB = 3; // Up to many Gb         using (FileStream stream = new FileStream(fileName, FileMode.Create))         {             using (BinaryWriter writer = new BinaryWriter(stream))             {                 while (writer.BaseStream.Length <= sizeInMB * 1000000)                 {                     writer.Write("a"); //This could be random. Also, larger strings improve performance obviously                 }                 writer.Close();             }         } 

This isn't efficient or even the right way to go about it. Any higher performance solutions?

Thanks for all the answers.

Edit

Ran some tests on the following methods for a 2Gb File (time in ms):

Method 1: Jon Skeet

byte[] data = new byte[sizeInMb * 1024 * 1024]; Random rng = new Random(); rng.NextBytes(data); File.WriteAllBytes(fileName, data); 

N/A - Out of Memory Exception for 2Gb File

Method 2: Jon Skeet

byte[] data = new byte[8192]; Random rng = new Random(); using (FileStream stream = File.OpenWrite(fileName)) {     for (int i = 0; i < sizeInMB * 128; i++)     {          rng.NextBytes(data);          stream.Write(data, 0, data.Length);     } } 

@1K - 45,868, 23,283, 23,346

@128K - 24,877, 20,585, 20,716

@8Kb - 30,426, 22,936, 22,936

Method 3 - Hans Passant (Super Fast but data isn't random)

using (var fs = new FileStream(fileName, FileMode.Create, FileAccess.Write, FileShare.None)) {     fs.SetLength(sizeInMB * 1024 * 1024); } 

257, 287, 3, 3, 2, 3 etc.

like image 404
Jason Avatar asked Dec 13 '10 18:12

Jason


People also ask

What is random file in C?

Random access file in C enables us to read or write any data in our disk file without reading or writing every piece of data before it. ftell() is used to find the position of the file pointer from the starting of the file. rewind() is used to move the file pointer to the beginning of the file.

What is a sequential access file in C?

Sequential-Access File. Streams provide communication channels between files and programs. Three files and their associated streams are automatically opened when program execution begins: the standard input to read data from the keyboard. the standard output to print data on the screen.

Which function allows random access to a file?

The functions read() and write() allow accessing a file in 'random mode'. Explanation: Random or non-sequential file access can be defined as a special kind of file and can be seen in almost all 'programming languages' like Java that allows random access to any of the locations in the file.


1 Answers

Well, a very simple solution:

byte[] data = new byte[sizeInMb * 1024 * 1024]; Random rng = new Random(); rng.NextBytes(data); File.WriteAllBytes(fileName, data); 

A slightly more memory efficient version :)

// Note: block size must be a factor of 1MB to avoid rounding errors :) const int blockSize = 1024 * 8; const int blocksPerMb = (1024 * 1024) / blockSize; byte[] data = new byte[blockSize]; Random rng = new Random(); using (FileStream stream = File.OpenWrite(fileName)) {     // There      for (int i = 0; i < sizeInMb * blocksPerMb; i++)     {         rng.NextBytes(data);         stream.Write(data, 0, data.Length);     } } 

However, if you do this several times in very quick succession creating a new instance of Random each time, you may get duplicate data. See my article on randomness for more information - you could avoid this using System.Security.Cryptography.RandomNumberGenerator... or by reusing the same instance of Random multiple times - with the caveat that it's not thread-safe.

like image 121
Jon Skeet Avatar answered Oct 13 '22 04:10

Jon Skeet