Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating a random binary file

Tags:

linux

bash

People also ask

How do you write random data to a file?

Just use rnd() to figure out an offset and length n , then use it n times to grab random bytes to overwrite your file with.


That's because on most systems /dev/random uses random data from the environment, such as static from peripheral devices. The pool of truly random data (entropy) which it uses is very limited. Until more data is available, output blocks.

Retry your test with /dev/urandom (notice the u), and you'll see a significant speedup.

See Wikipedia for more info. /dev/random does not always output truly random data, but clearly on your system it does.

Example with /dev/urandom:

$ time dd if=/dev/urandom of=/dev/null bs=1 count=1024
1024+0 records in
1024+0 records out
1024 bytes (1.0 kB) copied, 0.00675739 s, 152 kB/s

real    0m0.011s
user    0m0.000s
sys 0m0.012s

Try /dev/urandom instead:

$ time dd if=/dev/urandom of=random-file bs=1 count=1024

From: http://stupefydeveloper.blogspot.com/2007/12/random-vs-urandom.html

The main difference between random and urandom is how they are pulling random data from kernel. random always takes data from entropy pool. If the pool is empty, random will block the operation until the pool would be filled enough. urandom will genarate data using SHA(or any other algorithm, MD5 sometimes) algorithm in the case kernel entropy pool is empty. urandom will never block the operation.


I wrote a script to test various hashing functions speeds. For this I wanted files of "random" data, and I didn't want to use the same file twice so that none of the functions had a kernel cache advantage over the other. I found that both /dev/random and /dev/urandom were painfully slow. I chose to use dd to copy data of my hard disk starting at random offsets. I would NEVER suggest using this if you are doing anythings security related, but if all you need is noise it doesn't matter where you get it. On a Mac use something like /dev/disk0 on Linux use /dev/sda

Here is the complete test script:

tests=3
kilobytes=102400
commands=(md5 shasum)
count=0
test_num=0
time_file=/tmp/time.out
file_base=/tmp/rand

while [[ test_num -lt tests ]]; do
    ((test_num++))
    for cmd in "${commands[@]}"; do
        ((count++))
        file=$file_base$count
        touch $file
        # slowest
        #/usr/bin/time dd if=/dev/random of=$file bs=1024 count=$kilobytes >/dev/null 2>$time_file
        # slow
        #/usr/bin/time dd if=/dev/urandom of=$file bs=1024 count=$kilobytes >/dev/null 2>$time_file                                                                                                        
        # less slow
        /usr/bin/time sudo dd if=/dev/disk0 skip=$(($RANDOM*4096)) of=$file bs=1024 count=$kilobytes >/dev/null 2>$time_file
        echo "dd took $(tail -n1 $time_file | awk '{print $1}') seconds"
        echo -n "$(printf "%7s" $cmd)ing $file: "
        /usr/bin/time $cmd $file >/dev/null
        rm $file
    done
done

Here is the "less slow" /dev/disk0 results:

dd took 6.49 seconds
    md5ing /tmp/rand1:         0.45 real         0.29 user         0.15 sys
dd took 7.42 seconds
 shasuming /tmp/rand2:         0.93 real         0.48 user         0.10 sys
dd took 6.82 seconds
    md5ing /tmp/rand3:         0.45 real         0.29 user         0.15 sys
dd took 7.05 seconds
 shasuming /tmp/rand4:         0.93 real         0.48 user         0.10 sys
dd took 6.53 seconds
    md5ing /tmp/rand5:         0.45 real         0.29 user         0.15 sys
dd took 7.70 seconds
 shasuming /tmp/rand6:         0.92 real         0.49 user         0.10 sys

Here are the "slow" /dev/urandom results:

dd took 12.80 seconds
    md5ing /tmp/rand1:         0.45 real         0.29 user         0.15 sys
dd took 13.00 seconds
 shasuming /tmp/rand2:         0.58 real         0.48 user         0.09 sys
dd took 12.86 seconds
    md5ing /tmp/rand3:         0.45 real         0.29 user         0.15 sys
dd took 13.18 seconds
 shasuming /tmp/rand4:         0.59 real         0.48 user         0.10 sys
dd took 12.87 seconds
    md5ing /tmp/rand5:         0.45 real         0.29 user         0.15 sys
dd took 13.47 seconds
 shasuming /tmp/rand6:         0.58 real         0.48 user         0.09 sys

Here is are the "slowest" /dev/random results:

dd took 13.07 seconds
    md5ing /tmp/rand1:         0.47 real         0.29 user         0.15 sys
dd took 13.03 seconds
 shasuming /tmp/rand2:         0.70 real         0.49 user         0.10 sys
dd took 13.12 seconds
    md5ing /tmp/rand3:         0.47 real         0.29 user         0.15 sys
dd took 13.19 seconds
 shasuming /tmp/rand4:         0.59 real         0.48 user         0.10 sys
dd took 12.96 seconds
    md5ing /tmp/rand5:         0.45 real         0.29 user         0.15 sys
dd took 12.84 seconds
 shasuming /tmp/rand6:         0.59 real         0.48 user         0.09 sys

You'll notice that /dev/random and /dev/urandom were not much different in speed. However, /dev/disk0 took 1/2 the time.

PS. I lessen the number of tests and removed all but 2 commands for the sake of "brevity" (not that I succeeded in being brief).


Old thread, but like Tobbe mentioned, I needed something like this only better (faster).

So... a shell way of doing it the same, just way quicker then random/urandom, useful when creating really big files, I admit not fully random, but close enough probably, depends on your needs.

dd if=/dev/mem of=test1G.bin bs=1M count=1024
touch test100G.bin
seq 1 100 | xargs -Inone cat test1G.bin >> test100G.bin

This will create a 100Gb file from the contents of your ram (the first 1GB, I assume you have so much ram :) ) Note that it's also probably unsafe to share this file since it may contain all kinds of sensitive data like your passwords, so use it only for your own causes :) Oh, and you need to run it as root for the very same reason.