Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a more efficient way to generate a random file in Perl?

This is my first Perl script. Ever:

#!/usr/bin/perl

if ($#ARGV < 1) { die("usage: <size_in_bytes> <file_name>\n"); }

open(FILE,">" . $ARGV[0]) or die "Can't open file for writing\n";

# you can control the range of characters here
my $minimum = 32;
my $range = 96;

for ($i=0; $i< $ARGV[1]; $i++) {
    print FILE chr(int(rand($range)) + $minimum);
}

close(FILE);

Its purpose is to generate a file in a specified size filled with random characters.

It works but it is pretty slow. It takes a few seconds to write a 10MB random file.
Does anyone have suggestions/tips on how to make it faster/better? Also feel free to point out common newbie mistakes.

like image 269
quantumSoup Avatar asked Aug 10 '10 00:08

quantumSoup


2 Answers

  1. You could ask rand to create more than one value for you each time you call it.
  2. Collect several characters together before calling print. Printing one character at a time is inefficient.

 

for (my $bytes = 0; $bytes < $num_bytes; $bytes += 4) {
    my $rand = int(rand($range ** 4));
    my $string = '';
    for (1..4) {
        $string .= chr($rand % $range + $minimum);
        $rand = int($rand / $range);
    }
    print FILE $string;
}
like image 142
mob Avatar answered Oct 30 '22 14:10

mob


If you need random numbers from a range, I'm not aware of more efficient way. Your script adjusted to my likings:

#!/usr/bin/perl

use warnings;
use strict;

die("usage: $0 <size_in_bytes> <file_name>\n") unless @ARGV == 2;

my ($num_bytes, $fname) = @ARGV;

open(FILE, ">", $fname) or die "Can't open $fname for writing ($!)";

my $minimum = 32;
my $range = 96;

for (1 .. $num_bytes) {
    print FILE pack( "c", int(rand($range)) + $minimum);
}

close(FILE);

I use pack("c") when I really need binary. chr() might be fine too but IIRC it actually depends on what the character encoding your environment is using (think ASCII vs. utf8.)

BTW if you really need binary file for Windows compatibility you might want to add binmode FILE; after the open.

Otherwise, if range is optional, you can simply dd if=/dev/random of=$filename bs=1 count=$size_of_the_output (or on Linux the faster crypto-unsafe /dev/urandom). But that would be much slower as /dev/random really tries to deliver real random bits - as they become available. And if there is not enough of them (e.g. your platform doesn't have H/W RNG) then performance would really suffer - compared to the blazingly fast libc's pseudo-random number generator (the Perl uses internally to implement the rand()).

like image 31
Dummy00001 Avatar answered Oct 30 '22 12:10

Dummy00001