Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

shell script runs out of memory

I have written the following random-number generator shell script:

for i in $(seq 1 $1) #for as many times, as the first argument ($1) defines...
do 
echo "$i $((RANDOM%$2))" #print the current iteration number and a random number in [0, $2)
done

I run it like that:

./generator.sh 1000000000 101 > data.txt

to generate 1B rows of an id and a random number in [0,100] and store this data in file data.txt.

My desired output is:

1 39
2 95
3 61
4 27
5 85
6 44
7 49
8 75
9 52
10 66
...

It works fine for small number of rows, but with 1B, I get the following OOM error:

./generator.sh: xrealloc: ../bash/subst.c:5179: cannot allocate 18446744071562067968 bytes (4299137024 bytes allocated)

Which part of my program creates the error? How could I write the data.txt file line-by-line? I have tried replacing the echo line with:

echo "$i $((RANDOM%$2))" >> $3

where $3 is data.txt, but I see no difference.

like image 338
vefthym Avatar asked Dec 06 '22 23:12

vefthym


2 Answers

The problem is your for loop:

for i in $(seq 1 $1) 

This will first expand $(seq 1 $1), creating a very big list, which you then pass to for.

Using while, however, we can read the output of seq line-by-line, which will take a small amount of memory:

seq 1 1000000000 | while read i; do
        echo $i
done
like image 189
Martin Tournoij Avatar answered Dec 30 '22 10:12

Martin Tournoij


$(seq 1 $1) is computing the whole list before iterating over it. So it takes memory to store the entire list of 10^9 numbers, which is a lot.

I am not sure if you can make seq run lazily, i.e, get the next number only when needed. You can do a simple for loop instead:

for ((i=0; i<$1;++i))
do
  echo "$i $((RANDOM%$2))"
done
like image 29
Hari Menon Avatar answered Dec 30 '22 09:12

Hari Menon