Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How would I pick a random character from a string/array with Bash?

Tags:

bash

random

I have an example where I am attempting to randomly generate a letter in a set of letters (i.e. var="abcdefghijklmnopqrstuvwxyz").

How would I have:

var="abcdefghijklmnopqrstuvwxyz"

echo "${var}"

yield: some_letter

The point is, I’m trying to automatically generate a random letter from a selection of my choosing. Whether it be in an array or a string does not matter.

like image 265
Matthew Bonnette Avatar asked Dec 01 '22 10:12

Matthew Bonnette


2 Answers

var="abcdefghijklmnopqrstuvwxyz"
echo "${var:$(( RANDOM % ${#var} )):1}" # pick a 1 char substring starting at a random position

This works because:

  • ${var:START:LEN} is a parameter expansion that expands to a substing of $var
  • ${#var} is a parameter expansion that expands to the length of the contents of the string variable var
  • $(( )) creates an arithmetic context, in which context non-numeric strings are assumed to refer to variable names (so one can use RANDOM instead of $RANDOM).
  • $RANDOM, each time it is evaluated, expands to a random integer between 0 and 32767.
  • $RANDOM % ${#var} takes the remainder of dividing that random integer by the number of characters in the string named var; consequently, it will be between 0 and (length-of-var - 1), and will be almost randomly divided (if the length-of-var doesn't divide into 32768 evenly, then some of the characters will have a very slightly higher chance of being chosen than others).

Thus, ${var:$(( RANDOM % ${#var} )) : 1} will, each time it is evaluated, pick a location inside the string, and expand to a single-character span within it.

like image 68
Charles Duffy Avatar answered Dec 04 '22 03:12

Charles Duffy


For most practical cases, the solution of Charles Duffy is the way forward. However, if your random character picking has to be uniform, then the story becomes slightly more complicated when you want to use RANDOM (see explanation below). The best way forward would be the usage of shuf. shuf generates a random permutation of a given range and allows you to pick the first number like shuf -i 0-25 -n1, so you could use

var="abcdefghijklmnopqrstuvwxyz"
echo ${var:$(shuf -i 0-$((${#var}-1)) -n1):1}

The idea here is to pick a letter from the string var by using the pattern expansion ${var:m,n} where you pick a substring starting at m of length n. The length is set to 1 and the starting position is defined by the command shuf -i 0-$((${#var}-1) which shuffles a range between 0 and ${#var}-1 where ${#var} is the string length of the variable var.


Why not using RANDOM:

The random variable RANDOM generates a pseudo-random number between 0 and 32767. This implies that if you want to generate a random number between 0 and n, you cannot use the mod. The problem here is that the first 32768%n numbers will have a higher chance to be drawn. This is easily seen with the following script :

% for i in {0..32767}; do echo $((i%5)); done | sort -g | uniq -c
   6554 0
   6554 1
   6554 2
   6553 3  < smaller change to hit 3
   6553 4  < smaller chance to hit 4

Another classic approach is to map the range of the random number generator onto the requested range by scaling the random value as n*RANDOM/32768. Unfortunately, this only works for a random number generator that generate real numbers. RANDOM generates an integer. The integer scaling essentially shuffles the earlier problem:

% for i in {0..32767}; do echo $((5*i/32768)); done | sort -g | uniq -c
   6554 0
   6554 1
   6553 2  < smaller chance to hit 2
   6554 3
   6553 4  < smaller chance to hit 4

If you want to use RANDOM, the best way is to skip the values which are not needed, this you can do with a simple while loop

var="abcdefghijklmnopqrstuvwxyz"
n=${#var}
idx=32769; while (( idx >= (32768/n)*n )); do idx=$RANDOM; done
char=${var:$idx:1}

note: it is possible that you are stuck for an eternity in the while loop.

Comment: we do not comment on how good the random number generator behind RANDOM is. All we do is cite the comment in the source :

source bash 4.4.18 (variables.c)

/* A linear congruential random number generator based on the example
   one in the ANSI C standard. This one isn't very good, but a more
   complicated one is overkill.
*/
like image 28
kvantour Avatar answered Dec 04 '22 04:12

kvantour