Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate random numbers from 1-100 from a generator of 1-50

in a recent interview i was asked the following question:

Print random numbers from 1-100 using the given getrnd50() method which generates the random numbers from 1-50. Each random number should be printed only once and in random order. Use of no other random number generator is allowed and i was not allowed to change the definition of getrnd50().

I came up with the following code which gives the correct output.

import java.util.Random;

public class Test {

public static void main(String[] args) {
    int[] rs = new int[100];
    int count = 0;
    int k;
    while (count != 100) {

        // I decided to simply multiply the result of `getrnd50()` by 2. 
        // But since that would generate only even numbers,

        k = getrnd50() * 2;

        // I decided to randomly subtract 1 from the numbers. 
        // Which i accomlished as follows.          

        if (getrnd50() <= 25) // 25 is to half the possibilities.
            k--;

        // Every number is to be stored in its own unique location 
        // in the array `rs`, the location is `(number-1)`. 
        // Every time a number is generated it is checked whether it has 
        // already been generated. If not, it is saved in its position, printed and 
        // `count` is incremented.

        if (rs[k-1] == 0) {
            rs[k-1] = k;
            count++;
            System.out.print(k + " ");
        }
    }
}
// This is the method and i am not supposed to touch it.
static int getrnd50() {
    Random rand = new Random();
    return (1 + rand.nextInt(50));
}

}

While it was accepted in that round, in the next round the interviewer tells me that getrnd50() is a costly method and even in best case scenario i have to call it twice for every number generated. i.e. 200 times for 1-100. In worst case scenario it would be infinity and tens of thousand in average case. He asks me to optimize the code so as to significantly improve the average case.

He gave me a hint when i expressed my inability to do it, he said:

To consider the number of numbers generated while generating a new number. For ex. if count becomes 99 i don't have to call getrnd50() I can simply find the remaining number and print it.

While i understood his drift i had no idea how it would help me, so obviously i got rejected. Now i am curious to know the answer. Help me! Thanx in advance!

Note: if anyone is feeling lazy to to write a lengthy code just point out the numer generation part, the rest is easy. Also we are not bound to follow the hint.

like image 278
Surender Thakran Avatar asked Jan 02 '13 14:01

Surender Thakran


People also ask

How do you generate a random number from a list of numbers?

Note: If you want to generate random number based on a list, you can use this formula =INDEX($I$2:$I$7, RANDBETWEEN(1, 6)), and press Enter key.


3 Answers

The key is to not check if you have generated the number before, which gets very expensive when looking for only 1 remaining number, but to generate the numbers 1-100 in order, and then shuffle.

In your code, when you have generated 99 out of the 100 numbers, you will loop around, generating random numbers, until you find that 1 remaining number. That's why the average case in your version is so bad.

If instead you just shuffle an array, you only need to have as many random numbers as you have shuffle operations, and only as many shuffle operations as you need numbers output.

(For full details on shuffling, look up the Fisher-Yates shuffle, specifically the inside-out variant which can generate a shuffled array in place)

To generate the random numbers, you need a variable generator, rather than a fixed 1-50 one. You can approach this in a variety of ways, but be very careful of introducing skew into the results, if you really want the output to have a good distribution across the possible states.

For example, I would recommend using an integral number of bits, with shifting, rather than attempting to use a modulo. This does involve a certain amount of looping if the values are outside of the desired range, but without being able to modify the original random number generation, your hands are somewhat tied.

static int bits = 0;
static int r100 = 0;

static int randomInt(int range)
{
    int ret;

    int bitsneeded = 32 - Integer.numberOfLeadingZeros(range - 1);

    do {
            while(bits < bitsneeded)
            {
                    int r = (getrnd50()-1) * 50 + getrnd50()-1;
                    if(r < 2048)
                    {
                            r100 <<= 11;
                            r100 |= r;
                            bits += 11;
                    }
            }
            ret = r100 & ((1 << bitsneeded) - 1);
            bits -= bitsneeded;
            r100 >>=  bitsneeded;
    } while(ret >= range); 

        return ret + 1;
}

This implementation will use something in the region of 150 random numbers for your 100 value shuffled array. This is worse than the modulo version, but better than 2x the input range, which was the best case of the original version. There is, if the random generation was truly random, still a worst-case scenario of infinity, but random generation doesn't typically work like that. If it did, I'm not sure unskewed results are realistic given the constraints.

For illustration, as the results are subtle, here's a graph of my suggested random routine, versus a modulo version:

Graph of random generators

So in summary, I think that while your random generation is a bit inefficient, and could be improved, the really big win that interviewer was looking for, is in not needing so many random numbers in the first place, by doing a shuffle rather than repeated searching with an ever decreasing probability.

like image 151
JasonD Avatar answered Nov 03 '22 01:11

JasonD


Since 100 / 50 is an integer, this is quite easy. Since 50 / (100 / 50) is an integer, it's even easier.

If you didn't quite get that, here is some sample code:

int rnd1 = getrnd50();
int rnd2 = getrnd50();
if (rnd1 % 2 == 0)
{
    rnd2 += 50;
}
return rnd2;

Here is an outline:

  • Two numbers, chosen randomly between 1 and 50, called a and b.
  • If a is even, add 50 to b.
  • Return b.

You can make this a one-liner if you want:

return getrnd50() + getrnd50() % 2 * 50;

That's a little too obfuscated though.

Edit: I see the question was really asking for a shuffled list, not a sequence of random integers.

This can be done by creating a list from 1 to 100, and doing 100 random swaps, like a Fisher-Yates shuffle. I imagine that with a Fisher-Yates shuffle, the absolute minimum number of calls is 93 (given with the formula ceil(log50(100!))), but with a much simpler algorithm you can use 200.

The simple algorithm would involve swapping each of the 100 elements with a random element from the 100. The number to choose would be generated from 1-100 with the above generator.

For example:

for (int i = 0; i < 100; i++)
{
    swap(i, getrnd100() - 1); // - 1 for zero base!
}

Here is some complete code:

int[] result = new int[100];
for (int i = 0; i < 100; i++)
{
    result[i] = i + 1;
}
for (int i = 0; i < 100; i++)
{
    int j = (getrnd50() + getrnd50() % 2 * 50) - 1;
    int tmp = result[i];
    result[i] = result[j];
    result[j] = tmp;
}
return result;

(Disclaimer: I don't know Java, and I haven't tested it.)

Best case 200, worst case 200, average case 200.

like image 22
Kendall Frey Avatar answered Nov 02 '22 23:11

Kendall Frey


Here is how you could answer it. It exploits the fact that,

  • assuming you are using shuffle to get an O(n) swapping of "cards", the modulus decreases in a shuffle. i.e. start with an int[] of every values and shuffle it like Collections.shuffle() does.
  • you have more randomness than you need if you call getrnd50() twice, esp when you have less than 50 values left to swap with.

EDIT: For those not familar with how shuffle works, I have added the code for shuffling

import java.util.*;
import java.lang.*;

class Main {
    public static void main(String... args) {
        int samples = 100;

        // all the numbers [1, 100]
        int[] nums = new int[samples];
        for (int i = 0; i < samples; i++) nums[i] = i + 1;

        for (int i = samples - 1; i > 0; i--) {
            int swapWith = nextInt(i + 1);

            // swap nums[i] and nums[swapWith]
            if (swapWith == i) continue;
            int tmp = nums[swapWith];
            nums[swapWith] = nums[i];
            nums[i] = tmp;
        }
        System.out.println("calls/sample " + (double) calls / samples);
        System.out.println(Arrays.toString(nums));

        int[] count49 = new int[49];
        for (int i = 0; i < 49 * 10000; i++)
            count49[nextInt(49) - 1]++;
        int[] count54 = new int[54];
        for (int i = 0; i < 54 * 10000; i++)
            count54[nextInt(54) - 1]++;
        System.out.println("Histogram check (49): " + Arrays.toString(count49));
        System.out.println("Histogram check (54): " + Arrays.toString(count54));

    }

    // keep track of the range of values.
    static int maxRandom = 1;
    // some random value [0, maxRandom)
    static int rand100 = 0;

    static int nextInt(int n) {
        while (maxRandom < 10 * n * n) {
            maxRandom *= 50;
            rand100 = rand100 * 50 + getrnd50() - 1;
        }
        int ret = rand100 % n;
        maxRandom = (maxRandom + n - 1) / n;
        rand100 /= n;
        return ret + 1;
    }

    static final Random rand = new Random();
    static int calls = 0;

    static int getrnd50() {
        calls++;
        return (1 + rand.nextInt(50));
    }
}

prints

calls/sample 0.94

[1, 37, 4, 98, 76, 53, 26, 55, 9, 78, 57, 58, 47, 12, 44, 25, 82, 2, 42, 30, 88, 81, 64, 99, 16, 28, 34, 29, 51, 36, 13, 94, 80, 66, 19, 38, 20, 8, 40, 89, 72, 56, 75, 96, 35, 100, 95, 17, 74, 69, 11, 31, 86, 92, 6, 27, 22, 70, 63, 32, 93, 84, 71, 15, 23, 5, 14, 62, 49, 43, 87, 65, 83, 33, 45, 52, 39, 91, 60, 73, 68, 24, 97, 46, 50, 18, 79, 48, 77, 67, 59, 10, 7, 54, 90, 85, 21, 61, 41, 3]

Histogram check (49): [10117, 10158, 10059, 10188, 10338, 9959, 10313, 10278, 10166, 9828, 10105, 10159, 10250, 10152, 9949, 9855, 10026, 10040, 9982, 10112, 10021, 10082, 10029, 10052, 9996, 10057, 9849, 9990, 9914, 9835, 10029, 9738, 9953, 9828, 9896, 9931, 9995, 10034, 10067, 9745, 9873, 9903, 9913, 9841, 9823, 9859, 9941, 10007, 9765]

Histogram check (54): [10124, 10251, 10071, 10020, 10196, 10170, 10123, 10096, 9966, 10225, 10262, 10036, 10029, 9862, 9994, 9960, 10070, 10127, 10021, 10166, 10077, 9983, 10118, 10163, 9986, 9988, 10008, 9965, 9967, 9950, 9965, 9870, 10172, 9952, 9972, 9828, 9754, 10152, 9943, 9996, 9779, 10014, 9937, 9931, 9794, 9708, 9978, 9894, 9803, 9904, 9915, 9927, 10000, 9838]

In this case, 100 numbers need less than 100 calls to getrnd50

If you have 1000 values to shuffle

calls/sample 1.509
like image 30
Peter Lawrey Avatar answered Nov 02 '22 23:11

Peter Lawrey