I have the following problem:
I want to generate a 100x100 grid (numpy.ndarray) by filling it with numbers, out from a given list ([-1,0,1,2]). I want to distribute them randomly on this grid. Also, the numbers must maintain the following ratios: the number 0 must occupy 10% of the grid, while the remaining numbers have a 30% ratio each, so their sum equals 100%. Using np.random.choice() I was able to generate random numbers, each distributed with the associated probabilities. However, I run into problems because I have to make sure that the number 0 makes exactly 10% of the entire grid, and the non-zero numbers exactly 30% each. Using the np.random.choice() function, this is not always the case (especially if the sample size is small), because I have only assigned probabilities, and not ratios:
import numpy as np
numbers = np.random.choice([-1,0,1,2],(100,100),p=[0.3,0.1,0.3,0.3])
print(np.count_nonzero(numbers)) #must be = 0.1 always!
Another idea I had was to initially set the entire matrix as np.zeros((100,100)) and then fill up only 90% of it with non-zero elements, however, I don't how to approach this problem such that the numbers are distributed randomly on the grid, i.e., random location/index.
Edit: The ratio of each individual non-zero number in the grid will only depend on how many cells I want to be empty, or 0 in that case. All other non-zero elements must have the same ratio. For example, I want to have 20% of the grid to be zeros, the remaining numbers will have a ratio of (1 - ratio_of_zero)/amount_of_non-zero_elements.
This should do what you want it to (suggested by RemcoGerlich), though I don't know how efficient this method is:
import numpy as np
# Constants
SHAPE = (100, 100)
LENGTH = SHAPE[0] * SHAPE[1]
REST = [-1, 1, 2]
ZERO_PROB = 10
BASE_PROB = (100 - ZERO_PROB) // len(REST)
NUM_ZERO = round(LENGTH * (ZERO_PROB / 100))
NUM_REST = round(LENGTH * (BASE_PROB / 100))
# Make base 1D array
base_arr = [0 for _ in range(NUM_ZERO)]
for i in REST:
base_arr += [i for _ in range(NUM_REST)]
base_arr = np.array(base_arr)
# Give it a random order
np.random.shuffle(base_arr)
# Finally, reshape the array
result_arr = base_arr.reshape(SHAPE)
Looking at your comment, for flexibility that depends on how many of the numbers are to have different probabilities I suppose. You could just have a for loop which goes through and makes an array the right length for each one to add to the base_arr. Also, this can of course be a function you pass variables into rather than just a script with hard coded constants like this.
Edited based on comment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With