Given a 2D numpy array dist
with shape (200,200)
, where each entry of the array represents the joint probability of (x1, x2) for all x1 , x2 ∈ {0, 1, . . . , 199}. How do I sample bivariate data x = (x1, x2) from this probability distribution with the aid of Numpy or Scipy API?
Indexing a Two-dimensional Array To access elements in this array, use two indices. One for the row and the other for the column. Note that both the column and the row indices start with 0. So if I need to access the value '10,' use the index '3' for the row and index '1' for the column.
To create a NumPy array, you can use the function np. array() . All you need to do to create a simple array is pass a list to it. If you choose to, you can also specify the type of data in your list.
2D array are also called as Matrices which can be represented as collection of rows and columns. In this article, we have explored 2D array in Numpy in Python. NumPy is a library in python adding support for large multidimensional arrays and matrices along with high level mathematical functions to operate these arrays.
This solution works with probability distributions of any number of dimensions, assuming they are a valid probability distribution (its contents must sum to 1, etc.). It flattens the distribution, samples from that, and adjusts the random index to match the original array shape.
# Create a flat copy of the array
flat = array.flatten()
# Then, sample an index from the 1D array with the
# probability distribution from the original array
sample_index = np.random.choice(a=flat.size, p=flat)
# Take this index and adjust it so it matches the original array
adjusted_index = np.unravel_index(sample_index, array.shape)
print(adjusted_index)
Also, to get multiple samples, add a size
keyword argument to the np.random.choice
call, and modify adjusted_index
before printing it:
adjusted_index = np.array(zip(*adjusted_index))
This is necessary because np.random.choice
with a size
argument outputs a list of indices for each coordinate dimension, so this zips them into a list of coordinate tuples. This is also much more efficient than simply repeating the first code.
Relevant documentation:
np.random.choice
np.unravel_index
Here's a way, but I'm sure there's a much more elegant solution using scipy.
numpy.random
doesn't deal with 2d pmfs, so you have to do some reshaping gymnastics to go this way.
import numpy as np
# construct a toy joint pmf
dist=np.random.random(size=(200,200)) # here's your joint pmf
dist/=dist.sum() # it has to be normalized
# generate the set of all x,y pairs represented by the pmf
pairs=np.indices(dimensions=(200,200)).T # here are all of the x,y pairs
# make n random selections from the flattened pmf without replacement
# whether you want replacement depends on your application
n=50
inds=np.random.choice(np.arange(200**2),p=dist.reshape(-1),size=n,replace=False)
# inds is the set of n randomly chosen indicies into the flattened dist array...
# therefore the random x,y selections
# come from selecting the associated elements
# from the flattened pairs array
selections = pairs.reshape(-1,2)[inds]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With