Python: Generate random values from empirical distribution

Tags:

statistics

In Java, I usually rely on the org.apache.commons.math3.random.EmpiricalDistribution class to do the following:

Derive a probability distribution from observed data.
Generate random values from this distribution.

Is there any Python library that provides the same functionality? It seems like scipy.stats.gaussian_kde.resample does something similar, but I'm not sure if it implements the same procedure as the Java type I'm familiar with.

306

asked Feb 16 '16 13:02

Carlos Gavidia-Calderon

1 Answers

import numpy as np
import scipy.stats
import matplotlib.pyplot as plt

# This represents the original "empirical" sample -- I fake it by
# sampling from a normal distribution
orig_sample_data = np.random.normal(size=10000)

# Generate a KDE from the empirical sample
sample_pdf = scipy.stats.gaussian_kde(orig_sample_data)

# Sample new datapoints from the KDE
new_sample_data = sample_pdf.resample(10000).T[:,0]

# Histogram of initial empirical sample
cnts, bins, p = plt.hist(orig_sample_data, label='original sample', bins=100,
                         histtype='step', linewidth=1.5, density=True)

# Histogram of datapoints sampled from KDE
plt.hist(new_sample_data, label='sample from KDE', bins=bins,
         histtype='step', linewidth=1.5, density=True)

# Visualize the kde itself
y_kde = sample_pdf(bins)
plt.plot(bins, y_kde, label='KDE')
plt.legend()
plt.show(block=False)

resulting plot

new_sample_data should be drawn from roughly the same distribution as the original data (to the degree that the KDE is a good approximation to the original distribution).

166

answered Sep 30 '22 19:09

abeboparebop

Related questions
                            
                                How to change header row in a python dataframe
                            
                                plotting seismic wiggle traces using matplotlib
                            
                                Unexpected behaviour in numpy, when dividing arrays
                            
                                How to delete multiple files at once using Google Drive API
                            
                                python logging only to file
                            
                                importing a package from within another package in python
                            
                                How to Index multiple items of array with intervals in Python
                            
                                A reusable Tensorflow convolutional Network
                            
                                Error when trying to install pyamg: clang: error: no such file or directory: '“-I/.../boost_1_59_0”'
                            
                                "Server response (401): You must login to access this feature" when registering package on pypi
                            
                                Patsy: New levels in categorical fields in test data
                            
                                Tracing code execution in embedded Python interpreter
                            
                                Forecasting time series data with PyBrain Neural Networks
                            
                                Finding top N columns for each row in data frame
                            
                                Numpy and static linking
                            
                                Can celery assign task to specify worker
                            
                                Vectorize integration of pandas.DataFrame
                            
                                Why python numpy.delete does not raise indexError when out-of-bounds index is in np array
                            
                                Python unittest failing to resolve import statements
                            
                                How to merge two pandas DataFrames based on a similarity function?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With