Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fit multivariate gaussian distribution to a given dataset

I need to fit multivariate gaussian distribution i.e obtain mean vector and covariance matrix of the nearest multivariate gaussian for a given dataset of audio features in python. The audio features (MFCC coefficients) are a N X 13 matrix where N is around 4K. Can someone please outline the packages and technique to fit the gaussian for this data in python?

like image 839
Global Sink Avatar asked Dec 01 '14 14:12

Global Sink


1 Answers

Use the numpy package. numpy.mean and numpy.cov will give you the Gaussian parameter estimates. Assuming that you have 13 attributes and N is the number of observations, you will need to set rowvar=0 when calling numpy.cov for your N x 13 matrix (or pass the transpose of your matrix as the function argument).

If your data are in numpy array data:

mean = np.mean(data, axis=0)
cov = np.cov(data, rowvar=0)
like image 136
bogatron Avatar answered Nov 20 '22 11:11

bogatron