How to convert log probability into simple probability between 0 and 1 values using python

Tags:

I am using Gaussian mixture model for speaker identification. I use this code to predict the speaker for each voice clip.

for path in file_paths:   
    path = path.strip()   
    print (path)
    sr,audio = read(source + path)
    vector   = extract_features(audio,sr)
    #print(vector)
    log_likelihood = np.zeros(len(models))
    #print(len(log_likelihood))

    for i in range(len(models)):
        gmm1   = models[i]  #checking with each model one by one
        #print(gmm1)
        scores = np.array(gmm1.score(vector)) 
        #print(scores)
        #print(len(scores))
        log_likelihood[i] = scores.sum()
        print(log_likelihood)
        winner = np.argmax(log_likelihood)
        #print(winner)
    print ("\tdetected as - ", speakers[winner])

and it gives me the output like this:

[ 311.79769716    0.            0.            0.            0.        ]
[  311.79769716 -5692.56559902     0.             0.             0.        ]
[  311.79769716 -5692.56559902 -6170.21460788     0.             0.        ]
[  311.79769716 -5692.56559902 -6170.21460788 -6736.73192695     0.        ]
[  311.79769716 -5692.56559902 -6170.21460788 -6736.73192695 -6753.00196447]
    detected as -  bart

Here score function gives me the log probability for each speaker. Now i want to decide threshold value, for that i need these log probability value into simple probability value (between 0 to 1). How can i do that? I am using python software.

623

asked Jan 26 '18 16:01

Sandeep

2 Answers

You have to take exponent (np.exp()) of the log probabilities to get the actual probabilities back. It's because logarithm is the inverse of exponentiation: e^log(p) = p, where p are the probabilities.

Below is an example:

# some input array
In [9]: a
Out[9]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])

# converting to probabilities using "softmax"
In [10]: probs = np.exp(a) / (np.exp(a)).sum()

# sanity check
In [11]: probs.sum()
Out[11]: 1.0

# obtaining log probabilities
In [12]: log_probs = np.log(probs)

In [13]: log_probs
Out[13]: 
array([-8.45855173, -7.45855173, -6.45855173, -5.45855173, -4.45855173,
       -3.45855173, -2.45855173, -1.45855173, -0.45855173])

# In most cases, it won't sum to 1.0
In [14]: log_probs.sum()
Out[14]: -40.126965551706405

# get the probabilities back
In [15]: probabilities = np.exp(log_probs)

In [16]: probabilities.sum()   # check passed
Out[16]: 1.0

In [17]: probabilities
Out[17]: 
array([  2.12078996e-04,   5.76490482e-04,   1.56706360e-03,
         4.25972051e-03,   1.15791209e-02,   3.14753138e-02,
         8.55587737e-02,   2.32572860e-01,   6.32198578e-01])

175

answered Oct 06 '22 16:10

kmario23

The GMM module's score_sample from sklearn gives the probability density and they won't sum to 0, rather integrate to 1.

data = 10 * np.random.rand(100)
model = mixture.GMM(n_components=1).fit(data[:, None])
xfit = np.linspace(-5, 15, 5000)
logprob, _ = model.score_samples(xfit[:, None])
dx = xfit[1] - xfit[0]
print(dx * np.sum(np.exp(logprob)))
# 0.999773872653

You can also calculate the probability of a data point belonging to a multivariate normal distribution.,

Source: https://github.com/scikit-learn/scikit-learn/issues/4202

answered Oct 06 '22 16:10

Timoth Dev A

Related questions
                            
                                Jupyter: install new modules
                            
                                How can I keep leading zeros in a column, when I export to CSV?
                            
                                Python OOP: how to share a MongoDB connection with all classes
                            
                                Using Beautiful Soup to find specific class
                            
                                Update pip3 for Python 3.6?
                            
                                Can't import moviepy.editor
                            
                                Login to website using python requests
                            
                                Finding median of entire pandas Data frame
                            
                                Python pandas datareader no longer works for yahoo-finance changed url
                            
                                Matplotlib: how to make imshow read x,y coordinates from other numpy arrays?
                            
                                filter object becomes empty after iteration? [duplicate]
                            
                                Python access dictionary inside list of a dictionary
                            
                                Django - disable one of system checks
                            
                                python:pandas - How to combine first two rows of pandas dataframe to dataframe header?
                            
                                Error when executing compiled file " No module named 'scipy._lib.messagestream' "after using pyinstaller
                            
                                Matplotlib how to set legend's font type
                            
                                Load xlsx file from drive in colaboratory
                            
                                how to featureUnion numerical and text features in python sklearn properly
                            
                                Compute Euclidean distance between rows of two pandas dataframes
                            
                                Git/Heroku - How to hide my SECRET_KEY?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to convert log probability into simple probability between 0 and 1 values using python

Tags:

python

logarithm

gaussian

probability-distribution

gmm

Sandeep

People also ask

2 Answers

kmario23

Timoth Dev A

Recent Activity

Donate For Us