Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Negative Values: Evaluate Gensim LDA with Topic Coherence

I´m currently trying to evaluate my topic models with gensim topiccoherencemodel:

from gensim.models.coherencemodel import CoherenceModel
cm_u_mass = CoherenceModel(model = model1, corpus = corpus1, coherence = 'u_mass')
coherence_u_mass = cm_u_mass.get_coherence()

print('\nCoherence Score: ', coherence_u_mass)

The output is just negative values. Is this correct? Can anybody provide a formula or something how u_mass works?

like image 226
Nils_Denter Avatar asked Dec 24 '22 07:12

Nils_Denter


1 Answers

Having a quick look at the original article you can see that UMass coherence is calculated over the log of probabilities therefore it is negative.

About the formula you asked, it can be found as equation 4 in the same article.

I understand that as the value of UMass coherence approaches to 0 the topic coherence gets better.

Hope this helps.

like image 72
Francisco Nicolai Manaut Avatar answered Feb 02 '23 10:02

Francisco Nicolai Manaut