Gensim: How to save LDA model's produced topics to a readable format (csv,txt,etc)?

Tags:

last parts of the code:

lda = LdaModel(corpus=corpus,id2word=dictionary, num_topics=2)
print lda

bash output:

INFO : adding document #0 to Dictionary(0 unique tokens)
INFO : built Dictionary(18 unique tokens) from 5 documents (total  20 corpus positions)
INFO : using serial LDA version on this node
INFO : running online LDA training, 2 topics, 1 passes over the supplied corpus of 5 documents, updating model once every 5 documents
WARNING : too few updates, training might not converge; consider increasing the number of passes to improve accuracy
INFO : PROGRESS: iteration 0, at document #5/5
INFO : 2/5 documents converged within 50 iterations
INFO : topic #0: 0.079*cute + 0.076*broccoli + 0.070*adopted + 0.069*yesterday + 0.069*eat + 0.069*sister + 0.068*kitten + 0.068*kittens + 0.067*bananas + 0.067*chinchillas
INFO : topic #1: 0.082*broccoli + 0.079*cute + 0.071*piece + 0.070*munching + 0.069*spinach + 0.068*hamster + 0.068*ate + 0.067*banana + 0.066*breakfast + 0.066*smoothie
INFO : topic diff=0.470477, rho=1.000000
<gensim.models.ldamodel.LdaModel object at 0x10f1f4050>

So I'm wondering i'm able to save the resulting topics that it generated, to a readable format. I've tried the .save() methods, but it always outputs something unreadable.

638

asked Jun 27 '13 22:06

jeremy.ting

3 Answers

Here is how to save a model for gensim LDA:

from gensim import corpora, models, similarities

# create corpus and dictionary
corpus = ...
dictionary = ...

# train model, this might takes time
model = models.LdaModel.LdaModel(corpus=corpus,id2word=dictionary, num_topics=200,passes=5, alpha='auto')
# save model to disk (no need to use pickle module)
model.save('lda.model')

To print topics, here are a few ways:

# later on, load trained model from file
model =  models.LdaModel.load('lda.model')

# print all topics
model.show_topics(topics=200, topn=20)

# print topic 28
model.print_topic(109, topn=20)

# another way
for i in range(0, model.num_topics-1):
    print model.print_topic(i)

# and another way, only prints top words
for t in range(0, model.num_topics-1):
    print 'topic {}: '.format(t) + ', '.join([v[1] for v in model.show_topic(t, 20)])

174

answered Sep 28 '22 09:09

Renaud

you just need to use lda.show_topics(topics=-1) or any number of topics you want to have (topics=10, topics=15, topics=1000....). I am usually doing just:

logfile = open('.../yourfile.txt', 'a')
print>>logfile, lda.show_topics(topics=-1, topn=10)

All these parameters and others are available in gensim documentation.

answered Sep 28 '22 07:09

Everst

You may use pickle module.

import pickle
# your code
pickle.dump(lda,open(filename,'w'))
# you may load it back again
lda_copy = pickle.load(file(filename))

answered Sep 28 '22 07:09

Nik

Related questions
                            
                                Problem in understanding Python list comprehensions
                            
                                Python, Sqlite3 - How to convert a list to a BLOB cell
                            
                                Making tabulation look different than just whitespace
                            
                                How to search and replace utf-8 special characters in Python?
                            
                                Open a program with python minimized or hidden
                            
                                Determine precision and scale of particular number in Python
                            
                                Django: How to check if field widget is checkbox in the template?
                            
                                Is it bad that I don't follow PEP 8 and cut my lines at 79 characters?
                            
                                Why this python program is not working? AttributeError: 'module' object has no attribute
                            
                                Execute if no exception thrown
                            
                                add object into python's set collection and determine by object's attribute
                            
                                Celery versus djcelery
                            
                                Why does multiplication repeats the number several times?
                            
                                Python trick in finding leading zeros in string
                            
                                Simplest way to retry SQLite query if DB is locked?
                            
                                Remove points which contains pixels fewer than (N)
                            
                                How do I define a settings.LOGGING so that gunicorn will find the version value it wants?
                            
                                Need to find text with RegEx and BeautifulSoup
                            
                                ugettext and ugettext_lazy functions not recognized by makemessages in Python Django
                            
                                Python split string on quotes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Gensim: How to save LDA model's produced topics to a readable format (csv,txt,etc)?

Tags:

python

gensim

lda

jeremy.ting

People also ask

3 Answers

Renaud

Everst

Nik

Recent Activity

Donate For Us