I've made a parallel topic model using mallet.
And I want to get top-words for each document.
To do that, I'm trying to get a word-topic probability matrix.
How would I achieve this?
When you are building topics using MALLET, you have an option called --word-topic-counts-file
. When you give this option and specify a file, MALLET writes ( topic, word, probability ) values per each line in the file. You can later read this file in C, Java or R (of course, any language) to create the matrix you want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With