Topic modeling with latent Dirichlet allocation. lda aims for simplicity.
lda implements latent Dirichlet allocation (LDA) using collapsed Gibbs
sampling. LDA is described in Blei et al. (2003) and Pritchard et al. (2000).
pip install lda
lda.LDA implements latent Dirichlet allocation (LDA). The interface follows
conventions found in scikit-learn.
>>> import numpy as np
>>> import lda
>>> X = np.array([[1,1], [2, 1], [3, 1], [4, 1], [5, 8], [6, 1]])
>>> model = lda.LDA(n_topics=2, n_iter, random_state=1)
>>> doc_topic = model.fit_transform(X) # estimate of document-topic distributions
>>> model.components_ # estimate of topic-word distributions; model.doc_topic_ is an aliasPython 2.7 or Python 3.3+ is required. The following packages are also required
lda aims for simplicity. If you are working with very large corpora or want
to use faster and more sophisticated topic models, consider using hca or
MALLET. hca is written in C and MALLET_ is written in Java. Unlike
lda, hca can use more than one processor at a time.
- Documentation: http://pythonhosted.org/lda
- Source code: https://github.com/ariddell/lda/
- Issue tracker: https://github.com/ariddell/lda/issues
lda is licensed under Version 2.0 of the Mozilla Public License.