Latent Variable Discovery and Topic Modeling
Topic Modeling is a popular unsupervised latent variable learning problem. Assuming the word probability representation for each document, topic modeling is defined as estimating kdistributions over the words such that each document can be written as a linear form of these distributions. These k distributions are called the topic matrix. This problem is closely related to the Nonnegative Matrix Factorization (NMF). Recently, it has been shown that NMF can be efficiently solved under the so called Separability assumption on the topic matrix. We aim to provide a suite of computationally as well as statistically efficient algorithms to solve this problem.
- W. Ding, M. H. Rohban, P. Ishwar, V. Saligrama, “Topic Discovery Through Data Dependent and Random Projections,” International Conference on Machine Learning (ICML), 2013 (accepted – to appear) arXiv:1303.3664 [stat.ML].
- W. Ding, M. H. Rohban, P. Ishwar, V. Saligrama, “A New Geometric Approach to Latent Topic Modeling and Discovery,” ICASSP 2013, arXiv:1301.0858[stat.ML].