Pergunta

I want to use Latent Semantic Analysis for a small app I'm building, but I don't want to build up the matrices myself. (Partly because the documents I have wouldn't make a very good training collection, because they're kinda short and heterogeneous, and partly because I just got a new computer and I'm finding it a bitch to install the linear algebra and such libraries I would need.)

Are there any "default"/pre-built LSA implementations available? For example, things I'm looking for include:

  • Default U,S,V matrices (i.e., if D is a term-document matrix from some training set, then D = U S V^T is the singular value decomposition), so that given any query vector q, I can use these matrices to compute the LSA projection of q myself.
  • Some black-box LSA algorithm that, given a query vector q, returns the LSA projection of q.
Foi útil?

Solução

You'd probably be interested in the Gensim framework for Python; notably, it has an example on building the appropriate matrices from English Wikipedia.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top