Using TF-IDF with other features in SKLearn
-
30-10-2019 - |
Pergunta
What is the best/correct way to combine text analysis with other features? For example, I have a dataset with some text but also other features/categories. SKlearn's TF-IDF vectoriser transforms text data into sparse matrices. I can use these sparse matrices directly with a Naive Bayes classifier for example. But what's the way to also take into account the other features? Should I de-sparsify the tf-idf representation of the text and combine the features and the text into one DataFrame? Or can I keep the sparse matrix as a separate column for example? What's the correct way to do this?
Nenhuma solução correta
Licenciado em: CC-BY-SA com atribuição
Não afiliado a datascience.stackexchange