Using TF-IDF with other features in SKLearn

https://datascience.stackexchange.com/questions/22813

python
scikit-learn
pandas
tfidf

30-10-2019
|

Pergunta

What is the best/correct way to combine text analysis with other features? For example, I have a dataset with some text but also other features/categories. SKlearn's TF-IDF vectoriser transforms text data into sparse matrices. I can use these sparse matrices directly with a Naive Bayes classifier for example. But what's the way to also take into account the other features? Should I de-sparsify the tf-idf representation of the text and combine the features and the text into one DataFrame? Or can I keep the sparse matrix as a separate column for example? What's the correct way to do this?

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange