LibSVM and non-numerical data

https://stackoverflow.com/questions/4279421

machine-learning
svm
categorization
libsvm
document-classification

28-09-2019
|

문제

I'm interested in doing text categorization using LibSVM. How do you recommend I convert the terms/words to numerical data, so LibSVM can understand it?

Thank you!

해결책

In text categorization people tend to build histograms of the words used in the domain, sometimes they look at combinations of two words and put that in their histogram (this are called bigrams). But it really depends on your data and your objectives.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow