문제

I'm interested in doing text categorization using LibSVM. How do you recommend I convert the terms/words to numerical data, so LibSVM can understand it?

Thank you!

도움이 되었습니까?

해결책

In text categorization people tend to build histograms of the words used in the domain, sometimes they look at combinations of two words and put that in their histogram (this are called bigrams). But it really depends on your data and your objectives.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top