SVM-pref package from Cornell university

https://stackoverflow.com//questions/22037638

21-12-2019
|

문제

I'm using SVM-pref (http://svmlight.joachims.org) for a binary classification problem. I don't have much experience with this package and so I seek help with the following questions:

(1) My features are all discrete/nominal. Is there a special way to represent the feature vectors like a special way to convert the nominal values into continuous values or do we just replace the nominal values for dummy numbers like 1, 2, 3 .. etc.?

(2) If the answer to the first question is we replace nominal values with dummy numbers, then my second question is we start numbering feature values from 1 so we have 1:1 but not 1:0 otherwise the learner will consider a zero-value feature as non-existent. Is that correct?

(3) How to we configure the best -c values and the values for the rest of the parameters? Is it only by error and trial or are their other approaches used to decide on these parameters?

해결책 2

Here is also another useful and informative discussion about representing nominal features for SVM classifiers.

다른 팁

To use categorical features in SVM you must encode them using dummy variables, e.g. one-hot coding. For every level of the category, you should introduce a dimension. Something like this for a feature with levels A, B and C:

A -> [1,0,0]
B -> [0,1,0]
C -> [0,0,1]
See answer to previous question: use one dimension per categorical level.
Typically this is done by testing possible values in a cross-validation setting.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow