문제

I'm using SVM-pref (http://svmlight.joachims.org) for a binary classification problem. I don't have much experience with this package and so I seek help with the following questions:

(1) My features are all discrete/nominal. Is there a special way to represent the feature vectors like a special way to convert the nominal values into continuous values or do we just replace the nominal values for dummy numbers like 1, 2, 3 .. etc.?

(2) If the answer to the first question is we replace nominal values with dummy numbers, then my second question is we start numbering feature values from 1 so we have 1:1 but not 1:0 otherwise the learner will consider a zero-value feature as non-existent. Is that correct?

(3) How to we configure the best -c values and the values for the rest of the parameters? Is it only by error and trial or are their other approaches used to decide on these parameters?

다른 팁

  1. To use categorical features in SVM you must encode them using dummy variables, e.g. one-hot coding. For every level of the category, you should introduce a dimension. Something like this for a feature with levels A, B and C:

    A -> [1,0,0]
    B -> [0,1,0]
    C -> [0,0,1]

  2. See answer to previous question: use one dimension per categorical level.

  3. Typically this is done by testing possible values in a cross-validation setting.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top