Question

Documentation here.

I'm wondering how important the coef0 parameter is for SVCs under the polynomial and sigmoid kernels. As I understand it, it is the intercept term, just a constant as in linear regression to offset the function from zero. However to my knowledge, the SVM (scikit uses libsvm) should find this value.

What's a good general range to test over (is there one?). For example, generally with C, a safe choice is 10^-5 ... 10^5, going up in exponential steps.

But for coef0, the value seems highly data dependent and I'm not sure how to automate choosing good ranges for each grid search on each dataset. Any pointers?

Was it helpful?

Solution

First, sigmoid function is rarely the kernel. In fact, for almost none values of parameters it is known to induce the valid kernel (in the Mercer's sense).

Second, coef0 is not an intercept term, it is a parameter of the kernel projection, which can be used to overcome one of the important issues with the polynomial kernel. In general, just using coef0=0 should be just fine, but polynomial kernel has one issue, with p->inf, it more and more separates pairs of points, for which <x,y> is smaller than 1 and <a,b> with bigger value. it is because powers of values smaller than one gets closer and closer to 0, while the same power of value bigger than one grows to infinity. You can use coef0 to "scale" your data so there is no such distinction - you can add 1-min <x,y>, so no values are smaller than 1 . If you really feel the need for tuning this parameter, I would suggest search in the range of [min(1-min , 0),max(<x,y>)], where max is computed through all the training set.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top