Scikits Learn: feature weight in Linear kernel SVM

https://stackoverflow.com/questions/11075556

15-06-2021
|

Question

I am dealing with a text classification problem (sentiment analysis). I would like to know if there is any option in scikit-learn to add a "weight" (as a measure of importance) to a feature. I checked the documentation and found the attribute "coefs" of SVC, defined below:

    coef_   array, shape = [n_class-1, n_features]  
    Weights asigned to the features (coefficients in the primal problem). 
   This is only available in the case of linear kernel.coef_ is readonly property derived from dual_coef_ and support_vectors_

However, this attribute seems to be read-only.

Solution

The coef_ vectors is a view on the parameters learned by the machine learning algorithm. It does not make sense to set them by hand as they are automatically tuned optimally from the data. What you can do instead is:

set class_weight if you have prior knowledge about some classes being more important than others
set sample_weight if if you have prior knowledge about some samples (rows in the datasets) being more important than others
rescale the features to make some have more variance than others for instance if you use a RBF kernel and would like to make some feature more important than other (usually it's best to scale all feature to unit variance though)
use a custom precomputed kernel if you use kernels and want to encode special prior knowledge this way.

For text classification, the data is high dim and a kernel is usually just wasting resources for little or no added predictive accuracy, so the last two points are probably not relevant to your specific problem.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow