SVM
is not a Naive Bayes
, feature's values are not counters, but dimensions in multidimensional real valued space, 0's have exactly the same amount of information as 1's (which also answers your concern regarding removing 0 values - don't do it). There is no reason to ever normalize data to [0.001, 1]
for the SVM
.
The only issue here is that column-wise normalization is not a good idea for the tf-idf
, as it will degenerate yout features to the tf
(as for perticular i
'th dimension, tf-idf
is simply tf
value in [0,1]
multiplied by a constant idf
, normalization will multiply by idf^-1
). I would consider one of the alternative preprocessing methods:
- normalizing each dimension, so it has mean 0 and variance 1
- decorrelation by making
x=C^-1/2*x
, whereC
is data covariance matrix