Вопрос

Is it possible to use the supervised term weighting models with KNN classifier ?. I wonder how to represent the vector of test documents as long as the test documents are unlabeled and the supervised term weighting models require labeled documents to calculate the weights. Could any one help please?

Это было полезно?

Решение

yes. You can use metrics based on class information.

  1. You compute the collection based values for each term based on the train set (i.e. idf). This might include class based information such as the max.chi^2 values for each term.
  2. For the test documents you combine the measures: For instance multiplying the TF with the IDF (based on the train set) with the max.chi^2 (based on the train as well).

Regards,

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top