text classification : comparing classification reports

https://datascience.stackexchange.com/questions/74702

11-12-2020
|

質問

I have a 4-leabelled text classification problem.

Could someone help me choose among the below text classifiers ?

I was advised to select the second one ( the one which uses both unigrams and bigrams ) but I cannot really see why.

解決

Okay so keeping it very short and precisely in context of your question-

Accuracy tells us, out of all the documents how many are classified correctly.

Precision tells us out of all documents which are predicted in a category, how often its correct.

Uni -gram- "nasa", "is" "space" , "agency" bi-gram- "nasa is", "space agency"

Now lets go over the numbers, in both the cases accuracy and precision doesn't have significant difference.

But as we can see bi-grams can give me much more information and hence can have better performance on unseen data. Try to test the model on unseen data/validation set and compare the difference.May be Try tri-grams etc also.

ライセンス： CC-BY-SA と帰属

所属していません datascience.stackexchange