Algorithm for Multi-Class Classification of News Article

Question 1

There are many ways to attack this problem form CRFs to Random Forests.

With your limited training data, I would suggest going with a model with high bias such as the linear SVM. Start with training one vs all models for each class and predicting the class with the highest probably. This will give you a baseline for how hard your problem is with the given training data.

Question 2

I prefer you to use Naive-Bayes classification. There is a tool called Ling-pipe where this is already implemented. What you want to do is just refer

http://alias-i.com/lingpipe/demos/tutorial/classify/read-me.html

There you have a small sample program Classifynews.java. Run that program by training the data and apply testing .A training data sample is given as "20 newsgroups"

http://qwone.com/~jason/20Newsgroups/

Training can be applied by training the data and if needed you can build an intermediate model and then apply the test data into that model. Naive-Bayes is good for the cases where training data is small.

But its accuracy increases as the size of training data increases. So try to include more news groups. Good luck. Try this and let me know