Depending on classes overlapping or not, I propose two different approaches instead of joining the feature vectors:
If classes are not overlapping (that is, no document is in two or more classes at the same time), you would rather build a single ARFF file and then make use of the
AttributeSelection
filter (Ranker
search,InfoGainAttributeEval
evaluator suggested) to determine which features most discriminate among all the classes.If classes are overlapping, you could build twelve one-again-the-rest classifiers, each one with its own vocabulary. You could apply attribute selection to each independent problem as well, finding the features that best discriminate a single class from all of the rest.