Question

For a homework assignment I need to fit several classification models to a data set and compare their lift charts to determine the most effective model. The models produce a binary result (or a probability of that binary result), lets call them YES or NO. Models with continuous output are easy to generate lift charts for as its easy to order the data set in descending order of confidence.

I am having trouble doing that with models that generate a binary result (k-NN and ClassificationTree) for example. In my head I know methods to create a confidence value but I don't know how to do it with these libraries.

For k-NN I would set the probability confidence to the probability of a YES in the training data that falls through a particular path in the tree. However with this method, and the tree model in MATLAB, I don't know which tree path each record falls through.

Similarly with k-NN I would take the probability based upon the k neighbors, and find the probability of a YES from those k neighbors, but the model doesn't tell me the k neighbors and I'd prefer to not do a search for them.

Any help with one or both of these problems (or a better way of producing lift charts in MATLAB is greatly appreciated)

Était-ce utile?

La solution

I was actually able to find the answer to my own question. The predict function in MATLAB produces scores for the probability of each type of class in the prediction model

[class, score] = predict(mdl, new_observation);
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top