How am I supposed to use RandomizedLogisticRegression in Scikit-learn?

https://stackoverflow.com/questions/20246513

05-08-2022
|

Question

I simply have failed to understand the documentation for this class. I can fit data using it, and get the scores for features, but it this all this class is supposed to do?

I can't see how I can use it to actually perform regression using the model that was fit. The example in the documentation above is simply creating an instance of the class, so I can't see how that is supposed to help.

There are methods that perform 'transform' operation, but no mention of what kind of transform that is.

so is it possible to use this class to get actual predictions on new test data, and is it possible to use it in cross fold validation to compare performance with other methods I'm using?

I've used the highest ranking features in other classifiers, but I'm not sure if more than that is possible with this classifier.

Update: I've found the use for fit_transform under feature selection part of the documentation:

When the goal is to reduce the dimensionality of the data to use with another classifier, they expose a transform method to select the non-zero coefficient

Unless I get an answer that says I'm wrong, I'll assume that this classifier indeed does not do prediction. I'll wait before I answer my own question.

Solution

Randomized LR is supposed to be a feature selection method, not a classifier in and of itself. Its API matches that of a standard scikit-learn transformer:

randomlr = RandomizedLogisticRegression()
X_train = randomlr.fit_transform(X_train)
X_test = randomlr.transform(X_test)

Then fit a model to X_train and do classification on X_test as usual.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow