Question

I have a dataset where I need to include a few variables regardless of statistical relevance but I want to run a score selection (ie best subsets) for the rest of my variables. I would like to know if there is a way to specify the score selection method to keep specific variables in every model fit. My variables that need to be kept regardless of statistical significance are prefixed by "kp_"

proc logistic work.data;
    model y (event ='1')= kp_x1 kp_x2 x3 x4 x5 x6 x7 / selection=score best=3;
run;
Was it helpful?

Solution

Using the include= option in the model statement will keep the first n variables listed.

With your code, for example, to keep the *kp_x1* and *kp_x2* variables, you would write:

proc logistic work.data;
  model y (event ='1')= kp_x1 kp_x2 x3 x4 x5 x6 x7 / selection=score best=3 include=2;
run;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top