Question

Faced with the task of selecting parameters for the lightgbm model, the question accordingly arises, what is the best way to select them? I used the RandomizedSearchCV method, within 10 hours the parameters were selected, but there was no sense in it, the accuracy was the same as when manually entering the parameters at random. +/- the meaning of the parameters is clear, which ones are responsible for retraining, which ones are for the accuracy and speed of training, but it’s not entirely clear if you select manually one at a time or in pairs, or even more options?

Below is an example of how I implemented the selection of parameters:

SEED = 4 
NFOLDS = 2
kf = KFold(n_splits= NFOLDS, shuffle=False)

    parameters = {
          'num_leaves': np.arange(100,500,100),
          'min_child_weight': np.arange(0.01,1,0.01),
          'feature_fraction': np.arange(0.1,0.4,0.01),
          'bagging_fraction':np.arange(0.3,0.5,0.01),
          'min_data_in_leaf': np.arange(100,1500,10),
          'objective': ['binary'],
          'max_depth': [-1],
          'learning_rate':np.arange(0.001,0.02,0.001),
          "boosting_type": ['gbdt'],
          "bagging_seed": np.arange(10,42,5),
          "metric": ['auc'],
          "verbosity": [1],
          'reg_alpha': np.arange(0.3,1,0.2),
          'reg_lambda':  np.arange(0.37,0.39,0.001),
          'random_state': [425],
          'n_estimators': [500]}

model = lightgbm.LGBMClassifier()
RSCV = RandomizedSearchCV(model,parameters,scoring='roc_auc',cv=kf.split(train),n_iter=30,verbose=50)
RSCV.fit(train,label)

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top