Early stopping¶
This example shows how to use early stopping to reduce the time it takes to run a pipeline. This option is only available for models that allow in-training evaluation (XGBoost, LightGBM and CatBoost).
Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.
Load the data¶
In [1]:
Copied!
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
In [2]:
Copied!
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)
Run the pipeline¶
In [3]:
Copied!
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=2, verbose=2, warnings=False, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=2, verbose=2, warnings=False, random_state=1)
<< ================== ATOM ================== >> Algorithm task: binary classification. Parallel processing with 2 cores. Dataset stats ====================== >> Shape: (569, 31) Scaled: False Outlier values: 174 (1.2%) --------------------------------------- Train set size: 456 Test set size: 113 --------------------------------------- | | dataset | train | test | |---:|:----------|:----------|:---------| | 0 | 212 (1.0) | 167 (1.0) | 45 (1.0) | | 1 | 357 (1.7) | 289 (1.7) | 68 (1.5) |
In [4]:
Copied!
# Train the models using early stopping. An early stopping value of 0.1 means
# that the model will stop if it didn't improve in the last 10% of it's iterations
atom.run(
models="LGB",
metric="ap",
n_calls=7,
n_initial_points=3,
bo_params={"early_stopping": 0.1, "cv": 1},
)
# Train the models using early stopping. An early stopping value of 0.1 means
# that the model will stop if it didn't improve in the last 10% of it's iterations
atom.run(
models="LGB",
metric="ap",
n_calls=7,
n_initial_points=3,
bo_params={"early_stopping": 0.1, "cv": 1},
)
Training ===================================== >> Models: LGB Metric: average_precision Running BO for LightGBM... Initial point 1 --------------------------------- Parameters --> {'n_estimators': 499, 'learning_rate': 0.73, 'max_depth': 1, 'num_leaves': 40, 'min_child_weight': 5, 'min_child_samples': 18, 'subsample': 0.7, 'colsample_bytree': 0.8, 'reg_alpha': 100.0, 'reg_lambda': 10.0} Early stop at iteration 50 of 499. Evaluation --> average_precision: 0.6304 Best average_precision: 0.6304 Time iteration: 0.061s Total time: 0.092s Initial point 2 --------------------------------- Parameters --> {'n_estimators': 170, 'learning_rate': 0.11, 'max_depth': 4, 'num_leaves': 25, 'min_child_weight': 11, 'min_child_samples': 28, 'subsample': 0.7, 'colsample_bytree': 0.6, 'reg_alpha': 100.0, 'reg_lambda': 10.0} Early stop at iteration 18 of 170. Evaluation --> average_precision: 0.6304 Best average_precision: 0.6304 Time iteration: 0.047s Total time: 0.434s Initial point 3 --------------------------------- Parameters --> {'n_estimators': 364, 'learning_rate': 0.4, 'max_depth': 1, 'num_leaves': 30, 'min_child_weight': 17, 'min_child_samples': 27, 'subsample': 0.9, 'colsample_bytree': 0.5, 'reg_alpha': 0.0, 'reg_lambda': 1.0} Early stop at iteration 42 of 364. Evaluation --> average_precision: 0.9774 Best average_precision: 0.9774 Time iteration: 0.047s Total time: 0.528s Iteration 4 ------------------------------------- Parameters --> {'n_estimators': 238, 'learning_rate': 0.49, 'max_depth': 2, 'num_leaves': 29, 'min_child_weight': 18, 'min_child_samples': 25, 'subsample': 0.9, 'colsample_bytree': 0.4, 'reg_alpha': 0.0, 'reg_lambda': 10.0} Early stop at iteration 30 of 238. Evaluation --> average_precision: 0.9911 Best average_precision: 0.9911 Time iteration: 0.031s Total time: 3.708s Iteration 5 ------------------------------------- Parameters --> {'n_estimators': 31, 'learning_rate': 0.07, 'max_depth': 5, 'num_leaves': 21, 'min_child_weight': 18, 'min_child_samples': 28, 'subsample': 0.8, 'colsample_bytree': 0.5, 'reg_alpha': 0.0, 'reg_lambda': 100.0} Evaluation --> average_precision: 0.9920 Best average_precision: 0.9920 Time iteration: 0.047s Total time: 4.310s Iteration 6 ------------------------------------- Parameters --> {'n_estimators': 42, 'learning_rate': 0.55, 'max_depth': 3, 'num_leaves': 39, 'min_child_weight': 11, 'min_child_samples': 12, 'subsample': 0.8, 'colsample_bytree': 0.4, 'reg_alpha': 0.01, 'reg_lambda': 100.0} Evaluation --> average_precision: 0.9991 Best average_precision: 0.9991 Time iteration: 0.031s Total time: 4.899s Iteration 7 ------------------------------------- Parameters --> {'n_estimators': 238, 'learning_rate': 1.0, 'max_depth': 2, 'num_leaves': 40, 'min_child_weight': 1, 'min_child_samples': 10, 'subsample': 0.8, 'colsample_bytree': 0.3, 'reg_alpha': 100.0, 'reg_lambda': 100.0} Early stop at iteration 24 of 238. Evaluation --> average_precision: 0.6304 Best average_precision: 0.9991 Time iteration: 0.047s Total time: 5.655s Results for LightGBM: Bayesian Optimization --------------------------- Best parameters --> {'n_estimators': 42, 'learning_rate': 0.55, 'max_depth': 3, 'num_leaves': 39, 'min_child_weight': 11, 'min_child_samples': 12, 'subsample': 0.8, 'colsample_bytree': 0.4, 'reg_alpha': 0.01, 'reg_lambda': 100.0} Best evaluation --> average_precision: 0.9991 Time elapsed: 6.391s Fit --------------------------------------------- Train evaluation --> average_precision: 0.9975 Test evaluation --> average_precision: 0.9885 Time elapsed: 0.056s ------------------------------------------------- Total time: 6.447s Final results ========================= >> Duration: 6.447s ------------------------------------------ LightGBM --> average_precision: 0.9885
Analyze the results¶
In [5]:
Copied!
# Plot the evaluation on the train and test set during training
# Note that the metric is provided by the estimator's package, not ATOM!
atom.lgb.plot_evals(title="LightGBM's evaluation curve", figsize=(11, 9))
# Plot the evaluation on the train and test set during training
# Note that the metric is provided by the estimator's package, not ATOM!
atom.lgb.plot_evals(title="LightGBM's evaluation curve", figsize=(11, 9))