Early stopping¶
This example shows how to use early stopping to reduce the time it takes to run a pipeline. This option is only available for models that allow in-training evaluation (XGBoost, LightGBM and CatBoost).
Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.
Load the data¶
In [6]:
                Copied!
                
                
            # Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
    
        In [7]:
                Copied!
                
                
            # Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)
    
        Run the pipeline¶
In [8]:
                Copied!
                
                
            # Initialize atom
atom = ATOMClassifier(X, y, n_jobs=2, verbose=2, warnings=False, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=2, verbose=2, warnings=False, random_state=1)
    
        << ================== ATOM ================== >> Algorithm task: binary classification. Parallel processing with 2 cores. Dataset stats ==================== >> Shape: (569, 31) Scaled: False Outlier values: 174 (1.2%) ------------------------------------- Train set size: 456 Test set size: 113 ------------------------------------- | | dataset | train | test | | -- | ----------- | ----------- | ----------- | | 0 | 212 (1.0) | 167 (1.0) | 45 (1.0) | | 1 | 357 (1.7) | 289 (1.7) | 68 (1.5) |
In [9]:
                Copied!
                
                
            # Train the models using early stopping. An early stopping value of 0.1 means
# that the model will stop if it didn't improve in the last 10% of it's iterations
atom.run(
    models="LGB",
    metric="ap",
    n_calls=7,
    n_initial_points=3,
    bo_params={"early_stopping": 0.1},
)
# Train the models using early stopping. An early stopping value of 0.1 means
# that the model will stop if it didn't improve in the last 10% of it's iterations
atom.run(
    models="LGB",
    metric="ap",
    n_calls=7,
    n_initial_points=3,
    bo_params={"early_stopping": 0.1},
)
    
        
Training ========================= >>
Models: LGB
Metric: average_precision
Running BO for LightGBM...
| call             | n_estimators | learning_rate | max_depth | num_leaves | min_child_weight | min_child_samples | subsample | colsample_bytree | reg_alpha | reg_lambda | average_precision | best_average_precision | early_stopping |    time | total_time |
| ---------------- | ------------ | ------------- | --------- | ---------- | ---------------- | ----------------- | --------- | ---------------- | --------- | ---------- | ----------------- | ---------------------- | -------------- | ------- | ---------- |
| Initial point 1  |          499 |         0.733 |         1 |         40 |            0.001 |                18 |       0.7 |              0.8 |       100 |         10 |            0.6374 |                 0.6374 |         50/499 |  0.069s |     0.132s |
| Initial point 2  |          170 |         0.112 |         4 |         25 |              0.1 |                28 |       0.7 |              0.7 |       100 |         10 |            0.6374 |                 0.6374 |         18/170 |  0.038s |     0.486s |
| Initial point 3  |          364 |        0.4032 |         1 |         30 |               10 |                27 |       0.9 |              0.6 |         0 |          1 |            0.9833 |                 0.9833 |         48/364 |  0.022s |     0.702s |
| Iteration 4      |          306 |        0.0835 |         1 |         30 |              100 |                27 |       0.9 |              0.8 |         0 |        0.1 |            0.6374 |                 0.9833 |         31/306 |  0.038s |     3.127s |
| Iteration 5      |          477 |        0.2785 |         7 |         29 |           0.0001 |                12 |       0.7 |              0.6 |         0 |          1 |             0.995 |                  0.995 |         84/477 |  0.078s |     4.180s |
| Iteration 6      |          500 |          0.01 |         9 |         21 |           0.0001 |                15 |       0.6 |              0.6 |         0 |        0.1 |               1.0 |                    1.0 |        500/500 |  0.172s |     5.078s |
| Iteration 7      |          410 |        0.0136 |         9 |         24 |              0.1 |                27 |       0.6 |              0.5 |         0 |        0.1 |            0.9978 |                    1.0 |        410/410 |  0.125s |     5.862s |
Bayesian Optimization ---------------------------
Best call --> Iteration 6
Best parameters --> {'n_estimators': 500, 'learning_rate': 0.01, 'max_depth': 9, 'num_leaves': 21, 'min_child_weight': 0.0001, 'min_child_samples': 15, 'subsample': 0.6, 'colsample_bytree': 0.6, 'reg_alpha': 0, 'reg_lambda': 0.1}
Best evaluation --> average_precision: 1.0
Time elapsed: 6.565s
Fit ---------------------------------------------
Train evaluation --> average_precision: 1.0
Test evaluation --> average_precision: 0.9964
Time elapsed: 0.250s
-------------------------------------------------
Total time: 6.831s
Final results ==================== >>
Duration: 6.831s
-------------------------------------
LightGBM --> average_precision: 0.9964
Analyze the results¶
In [10]:
                Copied!
                
                
            # Plot the evaluation on the train and test set during training
# Note that the metric is provided by the estimator's package, not ATOM!
atom.lgb.plot_evals(title="LightGBM's evaluation curve", figsize=(11, 9))
# Plot the evaluation on the train and test set during training
# Note that the metric is provided by the estimator's package, not ATOM!
atom.lgb.plot_evals(title="LightGBM's evaluation curve", figsize=(11, 9))