Example: Pruning¶

This example shows an advanced example on how to use hyperparameter tuning with pruning.

Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.

Load the data¶

In [1]:

            
                Copied!
                
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.pruners import HyperbandPruner
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.pruners import HyperbandPruner
from atom import ATOMClassifier

In [2]:

            
                Copied!
                
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)

Run the pipeline¶

In [3]:

            
                Copied!
                
# Initialize atom
atom = ATOMClassifier(X, y, verbose=2, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, verbose=2, random_state=1)

<< ================== ATOM ================== >>
Algorithm task: binary classification.

Dataset stats ==================== >>
Shape: (569, 31)
Memory: 138.96 kB
Scaled: False
Outlier values: 167 (1.2%)
-------------------------------------
Train set size: 456
Test set size: 113
-------------------------------------
|   |     dataset |       train |        test |
| - | ----------- | ----------- | ----------- |
| 0 |   212 (1.0) |   170 (1.0) |    42 (1.0) |
| 1 |   357 (1.7) |   286 (1.7) |    71 (1.7) |

In [4]:

            
                Copied!
                
                    
                    
                
                

        
# Use ht_params to specify a custom pruner
# Note that pruned trials show the number of iterations it completed
atom.run(
    models="SGD",
    metric="f1",
    n_trials=25,
    ht_params={
        "distributions": ["penalty", "max_iter"],
        "pruner": HyperbandPruner(),
    }
)
# Use ht_params to specify a custom pruner
# Note that pruned trials show the number of iterations it completed
atom.run(
    models="SGD",
    metric="f1",
    n_trials=25,
    ht_params={
        "distributions": ["penalty", "max_iter"],
        "pruner": HyperbandPruner(),
    }
)

Training ========================= >>
Models: SGD
Metric: f1


Running hyperparameter tuning for StochasticGradientDescent...
| trial | penalty | max_iter |      f1 | best_f1 | time_trial | time_ht |    state |
| ----- | ------- | -------- | ------- | ------- | ---------- | ------- | -------- |
| 0     |      l1 |      650 |  0.9735 |  0.9735 |     2.288s |  2.288s | COMPLETE |
| 1     | elast.. |     1050 |  0.9739 |  0.9739 |     3.783s |  6.072s | COMPLETE |
| 2     | elast.. |      500 |  0.9558 |  0.9739 |     1.788s |  7.859s | COMPLETE |
| 3     |    none |      700 |  0.9825 |  0.9825 |     2.370s | 10.229s | COMPLETE |
| 4     |      l1 |   3/1400 |  0.9821 |  0.9825 |     0.032s | 10.261s |   PRUNED |
| 5     |    none |   9/1400 |  0.9821 |  0.9825 |     0.046s | 10.307s |   PRUNED |
| 6     |      l2 |   3/1200 |  0.9825 |  0.9825 |     0.025s | 10.332s |   PRUNED |
| 7     |      l2 |   1/1250 |  0.9358 |  0.9825 |     0.019s | 10.351s |   PRUNED |
| 8     |    none |    1/600 |  0.9912 |  0.9912 |     0.023s | 10.374s |   PRUNED |
| 9     |      l1 |    9/600 |  0.9381 |  0.9912 |     0.048s | 10.422s |   PRUNED |
| 10    |      l1 |   3/1000 |  0.9828 |  0.9912 |     0.029s | 10.451s |   PRUNED |
| 11    | elast.. |   1/1200 |   0.955 |  0.9912 |     0.021s | 10.473s |   PRUNED |
| 12    |      l2 |    1/550 |  0.9541 |  0.9912 |     0.021s | 10.494s |   PRUNED |
| 13    | elast.. |   1/1100 |  0.9636 |  0.9912 |     0.020s | 10.514s |   PRUNED |
| 14    |    none |    9/900 |  0.9565 |  0.9912 |     0.047s | 10.561s |   PRUNED |
| 15    |      l1 |    1/850 |  0.9735 |  0.9912 |     0.023s | 10.584s |   PRUNED |
| 16    | elast.. |    1/750 |  0.9298 |  0.9912 |     0.022s | 10.606s |   PRUNED |
| 17    |      l1 |     1250 |  0.9912 |  0.9912 |     4.543s | 15.149s | COMPLETE |
| 18    |    none |      700 |  0.9825 |  0.9912 |     0.004s | 15.153s | COMPLETE |
| 19    |    none |    3/750 |  0.9739 |  0.9912 |     0.025s | 15.178s |   PRUNED |
| 20    |    none |    1/750 |  0.9492 |  0.9912 |     0.024s | 15.202s |   PRUNED |
| 21    |      l2 |     1150 |  0.9828 |  0.9912 |     4.169s | 19.371s | COMPLETE |
| 22    |    none |    1/850 |  0.9643 |  0.9912 |     0.024s | 19.395s |   PRUNED |
| 23    |      l1 |    3/900 |  0.9649 |  0.9912 |     0.029s | 19.424s |   PRUNED |
| 24    |      l1 |      650 |  0.9735 |  0.9912 |     0.005s | 19.429s | COMPLETE |
Hyperparameter tuning ---------------------------
Best trial --> 17
Best parameters:
 --> penalty: l1
 --> max_iter: 1250
Best evaluation --> f1: 0.9912
Time elapsed: 19.429s
Fit ---------------------------------------------
Train evaluation --> f1: 0.9983
Test evaluation --> f1: 0.9353
Time elapsed: 6.245s
-------------------------------------------------
Total time: 25.673s


Final results ==================== >>
Total time: 25.733s
-------------------------------------
StochasticGradientDescent --> f1: 0.9353

Analyze the results¶

In [5]:

            
                Copied!
                
atom.plot_trials()
atom.plot_trials()

In [6]:

            
                Copied!
                
atom.plot_hyperparameter_importance()
atom.plot_hyperparameter_importance()