Example: Hyperparameter tuning¶

This example shows an advanced example on how to optimize your model's hyperparameters for multi-metric runs.

Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.

Load the data¶

In [1]:

            
                Copied!
                
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier

In [2]:

            
                Copied!
                
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)

Run the pipeline¶

In [3]:

            
                Copied!
                
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)

<< ================== ATOM ================== >>
Algorithm task: binary classification.
Parallel processing with 4 cores.

Dataset stats ==================== >>
Shape: (569, 31)
Memory: 138.96 kB
Scaled: False
Outlier values: 167 (1.2%)
-------------------------------------
Train set size: 456
Test set size: 113
-------------------------------------
|   |     dataset |       train |        test |
| - | ----------- | ----------- | ----------- |
| 0 |   212 (1.0) |   170 (1.0) |    42 (1.0) |
| 1 |   357 (1.7) |   286 (1.7) |    71 (1.7) |

In [4]:

            
                Copied!
                
                    
                    
                
                

        
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
    models="MLP",
    metric=["f1", "ap"],
    n_trials=10,
    est_params={"activation": "relu"},
    ht_params={
        "distributions": {
            "hidden_layer_1": IntDistribution(2, 4),
            "hidden_layer_2": IntDistribution(10, 20),
            "hidden_layer_3": IntDistribution(10, 20),
            "hidden_layer_4": IntDistribution(2, 4),
        }
    }
)
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
    models="MLP",
    metric=["f1", "ap"],
    n_trials=10,
    est_params={"activation": "relu"},
    ht_params={
        "distributions": {
            "hidden_layer_1": IntDistribution(2, 4),
            "hidden_layer_2": IntDistribution(10, 20),
            "hidden_layer_3": IntDistribution(10, 20),
            "hidden_layer_4": IntDistribution(2, 4),
        }
    }
)

Training ========================= >>
Models: MLP
Metric: f1, average_precision


Running hyperparameter tuning for MultiLayerPerceptron...
| trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 |      f1 | best_f1 | average_precision | best_average_precision | time_trial | time_ht |    state |
| ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ----------------- | ---------------------- | ---------- | ------- | -------- |
| 0     |              3 |             17 |             10 |              2 |  0.9455 |  0.9455 |            0.9837 |                 0.9837 |     0.943s |  0.943s | COMPLETE |
| 1     |              2 |             11 |             12 |              3 |  0.9739 |  0.9739 |            0.9988 |                 0.9988 |     0.929s |  1.872s | COMPLETE |
| 2     |              3 |             15 |             14 |              4 |  0.9913 |  0.9913 |               1.0 |                    1.0 |     0.935s |  2.807s | COMPLETE |
| 3     |              2 |             19 |             10 |              4 |  0.9649 |  0.9913 |            0.9867 |                    1.0 |     0.926s |  3.732s | COMPLETE |
| 4     |              3 |             16 |             11 |              2 |  0.9655 |  0.9913 |             0.998 |                    1.0 |     0.925s |  4.657s | COMPLETE |
| 5     |              4 |             20 |             13 |              4 |  0.9821 |  0.9913 |            0.9994 |                    1.0 |     0.940s |  5.597s | COMPLETE |
| 6     |              4 |             19 |             10 |              2 |  0.9825 |  0.9913 |            0.9901 |                    1.0 |     0.943s |  6.540s | COMPLETE |
| 7     |              2 |             19 |             11 |              3 |  0.7703 |  0.9913 |            0.9991 |                    1.0 |     0.943s |  7.483s | COMPLETE |
| 8     |              4 |             15 |             17 |              2 |  0.9913 |  0.9913 |            0.9997 |                    1.0 |     0.930s |  8.413s | COMPLETE |
| 9     |              4 |             19 |             10 |              4 |  0.9739 |  0.9913 |            0.9813 |                    1.0 |     0.915s |  9.327s | COMPLETE |
Hyperparameter tuning ---------------------------
Best trial --> 2
Best parameters:
 --> hidden_layer_sizes: (3, 15, 14, 4)
Best evaluation --> f1: 0.9913   average_precision: 1.0
Time elapsed: 9.327s
Fit ---------------------------------------------
Train evaluation --> f1: 0.993   average_precision: 0.998
Test evaluation --> f1: 0.9861   average_precision: 0.995
Time elapsed: 1.305s
-------------------------------------------------
Total time: 10.633s


Final results ==================== >>
Total time: 10.675s
-------------------------------------
MultiLayerPerceptron --> f1: 0.9861   average_precision: 0.995

In [5]:

            
                Copied!
                
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial

Out[5]:

FrozenTrial(number=2, state=TrialState.COMPLETE, values=[0.9913043478260869, 1.0000000000000002], datetime_start=datetime.datetime(2022, 11, 23, 17, 35, 3, 837565), datetime_complete=datetime.datetime(2022, 11, 23, 17, 35, 4, 771413), params={'hidden_layer_1': 3, 'hidden_layer_2': 15, 'hidden_layer_3': 14, 'hidden_layer_4': 4}, user_attrs={'params': {'hidden_layer_1': 3,
 'hidden_layer_2': 15,
 'hidden_layer_3': 14,
 'hidden_layer_4': 4}, 'estimator': MLPClassifier(hidden_layer_sizes=(3, 15, 14, 4), random_state=1)}, system_attrs={'nsga2:generation': 0}, intermediate_values={}, distributions={'hidden_layer_1': IntDistribution(high=4, log=False, low=2, step=1), 'hidden_layer_2': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_3': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_4': IntDistribution(high=4, log=False, low=2, step=1)}, trial_id=2, value=None)

In [6]:

            
                Copied!
                
atom.plot_pareto_front()
atom.plot_pareto_front()

In [7]:

            
                Copied!
                
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)

Running hyperparameter tuning for MultiLayerPerceptron...
| trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 |      f1 | best_f1 | average_precision | best_average_precision | time_trial | time_ht |    state |
| ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ----------------- | ---------------------- | ---------- | ------- | -------- |
| 10    |              4 |             18 |             13 |              4 |     1.0 |     1.0 |               1.0 |                    1.0 |     0.977s | 10.304s | COMPLETE |
| 11    |              2 |             14 |             19 |              2 |  0.9492 |     1.0 |            0.9899 |                    1.0 |     0.924s | 11.228s | COMPLETE |
| 12    |              2 |             11 |             10 |              4 |  0.7703 |     1.0 |              0.99 |                    1.0 |     0.919s | 12.147s | COMPLETE |
| 13    |              2 |             12 |             15 |              2 |  0.9643 |     1.0 |            0.9813 |                    1.0 |     0.914s | 13.061s | COMPLETE |
| 14    |              3 |             11 |             16 |              4 |  0.7703 |     1.0 |            0.9724 |                    1.0 |     0.926s | 13.987s | COMPLETE |
Hyperparameter tuning ---------------------------
Best trial --> 10
Best parameters:
 --> hidden_layer_sizes: (4, 18, 13, 4)
Best evaluation --> f1: 1.0   average_precision: 1.0
Time elapsed: 13.987s

In [8]:

            
                Copied!
                
# The trials attribute gives an overview of the trial results
atom.mlp.trials
# The trials attribute gives an overview of the trial results
atom.mlp.trials

Out[8]:

	params	estimator	score	time_trial	time_ht	state
trial
0	{'hidden_layer_sizes': (3, 17, 10, 2)}	MLPClassifier(hidden_layer_sizes=(3, 17, 10, 2...	[0.9454545454545454, 0.9837236558914353]	0.942854	0.942854	COMPLETE
1	{'hidden_layer_sizes': (2, 11, 12, 3)}	MLPClassifier(hidden_layer_sizes=(2, 11, 12, 3...	[0.9739130434782608, 0.9988003322156944]	0.928844	1.871698	COMPLETE
2	{'hidden_layer_sizes': (3, 15, 14, 4)}	MLPClassifier(hidden_layer_sizes=(3, 15, 14, 4...	[0.9913043478260869, 1.0000000000000002]	0.934848	2.806546	COMPLETE
3	{'hidden_layer_sizes': (2, 19, 10, 4)}	MLPClassifier(hidden_layer_sizes=(2, 19, 10, 4...	[0.9649122807017544, 0.9867431480369178]	0.925841	3.732387	COMPLETE
4	{'hidden_layer_sizes': (3, 16, 11, 2)}	MLPClassifier(hidden_layer_sizes=(3, 16, 11, 2...	[0.9655172413793103, 0.9980213692125051]	0.92484	4.657227	COMPLETE
5	{'hidden_layer_sizes': (4, 20, 13, 4)}	MLPClassifier(hidden_layer_sizes=(4, 20, 13, 4...	[0.9821428571428572, 0.999389732649834]	0.939853	5.59708	COMPLETE
6	{'hidden_layer_sizes': (4, 19, 10, 2)}	MLPClassifier(hidden_layer_sizes=(4, 19, 10, 2...	[0.9824561403508771, 0.990093407959159]	0.942856	6.539936	COMPLETE
7	{'hidden_layer_sizes': (2, 19, 11, 3)}	MLPClassifier(hidden_layer_sizes=(2, 19, 11, 3...	[0.7702702702702703, 0.9990764494418141]	0.942856	7.482792	COMPLETE
8	{'hidden_layer_sizes': (4, 15, 17, 2)}	MLPClassifier(hidden_layer_sizes=(4, 15, 17, 2...	[0.9913043478260869, 0.9996975196612221]	0.929844	8.412636	COMPLETE
9	{'hidden_layer_sizes': (4, 19, 10, 4)}	MLPClassifier(hidden_layer_sizes=(4, 19, 10, 4...	[0.9739130434782608, 0.9813127743443262]	0.914831	9.327467	COMPLETE
10	{'hidden_layer_sizes': (4, 18, 13, 4)}	MLPClassifier(hidden_layer_sizes=(4, 18, 13, 4...	[1.0, 1.0000000000000002]	0.976887	10.304354	COMPLETE
11	{'hidden_layer_sizes': (2, 14, 19, 2)}	MLPClassifier(hidden_layer_sizes=(2, 14, 19, 2...	[0.9491525423728813, 0.9899476963066745]	0.923839	11.228193	COMPLETE
12	{'hidden_layer_sizes': (2, 11, 10, 4)}	MLPClassifier(hidden_layer_sizes=(2, 11, 10, 4...	[0.7702702702702703, 0.9900232191286547]	0.918835	12.147028	COMPLETE
13	{'hidden_layer_sizes': (2, 12, 15, 2)}	MLPClassifier(hidden_layer_sizes=(2, 12, 15, 2...	[0.9642857142857142, 0.9812621686248989]	0.91383	13.060858	COMPLETE
14	{'hidden_layer_sizes': (3, 11, 16, 4)}	MLPClassifier(hidden_layer_sizes=(3, 11, 16, 4...	[0.7702702702702703, 0.9723670235061694]	0.92584	13.986698	COMPLETE

In [9]:

            
                Copied!
                
# Select a custom best trial...
atom.mlp.best_trial = 2

# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params
# Select a custom best trial...
atom.mlp.best_trial = 2

# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params

Out[9]:

{'hidden_layer_sizes': (3, 15, 14, 4)}

In [10]:

            
                Copied!
                
# Lastly, fit the model on the complete training set 
# using the new combination of hyperparameters
atom.mlp.fit()
# Lastly, fit the model on the complete training set 
# using the new combination of hyperparameters
atom.mlp.fit()

Fit ---------------------------------------------
Train evaluation --> f1: 0.9948   average_precision: 0.9994
Test evaluation --> f1: 0.9861   average_precision: 0.997
Time elapsed: 2.648s

Analyze the results¶

In [11]:

            
                Copied!
                
atom.plot_trials()
atom.plot_trials()

In [12]:

            
                Copied!
                
atom.plot_parallel_coordinate()
atom.plot_parallel_coordinate()