Example: Hyperparameter tuning¶

This example shows an advanced example on how to optimize your model's hyperparameters for multi-metric runs.

Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.

Load the data¶

In [1]:

                
                    Copied!
                    
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier

UserWarning: The pandas version installed (1.5.3) does not match the supported pandas version in Modin (1.5.2). This may cause undesired side effects!

In [2]:

                
                    Copied!
                    
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)

Run the pipeline¶

In [3]:

                
                    Copied!
                    
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)

<< ================== ATOM ================== >>
Algorithm task: binary classification.
Parallel processing with 4 cores.
Parallelization backend: loky

Dataset stats ==================== >>
Shape: (569, 31)
Train set size: 456
Test set size: 113
-------------------------------------
Memory: 141.24 kB
Scaled: False
Outlier values: 167 (1.2%)

In [4]:

                
                    Copied!
                    
                        
                        
                    
                    

            
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
    models="MLP",
    metric=["f1", "ap"],
    n_trials=10,
    est_params={"activation": "relu"},
    ht_params={
        "distributions": {
            "hidden_layer_1": IntDistribution(2, 4),
            "hidden_layer_2": IntDistribution(10, 20),
            "hidden_layer_3": IntDistribution(10, 20),
            "hidden_layer_4": IntDistribution(2, 4),
        }
    }
)
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
    models="MLP",
    metric=["f1", "ap"],
    n_trials=10,
    est_params={"activation": "relu"},
    ht_params={
        "distributions": {
            "hidden_layer_1": IntDistribution(2, 4),
            "hidden_layer_2": IntDistribution(10, 20),
            "hidden_layer_3": IntDistribution(10, 20),
            "hidden_layer_4": IntDistribution(2, 4),
        }
    }
)

Training ========================= >>
Models: MLP
Metric: f1, average_precision


Running hyperparameter tuning for MultiLayerPerceptron...
| trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 |      f1 | best_f1 | average_precision | best_average_precision | time_trial | time_ht |    state |
| ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ----------------- | ---------------------- | ---------- | ------- | -------- |
| 0     |              3 |             17 |             10 |              2 |  0.9455 |  0.9455 |            0.9837 |                 0.9837 |     0.627s |  0.627s | COMPLETE |
| 1     |              2 |             11 |             12 |              3 |  0.9739 |  0.9739 |            0.9988 |                 0.9988 |     0.704s |  1.331s | COMPLETE |
| 2     |              3 |             15 |             14 |              4 |  0.9913 |  0.9913 |               1.0 |                    1.0 |     0.633s |  1.964s | COMPLETE |
| 3     |              2 |             19 |             10 |              4 |  0.9649 |  0.9913 |            0.9867 |                    1.0 |     0.623s |  2.587s | COMPLETE |
| 4     |              3 |             16 |             11 |              2 |  0.9655 |  0.9913 |             0.998 |                    1.0 |     0.617s |  3.204s | COMPLETE |
| 5     |              4 |             20 |             13 |              4 |  0.9821 |  0.9913 |            0.9994 |                    1.0 |     0.621s |  3.826s | COMPLETE |
| 6     |              4 |             19 |             10 |              2 |  0.9825 |  0.9913 |            0.9901 |                    1.0 |     0.863s |  4.689s | COMPLETE |
| 7     |              2 |             19 |             11 |              3 |  0.7703 |  0.9913 |            0.9991 |                    1.0 |     0.882s |  5.571s | COMPLETE |
| 8     |              4 |             15 |             17 |              2 |  0.9913 |  0.9913 |            0.9997 |                    1.0 |     1.109s |  6.680s | COMPLETE |
| 9     |              4 |             19 |             10 |              4 |  0.9739 |  0.9913 |            0.9813 |                    1.0 |     1.071s |  7.751s | COMPLETE |
Hyperparameter tuning ---------------------------
Best trial --> 2
Best parameters:
 --> hidden_layer_sizes: (3, 15, 14, 4)
Best evaluation --> f1: 0.9913   average_precision: 1.0
Time elapsed: 7.751s
Fit ---------------------------------------------
Train evaluation --> f1: 0.993   average_precision: 0.998
Test evaluation --> f1: 0.9861   average_precision: 0.995
Time elapsed: 1.476s
-------------------------------------------------
Total time: 9.227s


Final results ==================== >>
Total time: 9.276s
-------------------------------------
MultiLayerPerceptron --> f1: 0.9861   average_precision: 0.995

In [5]:

                
                    Copied!
                    
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial

Out[5]:

FrozenTrial(number=2, state=TrialState.COMPLETE, values=[0.9913043478260869, 1.0000000000000002], datetime_start=datetime.datetime(2023, 2, 24, 20, 10, 51, 71612), datetime_complete=datetime.datetime(2023, 2, 24, 20, 10, 51, 703672), params={'hidden_layer_1': 3, 'hidden_layer_2': 15, 'hidden_layer_3': 14, 'hidden_layer_4': 4}, user_attrs={'params': {'hidden_layer_1': 3,
 'hidden_layer_2': 15,
 'hidden_layer_3': 14,
 'hidden_layer_4': 4}, 'estimator': MLPClassifier(hidden_layer_sizes=(3, 15, 14, 4), random_state=1)}, system_attrs={'nsga2:generation': 0}, intermediate_values={}, distributions={'hidden_layer_1': IntDistribution(high=4, log=False, low=2, step=1), 'hidden_layer_2': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_3': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_4': IntDistribution(high=4, log=False, low=2, step=1)}, trial_id=2, value=None)

In [6]:

                
                    Copied!
                    
atom.plot_pareto_front()
atom.plot_pareto_front()

In [7]:

                
                    Copied!
                    
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)

Running hyperparameter tuning for MultiLayerPerceptron...
| trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 |      f1 | best_f1 | average_precision | best_average_precision | time_trial | time_ht |    state |
| ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ----------------- | ---------------------- | ---------- | ------- | -------- |
| 10    |              4 |             18 |             13 |              4 |     1.0 |     1.0 |               1.0 |                    1.0 |     0.724s |  8.475s | COMPLETE |
| 11    |              2 |             14 |             19 |              2 |  0.9492 |     1.0 |            0.9899 |                    1.0 |     0.709s |  9.183s | COMPLETE |
| 12    |              2 |             11 |             10 |              4 |  0.7703 |     1.0 |              0.99 |                    1.0 |     0.835s | 10.018s | COMPLETE |
| 13    |              2 |             12 |             15 |              2 |  0.9643 |     1.0 |            0.9813 |                    1.0 |     0.779s | 10.797s | COMPLETE |
| 14    |              3 |             11 |             16 |              4 |  0.7703 |     1.0 |            0.9724 |                    1.0 |     0.741s | 11.537s | COMPLETE |
Hyperparameter tuning ---------------------------
Best trial --> 10
Best parameters:
 --> hidden_layer_sizes: (4, 18, 13, 4)
Best evaluation --> f1: 1.0   average_precision: 1.0
Time elapsed: 11.537s

In [8]:

                
                    Copied!
                    
# The trials attribute gives an overview of the trial results
atom.mlp.trials
# The trials attribute gives an overview of the trial results
atom.mlp.trials

Out[8]:

	params	estimator	score	time_trial	time_ht	state
trial
0	{'hidden_layer_sizes': (3, 17, 10, 2)}	MLPClassifier(hidden_layer_sizes=(3, 17, 10, 2...	[0.9454545454545454, 0.9837236558914353]	0.627226	0.627226	COMPLETE
1	{'hidden_layer_sizes': (2, 11, 12, 3)}	MLPClassifier(hidden_layer_sizes=(2, 11, 12, 3...	[0.9739130434782608, 0.9988003322156944]	0.704077	1.331303	COMPLETE
2	{'hidden_layer_sizes': (3, 15, 14, 4)}	MLPClassifier(hidden_layer_sizes=(3, 15, 14, 4...	[0.9913043478260869, 1.0000000000000002]	0.633055	1.964358	COMPLETE
3	{'hidden_layer_sizes': (2, 19, 10, 4)}	MLPClassifier(hidden_layer_sizes=(2, 19, 10, 4...	[0.9649122807017544, 0.9867431480369178]	0.62296	2.587318	COMPLETE
4	{'hidden_layer_sizes': (3, 16, 11, 2)}	MLPClassifier(hidden_layer_sizes=(3, 16, 11, 2...	[0.9655172413793103, 0.9980213692125051]	0.617179	3.204497	COMPLETE
5	{'hidden_layer_sizes': (4, 20, 13, 4)}	MLPClassifier(hidden_layer_sizes=(4, 20, 13, 4...	[0.9821428571428572, 0.999389732649834]	0.621078	3.825575	COMPLETE
6	{'hidden_layer_sizes': (4, 19, 10, 2)}	MLPClassifier(hidden_layer_sizes=(4, 19, 10, 2...	[0.9824561403508771, 0.990093407959159]	0.863342	4.688917	COMPLETE
7	{'hidden_layer_sizes': (2, 19, 11, 3)}	MLPClassifier(hidden_layer_sizes=(2, 19, 11, 3...	[0.7702702702702703, 0.9990764494418141]	0.881802	5.570719	COMPLETE
8	{'hidden_layer_sizes': (4, 15, 17, 2)}	MLPClassifier(hidden_layer_sizes=(4, 15, 17, 2...	[0.9913043478260869, 0.9996975196612221]	1.109416	6.680135	COMPLETE
9	{'hidden_layer_sizes': (4, 19, 10, 4)}	MLPClassifier(hidden_layer_sizes=(4, 19, 10, 4...	[0.9739130434782608, 0.9813127743443262]	1.070974	7.751109	COMPLETE
10	{'hidden_layer_sizes': (4, 18, 13, 4)}	MLPClassifier(hidden_layer_sizes=(4, 18, 13, 4...	[1.0, 1.0000000000000002]	0.723657	8.474766	COMPLETE
11	{'hidden_layer_sizes': (2, 14, 19, 2)}	MLPClassifier(hidden_layer_sizes=(2, 14, 19, 2...	[0.9491525423728813, 0.9899476963066745]	0.708648	9.183414	COMPLETE
12	{'hidden_layer_sizes': (2, 11, 10, 4)}	MLPClassifier(hidden_layer_sizes=(2, 11, 10, 4...	[0.7702702702702703, 0.9900232191286547]	0.83476	10.018174	COMPLETE
13	{'hidden_layer_sizes': (2, 12, 15, 2)}	MLPClassifier(hidden_layer_sizes=(2, 12, 15, 2...	[0.9642857142857142, 0.9812621686248989]	0.778707	10.796881	COMPLETE
14	{'hidden_layer_sizes': (3, 11, 16, 4)}	MLPClassifier(hidden_layer_sizes=(3, 11, 16, 4...	[0.7702702702702703, 0.9723670235061694]	0.740589	11.53747	COMPLETE

In [9]:

                
                    Copied!
                    
# Select a custom best trial...
atom.mlp.best_trial = 2

# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params
# Select a custom best trial...
atom.mlp.best_trial = 2

# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params

Out[9]:

{'hidden_layer_sizes': (3, 15, 14, 4)}

In [10]:

                
                    Copied!
                    
# Lastly, fit the model on the complete training set 
# using the new combination of hyperparameters
atom.mlp.fit()
# Lastly, fit the model on the complete training set 
# using the new combination of hyperparameters
atom.mlp.fit()

Fit ---------------------------------------------
Train evaluation --> f1: 0.9948   average_precision: 0.9994
Test evaluation --> f1: 0.9861   average_precision: 0.997
Time elapsed: 3.028s

Analyze the results¶

In [11]:

                
                    Copied!
                    
atom.plot_trials()
atom.plot_trials()

In [12]:

                
                    Copied!
                    
atom.plot_parallel_coordinate()
atom.plot_parallel_coordinate()