Example: Hyperparameter tuning¶
This example shows an advanced example on how to optimize your model's hyperparameters for multi-metric runs.
Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.
Load the data¶
In [1]:
Copied!
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier
UserWarning: The pandas version installed (1.5.3) does not match the supported pandas version in Modin (1.5.2). This may cause undesired side effects!
In [2]:
Copied!
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)
Run the pipeline¶
In [3]:
Copied!
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)
<< ================== ATOM ================== >> Algorithm task: binary classification. Parallel processing with 4 cores. Parallelization backend: loky Dataset stats ==================== >> Shape: (569, 31) Train set size: 456 Test set size: 113 ------------------------------------- Memory: 141.24 kB Scaled: False Outlier values: 167 (1.2%)
In [4]:
Copied!
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
models="MLP",
metric=["f1", "ap"],
n_trials=10,
est_params={"activation": "relu"},
ht_params={
"distributions": {
"hidden_layer_1": IntDistribution(2, 4),
"hidden_layer_2": IntDistribution(10, 20),
"hidden_layer_3": IntDistribution(10, 20),
"hidden_layer_4": IntDistribution(2, 4),
}
}
)
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
models="MLP",
metric=["f1", "ap"],
n_trials=10,
est_params={"activation": "relu"},
ht_params={
"distributions": {
"hidden_layer_1": IntDistribution(2, 4),
"hidden_layer_2": IntDistribution(10, 20),
"hidden_layer_3": IntDistribution(10, 20),
"hidden_layer_4": IntDistribution(2, 4),
}
}
)
Training ========================= >> Models: MLP Metric: f1, average_precision Running hyperparameter tuning for MultiLayerPerceptron... | trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 | f1 | best_f1 | average_precision | best_average_precision | time_trial | time_ht | state | | ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ----------------- | ---------------------- | ---------- | ------- | -------- | | 0 | 3 | 17 | 10 | 2 | 0.9455 | 0.9455 | 0.9837 | 0.9837 | 0.627s | 0.627s | COMPLETE | | 1 | 2 | 11 | 12 | 3 | 0.9739 | 0.9739 | 0.9988 | 0.9988 | 0.704s | 1.331s | COMPLETE | | 2 | 3 | 15 | 14 | 4 | 0.9913 | 0.9913 | 1.0 | 1.0 | 0.633s | 1.964s | COMPLETE | | 3 | 2 | 19 | 10 | 4 | 0.9649 | 0.9913 | 0.9867 | 1.0 | 0.623s | 2.587s | COMPLETE | | 4 | 3 | 16 | 11 | 2 | 0.9655 | 0.9913 | 0.998 | 1.0 | 0.617s | 3.204s | COMPLETE | | 5 | 4 | 20 | 13 | 4 | 0.9821 | 0.9913 | 0.9994 | 1.0 | 0.621s | 3.826s | COMPLETE | | 6 | 4 | 19 | 10 | 2 | 0.9825 | 0.9913 | 0.9901 | 1.0 | 0.863s | 4.689s | COMPLETE | | 7 | 2 | 19 | 11 | 3 | 0.7703 | 0.9913 | 0.9991 | 1.0 | 0.882s | 5.571s | COMPLETE | | 8 | 4 | 15 | 17 | 2 | 0.9913 | 0.9913 | 0.9997 | 1.0 | 1.109s | 6.680s | COMPLETE | | 9 | 4 | 19 | 10 | 4 | 0.9739 | 0.9913 | 0.9813 | 1.0 | 1.071s | 7.751s | COMPLETE | Hyperparameter tuning --------------------------- Best trial --> 2 Best parameters: --> hidden_layer_sizes: (3, 15, 14, 4) Best evaluation --> f1: 0.9913 average_precision: 1.0 Time elapsed: 7.751s Fit --------------------------------------------- Train evaluation --> f1: 0.993 average_precision: 0.998 Test evaluation --> f1: 0.9861 average_precision: 0.995 Time elapsed: 1.476s ------------------------------------------------- Total time: 9.227s Final results ==================== >> Total time: 9.276s ------------------------------------- MultiLayerPerceptron --> f1: 0.9861 average_precision: 0.995
In [5]:
Copied!
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial
Out[5]:
FrozenTrial(number=2, state=TrialState.COMPLETE, values=[0.9913043478260869, 1.0000000000000002], datetime_start=datetime.datetime(2023, 2, 24, 20, 10, 51, 71612), datetime_complete=datetime.datetime(2023, 2, 24, 20, 10, 51, 703672), params={'hidden_layer_1': 3, 'hidden_layer_2': 15, 'hidden_layer_3': 14, 'hidden_layer_4': 4}, user_attrs={'params': {'hidden_layer_1': 3, 'hidden_layer_2': 15, 'hidden_layer_3': 14, 'hidden_layer_4': 4}, 'estimator': MLPClassifier(hidden_layer_sizes=(3, 15, 14, 4), random_state=1)}, system_attrs={'nsga2:generation': 0}, intermediate_values={}, distributions={'hidden_layer_1': IntDistribution(high=4, log=False, low=2, step=1), 'hidden_layer_2': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_3': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_4': IntDistribution(high=4, log=False, low=2, step=1)}, trial_id=2, value=None)
In [6]:
Copied!
atom.plot_pareto_front()
atom.plot_pareto_front()
In [7]:
Copied!
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)
Running hyperparameter tuning for MultiLayerPerceptron... | trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 | f1 | best_f1 | average_precision | best_average_precision | time_trial | time_ht | state | | ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ----------------- | ---------------------- | ---------- | ------- | -------- | | 10 | 4 | 18 | 13 | 4 | 1.0 | 1.0 | 1.0 | 1.0 | 0.724s | 8.475s | COMPLETE | | 11 | 2 | 14 | 19 | 2 | 0.9492 | 1.0 | 0.9899 | 1.0 | 0.709s | 9.183s | COMPLETE | | 12 | 2 | 11 | 10 | 4 | 0.7703 | 1.0 | 0.99 | 1.0 | 0.835s | 10.018s | COMPLETE | | 13 | 2 | 12 | 15 | 2 | 0.9643 | 1.0 | 0.9813 | 1.0 | 0.779s | 10.797s | COMPLETE | | 14 | 3 | 11 | 16 | 4 | 0.7703 | 1.0 | 0.9724 | 1.0 | 0.741s | 11.537s | COMPLETE | Hyperparameter tuning --------------------------- Best trial --> 10 Best parameters: --> hidden_layer_sizes: (4, 18, 13, 4) Best evaluation --> f1: 1.0 average_precision: 1.0 Time elapsed: 11.537s
In [8]:
Copied!
# The trials attribute gives an overview of the trial results
atom.mlp.trials
# The trials attribute gives an overview of the trial results
atom.mlp.trials
Out[8]:
params | estimator | score | time_trial | time_ht | state | |
---|---|---|---|---|---|---|
trial | ||||||
0 | {'hidden_layer_sizes': (3, 17, 10, 2)} | MLPClassifier(hidden_layer_sizes=(3, 17, 10, 2... | [0.9454545454545454, 0.9837236558914353] | 0.627226 | 0.627226 | COMPLETE |
1 | {'hidden_layer_sizes': (2, 11, 12, 3)} | MLPClassifier(hidden_layer_sizes=(2, 11, 12, 3... | [0.9739130434782608, 0.9988003322156944] | 0.704077 | 1.331303 | COMPLETE |
2 | {'hidden_layer_sizes': (3, 15, 14, 4)} | MLPClassifier(hidden_layer_sizes=(3, 15, 14, 4... | [0.9913043478260869, 1.0000000000000002] | 0.633055 | 1.964358 | COMPLETE |
3 | {'hidden_layer_sizes': (2, 19, 10, 4)} | MLPClassifier(hidden_layer_sizes=(2, 19, 10, 4... | [0.9649122807017544, 0.9867431480369178] | 0.62296 | 2.587318 | COMPLETE |
4 | {'hidden_layer_sizes': (3, 16, 11, 2)} | MLPClassifier(hidden_layer_sizes=(3, 16, 11, 2... | [0.9655172413793103, 0.9980213692125051] | 0.617179 | 3.204497 | COMPLETE |
5 | {'hidden_layer_sizes': (4, 20, 13, 4)} | MLPClassifier(hidden_layer_sizes=(4, 20, 13, 4... | [0.9821428571428572, 0.999389732649834] | 0.621078 | 3.825575 | COMPLETE |
6 | {'hidden_layer_sizes': (4, 19, 10, 2)} | MLPClassifier(hidden_layer_sizes=(4, 19, 10, 2... | [0.9824561403508771, 0.990093407959159] | 0.863342 | 4.688917 | COMPLETE |
7 | {'hidden_layer_sizes': (2, 19, 11, 3)} | MLPClassifier(hidden_layer_sizes=(2, 19, 11, 3... | [0.7702702702702703, 0.9990764494418141] | 0.881802 | 5.570719 | COMPLETE |
8 | {'hidden_layer_sizes': (4, 15, 17, 2)} | MLPClassifier(hidden_layer_sizes=(4, 15, 17, 2... | [0.9913043478260869, 0.9996975196612221] | 1.109416 | 6.680135 | COMPLETE |
9 | {'hidden_layer_sizes': (4, 19, 10, 4)} | MLPClassifier(hidden_layer_sizes=(4, 19, 10, 4... | [0.9739130434782608, 0.9813127743443262] | 1.070974 | 7.751109 | COMPLETE |
10 | {'hidden_layer_sizes': (4, 18, 13, 4)} | MLPClassifier(hidden_layer_sizes=(4, 18, 13, 4... | [1.0, 1.0000000000000002] | 0.723657 | 8.474766 | COMPLETE |
11 | {'hidden_layer_sizes': (2, 14, 19, 2)} | MLPClassifier(hidden_layer_sizes=(2, 14, 19, 2... | [0.9491525423728813, 0.9899476963066745] | 0.708648 | 9.183414 | COMPLETE |
12 | {'hidden_layer_sizes': (2, 11, 10, 4)} | MLPClassifier(hidden_layer_sizes=(2, 11, 10, 4... | [0.7702702702702703, 0.9900232191286547] | 0.83476 | 10.018174 | COMPLETE |
13 | {'hidden_layer_sizes': (2, 12, 15, 2)} | MLPClassifier(hidden_layer_sizes=(2, 12, 15, 2... | [0.9642857142857142, 0.9812621686248989] | 0.778707 | 10.796881 | COMPLETE |
14 | {'hidden_layer_sizes': (3, 11, 16, 4)} | MLPClassifier(hidden_layer_sizes=(3, 11, 16, 4... | [0.7702702702702703, 0.9723670235061694] | 0.740589 | 11.53747 | COMPLETE |
In [9]:
Copied!
# Select a custom best trial...
atom.mlp.best_trial = 2
# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params
# Select a custom best trial...
atom.mlp.best_trial = 2
# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params
Out[9]:
{'hidden_layer_sizes': (3, 15, 14, 4)}
In [10]:
Copied!
# Lastly, fit the model on the complete training set
# using the new combination of hyperparameters
atom.mlp.fit()
# Lastly, fit the model on the complete training set
# using the new combination of hyperparameters
atom.mlp.fit()
Fit --------------------------------------------- Train evaluation --> f1: 0.9948 average_precision: 0.9994 Test evaluation --> f1: 0.9861 average_precision: 0.997 Time elapsed: 3.028s
Analyze the results¶
In [11]:
Copied!
atom.plot_trials()
atom.plot_trials()
In [12]:
Copied!
atom.plot_parallel_coordinate()
atom.plot_parallel_coordinate()