Example: Hyperparameter tuning¶
This example shows an advanced example on how to optimize your model's hyperparameters for multi-metric runs.
Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.
Load the data¶
In [1]:
Copied!
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from optuna.distributions import IntDistribution
from atom import ATOMClassifier
In [2]:
Copied!
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)
Run the pipeline¶
In [3]:
Copied!
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, n_jobs=4, verbose=2, random_state=1)
<< ================== ATOM ================== >> Configuration ==================== >> Algorithm task: Binary classification. Parallel processing with 4 cores. Dataset stats ==================== >> Shape: (569, 31) Train set size: 456 Test set size: 113 ------------------------------------- Memory: 141.24 kB Scaled: False Outlier values: 167 (1.2%)
In [4]:
Copied!
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
models="MLP",
metric=["f1", "ap"],
n_trials=10,
est_params={"activation": "relu"},
ht_params={
"distributions": {
"hidden_layer_1": IntDistribution(2, 4),
"hidden_layer_2": IntDistribution(10, 20),
"hidden_layer_3": IntDistribution(10, 20),
"hidden_layer_4": IntDistribution(2, 4),
}
},
errors='raise'
)
# Train a MultiLayerPerceptron model on two metrics
# using a custom number of hidden layers
atom.run(
models="MLP",
metric=["f1", "ap"],
n_trials=10,
est_params={"activation": "relu"},
ht_params={
"distributions": {
"hidden_layer_1": IntDistribution(2, 4),
"hidden_layer_2": IntDistribution(10, 20),
"hidden_layer_3": IntDistribution(10, 20),
"hidden_layer_4": IntDistribution(2, 4),
}
},
errors='raise'
)
Training ========================= >> Models: MLP Metric: f1, ap Running hyperparameter tuning for MultiLayerPerceptron... | trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 | f1 | best_f1 | ap | best_ap | time_trial | time_ht | state | | ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ------- | ------- | ---------- | ------- | -------- | | 0 | 3 | 17 | 10 | 2 | 0.9464 | 0.9464 | 0.9844 | 0.9844 | 9.522s | 9.522s | COMPLETE | | 1 | 2 | 11 | 12 | 3 | 0.9744 | 0.9744 | 0.9991 | 0.9991 | 9.369s | 18.891s | COMPLETE | | 2 | 3 | 15 | 14 | 4 | 0.9915 | 0.9915 | 0.9978 | 0.9991 | 11.460s | 30.351s | COMPLETE | | 3 | 2 | 19 | 10 | 4 | 0.9655 | 0.9915 | 0.9878 | 0.9991 | 11.359s | 41.709s | COMPLETE | | 4 | 3 | 16 | 11 | 2 | 0.9661 | 0.9915 | 0.9981 | 0.9991 | 0.653s | 42.362s | COMPLETE | | 5 | 4 | 20 | 13 | 4 | 0.9739 | 0.9915 | 0.9989 | 0.9991 | 0.610s | 42.972s | COMPLETE | | 6 | 4 | 19 | 10 | 2 | 0.9828 | 0.9915 | 0.9907 | 0.9991 | 0.606s | 43.578s | COMPLETE | | 7 | 2 | 19 | 11 | 3 | 0.7733 | 0.9915 | 0.9997 | 0.9997 | 0.601s | 44.179s | COMPLETE | | 8 | 4 | 15 | 17 | 2 | 0.9915 | 0.9915 | 0.9997 | 0.9997 | 0.606s | 44.785s | COMPLETE | | 9 | 4 | 19 | 10 | 4 | 0.9828 | 0.9915 | 0.9822 | 0.9997 | 0.610s | 45.395s | COMPLETE | Hyperparameter tuning --------------------------- Best trial --> 8 Best parameters: --> hidden_layer_sizes: (4, 15, 17, 2) Best evaluation --> f1: 0.9915 ap: 0.9997 Time elapsed: 45.395s Fit --------------------------------------------- Train evaluation --> f1: 0.9965 ap: 0.9991 Test evaluation --> f1: 0.9718 ap: 0.9938 Time elapsed: 1.740s ------------------------------------------------- Time: 47.135s Final results ==================== >> Total time: 47.340s ------------------------------------- MultiLayerPerceptron --> f1: 0.9718 ap: 0.9938
In [5]:
Copied!
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial
# For multi-metric runs, the selected best trial is the first in the Pareto front
atom.mlp.best_trial
Out[5]:
FrozenTrial(number=8, state=1, values=[0.9914529914529915, 0.9997077732320282], datetime_start=datetime.datetime(2024, 3, 5, 12, 9, 40, 773486), datetime_complete=datetime.datetime(2024, 3, 5, 12, 9, 41, 379399), params={'hidden_layer_1': 4, 'hidden_layer_2': 15, 'hidden_layer_3': 17, 'hidden_layer_4': 2}, user_attrs={'estimator': MLPClassifier(hidden_layer_sizes=(4, 15, 17, 2), random_state=1)}, system_attrs={'nsga2:generation': 0}, intermediate_values={}, distributions={'hidden_layer_1': IntDistribution(high=4, log=False, low=2, step=1), 'hidden_layer_2': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_3': IntDistribution(high=20, log=False, low=10, step=1), 'hidden_layer_4': IntDistribution(high=4, log=False, low=2, step=1)}, trial_id=8, value=None)
In [6]:
Copied!
atom.plot_pareto_front()
atom.plot_pareto_front()
In [7]:
Copied!
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)
# If you are unhappy with the results, it's possible to conitnue the study
atom.mlp.hyperparameter_tuning(n_trials=5)
Running hyperparameter tuning for MultiLayerPerceptron... | trial | hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 | f1 | best_f1 | ap | best_ap | time_trial | time_ht | state | | ----- | -------------- | -------------- | -------------- | -------------- | ------- | ------- | ------- | ------- | ---------- | ------- | -------- | | 10 | 4 | 18 | 13 | 4 | 0.9831 | 0.9915 | 0.9997 | 0.9997 | 0.643s | 46.038s | COMPLETE | | 11 | 2 | 14 | 19 | 2 | 0.9421 | 0.9915 | 0.9899 | 0.9997 | 0.602s | 46.641s | COMPLETE | | 12 | 2 | 11 | 10 | 4 | 0.7733 | 0.9915 | 0.99 | 0.9997 | 0.622s | 47.262s | COMPLETE | | 13 | 2 | 12 | 15 | 2 | 0.9558 | 0.9915 | 0.9985 | 0.9997 | 0.614s | 47.876s | COMPLETE | | 14 | 3 | 11 | 16 | 4 | 0.7733 | 0.9915 | 0.9721 | 0.9997 | 0.622s | 48.498s | COMPLETE | Hyperparameter tuning --------------------------- Best trial --> 8 Best parameters: --> hidden_layer_sizes: (4, 15, 17, 2) Best evaluation --> f1: 0.9915 ap: 0.9997 Time elapsed: 48.498s
In [8]:
Copied!
# The trials attribute gives an overview of the trial results
atom.mlp.trials
# The trials attribute gives an overview of the trial results
atom.mlp.trials
Out[8]:
hidden_layer_1 | hidden_layer_2 | hidden_layer_3 | hidden_layer_4 | estimator | f1 | best_f1 | ap | best_ap | time_trial | time_ht | state | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
trial | ||||||||||||
0 | 3 | 17 | 10 | 2 | MLPClassifier(hidden_layer_sizes=(3, 17, 10, 2... | 0.946429 | 0.991453 | 0.984402 | 0.999708 | 9.522182 | 9.522182 | COMPLETE |
1 | 2 | 11 | 12 | 3 | MLPClassifier(hidden_layer_sizes=(2, 11, 12, 3... | 0.974359 | 0.991453 | 0.999128 | 0.999708 | 9.368656 | 18.890838 | COMPLETE |
2 | 3 | 15 | 14 | 4 | MLPClassifier(hidden_layer_sizes=(3, 15, 14, 4... | 0.991453 | 0.991453 | 0.997842 | 0.999708 | 11.459907 | 30.350745 | COMPLETE |
3 | 2 | 19 | 10 | 4 | MLPClassifier(hidden_layer_sizes=(2, 19, 10, 4... | 0.965517 | 0.991453 | 0.987805 | 0.999708 | 11.358701 | 41.709446 | COMPLETE |
4 | 3 | 16 | 11 | 2 | MLPClassifier(hidden_layer_sizes=(3, 16, 11, 2... | 0.966102 | 0.991453 | 0.998086 | 0.999708 | 0.652744 | 42.362190 | COMPLETE |
5 | 4 | 20 | 13 | 4 | MLPClassifier(hidden_layer_sizes=(4, 20, 13, 4... | 0.973913 | 0.991453 | 0.998855 | 0.999708 | 0.610210 | 42.972400 | COMPLETE |
6 | 4 | 19 | 10 | 2 | MLPClassifier(hidden_layer_sizes=(4, 19, 10, 2... | 0.982759 | 0.991453 | 0.990748 | 0.999708 | 0.605815 | 43.578215 | COMPLETE |
7 | 2 | 19 | 11 | 3 | MLPClassifier(hidden_layer_sizes=(2, 19, 11, 3... | 0.773333 | 0.991453 | 0.999708 | 0.999708 | 0.601119 | 44.179334 | COMPLETE |
8 | 4 | 15 | 17 | 2 | MLPClassifier(hidden_layer_sizes=(4, 15, 17, 2... | 0.991453 | 0.991453 | 0.999708 | 0.999708 | 0.605913 | 44.785247 | COMPLETE |
9 | 4 | 19 | 10 | 4 | MLPClassifier(hidden_layer_sizes=(4, 19, 10, 4... | 0.982759 | 0.991453 | 0.982168 | 0.999708 | 0.610058 | 45.395305 | COMPLETE |
10 | 4 | 18 | 13 | 4 | MLPClassifier(hidden_layer_sizes=(4, 18, 13, 4... | 0.983051 | 0.991453 | 0.999708 | 0.999708 | 0.642812 | 46.038117 | COMPLETE |
11 | 2 | 14 | 19 | 2 | MLPClassifier(hidden_layer_sizes=(2, 14, 19, 2... | 0.942149 | 0.991453 | 0.989914 | 0.999708 | 0.602385 | 46.640502 | COMPLETE |
12 | 2 | 11 | 10 | 4 | MLPClassifier(hidden_layer_sizes=(2, 11, 10, 4... | 0.773333 | 0.991453 | 0.990024 | 0.999708 | 0.621954 | 47.262456 | COMPLETE |
13 | 2 | 12 | 15 | 2 | MLPClassifier(hidden_layer_sizes=(2, 12, 15, 2... | 0.955752 | 0.991453 | 0.998518 | 0.999708 | 0.613556 | 47.876012 | COMPLETE |
14 | 3 | 11 | 16 | 4 | MLPClassifier(hidden_layer_sizes=(3, 11, 16, 4... | 0.773333 | 0.991453 | 0.972070 | 0.999708 | 0.622326 | 48.498338 | COMPLETE |
In [9]:
Copied!
# Select a custom best trial...
atom.mlp.best_trial = 2
# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params
# Select a custom best trial...
atom.mlp.best_trial = 2
# ...and check that the best parameters are now those in the selected trial
atom.mlp.best_params
Out[9]:
{'hidden_layer_sizes': (3, 15, 14, 4)}
In [10]:
Copied!
# Lastly, fit the model on the complete training set
# using the new combination of hyperparameters
atom.mlp.fit()
# Lastly, fit the model on the complete training set
# using the new combination of hyperparameters
atom.mlp.fit()
Fit --------------------------------------------- Train evaluation --> f1: 0.9983 ap: 0.9998 Test evaluation --> f1: 0.9718 ap: 0.9947 Time elapsed: 3.541s
Analyze the results¶
In [11]:
Copied!
atom.plot_trials()
atom.plot_trials()
In [12]:
Copied!
atom.plot_parallel_coordinate()
atom.plot_parallel_coordinate()