Multi-metric runs¶
This example shows how to evaluate an atom's pipeline on multiple metrics.
Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.
Load the data¶
In [1]:
Copied!
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
In [2]:
Copied!
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)
Run the pipeline¶
In [3]:
Copied!
atom = ATOMClassifier(X, y, n_jobs=1, verbose=2, warnings=False, random_state=1)
atom = ATOMClassifier(X, y, n_jobs=1, verbose=2, warnings=False, random_state=1)
<< ================== ATOM ================== >> Algorithm task: binary classification. Dataset stats ====================== >> Shape: (569, 31) Scaled: False Outlier values: 174 (1.2%) --------------------------------------- Train set size: 456 Test set size: 113 --------------------------------------- | | dataset | train | test | |---:|:----------|:----------|:---------| | 0 | 212 (1.0) | 167 (1.0) | 45 (1.0) | | 1 | 357 (1.7) | 289 (1.7) | 68 (1.5) |
In [4]:
Copied!
# For every step of the BO, both metrics are calculated,
# but only the first is used for optimization!
atom.run(
models=["lSVM", "QDA"],
metric=("f1", "recall"),
n_calls=10,
n_initial_points=4,
n_bootstrap=6,
)
# For every step of the BO, both metrics are calculated,
# but only the first is used for optimization!
atom.run(
models=["lSVM", "QDA"],
metric=("f1", "recall"),
n_calls=10,
n_initial_points=4,
n_bootstrap=6,
)
Training ===================================== >> Models: lSVM, QDA Metric: f1, recall Running BO for Linear-SVM... Initial point 1 --------------------------------- Parameters --> {'loss': 'squared_hinge', 'C': 46.003, 'penalty': 'l1', 'dual': False} Evaluation --> f1: 0.9656 Best f1: 0.9656 recall: 0.9724 Best recall: 0.9724 Time iteration: 0.195s Total time: 0.205s Initial point 2 --------------------------------- Parameters --> {'loss': 'squared_hinge', 'C': 0.015, 'penalty': 'l1', 'dual': False} Evaluation --> f1: 0.9640 Best f1: 0.9656 recall: 0.9724 Best recall: 0.9724 Time iteration: 0.084s Total time: 0.422s Initial point 3 --------------------------------- Parameters --> {'loss': 'hinge', 'C': 2.232, 'penalty': 'l2'} Evaluation --> f1: 0.9723 Best f1: 0.9723 recall: 0.9791 Best recall: 0.9791 Time iteration: 0.073s Total time: 0.521s Initial point 4 --------------------------------- Parameters --> {'loss': 'squared_hinge', 'C': 0.037, 'penalty': 'l2'} Evaluation --> f1: 0.9761 Best f1: 0.9761 recall: 0.9861 Best recall: 0.9861 Time iteration: 0.075s Total time: 0.625s Iteration 5 ------------------------------------- Parameters --> {'loss': 'hinge', 'C': 98.62, 'penalty': 'l2'} Evaluation --> f1: 0.9657 Best f1: 0.9761 recall: 0.9756 Best recall: 0.9861 Time iteration: 0.074s Total time: 0.951s Iteration 6 ------------------------------------- Parameters --> {'loss': 'squared_hinge', 'C': 99.908, 'penalty': 'l2'} Evaluation --> f1: 0.9686 Best f1: 0.9761 recall: 0.9620 Best recall: 0.9861 Time iteration: 0.093s Total time: 1.282s Iteration 7 ------------------------------------- Parameters --> {'loss': 'hinge', 'C': 0.001, 'penalty': 'l2'} Evaluation --> f1: 0.9621 Best f1: 0.9761 recall: 0.9654 Best recall: 0.9861 Time iteration: 0.071s Total time: 1.661s Iteration 8 ------------------------------------- Parameters --> {'loss': 'squared_hinge', 'C': 0.172, 'penalty': 'l2'} Evaluation --> f1: 0.9762 Best f1: 0.9762 recall: 0.9827 Best recall: 0.9861 Time iteration: 0.065s Total time: 2.021s Iteration 9 ------------------------------------- Parameters --> {'loss': 'hinge', 'C': 0.078, 'penalty': 'l2'} Evaluation --> f1: 0.9828 Best f1: 0.9828 recall: 0.9930 Best recall: 0.9930 Time iteration: 0.064s Total time: 2.368s Iteration 10 ------------------------------------ Parameters --> {'loss': 'hinge', 'C': 0.171, 'penalty': 'l2'} Evaluation --> f1: 0.9778 Best f1: 0.9828 recall: 0.9897 Best recall: 0.9930 Time iteration: 0.064s Total time: 2.715s Results for Linear-SVM: Bayesian Optimization --------------------------- Best parameters --> {'loss': 'hinge', 'C': 0.078, 'penalty': 'l2'} Best evaluation --> f1: 0.9828 recall: 0.993 Time elapsed: 2.988s Fit --------------------------------------------- Train evaluation --> f1: 0.9914 recall: 0.9965 Test evaluation --> f1: 0.9784 recall: 1.0 Time elapsed: 0.029s Bootstrap --------------------------------------- Evaluation --> f1: 0.9747 ± 0.007 recall: 0.9926 ± 0.0112 Time elapsed: 0.107s ------------------------------------------------- Total time: 3.125s Running BO for Quadratic Discriminant Analysis... Initial point 1 --------------------------------- Parameters --> {'reg_param': 1.0} Evaluation --> f1: 0.9227 Best f1: 0.9227 recall: 0.9895 Best recall: 0.9895 Time iteration: 0.040s Total time: 0.044s Initial point 2 --------------------------------- Parameters --> {'reg_param': 0.9} Evaluation --> f1: 0.9021 Best f1: 0.9227 recall: 0.8305 Best recall: 0.9895 Time iteration: 0.040s Total time: 0.114s Initial point 3 --------------------------------- Parameters --> {'reg_param': 0.1} Evaluation --> f1: 0.9626 Best f1: 0.9626 recall: 0.9793 Best recall: 0.9895 Time iteration: 0.045s Total time: 0.188s Initial point 4 --------------------------------- Parameters --> {'reg_param': 1.0} Evaluation --> f1: 0.9210 Best f1: 0.9626 recall: 0.9861 Best recall: 0.9895 Time iteration: 0.036s Total time: 0.252s Iteration 5 ------------------------------------- Parameters --> {'reg_param': 0.2} Evaluation --> f1: 0.9640 Best f1: 0.9640 recall: 0.9724 Best recall: 0.9895 Time iteration: 0.035s Total time: 0.422s Iteration 6 ------------------------------------- Parameters --> {'reg_param': 0.7} Evaluation --> f1: 0.9381 Best f1: 0.9640 recall: 0.8962 Best recall: 0.9895 Time iteration: 0.035s Total time: 0.598s Iteration 7 ------------------------------------- Parameters --> {'reg_param': 0.8} Evaluation --> f1: 0.9152 Best f1: 0.9640 recall: 0.8544 Best recall: 0.9895 Time iteration: 0.040s Total time: 0.799s Iteration 8 ------------------------------------- Parameters --> {'reg_param': 0.4} Evaluation --> f1: 0.9600 Best f1: 0.9640 recall: 0.9551 Best recall: 0.9895 Time iteration: 0.037s Total time: 0.985s Iteration 9 ------------------------------------- Parameters --> {'reg_param': 0.5} Evaluation --> f1: 0.9554 Best f1: 0.9640 recall: 0.9308 Best recall: 0.9895 Time iteration: 0.035s Total time: 1.166s Iteration 10 ------------------------------------ Parameters --> {'reg_param': 0.6} Evaluation --> f1: 0.9517 Best f1: 0.9640 recall: 0.9240 Best recall: 0.9895 Time iteration: 0.039s Total time: 1.350s Results for Quadratic Discriminant Analysis: Bayesian Optimization --------------------------- Best parameters --> {'reg_param': 0.2} Best evaluation --> f1: 0.964 recall: 0.9724 Time elapsed: 1.641s Fit --------------------------------------------- Train evaluation --> f1: 0.9692 recall: 0.9792 Test evaluation --> f1: 0.9784 recall: 1.0 Time elapsed: 0.014s Bootstrap --------------------------------------- Evaluation --> f1: 0.9722 ± 0.0051 recall: 0.9877 ± 0.0101 Time elapsed: 0.045s ------------------------------------------------- Total time: 1.700s Final results ========================= >> Duration: 4.826s ------------------------------------------ Linear-SVM --> f1: 0.9747 ± 0.007 recall: 0.9926 ± 0.0112 ! Quadratic Discriminant Analysis --> f1: 0.9722 ± 0.0051 recall: 0.9877 ± 0.0101
Analyze the results¶
In [5]:
Copied!
# The columns in the results dataframe contain a list of
# scores, one for each metric (in the same order as called)
atom.results[["metric_bo", "metric_train", "metric_test"]]
# The columns in the results dataframe contain a list of
# scores, one for each metric (in the same order as called)
atom.results[["metric_bo", "metric_train", "metric_test"]]
Out[5]:
metric_bo | metric_train | metric_test | |
---|---|---|---|
lSVM | [0.982845769640169, 0.9930429522081065] | [0.9913941480206541, 0.9965397923875432] | [0.9784172661870503, 1.0] |
QDA | [0.963998965582012, 0.9723532970356927] | [0.9691780821917808, 0.9792387543252595] | [0.9784172661870503, 1.0] |
In [6]:
Copied!
# Some plots allow us to choose the metric we want to show
with atom.canvas():
atom.plot_bo(metric="f1", title="BO performance for f1")
atom.plot_bo(metric="recall", title="BO performance for recall")
# Some plots allow us to choose the metric we want to show
with atom.canvas():
atom.plot_bo(metric="f1", title="BO performance for f1")
atom.plot_bo(metric="recall", title="BO performance for recall")
In [7]:
Copied!
atom.plot_results(metric="recall")
atom.plot_results(metric="recall")