Multi-metric runs¶

This example shows how to evaluate an atom's pipeline on multiple metrics.

Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.

Load the data¶

In [1]:

            
                Copied!
                
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier

In [2]:

            
                Copied!
                
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)

Run the pipeline¶

In [3]:

            
                Copied!
                
atom = ATOMClassifier(X, y, n_jobs=1, verbose=2, warnings=False, random_state=1)
atom = ATOMClassifier(X, y, n_jobs=1, verbose=2, warnings=False, random_state=1)

<< ================== ATOM ================== >>
Algorithm task: binary classification.

Dataset stats ====================== >>
Shape: (569, 31)
Scaled: False
Outlier values: 174 (1.2%)
---------------------------------------
Train set size: 456
Test set size: 113
---------------------------------------
|    | dataset   | train     | test     |
|---:|:----------|:----------|:---------|
|  0 | 212 (1.0) | 167 (1.0) | 45 (1.0) |
|  1 | 357 (1.7) | 289 (1.7) | 68 (1.5) |

In [4]:

            
                Copied!
                
                    
                    
                
                

        
# For every step of the BO, both metrics are calculated,
# but only the first is used for optimization!
atom.run(
    models=["lSVM", "QDA"],
    metric=("f1", "recall"),
    n_calls=10,
    n_initial_points=4,
    n_bootstrap=6,
)
# For every step of the BO, both metrics are calculated,
# but only the first is used for optimization!
atom.run(
    models=["lSVM", "QDA"],
    metric=("f1", "recall"),
    n_calls=10,
    n_initial_points=4,
    n_bootstrap=6,
)

Training ===================================== >>
Models: lSVM, QDA
Metric: f1, recall


Running BO for Linear-SVM...
Initial point 1 ---------------------------------
Parameters --> {'loss': 'squared_hinge', 'C': 46.003, 'penalty': 'l1', 'dual': False}
Evaluation --> f1: 0.9656  Best f1: 0.9656   recall: 0.9724  Best recall: 0.9724
Time iteration: 0.209s   Total time: 0.209s
Initial point 2 ---------------------------------
Parameters --> {'loss': 'squared_hinge', 'C': 0.015, 'penalty': 'l1', 'dual': False}
Evaluation --> f1: 0.9640  Best f1: 0.9656   recall: 0.9724  Best recall: 0.9724
Time iteration: 0.076s   Total time: 0.410s
Initial point 3 ---------------------------------
Parameters --> {'loss': 'hinge', 'C': 2.232, 'penalty': 'l2'}
Evaluation --> f1: 0.9723  Best f1: 0.9723   recall: 0.9791  Best recall: 0.9791
Time iteration: 0.063s   Total time: 0.492s
Initial point 4 ---------------------------------
Parameters --> {'loss': 'squared_hinge', 'C': 0.037, 'penalty': 'l2'}
Evaluation --> f1: 0.9761  Best f1: 0.9761   recall: 0.9861  Best recall: 0.9861
Time iteration: 0.077s   Total time: 0.600s
Iteration 5 -------------------------------------
Parameters --> {'loss': 'hinge', 'C': 98.62, 'penalty': 'l2'}
Evaluation --> f1: 0.9657  Best f1: 0.9761   recall: 0.9756  Best recall: 0.9861
Time iteration: 0.061s   Total time: 0.878s
Iteration 6 -------------------------------------
Parameters --> {'loss': 'squared_hinge', 'C': 99.908, 'penalty': 'l2'}
Evaluation --> f1: 0.9686  Best f1: 0.9761   recall: 0.9620  Best recall: 0.9861
Time iteration: 0.063s   Total time: 1.186s
Iteration 7 -------------------------------------
Parameters --> {'loss': 'hinge', 'C': 0.001, 'penalty': 'l2'}
Evaluation --> f1: 0.9621  Best f1: 0.9761   recall: 0.9654  Best recall: 0.9861
Time iteration: 0.063s   Total time: 1.501s
Iteration 8 -------------------------------------
Parameters --> {'loss': 'squared_hinge', 'C': 0.172, 'penalty': 'l2'}
Evaluation --> f1: 0.9762  Best f1: 0.9762   recall: 0.9827  Best recall: 0.9861
Time iteration: 0.056s   Total time: 1.838s
Iteration 9 -------------------------------------
Parameters --> {'loss': 'hinge', 'C': 0.078, 'penalty': 'l2'}
Evaluation --> f1: 0.9828  Best f1: 0.9828   recall: 0.9930  Best recall: 0.9930
Time iteration: 0.068s   Total time: 2.206s
Iteration 10 ------------------------------------
Parameters --> {'loss': 'hinge', 'C': 0.171, 'penalty': 'l2'}
Evaluation --> f1: 0.9778  Best f1: 0.9828   recall: 0.9897  Best recall: 0.9930
Time iteration: 0.070s   Total time: 2.549s

Results for Linear-SVM:         
Bayesian Optimization ---------------------------
Best parameters --> {'loss': 'hinge', 'C': 0.078, 'penalty': 'l2'}
Best evaluation --> f1: 0.9828   recall: 0.993
Time elapsed: 2.805s
Fit ---------------------------------------------
Train evaluation --> f1: 0.9914   recall: 0.9965
Test evaluation --> f1: 0.9784   recall: 1.0
Time elapsed: 0.031s
Bootstrap ---------------------------------------
Evaluation --> f1: 0.9747 ± 0.007   recall: 0.9926 ± 0.0112
Time elapsed: 0.094s
-------------------------------------------------
Total time: 2.930s


Running BO for Quadratic Discriminant Analysis...
Initial point 1 ---------------------------------
Parameters --> {'reg_param': 1.0}
Evaluation --> f1: 0.9227  Best f1: 0.9227   recall: 0.9895  Best recall: 0.9895
Time iteration: 0.048s   Total time: 0.048s
Initial point 2 ---------------------------------
Parameters --> {'reg_param': 0.9}
Evaluation --> f1: 0.9021  Best f1: 0.9227   recall: 0.8305  Best recall: 0.9895
Time iteration: 0.022s   Total time: 0.096s
Initial point 3 ---------------------------------
Parameters --> {'reg_param': 0.1}
Evaluation --> f1: 0.9626  Best f1: 0.9626   recall: 0.9793  Best recall: 0.9895
Time iteration: 0.031s   Total time: 0.159s
Initial point 4 ---------------------------------
Parameters --> {'reg_param': 1.0}
Evaluation --> f1: 0.9210  Best f1: 0.9626   recall: 0.9861  Best recall: 0.9895
Time iteration: 0.032s   Total time: 0.228s
Iteration 5 -------------------------------------
Parameters --> {'reg_param': 0.2}
Evaluation --> f1: 0.9640  Best f1: 0.9640   recall: 0.9724  Best recall: 0.9895
Time iteration: 0.040s   Total time: 0.402s
Iteration 6 -------------------------------------
Parameters --> {'reg_param': 0.7}
Evaluation --> f1: 0.9381  Best f1: 0.9640   recall: 0.8962  Best recall: 0.9895
Time iteration: 0.036s   Total time: 0.586s
Iteration 7 -------------------------------------
Parameters --> {'reg_param': 0.8}
Evaluation --> f1: 0.9152  Best f1: 0.9640   recall: 0.8544  Best recall: 0.9895
Time iteration: 0.031s   Total time: 0.746s
Iteration 8 -------------------------------------
Parameters --> {'reg_param': 0.4}
Evaluation --> f1: 0.9600  Best f1: 0.9640   recall: 0.9551  Best recall: 0.9895
Time iteration: 0.031s   Total time: 0.927s
Iteration 9 -------------------------------------
Parameters --> {'reg_param': 0.5}
Evaluation --> f1: 0.9554  Best f1: 0.9640   recall: 0.9308  Best recall: 0.9895
Time iteration: 0.027s   Total time: 1.243s
Iteration 10 ------------------------------------
Parameters --> {'reg_param': 0.6}
Evaluation --> f1: 0.9517  Best f1: 0.9640   recall: 0.9240  Best recall: 0.9895
Time iteration: 0.036s   Total time: 1.432s

Results for Quadratic Discriminant Analysis:         
Bayesian Optimization ---------------------------
Best parameters --> {'reg_param': 0.2}
Best evaluation --> f1: 0.964   recall: 0.9724
Time elapsed: 1.586s
Fit ---------------------------------------------
Train evaluation --> f1: 0.9692   recall: 0.9792
Test evaluation --> f1: 0.9784   recall: 1.0
Time elapsed: 0.011s
Bootstrap ---------------------------------------
Evaluation --> f1: 0.9722 ± 0.0051   recall: 0.9877 ± 0.0101
Time elapsed: 0.036s
-------------------------------------------------
Total time: 1.633s


Final results ========================= >>
Duration: 4.579s
------------------------------------------
Linear-SVM                      --> f1: 0.9747 ± 0.007   recall: 0.9926 ± 0.0112 !
Quadratic Discriminant Analysis --> f1: 0.9722 ± 0.0051   recall: 0.9877 ± 0.0101

Analyze the results¶

In [5]:

            
                Copied!
                
# The columns in the results dataframe contain a list of
# scores, one for each metric (in the same order as called)
atom.results[["metric_bo", "metric_train", "metric_test"]]
# The columns in the results dataframe contain a list of
# scores, one for each metric (in the same order as called)
atom.results[["metric_bo", "metric_train", "metric_test"]]

Out[5]:

	metric_bo	metric_train	metric_test
lSVM	[0.982845769640169, 0.9930429522081065]	[0.9913941480206541, 0.9965397923875432]	[0.9784172661870503, 1.0]
QDA	[0.963998965582012, 0.9723532970356927]	[0.9691780821917808, 0.9792387543252595]	[0.9784172661870503, 1.0]

In [6]:

            
                Copied!
                
# Some plots allow us to choose the metric we want to show
with atom.canvas():
    atom.plot_bo(metric="f1", title="BO performance for f1")
    atom.plot_bo(metric="recall", title="BO performance for recall")
# Some plots allow us to choose the metric we want to show
with atom.canvas():
    atom.plot_bo(metric="f1", title="BO performance for f1")
    atom.plot_bo(metric="recall", title="BO performance for recall")

In [7]:

            
                Copied!
                
atom.plot_results(metric="recall")
atom.plot_results(metric="recall")