Example: In-training validationĀ¶
This example shows how to keep track of the model's performance during training.
Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.
Load the dataĀ¶
InĀ [1]:
Copied!
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
InĀ [2]:
Copied!
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)
Run the pipelineĀ¶
InĀ [3]:
Copied!
# Initialize atom
atom = ATOMClassifier(X, y, verbose=2, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, verbose=2, random_state=1)
<< ================== ATOM ================== >> Configuration ==================== >> Algorithm task: Binary classification. Dataset stats ==================== >> Shape: (569, 31) Train set size: 456 Test set size: 113 ------------------------------------- Memory: 141.24 kB Scaled: False Outlier values: 167 (1.2%)
InĀ [4]:
Copied!
# Not all models support in-training validation
# You can chek which ones do using the available_models method
atom.available_models(validation=True)
# Not all models support in-training validation
# You can chek which ones do using the available_models method
atom.available_models(validation=True)
Out[4]:
acronym | fullname | estimator | module | handles_missing | needs_scaling | accepts_sparse | native_multilabel | native_multioutput | validation | supports_engines | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | CatB | CatBoost | CatBoostClassifier | catboost.core | True | True | True | False | False | n_estimators | catboost |
1 | LGB | LightGBM | LGBMClassifier | lightgbm.sklearn | True | True | True | False | False | n_estimators | lightgbm |
2 | MLP | MultiLayerPerceptron | MLPClassifier | sklearn.neural_network._multilayer_perceptron | False | True | True | True | False | max_iter | sklearn |
3 | PA | PassiveAggressive | PassiveAggressiveClassifier | sklearn.linear_model._passive_aggressive | False | True | True | False | False | max_iter | sklearn |
4 | Perc | Perceptron | Perceptron | sklearn.linear_model._perceptron | False | True | False | False | False | max_iter | sklearn |
5 | SGD | StochasticGradientDescent | SGDClassifier | sklearn.linear_model._stochastic_gradient | False | True | True | False | False | max_iter | sklearn |
6 | XGB | XGBoost | XGBClassifier | xgboost.sklearn | True | True | True | False | False | n_estimators | xgboost |
InĀ [5]:
Copied!
# Run the models normally
atom.run(models=["MLP", "LGB"], metric="auc")
# Run the models normally
atom.run(models=["MLP", "LGB"], metric="auc")
Training ========================= >> Models: MLP, LGB Metric: auc Results for MultiLayerPerceptron: Fit --------------------------------------------- Train evaluation --> auc: 0.9997 Test evaluation --> auc: 0.9936 Time elapsed: 1.825s ------------------------------------------------- Time: 1.825s Results for LightGBM: Fit --------------------------------------------- Train evaluation --> auc: 1.0 Test evaluation --> auc: 0.9775 Time elapsed: 0.417s ------------------------------------------------- Time: 0.417s Final results ==================== >> Total time: 2.246s ------------------------------------- MultiLayerPerceptron --> auc: 0.9936 ! LightGBM --> auc: 0.9775
Analyze the resultsĀ¶
InĀ [6]:
Copied!
atom.plot_evals(title="In-training validation scores")
atom.plot_evals(title="In-training validation scores")
InĀ [7]:
Copied!
# Plot the validation on the train and test set
atom.lgb.plot_evals(dataset="train+test", title="LightGBM's in-training validation")
# Plot the validation on the train and test set
atom.lgb.plot_evals(dataset="train+test", title="LightGBM's in-training validation")