Models
Predefined models
ATOM provides many models for classification and regression tasks
that can be used to fit the data in the pipeline. After fitting, a
class containing the underlying estimator is attached to atom as an
attribute. We refer to these "subclasses" as models. Apart from the
estimator, the models contain a variety of attributes and methods that
can help you understand how the underlying estimator performed. They
can be accessed using their acronyms, e.g., atom.LGB
to access the
LightGBM model. The available models and their corresponding
acronyms are:
AdaBoost (AdaB) | Adaptive Boosting. |
ARIMA (ARIMA) | Autoregressive Integrated Moving Average. |
AutoARIMA (AutoARIMA) | Automatic Autoregressive Integrated Moving Average. |
AutoETS (AutoETS) | ETS model with automatic fitting capabilities. |
AutomaticRelevanceDetermination (ARD) | Automatic Relevance Determination. |
Bagging (Bag) | Bagging model (with decision tree as base estimator). |
BATS (BATS) | BATS forecaster with multiple seasonality. |
BayesianRidge (BR) | Bayesian ridge regression. |
BernoulliNB (BNB) | Bernoulli Naive Bayes. |
CatBoost (CatB) | Cat Boosting Machine. |
CategoricalNB (CatNB) | Categorical Naive Bayes. |
ComplementNB (CNB) | Complement Naive Bayes. |
Croston (Croston) | Croston's method for forecasting. |
DecisionTree (Tree) | Single Decision Tree. |
Dummy (Dummy) | Dummy classifier/regressor. |
DynamicFactor (DF) | Dynamic Factor. |
ElasticNet (EN) | Linear Regression with elasticnet regularization. |
ETS (ETS) | Error-Trend-Seasonality model. |
ExponentialSmoothing (ES) | Holt-Winters Exponential Smoothing forecaster. |
ExtraTree (ETree) | Extremely Randomized Tree. |
ExtraTrees (ET) | Extremely Randomized Trees. |
GaussianNB (GNB) | Gaussian Naive Bayes. |
GaussianProcess (GP) | Gaussian process. |
GradientBoostingMachine (GBM) | Gradient Boosting Machine. |
HuberRegression (Huber) | Huber regressor. |
HistGradientBoosting (hGBM) | Histogram-based Gradient Boosting Machine. |
KNearestNeighbors (KNN) | K-Nearest Neighbors. |
Lasso (Lasso) | Linear Regression with lasso regularization. |
LeastAngleRegression (Lars) | Least Angle Regression. |
LightGBM (LGB) | Light Gradient Boosting Machine. |
LinearDiscriminantAnalysis (LDA) | Linear Discriminant Analysis. |
LinearSVM (lSVM) | Linear Support Vector Machine. |
LogisticRegression (LR) | Logistic Regression. |
MSTL (MSTL) | Multiple Seasonal-Trend decomposition using LOESS. |
MultiLayerPerceptron (MLP) | Multi-layer Perceptron. |
MultinomialNB (MNB) | Multinomial Naive Bayes. |
NaiveForecaster (NF) | Naive Forecaster. |
OrdinaryLeastSquares (OLS) | Linear Regression. |
OrthogonalMatchingPursuit (OMP) | Orthogonal Matching Pursuit. |
PassiveAggressive (PA) | Passive Aggressive. |
Perceptron (Perc) | Linear Perceptron classification. |
Prophet (Prophet) | Prophet forecaster by Facebook. |
PolynomialTrend (PT) | Polynomial Trend forecaster. |
QuadraticDiscriminantAnalysis (QDA) | Quadratic Discriminant Analysis. |
RadiusNearestNeighbors (RNN) | Radius Nearest Neighbors. |
RandomForest (RF) | Random Forest. |
Ridge (Ridge) | Linear least squares with l2 regularization. |
SARIMAX (SARIMAX) | Seasonal Autoregressive Integrated Moving Average. |
STL (STL) | Seasonal-Trend decomposition using LOESS. |
StochasticGradientDescent (SGD) | Stochastic Gradient Descent. |
SupportVectorMachine (SVM) | Support Vector Machine. |
TBATS (TBATS) | TBATS forecaster with multiple seasonality. |
Theta (Theta) | Theta method for forecasting. |
VAR (VAR) | Vector Autoregressive. |
VARMAX (VARMAX) | Vector Autoregressive Moving-Average. |
XGBoost (XGB) | Extreme Gradient Boosting. |
Warning
The model classes cannot be initialized directly by the user! Use them only through atom.
Tip
The acronyms are case-insensitive, e.g., atom.lgb
also calls
the LightGBM model.
Model selection
Although ATOM allows running all models for a given task using
atom.run(models=None)
, it's usually smarter to select only
a subset of models. Every model has a series of tags that indicate
special characteristics of the model. Use a model's get_tags
method to see its tags, or the available_models
method to get an overview of all models and their tags. The tags differ
per task, but can include:
- acronym: Model's acronym (used to call the model).
- fullname: Name of the model's class.
- estimator: Name of the model's underlying estimator.
- module: The estimator's module.
- handles_missing: Whether the model can handle missing values without preprocessing. If False, consider using the Imputer class before training the models.
- needs_scaling: Whether the model requires feature scaling. If True, automated feature scaling is applied.
- accepts_sparse: Whether the model accepts sparse input.
- uses_exogenous: Whether the model uses exogenous variables.
- multiple_seasonality: Whether the model can handle more than one seasonality period.
- native_multilabel: Whether the model has native support for multilabel tasks.
- native_multioutput: Whether the model has native support for multioutput tasks.
- validation: Whether the model has in-training validation.
- supports_engines: Engines supported by the model.
To filter for specific tags, specify the column name with the desired value
in the arguments of available_models
, e.g., atom.available_models(accepts_sparse=True)
to get all models that accept sparse input or atom.available_models(supports_engines="cuml")
to get all models that support the cuML engine.
Custom models
It is also possible to create your own models in ATOM's pipeline. For example, imagine we want to use sklearn's RANSACRegressor estimator (note that is not included in ATOM's predefined models). There are two ways to achieve this:
- Using ATOMModel (recommended). With this approach you can pass the required model characteristics to the pipeline.
from atom import ATOMRegressor, ATOMModel
from sklearn.datasets import load_diabetes
from sklearn.linear_model import RANSACRegressor
ransac = ATOMModel(RANSACRegressor, name="RANSAC", needs_scaling=True)
X, y = load_diabetes(return_X_y=True, as_frame=True)
atom = ATOMRegressor(X, y)
atom.run(ransac)
- Using the estimator's class or an instance of the class. This approach will also call ATOMModel under the hood, but it will leave its parameters to their default values.
from atom import ATOMRegressor
from sklearn.datasets import load_diabetes
from sklearn.linear_model import RANSACRegressor
X, y = load_diabetes(return_X_y=True, as_frame=True)
atom = ATOMRegressor(X, y)
atom.run(RANSACRegressor)
Additional things to take into account:
- Custom models can be accessed through their acronym like any other model, e.g.
atom.ransac
in the example above. - Custom models are not restricted to sklearn estimators, but they should follow sklearn's API, i.e., have a fit and predict method.
- Parameter customization (for the initializer) is only possible for
custom models which provide an estimator that has a
set_params()
method, i.e., it's a child class of BaseEstimator. - Hyperparameter tuning for custom models is ignored unless appropriate
dimensions are provided through
ht_params
.
Deep learning
Deep learning models can be used through ATOM's custom models as long as they follow sklearn's API. For example, models implemented with the Keras package should use the scikeras wrappers KerasClassifier or KerasRegressor.
Many deep learning use cases, for example in computer vision, use datasets with more than 2 dimensions, e.g., image data can have shape (n_samples, length, width, rgb). Luckily, scikeras has a workaround to be able to work with such datasets. Learn with this example how to use ATOM to train and validate a Convolutional Neural Network on an image dataset.
Warning
Models implemented with keras can only use
custom hyperparameter tuning when n_jobs=1
or ht_params={"cv": 1}
. Using n_jobs > 1 and cv > 1 raises a
PicklingError due to incompatibilities of the APIs.
Ensembles
Ensemble models use multiple estimators to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. ATOM implements two ensemble techniques: voting and stacking. Click here to see an example that uses ensemble models.
If the ensemble's underlying estimator is a model that used automated feature scaling,
it's added as a Pipeline containing the [Scaler
][scaler
] and estimator. If
a mlflow experiment is active, the ensembles start their
own run, just like the predefined models do.
Warning
Combining models trained on different branches into one ensemble is not allowed and will raise an exception.
Voting
The idea behind voting is to combine the predictions of conceptually different models to make new predictions. Such a technique can be useful for a set of equally well performing models in order to balance out their individual weaknesses. Read more in sklearn's documentation.
A voting model is created from a trainer through the voting
method. The voting model is added automatically to the list of
models in the trainer, under the Vote
acronym. The underlying
estimator is a custom adaptation of VotingClassifier
or VotingRegressor
depending on the task. The differences between ATOM's and sklearn's
implementation are:
- ATOM's implementation doesn't fit estimators if they're already fitted.
- ATOM's instance is considered fitted at initialization when all underlying estimators are.
- ATOM's VotingClassifier doesn't implement a LabelEncoder to encode the target column.
The two estimators are customized in this way to save time and computational resources, since the classes are always initialized with fitted estimators. As a consequence of this, the VotingClassifier can not use sklearn's build-in LabelEncoder for the target column since it can't be fitted when initializing the class. For the vast majority of use cases, the changes will have no effect.
Stacking
Stacking is a method for combining estimators to reduce their biases. More precisely, the predictions of each individual estimator are stacked together and used as input to a final estimator to compute the prediction. Read more in sklearn's documentation.
A stacking model is created from a trainer through the stacking
method. The stacking model is added automatically to the list of
models in the trainer, under the Stack
acronym. The underlying
estimators are StackingClassifier or StackingRegressor
depending on the task.
Tip
By default, the final estimator is trained on the training set.
Note that this is the same data on which the other estimators are
fitted, increasing the chance of overfitting. If possible, it's
recommended to use train_on_test=True
in combination with a
holdout set for model evaluation.