Models
Predefined models
ATOM provides many models for classification and regression tasks
that can be used to fit the data in the pipeline. After fitting, a
class containing the underlying estimator is attached to the trainer
as an attribute. We refer to these "subclasses" as models. Apart from
the estimator, the models contain a variety of attributes and methods
to help you understand how the underlying estimator performed. They
can be accessed using their acronyms, e.g. atom.LGB
to access the
LightGBM's model. The available models and their corresponding
acronyms are:
- "Dummy" for Dummy Estimator
- "GP" for Gaussian Process
- "GNB" for Gaussian Naive Bayes
- "MNB" for Multinomial Naive Bayes
- "BNB" for Bernoulli Naive Bayes
- "CatNB" for Categorical Naive Bayes
- "CNB" for Complement Naive Bayes
- "OLS" for Ordinary Least Squares
- "Ridge" for Ridge Estimator
- "Lasso" for Lasso Regression
- "EN" for ElasticNet Regression
- "Lars" for Least Angle Regression
- "BR" for Bayesian Ridge
- "ARD" for Automated Relevance Determination
- "Huber" for Huber Regression
- "Perc" for Perceptron
- "LR" for Logistic Regression
- "LDA" for Linear Discriminant Analysis
- "QDA" for Quadratic Discriminant Analysis
- "KNN" for K-Nearest Neighbors
- "RNN" for Radius Nearest Neighbors
- "Tree" for Decision Tree
- "Bag" for Bagging
- "ET" for Extra-Trees
- "RF" for Random Forest
- "AdaB" for AdaBoost
- "GBM" for Gradient Boosting Machine
- "hGBM" for HistGBM
- "XGB" for XGBoost
- "LGB" for LightGBM
- "CatB" for CatBoost
- "lSVM" for Linear-SVM
- "kSVM" for Kernel-SVM
- "PA" for Passive Aggressive
- "SGD" for Stochastic Gradient Descent
- "MLP" for Multi-layer Perceptron
Tip
The acronyms are case-insensitive, e.g. atom.lgb
also calls
the LightGBM's model.
Warning
The models can not be initialized directly by the user! Use them only through the trainers.
Custom models
It is also possible to create your own models in ATOM's pipeline. For example, imagine we want to use sklearn's RANSACRegressor estimator (note that is not included in ATOM's predefined models). There are two ways to achieve this:
- Using ATOMModel (recommended). With this approach you can pass the required model characteristics to the pipeline.
from atom import ATOMRegressor, ATOMModel
from sklearn.linear_model import RANSACRegressor
ransac = ATOMModel(
models=RANSACRegressor,
acronym="RANSAC",
fullname="Random Sample Consensus",
needs_scaling=True,
)
atom = ATOMRegressor(X, y)
atom.run(ransac)
- Using the estimator's class or an instance of the class. This approach will also call ATOMModel under the hood, but it will leave its parameters to their default values.
from atom import ATOMRegressor
from sklearn.linear_model import RANSACRegressor
atom = ATOMRegressor(X, y)
atom.run(RANSACRegressor)
Additional things to take into account:
- Custom models can be accessed through their acronym like any other model, e.g.
atom.ransac
in the example above. - Custom models are not restricted to sklearn estimators, but they should follow sklearn's API, i.e. have a fit and predict method.
- Parameter customization (for the initializer)
is only possible for custom models which provide an estimator that has a
set_params()
method, i.e. it's a child class of BaseEstimator. - Hyperparameter optimization for custom
models is ignored unless appropriate dimensions are provided through
bo_params
. - If the estimator has a
n_jobs
and/orrandom_state
parameter that is left to its default value, it will automatically adopt the values from the trainer it's called from.
Deep learning
Deep learning models can be used through ATOM's custom models as long as they follow sklearn's API. For example, models implemented with the Keras package should use the scikeras wrappers KerasClassifier or KerasRegressor.
Many deep learning use cases, for example in computer vision, use datasets with more than 2 dimensions, e.g. image data can have shape (n_samples, length, width, rgb). These data structures are not intended to store in a two-dimensional pandas dataframe, but, since ATOM requires a dataframe for its internal API, datasets with more than two dimensions are stored in a single column called "multidim feature", where every row contains one (multidimensional) sample. Note that the data cleaning, feature engineering and some plotting methods are unavailable when this is the case.
See in this example how to use ATOM to train and validate a Convolutional Neural Network implemented with Keras.
Warning
Keras' models can only use custom hyperparameter tuning
when n_jobs=1
or bo_params={"cv": 1}
. Using n_jobs > 1 and
cv > 1 raises a PicklingError due to incompatibilities of the APIs.
Ensembles
Ensemble models use multiple estimators to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. ATOM implements two ensemble techniques: voting and stacking. Click here to see an example that uses ensemble models.
If the ensemble's underlying estimator is a model that used automated feature scaling,
it's added as a Pipeline containing the scaler
and estimator. If an
mlflow experiment is active, the ensembles start
their own run, just like the predefined models do.
Warning
Combining models trained on different branches into one ensemble is not allowed and will raise an exception.
Voting
The idea behind voting is to combine the predictions of conceptually different models to make new predictions. Such a technique can be useful for a set of equally well performing models in order to balance out their individual weaknesses. Read more in sklearn's documentation.
A voting model is created from a trainer through the voting
method. The voting model is added automatically to the list of
models in the pipeline, under the Vote
acronym. The underlying
estimator is a custom adaptation of VotingClassifier
or VotingRegressor
depending on the task. The differences between ATOM's and sklearn's
implementation are:
- ATOM's implementation doesn't fit estimators if they're already fitted.
- ATOM's instance is considered fitted at initialization when all underlying estimators are.
- ATOM's VotingClassifier doesn't implement a LabelEncoder to encode the target column.
The two estimators are customized in this way to save time and computational resources, since the classes are always initialized with fitted estimators. As a consequence of this, the VotingClassifier can not use sklearn's build-in LabelEncoder for the target column since it can't be fitted when initializing the class. For the vast majority of use cases, the changes will have no effect. If you want to export the estimator and retrain it on different data, just make sure to clone the underlying estimators first.
Stacking
Stacking is a method for combining estimators to reduce their biases. More precisely, the predictions of each individual estimator are stacked together and used as input to a final estimator to compute the prediction. Read more in sklearn's documentation.
A stacking model is created from a trainer through the stacking
method. The stacking model is added automatically to the list of
models in the pipeline, under the Stack
acronym. The underlying
estimator is a custom adaptation of StackingClassifier
or StackingRegressor
depending on the task. The only difference between ATOM's and sklearn's
implementation is that ATOM's implementation doesn't fit estimators if
they're already fitted. The two estimators are customized in this way to
save time and computational resources, since the classes are always
initialized with fitted estimators. For the vast majority of use cases,
the changes will have no effect. If you want to export the estimator and
retrain it on different data, just make sure to clone
the underlying estimators first.