Predicting
After running a successful pipeline, it's possible you would like to
apply all used transformations onto new data, or make predictions using
one of the trained models. Just like a sklearn estimator, you can call
the prediction methods from a fitted trainer, e.g. atom.predict(X)
.
Calling the method without specifying a model will use the winning model
in the pipeline (under attribute winner
). To use a different model,
simply call the method from a model, e.g. atom.AdaB.predict(X)
.
All prediction methods transform the provided data through all
transformers in the current branch before making the predictions.
By default, this excludes transformers that should only be applied
on the training set, like outlier pruning and balancing the dataset.
Use the method's pipeline
parameter to customize which
transformations to apply with every call.
The available prediction methods are a selection of the most common methods for estimators in sklearn's API:
transform | Transform new data through all transformers in a branch. |
predict | Transform new data through all transformers in a branch and return class predictions. |
predict_proba | Transform new data through all transformers in a branch and return class probabilities. |
predict_log_proba | Transform new data through all transformers in a branch and return class log-probabilities. |
decision_function | Transform new data through all transformers in a branch and return confidence scores. |
score | Transform new data through all transformers in a branch and return a metric score. |
Except for transform, the prediction methods can be calculated on the train
and test set. You can access them through the model's prediction attributes,
e.g. atom.mnb.predict_train
or atom.mnb.predict_test
. Keep in mind that
the results are not calculated until the attribute is called for the first
time. This mechanism avoids having to calculate attributes that are never
used, saving time and memory.
Note
Many of the plots use the prediction attributes. This can considerably increase the size of the instance for large datasets. Use the reset_predictions method if you need to free some memory!