Release history
Version 5.2.0
New features
- Two new plot methods: plot_terminator_improvement and plot_timeline.
Enhancements
- Data splits in every trial are now properly stratified according to the selected strategy.
- Performance optimization for multiple methods using smart caching.
- Improved visualizations for plots with logarithmic hyperparameters.
Bug fixes
- Fixed a bug where parameters in a trial would not match with those displayed.
Version 5.1.2
API changes
- The default
strategy
for theencode
method has changed from "LeaveOneOut" to "Target"-encoding. LeaveOneOut is no longer a supported strategy.
Bug fixes
- Fixed a bug where stratification failed for datasets where the target column was not placed last.
- Fixed a bug where transformers with no
get_feature_names_out
method could fail. - Fixed a bug where the FeatureSelector class could fail when transforming a dataset with different column order than seen at fit time.
Version 5.1.1
API changes
- The
infrequent_to_value
parameter in the Encoder class is replaced withinfrequent_to_value
to be consistent with sklearn's naming convention.
Enhancements
- Added the
kwargs
parameter to the save_data method.
Bug fixes
- Fixed an installation issue for systems without an x86 architecture.
- Fixed a bug where Voting would fail for certain metrics.
- Fixed a bug where the time metric in mlflow was always zero.
- Fixed a bug where shap plots wouldn't display the full column names.
- Fixed a bug where column names where not properly propagated during transformation.
Version 5.1.0
New features
- Support for multilabel classification, multiclass-multilabel classification and multioutput regression tasks. Read more in the user guide.
- New backend parameter to choose a parallel execution backend.
- New
parallel
parameter to train multiple models simultaneously. - Integration with DAGsHub to store your mlflow experiments. Read more in the user guide.
- New serve method to deploy models to a rest API endpoint.
- New get_best_threshold method to calculate the optimal threshold for binary and multilabel tasks.
- New get_sample_weight method to calculate the sample weights for a balanced data set.
API changes
- The
ATOMLoader
class is deprecated in favor of the load method. - The
errors
attribute for runners is deprecated.
Enhancements
- Added three new notebook examples.
- Added the
drop_chars
parameter to the Cleaner class. - Added the
errors
parameter to the trainers. - Rework of the dependencies, making the base package more lightweight.
- The logging entries for external libraries are redirected to atom's file handler.
Bug fixes
- Fixed multiple errors that appeared after sklearn's 1.2 update.
- Fixed a bug where hyperparameter tuning could fail for multi-metric runs.
- Fixed a bug where trials would try to report multiple times the same step.
- Fixed a bug where custom models could skip in-training validation.
- Fixed an issue where the bootstrapping estimators were trained using
partial_fit
.
Version 5.0.1
Bug fixes
- Fixed installation issue.
- Updated package dependencies.
Version 5.0.0
New features
- Completely new hyperparameter tuning process.
- Completely reworked plotting interface.
- Accelerate your pipelines with sklearnex.
- New FeatureGrouper class to extract statistical features from similar groups.
- New create_app method to create a nice front-end for model predictions.
- New inverse_transform method for atom and models.
- New linear model: OrthogonalMatchingPursuit.
- The plot_results method now accepts time metrics.
API changes
- The
gpu
parameter is deprecated in favor ofdevice
andengine
. - Refactor of the Cleaner, Discretizer, Encoder and FeatureSelector classes.
- Refactor of all shap plots.
- Refactor of the apply method.
- The
plot_scatter_matrix
method is renamed to plot_relationships. - The
kSVM
model is renamed to SVM. - Multidimensional datasets are no longer supported. Check the deep learning section of the user guide for guidance with such datasets.
- The
greater_is_better
,needs_proba
andneeds_threshold
parameters are deprecated. Metric functions are now created using make_scorer's default parameters. - The
drop
method is removed from atom. Use the reworked apply method instead. - The prediction methods can no longer be called from atom.
- The dashboard method for models is now called create_dashboard.
Enhancements
- New examples for plotting, automated feature scaling, pruning and advanced hyperparameter tuning.
- The Normalizer class can now be accelerated with GPU.
- The Scaler class now ignores binary columns (only 0s and 1s).
- The
models
parameter in plot and utility methods now accepts model indices. - The transform method now also transforms
only
y
whenX
has a default value. - The prediction methods now return pandas objects.
- Dependency versions are checked with originals after unpickling.
- Automatic generation of documentation from docstrings.
- Improvements in documentation display for mobile phones.
- New
feature_importance
attribute for models. - Added a visualization for automated feature scaling to plot_pipeline.
Bug fixes
- The FeatureExtractor class no longer raises a warning for highly fragmented dataframes.
- Fixed a bug where models could not call the
score
function. - The Encoder class no longer fails when the user provides ordinal values that are not present during fitting.
- Fixed a bug with the
max_nan_rows
parameter in the Imputer class. - Fixed a bug where Tokenizer could fail when no ngrams were found.