Release history
Version 6.1.0
New features
- Support for metadata routing, to pass data such as
sample_weight
to estimators, scorers, and CV splitters. Read more in the user guide. - Two new plots: plot_data_splits and plot_cv_splits.
- New set_threshold method to change the threshold of a binary classifier.
- New
pos_label
attribute to specify the positive label for binary and multilabel classification tasks.
API changes
- The
threshold
parameter in the evaluate method is deprecated in favour of the set_threshold method. - Stratification over multiple columns is no longer possible.
Enhancements
- The Imputer class now supports custom strategies for numerical columns by passing a function in place of a strategy name.
- Refactor of the cross-validation splitting strategy.
- Documentation improvements.
Bug fixes
- Fix a bug in
conda
dependencies for Windows and macOS.
Version 6.0.1
New features
- Support for Python 3.12.
Enhancements
- Packaging license file.
Version 6.0.0
New features
- Completely new module for time series. Read more in the user guide.
- Support for Python 3.11 and drop support for Python 3.8 and Python 3.9.
- New data engines. Read more in the user guide.
- Added the
dask
parallelization backend. - Improved memory optimizations. Read more in the user guide.
- Added the
iterative
strategy for numerical imputation. - Added the
hdbscan
strategy to the Pruner class. - Use the
ignore
parameter to ignore columns in the dataset. - New update_traces method to further customize your plots.
API changes
- The plot_results method is divided into plot_results and plot_bootstrap and accepts any metric.
- The FeatureGrouper class no longer accepts a
name
parameter. Provide the group names directly through thegroup
parameter as dict. - Rework of the register method.
- The
multioutput
attribute is deprecated. Multioutput meta-estimators are now assigned automatically. - Model tags have to be separated from the acronym by an underscore.
- The
engine
parameter is now a dict. - The
automl
method is deprecated.
Enhancements
- Transformations only on
y
are now accepted, e.g.,atom.scale(columns=-1)
. - The Imputer class has many more strategies for numerical columns designed for time series.
- The evaluate method highlights the highest score per metric.
- Full support for pandas nullable dtypes.
- The dataset can now be provided as callable.
- The FeatureExtractor class can extract features from the dataset's index.
- Subplots can now share axes on the canvas.
- The save and save_data
methods now accept pathlib.Path objects as
filename
. - Cleaner representation on hover for the plot_timeline method.
- The
cv
key inht_params
now accepts a custom cross-validation generator. - Improved error message for incorrect stratification of multioutput datasets.
- Rework of the shrink method.
Bug fixes
- Fixed a bug where the cross_validate method could fail for pipelines that changed the number of rows.
- Fixed a bug where the Pruner class didn't drop all outlier clusters.
- Fixed a bug where the pipeline could fail for transformers that returned a series.
- Fixed a bug where the pipeline could fail for transformers that reset its internal attributes during fitting.
- Fixed a bug where the register method failed in Databricks.
- Fixed a bug where tuning hyperparameter for a
base_estimator
inside a custom meta-estimator would fail. - Fixed a bug where the data properties'
@setter
could fail for numpy arrays. - Fixed a bug where reference lines for some plots didn't lie exactly on the unity line.