Release history

Version 6.1.0

New features

Support for metadata routing, to pass data such as sample_weight to estimators, scorers, and CV splitters. Read more in the user guide.
Two new plots: plot_data_splits and plot_cv_splits.
New set_threshold method to change the threshold of a binary classifier.
New pos_label attribute to specify the positive label for binary and multilabel classification tasks.

API changes

The threshold parameter in the evaluate method is deprecated in favour of the set_threshold method.
Stratification over multiple columns is no longer possible.

Enhancements

The Imputer class now supports custom strategies for numerical columns by passing a function in place of a strategy name.
Refactor of the cross-validation splitting strategy.
Documentation improvements.

Bug fixes

New features

Enhancements

New features

API changes

The plot_results method is divided into plot_results and plot_bootstrap and accepts any metric.
The FeatureGrouper class no longer accepts a name parameter. Provide the group names directly through the group parameter as dict.
Rework of the register method.
The multioutput attribute is deprecated. Multioutput meta-estimators are now assigned automatically.
Model tags have to be separated from the acronym by an underscore.
The engine parameter is now a dict.
The automl method is deprecated.

Enhancements

Transformations only on y are now accepted, e.g., atom.scale(columns=-1).
The Imputer class has many more strategies for numerical columns designed for time series.
The evaluate method highlights the highest score per metric.
Full support for pandas nullable dtypes.
The dataset can now be provided as callable.
The FeatureExtractor class can extract features from the dataset's index.
Subplots can now share axes on the canvas.
The save and save_data methods now accept pathlib.Path objects as filename.
Cleaner representation on hover for the plot_timeline method.
The cv key in ht_params now accepts a custom cross-validation generator.
Improved error message for incorrect stratification of multioutput datasets.
Rework of the shrink method.

Bug fixes

Fixed a bug where the cross_validate method could fail for pipelines that changed the number of rows.
Fixed a bug where the Pruner class didn't drop all outlier clusters.
Fixed a bug where the pipeline could fail for transformers that returned a series.
Fixed a bug where the pipeline could fail for transformers that reset its internal attributes during fitting.
Fixed a bug where the register method failed in Databricks.
Fixed a bug where tuning hyperparameter for a base_estimator inside a custom meta-estimator would fail.
Fixed a bug where the data properties' @setter could fail for numpy arrays.
Fixed a bug where reference lines for some plots didn't lie exactly on the unity line.