Skip to content

Release history


Version 5.2.0

⭐ New features

🚀 Enhancements

  • Data splits in every trial are now properly stratified according to the selected strategy.
  • Performance optimization for multiple methods using smart caching.
  • Improved visualizations for plots with logarithmic hyperparameters.

🐛 Bug fixes

  • Fixed a bug where parameters in a trial would not match with those displayed.

Version 5.1.2

📝 API changes

  • The default strategy for the encode method has changed from "LeaveOneOut" to "Target"-encoding. LeaveOneOut is no longer a supported strategy.

🐛 Bug fixes

  • Fixed a bug where stratification failed for datasets where the target column was not placed last.
  • Fixed a bug where transformers with no get_feature_names_out method could fail.
  • Fixed a bug where the FeatureSelector class could fail when transforming a dataset with different column order than seen at fit time.

Version 5.1.1

📝 API changes

  • The infrequent_to_value parameter in the Encoder class is replaced with infrequent_to_value to be consistent with sklearn's naming convention.

🚀 Enhancements

  • Added the kwargs parameter to the save_data method.

🐛 Bug fixes

  • Fixed an installation issue for systems without an x86 architecture.
  • Fixed a bug where Voting would fail for certain metrics.
  • Fixed a bug where the time metric in mlflow was always zero.
  • Fixed a bug where shap plots wouldn't display the full column names.
  • Fixed a bug where column names where not properly propagated during transformation.

Version 5.1.0

⭐ New features

  • Support for multilabel classification, multiclass-multilabel classification and multioutput regression tasks. Read more in the user guide.
  • New backend parameter to choose a parallel execution backend.
  • New parallel parameter to train multiple models simultaneously.
  • Integration with DAGsHub to store your mlflow experiments. Read more in the user guide.
  • New serve method to deploy models to a rest API endpoint.
  • New get_best_threshold method to calculate the optimal threshold for binary and multilabel tasks.
  • New get_sample_weight method to calculate the sample weights for a balanced data set.

📝 API changes

  • The ATOMLoader class is deprecated in favor of the load method.
  • The errors attribute for runners is deprecated.

🚀 Enhancements

  • Added three new notebook examples.
  • Added the drop_chars parameter to the Cleaner class.
  • Added the errors parameter to the trainers.
  • Rework of the dependencies, making the base package more lightweight.
  • The logging entries for external libraries are redirected to atom's file handler.

🐛 Bug fixes

  • Fixed multiple errors that appeared after sklearn's 1.2 update.
  • Fixed a bug where hyperparameter tuning could fail for multi-metric runs.
  • Fixed a bug where trials would try to report multiple times the same step.
  • Fixed a bug where custom models could skip in-training validation.
  • Fixed an issue where the bootstrapping estimators were trained using partial_fit.

Version 5.0.1

🐛 Bug fixes

  • Fixed installation issue.
  • Updated package dependencies.

Version 5.0.0

⭐ New features

📝 API changes

  • The gpu parameter is deprecated in favor of device and engine.
  • Refactor of the Cleaner, Discretizer, Encoder and FeatureSelector classes.
  • Refactor of all shap plots.
  • Refactor of the apply method.
  • The plot_scatter_matrix method is renamed to plot_relationships.
  • The kSVM model is renamed to SVM.
  • Multidimensional datasets are no longer supported. Check the deep learning section of the user guide for guidance with such datasets.
  • The greater_is_better, needs_proba and needs_threshold parameters are deprecated. Metric functions are now created using make_scorer's default parameters.
  • The drop method is removed from atom. Use the reworked apply method instead.
  • The prediction methods can no longer be called from atom.
  • The dashboard method for models is now called create_dashboard.

🚀 Enhancements

🐛 Bug fixes

  • The FeatureExtractor class no longer raises a warning for highly fragmented dataframes.
  • Fixed a bug where models could not call the score function.
  • The Encoder class no longer fails when the user provides ordinal values that are not present during fitting.
  • Fixed a bug with the max_nan_rows parameter in the Imputer class.
  • Fixed a bug where Tokenizer could fail when no ngrams were found.