Skip to content

Release history


Version 4.8.0

  • The Encoder class now directly handles unknown categories encountered during fitting.
  • The Balancer and Encoder classes now accept custom estimators for the strategy parameter.
  • The new merge method enables the user to merge multiple atom instances into one.
  • The dtype shrinking is moved from atom's initializers to the shrink method.
  • ATOM's custom pipeline now handles transformers fitted on a subset of the dataset.
  • The column parameter in the distribution method is renamed to columns for continuity of the API.
  • The mae criterion for the GBM model hyperparameter tuning is deprecated to be consistent with sklearn's API.
  • Branches are now case-insensitive.
  • Renaming a branch using an existing name now raises an exception.
  • Fixed a bug where columns of type category broke the Imputer class.
  • Fixed a bug where predictions of the Stacking ensemble crashed for branches with multiple transformers.
  • The tables in the documentation now adapt to dark mode.

Version 4.7.3

  • Fixed a bug where the conda-forge recipe couldn't install properly.

Version 4.7.2

  • Fixed a bug where the pipeline failed for custom transformers that returned sparse matrices.
  • Package requirements files are added to the installer.

Version 4.7.1

  • Fixed a bug where the pip installer failed.
  • Fixed a bug where categorical columns also selected datetime columns.

Version 4.7.0

  • Launched our new slack channel!
  • The new FeatureExtractor class extracts useful features from datetime columns.
  • The new plot_det method plots a binary classifier's detection error tradeoff curve.
  • The partial dependence plot is able to draw Individual Conditional Expectation (ICE) lines.
  • The full traceback of exceptions encountered during training are now saved to the logger.
  • ATOMClassifier and ATOMRegressor now convert the dtypes of the input data to the minimal allowed type for memory efficiency.
  • The scoring method is renamed to evaluate to clarify its purpose.
  • The column parameter in the apply method is renamed to columns for continuity of the API.
  • Minor documentation improvements.

Version 4.6.0

  • Added the full_train method to retrieve an estimator trained on the complete dataset.
  • The score method is now also able to calculate custom metrics on new data.
  • Refactor of the Imputer class.
  • Refactor of the Encoder class to avoid errors for unknown classes and allow the input of missing values.
  • The clean method no longer automatically encodes the target column for regression tasks.
  • Creating a branch using a models' acronym as name now raises an exception.
  • Fixed a bug where CatBoost failed when early_stopping < 1.
  • Fixed a bug where created pipelines had duplicated names.

Version 4.5.0

  • Support of NLP pipelines. Read more in the user guide.
  • Integration of mlflow to track all models in the pipeline. Read more in the user guide.
  • The new Gauss class transforms features to a more Gaussian-like distribution.
  • New cross_validate method to evaluate the robustness of a pipeline using cross_validation.
  • New reset method to go back to atom's initial state.
  • Added the Dummy model to compare other models with a simple baseline.
  • New plot_wordcloud and plot_ngrams methods for text visualization.
  • Plots now can return the figure object when display=None.
  • The Pruner class can now able to drop outliers based on the selection of multiple strategies.
  • The new shuffle parameter in atom's initializer determines whether to shuffle the dataset.
  • The trainers no longer require you to specify a model using the models parameter. If left to default, all predefined models for that task are used.
  • The apply method now accepts args and kwargs for the function.
  • Refactor of the evaluate method.
  • Refactor of the export_pipeline method.
  • The parameters in the Cleaner class have been refactored to better describe their function.
  • The train_sizes parameter in train_sizing now accepts integer values to automatically create equally distributed splits in the training set.
  • Refactor of plot_pipeline to show models in the diagram as well.
  • Refactor of the bagging parameter to the (more appropriate) name n_bootstrap.
  • New option to exclude columns from a transformer adding ! before their name.
  • Fixed a bug where the Pruner class failed if there were categorical columns in the dataset.
  • Completely reworked documentation website.

Version 4.4.0

  • The drop method now allows the user to drop columns as part of the pipeline.
  • New apply method to perform data transformations as function to the pipeline
  • Added the status method to save an overview of atom's branches and models to the logger.
  • Improved the output messages for the Imputer class.
  • The dataset's columns can now be called directly from atom.
  • The distribution and plot_distribution methods now ignore missing values.
  • Fixed a bug where transformations could fail when columns were added to the dataset after initializing the pipeline.
  • Fixed a bug where the Cleaner class didn't drop columns consisting entirely of missing values when drop_min_cardinality=True.
  • Fixed a bug where the winning model wasn't displayed correctly.
  • Refactored the way transformers are added or removed from predicting methods.
  • Improved documentation.

Version 4.3.0

  • Possibility to add custom transformers to the pipeline.
  • The export_pipeline utility method exports atom's current pipeline to a sklearn object.
  • Use AutoML to automate the search for an optimized pipeline.
  • New magic methods makes atom behave similarly to sklearn's Pipeline.
  • All training approaches can now be combined in the same atom instance.
  • New plot_scatter_matrix, plot_distribution and plot_qq plots for data inspection.
  • Complete rework of all the shap plots to be consistent with their new API.
  • Improvements for the Scaler and Pruner classes.
  • The acronym for custom models now defaults to the capital letters in the class' __name__.
  • Possibility to apply transformations on only a subset of the columns.
  • Plots and methods now accept winner as model name.
  • Fixed a bug where custom metrics didn't show the correct name.
  • Fixed a bug where timers were not displayed correctly.
  • Further compatibility with deep learning datasets.
  • Large refactoring for performance optimization.
  • Cleaner output of messages to the logger.
  • Plots no longer show a default title.
  • Added the AutoML example notebook.
  • Minor bug fixes.

Version 4.2.1

  • Bug fix where there was memory leakage in successive halving and train sizing pipelines.
  • The XGBoost, LightGBM and CatBoost packages can now be installed through the installer's extras_require under the name models, e.g. pip install -U atom-ml[models].
  • Improved documentation.

Version 4.2.0

  • Possibility to add custom models to the pipeline using ATOMModel.
  • Compatibility with deep learning models.
  • New branch system for different data pipelines. Read more in the user guide.
  • Use the canvas contextmanager to draw multiple plots in one figure.
  • New voting and stacking ensemble techniques.
  • New get_class_weight utility method.
  • New Sequential Feature Selection strategy for the FeatureSelector.
  • Added the sample_weight parameter to the score method.
  • New ways to initialize the data in the training instances.
  • The n_rows parameter in ATOMLoader is deprecated in favour of the new input formats.
  • The test_size parameter now also allows integer values.
  • Renamed categories to classes to be consistent with sklearn's API.
  • The class property now returns a pd.DataFrame of the number of rows per target class in the train, test and complete dataset.
  • Possibility to add custom parameters to an estimator's fit method through est_params.
  • The successive halving and train sizing approaches now both allow subsequent runs from atom without losing the information from previous runs.
  • Bug fix where ATOMLoader wouldn't encode the target column during transformation.
  • Added the Deep learning, Ensembles and Utilities example notebooks.
  • Compatibility with python 3.9.

Version 4.1.0

  • New est_params parameter to customize the parameters in every model's estimator.
  • Following skopt's API, the n_random_starts parameter to specify the number of random trials is deprecated in favour of n_initial_points.
  • The Balancer class now allows you to use any of the strategies from imblearn.
  • New utility attributes to inspect the dataset.
  • Four new models: CatNB, CNB, ARD and RNN.
  • Added the models section to the documentation.
  • Small changes in log outputs.
  • Bug fixes and performance improvements.

Version 4.0.1

  • Bug fix where the FeatureGenerator was not deterministic for a fixed random state.
  • Bug fix where subsequent runs with the same metric failed.
  • Added the license file to the package's installer.
  • Typo fixes in documentation.

Version 4.0.0

  • Bayesian optimization package changed from GpyOpt to skopt.
  • Complete revision of the model's hyperparameters.
  • Four SHAP plots can now be called directly from an ATOM pipeline.
  • Two new plots for regression tasks.
  • New plot_pipeline and pipeline attribute to access all transformers.
  • Possibility to determine transformer parameters per method.
  • New calibration method and plot.
  • Metrics can now be added as scorers or functions with signature metric(y, y_pred, **kwargs).
  • Implementation of multi-metric runs.
  • Possibility to choose which metric to plot.
  • Early stopping for models that allow in-training evaluation.
  • Added the ATOMLoader function to load any saved pickle instance.
  • The "remove" strategy in the data cleaning parameters is deprecated in favour of "drop".
  • Implemented the DFS strategy in FeatureGenerator.
  • All training classes now inherit from BaseEstimator.
  • Added multiple new example notebooks.
  • Tests coverage up to 100%.
  • Completely new documentation page.
  • Bug fixes and performance improvements.
Back to top