Skip to content

Nomenclature


This documentation consistently uses terms to refer to certain concepts related to this package. The most frequent terms are described hereunder.

dataframe-like
Any type object from which a pd.DataFrame can be created. This includes an iterable, a dict whose values are 1d-arrays, a two-dimensional list, tuple, np.array or scipy.sparse.matrix, and most commonly, a dataframe. This is the standard input format for any dataset.

atom
Instance of the ATOMClassifier or ATOMRegressor classes (note that the examples use it as the default variable name).

ATOM
Refers to this package.

branch
Collection of transformers fitted to a specific dataset. See the branches section.

BO
Bayesian optimization algorithm used for hyperparameter tuning.

categorical columns
Refers to all columns of type object or category.

class
Unique value in a column, e.g. a binary classifier has 2 classes in the target column.

estimator
An object which manages the estimation and decoding of an algorithm. The algorithm is estimated as a deterministic function of a set of parameters, a dataset and a random state.

missing values
All values in the missing attribute, as well as None, NaN, +inf and -inf.

model
Instance of a model in the pipeline.

outlier
Sample that contains one or more outlier values. Note that the Pruner class can use a different definition for outliers depending on the chosen strategy.

outlier value
Value that lies further than 3 times the standard deviation away from the mean of its column, i.e. |z-score| > 3.

pipeline
Dataset, transformers and models in a specific branch.

scorer
A non-estimator callable object which evaluates an estimator on given test data, returning a number. Unlike evaluation metrics, a greater returned number must correspond with a better score. See sklearn's documentation.

sequence
A one-dimensional array of type list, tuple, np.array or pd.Series. This is the standard input format for a dataset's target column.

target
Name of the dependent variable, passed as y to an estimator's fit method.

task
One of the three supervised machine learning approaches that ATOM supports:

trainer
Instance of a class that trains and evaluates the models (implements a run method). The following classes are considered trainers:

transformer
An estimator implementing a transform method. This encompasses all data cleaning and feature engineering classes.

Back to top