Skip to content

Scaler


class atom.data_cleaning.Scaler(strategy="standard", verbose=0, logger=None, **kwargs) [source]

Apply one of sklearn's scalers. Categorical columns are ignored. This class can be accessed from atom through the scale method. Read more in the user guide.

Parameters: strategy: str, optional (default="standard")
Strategy with which to scale the data. Choose from: verbose: int, optional (default=0)
Verbosity level of the class. Possible values are:
  • 0 to not print anything.
  • 1 to print basic information.
logger: str, Logger or None, optional (default=None)
  • If None: Doesn't save a logging file.
  • If str: Name of the log file. Use "auto" for automatic naming.
  • Else: Python logging.Logger instance.

**kwargs
Additional keyword arguments passed to the strategy estimator.

Tip

Use atom's scaled attribute to check if the feature set is scaled.


Attributes

Attributes: estimator: sklearn estimator
Estimator's instance with which the data is scaled.


Methods

fit Fit to data.
fit_transform Fit to data, then transform it.
get_params Get parameters for this estimator.
log Write information to the logger and print to stdout.
save Save the instance to a pickle file.
set_params Set the parameters of this estimator.
transform Transform the data.


method fit(X, y=None) [source]

Compute the mean and std to be used for scaling.

Parameters:

X: dict, list, tuple, np.ndarray or pd.DataFrame
Feature set with shape=(n_samples, n_features).

y: int, str, sequence or None, optional (default=None)
Does nothing. Implemented for continuity of the API.

Returns: self: Scaler
Fitted instance of self.


method fit_transform(X, y=None) [source]

Fit to data, then transform it.

Parameters:

X: dict, list, tuple, np.ndarray or pd.DataFrame
Feature set with shape=(n_samples, n_features).

y: int, str, sequence or None, optional (default=None)
Does nothing. Implemented for continuity of the API.

Returns: X: pd.DataFrame
Scaled feature set.


method get_params(deep=True) [source]

Get parameters for this estimator.

Parameters:

deep: bool, optional (default=True)
If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns: params: dict
Dictionary of the parameter names mapped to their values.


method log(msg, level=0) [source]

Write a message to the logger and print it to stdout.

Parameters:

msg: str
Message to write to the logger and print to stdout.

level: int, optional (default=0)
Minimum verbosity level to print the message.


method save(filename="auto") [source]

Save the instance to a pickle file.

Parameters: filename: str, optional (default="auto")
Name of the file. Use "auto" for automatic naming.


method set_params(**params) [source]

Set the parameters of this estimator.

Parameters: **params: dict
Estimator parameters.
Returns: self: Scaler
Estimator instance.


method transform(X, y=None) [source]

Perform standardization by centering and scaling.

Parameters:

X: dict, list, tuple, np.ndarray or pd.DataFrame
Feature set with shape=(n_samples, n_features).

y: int, str, sequence or None, optional (default=None)
Does nothing. Implemented for continuity of the API.

Returns: X: pd.DataFrame
Scaled feature set.


Example

from atom import ATOMRegressor

atom = ATOMRegressor(X, y)
atom.scale()
or
from atom.data_cleaning import Scaler

scaler = Scaler()
scaler.fit(X_train)
X = scaler.transform(X)

Back to top