Scaler

class atom.data_cleaning.Scaler(strategy="standard", verbose=0, logger=None, **kwargs) [source]

Apply one of sklearn's scalers. Categorical columns are ignored. This class can be accessed from atom through the scale method. Read more in the user guide.

Parameters:

strategy: str, optional (default="standard")
Strategy with which to scale the data. Choose from:

verbose: int, optional (default=0)
Verbosity level of the class. Possible values are:

0 to not print anything.
1 to print basic information.

logger: str, Logger or None, optional (default=None)

If None: Doesn't save a logging file.
If str: Name of the log file. Use "auto" for automatic naming.
Else: Python logging.Logger instance.

**kwargs
Additional keyword arguments passed to the strategy estimator.

Tip

Use atom's scaled attribute to check if the feature set is scaled.

Attributes

Attributes:

estimator: sklearn estimator
Estimator's instance with which the data is scaled.

Methods

fit	Fit to data.
fit_transform	Fit to data, then transform it.
get_params	Get parameters for this estimator.
log	Write information to the logger and print to stdout.
save	Save the instance to a pickle file.
set_params	Set the parameters of this estimator.
transform	Transform the data.

method fit(X, y=None) [source]

Compute the mean and std to be used for scaling.

Parameters:

X: dict, list, tuple, np.ndarray or pd.DataFrame
Feature set with shape=(n_samples, n_features).

y: int, str, sequence or None, optional (default=None)
Does nothing. Implemented for continuity of the API.

Returns:

self: Scaler
Fitted instance of self.

method fit_transform(X, y=None) [source]

Fit to data, then transform it.

Parameters:

X: dict, list, tuple, np.ndarray or pd.DataFrame
Feature set with shape=(n_samples, n_features).

y: int, str, sequence or None, optional (default=None)
Does nothing. Implemented for continuity of the API.

Returns:

X: pd.DataFrame
Scaled feature set.

method get_params(deep=True) [source]

Get parameters for this estimator.

Parameters:	deep: bool, optional (default=True) If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params: dict Dictionary of the parameter names mapped to their values.

method log(msg, level=0) [source]

Write a message to the logger and print it to stdout.

Parameters:

msg: str
Message to write to the logger and print to stdout.

level: int, optional (default=0)
Minimum verbosity level to print the message.

method save(filename="auto") [source]

Save the instance to a pickle file.

Parameters:

filename: str, optional (default="auto")
Name of the file. Use "auto" for automatic naming.

method set_params(**params) [source]

Set the parameters of this estimator.

Parameters:	**params: dict Estimator parameters.
Returns:	self: Scaler Estimator instance.

method transform(X, y=None) [source]

Perform standardization by centering and scaling.

Parameters:

X: dict, list, tuple, np.ndarray or pd.DataFrame
Feature set with shape=(n_samples, n_features).

y: int, str, sequence or None, optional (default=None)
Does nothing. Implemented for continuity of the API.

Returns:

X: pd.DataFrame
Scaled feature set.

Example

from atom import ATOMRegressor

atom = ATOMRegressor(X, y)
atom.scale()

or

from atom.data_cleaning import Scaler

scaler = Scaler()
scaler.fit(X_train)
X = scaler.transform(X)