Gauss
Transform the data to follow a Gaussian distribution. This transformation is useful for modeling issues related to heteroscedasticity (non-constant variance), or other situations where normality is desired. Missing values are disregarded in fit and maintained in transform. Categorical columns are ignored. This class can be accessed from atom through the gauss method. Read more in the user guide.
Parameters: |
strategy: str, optional (default="yeo-johnson") The transforming strategy. Choose from:
Verbosity level of the class. Possible values are:
Seed used by the quantile strategy. If None, the random number generator is the RandomState used by numpy.random .
**kwargs |
Info
The yeo-johnson and box-cox strategies apply zero-mean, unit-variance
normalization after transforming. Use the kwargs
parameter to change
this behaviour.
Tip
Use atom's plot_distribution method to visualize the transformation.
Warning
Note that the quantile strategy performs a non-linear transformation. This may distort linear correlations between variables measured at the same scale but renders variables measured at different scales more directly comparable.
Attributes
Attributes: |
estimator: sklearn estimator Estimator's instance with which the data is transformed. |
Methods
fit | Fit to data. |
fit_transform | Fit to data, then transform it. |
get_params | Get parameters for this estimator. |
log | Write information to the logger and print to stdout. |
save | Save the instance to a pickle file. |
set_params | Set the parameters of this estimator. |
transform | Transform the data. |
Fit to data.
Parameters: |
X: dict, list, tuple, np.ndarray or pd.DataFrame
y: int, str, sequence or None, optional (default=None) |
Returns: |
self: Gauss Fitted instance of self. |
Fit to data, then transform it.
Parameters: |
X: dict, list, tuple, np.ndarray or pd.DataFrame
y: int, str, sequence or None, optional (default=None) |
Returns: |
X: pd.DataFrame Scaled feature set. |
Get parameters for this estimator.
Parameters: |
deep: bool, optional (default=True) |
Returns: |
params: dict Dictionary of the parameter names mapped to their values. |
Write a message to the logger and print it to stdout.
Parameters: |
msg: str
level: int, optional (default=0) |
Save the instance to a pickle file.
Parameters: |
filename: str, optional (default="auto") Name of the file. Use "auto" for automatic naming. |
Set the parameters of this estimator.
Parameters: |
**params: dict Estimator parameters. |
Returns: |
self: Gauss Estimator instance. |
Apply the transformations to the data.
Parameters: |
X: dict, list, tuple, np.ndarray or pd.DataFrame
y: int, str, sequence or None, optional (default=None) |
Returns: |
X: pd.DataFrame Transformed feature set. |
Example
from atom import ATOMRegressor
atom = ATOMRegressor(X, y)
atom.gauss()
from atom.data_cleaning import Gauss
gauss = Gauss()
gauss.fit(X_train)
X = gauss.transform(X)