Balancer
Balance the number of samples per class in the target column. Use only for classification tasks. This class can be accessed from atom through the balance method. Read more in the user guide.
Parameters: |
strategy: str or estimator, optional (default="ADASYN") Number of cores to use for parallel processing.
Verbosity level of the class. Possible values are:
Seed used by the random number generator. If None, the random number generator is the RandomState instance used by numpy.random .
**kwargs |
Tip
Use atom's classes attribute for an overview of the target class distribution per data set.
Attributes
Attributes: |
<strategy>: imblearn estimator
mapping: dict |
Methods
fit_transform | Same as transform. |
get_params | Get parameters for this estimator. |
log | Write information to the logger and print to stdout. |
save | Save the instance to a pickle file. |
set_params | Set the parameters of this estimator. |
transform | Transform the data. |
Oversample or undersample the data.
Parameters: |
X: dict, list, tuple, np.ndarray or pd.DataFrame
|
Returns: |
X: pd.DataFrame
y: pd.Series |
Get parameters for this estimator.
Parameters: |
deep: bool, optional (default=True) |
Returns: |
params: dict Dictionary of the parameter names mapped to their values. |
Write a message to the logger and print it to stdout.
Parameters: |
msg: str
level: int, optional (default=0) |
Save the instance to a pickle file.
Parameters: |
filename: str, optional (default="auto") Name of the file. Use "auto" for automatic naming. |
Set the parameters of this estimator.
Parameters: |
**params: dict Estimator parameters. |
Returns: |
self: Balancer Estimator instance. |
Oversample or undersample the data.
Parameters: |
X: dict, list, tuple, np.ndarray or pd.DataFrame
|
Returns: |
X: pd.DataFrame
X: pd.Series |
Example
from atom import ATOMClassifier
atom = ATOMClassifier(X, y)
atom.balance(strategy="NearMiss", sampling_strategy=0.7, n_neighbors=10)
from atom.data_cleaning import Balancer
balancer = Balancer(strategy="NearMiss", sampling_strategy=0.7, n_neighbors=10)
X_train, y_train = balancer.transform(X_train, y_train)