Balancer
Balance the number of samples per class in the target column. When oversampling, the newly created samples have an increasing integer index for numerical indices, and an index of the form [estimator]_N for non-numerical indices, where N stands for the N-th sample in the data set. Use only for classification tasks. This class can be accessed from atom through the balance method. Read more in the user guide.
Parameters: |
strategy: str or estimator, optional (default="ADASYN") Number of cores to use for parallel processing.
Verbosity level of the class. Possible values are:
Seed used by the random number generator. If None, the random number generator is the RandomState instance used by np.random .
**kwargs |
Tip
Use atom's classes attribute for an overview of the target class distribution per data set.
Warning
The clustercentroids estimator is unavailable because of incompatibilities of the APIs.
Attributes
Attributes: |
<strategy>: imblearn estimator
mapping: dict |
Methods
fit_transform | Same as transform. |
get_params | Get parameters for this estimator. |
log | Write information to the logger and print to stdout. |
save | Save the instance to a pickle file. |
set_params | Set the parameters of this estimator. |
transform | Transform the data. |
Balance the data.
Parameters: |
X: dataframe-like
|
Returns: |
X: pd.DataFrame
y: pd.Series |
Get parameters for this estimator.
Parameters: |
deep: bool, optional (default=True) |
Returns: |
params: dict Parameter names mapped to their values. |
Write a message to the logger and print it to stdout.
Parameters: |
msg: str
level: int, optional (default=0) |
Save the instance to a pickle file.
Parameters: |
filename: str, optional (default="auto") Name of the file. Use "auto" for automatic naming. |
Set the parameters of this estimator.
Parameters: |
**params: dict Estimator parameters. |
Returns: |
self: Balancer Estimator instance. |
Balance the data.
Parameters: |
X: dataframe-like
|
Returns: |
X: pd.DataFrame
X: pd.Series |
Example
from atom import ATOMClassifier
atom = ATOMClassifier(X, y)
atom.balance(strategy="NearMiss", sampling_strategy=0.7, n_neighbors=10)
from atom.data_cleaning import Balancer
balancer = Balancer(strategy="NearMiss", sampling_strategy=0.7, n_neighbors=10)
X_train, y_train = balancer.transform(X_train, y_train)