Discretizer
Bin continuous data into intervals. For each feature, the bin edges are computed during fit and, together with the number of bins, they will define the intervals. Ignores numerical columns. It can be accessed from atom through the discretize method. Read more in the user guide.
Parameters: |
strategy: str, optional (default="quantile") Strategy used to define the widths of the bins. Choose from:
Label names with which to replace the binned intervals.
Train estimator on GPU (instead of CPU). Not for strategy="custom".
Verbosity level of the class. Choose from:
|
Tip
The transformation returns categorical columns. Use the Encoder class to convert them back to numerical types.
Warning
If strategy="custom", the columns returned can contain NaN
if the values
lie outside the specified bin edges.
Attributes
Attributes: |
feature_names_in_: np.array
n_features_in_: int |
Methods
fit | Fit to data. |
fit_transform | Fit to data, then transform it. |
get_params | Get parameters for this estimator. |
log | Write information to the logger and print to stdout. |
save | Save the instance to a pickle file. |
set_params | Set the parameters of this estimator. |
transform | Transform the data. |
Fit to data.
Parameters: |
X: dataframe-like
y: int, str, sequence or None, optional (default=None) |
Returns: |
Discretizer Fitted instance of self. |
Fit to data, then transform it. Note that leaving y=None can lead
to errors if the strategy
encoder requires target values.
Parameters: |
X: dataframe-like
y: int, str, sequence or None, optional (default=None) |
Returns: |
pd.DataFrame Transformed feature set. |
Get parameters for this estimator.
Parameters: |
deep: bool, optional (default=True) |
Returns: |
dict Parameter names mapped to their values. |
Write a message to the logger and print it to stdout.
Parameters: |
msg: str
level: int, optional (default=0) |
Save the instance to a pickle file.
Parameters: |
filename: str, optional (default="auto") Name of the file. Use "auto" for automatic naming. |
Set the parameters of this estimator.
Parameters: |
**params: dict Estimator parameters. |
Returns: |
Discretizer Estimator instance. |
Bin the data into intervals.
Parameters: |
X: dataframe-like
y: int, str, sequence or None, optional (default=None) |
Returns: |
pd.DataFrame Transformed feature set. |
Example
from atom import ATOMClassifier
atom = ATOMClassifier(X, y)
atom.discretize(strategy="custom", bins=[0, 18, 120], labels=["child", "adult"])
from atom.data_cleaning import Discretizer
discretizer = Discretizer(
strategy="custom",
bins=[0, 18, 120],
labels=["child", "adult"],
)
discretizer.fit(X_train)
X = discretizer.transform(X)