FeatureExtractor
Create new features extracting datetime elements (day, month, year,
etc...) from the provided columns. Columns of dtype datetime64
are
used as is. Categorical columns that can be successfully converted
to a datetime format (less than 30% NaT values after conversion)
are also used. This class can be accessed from atom through the
feature_extraction
method. Read more in the user guide.
Parameters: |
features: str or sequence, optional (default=["day", "month", "year"])
fmt: str, sequence or None, optional (default=None) Type of encoding to use. Choose from:
drop_columns: bool, optional (default=True) Verbosity level of the class. Possible values are:
|
Warning
Decision trees based algorithms build their split rules according to one feature at a time. This means that they will fail to correctly process cyclic features since the cos/sin features should be considered one single coordinate system.
Methods
fit_transform | Same as transform. |
get_params | Get parameters for this estimator. |
log | Write information to the logger and print to stdout. |
save | Save the instance to a pickle file. |
set_params | Set the parameters of this estimator. |
transform | Transform the data. |
Extract the new features.
Parameters: |
X: dict, list, tuple, np.ndarray or pd.DataFrame
y: int, str, sequence or None, optional (default=None) |
Returns: |
X: pd.DataFrame Dataframe containing the new features. |
Get parameters for this estimator.
Parameters: |
deep: bool, optional (default=True) |
Returns: |
params: dict Dictionary of the parameter names mapped to their values. |
Write a message to the logger and print it to stdout.
Parameters: |
msg: str
level: int, optional (default=0) |
Save the instance to a pickle file.
Parameters: |
filename: str, optional (default="auto") Name of the file. Use "auto" for automatic naming. |
Set the parameters of this estimator.
Parameters: |
**params: dict Estimator parameters. |
Returns: |
self: FeatureExtractor Estimator instance. |
Extract the new features.
Parameters: |
X: dict, list, tuple, np.ndarray or pd.DataFrame
y: int, str, sequence or None, optional (default=None) |
Returns: |
X: pd.DataFrame Dataframe containing the new features. |
Example
from atom import ATOMClassifier
atom = ATOMClassifier(X, y)
atom.feature_extraction(features=["day", "month"], fmt="%d/%m/%Y")
from atom.feature_engineering import FeatureExtractor
feature_extractor = FeatureExtractor(features=["day", "month"], fmt="%d/%m/%Y")
X = feature_extractor.transform(X)