Stochastic Gradient Descent (SGD)
Stochastic Gradient Descent is a simple yet very efficient approach to fitting linear classifiers and regressors under convex loss functions. Even though SGD has been around in the machine learning community for a long time, it has received a considerable amount of attention just recently in the context of large-scale learning.
Corresponding estimators are:
- SGDClassifier for classification tasks.
- SGDRegressor for regression tasks.
Read more in sklearn's documentation.
Hyperparameters
- By default, the estimator adopts the default parameters provided by its package. See the user guide on how to customize them.
- The
l1_ratio
parameter is only used when penalty="elasticnet". - The
eta0
parameter is only used when learning_rate!="optimal". - The
n_jobs
andrandom_state
parameters are set equal to those of the trainer.
Dimensions: |
loss: str
penalty: str, default="l2"
alpha: float, default=1e-4
l1_ratio: float, default=0.15
epsilon: float, default=0.1
learning_rate: str, default="optimal"
eta0: float, default=1e-2
power_t: float, default=0.5
average: bool, default=False |
Attributes
Data attributes
Attributes: |
dataset: pd.DataFrame
train: pd.DataFrame
test: pd.DataFrame
X: pd.DataFrame
y: pd.Series
X_train: pd.DataFrame
y_train: pd.Series
X_test: pd.DataFrame
y_test: pd.Series
shape: tuple
columns: list
n_columns: int
features: list
n_features: int
target: str |
Utility attributes
Attributes: |
bo: pd.DataFrame Information of every step taken by the BO. Columns include:
best_call: str
best_params: dict
estimator: class
time_bo: str
metric_bo: float or list
time_fit: str
metric_train: float or list
metric_test: float or list
metric_bootstrap: np.array
mean_bootstrap: float or list
std_bootstrap: float or list Training results. Columns include:
|
Prediction attributes
The prediction attributes are not calculated until the attribute is called for the first time. This mechanism avoids having to calculate attributes that are never used, saving time and memory.
Prediction attributes: |
predict_train: np.array
predict_test: np.array
decision_function_train: np.array
decision_function_test: np.array
score_train: np.float64
score_test: np.float64 |
Methods
The majority of the plots and prediction methods
can be called directly from the models, e.g. atom.sgd.plot_permutation_importance()
or atom.sgd.predict(X)
.
The remaining utility methods can be found hereunder.
calibrate | Calibrate the model. |
clear | Clear attributes from the model. |
cross_validate | Evaluate the model using cross-validation. |
delete | Delete the model from the trainer. |
dashboard | Create an interactive dashboard to analyze the model. |
evaluate | Get the model's scores for the provided metrics. |
export_pipeline | Export the model's pipeline to a sklearn-like Pipeline object. |
full_train | Train the estimator on the complete dataset. |
rename | Change the model's tag. |
save_estimator | Save the estimator to a pickle file. |
transform | Transform new data through the model's branch. |
Applies probability calibration on the model. The estimator
is trained via cross-validation on a subset of the training
data, using the rest to fit the calibrator. The new classifier
will replace the estimator
attribute. If there is an active
mlflow experiment, a new run is started using the name
[model_name]_calibrate
. Since the estimator changed, the
model is cleared. Only if classifier.
Parameters: |
**kwargs Additional keyword arguments for sklearn's CalibratedClassifierCV. Using cv="prefit" will use the trained model and fit the calibrator on the test set. Use this only if you have another, independent set for testing. |
Reset attributes to their initial state, deleting potentially large data arrays. Use this method to free some memory before saving the class. The cleared attributes per model are:
Evaluate the model using cross-validation. This method cross-validates the whole pipeline on the complete dataset. Use it to assess the robustness of the solution's performance.
Parameters: |
**kwargs Additional keyword arguments for sklearn's cross_validate function. If the scoring method is not specified, it uses the trainer's metric. |
Returns: |
scores: dict Return of sklearn's cross_validate function. |
Delete the model from the trainer. If it's the last model in the
trainer, the metric is reset. Use this method to drop unwanted
models from the pipeline or to free some memory before saving.
The model is not removed from any active mlflow experiment.
Create an interactive dashboard to analyze the model. The dashboard allows you to investigate SHAP values, permutation importances, interaction effects, partial dependence plots, all kinds of performance plots, and even individual decision trees. By default, the dashboard opens in an external dash app.
Parameters: |
dataset: str, optional (default="test")
filename: str or None, optional (default=None)
**kwargs |
Returns: |
dashboard: ExplainerDashboard Created dashboard object. |
Get the model's scores for the provided metrics.
Parameters: |
metric: str, func, scorer, sequence or None, optional (default=None)
dataset: str, optional (default="test") Threshold between 0 and 1 to convert predicted probabilities to class labels. Only used when:
sample_weight: sequence or None, optional (default=None) |
Returns: |
score: pd.Series Scores of the model. |
Export the model's pipeline to a sklearn-like Pipeline object. If the
model used automated feature scaling,
the scaler
is added to the pipeline. The returned pipeline is already
fitted on the training set.
Info
ATOM's Pipeline class behaves the same as a sklearn Pipeline, and additionally:
- Accepts transformers that change the target column.
- Accepts transformers that drop rows.
- Accepts transformers that only are fitted on a subset of the provided dataset.
- Always outputs pandas objects.
- Uses transformers that are only applied on the training set (see the balance or prune methods) to fit the pipeline, not to make predictions on unseen data.
Parameters: |
memory: bool, str, Memory or None, optional (default=None) Used to cache the fitted transformers of the pipeline.
verbose: int or None, optional (default=None) |
Returns: |
Pipeline Current branch as a sklearn-like Pipeline object. |
In some cases it might be desirable to use all available data
to train a final model. Note that doing this means that the
estimator can no longer be evaluated on the test set. The newly
retrained estimator will replace the estimator
attribute. If
there is an active mlflow experiment, a new run is started
with the name [model_name]_full_train
. Since the estimator
changed, the model is cleared.
Parameters: |
include_holdout: bool, optional (default=False) Whether to include the holdout data set (if available) in the training of the estimator. Note that if True, it means the model can't be evaluated. |
Change the model's tag. The acronym always stays at the beginning of the model's name. If the model is being tracked by mlflow, the name of the corresponding run is also changed.
Parameters: |
name: str or None, optional (default=None) New tag for the model. If None, the tag is removed. |
Save the estimator to a pickle file.
Parameters: |
filename: str, optional (default="auto") Name of the file. Use "auto" for automatic naming. |
Transform new data through the model's branch. Transformers that are only applied on the training set are skipped. If the model used feature scaling, the data is also scaled.
Parameters: |
X: dataframe-like
verbose: int or None, optional (default=None) |
Returns: |
pd.DataFrame
pd.Series |
Example
from atom import ATOMClassifier
atom = ATOMClassifier(X, y)
atom.run(models="SGD", metric="recall", bo_params={"cv": 3})