plot_distribution
method plot_distribution(columns=0, distributions=None, show=None, title=None, legend="upper right", figsize=None, filename=None, display=True)[source]
Plot column distributions.
- For numerical columns, plot the probability density
distribution. Additionally, it's possible to plot any of
scipy.stats
distributions fitted to the column. - For categorical columns, plot the class distribution. Only one categorical column can be plotted at the same time.
Tip
Use atom's distribution method to check which distribution fits the column best.
Parameters | columns: int, str, slice or sequence, default=0
Columns to plot. I's only possible to plot one categorical
column. If more than one categorical columns are selected,
all categorical columns are ignored.
distributions: str, sequence or None, default=None
Names of the show: int or None, default=Nonescipy.stats distributions to fit to the
columns. If None, a Gaussian kde distribution is
showed. Only for numerical columns.
Number of classes (ordered by number of occurrences) to
show in the plot. If None, it shows all classes. Only for
categorical columns.
title: str, dict or None, default=None
Title for the plot.
legend: str, dict or None, default="upper right"
Legend for the plot. See the user guide for
an extended description of the choices.
figsize: tuple or None, default=None
Figure's size in pixels, format as (x, y). If None, it
adapts the size to the plot's type.
filename: str or None, default=None
Save the plot using this name. Use "auto" for automatic
naming. The type of the file depends on the provided name
(.html, .png, .pdf, etc...). If display: bool or None, default=Truefilename has no file type,
the plot is saved as html. If None, the plot is not saved.
Whether to render the plot. If None, it returns the figure.
|
Returns | go.Figure or None
Plot object. Only returned if display=None .
|
See Also
Plot a correlation matrix.
Plot a quantile-quantile plot.
Plot pairwise relationships in a dataset.
Example
>>> from atom import ATOMClassifier
>>> import numpy as np
>>> from sklearn.datasets import load_breast_cancer
>>> X, y = load_breast_cancer(return_X_y=True, as_frame=True)
>>> # Add a categorical feature
>>> animals = ["cat", "dog", "bird", "lion", "zebra"]
>>> probabilities = [0.001, 0.1, 0.2, 0.3, 0.399]
>>> X["animals"] = np.random.choice(animals, size=len(X), p=probabilities)
>>> atom = ATOMClassifier(X, y)
>>> atom.plot_distribution(columns=[0, 1])
>>> atom.plot_distribution(columns=0, distributions=["norm", "invgauss"])
>>> atom.plot_distribution(columns="animals", legend="lower right")