Skip to content

plot_distribution


method plot_distribution(columns=0, distributions=None, show=None, title=None, legend="upper right", figsize=None, filename=None, display=True)[source]
Plot column distributions.

  • For numerical columns, plot the probability density distribution. Additionally, it's possible to plot any of scipy.stats distributions fitted to the column.
  • For categorical columns, plot the class distribution. Only one categorical column can be plotted at the same time.

Tip

Use atom's distribution method to check which distribution fits the column best.

Parameterscolumns: int, str, slice or sequence, default=0
Columns to plot. I's only possible to plot one categorical column. If more than one categorical columns are selected, all categorical columns are ignored.

distributions: str, sequence or None, default=None
Names of the scipy.stats distributions to fit to the columns. If None, a Gaussian kde distribution is showed. Only for numerical columns.

show: int or None, default=None
Number of classes (ordered by number of occurrences) to show in the plot. If None, it shows all classes. Only for categorical columns.

title: str, dict or None, default=None
Title for the plot.

legend: str, dict or None, default="upper right"
Legend for the plot. See the user guide for an extended description of the choices.

  • If None: No legend is shown.
  • If str: Location where to show the legend.
  • If dict: Legend configuration.

figsize: tuple or None, default=None
Figure's size in pixels, format as (x, y). If None, it adapts the size to the plot's type.

filename: str or None, default=None
Save the plot using this name. Use "auto" for automatic naming. The type of the file depends on the provided name (.html, .png, .pdf, etc...). If filename has no file type, the plot is saved as html. If None, the plot is not saved.

display: bool or None, default=True
Whether to render the plot. If None, it returns the figure.

Returnsgo.Figure or None
Plot object. Only returned if display=None.


See Also

plot_correlation

Plot a correlation matrix.

plot_qq

Plot a quantile-quantile plot.

plot_relationships

Plot pairwise relationships in a dataset.


Example

>>> from atom import ATOMClassifier
>>> import numpy as np
>>> from sklearn.datasets import load_breast_cancer

>>> X, y = load_breast_cancer(return_X_y=True, as_frame=True)

>>> # Add a categorical feature
>>> animals = ["cat", "dog", "bird", "lion", "zebra"]
>>> probabilities = [0.001, 0.1, 0.2, 0.3, 0.399]
>>> X["animals"] = np.random.choice(animals, size=len(X), p=probabilities)

>>> atom = ATOMClassifier(X, y)
>>> atom.plot_distribution(columns=[0, 1])
>>> atom.plot_distribution(columns=0, distributions=["norm", "invgauss"])
>>> atom.plot_distribution(columns="animals", legend="lower right")