Skip to content

plot_ngrams


method plot_ngrams(ngram="words", index=None, show=10, title=None, figsize=None, filename=None, display=True) [source]

Plot n-gram frequencies. The text for the plot is extracted from the column named Corpus. If there is no column with that name, an exception is raised. If the documents are not tokenized, the words are separated by spaces.

Parameters:

ngram: str or int, optional (default="bigram")
Number of contiguous words to search for (size of n-gram). Choose from: words (1), bigrams (2), trigrams (3), quadgrams (4).

index: int, tuple, slice or None, optional (default=None)
Indices of the documents in the corpus to include in the search. If shape (n, m), it selects documents n until m. If None, it selects all rows in the dataset.

show: int, optional (default=10)
Number of n-grams (ordered by number of occurrences) to show in the plot.

title: str or None, optional (default=None)
Plot's title. If None, the title is left empty.

figsize: tuple, optional (default=None)
Figure's size, format as (x, y). If None, it adapts the size to the number of n-grams shown.

filename: str or None, optional (default=None)
Name of the file. Use "auto" for automatic naming. If None, the figure is not saved.

display: bool or None, optional (default=True)
Whether to render the plot. If None, it returns the matplotlib figure.

Returns: fig: matplotlib.figure.Figure
Plot object. Only returned if display=None.


Example

from atom import ATOMClassifier

atom = ATOMClassifier(X_text, y_text)
atom.textclean()
atom.plot_ngrams("bigrams")
plot_ngrams
Back to top