Violin plot

plot_utils.violin_plot(X, fig=None, ax=None, figsize=None, dpi=100, nan_warning=False, showmeans=True, showextrema=False, showmedians=False, vert=True, data_names=[], rot=45, name_ax_label=None, data_ax_label=None, sort_by=None, title=None, **violinplot_kwargs)[source]

Generate violin plots for each data set within X.

Parameters:
  • X (pandas.DataFrame, pandas.Series, numpy.ndarray, or dict) –

    The data to be visualized. It can be of the following types:

    • pandas.DataFrame:
      • Each column contains a set of data

    • pandas.Series:
      • Contains only one set of data

    • numpy.ndarray:
      • 1D numpy array: only one set of data

      • 2D numpy array: each column contains a set of data

      • Higher dimensional numpy array: not allowed

    • dict:
      • Each key-value pair is one set of data

    • list of lists:
      • Each sub-list is a data set

    Note that the NaN values in the data are implicitly excluded.

  • fig (matplotlib.figure.Figure or None) – Figure object. If None, a new figure will be created.

  • ax (matplotlib.axes._subplots.AxesSubplot or None) – Axes object. If None, a new axes will be created.

  • figsize ((float, float)) – Figure size in inches, as a tuple of two numbers. The figure size of fig (if not None) will override this parameter.

  • dpi (float) – Figure resolution. The dpi of fig (if not None) will override this parameter.

  • nan_warning (bool) – Whether to show a warning if there are NaN values in the data.

  • showmeans (bool) – Whether to show the mean values of each data group.

  • showextrema (bool) – Whether to show the extrema of each data group.

  • showmedians (bool) – Whether to show the median values of each data group.

  • vert (bool) – Whether to show the violins as vertical.

  • data_names (list<str>, [], or None) –

    The names of each data set, to be shown as the axis tick label of each data set. If [] or None, it will be determined automatically. If X is a:

    • numpy.ndarray:
      • data_names = [‘data_0’, ‘data_1’, ‘data_2’, …]

    • pandas.Series:
      • data_names = X.name

    • pd.DataFrame:
      • data_names = list(X.columns)

    • dict:
      • data_names = list(X.keys())

  • rot (float) – The rotation (in degrees) of the data_names when shown as the tick labels. If vert is False, rot has no effect.

  • name_ax_label (str) – The label of the “name axis”. (“Name axis” is the axis along which different violins are presented.)

  • data_ax_label (str) – The labels of the “data axis”. (“Data axis” is the axis along which the data values are presented.)

  • sort_by ({‘name’, ‘mean’, ‘median’, None}) – Option to sort the different data groups in X in the violin plot. None means no sorting, keeping the violin plot order as provided; ‘mean’ and ‘median’ mean sorting the violins according to the mean/median values of each data group; ‘name’ means sorting the violins according to the names of the groups.

  • title (str) – The title of the plot.

  • **violinplot_kwargs (dict) – Other keyword arguments to be passed to matplotlib.pyplot.violinplot().

Returns:

  • fig (matplotlib.figure.Figure) – The figure object being created or being passed into this function.

  • ax (matplotlib.axes._subplots.AxesSubplot) – The axes object being created or being passed into this function.