Multiple histograms

plot_utils.hist_multi(X, bins=10, fig=None, ax=None, figsize=None, dpi=100, nan_warning=False, showmeans=True, showmedians=False, vert=True, data_names=[], rot=45, name_ax_label=None, data_ax_label=None, sort_by=None, title=None, show_vals=True, show_pct_diff=False, baseline_data_index=0, legend_loc='best', show_counts_on_data_ax=True, **extra_kwargs)[source]

Generate multiple histograms, one for each data set within X.

Parameters:
  • X (pandas.DataFrame, pandas.Series, numpy.ndarray, or dict) –

    The data to be visualized. It can be of the following types:

    • pandas.DataFrame:
      • Each column contains a set of data

    • pandas.Series:
      • Contains only one set of data

    • numpy.ndarray:
      • 1D numpy array: only one set of data

      • 2D numpy array: each column contains a set of data

      • Higher dimensional numpy array: not allowed

    • dict:
      • Each key-value pair is one set of data

    • list of lists:
      • Each sub-list is a data set

    Note that the NaN values in the data are implicitly excluded.

  • bins (int or sequence or str) – If an integer is given, the whole range of data (i.e., all the numbers within X) is divided into bins segments. If sequence or str, they will be passed to the bins argument of matplotlib.pyplot.hist().

  • fig (matplotlib.figure.Figure or None) – Figure object. If None, a new figure will be created.

  • ax (matplotlib.axes._subplots.AxesSubplot or None) – Axes object. If None, a new axes will be created.

  • figsize ((float, float)) – Figure size in inches, as a tuple of two numbers. The figure size of fig (if not None) will override this parameter.

  • dpi (float) – Figure resolution. The dpi of fig (if not None) will override this parameter.

  • nan_warning (bool) – Whether to show a warning if there are NaN values in the data.

  • showmeans (bool) – Whether to show the mean values of each data group.

  • showmedians (bool) – Whether to show the median values of each data group.

  • vert (bool) – Whether to show the “base” of the histograms as vertical.

  • data_names (list<str>, [], or None) –

    The names of each data set, to be shown as the axis tick label of each data set. If [] or None, it will be determined automatically. If X is a:

    • numpy.ndarray:
      • data_names = [‘data_0’, ‘data_1’, ‘data_2’, …]

    • pandas.Series:
      • data_names = X.name

    • pd.DataFrame:
      • data_names = list(X.columns)

    • dict:
      • data_names = list(X.keys())

  • rot (float) – The rotation (in degrees) of the data_names when shown as the tick labels. If vert is False, rot has no effect.

  • name_ax_label (str) – The label of the “name axis”. (“Name axis” is the axis along which different violins are presented.)

  • data_ax_label (str) – The labels of the “data axis”. (“Data axis” is the axis along which the data values are presented.)

  • sort_by ({‘name’, ‘mean’, ‘median’, None}) – Option to sort the different data groups in X in the violin plot. None means no sorting, keeping the violin plot order as provided; ‘mean’ and ‘median’ mean sorting the violins according to the mean/median values of each data group; ‘name’ means sorting the violins according to the names of the groups.

  • title (str) – The title of the plot.

  • show_vals (bool) – Whether to show mean and/or median values along the mean/median bars. Only effective if showmeans and/or showmedians are turned on.

  • show_pct_diff (bool) – Whether to show percent difference of mean and/or median values between different data sets. Only effective when show_vals is set to True.

  • baseline_data_index (int) – Which data set is considered the “baseline” when showing percent differences.

  • legend_loc (str) – The location specification for the legend.

  • show_counts_on_data_ax (bool) – Whether to show counts besides the histograms.

  • **extra_kwargs (dict) – Other keyword arguments to be passed to matplotlib.pyplot.bar().

Returns:

  • fig (matplotlib.figure.Figure) – The figure object being created or being passed into this function.

  • ax (matplotlib.axes._subplots.AxesSubplot) – The axes object being created or being passed into this function.