Correlation matrix
- plot_utils.correlation_matrix(X, color_map='RdBu_r', fig=None, ax=None, figsize=None, dpi=100, variable_names=None, rot=45, scatter_plots=False)[source]
Plot correlation matrix of a dataset
X
, whose columns are different variables (or a sample of a certain random variable).- Parameters:
X (numpy.ndarray or pandas.DataFrame) – The data set.
color_map (str or matplotlib.colors.Colormap) – The color scheme to show high, low, negative high correlations. Valid names are listed in https://matplotlib.org/users/colormaps.html. Using diverging color maps is recommended: PiYG, PRGn, BrBG, PuOr, RdGy, RdBu, RdYlBu, RdYlGn, Spectral, coolwarm, bwr, seismic.
fig (matplotlib.figure.Figure or
None
) – Figure object. If None, a new figure will be created.ax (matplotlib.axes._subplots.AxesSubplot or
None
) – Axes object. If None, a new axes will be created.figsize ((float, float)) – Figure size in inches, as a tuple of two numbers. The figure size of
fig
(if notNone
) will override this parameter.dpi (float) – Figure resolution. The dpi of
fig
(if notNone
) will override this parameter.variable_names (list<str>) – Names of the variables in
X
. IfX
is a pandas DataFrame, this argument is not needed: column names ofX
is automatically used as variable names. IfX
is a numpy array, and this argument is not provided, thenX
’s column indices are used. The length ofvariable_names
should match the number of columns inX
; if not, a warning will be thrown (not error).rot (float) – The rotation of the x axis labels, in degrees.
scatter_plots (bool) – Whether or not to show the scatter plots of pairs of variables.
- Returns:
correlations (pandas.DataFrame) – The correlation matrix.
fig (matplotlib.figure.Figure) – The figure object being created or being passed into this function.
ax (matplotlib.axes._subplots.AxesSubplot) – The axes object being created or being passed into this function.