1. Problem Statement
Correlation matrices are often large, complex and visually
off-putting. The objectives of the visualisation are to:
- Present a correlation matrix in a way which is straightforward to engage with
- Make it easy to locate the material assumptions
- Make it easy to identify possible inconsistencies between correlation assumptions
2. Suggested Approach
We suggest a hybrid of the following techniques – see example below:- Bar charts to illustrate the materiality of individual risks, measured by undiversified capital requirements. Colour is used to collate risks into categories.
- Shading of alternate rows and columns to lead the eye to the row and column headings, and borders around correlations within each category that align to the bar charts.
- A table of values to show the correlation assumptions – this can be triangular because the matrix is symmetric, and the values of 1.0 on the diagonal are omitted. The typography is designed to emphasise visual differences between zero, positive and negative values.
- Ellipses to visualise the sign and magnitude of each correlation, in the space created by restricting the numerical assumptions to a triangle. These help with seeing patterns.
3. Rationale and Commentary
The illustrative matrix in section 2 parameterises an economic capital model for a hypothetical mid-size insurance group writing life and non-life business. (There is nothing specific to insurance groups here, and other kinds of financial institution could make use of the same ideas.) There are 25 risk factors, categorised into market, credit, life, non-life and operational risk, leading to 300 separate correlation assumptions once symmetry of the matrix is taken into account.Other than the ellipses, this is a hybrid of techniques that are likely to be familiar, so we start by discussing the ellipses. The convention is that:
- Zero correlations are white circles (appropriately enough).
- Positive correlations are non-circular ellipses slanting diagonally upwards and are red. Larger correlations are darker coloured and more cigar-shaped, converging to a diagonal line for correlations of +1.0.
- Negative correlations are like positive but blue, point downwards, and become darker coloured and more cigar-shaped towards -1.0.
- Use of red and blue (rather than red and green) is intentional, for the benefit of those with red-green colour blindness
- There are lots of zero correlations (which are significant in their own right).
- The non-zero correlations seem to have been set primarily within each category, with comparatively few non-zero correlations between risks in different categories.
- Non-life risk is moderately correlated in itself, but life risk isn’t.
- Risks within the operational risk category are fairly heavily correlated with each other (lots of bold), but operational risk isn’t financially significant.
- There is ‘visual continuity’ in the ellipses, with values that are numerically similar being visually similar.
The immediate criticism is that this is visually off-putting. Several other points can be made, as discussed in the following table:
| Criticism | Possible remedy |
| There is no indication of the financial significance of the risks. | Show pre-diversification capital requirements beside row/column headings. |
| All values are formatted to 2 decimal places, which makes them all look very similar to each other. | Show zeros without decimal places and all other values to 2 decimal places, to make the zeros and non-zero values stand out from each other. |
| It’s hard to see which values are negative, because minus signs don’t use much ink/many pixels, and so don’t stand out. | Use () rather than minus signs to present negative values. |
| The numerically large correlations (0.5/0.75 and negatives) don’t really stand out. | Use bold formatting to highlight any correlations with absolute values ≥ 0.5. |
| The symmetry of the matrix leads to repetition, in that the upper-right triangle has the same values as the lower-left triangle. | Remove the upper-right triangle of values and use the space created to represent the correlations as ellipses. |
| The values on the leading diagonal are always going to be 1.0, and don’t carry any information as such, but look important. | Remove them. |
| If we look at a correlation towards the middle of the matrix, it’s hard to look up the row and column headings. | Shade alternate rows and columns. |
| It isn’t particularly easy to identify which correlations are within a category, and which are between categories | Use coloured borders to identify blocks of correlations within a category. |
Addressing each criticism leads to the approach in section 2.
4. Applicability and Alternatives
The motivating example is a matrix of assumptions, but the ellipse technique can also be applied to empirical data, and to the correlations between simulated results for risk factors, or the financial impact of risk factors (e.g. P&L or changes in own funds).There is no requirement to show ellipses in the top-right triangle:
- Some users may prefer to see numbers (despite the repetition).
- Others may prefer to dispense with ellipses and show the correlations as a heatmap:
- Or it may be preferable to see an indication of the financial significance of each correlation, for example a (different) heat-map showing the ranked impact on overall economic capital of changing each assumption by 1%.
- It is also common to set key correlations using detailed analysis and use completion rules (also called imputation rules) to derive the remaining correlations. The top-right triangle could be used to visualise the distinction between analytical and imputed correlations, possibly with an overlay to indicate the degree of statistical confidence in analytical correlations. For example, the cells for imputed correlations could be left blank, with analytical correlations shown using circles coloured to indicate confidence.
- Present sub-matrices, for example by legal entity, geography or risk category
- Filter out or coalesce financially insignificant assumptions
5. Implementation
The case study was produced using Excel and an example workbook is at TODO LINK.The bars in each row are produced using Excel conditional formatting data bars, and the bars in each column use Excel sparklines. At the time of writing, Excel does not have a built-in feature to produce the ellipses, but it’s reasonably straightforward to produce them using VBA.
The typography was implemented using the custom numeric format "0.00;(0.00);0" (without the surrounding quotation marks).
The bold formatting and row/column shading uses conditional formatting.
The R packages ellipse and corrplot produce correlation matrix ellipses. Corrplot implements several other approaches to correlation matrix visualisation, and there is a walkthrough of using it here.
No comments:
Post a Comment