Construction of Minimum Spanning Trees From Financial Returns Using Rank Correlation
Construction of Minimum Spanning Trees From Financial Returns Using Rank Correlation
Construction of Minimum Spanning Trees From Financial Returns Using Rank Correlation
Abstract
The construction of minimum spanning trees (MSTs) from correlation ma-
trices is an often used method to study relationships in the financial markets.
However most of the work on this topic tends to use the Pearson correlation
coefficient, which relies on the assumption of normality and can be brittle
to the presence of outliers, neither of which is ideal for the study of finan-
cial returns. In this paper we study the inference of MSTs from daily US,
UK and German financial returns using Pearson and two rank correlation
methods, Spearman and Kendall’s τ . MSTs constructed using these rank
methods tend to be more stable and maintain more edges over the dataset
than those constructed using Pearson correlation. The edge agreement be-
tween the Pearson and rank MSTs varies significantly depending on the state
of the markets, but the rank MSTs generally show strong agreement at all
times. Deviation from univariate normality can be related to changes in the
correlation matrices but is more difficult to connect to changes in the MSTs.
Irrelevant of coefficient, the trees tend to have similar topologies. Portfolios
constructed from the MST correlation matrices have a smaller turnover than
those from the full covariance matrix for the larger markets, but not for the
smaller German market. Using a bootstrap method we find that the corre-
lation matrices constructed using the rank correlations are more robust, but
there is little difference between the robustness of the MSTs.
Keywords: networks, correlation, finance, minimum spanning trees
∗
Corresponding Author
Email address: [email protected] (Tristan Millington)
• Sort edges of the distance graph in ascending order and place in a list
– If i and j are not in the same component in the tree, add this edge
into the tree
Various ‘stylised facts’ are known about these trees, for instance
2
• Branches of the trees tend to contain companies in the same sector [1]
• The trees shrink and have different structures during times of market
stress compared to market calm [15] [16]
• The trees tend to have a ‘scale free’ structure, with nodes of high degree
(hubs) occurring more than would be expected from a random graph
[17] [18] [15]
While most of the focus has been on the US markets, MST based models
have been applied to markets from other countries (e.g. Japan [25], the UK
[26], Italy [27] and South Korea [28]), to cryptocurrencies [29] [30] and to
networks from neuroscience [31] [32].
While the interpretability of the Pearson correlation coefficient is a big
plus, it assumes normality, something which most assets return distributions
do not follow [33], and is sensitive to outliers. There are of course correla-
tion measures that do not suffer from these issues, namely those based on
rank. Rank correlation methods calculate the correlation between the ranks
of variables, which tends to remove the effects of outliers while still giving a
measure of the degree to which two variables increase or decrease together.
Most of the literature which studies the correlations between asset re-
turns tends to use the Pearson correlation coefficient and so in this paper
we compare networks inferred from stock returns using Pearson, Spearman
and Kendall’s τ correlation in order to see if the robustness of these rank
correlations can improve our understanding of the stock markets.
Previous work [34] has briefly mentioned that MSTs constructed using
Spearman correlation from volatility measures of stocks are more robust,
but they did not explicitly compare the two correlation coefficients. In a pa-
per more broadly looking at the effects of weighting observations, Pozzi et al.
[35] compare Pearson correlation and Kendall’s τ . They find that matrices
constructed using Kendall’s τ tend to contain more information than those
constructed using Pearson correlation, and are affected less when they weight
3
observations. A paper on a more similar theme to this is written by Musmeci
et al. [36], who take a multilayer network approach. Each layer is composed
of a Planar Maximally Filtered Graph, constructed using a different method.
Four methods are used to quantify relationships between assets, Pearson cor-
relation, Kendall’s τ , tail dependence and partial correlation. They find that
these layers tend to have significant differences, with between 30% and 70%
of the edges being unique to each layer. Pearson and Kendall’s τ tend to
be the methods that agree the most, with a correlation of around 0.7 on the
degree of nodes. Interestingly they find that the level of agreement drops
during times of crisis, showing that these different methods tend to pick up
different signals from the markets and indicating that being mindful of mul-
tiple methods of quantifying relationships is valuable when taking a network
approach to financial returns. The final example we found is by Shirokikh et
al. [37] who use a thresholding model with Spearman correlation, but they
do not compare how this model differs to a Pearson based one.
To the best of our knowledge however, there has been no work which
compares Pearson and Spearman correlation, or compares different rank cor-
relations. Furthermore we are unaware of any work that compares Pearson
and rank correlation MSTs. Therefore we undertake a detailed study on
these fronts.
The Pearson correlation between two variables ri and rj is defined as
follows Pn
(ri (t) − r¯i )(rj (t) − r¯j )
Cij = pPni=1 (2)
((r (t) − r
¯ )2 (r (t) − r¯ )2 )
i=1 i i j j
4
the τ -b formulation, defined as
nc − nd
τ=p (3)
(n0 − n1 )(n0 − n2 )
5
3. Results and Analysis
3.1. Correlation Matrix Analysis
Firstly we analyze the full correlation matrices with no filtration. A
starting point is to look at the correlation coefficient for the same set of values.
Figure 1 shows a set of scatter plots comparing the correlations. From this
we can see there is a degree of agreement between all, and generally larger
correlations are more likely to be similar. However there is a ‘fat’ middle when
comparing the rank correlations to the Pearson correlation, where there can
be significant disagreement. Spearman and τ seem to be very similar, with
there being a strong relationship between the two, although the τ correlations
are slightly smaller than the Spearman ones.
The largest eigenvalue of the correlation matrix is a measure of the inten-
sity of the correlation present, and in matrices inferred from financial returns
tends to be significantly larger than the second largest [12] [13]. Generally
this largest eigenvalue is larger during times of stress and smaller during
times of calm [44] [12]. Firstly we study how this varies over time for each
correlation measure. This is shown in Figure 2. For all of the networks there
is a similar shape, with it peaking during times of market stress and dropping
during times of calm. The Spearman and Pearson correlations have relatively
similar values, although the Spearman has a smaller range. The τ correla-
tion is much smaller than the other two at all times, and also has a smaller
range. Times of stress and volatility tend to bring more outliers in returns
data, which could be the cause of the difference in largest eigenvalue. Figure
1 has shown us that the τ correlation coefficients tend to be smaller than the
others, which could explain why the largest eigenvalue is consistently smaller
too.
6
(a) Pearson vs Spearman - US (b) Pearson vs τ - US (c) Spearman vs τ - US
Figure 1: Relationship of the correlation coefficients for each country across the entirety
of each dataset. There is a large degree of agreement and larger correlations are more
likely to be similar, but the ‘fat’ middle is notable when comparing the Pearson and rank
correlations. The rank correlations themselves are relatively similar, with the τ correlations
being smaller than the Spearman correlations.
7
(a) US (b) UK (c) DE
Figure 2: Largest eigenvalue (λmax ) in the networks over time. From this we can see the
Spearman correlation has a slightly smaller largest eigenvalue than the Pearson correlation,
while the τ correlation is much smaller. The rank methods also have a smaller range than
the Pearson correlation. The volatility of the markets at times of stress is likely to lead
to more outliers, so the robustness of the rank correlations to these could be causing the
reduced variance of the largest eigenvalue.
all countries. Other authors have noted that by some measures the markets
could be considered more stable during these times [44] [45], but we have not
found that this has been mentioned in the context of MSTs. For Germany
the results are harder to interpret, as the German MSTs are much smaller,
meaning small changes are much more significant as an overall fraction and
so we report the mean and standard deviation (s.d.) of these MST differ-
ences. The mean differences are are 0.138 (s.d. 0.089) for the Pearson MST,
0.131 (s.d. 0.080) for the Spearman and 0.128 (s.d. 0.077) for the τ MSTs.
This indicates that the rank MSTs do change less than the Pearson ones, but
the difference is small.
This therefore shows that the Spearman and τ MSTs tend to be more sta-
ble than Pearson MSTs for all of the countries. This is particularly prominent
at the start of the financial crisis for the US and the UK, with the Pearson
MSTs showing a large spike in difference, while the Spearman and τ MSTs
show little or no change in difference. In this particular situation we would
expect the heavy tails to affect the Pearson correlation between two assets
more than the rank methods, and this should change the edges selected by
the MST construction procedure.
Next we measure how the MSTs have changed from the first inferred
tree using the multi step survival ratio [15] - the fraction of edges that have
been consistently maintained for each tree from the first to the current. This
measures the life of an edge and shows us how the tree evolves. This is
8
(a) US (b) UK (c) DE
Figure 3: Edge difference between adjacent MSTs. From this we can see the MSTs inferred
using rank correlation are far more stable with regards to time than those inferred using
Pearson correlation. While all of the trees seem to become more similar during the financial
crisis (2009 - 2011), the Pearson MSTs show a big reconfiguration as the crisis starts, while
the τ MSTs shows no spike before dropping.
defined as
1
|E(t) ∩ E(t − 1) . . . E(t − k + 1) ∩ E(t − k)| (4)
p−1
where E(t) is the edge set at that moment in time, k goes from 1 to t − 1
and |S| is the cardinality of the set S. A plot of this is shown in Figure
4. From this we can see that quite rapidly the trees differ from the original
for all countries, with 70% of the edges changing within 2 years. For our
experiments, what is particularly interesting is that for the US and the UK
the rank MSTs maintain slightly more edges than the Pearson MSTs, but the
difference between the τ and Spearman MSTs is very small. For Germany,
the Spearman MSTs actually maintain the most edges for the longest time
period, followed by the Pearson MSTs and then the τ MSTs.
Over the entire dataset for the US, the Pearson MSTs maintain 4 edges,
the Spearman MSTs 7 edges and the τ MSTs 8 edges. For the UK the
Pearson MSTs maintain 3 edges, the Spearman MSTs 5 edges and the τ
MSTs 4 edges. For Germany all three maintain 2 edges. All of these edges
are intrasector aside from one in the German τ MSTs, which is BASF to
Bayer (Materials to Healthcare).
There is of course the question of how the difference between the MSTs
changes over time. We measure the fraction of edges that differ between
the three MSTs and plot it in Figure 5. From this we can see there is a
significant difference in the presence of edges between the rank MSTs and
the Pearson MSTs for all countries. The difference does seem to increase
during the banking and financial crises, with notable peaks occurring during
9
(a) US (b) UK (c) DE
Figure 4: Multi-step survival ratio for the MSTs. This gives us a measure of how long the
edges persist for. Most of the edges disappear very rapidly, with around 70% changing
within 2 years. The rank MSTs seem to be slightly more stable than the Pearson MSTs,
maintaining more edges from the initial tree.
Figure 5: Fraction of the edges that differ between the various MSTs. The US has the
largest difference, followed by the UK and then Germany, which indicates size of the MST
influences the edge difference. In general for the US and the UK around 40% of the edges
differ, although this does increase during the financial crisis, particularly in 2009. Since
the German tree is so much smaller there is a larger range in these values, but on average
around 30% of the edges different between the Pearson and rank MSTs.
2008 and 2009. There seems to be relatively little difference between the two
rank MSTs, with less than 10% of the edges being different for every country
for most of the time period. This difference between the rank methods does
not seem to be particularly affected by market conditions.
10
To measure this we look at how the centrality of a sector varies by calcu-
lating the mean of the centrality of all the nodes in said sector. This reduces
the effect of the different numbers of companies in each sector. Then to
make comparisons between the differently sized MSTs easier, we normalize
the sector centrality so the sum of all sector centralities in an MST is 1. To
express mathematically, we calculate the centrality of sector s from the set
of all sectors S as follows
1 1 X
P ci (5)
j∈S Cj |s| i∈s
11
2017 is more intense in the Pearson MSTs than the rank ones, but the rest
of the centrality seems similar.
In general it seems the degree centrality of most sectors is relatively sim-
ilar in all three MSTs, with only small differences occurring. Furthermore
the sector centralities of the rank MSTs are virtually identical.
12
(a) US - Pearson (b) US - Spearman (c) US - τ
13
sector also becomes very central in all three from 2016, but has the largest
centrality in the Pearson MSTs. The Utilities sector is not central for most of
the dataset in any of the MSTs, but has a peak during 2016 for the Pearson
MSTs, but not in the rank MSTs. The Communications and Consumer
Staples sectors are relatively similarly expressed across all MSTs, and are
not very central. The centrality of the Materials sector varies similarly in all
MSTs, but has a higher peak value in the Pearson MSTs, and the Consumer
Discretionary is similar, but has its peak value in the Spearman MSTs.
For Germany the Financials, Industrials and Materials sectors are re-
garded as important throughout most of the dataset for every MST. How-
ever the Materials and Industrials sectors are more central in the Pearson and
Spearman MSTs than the τ . At the end of the time period, the Technology
sector has a higher peak in the Pearson MSTs, followed by the Spearman
MSTs, but is not particularly important in the τ MSTs. The Consumer
Staples, Consumer Discretionary and Healthcare sectors are not particularly
central in any of the MSTs, and are relatively similarly expressed.
Compared to the sector degree centrality, the sector betweenness central-
ity shows a much greater range, and greater disagreement between the MSTs,
notably between the rank MSTs. However in general the sectors that are re-
garded as important in the degree MSTs are also regarded as important in
the betweenness MSTs. This could imply that companies tend to be placed
in different positions in the different MSTs, even if they have a similar degree
centrality
where M x = pi=1 pj=1 Cij (i.e. the sum of all the correlations in the net-
P P
work), Ci is the ith column of the correlation matrix and C x and C y are
correlation matrices created from different correlation coefficients. Normal-
ising the correlations ensures that times where the correlations are higher
14
do not distort our measurements. This is done for both the full correlation
matrices (Figure 8) and the MST filtered correlation matrices (Figure 9).
To clarify, an MST filtered correlation matrix is the correlation matrix con-
structed from the MST, where edges in the MST are given the weight of their
correlation from the original full correlation matrix and all other correlations
are set to zero.
For all countries there is a positive relationship between the KS distance
and the distance between nodes in the full correlation matrices when we
compare the Pearson and rank networks, indicating that a departure from
Gaussianity does increase the difference between the different correlation
coefficients. There appears to be no relationship when comparing the rank
correlation networks, which is to be expected.
However if we look at the MST figures, the results seem a bit different.
For all countries it seems there is relatively little relationship between node
difference and KS distance. However since the procedure to create the MST
discards the majority of correlations, this could imply that the trees tend to
keep correlations that are unaffected by this deviation. Furthermore, since
the MSTs only keep large correlations, it could be that the deviations affect
smaller correlations more.
To quantify these we show the Spearman correlation between node dif-
ference and KS distance in Table 1. For all of the full correlation matrices
there is some positive correlation between deviation from Gaussianity and
the difference between the rank and Pearson correlations. The US and UK
also have some very mild negative correlation between the τ and Spearman
MSTs. None of the countries have any correlation between the node differ-
ence and KS distance with the MSTs. From the scatter plots this is mostly
to be expected. The results if we use unweighted MSTs are very similar.
Our second experiment on this front is to use quantile normalisation to
make the distributions of the asset returns normal. We then look at how this
changes the differences between the trees. We use 200 quantiles for this, and
plot the differences between the MSTs in Figure 10. If we compare this to
Figure 3 we can see the differences between the Pearson and rank methods
have been reduced, but that they are still larger than the differences between
the rank MSTs.
This therefore implies that, in contrast to the previous results, the depar-
ture from univariate Gaussianity does cause differences between the MSTs
as well as the full correlation matrices. However it does not explain all of the
differences between the MSTs. This would imply that overall there are non-
15
(a) US Pearson - Spearman (b) US Pearson - τ (c) US Spearman - τ
Figure 8: Scatter plots of the difference for nodes in the full correlation matrix against
the KS distance for all 3 countries. There appears to be a positive relationship between
node difference and KS distance when comparing the Pearson and rank MSTs, indicating
that a deviation from univariate Gaussianity does cause differences
16
(a) US Pearson - Spearman (b) US Pearson - τ (c) US Spearman - τ
Figure 9: Scatter plots of the difference for nodes in the MSTs against the KS distance
for all 3 countries. There seems to be significantly less of a relationship between the node
difference and the KS distance in the MSTs compared to the full correlation matrices.
This could be due to the MST procedure discarding the changed relationships.
17
Network Pearson - Spearman Pearson - τ Spearman - τ
US Full 0.221 0.244 -0.114
UK Full 0.409 0.415 -0.104
DE Full 0.331 0.352 -0.075
US MST -0.005 -0.005 -0.016
UK MST 0.000 -0.008 -0.006
DE MST -0.013 0.008 0.021
Table 1: Spearman correlation between node difference and the Kolmogorov-Smirnov dis-
tance for the node from a univariate Gaussian. In general a departure from univariate
Gaussianity tends to cause differences in the full correlation matrices, but not in the
MSTs, potentially due to the filtration procedure.
Figure 10: Edge difference between the trees over time when quantile normalisation is used
to make the asset returns data normal. Comparing this to Figure 3 there is a reduction in
this difference between the Pearson and rank MSTs, but it is still much higher than the
difference between the rank MSTs. This would imply that it is not just a deviation from
normality which causes differences between the MSTs.
linear relationships present in the dataset that also drive differences between
the MSTs, as well as non-normalities.
18
(a) US (b) UK (c) DE
19
From these we can see that irrelevant of the coefficient, the MSTs have
similar structure. All of the trees have a heavy tailed degree distribution,
with there being a high number of nodes with only one other edge and a
small number of edges with a large degree. The structure of the trees does
tend to be dependent on market state for all countries. The average shortest
length path is slightly longer for the τ MSTs than the Pearson or Spearman,
but the τ correlation tends to be slightly smaller for the same value (see
Figure 1) which would explain the longer paths, as they will have a greater
distance.
Σ∗ = αΣ + (1 − α)tr(Σ)I (8)
where α = 0.9. This is also applied to the full covariance matrix to assist
comparisons. With these resulting covariance matrices, we create minimum
risk portfolios by solving the following optimization problem
minimize wT Σ∗ w
w
subject to 1T w = 1 (9)
wi > 0
20
(a) US (b) UK (c) DE
Figure 15: Out of sample Sharpe ratio of the portfolios. The full covariance matrix
generally has a higher Sharpe ratio than the MST covariance matrices, but the difference
is not particularly large. For the larger markets this also comes at the cost of a larger
portfolio turnover.
Figure 15. The results for all four portfolios are relatively similar and highly
affected by market conditions, but in general the full correlation matrices
have a higher Sharpe ratio than the MST filtered ones. Next we look at the
turnover of the portfolios. Since we have found that the rank MSTs tend to
be more stable than the Pearson MSTs, we look at how this translates into
improving portfolio stability. We measure this by using the L1 norm of the
difference between two portfolios adjacent in time
p
X
|wt,i − wt−1,i | (10)
i=1
This is shown in Figure 16. There is a reduction in mean turnover for the
MSTs portfolios for the US and the UK, but not for Germany. This could
be due to the smaller size of the German markets, causing n to be much
larger than p, and therefore the estimation of the full covariance matrix will
be much better. For all countries the τ MSTs have the lowest turnover,
followed by the Spearman and then the Pearson MSTs. This shows that
rank MST covariance matrices may be useful in reducing portfolio turnover
when the investor is considering a large number of assets.
21
(a) US (b) UK (c) DE
Figure 16: Turnover of the portfolios constructed using the MST correlation matrices.
The MST portfolios tend to have a lower turnover than the full covariance portfolios for
the US and UK markets, but not for the German market. Out of the three MSTs, the τ
portfolios have the lowest turnover.
time, and if the end of the dataset is reached we wrap round and start back
at the beginning. This tends to be more appropriate for time series data
compared to the classic bootstrap due to look ahead effects and potential
autocorrelation. With these pseudo-datasets we calculate the correlations
between assets and construct MSTs from these correlation matrices. Once
we have this set of MSTs, we can compare the edges present in them. Ideally
if there is no noise, the data is purely stationary and the methods robust all
these MSTs would be the same. Of course this is not the case in real life.
To run the bootstrap we take the first 1008 days of data and create 1000
bootstrapped datasets of 504 days.
Using these bootstrapped datasets we measure the mean and standard
deviation of the difference between the full correlation matrices, the MST fil-
tered correlation matrices (i.e. weighted MSTs) and the fraction of difference
in edge presence across the MSTs (i.e. unweighted MSTs). To measure the
difference between the full and MST filtered correlation matrices we take a
similar approach as to section 3.4 and normalise the entries of the correlation
matrices to sum to 1 and take the sum of the absolute difference for each
entry
p p
X X Cijx Cijy
| x − y| (11)
i=1 j=1
M M
p p
where M x = i=1 j=1 Cijx . The results are shown in Table 2. If we look
P P
at the full correlation matrices, the US and Germany both see a reduction
in the mean difference when using rank correlation as opposed to Pearson
correlation. For the UK there is a reduction in mean difference for the Spear-
22
Method MST Weighted MST Unweighted Full
Mean S.D Mean S.D Mean S.D
US
Pearson 0.835 0.218 0.722 0.056 0.234 0.094
Spearman 0.830 0.209 0.721 0.053 0.175 0.074
τ 0.824 0.210 0.720 0.053 0.174 0.071
UK
Pearson 0.896 0.214 0.750 0.054 0.296 0.137
Spearman 0.904 0.220 0.749 0.056 0.247 0.123
τ 0.890 0.220 0.747 0.056 0.248 0.123
DE
Pearson 0.732 0.249 0.690 0.064 0.226 0.101
Spearman 0.700 0.235 0.690 0.063 0.128 0.058
τ 0.665 0.231 0.684 0.062 0.121 0.048
Table 2: Mean and standard deviation (s.d.) of the difference between full correlation ma-
trices and MSTs constructed from the bootstrapped datasets. For the all of the countries
there is a reduction in the mean difference between full correlation matrices when the rank
correlation method is used, but there is little reduction for the MST networks, weighted
or unweighted.
man MSTs, but not for the τ MSTs when compared the the Pearson MSTs.
However if we look at the mean difference between the MSTs, the results
differ. If we look at the weighted edges there is a slight reduction in differ-
ence for the US and Germany, but particularly for the US this is not large.
However for the unweighted MSTs there seems to be little to no difference
between the MSTs for the US and Germany, and in fact an increase for the
UK. From this we would conclude that the MST construction procedure has
a larger effect on the robustness of the trees than the correlation coefficient
chosen.
4. Conclusion
In this paper we have used the Pearson, Spearman and Kendall’s τ corre-
lation coefficients to infer correlation matrices from stock returns from three
countries (the US, UK and Germany), constructed minimum spanning trees
from these matrices and compared the robustness and evolution of the trees
over time.
23
Looking at the evolution of the trees over the dataset we have found
the MSTs constructed using the rank correlations (Spearman and Kendall’s
τ ) change less than Pearson MSTs (notably during times of market stress)
and have edges that are maintained for a longer time period over the dataset.
Despite this, the trees tend to have a similar topology, irrelevant of coefficient
and this topology tends to vary in a similar way over time for all three
methods. Perhaps unsurprisingly, the structure of the rank MSTs is very
similar, while they both differ more from the Pearson MSTs.
In general all of the MSTs tend to show broad agreement on which sectors
are regarded as important over the entire dataset, but there can be significant
disagreements at particular points in time. The rank MSTs show more agree-
ment with each other than either with the Pearson MSTs. The agreement
using degree centrality is higher than using betweenness centrality, indicat-
ing that companies tend to be found in different places on the trees in the
Pearson MSTs compared to the rank ones.
We then attempt to connect departures from univariate Gaussianity for
individual companies to changes in their expression in the MSTs and full
correlation matrices. These deviations are correlated with changes in the ex-
pression of a company in the full correlation matrices, but not in the MSTs.
Furthermore we then use a quantile normalisation method to enforce uni-
variate Gaussianity on each company, and then run the analysis again. From
this we find that there is a reduction in difference between the rank and
Pearson MSTs, but there is still a much larger difference between them than
between the Spearman and τ MSTs, indicating that there could be non-linear
relationships involved too.
These MSTs can also be applied for use in portfolio selection. We con-
struct minimum risk portfolios from the MST correlation matrices and com-
pare the resulting portfolios to those produced using the full covariance ma-
trix. The portfolios constructed from the MST correlation matrices tend to
have a lower turnover compared to those constructed using the full matri-
ces for the larger markets, but tend to have a slightly lower Sharpe ratio.
In particular for the portfolios constructed the MSTs, the τ portfolios have
the lowest turnover, while their Sharpe ratio is indistinguishable from those
constructed from the Pearson MST correlation matrices.
Finally we use a bootstrap to test the consistency of the correlation ma-
trices inferred and the edges selected in the MSTs. We find that the full cor-
relation matrices constructed using rank correlations mostly have a smaller
difference than the full Pearson correlation matrices, but there is relatively
24
little difference in the MSTs, indicating the MST construction procedure has
a larger influence on this than the correlation coefficient chosen.
Overall it may be worth constructing MSTs using different correlation
coefficients for a given dataset to give a different picture, but the MST con-
struction procedure has the greatest influence on the results. Generally the
Spearman and Kendall’s τ correlation coefficients tend to give similar results,
indicating that if computational resources are constrained then calculating
the Spearman correlation is sufficient. Future work could proceed in several
directions. A comparison of mutual information MSTs to these correlation
MSTs to see how they differ could be interesting, or exploring different filtra-
tion models, for instance the Planar Maximally Filtered Graph. Alternatively
these comparisons could be performed with returns data from other coun-
tries or assets, perhaps from data that is highly correlated and volatile, for
instance for returns from cryptocurrencies or developing nations.
Funding
TM Acknowledges PhD studentship funding from the School of Electron-
ics and Computer Science, University of Southampton;
References
[1] R. N. Mantegna, Hierarchical structure in financial markets, The Eu-
ropean Physical Journal B - Condensed Matter and Complex Systems
11 (1) (1999) 193–197.
25
[5] T. Millington, M. Niranjan, Partial correlation financial net-
works, Applied Network Science 5 (1) (2020) 11. doi:10.1007/
s41109-020-0251-z.
[9] R. S. Tsay, Analysis of financial time series, Vol. 543, John Wiley &
Sons, 2005.
26
[15] J.-P. Onnela, A. Chakraborti, K. Kaski, J. Kertesz, A. Kanto, Dynam-
ics of market correlations: Taxonomy and portfolio analysis, Physical
Review E 68 (5) (2003) 056110.
[16] Y. Zhang, G. H. T. Lee, J. C. Wong, J. L. Kok, M. Prusty, S. A. Cheong,
Will the us economy recover in 2010? a minimal spanning tree study,
Physica A: Statistical Mechanics and its Applications 390 (11) (2011)
2020 – 2050.
[17] G. Bonanno, G. Caldarelli, F. Lillo, R. N. Mantegna, Topology of
correlation-based minimal spanning trees in real and model markets,
Phys. Rev. E 68 (2003) 046130. doi:10.1103/PhysRevE.68.046130.
[18] N. Vandewalle, F. Brisbois, X. Tordoir, et al., Non-random topology of
stock markets, Quantitative Finance 1 (3) (2001) 372–374.
[19] J.-P. Onnela, A. Chakraborti, K. Kaski, J. Kertész, Dynamic asset trees
and black monday, Physica A: Statistical Mechanics and its Applications
324 (1) (2003) 247 – 252, proceedings of the International Econophysics
Conference.
[20] A. Hüttner, J.-F. Mai, S. Mineo, Portfolio selection based on graphs:
Does it align with markowitz-optimal portfolios?, Dependence Modeling
6 (1) (2018) 63–87.
[21] G. Peralta, A. Zareei, A network approach to portfolio selection, Journal
of Empirical Finance 38 (2016) 157 – 180.
[22] H. Kaya, Eccentricity in asset management, Available at SSRN 2350429
(2013).
[23] F. Pozzi, T. Di Matteo, T. Aste, Spread of risk across financial markets:
better to invest in the peripheries, Scientific reports 3 (2013) 1665.
[24] T. Aste, W. Shaw, T. D. Matteo, Correlation structure and dynamics
in volatile markets, New Journal of Physics 12 (8) (2010) 085009. doi:
10.1088/1367-2630/12/8/085009.
[25] W.-S. Jung, O. Kwon, F. Wang, T. Kaizoji, H.-T. Moon, H. E. Stan-
ley, Group dynamics of the japanese market, Physica A: Statistical Me-
chanics and its Applications 387 (2) (2008) 537 – 542. doi:https:
//doi.org/10.1016/j.physa.2007.09.022.
27
[26] R. Coelho, S. Hutzler, P. Repetowicz, P. Richmond, Sector analysis
for a ftse portfolio of stocks, Physica A: Statistical Mechanics and its
Applications 373 (2007) 615 – 626. doi:https://doi.org/10.1016/j.
physa.2006.02.050.
[27] P. Coletti, Comparing minimum spanning trees of the italian stock mar-
ket using returns and volumes, Physica A: Statistical Mechanics and its
Applications 463 (2016) 246 – 261. doi:https://doi.org/10.1016/j.
physa.2016.07.029.
[28] W.-S. Jung, S. Chae, J.-S. Yang, H.-T. Moon, Characteristics of the
korean stock market correlations, Physica A: Statistical Mechanics and
its Applications 361 (1) (2006) 263 – 271. doi:https://doi.org/10.
1016/j.physa.2005.06.081.
[29] D. Stosic, D. Stosic, T. B. Ludermir, T. Stosic, Collective behavior of
cryptocurrency price changes, Physica A: Statistical Mechanics and its
Applications 507 (2018) 499–509.
[30] J. Y. Song, W. Chang, J. W. Song, Cluster analysis on the structure
of the cryptocurrency market via bitcoin–ethereum filtering, Physica
A: Statistical Mechanics and its Applications 527 (2019) 121339. doi:
https://doi.org/10.1016/j.physa.2019.121339.
[31] P. Tewarie, A. Hillebrand, M. Schoonheim, B. van Dijk, J. Geurts,
F. Barkhof, C. Polman, C. Stam, Functional brain network analysis
using minimum spanning trees in multiple sclerosis: An meg source-
space study, NeuroImage 88 (2014) 308 – 318. doi:https://doi.org/
10.1016/j.neuroimage.2013.10.022.
[32] P. Tewarie, E. van Dellen, A. Hillebrand, C. Stam, The minimum span-
ning tree: An unbiased method for brain network analysis, NeuroImage
104 (2015) 177 – 188. doi:https://doi.org/10.1016/j.neuroimage.
2014.10.015.
[33] R. Cont, Empirical properties of asset returns: stylized facts and
statistical issues, Quantitative Finance 1 (2) (2001) 223–236. doi:
10.1080/713665670.
[34] S. Miccichè, G. Bonanno, F. Lillo, R. N. Mantegna, Degree stability
of a minimum spanning tree of price return and volatility, Physica A:
28
Statistical Mechanics and its Applications 324 (1) (2003) 66 – 73, pro-
ceedings of the International Econophysics Conference. doi:https:
//doi.org/10.1016/S0378-4371(03)00002-5.
29
[45] A. Kocheturov, M. Batsyn, P. M. Pardalos, Dynamics of cluster struc-
tures in a financial market network, Physica A: Statistical Mechanics
and its Applications 413 (2014) 523 – 533. doi:https://doi.org/10.
1016/j.physa.2014.06.077.
30