Mathematics 09 03003 v3
Mathematics 09 03003 v3
Mathematics 09 03003 v3
Article
A New Goodness of Fit Test for Multivariate Normality and
Comparative Simulation Study
Jurgita Arnastauskaitė 1,2, * , Tomas Ruzgas 2 and Mindaugas Bražėnas 3
Abstract: The testing of multivariate normality remains a significant scientific problem. Although it
is being extensively researched, it is still unclear how to choose the best test based on the sample size,
variance, covariance matrix and others. In order to contribute to this field, a new goodness of fit test
for multivariate normality is introduced. This test is based on the mean absolute deviation of the
empirical distribution density from the theoretical distribution density. A new test was compared
with the most popular tests in terms of empirical power. The power of the tests was estimated for the
selected alternative distributions and examined by the Monte Carlo modeling method for the chosen
sample sizes and dimensions. Based on the modeling results, it can be concluded that a new test is
one of the most powerful tests for checking multivariate normality, especially for smaller samples. In
addition, the assumption of normality of two real data sets was checked.
Citation: Arnastauskaitė, J.; Ruzgas, Keywords: multivariate normality; power of tests; squared radii; skewness; kurtosis
T.; Bražėnas, M. A New Goodness of
Fit Test for Multivariate Normality
and Comparative Simulation Study.
Mathematics 2021, 9, 3003. https:// 1. Introduction
doi.org/10.3390/math9233003 Much multivariate data is being collected by monitoring natural and social processes.
IBM estimates that we all generate 175 zettabytes of data every day. To add, the data
Academic Editor: María Purificación
were collected at a rapidly increasing rate, i.e., it is estimated that 90% of data has been
Galindo Villardón
generated in the last two years. The need to extract useful information from continuously
generated data sets drives demand for data specialists and the development of robust
Received: 12 October 2021
analysis methods.
Accepted: 19 November 2021
Data analytics is inconceivable without testing the goodness of fit hypothesis. The
Published: 23 November 2021
primary task of a data analyst is to become familiar with the data sets received. This usually
starts by identifying the distribution of the data. Then, the assumption that the data follow
Publisher’s Note: MDPI stays neutral
a normal distribution is usually tested. Since 1990, many tests have been developed to test
with regard to jurisdictional claims in
published maps and institutional affil-
this assumption, mostly for univariate data.
iations.
It is important to use the powerful tests for the goodness of fit hypothesis to test
the assumption of normality because an alternative distribution is not known in general.
Based on the outcome of normality verification, one can choose suitable analysis methods
(parametric or non-parametric) for further investigation. From the end of the 20th century
to the present day, multivariate tests for testing the goodness of fit hypothesis have been
Copyright: © 2021 by the authors.
developed by a number of authors [1–14]. Some of the most popular and commonly used
Licensee MDPI, Basel, Switzerland.
multivariate tests are Chi-Square [8], Cramer von Mises [2], Anderson-Darling [2], and
This article is an open access article
distributed under the terms and
Royston [3].
conditions of the Creative Commons
Checking the assumption of normality of multivariate data is more complex compared
Attribution (CC BY) license (https://
to univariate. Additional data processing is required (e.g., standardization). The develop-
creativecommons.org/licenses/by/ ment of multivariate tests is more complex because they require checking the properties
4.0/). of invariance and contingency. While for the univariate tests, the invariance property is
always satisfied. The properties of invariance, contingency are presented in Section 2 and
are discussed in more detail in [2,12,15].
The study aims to perform a power analysis of the multivariate goodness of fit
hypothesis tests for the assumption of normality, to find out proposed test performances
compared to other well-known tests and to apply the multivariate tests to the real data.
The power estimation procedure is discussed in [16].
Scientific novelty. The power analysis of multivariate goodness of fit hypothesis
testing for different data sets was performed. The goodness of fit tests were selected as
representatives of popular techniques, which had been analyzed by other researchers
experimentally. In addition, we proposed a new multivariate test based on the mean
absolute deviation of the empirical distribution density from the theoretical distribution
density. In this test, the density estimate is derived by using an inversion formula which is
presented in Section 3.
The rest of the paper is organized as follows. Section 2 defines the tests for the
comparative multivariate test power study. Section 3 presents details of our proposed
test. Section 4 presents the data distributions used for experimental test power evaluation.
Section 5 presents and discusses the results of simulation modeling. Section 6 discusses
the application of multivariate goodness of fit hypothesis tests to real data. Finally, the
conclusions and recommendations are given in Section 7.
n
1 T
S=
n ∑ (Xj − X )(Xj − X ) .
j =1
An invariant test has a statistic, which satisfies the condition (1). It might seem that a
test based on a standardized sample
1
Yj = S− 2 ( X j − X ),
is invariant, however Henze and Zirkler [2] note that this is not always the case. In practice,
for a given sample X1 , X2 , . . . , Xn the alternative distribution is not know. In such a case it
is important to use a test for which the probability of correctly rejecting H0 tends to one
as n → ∞ . Such a test is said to be consistent. For more elaborate discussion on these
properties we refer the reader to [2]. Other important denotes are given in Appendix A.
Mathematics 2021, 9, 3003 3 of 20
n
where Nn,l = ∑ 1 al −1 < Dn,j ≤ al , (l = 1, 2, . . . , k; a0 = 0, ak = +∞). Since Mn,k takes
j =1
the equivalent form [8]:
k
Mn,k = k ∑ ( Gn ( al ) − Gn ( al −1 ))2 ,
l =1
2 −1
where G p (·) is the probability distribution function of χ ( p). G p ( al ) − G p ( al −1 ) = k
l = 1, 2, . . . , k; G p (+∞) = 1 .
1 n 2j − 1 2
12n j∑
CM = + G p D( j ) − , (3)
=1
n
i
where mi = 1
n ∑in=1 ( xi − x ) x = 1
n ∑in=1 xi .
Mathematics 2021, 9, 3003 4 of 20
where Z1T = z11 , . . . , z1p and Z2T = z21 , . . . , z2p are defined as
√ 1
q r
χ
Z1 = δ log y + y2 − 1 and Z2 = 9α −1+ 3
9α 2α
where
s
3 n2 + 27n − 70 (n + 1)(n + 3)
1 (w2 − 1)(n + 1)(n + 3)
q
2
δ= p , w = −1 + 2( β − 1), β = , y=s ,
log(w2 ) (n − 2)(n + 5)(n + 7)(n + 9) 12(n − 2)
Wj − l
Zj = ,
σ
where γ, l and σ are derived from the polynomial approximations. The polynomial
coefficients are provided for different [3,5]:
where e is the equivalent degrees of freedom, Φ(·) is the cumulative distribution function
for the standard normal distribution such that,
p
e= ,
[1 + ( p − 1) c ]
( " #)2
−1 Φ − Zj
ψj = Φ , j = 1, 2, . . . , d.
2
Let R be the correlation matrix and rij is the correlation between ith and jth observa-
tions. Then, the extra term c is found by
cij
∑ ∑ p ( p − 1) ,
c= cij i6= j ,
i j
where
g rij , n for i 6= j
cij = .
1 for i = j
When g(0, n) = 0 and g(1, n) = 1, then g(·) can be defined as
l l
g(r, n) = r 1 − (1 − r ) ,
$
v
where l, $ and v are the unknown parameters, which are estimated by Ross modeling [4]. It
was found that l = 0.715 and $ = 5 for sample size 10 ≤ n ≤ 2000 and v is a cubic function
where P̂(t) is the characteristic function of the multivariate standard normal, Q̂(t) is
the empirical characteristic function of the standardised observations, ϕh (t) is a kernel
(weighting) function
− p/4 −tT t
ϕh (t) = 2πh2 e 2h2 ,
n j∑ n2 j,k∑
En = n en,j − N1 || − E|| N1 − N2 || −
E||Y ||Y
en,j − Y
en,k || , (8)
=1 =1
p −1
where Y en,j = n/(n − 1)Yn,j , Yn,j = Sn 2 X j − X n , j = 1, . . . , n is called scattering
residues. N1 and N2 are independent
randomly
distributed vectors according to the normal
p +1 p
distribution. E|| N1 − N2 || = 2Γ 2 /Γ 2 , where Γ(·) is a Gamma function. The null
hypothesis is rejected when En acquires large values.
1 n n h2
− p n − h2 D
− p
∑ ∑ e− 2 Dij − 2 ∑e
2 2
HZ = 1 + h2 2(1+ h2 ) i + n 1 + 2h2 , (10)
n2 i =1 j =1 i =1
T T
where Dij = Xi − X j S−1 Xi − X j , Di = Xi − X S−1 Xi − X .
Di gives the squared Mahalanobis distance of ith observation to the centroid and Dij
gives the Mahalanobis distance between ith and jth observations. If the sample follows a
multivariate normal distribution, the test statistic is approximately log-normally distributed
with mean [1]
p
2
a− 2 1 + ph a + p( p + 2)h4
1− ,
2a2
and variance [1]
− p 2a− p 1 + 2ph4
3p( p + 2)h8 3ph4 p ( p + 2) h8
2 p
−2
2 1 + 4h2 + + − 4w h 1 + + ,
a2 4a4 2wh 2wh 2
A drawback of the Henze-Zirkler test is that, when H0 is rejected, the possible violation
of normality is generally not straightforward. Thus, many biomedical researchers would
prefer a more informative and equally or more powerful test than the Henze-Zirkler test [5].
Mathematics 2021, 9, 3003 7 of 20
( Nin − n/r )
Vi = Vin θ̂n = √ , i = 1, . . . , r,
n/r
where Nin is the number of random vectors X1 , . . . , Xn falling into Ein θ̂n , i = 1, . . . , r.
Then the limiting covariance matrix of standardized frequencies is Vn θ̂n = ∑l = I −
where q is the number of clusters (i.e., components, classes) of the mixture, and pk (k = 1, . . . , q)
is the a priori probability which satisfy
q
pk > 0, ∑ pk = 1. (14)
k =1
xτ = τ T x, (15)
f ↔ { f τ , τ ∈ R p }. (16)
It is quite natural to try to find the multivariate density f using the density estimates
fˆτ of univariate observational projections [20]. In case of Gaussian mixture model, the pro-
jection of the observations (15) is also distributed according to the Gaussian mixture model:
q
f τ ( x ) = f τ ( x, θτ ) = ∑ pk,τ ϕk,τ (x), (17)
k =1
2
where ϕk,τ ( x ) = ϕ x; mk,τ , σk,τ is univariate Gaussian density. The parameter set θ of
multivariate mixture and the distribution parameters of the data projections θτ =
the
2
pk,τ , mk,τ , σk,τ , k = 1, . . . , q are related by equations:
p j,τ = p j ,
m j,τ = τ T M j , (18)
2 = τ T R τ.
σj,τ j
where
T
ψ(t) = Eeit x , (20)
where ψ(t) denotes the characteristic function of the random variable X. Given that u = |t|,
τ = t/|t| and by changing the variables to a spherical coordinate system we obtain
Z ∞
1
Z
T
f (x) = ds e−iuτ x ψ(uτ )u p−1 du, (21)
(2π ) p τ: |τ |=1 0
Mathematics 2021, 9, 3003 9 of 20
where the first integral is the surface integral of the unit sphere. The characteristic function
of the projection of the observed random variable is
TX
ψτ (u) = Eeiuτ , (22)
where #T denotes a size of set T. Using the p-variate ball volume formula
p
p
π π 2pR ,
p
2 Rp when p mod 2 ≡ 0,
Vp (R) = = ( 2 ) !
(25)
p p +1 p −1
Γ 2 +1 2 2 π 2 Rp , when p mod 2 ≡ 1,
p!!
Computer simulation studies have shown that the density estimates obtained using
the inversion formula are not smooth. Therefore, in Formula (24), an additional multiplier
2
e−hu is used. This multiplier smoothes the estimate fˆ( x ) with the Gaussian kernel function.
Moreover, this form of the multiplier allows the integral value to be calculated analytically.
Monte Carlo studies have shown that its use significantly reduces the error of estimates.
Formula (24) can be used to estimate the characteristic function of the projected data.
Let us consider two approaches. The first one is based on the density approximation
of the Gaussian distribution mixture model. In this case, the parametric estimate of the
characteristic function is used:
q̂τ
2 2
ψ̂τ (u) = ∑ p̂k,τ eium̂k,τ −u σ̂k,τ /2 . (27)
k =1
q̂τ R∞ T 2 2
eiu(m̂k,τ −τ!x)−u (h+σ̂k,τ /2) u p−1 du
A( p)
fˆ( x ) = #T ∑τ ∈T ∑k=1 p̂k,τ 0
A( p) q̂τ m̂ −τ T x
q
2 + 2h
− p (28)
= #T ∑τ ∈T ∑k=1 p̂k,τ I p−1 qk,τ
2
σ̂k,τ ,
σ̂k,τ +2h
where
∞
Z
2 /2
Ij (y) = Re eiyt−t t j dt . (29)
0
We note, that only the real part of the expression is considered here (the sum of the
imaginary parts must be equal to zero) in other words, the density estimate fˆ( x ) can acquire
2
only the real values. The chosen form of the smoothing multiplier e−hu allows relating the
smoothing parameter h with the variances of the projection clusters, i.e., in the calculations
the variances are simply increased by 2h. Next, the expression (29) is evaluated.
Let Z ∞
2
Cj (y) = cos(yt)·e−t /2 ·t j dt, (30)
0
Mathematics 2021, 9, 3003 10 of 20
Z ∞
2 /2
S j (y) = sin(yt)·e−t ·t j dt, (31)
0
then (29) can be written as
Z ∞
2 /2
e−iyt−t t j dt = Cj (y) + iS j (y). (32)
0
S j (y) is expressed analogously. With respect to the limitations of the j index, the
following recursive equations are obtained:
From (35) and (38) it follows that S0 satisfies the differential equation
By equating the coefficients of the same powers, its values are obtained:
which gives us
∞
(−1)l y2l +1 y3 y5 y7
S0 ( y ) = ∑ (2l + 1)!!
= y− + −
3!! 5!! 7!!
+.... (42)
l =0
I j ( y ) = C j ( y ). (44)
One of the disadvantages of the inversion formula method (defined by (24)) is that
the Gaussian distribution mixture model (13) described by this estimate (for f k = ϕk )
does not represent density accuratelly, except around observations. When approximating
Mathematics 2021, 9, 3003 11 of 20
the density under study with a mixture of Gaussian distributions, the estimation of the
density using the inversion formula often becomes complicated due to a large number
of components. Thus, we merge components with small a priori probabilities into one
noise cluster.
We have developed and examined a modification of the algorithm which is based on
the use of a multivariate Gaussian distribution mixture model. The parametric estimate of
the characteristic function of uniform distribution density is defined as
(b − a)u
2 iu( a+b)
ψ̂(u) = sin ·e 2 , (45)
(b − a)u 2
in the inversion Formula (19). In the density estimate calculation Formula (24), the estima-
tion of the characteristic function is constructed as a union of the characteristic functions
of a mixture of Gaussian distributions and uniform distribution with corresponding a
priori probabilities:
q̂τ
(b − a)u
2 σ̂2 /2 2 iu( a+b)
ψ̂τ (u) = ∑ p̂k,τ eium̂k,τ −u k,τ + p̂0,τ
(b − a)u
sin
2
·e 2 , (46)
k =1
where the second member describes uniformly distributed noise cluster, p̂0 —noise cluster
weight, a = a(τ ), b = b(τ ). Based on the established estimates of the parameters of the
uniform distribution and data projections, it is possible to define the range
τT x − τT x
T max min
a= τ x − , (47)
min 2( n − 1)
τT x − τT x
T max min
b= τ x + . (48)
max 2( n − 1)
By inserting (46) to (24) we obtain
A( p)
h
q̂τ R ∞ iu(m̂k,τ −τ T x)−u2 (h+σ̂2 /2) p−1
fˆ( x ) = #T ∑τ ∈T ∑k= 1 p̂k,τ 0 e
k,τ u du
2 p̂0,τ R ∞ iu( a+b −τ T x )−u2 h
(b− a)u
i (49)
+ b− a 0 e 2 ·sin 2 ·u p−2 du .
Z∞
iyu−u2 /2
Jj (y, t) = Re e ·sin(tu)·u j du. (51)
0
By integrating, we get
R ∞ iyu− u2 R∞ u2
0 e
2 · sin( tu )· u j du =
0 (cos( yu ) + i sin( yu ))· sin(tu)·e− 2 ·u j du =
R ∞ sin((y+t)u)+sin((t−y)u) u2
0 2 + i cos((y−t)u)−2 cos((y+t)u) ·e− 2 ·u j du = 21 S j (y + t)+ (52)
1 1 1
2 S j ( t − y ) + i 2 C j ( y − t ) − i 2 C j ( y + t ),
Mathematics 2021, 9, 3003 12 of 20
where S j (y) and Cj (y) are defined in (30) and (31). Then the integral (51) evaluates to
1 1
Jj (y, t) = S j ( y + t ) + S j ( t − y ). (53)
2 2
The above procedure is called a modified inversion formula density estimate. Our proposed
normality test is based on the distance function
Z
T = f (z) − fˆ(z)dG (z), (54)
p
R
T does not depend on a moderate sample volume (≥32) but depends on the data
dimension. It is convenient to use the test statistics T ∗ = −log(T ) which had the lowest
sensitivity based on the exploratory study. Under the null hypothesis statistic T ∗ approxi-
mately follows the Johnson SU distribution which is specified by the shape (δ > 0, γ), scale
(λ > 0), location (ξ) parameters and has the density function
!
X−ξ X−ξ 2
δ 0
f (X) = √ g exp −0.5 γ + δg , for X (−∞, +∞).
λ 2π λ λ
h i
where g(y) = ln y + y2 + 1 , g0 (y) = √ 12
p
.
y +1
In the middle of the twentieth century, N. L. Johnson [24] proposed certain systems
of curve derived by the method of translation, which, retain most of the advantages and
eliminate some of the drawbacks of the systems first based on this method. Johnson
introduced log-normal (SL ), bounded (SB ), and unbounded (SU ) systems. The bounded
system range of variation covers the area between the bounding line β 2 − β 1 − 1 = 0 and
the Pearson Type III distribution; where (β 1 , β 2 ) points are obtained from the distribution
moments defined by Wicksell [25]:
Z ∞
1 1 2 1 2 δ−2 −rγδ−1
µr0 (y) = √ e(r(z−γ))/δ e− 2 z dz = e 2 r .
2π −∞
It follows that 2 p
β 1 = e δ −2 − 1 e δ −2 + 2 , β1 > 0 ,
4 3 2
β 2 = eδ−2 + 2 eδ−2 + 3 eδ−2 − 3.
The SU system is bounded at one end only (Pearson Type V). The SL system is lying
between SB and SU systems. These regions are indicated in Figure 1. The SU system is
presented in detail in [24].
Mathematics 2021, 9, 3003 13 of 20
For statistic T ∗ , the invariance and contingency properties were checked. The invari-
ance property is confirmed because standardized data was used. The contingency property
is confirmed experimentally (see Section 5).
4. Statistical Distributions
The overviewed normality tests are assessed by the simulation study of 11 statistical
distributions grouped into four groups: symmetric, asymmetric, mixed and normal mixture
distributions [5]. A description of these distribution groups is given in the following sub-
sections.
Mathematics 2021, 9, 3003 14 of 20
is such that the first m variates (i.e., Xk1 , Xk2 , . . . , Xkm ) follow the standard normal distri-
bution and distribution of the remaining variates is one of the non-normal distributions
(Laplace(0,1), χ2 (5), t(5), Beta(1,1), Beta(1,2), Beta(2,2)). The experimental research covers the
cases for m = p − 1, m = p/2 and m = 1.
In order to supplement and emphasize the results presented in Tables 2–5, the general-
ized line diagrams were drawn using the Trellis display [26] multivariate data visualization
method. The resulting graph is shown in Figure 2 which shows that the New test is sig-
nificantly more powerful than the other tests. The power of the Mar1 tests is the lowest
compared with the other tests. Figure 2 indicate that the power of the tests increases as the
sample size increases. By increasing the dimensions of the power of 8 (AD, CHI2, CVM,
Energy, HZ, New, Mar1 and NRR) tests decreases while the power of the other (DH, DN,
LV, Mar2 and Roy) tests increases slightly. For small sample sizes, the most powerful tests
Mathematics 2021, 9, 3003 17 of 20
are New, Roy and DH. For large sample sizes, the most powerful tests are New, Energy,
HZ and LV.
Figure 2. The summary of average empirical power of all examined distribution groups by sample size and dimensionality.
Mathematics 2021, 9, 3003 18 of 20
6. Examples
6.1. Survival Data
The data set collected in 2001–2020 by the Head of the Department of Urology Clinic of
the Lithuanian University of Health Sciences [27] illustrates the practical application. This
dataset consists of study data from 2423 patients and two different continuous attributes
(patient age and prostate-specific antigen (PSA)). The assumption of normality was verified
by filtering patients’ age and PSA by year of death (i.e., deaths during the first 1, 2, 3, 4, 5,
6, 7, 10, and 15 years). The filtered data was standardized. The power and p-value were
calculated for the multivariate tests. The significance level of α = 0.05 was used for the
study. Based on the obtained results, it was found that all the applied multivariate tests
rejected the H0 the hypothesis of normality. The power of tests CHI2, DH, Energy, HZ, LV,
New, Mar, NRR and Roy was 0.999 and the p-value was <0.0001. Except for DN test, which
power was 0.576 and the p-value was 0.026.
7. Conclusions
In this study, the comprehensive comparison of the power of 13 multivariate goodness
of fit tests was performed for groups of symmetric, asymmetric, mixed, and normal mixture
distributions. Two-dimensional, five-dimensional, and ten-dimensional data sets were
generated to estimate the test power empirically.
A new multivariate goodness of fit test based on inversion formula was proposed.
Based on the obtained modeling results, it was determined that the most powerful tests
for the groups of symmetric, asymmetric mixed and normal mixture distributions are the
proposed test and Roy multivariate test. From two real data examples, it was concluded
that our proposed test is stable, even when applied to real data sets.
Author Contributions: Data curation, J.A., T.R.; Formal analysis, J.A., T.R.; Investigation, J.A., T.R.;
Methodology, J.A., T.R.; Software, J.A., T.R.; Supervision, T.R.; Writing—original draft, J.A., M.B.;
writing—review and editing, J.A., M.B. All authors have read and agreed to the published version of
the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Generated data sets were used in the study (see in Section 4).
Conflicts of Interest: The authors declare no conflict of interest.
Mathematics 2021, 9, 3003 19 of 20
Appendix A
R p is p-variate set of real numbers,
T
Xk = Xk1 , Xk2 , . . . , Xkp ∈ R p , k = 1, 2, . . . , n is p-variate vector,
#T denotes a size of set T,
p is dimension,
h is smoothness parameter,
D( j) , j = 1, 2, . . . , n are ordered statistics,
G p (·) is the probability distribution function of χ2 ( p),
s is skewness,
k is kurtosis,
n is sample size,
x is sample mean,
σ2 is sample variance,
z is a standardized value,
d is number of variables,
e is the equivalent degrees of freedom,
Φ(·) is the cumulative distribution function for the standard normal distribution,
R is the correlation matrix,
rij is the correlation between ith and jth observations,
Vi is a vector of standardized cell frequencies,
Nin is the number of random vectors,
J = J (θ ) is the Fisher information matrix,
Q is the p( p + 1)/2 × p( p + 1)/2 covariance matrix of w,
P̂(t) is the characteristic function of the multivariate standard normal,
Q̂(t) is the empirical characteristic function of the standardised observations,
ϕ β (t) is a kernel (weighting) function,
Γ(·) is a Gamma function,
F̂ (k) is an auto-covariance function,
Dij is Mahalanobis distance between ith and jth observations,
W j is the normality transformations,
f k ( X, θk ) is a distribution of the kth class,
θ is a set of parameters θ = p1 , . . . , pq , θ1 , . . . , θq ,
ψ(t) is the characteristic function of the random variable X.
pk (k = 1, . . . , q) is the a priori probability,
R is the radius of the ball (bounded sphere),
qς is a quantile of standardized normal distribution
δ and γ are shape parameters,
λ is scale parameter,
ξ is location parameter.
References
1. Henze, N.; Zirkler, B. A class of invariant consistent tests for multivariate normality. Commun. Stat. Theory Methods 1990, 19,
3595–3617. [CrossRef]
2. Henze, N. Invariant tests for multivariate normality: A critical review. Stat. Pap. 2002, 43, 467–506. [CrossRef]
3. Royston, J.P. An extension of Shapiro and Wilk’s W test for normality to large samples. Appl. Stat. 1982, 31, 115–124. [CrossRef]
4. Ross, G.J.S.; Hawkins, D. MLP: Maximum Likelihood Program; Rothamsted Experimental Station: Harpenden, UK, 1980.
5. Korkmaz, S.; Goksuluk, D.; Zararsiz, G. MVN: An R Package for Assessing Multivariate Normality. R J. 2014, 6, 151–162.
[CrossRef]
6. Doornik, J.A.; Hansen, H. An Omnibus Test for Univariate and Multivariate Normality. Oxf. Bull. Econ. Stat. 2008, 70, 927–939.
7. Voinov, V.; Pya, N.; Makarov, R.; Voinov, Y. New invariant and consistent chi-squared type goodness-of-fit tests for multivariate
normality and a related comparative simulation study. Commun. Stat. Theory Methods 2016, 45, 3249–3263. [CrossRef]
8. Moore, D.S.; Stubblebine, J.B. Chi-square tests for multivariate normality with application to common stock prices. Commun. Stat.
Theory Methods 1981, A10, 713–738. [CrossRef]
9. Gnanadesikan, R.; Kettenring, J.R. Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 1972, 28,
81–124.
10. Górecki, T.; Horváth, L.; Kokoszka, P. Tests of Normality of Functional Data. Int. Stat. Rev. 2020, 88, 677–697. [CrossRef]
11. Pinto, L.P.; Mingoti, S.A. On hypothesis tests for covariance matrices under multivariate normality. Pesqui. Operacional. 2015, 35,
123–142. [CrossRef]
Mathematics 2021, 9, 3003 20 of 20
12. Dörr, P.; Ebner, B.; Henze, N. A new test of multivariate normality by a double estimation in a characterizing PDE. Metrika 2021,
84, 401–427. [CrossRef]
13. Zhoua, M.; Shao, Y. A Powerful Test for Multivariate Normality. J. Appl Stat. 2014, 41, 351–363. [CrossRef]
14. Kolkiewicz, A.; Rice, G.; Xie, Y. Projection pursuit based tests of normality with functional data. J. Stat. Plan. Inference 2021, 211,
326–339. [CrossRef]
15. Ebner, B.; Henze, N. Tests for multivariate normality-a critical review with emphasis on weighted Lˆ2-statistics. TEST 2020, 29,
845–892. [CrossRef]
16. Arnastauskaitė, J.; Ruzgas, T.; Bražėnas, M. An Exhaustive Power Comparison of Normality Tests. Mathematics 2021, 9, 788.
[CrossRef]
17. Mardia, K. Measures of Multivariate Skewness and Kurtosis with Applications. Biometrika 1970, 57, 519–530. [CrossRef]
18. Szekely, J.G.; Rizzo, L.M. Energy statistics: A class of statistics based on distances. J. Stat. Plan. Inference 2013, 143, 1249–1272.
[CrossRef]
19. Lobato, I.; Velasco, C. A simple Test of Normality for Time Series. Econom. Theory 2004, 20, 671–689. [CrossRef]
20. Ruzgas, T. The Nonparametric Estimation of Multivariate Distribution Density Applying Clustering Procedures. Ph.D. Thesis,
Institute of Mathematics and Informatics, Vilnius, Lithuania, 2007; p. 161.
21. Marron, J.S.; Wand, M.P. Exact mMean Integrated Squared Error. Ann. Stat. 1992, 20, 712–736. [CrossRef]
22. Kavaliauskas, M.; Rudzkis, R.; Ruzgas, T. The Projection-based Multivariate Distribution Density Estimation. Acta Comment.
Univ. Tartu. Math. 2004, 8, 135–141.
23. Epps, T.W.; Pulley, L.B. A test for normality based on the empirical characteristic function. Biometrika 1983, 70, 723–726. [CrossRef]
24. Johnson, N.L. Systems of Frequency Curves Generated by Methods of Translation. Biometrika 1949, 36, 149–176. [CrossRef]
[PubMed]
25. Wicksell, S.D. The construction of the curves of equal frequency in case of type A correlation. Ark. Mat. Astr. Fys. 1917, 12, 1–19.
26. Theus, M. High Dimensional Data Visualization. In Handbook of Data Visualization; Springer: Berlin/Heidelberg, Germany, 2008;
pp. 5–7.
27. Milonas, D.; Ruzgas, T.; Venclovas, Z.; Jievaltas, M.; Joniau, S. The significance of prostate specific antigen persistence in prostate
cancer risk groups on long-term oncological outcomes. Cancers 2021, 13, 2453. [CrossRef]
28. Martuzevicius, D.; Prasauskas, T.; Setyan, A.; O’Connell, G.; Cahours, X.; Julien, R.; Colard, S. Characterization of the Spatial and
Temporal Dispersion Differences Between Exhaled E-Cigarette Mist and Cigarette Smoke. Nicotine Tob. Res. 2019, 21, 1371–1377.
[CrossRef]