Horizon Academic Research Journal Vol. 4 No. 1
Horizon Academic Research Journal Vol. 4 No. 1
Horizon Academic Research Journal Vol. 4 No. 1
Publisher
Primedia E-launch LLC
PO Box 2727, Orlando, Florida, 32802, United States
ISBN 979-8-89238-890-0
Copyright ©2023, by Sunrise International Education Inc., 641 S St NW, Ste. 300,
Washington, DC 20001. All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without the prior written permission of
the publisher.
TABLE OF CONTENTS
Image Classification of Stars and Galaxies Using Different Machine Emma Leifer 1
Learning Models
Life and Death: The Effect of Biases and Heuristics on Medical Rajeev Krishnamurthy 27
Decision Making
Art Therapy’s Effectiveness and its Role in Treating Neurological Simryn Patel 108
Conditions
Detecting Causality by Using Alexander Quandles and Alexander- Nikhila Pasam 129
Conway Polynomial
Self-Supervised Dementia Prediction From MRI Scans With Zile Huang 150
Metadata Integration
Sexual Dimorphic Nature of the Amygdala and its Contribution to Nardos Shewadeg Gebresenbet 165
Females’ Susceptibility to Depression
Black Sea Grain Initiative: A Game-Theoretic Analysis Paari Dhanasekaran 179
Automated Pneumonia Detection From Chest X-ray Images Using Lurvïsh Polodoo 192
Machine Learning
The Effects of Classical Music Intervention on the Neuropsychiatric Nyneishia Janarthanan 236
and Cognitive Mechanisms of Alzheimer’s Disease Patients
Comparing the Effectiveness of Support Vector Classifier and Dania Ali 265
Stochastic Gradient Descent in Hate-Speech Detection
The WHO Is Not The Global Health Government States Think It Is Dilay Kuyucak 275
Detecting Distributed Denial Of Service Attacks (DDoS) Using Isha Singhal 303
Machine Learning Models
Can Behavioural Economics Help Explain Gender Disparities in Jumaina Fatima 334
Labour Markets?
Using Data-Efficient Image Transformers for Diabetic Retinopathy Veda Fernandes 345
Severity Classification
Using Behavioral Economics Insights to Determine the Likely Baraka Muhoza 364
Causes of the High Rate of Unemployment in Refugee Camps and
What Can Be Done to Alleviate It
Abstract
The correct classification of astronomical objects - such as stars and
galaxies - is essential to the field of astronomy. Today, however, with
the advent of powerful next generation telescopes, the quantity of im-
ages being collected far exceeds the amount that can be catalogued by
astronomers through their own observations and analyses alone. As just
one example, the Large Synoptic Survey Telescope (“LSST”), opening in
August 2024, will catalogue around 40 billion images of stars and galax-
ies. Certain machine learning models have proven efficient and accurate in
classifying astronomical images. In this paper, we tested the ability of four
different machine learning models to classify images of stars and galaxies
accurately without inputs of additional measurements of the brightness,
size, or shape of stars or galaxies: a convolutional neural network (“CNN”)
model, a logistic regression model, a random forest classifier, and a small
neural network. We discuss and compare the architecture and perfor-
mance of each model. We found that our neural network model trained
on a data set preprocessed using a data preprocessing technique known
as Principal Component Analysis (“PCA”), performed the best achieving
an accuracy of 84 percent out of sample. We thus demonstrate that using
such machine learning models can be an effective way to classify images of
stars and galaxies, substantially reducing the time required to catalogue
them.
1 Introduction
Sir William Herschel, the famous astronomer and composer, wrote in 1789,
“The method I have taken of analyzing the heavens, if I may so express my-
self, is perhaps the only one by which we can arrive at a knowledge of their
construction” [Her89]. Herschel catalogued the objects he observed in the night
sky, providing “a short description of each nebula or cluster of stars, as well
as its situation with respect to some known object” in the hope that he would
∗ Advised by: Guillermo Goldsztein of the Georgia Institute of Technology
† Trinity School, New York, New York
1
“engage the attention of Astronomers,” and “induce them to undertake the
necessary observations” [Her89]. Herschel’s work led to the understanding that
the stars he observed make up the Milky Way [Cea20a]. Today, the correct
classification of stars, galaxies, and quasars remains essential to the field of as-
tronomy. The advent of powerful telescopes, however, has provided astronomers
with vast amounts of data on enormous numbers of astronomical objects, orders
of magnitude more than they can catalogue through their own observations and
analyses. As just one example, the Large Synoptic Survey Telescope (“LSST”),
opening in August 2024, will catalogue around 40 billion images of stars and
galaxies [Cea20a]. Moreover, classification of images of stars and galaxies based
on morphology alone – according to which “unresolved” point sources are clas-
sified as stars, and “resolved” sources for which a shape can be determined are
classified as galaxies – has led to inaccurate classifications [Cea20a]. Using this
approach, images of quasars, galaxies with an extremely luminous center region,
particularly if they are very distant, are sometimes mistaken for stars [Cea20a].
The point source/resolved image classification is also no longer sufficient given
that the newest telescopes capture more and more unresolved images of very
faint distant galaxies [Fea12].
To reduce misclassification of stars and galaxies in scenarios where images are
morphologically similar, differences in their spectral energy distributions (SEDs)
can be utilized. In particular, stellar emission spectra usually peak around a
particular frequency, while the spectra emitted by galaxies are more evenly
distributed [Facteb]. Methods based on SEDs have been effective but have
certain limitations, including that they do not incorporate information about
the data set, such as the expected relative numbers of stars and galaxies [Fea12].
The use of statistical models to distinguish stars from galaxies has emerged as a
very useful supplement to methodologies based on image morphology and SEDs.
Machine learning techniques can be used to construct statistical models based
on an existing dataset with known classifications [HR20, Imp23, Kim18, KB16,
Oea92]. Our study focuses on using a subclass of machine learning techniques
known as “supervised machine learning,” to classify stars and galaxies, which
we show to be a method for efficiently and accurately analyzing the reams
of data generated by telescopes such as LSST to classify stars, galaxies, and
quasars [Iea19, Kim18, Sea15, Sowte].
In this paper, we describe several different machine learning algorithms that
can be used to classify images of stars and galaxies without prior feature extrac-
tion. In other words, our models are trained using 3986 64 x 64 pixelated images
of stars and galaxies without inputs of additional measurements of the bright-
ness, size, or shape of stars or galaxies. We built four different models to use
as classifiers: a neural network model, a convolutional neural network (“CNN”)
model, a logistic regression model, and a random forest classifier. In our paper,
we discuss and compare the architecture and performance of each model. We
found that our neural network model trained on a data set preprocessed using
a technique known as Principal Component Analysis (“PCA”) performed the
best, achieving an accuracy of 84 percent out of sample.
2
1.1 Stars and Galaxies
A star is a massive ball of hot gas [oAfEtec, Geateb]. Typically, stars consist
primarily of hydrogen and helium gas, with small amounts of other elements
– many stars are about 73 percent hydrogen gas, 25 percent helium gas, and
2 percent other elements [Fac23]. Individual stars have their own unique life
cycles, which can range from a few million years to billions of years [Geateb,
ALMte]. Stars are categorized and differentiated based on factors such as their
mass, temperature, and brightness. The ages and compositions of stars in a
galaxy can provide information about the history, dynamics, age, and evolution
of the galaxy [Chi01]. A galaxy is a massive cluster of gas, dust, and billions of
stars with their respective solar systems, all bound together by gravity (a “solar”
or “star” system is a group of astronomical objects like planets or meteors that
orbit a star) [ED23, oAfEteb, Geatea].
The problem of classifying stars and galaxies is complicated by the fact
that images of certain galaxies, those with particularly bright regions at their
centers, resemble stars. The extremely bright centers of these galaxies, with a
brightness that cannot be accounted for by the presence of stars alone, are called
“active galactic nuclei” or “AGNs” [Hubte, Pogte, Tel21]. The brightest AGNs
are known as “quasars,” which have a massive black hole at their centers [KS98,
Pogte, ESAte, VBea01]. Gas and dust falling into the central black hole emit
electromagnetic radiation as they are subjected to the extreme gravitational pull
of the black hole [ESAte]. Quasars, which can be thousands of times brighter
than the entire Milky Way galaxy, are some of the brightest objects in the
universe [ESAte]. As a result, distant quasars can be mistaken for stars, even
using modern machine learning techniques [oVSOte,Cea20a,Sch63]. The images
below show quasars misidentified as stars in a recent machine learning study,
when the algorithm was applied to previously catalogued images in the test data
set [Cea20b].
Figure 1: Images of quasars from recent machine learning study that were mis-
classified as stars [Dea01].
3
known as “feature extraction” or “feature generation.” For example, an image
might be pre-processed by running it through a set of filters to highlight the
boundaries or edges, before the image is fed into the model. For a machine
learning model, this preprocessing would not be necessary.
It is helpful to define several terms in introducing our machine learning
model. A data point, also known as an “example,” is a collection of input values,
known as “features,” and an output value, known as the “target variable” or
as a “label” in a classification problem, similar to an (x,y) coordinate, where
the “x” can have a number of values, i.e., (features (x), target variable (y)).
In our model, the input value is an image of a star or galaxy, the features are
the pixels in that image, and the target variable or label is the classification of
the image as a star or galaxy. The data points or “examples” in our model are
the classified images. “Parameters” are numerical weights used in the model to
combine the features (input values) to produce the output value. Parameters tell
the relevance of certain characteristics of the image, such as, in our example,
the brightness in a particular location or the strength of the boundary in a
particular direction in each pixel in the image.
“Hyperparameters” are features of the model selected in advance by the user
to determine the mathematical relationships among the inputs into the model,
and the definition of the error that should be minimized by the model. In
essence, the user determines in advance the definition of the optimal outcome
being sought. The selection of hyperparameters can also determine the number
of parameters used by the model. In general, models with fewer parameters are
preferred to those with more parameters because models with more parameters
are more likely to be “overfit” to the training data, i.e., they will not generalize
well to unseen data. Models with too many parameters will find patterns in
data sets that are not actually relevant to predicting the target variable.
Initially, the data are divided into three parts – the “training set,” the “val-
idation set,” and the “test set.” The “training set” is the subset of data used
to fit the model – meaning to determine the optimal parameters or weights to
predict the correct target variable, i.e., label for the image. Once we have a
trained model, possibly a set of trained models with different hyperparameters,
we need to determine the best model to use and assess its performance on un-
seen data. To select the best model, we run the “validation set” through each
of our trained models. From this process, a set of error metrics or scores is
obtained that is used to select the model that performed the best according to
those scores. Finally, the performance of the model must be assessed on com-
pletely unseen data – the “test set.” The performance on the test set should
be representative of the model’s true performance going forwards. A test set
is necessary in light of the fact that the error metrics on the training data are
biased because the model was fit to the training data, and the validation errors
are biased because we selected the best performing model on the validation set.
Machine learning can be “supervised,” “unsupervised,” or “semi-supervised,”
depending upon how much information about the data set is provided to the
model. For example, in a “supervised” machine learning model, the user pro-
vides the model with a training set that includes both the features (e.g., images
4
of stars and galaxies) and the target variable (classification of the images as stars
or galaxies). In an “unsupervised” model, the user provides the model with a
training set that includes only features (no target variable) and asks the model
to differentiate the objects in the data set based on the most relevant features.
In a “semi-supervised” machine learning classification model, the training set is
composed mostly of unlabeled data with a small amount of labeled data.
2 Related Works
Our research is related to several prior studies applying machine learning to
star-galaxy classification. In 1992, S. C. Odewahn et al. were the first to use
neural networks – a machine learning method – to classify astronomical images
as stars or galaxies [Oea92]. They achieved successful classification rates that
varied with size of the images. Our research is also related to a 2016 study by
Edward J. Kim and Robert J. Brunner on star-galaxy image classification using
CNNs [KB16]. Kim and Brunner used CNNs to classify 48 x 48 pixelated images
of stars and galaxies captured by the Canada-France-Hawaii Telescope Lensing
Survey. Kim and Brunner demonstrated that CNN models can accurately clas-
sify images of stars and galaxies without extra data extracted from each image
by experts. Kim and Brunner also compared the performance of CNN models
to that of Trees for Probabilistic Classifications (“TPC” models). To minimize
overfitting (discussed in Section 4), Kim and Brunner used data augmentation
and regularization techniques. To increase the amount of data in their training
set (data augmentation) they created rotations, reflections, translations, and
blurrings of original images. They used a regularization technique known as
“drop out.”
Another related study was conducted by Edward J. Kim in 2018 at the Uni-
versity of Illinois [Kim18]. Kim lays out multiple machine learning techniques
that can be used to classify images of stars and galaxies. Kim discusses the
use of a Bayesian combination technique to improve the performance of any
single classification method. This study demonstrates further that CNNs per-
form accurately as classification models. Finally, this study shows that a semi-
supervised machine learning classification model performs well when a relatively
small amount of labeled data is available. More recently, in 2020, Ryan Hausen
and Brant E. Robertson developed Morpheus, a machine learning model that
simultaneously detects and classifies objects in astronomical images [HR20].
5
The ARIES conducts research in astronomy and astrophysics and uses optical
telescopes on site to gather empirical data. At ARIES, research on star clusters,
stellar variability, and the nuclei of active galaxies and related phenomena is
carried out using the DFOT observational data.
The DFOT is located on a mountain peak in Naintital, India, known as
the Devesthal (“Abode of God”). The Devesthal offers particularly good astro-
nomical viewing conditions as a result of its high altitude (approximately 2450
meters) and distance from urban areas, which helps to minimize light pollu-
tion. Astronomical “seeing” is determined by the amount of disruption to light
caused by fluctuations in the refractive index of the earth’s atmosphere. The
location of DFOT atop the Devesthal allows for sub-arcsec seeing – the ability
of the DFOT to resolve details smaller than 1 arc-second (1/3600 of a degree).
Similarly, the high altitude of the site reduces “extinction,” the scattering or
absorption of light by particles between the source and the telescope, allowing
more light to reach the DFOT [oAfEtea].
The DFOT, which was installed in 2010, has two charge-coupled device
(“CCD”) cameras that capture images of the sky. A CCD camera is a very
small microchip that contains a grid of elements that sense light (each of which
is called a “pixel”) [Obste, Dea01]. One of the DFOT’s CCD cameras captures
images of 2048 x 2048 pixels, and the other captures images of 512 x 512 pixels.
When the light collected by the telescope is focused onto the CCD, each pixel
is assigned a number that corresponds to a shade of grey (color images are
obtained when several pictures are taken – through a red, green, and blue filter
– and are then overlayed). Figures 2 and 3 below show images from our dataset
of galaxies and stars, respectively, taken by the DFOT:
Figure 2: Images of galaxies from our dataset taken by the DFOT [Agr21].
Figure 3: Figure 3: Images of stars from our dataset taken by the DFOT [Agr21].
6
3.2 Dataset
We obtained our dataset on Kaggle, where it is publicly available – Kaggle
is an online platform through which users can find and publish datasets on a
wide range of topics. The images in our data set were taken by the DFOT’s
CCD camera with a 2048 x 2048 pixel grid. The original images – 2048 x 2048
pixels in size – were reduced to 64 x 64 cutouts, where each cutout showed a
single astronomical object. Image segmentation was then performed to isolate
the astronomical object (star or galaxy) in each image. Image segmentation
splits up an image into different regions based on characteristics of the image in
those regions. Image segmentation helps pinpoint objects or boundaries in the
image making analysis more efficient; it also makes the process of finding the
coordinates of each isolated object easier [Labte].
After image segmentation, the coordinates of each isolated object were then
identified and inputted into the Sloan Digital Sky Survey (SDSS) to find the
corresponding label of star or galaxy for each image. The SDSS is a publicly
available database that contains three-dimensional maps of the universe, with
catalogues that contain both photometric and spectroscopic data [Surtea, Sur-
teb, Surtec]. Photometric data are obtained when light passes through multiple
colored filters [Factea]. Spectroscopic data are obtained when light comes in
contact with a “dispersive element” (i.e., a prism) [Sowte]. Researchers can use
the SDSS database to identify astronomical objects by inputting the coordinates
of their objects.
Cutout images in our dataset were separated into two directories – a “star”
directory and a “galaxy” directory – based on SDSS classification after their
coordinates were inputted into the SDSS. Our dataset was composed only of
the 64 x 64 pixel cutout images.
7
Figure 4: The for loop that assigned 0’s to “star” images and 1’s to “galaxy”
images.
galaxies), our validation set had 797 images (607 stars and 190 galaxies), and
our test set had 798 images (629 stars and 169 galaxies):
Table 1: Division of full data set into training set, validation set, and test set.
8
Table 2: Division of the full data set, showing numbers of stars and galaxies in
the balanced training set.
9
network to find higher order patterns and structures in the data, because the
neural network can recombine transformations it has done in prior layers into
more complicated features. For example, a neural network could start by iden-
tifying edges, and then slowly determine that a particular combination of edges
represents a star or a galaxy.
Within each layer, each input feature, and successive transformations of that
feature, is referred to as a “node” in a neural network – “nodes” can be thought
of as corresponding to neurons in the brain. The layer with the raw features,
the first layer, is known as the “input” layer. The intermediate layers, which are
the transformations of the raw features, are known as the “hidden layers.” The
final layer, which contains the predictions, is known as the “output layer.” The
output layer in a classification problem will have nodes that correspond to the
probability of each category. “Deep learning” algorithms are neural networks
that contain multiple hidden layers.
The number and types of layers used, as well as the number of nodes in
each layer, are all considered “hyperparameters,” and are determined by the
programmer before the model is trained. Once the hyperparameters have been
set and the model has been fit, each node represents a combination of the
features that went into it. Layers other than the input layer use an “activation
function” to compare the input data to a threshold value – the input layer does
not require an activation function simply because all input data must be passed
on to successive layers for a model to develop. If the data a node receives
are below a threshold value, the node will not pass the data on to the next
layer – essentially the neuron will not fire. Figure 6 below shows a physical
representation of a deep learning neural network with its different layers.
10
10
4.2 Convolutional Neural Network (CNN)
For one of our models, we used a type of deep learning neural network known
as a convolutional neural network (CNN). CNNs, which are designed to process
pixel data from images, contain a specific architecture that is used for image
processing and recognition. CNNs contain a minimum of two hidden layers – a
convolutional layer and a pooling layer (discussed below in Sections 4.3 and 4.4)
– that are used to process and extract patterns in image datasets. In addition to
the convolutional and pooling layers, it is very common for CNNs also to have
one, if not several, “dense” fully connected layers (discussed below in Section
4.5).
In our study, we tested the performance of CNNs with and without a dense
layer, and also the performance of CNNs with different numbers of nodes in their
dense layer. To build our CNN model, we drew on existing libraries of code
publicly available online (shown in Figure 7 below along with other libraries
used in this project).
11
11
Figure 8: A convolution being computed over an entire image [Bae23].
horizontal edges versus vertical edges in an image. Each image inputted into a
model will undergo a convolution with one or more filters to construct multiple
transformed images, known as “feature maps.” The kernel size of a convolutional
filter as well as the number of feature maps produced are hyperparameters in a
CNN. For our model, we set our convolutional layer to include 32 feature maps.
In section 4.4 below, we discuss how different kernel size affects a CNN model’s
performance. Additionally, for the activation function of our convolutional layer,
we used the “Relu” (“Rectified Linear Unit”) activation function. Relu is a
piecewise linear function that returns zero for inputs less than or equal to zero,
and returns the input exactly if it is positive. Figure 9 below shows a graph of
the Relu function:
12
12
4.4 Max Pooling Layer
In CNN models, a pooling layer is a hidden layer that is placed after a convo-
lutional layer. Pooling layers are used to reduce the number of parameters in a
model, thereby decreasing a model’s runtime and likelihood that it experiences
overfitting. There are two types of pooling layers: “max” pooling layers and
“average” pooling layers. A max pooling layer is generally used in CNN models
that will be used for object recognition; it is useful for identifying distinctive
features in an image such as edges and corners. However, an average pooling
layer is generally used in CNN models that will be used for object detection
and image segmentation. Pooling layers reduce the number of parameters in a
CNN model by reducing the dimensions of feature maps generated by the con-
volutional layer. A max pooling layer, similar to a convolutional layer, scans its
input, a feature map, taking the maximum value from different regions of the
map. Figure 10 below illustrates the way a max pooling layer works.
Figure 10: Output of a max pooling layer, showing reduction of the dimensions
of the original feature map [Kho21].
A filter size of 2 x 2 indicates that the max pool will survey a 2 x 2 region
of the feature map and take only the maximum value. A “stride size” of 2 in
a given direction indicates that the max pool filter will move two units right or
two units down to survey another region. We built a max pooling layer and set
the stride and filter, both hyperparameters, to have a size of 2 x 2.
13
13
dense layer are connected to every node in the preceding layer, the dense layer
is able to find relationships between separate parts of the image. In our model,
we included a single hidden dense layer before our output layer.
Because dense layers can receive inputs only in the form of one-dimensional
arrays, before adding a hidden dense layer in our model, it was necessary to
include a “flattening layer.” A flattening layer will compress multi-dimensional
arrays that describe images into single dimensional arrays.
14
14
Figure 11: Graph of the sigmoid function for our model, showing a decision
boundary of 0.5 [Sah21].
Different types of loss functions are used in different contexts. For predicting
continuous variables or outcomes, we often use the mean squared error, while in
classification problems, “cross-entropy loss” is commonly used. In problems with
two outcomes, this function is called the “binary cross-entropy loss function.” If
there are multiple outcomes, the categorical cross-entropy loss function is used.
Because we have a binary problem (predicting stars or galaxies), we used the
binary cross-entropy loss function, shown in Figure 12 below.
For binary classification problems, each data point (or image in our case)
exists in one of two possible classes (a class associated with 0 or with 1, as
discussed in Section 3.3 above). In the equation in Figure 12, y corresponds to
the target class label (0 or 1), and p corresponds to the predicted probability
of class 1. As demonstrated by this equation, the loss function is minimized
when the model has the highest possible probability associated with the correct
output.
After we have chosen a loss function, we need to select a method for the
neural network to update itself once it knows the scores produced by the loss
function. To optimize our model, we used the “Adam” optimizer (“Adaptive
moment estimation”) – which applies a technique called stochastic gradient
15
15
descent to the weights to improve them.
After creating the baseline architecture for our CNN model, we tested the
performance of different CNN models with different combinations of hyperpa-
rameters. We adjusted kernel size, the number of nodes in our dense layer, and
tested the performance of a model without any hidden dense layer at all.
Next, using a hidden dense layer with 64 nodes, we tested how adjusting the
hyperparameter “kernel size” would affect the performance of our model. We
16
16
tested the performance of models with a kernel size of 2, 3, 4, 5, and 6. We
found that a CNN model with a kernel size of 6 performs best and achieves an
accuracy on its validation set of 82.6 percent. Table 4 below shows the relative
performance of CNNs with different kernel sizes.
Table 4: Relative performance we obtained for CNNs with different kernel size.
17
17
Table 5: Performance of our logistic regression model and number of parameters
in the model.
Table 6: RFC performance with minimum samples per leaf fixed at 1, when the
number of trees is adjusted.
18
18
Table 7: RFC performance with a fixed number of 100 trees, when the minimum
samples per leaf is adjusted.
19
19
Figure 14: Graph showing the “Cumulative Variance” and “Explained Variance
By Component” obtained after we applied PCA to our data.
worse on the test data on than the training data (100 percent training accuracy,
compared to 80 percent test set accuracy), the training set accuracy (88 percent)
of this smaller model was much closer to the accuracy of the validation and test
sets (85 percent and 84 percent respectively). Table 8 below summarizes our
results.
20
20
Table 8: Summary of the results of the neural network model applied to our data
after preprocessing with PCA, which gave the best results of the four models
we tested.
ical image datasets. Additionally, future work could include analyzing images
that our models misclassified, to determine whether there is a pattern to the
misclassification of certain images by models of totally different structure. It
is possible that by analyzing images misclassified by our models, we might find
that certain images were initially misclassified by the SDSS database. Finally,
the model could be trained and tested on a data set where quasars are distin-
guished as a separate category from galaxies, to determine the ability of the
model to distinguish quasars from stars and galaxies. Such an analysis could
potentially also help address the question whether the misclassifications in our
model resulted, at least in part, from the inability to distinguish stars from
quasars.
References
[Agr21] D. Agrawal. Star-galaxy classification data. Kaggle,
https://www.kaggle.com/datasets/divyansh22/dummy-astronomy-
data, June 12, 2021.
21
21
[Bae23] Baeldung. What is the purpose of a feature map in a convolu-
tional neural network? https://www.baeldung.com/cs/cnn-feature-
map, May 24, 2023.
[Cea20a] A.O. Clarke et al. Identifying galaxies, quasars, and stars with ma-
chine learning: A new catalogue of classifications for 111 million sdss
sources without spectra. Astronomy and Astrophysics, 639, A84,
2020.
[ED23] K. Erickson and H. Doyle. How many solar systems are in our galaxy?
National Aeronautics and Space Administration Science Space
Place, https://spaceplace.nasa.gov/other-solar-systems/en/, August
20, 2023.
22
22
[Facteb] Australia Telescope National Facility. Spec-
troscopy: Unlocking the secret in starlight.
https://www.atnf.csiro.au/outreach/education/senior/astrophysics/
spectroscopytop.html, No date.
[Iea19] Z. Ivezic et al. Lsst: From science drivers to reference design and
anticipated data products. The Astrophysical Journal 873(2), 111,
March 2019.
23
23
[KS98] B.K. Kennedy and D.P. Schneider. Quasar discovered with
x-rays is long ago and far away. Pennsylvania State Uni-
versity. https://science.psu.edu/news/quasar-discovered-x-rays-long-
ago-and-far-away, March 26, 1998.
24
24
[Sea11] R. Sagar et al. The new 130-cm optical telescope
at devasthal, nainital. Current Science, 101, 1020.
https://doi.org/https://www.jstor.org/stable/24079276, 2011.
[Surtea] Sloan Digital Sky Survey. The sloan digital sky survey.
https://classic.sdss.org/home.php, No date.
[Surteb] Sloan Digital Sky Survey. The sloan digital sky survey.
https://www.sdss4.org, No date.
[Surtec] Sloan Digital Sky Survey. The sloan digital sky survey-v: Pioneering
panoptic spectroscopy - sdss-v. https://www.sdss.org/, No date.
[VBea01] D.E. Vanden Berk et al. Composite quasar spectra from the sloan
digital sky survey. The Astronomical Journal 122(2), 549, 2001.
25
25
Life and Death: The Effect of Biases and
Heuristics on Medical Decision Making
Rajeev Krishnamurthy 1†
Abstract
Doctors’ decision making is affected by a variety of cognitive shortcuts and
biases. Five biases and heuristics extremely relevant to medical decision
making are the availability heuristic, anchoring, sunk cost bias, omission bias,
and status quo bias. By conducting literature reviews involving the analysis
and evaluation of largely quantitative data, this paper analyses these five
biases and the extent to which they affect doctors, as well as the roles they
play in medicine. Finally, this paper recommends a range of policies which aim
to alleviate the negative effects of these heuristics and biases on medical
decision making
1 Introduction
The average doctor-patient consultation takes a mere 18 minutes [ea20a], so it is
perhaps not surprising that misdiagnosis is the largest single cause of adverse
medical events in the USA, accounting for 34% of the country’s total medicolegal
claims [LB11]. The current medical system incentivizes doctors to process the
highest number of patients possible, and, in order to accomplish this, they
unconsciously utilize a number of cognitive shortcuts, or heuristics. While this
does indeed speed up the medical processes, it also makes doctors vulnerable to a
number of errors and biases – heuristics offer the easiest path from a problem to
its solution for the brain, not the most methodical or careful one [ea08]. This can
cause anything from a doctor over-diagnosing epilepsy because he took a course
on it a week earlier, to one continuing an incorrect medical treatment because of
previous investment of time and money into it. Furthermore, the patients whom
doctors are treating may have biases too – which can influence doctors and which
they must compensate for.
This paper will provide an overview of heuristics and biases within the medical
establishment, using both hypothetical and real-world examples to illustrate their
causes and effects. By reviewing a wide range of previous literature, the
mechanisms behind these biases can be explored thoroughly, and various policies
1 Advised by: Dr. Edoardo Gallo of the University of Cambridge †The
International School Bangalore
26
intended to reduce the effects of biases on and increase the accuracy of doctors’
decision-making will be evaluated. Finally, I will provide systemic
recommendations which aim to significantly alleviate the negative impact that
biases and heuristics cause to the medical establishment. This paper will focus on
five cognitive drivers: the availability heuristic, anchoring, sunk cost bias, status
quo bias, and omission bias. This paper will contain four main sections – Section
1 will focus on the availability heuristic, Section 2 will focus on anchoring, Section
3 will cover the sunk cost bias, and Section 4 will focus on the status quo and
omission biases. Each section will consist of three subsections: the first will define
the bias or heuristic covered, the second will be a literature review, and the third
will comprise its implications for medical decision making, and recommendations
which could potentially alleviate its harmful effects.
2 Availability
2.1 Defining the availability heuristic
Li et al. define the availability heuristic as “the tendency to overestimate the
likelihood of events when they readily come to mind”. It is an example of base rate
neglect – a phenomenon that occurs when people tend to ignore statistical
averages in favor of new information [KT73]. A real-word example of this is as
follows: students who were asked to retrieve 12 examples of them expressing
assertive behavior rated themselves as less assertive than students who were
asked to recite 12 examples of their unassertive behavior [ea91] - 12 examples of
the stated behavior were not easily available to the students, leading them to
underrate themselves. In medicine, the availability heuristic could present as a
physician who spent years specializing in tuberculosis being more likely than
generalist peers to misdiagnose similarly presenting disorders as tuberculosis
[ea20b].
27
terms – a relative increase of 15 percent – in the 10 days following a diagnosis
[Ly21]. However, in the following 50 days, no statistically significant change was
found. Ly, however, acknowledged that, due to the study’s 95 percent confidence
elements, an increase below the 5 percent level could not be ruled out [Ly21]. Ly
concluded that “These results are consistent with the availability heuristic
influencing physician decision making in relation to pulmonary embolism
diagnoses”.
Li et al. approached their study with a different method – it involved 46 internal
medicine residents, divided into two groups, with one being the experimental (EG)
and the other the control (CG) [ea20b]. Prior to the experiment, the EG was asked
to analyze an article on dengue fever, and then completed a test on it. The control
group, however, did not receive any of this information and directly participated
in Stage 2 of the study, which occurred six hours later [ea20b]. Li and his
colleagues mention that “great care was taken to ensure that stage 2 appeared to
be an unrelated study” [ea20b]. The participants were presented with and asked
to diagnose eight clinical cases – one of which was dengue fever, three of which
appeared similar to it but were actually different conditions, and the remainder of
which were unrelated to dengue fever. Finally, in the third stage, participants
received three experimental cases and one filler that they had previously
diagnosed and were encouraged to reflect on their previous diagnoses and change
them if they felt they were incorrect in order to test whether reflection would
compensate for availability heuristic-caused errors [ea20b]. Participants were
assigned a score of 1 and 0 for each correct and incorrect diagnosis they made,
and the mean scores of each group were compared.
In the second stage of the study, the CG significantly outperformed the EG in
the experimental cases, 0.80 to 0.66, and slightly underperformed it in the filler
cases, 0.59 to 0.64 [ea20b]. The EG misdiagnosed significantly more cases as
dengue fever than the CG. Additionally, the participants did not show a statistically
significant difference in accuracy after performing reflective reasoning [ea20b]. Li
and his colleagues concluded that “the availability bias seemed to account for the
bulk of diagnostic errors and was not well repaired by reflective reasoning”
[ea20b].
28
et al. – a man was incorrectly diagnosed with COVID-19 despite three negative
tests, resulting in him being given excessive doses of antibiotics and requiring
supplemental oxygen before being correctly diagnosed and eventually discharged
[ea22b]. They described the availability bias as a “significant contributor to poor
patient outcomes” and encouraged physicians to be aware of it in order to avoid
“inadvertently affecting patient outcomes” [ea22b].
Using reflection and taking additional time to diagnose is not an effective
method against this heuristic, resulting in no statistically significant improvement
in the accuracy of diagnosis [ea20b]. A possible workaround could be to consult
with another doctor who has not seen or diagnosed a recent case of the disease,
as only diagnoses of the exact disease cause availability bias, not ones similar to it
[Ly21]. However, this would likely not be cost-effective, and it might be difficult to
find an unbiased doctor in the case of a common condition, as a result of the
relatively long-lasting nature of the bias [Ly21]. Additionally, as the availability
bias is an example of base rate neglect [KT73], consulting base rates and ensuring
that statistical overdiagnosis is not taking place could be an effective tool for
doctors to mitigate the effects of the availability heuristic.
Finally, the current rise of artificial intelligence could provide the future
possibility of the consultation of neural networks to ensure that opportunities for
differential diagnosis are presented and base rate neglect is avoided [ea21].
Patient details and symptoms reported would be processed by the system, which
would present several diagnoses to the doctor, considering their rates of
occurrence in the general population as well as their likelihood based on patient
history and the symptoms presented. By presenting base rates to doctors, base
rate neglect would be mitigated, and, as a result of the system itself theoretically
not being subject to human heuristics and cognitive shortcuts, the “second opinion”
provided by it would provide an effective antidote to the doctor’s availability bias.
However, this would not be a silver bullet – as mentioned previously, taking
additional time to reflect after diagnosis did not mitigate the bias’s effects on
doctors, so the system would likely need to provide fairly forceful suggestions to
doctors and play a relatively large role in the decision making process in order to
have an impact. Moreover, at the end of the day, systems are merely an aid to
doctors, and a biased doctor will inevitably make biased decisions – while using
AI as an aid might improve the accuracy of diagnoses, systemic training in order
to make doctors less susceptible to the effects of the availability heuristic would
still be necessary.
Additionally, precautions would need to be taken while integrating AI into the
medical decision-making process. It has been demonstrated that, when trained on
biased datasets, AI-based systems can produce biased results [ea19]. However,
measures to alleviate these inherent biases do exist [ea19], and would need to be
integrated into a hypothetical medical system in order to provide relatively
unbiased advice to help mitigate the effects of the availability heuristic upon
doctors.
29
3 Anchoring
3.1 Defining Anchoring
Anchoring was originally described by Kahneman and Tversky, as the tendency of
people to make estimates “by starting from an initial value that is adjusted to yield
the final answer”; this adjustment is “typically insufficient” [TK74]. An example of
this is as follows: participants who were anchored with the value “65” estimated
20% more African countries in the United Nations than participants anchored
with the value “10” [TK74]. Dargahi et al. define anchoring in the context of
medicine as “the excessive weighting of initial information and the inability to
adjust the initial diagnostic hypothesis when further information becomes
available” [ea22a]. A hypothetical example of this could be a doctor misdiagnosing
a patient with depression because the patient seemed depressed to him upon first
impressions, and the doctor did not make a sufficient adjustment away from the
first impression.
30
decreased consistently with increased experience, the frequency of anchoring-
induced errors “seemed... independent of training and level of ability”. They
recommended that “physicians should encourage independent review of their
conclusions and realize that knowledge provides no shield against premature
closure”, and that anchoring might be able to be avoided with “good interrater
ability” [ea85].
31
4 Sunk Cost Bias
4.1 Defining sunk cost bias
Bornstein et al. define sunk cost bias as occurring “when a decision maker
continues to invest resources into a previously selected action or plan even after
the plan has proven to be the suboptimal option” [ea99]. This, for example, could
take the form of sitting through a boring movie in order to “get more value” out of
your ticket. This may seem logical; however, by continuing to sit in the movie, you
are impacting your future enjoyment as well. Thus, despite the sunk cost, the best
option is always to switch immediately to the optimal course of action. In medicine,
sunk cost bias could take the form of a doctor continuing a course of ineffective
prescription because their patient has already spent time at their office and money
in buying the medicine.
32
[BBB12]. The expected result consistent with the sunk cost effect would have been
for the doctors in the scenarios with the most investment recommending
continuation, however, the opposite transpired, and the doctors in the scenario
with no investment were the most likely to recommend continuing the treatment
– an “overcompensation” for the sunk cost effect [BBB12]. In spite of this, 11% of
those surveyed stated that they would recommend continuing the treatment –
which the authors describe as “unrealistic optimism”. The authors hypothesized
that “the participants’ response to the scenario given in the study may not be
reflective of their behaviour when faced with a similar situation in practice” due
to the “close-ended nature of the available responses” and concluded that “further
research is necessary” [BBB12].
33
a doctor choosing not to prescribe a patient a new, improved medication as the
patient had been on the previous medication for several years; one of omission
could involve not treating a patient who is having a heart attack as they are being
treated for pneumothorax already.
34
physicians “unaware that their first treatment decision will be reviewed by
another”, or, finally, to ask physicians to “consider why the preferred option may
be wrong” [CS21].
6 Conclusion
While most doctors get the vast majority of their diagnoses and prescriptions right,
the consequences of failure are so severe that any rate of misdiagnosis and failure
to pursue optimal courses of action is too high. In order to make our medical
decision making process as sound as possible, the impact of a variety of cognitive
shortcuts and biases that doctors utilize, such as availability, anchoring, omission
bias, and status quo bias on the process must be minimized through personal and
systemic change. This would be on the parts of patients, doctors, and the medical
35
establishment, and would involve awareness campaigns, doctor training, and
additional measures intended to help doctors make more objective judgements.
However, the impacts of biases like the sunk cost effect remain unresearched and
unknown, and real-world observational studies must be conducted in order to
reveal their effects and develop recommendations.
References
[BBB12] Jennifer A. Braverman and J.S. Blumenthal-Barby. Assessment of the
sunk-cost effect in clinical decision-making. Social Science & Medicine,
2012.
36
[ea22a] Helen Dargahi et al. Anchoring errors in emergency medicine residents
and faculties. Medical Journal of the Islamic Republic of Iran, 2022.
[ea22b] Kwaku Kyere et al. Availability bias and the covid-19 pandemic: A case
study of legionella pneumonia. Cureus, 2022.
[KT73] Daniel Kahneman and Amos Tversky. On the psychology od prediction.
Psychological Review, 1973.
[LB11] David Levine and Alan Bleakley. Misdiagnosis: Analysis based on case
record review with proposals aimed to improve diagnostic processes.
Clinical Medicine, 2011.
[Ly21] Dan P. Ly. The influence of the availability heuristic on physicians in the
emergency department. Annals of Emergency Medicine, 2021.
[RB92] Ilana Ritov and Jonathan Baron. Status-quo and omission biases. Journal
of Risk and Uncertainty, 1992.
[RH21] Rita W. Rehana and Najia Huda. A common heuristic in medicine:
Anchoring. Annals of Medical and Health Sciences Research, 2021.
[SZ88] William Samuelson and Richard Zeckhauser. Status quo bias in decision
making. Journal of Risk and Uncertainty, 1988.
[TK74] Amos Tversky and Daniel Kahneman. Judgment under uncertainty:
Heuristics and biases. Judgment under Uncertainty, 1974.
37
Mesenchymal Stem Cell-Derived Exosomes and
Their Therapeutic Potential on Parkinson’s
Disease
Dorsa Arbabha 1†
Abstract
Parkinson’s disease (PD) is characterized by the degeneration of
dopaminergic neurons in the substantia nigra, resulting in dopamine depletion
and a spectrum of motor and non-motor symptoms in patients. Mesenchymal
stem cells (MSCs) have garnered attention for their therapeutic potential across
various diseases. They can differentiate into various cell types, including
dopaminergic cells, and secrete neurotrophic and anti-inflammatory factors
with robust neuroprotective properties. In PD, midbrain dopaminergic neurons
express miR-133b, a crucial regulator of tyrosine hydroxylase and dopamine
transporter synthesis. MSCs facilitate interactions with brain parenchymal cells
by transferring miR-133b via exosomes, promoting neurite outgrowth and
functional recovery. Notably, studies have demonstrated elevated dopamine
levels and its metabolites in the striatum of PD rats following treatment with
these exosomes. This review examines mesenchymal stem cell-derived
exosomes, their unique attributes, and their potential as a promising therapeutic
avenue for PD.
1 Introduction
1 Advised by Dr. Arij Daou, University of Chicago †Cagaloglu
Anatolian High School
38
thousands of patients every year, and there is yet a cure for this disease to be found.
[Elb16]. There remains a demand for new and innovative treatments in the field.
This review aims to summarize how MSC-EXOs can be used in treating PD,
explaining the connection and science behind the existing successful studies
regarding them and laying the groundwork for future research and development
in this field.
39
with PD also face an elevated risk of developing dementia. Although surgical
procedures like deep brain stimulation (DBS) and various pharmaceutical
therapies have increased in recent decades, there remains a pressing need for the
development of effective and accessible disease-modifying medications.
a. Motor Symptoms
b. Non-Motor Symptoms
40
complex neuropsychiatric disorder. The indicators and symptoms fall into three
major categories: affect (such as anxiety and depression), perception and thought
(such as psychosis), and motivation (such as impulse control disorders and
apathy). [Wei22]. Another non-motor manifestation of PD is autonomic
dysfunction, which includes gastrointestinal dysfunction, cardiovascular
dysregulation, urine disruption, sexual dysfunction, thermoregulatory aberrance,
and pupillo-motor and tear abnormalities. [Che20b]. The regulation of sleep and
wakefulness depends on the coordinated and highly complex operation of
numerous brain regions and neurotransmitters, many of which have been
demonstrated to be compromised in people with PD. Given this
pathophysiological context, it is not unexpected that sleep and wakefulness
problems are virtually always present in PD patients. [Ste20]. A survey study in
1988 revealed that 98 percent of PD patients had disabilities at night or upon
waking since the onset of their disease, and disturbed wakefulness regulation was
shown to be a prominent feature in up to 30 percent of PD patients. [lee88].
Additionally, sensory symptoms are prevalent in PD, which generally tend to affect
the side of the body that was first or more severely disrupted by the motor
fluctuations. These symptoms include musculoskeletal pain, dystonic pain,
akathisia, CNP, olfactory disturbance, and visual dysfunctions. [Zhu16]. Other
underappreciated non-motor aspects of PD are anosmia, the loss of smell, and
aguesia, the loss of the sense of taste. [Tar17]. Many PD patients also suffer from
cognitive deficits, with a systematic review revealing that 36 percent of newly
diagnosed patients suffer from cognitive impairment. [Aar05]. This condition,
known as Parkinson’s disease dementia (PD-D), is primarily associated with older
age at the disease onset or time of evaluation. [Han17]. In addition to cognitive
dysfunctions, the clinical features of PD-D include behavioral symptoms,
autonomic dysfunctions, sleep disorders, and parkinsonism. [Sez19].
It has been observed that men are more susceptible to PD compared to women.
The meta-analysis results of 7 door-to-door studies show that the ratio of male-
to-female PD cases is 1.49, with a 95 percent confidence interval between 1.24 and
1.95. [Woo04]. Several factors have been proposed as potential explanations for
this gender difference, including the protective effects of estrogens, the higher
frequency and intensity of occupational toxin exposure, more prevalent minor
41
head trauma in males, and recessive susceptibility genes of the X chromosome.
[Wir11a]. Studies also show that the risk of death in PD differs between men and
women, estimated at 2 percent for men and 1.3 percent for women. Overall, the
mortality risk was estimated at 1.7 percent after age 40, and the gender difference
decreased with the increase in age. [Elb02].
42
and predispose to the development of PD. For example, the Parkin gene, one of
the genes responsible for recessive familial PD, is involved in the formation and
progression of cancer. [Xu14]. [Elb16].
For over three decades, numerous studies have shown an adverse relationship
between smoking and PD. A meta-analysis of 44 case-control and four cohort
studies has revealed that ever-smokers had a 60 percent lower risk of PD than
never-smokers. [Her02]. Similarly, a study of individual data from eight case-
control studies and three cohort studies from the US corroborated this
relationship. [Rit07]. Remarkably, this inverse relationship has been observed
even among subjects to pesticide exposure. [Gal05]. [Bre16]. According to another
cohort study, the duration of smoking appeared to be more significant than the
intensity of smoking. This finding is important because it shows that several years
of smoking may be necessary before noticing a decreased risk of PD. [Che10a].
43
However, cohort studies of PD patients indicate that smoking has little impact on
the progression of the illness. [Alv04].
The idea that PD and pesticide exposure were related first surfaced when many
cases of Parkinsonism followed intravenous MPTP injections in the early 1980s. In
dopaminergic cells, MPTP is converted into 1-methyl-4-phenylpyridinium (MPP+), a
mitochondrial respiratory chain inhibitor with neurotoxic characteristics. The
molecule resembles the chemical structure of paraquat, a nonselective herbicide that
has been in use since the 1960s and remains extensively employed. Given these
observations, numerous research studies have delved into the connection between
farming, pesticide exposure, and PD. [Elb16]. According to a meta-analysis of 46
research, people exposed to pesticides have an approximately 1.6 times increased
chance of developing PD. Among different types of pesticides, herbicides and
insecticides have a stronger connection with PD despite significant study
heterogeneity. While fungicides exhibited a weaker correlation, less research has
investigated their potential link to the disease. [vdM12].
2.1.3 Pathology
The substantia nigra, pars compacta (SNpc), an essential component of the basal
ganglia, is the primary brain region damaged by PD. Dopamine is a necessary brain
monoamine that primarily serves as an inhibitory neurotransmitter, and this
region is predominantly made up of neurons that secrete it. In a healthy brain,
dopamine controls the excitability of striatal neurons, which are essential in
regulating the balance of bodily movement. However, in PD, dopamine levels
decrease, and SNpc dopamine neurons deteriorate. [Ger89]. Low levels of
dopamine result in reduced inhibition of striatal neurons’ activity, allowing them
to fire excessively. This underlying mechanism elucidates why individuals with PD
are unable to control their movements, experiencing tremors, stiffness, and
bradykinesia, which are the hallmarks of PD-related motor symptoms.
44
45
The intracellular buildup of Lewy bodies in dopamine neurons of the SNpc,
which contain misfolded aggregates of alpha-synuclein (SNCA) and other related
proteins, is one of the defining diseases of PD. [Car14]. Several molecular, genetic,
and biochemical studies have shown that post-mortem human brains from
patients with mixed dementia with Lewy bodies (DLB) and PD with dementia
(PDD) who were diagnosed neuropathologically are frequently found to contain a
variety of misfolded protein aggregates, including p-tau, A-beta, and SNCA.
(Stefanis, 2012). According to research, amyloid deposition in some PD patients’
brains has been associated with cognitive reductions without dementia, indicating
that amyloid contributes to cognitive but not motor decline over time. [Iba17]. It
has also been discovered that the load and amount of Abeta pathology influences
cognitive deficits in PDD and LDB. [Bla16]. Alphasynuclein or other misfolded
amyloid proteins can kill neurons by creating a pore in the membrane and
inducing neuroinflammation, excitotoxicity, oxidative stress, and energy failure.
[Mar12]. Oxidative stress has been associated with various general mitochondrial
abnormalities, including variations in the dynamics and shape of the mitochondria,
mutations in the mitochondrial DNA, and abnormalities in calcium homeostasis.
Dysfunctional mitochondria can result in decreased energy production, the
creation of reactive oxygen species, and the activation of stress-induced apoptosis.
[Sub13].
46
they are most closely connected with Alzheimer’s and can co-localize with alpha-
synuclein in Lewy bodies. [Hep16].
Drug medication is the most common treatment for PD patients, with patients
receiving different doses of several generic drugs. Although there still is no
definitive cure for PD altogether, some drugs can help slow the course of the
disease or alleviate some of the symptoms. PD drugs are generally categorized as
dopaminergic and non-dopaminergic drugs.
47
The majority of drug treatments as mentioned have significant side effects and
provide only momentarily relief, particularly for particular patient types.
They are also powerless to halt additional dopaminergic neuron loss. Therefore,
some clinicians turn to surgical procedures to lessen the motor symptoms when
drug medication is deemed ineffective, especially in the late stages of the disease.
48
As some PD incidences have shown to be caused by multiple genes associated
with the disease, despite most PD cases being sporadic in origin, researchers have
been investigating gene therapy strategies as a potentially viable treatment option.
[Cou12]. Despite efforts, further research and trials are needed in this field to
show the viability of this type of treatment.
Before stem cell therapy can be authorized as a viable treatment for people
with PD, additional research must assess its safety and effectiveness. One concern
lies in the self-replicating ability of stem cells, which carries the risk of tumor
formation after clinical transplantation. [Zha23]. In this regard, stem cell
exosomes could be used as an alternative option, as explained in the upcoming
sections of this review.
49
2.2 Mesenchymal Stem Cell-Derived Exosomes
Stem cells are a classification of cells that carry long-term self-renewal abilities
and can differentiate into other cell types that are more specialized within their
functions. Through this differentiation progress, these cells maintain their DNA
structure while exhibiting distinct gene expression patterns in their technical
roles. [Kol13]. Stem cells are distributed throughout nearly every adult organ,
where they are responsible for replacing the cells lost within these organs and
responding to any injury or disease in the tissue. In their differentiation pathways,
there are intermediate or progenitor states. These progenitor cell states can
influence the behavior of the cells surrounding them. Additionally, stem cells can
be engineered and modified in vitro to be differentiated into desired cells.
Leveraging these unique properties, stem cells have been widely researched in
tissue engineering and cell therapy fields. [Bac18].
The potential of stem cells differentiating into specialized cell types is known
as stem cell potency. Potency defines the ability of stem cells to adopt a different
phenotype. Stem cells can be categorized by their potencies as totipotent,
pluripotent, and multipotent. [Kol13]. Totipotent stem cells are relatively rare and
initially present in low amounts in the zygote. These stem cells can differentiate
into every cell type in the body and the placenta. Pluripotent stem cells are found
in the blastocyst and can differentiate into all body cell types other than the
placenta. Multipotent stem cells are more specialized and are found in three germ
layers: the ectoderm, endoderm, and mesoderm. They differentiate into different
cell types according to the germ layer that they originate from. In contrast,
unipotent stem cells exhibit long-term self-renewal and can reproduce in large
amounts. However, these cells are committed to differentiating into one specific
cell type. [Arb23].
MSCs are stromal cells that exhibit multilineage differentiation and have the
ability of self- renewal, akin to other types of stem cells. MSCs can be extracted
from various tissues, including adipose tissue, bone marrow, menstrual blood,
endometrial polyps, and the umbilical cord. [Din07]. This is because these sources
are most useful for experimental and potential clinical applications because of the
ease of extraction and yield. Thus, MSCs also carry fewer ethical issues compared
to other stem cells, such as induced pluripotent stem cells and embryonic stem
cells, due to this ease in harvest. [Din11].
50
Under particular in vitro circumstances, MSCs can develop into diverse lineages
of mesodermal, ectodermal, and endodermal cells, including bone, fat,
chondrocyte, muscle, neuron, islet cells, and liver cells. [Ois09]. Additionally,
genetic processes involving transcription factors control differentiation. Some
regulatory genes that cause progenitor cells to differentiate into a particular
lineage can govern differentiation to a specific phenotypic route. [Bac18]. A
microenvironment created with biomaterial scaffolds can offer MSCs the ideal
circumstances for proliferation and differentiation in addition to growth factors
and induction chemicals. [Ser04]. Research has also found that adult human
MSCs can easily and directly be developed into dopaminergic neurons. [Kha19].
[Ven17].
2.2.2 Exosomes
Figure 6: Virtually all cells release exosomes, most commonly identified by the
tetraspanins CD9, CD81, and CD63 on their surface. Exosomes carry molecules
such as proteins, RNA, or DNA and mediate cell-to-cell communication. [Neund].
Due to their extremely small proportions, they can easily pass compartments
and membranes. This cell-to-cell interaction mediation of exosomes plays a
51
significant role in human metabolism and health, including the development of
immunity and the maintenance of homeostasis, the onset of malignancy, and the
development of numerous diseases. Viruses and other evading particles can use
these vesicle pathways to spread their infections. One remarkable attribute of
exosomes lies in their ability to be harnessed for targeted interventions and drug
delivery. Furthermore, their capability to traverse the blood-brain barrier
positions them as an excellent drug delivery pathway. [Zho23].
Exosomes also play a crucial role in paracrine signaling and are the primary
determinant of stem cell efficacy. Cell-free exosome therapy can overcome
numerous drawbacks of stem cells, such as their stability and storage convenience.
Exosomes exhibit high biocompatibility, eliminating the risk of host rejection and
enabling precise dose-control. [Gur21]. One significant feature of exosomes lies in
their ability to transfer RNA to recipient cells and affect their proteome, functions,
and RNA expression. These processes are essential for controlling immunological
responses or various other pathological reactions through intercellular
communication. [Har13]. These RNA molecules include messenger RNA (mRNA)
and microRNA (miRNA), which affect the protein synthesis of the recipient cells in
the process of cell-to-cell communication. [Xin12].
MSC-EXOs
MSCs have been found to carry the ability to differentiate into neural cells and
secrete several neurotrophic and anti-inflammatory substances after
transplantation, showing strong neuroprotective capabilities for diseases such as
amyotrophic lateral sclerosis, multiple sclerosis, PD, and glaucoma. [Joh10].
It is currently widely accepted that MSCs primarily use secreted trophic factors
in order to exert their therapeutic benefits. Exosomes are thought by many
researchers to be the paracrine effectors of MSCs with their involvement in cell-
to-cell communication. They have been tested in various illness models, and the
results have shown that they perform similar tasks to MSCs, including reducing
the size of myocardial infractions, enabling kidney injury repair, modifying
immunological responses, and encouraging tumor growth. [Yu14].
52
MSC-EXOs have also been investigated with regard to their miRNAs. It
has been discovered that most of the miRNAs included in MSC-EXOs are in their
precursor form. [Che10b]. MSCs influence other cells biologically by secreting
miRNAs through these exosomes. Exosomes from MSCs administered to neurons
and astrocytes cause target cells to produce miR-133b, which aids the functional
recovery process in spinal cord injury and PD. This discovery indicates that MCSs
control neurite outgrowth by delivering miR-133b to neurons and astrocytes via
exosome release. [Xin12].
A study has found that MSC treatment dramatically increased the levels of miR-
133b in the ipsilateral hemisphere of rats who had undergone middle cerebral
artery occlusion (MCAo). Exosomes from MSCs that had been exposed to
ipsilateral ischemia tissue extracts from rats that had undergone MCAo in vitro,
showed a substantial increase in miR-133b levels, significantly in primary
cultured neurons and astrocytes. However, treatment of the astrocytes with
exosome- enriched fractions from MSCs transfected with a miR-133b inhibitor
dramatically reduced miR- 133b levels. This study stands as the first evidence that
53
MSCs interact with brain parenchymal cells via exosome-mediated miR133b
transfer, controlling the expression of particular genes in order to promote neurite
outgrowth and functional treatment. [Xin12]. The research team later showed that
intravenous injection of MSC-EXOs can increase axonal density and
synaptophysin-positive areas along the ischemic boundary zone of the cortex and
striatum and hasten functional recovery in the same model as above, confirming
that MSC-EXOs could significantly improve neurologic outcome and contribute to
neurovascular remodeling. [Xin13].
4 Discussion
Despite being preliminary in terms of clinical application, stem cells have shown
significant therapeutic potential in a variety of diseases. The biggest shortcoming
54
of stem cells is their high instability. Suffering from their high potency, they carry
the risk of tumor formation in clinical applications. Due to this characteristic, there
has been a shift of focus to utilizing their exosomes in regenerative medicine.
Exosomes derived from stem cells have been proven to carry therapeutic abilities
on par with those of stem cells. As they do not have the ability to multiply on their
own and show highly adaptive characteristics, being able to survive in a variety of
environments, they are a much stabler option than stem cells regarding clinical
application and have a high potential to be optimized in drug usage. These
nanoparticles can easily pass membranes throughout tissues thanks to their small
sizes. As mentioned, exosomes have an essential responsibility in cell-to-cell
communication and have been accepted by many studies to be behind the
therapeutic effects of stem cells by enabling their miRNA transmission. For these
reasons, there has been a spur of recent research on these microvesicles as
potential treatments for many diseases. Similarly, many scientists have been
researching the effect of exosomes on neurodegenerative diseases, particularly AD
and PD.
55
growth and the impacts of immunogenicity. Finally, the therapeutic implications
of exosome formation under intervention during disease are uncertain because of
their intricate structure. Exosome-derived stem cells are currently the subject of
preliminary research, however; their pharmaceutical use is hampered by the
varied composition and functional activity of spontaneously produced exosomes.
It has been revealed that exosomes derived in different conditions have carried
other functional factors. Exosomes’ precise function is likewise primarily
unknown. [Yua18]. There remains a critical need for research on the precise role
and components of exosomes for progress in studies regarding the therapeutic
delivery of different diseases.
5 Conclusion
Due to their ease in harvest and high potential in tissue regeneration and
remodeling, MSCs have become highly anticipated stem cells, being studied in a
variety of cell therapies. They carry the potential of differentiating into several
different cell lines, including dopaminergic cells, whose deterioration is a
prominent issue in PD patients. MSCs have also been proven to secrete
neurotrophic and anti-inflammatory substances after transplantation, carrying
strong neuroprotective abilities.
References
[Aar05] D. Aarsland. A systematic review of prevalence studies of dementia in
parkinson’s disease. Movement Disorders : Official Journal of the
Movement Disorder Society, 20, 1255–1263, 2005.
[Abu22] A. H. Abusrair. Tremor in parkinson’s disease: From pathophysiology to
advanced therapies. Tremor and Other Hyperkinetic Movements (New
York, N.Y.), 12, 29. https://doi.org/10.5334/tohm.712, 2022.
56
[Alv04] G. Alves. Cigarette smoking in parkinson’s disease: Influence on disease
progression. Movement Disorders : Official
Journal of the Movement Disorder Society, 19(9), 1087–1092.
https://doi.org/10.1002/mds.20117, 2004.
[Ant08] A. Antonini. Comt inhibition with tolcapone in the treatment algorithm
of patients with parkinson’s disease (pd): Relevance for motor and non-
motor features. Neuropsychiatric Disease and Treatment, 4(1), 1–9.
https://doi.org/10.2147/ndt.s2404, 2008.
[Arb23] D. Arbabha. Application of stem cells and adipose- derived stem cell
exosomes on dermal wound healing. CellR4, 11(e3402).
https://doi.org/10.32113/cellr4202373402,2023.
[Ari99] K. Arima. Cellular co-localization of phosphorylated tau- and
nacp/alpha-synuclein- epitopes in lewy bodies in sporadic parkinson’s
disease and in dementia with lewy bodies. Brain Research, 843(1–2),
53–61. https://doi.org/10.1016/s0006-8993(99)01848-x, 1999.
[Bac18] L. Bacakova. Stem cells: Their source, potency and use in regenerative
therapies with focus on adipose-derived stem cells—a review.
Biotechnology Advances, 36(4), 1111–1126.
https://doi.org/10.1016/j.biotechadv.2018.03.011, 2018.
[Baj10] A. Bajaj. Parkinson’s disease and cancer risk: A systematic review and
meta-analysis. Cancer Causes Control : CCC, 21(5), 697–707.
https://doi.org/10.1007/s10552-009-9497-6, 2010.
[Bar69] A. Barbeau. L-dopa therapy in parkinson’s disease: A critical review of
nine years’ experience. Canadian Medical Association Journal, 101(13),
59–68., 1969.
[Bei14] J. M. Beitz. Parkinson’s disease: A review. rontiers in Bioscience (Scholar
Edition), 6(1), 65–74. https://doi.org/10.2741/s415, 2014.
[Ber01] A. Berardelli. Pathophysiology of bradykinesia in parkinson’s disease.
Brain : A Journal of Neurology, 124(Pt 11), 2131– 2146.
https://doi.org/10.1093/brain/124.11.2131, 2001.
[Bla16] J. W. Blaszczyk. Parkinson’s disease and neurodegeneration: Gaba-
collapse hypothesis. Frontiers in Neuroscience, 10, 269.
https://doi.org/10.3389/fnins.2016.00269, 2016.
[Bol20] M. Bologna. Pathophysiology of rigidity in parkinson’s disease: Another
step forward. Clinical Neurophysiology : Official Journal of the
International Federation of Clinical Neurophysiology, 131(8), 1971–
1972. https://doi.org/10.1016/j.clinph.2020.05.013, 2020.
[Bou08] G. Bouchez. Partial recovery of dopaminergic pathway after graft of
adult mesenchymal stem cells in a rat model of parkin-
57
son’s disease. Neurochemistry International, 52(7), 1332–1342.
https://doi.org/10.1016/j.neuint.2008.02.003, 2008.
[Bre16] C. B. Breckenridge. Association between parkinson’s disease and
cigarette smoking, rural living, well-water consumption, farming and
pesticide use: Systematic review and meta-analysis. PloS One, 11(4),
e0151841. https://doi.org/10.1371/journal.pone.0151841, 2016.
[Bro00] D. J. Brooks. Dopamine agonists: Their role in the treatment of
parkinson’s disease. Journal of Neurology, Neurosurgery, and Psychiatry,
68(6), 685–689. https://doi.org/10.1136/jnnp.68.6.685, 2000.
[Car14] A. Cardinale. Protein misfolding and neurodegenerative diseases.
International Journal of Cell Biology, 2014, 217371.
https://doi.org/10.1155/2014/217371, 2014.
[Cha21] R. J. Chandler. Modelling the functional genomics of parkinson’s disease
in caenorhabditis elegans: Lrrk2 and beyond. Bioscience Reports,
41(9). https://doi.org/10.1042/BSR20203672, 2021.
58
drug addiction and neuroplasticity. Genome Medicine, 2(12), 92.
https://doi.org/10.1186/gm213, 2010.
[dRV16] J. P. de Rivero Vaccari. Exosome-mediated inflammasome signaling after
central nervous system injury. Journal of Neurochemistry, 136 Suppl 1(0
1), 39–48. https://doi.org/10.1111/jnc.13036, 2016.
[Dur04] F. Durif. Clozapine improves dyskinesias in parkinson disease: A
double- blind, placebo-controlled study. Neurology, 62(3), 381–388.
https://doi.org/10.1212/01.wnl.0000110317.52453.6c, 2004.
[Elb02] A. Elbaz. Risk tables for parkinsonism and parkinson’s disease. Journal
of Clinical Epidemiology, 55(1), 25–31. https://doi.org/10.1016/s0895-
4356(01)00425-5, 2002.
[Elb16] A. Elbaz. Epidemiology of parkinson’s disease. Revue Neurologique,
172(1), 14–26. https://doi.org/10.1016/j.neurol.2015.09.012, 2016.
[Fio08] R. Fiore. Microrna function in neuronal development, plasticity and
disease. Biochimica et Biophysica Acta, 1779(8), 471–478.
https://doi.org/10.1016/j.bbagrm.2007.12.006, 2008.
[Gal05] J. P. Galanaud. Cigarette smoking and parkinson’s disease: A case-
control study in a population characterized by a high prevalence of
pesticide exposure. Movement Disorders : Official Journal of the
Movement Disorder Society, 20(2), 181–189.
https://doi.org/10.1002/mds.20307, 2005.
[Ger89] D. C. German. Midbrain dopaminergic cell loss in parkinson’s disease:
Computer visualization. Annals of Neurology, 26(4), 507–514.
https://doi.org/10.1002/ana.410260403, 1989.
[Gur21] S. Gurunathan. A comprehensive review on factors influences
biogenesis, functions, therapeutic and clinical implications of
exosomes. International Journal of Nanomedicine, 16, 1281–1312.
https://doi.org/10.2147/IJN.S291956, 2021.
[Han17] H. A. Hanagasi. Dementia in parkinson’s disease. Journal of the
Neurological Sciences, 374, 26–31.
https://doi.org/10.1016/j.jns.2017.01.012, 2017.
[Har13] C. V. Harding. Exosomes: Looking back three decades and into the
future. The Journal of Cell Biology, 200(4), 367–371.
https://doi.org/10.1083/jcb.201212113, 2013.
[Hau99] R. A. Hauser. Long-term evaluation of bilateral fetal nigral
transplantation in parkinson disease. Archives of Neurology, 56(2), 179–
187. https://doi.org/10.1001/archneur.56.2.179, 1999.
[Hep16] D. H. Hepp. Distribution and load of amyloid- pathology in parkinson
disease and dementia with lewy bodies. Journal
59
of Neuropathology and Experimental Neurology, 75(10), 936–945.
https://doi.org/10.1093/jnen/nlw070, 2016.
[Her02] M. A. Herm´an. A meta-analysis of coffee drinking, cigarette smoking,
and the risk of parkinson’s disease. Annals of Neurology, 52(3), 276–284.
https://doi.org/10.1002/ana.10277, 2002.
[Her16] T. M. Herrington. Mechanisms of deepbrain
stimulation. Journal of Neurophysiology, 115(1), 19–
38. https://doi.org/10.1152/jn.00281.2015, 2016.
[Huo17] P. Huot. Serotonergic approaches in parkinson’s disease: Translational
perspectives, an update. ACS Chemical Neuroscience, 8(5), 973–986.
https://doi.org/10.1021/acschemneuro.6b00440, 2017.
[Hur13] M. J. Hurley. Parkinson’s disease is associated with altered expression
of cav1 channels and calcium-binding proteins. Brain : A Journal of
Neurology, 136(Pt 7), 2077–2097.
https://doi.org/10.1093/brain/awt134, 2013.
[Iba17] C. F. Ibanez. Biology of gdnf and its receptors—relevance for disorders
of the central nervous system. Neurobiology of Disease, 97(Pt B), 80–89.
https://doi.org/10.1016/j.nbd.2016.01.021, 2017.
[Jan08] J. Jankovic. Clinical features and diagnosis. ournal of Neurology,
Neurosurgery, and Psychiatry, 79(4), 368–376.
https://doi.org/10.1136/jnnp.2007.131045, 2008.
[Joh10] T. V. Johnson. Neuroprotective effects of intravitreal mesenchymal stem
cell transplantation in experimental glaucoma. Investigative
Ophthalmology Visual Science, 51(4), 2051–2059.
https://doi.org/10.1167/iovs.09-4509, 2010.
[Kal14] A. Kalani. Exosomes: Mediators of neurodegeneration, neuroprotection
and therapeutics. Molecular Neurobiology, 49(1), 590–600.
https://doi.org/10.1007/s12035-013-8544-1, 2014.
[Kal20] R. Kalluri. The biology, function, and biomedical applications of
exosomes. Science (New York, N.Y.), 367(6478).
https://doi.org/10.1126/science.aau6977, 2020.
[Kha19] M. Khademizadeh. Differentiation of adult human mesenchymal stem
cells into dopaminergic neurons. Research in Pharmaceutical Sciences,
14(3), 209–215. https://doi.org/10.4103/1735- 5362.258487, 2019.
[Kim11] H. J. Kim. Stem cell potential in parkinson’s disease and molecular
factors for the generation of dopamine neurons. Biochimica et
Biophysica Acta, 1812(1), 1–11.
https://doi.org/10.1016/j.bbadis.2010.08.006, 2011.
[Kim18] S. M. Kim. Gait patterns in parkinson’s disease with or without
60
cognitive impairment. Dementia and Neurocognitive Disorders, 17(2),
57–65. https://doi.org/10.12779/dnd.2018.17.2.57, 2018.
[Kol13] G. Kolios. Introduction to stem cells and regenerative medicine.
Respiration; International Review of Thoracic Diseases, 85(1), 3–10.
https://doi.org/10.1159/000345615, 2013.
[Lai15] R. C. Lai. Mesenchymal stemcell exosomes. Seminars in
Cell Developmental Biology, 40, 82–88.
https://doi.org/10.1016/j.semcdb.2015.03.001, 2015.
[lee88] A. J. lees. The nighttime problems of parkinson’s disease. Clinical
Neuropharmacology, 11(6), 512–519.
https://doi.org/10.1097/00002826- 198812000-00004, 1988.
[Li21] Q. Li. Exosomes derived from mir-188-3p- modified adiposederived
mesenchymal stem cells protect parkinson’s disease. Molecular
Therapy. Nucleic Acids, 23, 1334–1344.
https://doi.org/10.1016/j.omtn.2021.01.022, 2021.
[Lim10] P. K. Lim. Neurogenesis: Role for micrornas and mesenchymal stem cells
in pathological states. Current Medicinal Chemistry, 17(20), 2159–2167.
https://doi.org/10.2174/092986710791299894, 2010.
[Lin11] O. Lindvall. Cell therapeutics in parkinson’s disease. Neurotherapeutics :
The Journal of the American Society for Experimental
NeuroTherapeutics, 8(4), 539–548. https://doi.org/10.1007/s13311-
0110069-6, 2011.
[Liu15] A. K. L. Liu. Nucleus basalis of meynert revisited: Anatomy, history and
differential involvement in alzheimer’s and parkinson’s disease. Acta
Neuropathologica, 129(4), 527–540. https://doi.org/10.1007/s00401-
015-1392-5, 2015.
[Liu22] S. F. Liu. Update on the application of mesenchymal stem cell-derived
exosomes in the treatment of parkinson’s disease: A systematic review.
Frontiers in Neurology, 13, 950715.
https://doi.org/10.3389/fneur.2022.950715, 2022.
[Mai17] P. Maiti. Current understanding of the molecular mechanisms in
parkinson’s disease: Targets for potential treatments. Translational
Neurodegeneration, 6, 28. https://doi.org/10.1186/s40035-017-0099z,
2017.
[Mar12] O. Marques. Alpha-synuclein: From secretion to dysfunction and death.
Cell Death Disease, 3(7), e350. https://doi.org/10.1038/cddis.2012.94,
2012.
[Neund] Neuromics. Exosome isolation methods.
https://www.neuromics.com/exosome- isolation-methods, n.d.
61
[niand] nia.nih.gov. What is lewy body dementia? causes, symptoms, and
treatments. https://www.nia.nih.gov/health/what-lewybody-dementia-
causes-symptoms-and- treatments, n.d.
[Off07] D. Offen. Intrastriatal transplantation of mouse bone marrow-derived
stem cells improves motor behavior in a mouse model of parkinson’s
disease. Journal of Neural Transmission. Supplementum, 72, 133–143.
https://doi.org/10.1007/978-3-211-73574-916,2007.
[Oh16] S. H. Oh. Mesenchymal stem cells inhibit transmission of -synuclein by
modulating clathrin-mediated endocytosis in a parkinsonian model.
Cell Reports, 14(4), 835–849.
https://doi.org/10.1016/j.celrep.2015.12.075, 2016.
[Ois09]
K. Oishi. Differential ability of somatic stem cells. Cell Transplantation,
18(5), 581–589. https://doi.org/10.1177/096368970901805614, 2009.
62
[Rie04] P. Riederer. Clinical applications of mao-inhibitors. Current Medicinal
Chemistry, 11(15), 2033–2043.
https://doi.org/10.2174/0929867043364775, 2004.
[Rit07] B. Ritz.Pooled analysis of tobacco use and risk of
parkinson disease. Archives of Neurology, 64(7), 990–
997. https://doi.org/10.1001/archneur.64.7.990, 2007.
[Roy04] L. Roybon. Stem cell therapy for parkinson’s disease: Where do we
stand? Cell and Tissue Research, 318(1), 261–273.
https://doi.org/10.1007/s00441-004-0946-y, 2004.
[Ser04] M. Seruya. Clonal population of adult stem cells: Life span and
differentiation potential. Cell Transplantation, 13(2), 93–101.
https://doi.org/10.3727/000000004773301762, 2004.
[Sez19] M. Sezgin. Parkinson’s disease dementia and lewy body disease.
Seminars in Neurology, 39(2), 274–282. https://doi.org/10.1055/s-
00391678579, 2019.
[SM08] S. Schraen-Maschke. Tau as a biomarker of neurodegenerative diseases.
Biomarkers in Medicine, 2(4), 363–384.
https://doi.org/10.2217/17520363.2.4.363, 2008.
[Sma07] N. R. Smalheiser. Exosomal transfer of proteins and rnas at synapses in
the nervous system. Biology Direct, 2, 35. https://doi.org/10.1186/1745-
6150-2-35, 2007.
[Ste12] L. Stefanis. -synuclein in parkinson’s disease. Cold Spring
Harbor Perspectives in Medicine, 2(2), a009399.
https://doi.org/10.1101/cshperspect.a009399, 2012.
[Ste20] A. Stefani. Sleep in parkinson’s disease. Neuropsychopharmacology :
Official Publication of the American College of
Neuropsychopharmacology, 45(1), 121–128.
https://doi.org/10.1038/s41386-019-0448-y, 2020.
[Sub13] S. R. Subramaniam. Mitochondrial dysfunction and oxidative stress in
parkinson’s disease. Progress in Neurobiology, 106–107, 17–32.
https://doi.org/10.1016/j.pneurobio.2013.04.004, 2013.
[S¨a08] K. S¨a¨aksj¨arvi. Prospective study of coffee consumption and risk of
parkinson’s disease. European Journal of Clinical Nutrition, 62(7), 908–
915. https://doi.org/10.1038/sj.ejcn.1602788, 2008.
[Tan99] C. M. Tanner. Parkinson disease in twins: An etiologic study. JAMA,
281(4), 341–346. https://doi.org/10.1001/jama.281.4.341, 1999.
63
[Tar17] A. Tarakad. Anosmia and ageusia in parkinson’s disease.
International Review of Neurobiology, 133, 541–556.
https://doi.org/10.1016/bs.irn.2017.05.028, 2017.
[Tha08] E. L. Thacker. Familial aggregation of parkinson’s disease: A
metaanalysis. Movement Disorders : Official Journal of the Movement
Disorder Society, 23(8), 1174–1183.
https://doi.org/10.1002/mds.22067, 2008.
[Tys17] O. B. Tysnes. Epidemiology of parkinson’s disease. Journal of Neural
Transmission (Vienna, Austria : 1996), 124(8), 901–905.
https://doi.org/10.1007/s00702- 017-1686-y, 2017.
[vdM12] M. van der Mark. Is pesticide use related to parkinson disease? some
clues to heterogeneity in study results. Environmental Health
Perspectives, 120(3), 340–347. https://doi.org/10.1289/ehp.1103881,
2012.
[Ven17] K. Venkatesh. Mesenchymal stem cells as a source of dopaminergic
neurons: A potential cell based therapy for parkinson’s disease. Current
Stem Cell Research Therapy, 12(4), 326–347.
https://doi.org/10.2174/1574888X12666161114122059, 2017.
[Ver15] A. Verstraeten. Progress in unraveling the genetic etiology of parkinson
disease in a genomic era. Trends in Genetics : TIG, 31(3),
140–149. https://doi.org/10.1016/j.tig.2015.01.004, 2015.
[Wei22] D. Weintraub. The neuropsychiatry of parkinson’s disease: Ad-
vances and challenges. The Lancet. Neurology, 21(1), 89–102.
https://doi.org/10.1016/S1474- 4422(21)00330-6, 2022.
64
Flow and Metabolism : Official Journal of the International Society of
Cerebral Blood Flow and Metabolism, 33(11), 1711–1715.
https://doi.org/10.1038/jcbfm.2013.152, 2013.
[Xu14] L. Xu. An emerging role of park2 in cancer. Journal of Molecular Medicine
(Berlin, Germany), 92(1), 31–42. https://doi.org/10.1007/s00109- 013-
1107-0, 2014.
[Yu11] Y. M. Yu. Microrna mir-133b is essential for functional recovery after
spinal cord injury in adult zebrafish. The European Journal of
Neuroscience, 33(9), 1587–1597.
https://doi.org/10.1111/j.14609568.2011.07643.x, 2011.
[Yu14] B. Yu. Exosomes derived from mesenchymal stem cells. International
Journal of Molecular Sciences, 15(3), 4142–4157.
https://doi.org/10.3390/ijms15034142, 2014.
[Yua18] Y. Yuan. Stem cell-derived exosome in cardiovascular diseases: Macro
roles of micro particles. Frontiers in Pharmacology, 9, 547.
https://doi.org/10.3389/fphar.2018.00547, 2018.
[Zha23] K. Zhang. Stem cell-derived exosome versus stem cell therapy. Nature
Reviews Bioengineering, 1–2. https://doi.org/10.1038/s44222023-
00064-2, 2023.
[Zho23] Z. Zhou. Implications of crosstalk between exosome-mediated
ferroptosis and diseases for pathogenesis and treatment. Cells, 12(2).
https://doi.org/10.3390/cells12020311, 2023.
[Zhu16] M. Zhu. Sensory symptoms in parkinson’s disease: Clinical features,
pathophysiology, and treatment. Journal of Neuroscience Research, 94(8),
685–692. https://doi.org/10.1002/jnr.23729, 2016.
65
Life Challenges Faced By Chinese Workers In
Africa
Hengyi Chen∗
October 13, 2023
Abstract
The presence of Chinese workers in Africa has grown significantly in
recent years, driven by large-scale infrastructure projects and economic
initiatives. While attention has been devoted to the broader landscape
of Chinese foreign enterprises operating abroad, there is a notable lack
of comprehensive research into the specific challenges faced by Chinese
workers in Africa. This essay delves into the multifaceted challenges ex-
perienced by Chinese workers in Africa, focusing on personal safety, health
concerns, and cultural conflicts. This essay proposes several potential so-
lutions to address these challenges. Companies can incorporate crucial
provisions in employment contracts to ensure the welfare and quality of
life of their workers. Educational sessions for Chinese workers in Africa
can help bridge cultural gaps and enhance safety. Additionally, business
organizations such as chambers of commerce can play a collaborative role
in addressing these challenges and mediating conflicts.
1 Introduction
Chinese foreign enterprises operating abroad have certainly garnered substantial
attention from researchers and the media. This reporting typically focuses on
huge, state-owned companies and their strategic impact.1
Chinese state-owned businesses often engage in large-scale infrastructure
projects, resource extraction, and other ventures that have the potential to
shape the economic landscape of host countries. To exemplify, China has in-
vested in Ethiopia’s railway system. The Chinese Export-Import Bank provided
85% of the funding for the $475 million Addis Ababa Light Rail, which serves
4 million of the city’s residents.2
In contrast to the focus on investment, it is surprising that researchers have
published relatively few essays about the circumstances of Chinese workers in
∗ Advised
by: Dr. James Sundquist of the University of Yale
1 Forexample, see Haydn Shaughnessy, “Chinese Companies Are Transforming Busi-
ness—and the West Is Struggling To Keep Up”
2 Mariama Sow, “Figures of the week: Chinese investment in Africa”
66
Africa. Given the increasing presence of Chinese labor in various African coun-
tries under the influence of transnational projects, such as the Belt and Road
Initiative, Chinese workers have become a huge labor sector abroad. As of 2019,
there were officially about one million Chinese workers employed overseas, with
many additional Chinese citizens working overseas on tourist visas or in other
unofficial capacities.3
When attention is paid to labor issues in Africa, it usually frames Chinese
labor as competing with Africans for jobs or focuses on African workers under
Chinese managers. For example, U.S. politicians, from Hilary Clinton to Rex
Tillerson, have criticized China for not hiring enough African workers.4 Al-
though these angles are worth exploring, so are the challenges faced by Chinese
workers. The U.S. perspective still portrays China as a single entity, without
sympathy for the difficult situations many Chinese workers find themselves in.
One of the rare instances in which Chinese individuals are placed in center
stage is Howard French’s work, China’s Second Continent. French introduces
the pattern of the migration waves of Chinese workers. The book captures these
individuals’ motivations, actions, and experiences in Africa.5 French provides
numerous interesting anecdotes about Chinese entrepreneurs and the opportu-
nities they see in China. The Asian arrivals have also faced substantial hurdles
due to language barriers in communicating with local administrations. French’s
book, on the other hand, is full of examples of persistent Chinese migrants who
have successfully explored possibilities in this new frontier.6 French highlights
the spirit of these individuals and their ability to identify and capitalize on op-
portunities in Africa’s rapidly evolving economic landscape. He discusses the
sectors they invest in, such as infrastructure, manufacturing, and retail, and
how their activities impact local economies.
French’s account, while illuminating, contains three important gaps. My
opinions diverge from French in these three aspects. First, French’s book pri-
marily focuses on entrepreneurs and does not explore common workers in the
same level of detail. Chinese workers in Africa face a wide range of challenges
that differ significantly from those of entrepreneurs. These challenges may in-
clude issues related to labor rights, working conditions, cultural adaptation,
discrimination, and more. Neglecting to explore these aspects leaves a signifi-
cant gap in the narrative, as the experiences of workers are a vital part of the
broader story of Chinese engagement in Africa. Second, his use of anecdotes
shows the diversity of experiences but makes it challenging to understand the
common themes of Chinese experiences in Africa. Anecdotes by their nature
are selective and specific. They highlight individual stories or instances, but
these may not be representative of the broader population. If the book relies
heavily on anecdotes, it can give readers a skewed view of the overall experi-
3 Jennifer Hillman and Alex Tippett, “Who Built That? Labor and the Belt and Road
Initiative”
4 Jenni Marsh, “Employed By China”
5 IPI, “Africa: China’s Second Continent”
6 Chris Hartman, ’China’s Second Continent’ tells the fascinating yet alarming story of
67
ences of Chinese workers. It may focus on exceptional cases or outliers, making
it difficult to discern the typical challenges faced by the majority. Finally, ten
years have passed since French’s book was published, which suggests that it
requires updating. As the years have progressed since the book’s publication,
the landscape of China-Africa relations has inevitably evolved. Trends in la-
bor migration, working conditions, and interactions between Chinese workers
and the local populace may have transformed, thereby necessitating a refreshed
assessment of the circumstances.
In this essay, I will introduce the core challenges that are faced by Chinese
workers in Africa— personal safety, staying healthy, and cultural conflict. I show
that these three factors are the most concerning factors that bother Chinese
workers. African food systems often suffer from inefficiencies, poor infrastruc-
ture, and post-harvest losses. These issues can result in irregular and insufficient
food supplies, leading to food insecurity for Chinese workers. Inadequate food
safety measures can expose Chinese workers to health risks. Contaminated or
unsafe food can cause illnesses and undermine their well-being and Chinese
workers may face substandard housing conditions, including overcrowding and
lack of basic amenities. I argue the potential solutions to solve their living prob-
lems, such as the improvement of employment contracts, education sessions for
Chinese workers, and communication regularly with local governments to help
both sides resolve problems.
2 Personal Safety
What is the primary challenge initially encountered by Chinese workers in
Africa? While research delves into various dimensions, a focal point emerges
regarding safety concerns as one of the most central themes. The Chinese
Academy of Social Sciences notes that 84% of China’s Belt and Road invest-
ments are in medium to high-risk countries. Three hundred and fifty serious
security incidents involving Chinese firms occurred between 2015 and 2017,
from kidnappings and terror attacks to anti-Chinese violence, according to
China’s Ministry of State Security.7 Ongoing conflicts, political uncertainties,
and economic trials in certain African nations have cultivated an atmosphere
characterized by escalated and unforeseeable risks. Kate Bartlett, a journalist
based in Africa, reported that the killing of nine Chinese gold mine workers in
conflict-ridden Central African Republic in March 2023, highlighted the risks
some projects face in volatile areas.8 Against this complex backdrop, Chinese
workers, frequently engaged in critical endeavors such as infrastructure devel-
opment and resource extraction, have encountered a multitude of challenges
directly tied to safety apprehensions, labor disagreements, and interruptions
to their undertakings stemming from local upheavals. A report last year by
the U.K.-based Business and Human Rights Resource Center found 181 human
rights allegations connected to Chinese investments in Africa between 2013 and
7 Paul Nantulya, “Chinese Security Firms Spread along the African Belt and Road”
8 Kate Bartlett, “How Chinese Private Security Companies in Africa Differ From Russia’s”
68
2020, with the highest number of incidents in Uganda, Kenya, Zimbabwe, and
the Democratic Republic of Congo.9
These challenges stem from a confluence of factors. The precarious nature of
the political climate in some regions not only jeopardizes the stability of existing
projects but also undermines the overall security of Chinese workers. Moreover,
the intricate web of economic hardships prevailing in certain African countries
can exacerbate the challenges faced by Chinese workers. Financial constraints
and resource limitations in these regions can impede the timely execution of
projects, thereby increasing the exposure of workers to potential hazards and
uncertainties. In contrast, most Chinese companies just meet the basic stan-
dards of local laws, in contrast to some Western transnational enterprises and
mature local companies.10
Military takeovers are a further source of instability and danger. Accord-
ing to Cobus van Staden, senior researcher at the South African Institute of
International Affairs, “One of the contributing factors in all of this is the per-
ception you see in African countries that Chinese people keep lots of cash on
hand,” making Chinese workers favored targets for kidnappers.11 Chad, Mali,
Guinea, Sudan, and more recently Burkina Faso have all witnessed successful
military takeovers, The aftermath of these political shifts has heightened con-
cerns about the safety and security of Chinese workers operating within these
regions.12 These abrupt leadership changes, often accompanied by civil unrest
and power struggles, introduce an added layer of complexity to an already intri-
cate situation. After military takeovers, institutions and governance structures
can falter, leading to instability. This can result in higher lawlessness, weakened
law enforcement, and increased criminal activities, all contributing to jeopar-
dized safety for foreign workers, including the Chinese workforce. According to
the United Nations, in 2008, xenophobic violence resulted in the death of over
60 people and contributed to the displacement of at least 100,000.13 Although
there is an estimate that more than 500,000 Chinese citizens live in South Africa,
violent crime cases are increasingly high which cast double over their lives.14
Wang Wei, who has been employed at a Chinese company in Johannesburg
for half a decade, emphasized the heightened caution he has exercised when
venturing outside, particularly in the aftermath of the tragic murder of Zhong
Zhiwei. Zhong Zhiwei, the former president of the Township Association of
Shandong Province in South Africa, and his wife were tragically gunned down
in broad daylight in Johannesburg on August 13. Wang Wei is among the par-
ticipants in a collective effort led by a local Chinese association, urging the
South African government to swiftly apprehend and bring to justice the perpe-
trators responsible for the couple’s killing. “This is the least I can contribute,”
9 Kate Bartlett, “Are Rights Abuses Tarnishing China’s Image in Africa?”
10 Wenjie Zhao, “Research on the rights and interests protection of local workers in Chinese-
funded enterprises in Africa – taking Zimbabwe as an example”
11 Kate Bartlett, “Chinese Working in Africa Face Threat of Kidnapping”
12 Reuters, “Recent coups in West and Central Africa”
13 United Nation, “South Africa: UN experts condemn xenophobic violence and racial dis-
69
he remarked.15
Moreover, the prevalence of civil conflicts between local governments and var-
ious military factions further amplifies concerns. This prompts many enterprises
to enlist security guards to safeguard both their assets and their workforce. Ac-
cording to the Private Security Industry Regulatory Authority (PRISA), there
were 9,539 registered security companies in South Africa in 2022. There are
also almost 2.5 million registered security officers.16 The need for security guards
to protect both properties and employees becomes a strategic necessity in an
environment characterized by uncertainty and potential risks. Phoenix Inter-
national, a Chinese think tank with strong state-owned enterprise (SOE) ties,
reports that no more than twenty of these state-owned firms conduct activities
overseas protecting SOEs and other Chinese interests. By 2013, they employed
around 3,200 personnel, according to the Germany-based Mercator Institute
for China Studies, more than the number of United Nations (UN) peacekeepers
China furnishes, a figure that stood at 2,534 troops and police as of June 2020.17
In the pursuit of profit maximization, certain enterprises may opt to cut
costs, particularly in areas such as safety measures and security provisions,
inadvertently putting the lives and well-being of their workers at risk. Instances
of accidents, injuries, or even loss of life can have profound effects on workforce
morale, productivity, and reputation. The negative impact on the enterprise’s
image, both locally and internationally, can overshadow any initial cost savings,
leading to the potential loss of business opportunities and investor confidence.
3 Staying Healthy
Chinese laborers are now more prevalent than ever in Africa’s dynamic panorama
of economic development and infrastructure projects. Beyond the difficulties
posed by safety concerns and operational complexities, a crucial but sometimes
under-appreciated worry remains large: the health risks that these workers en-
counter while engaged in their profession. These individuals are exposed to a
variety of health risks, such as infectious diseases and environmental contami-
nants, as they work on various projects across the continent. The multifaceted
health risks that Chinese workers in Africa face are examined in this essay, along
with the wider implications for project continuity, labor force sustainability, and
the pressing need for coordinated efforts to protect their safety.
Chinese immigrants frequently experience a difference in food culture when
arriving in African nations. This difference can be found in taste, ingredients,
and cooking techniques. Their typical eating habits may greatly differ from the
local food, which could cause discomfort and even intestinal problems. Cross-
cultural interaction includes learning new tastes and nutritional practices, but
if handled carelessly, it might pose acute health hazards. Finding familiar meals
15 GlobalTimes, “Chinese in S. Africa fear for safety amid rising murder cases”
16 Wise Move, “Top Security Companies in South Africa — Complete List 2023”
17 Paul Nantulya, “Chinese Security Contractors in Africa”
70
and maintaining dietary habits can be difficult, leading to physical and mental
exhaustion and impacting employee happiness and performance.
Beyond only individual preferences, Africa’s larger food systems have signif-
icant flaws and food safety issues. The ability of these systems to supply the
rising food demand of expanding populations has been put under strain for a
number of reasons, including extreme weather events, climate change, frequent
outbreaks of pests and diseases, and limited adoption of modern agricultural
technologies.18 This can lead to worries regarding the accessibility, efficacy, and
security of food for Chinese workers as well as the surrounding community.
For example, Africa’s food security challenges are compounded by the war in
Ukraine, by supply chain shortages, conflict, and drought. This has caused many
staple food prices in Africa to increase by an average of almost 25% between
2020 and 2022.19
The interplay of these factors can potentially lead to compromised food
safety. Inadequate food storage, lack of access to clean water for cooking and
washing produce, and challenges in sourcing ingredients that meet both dietary
and safety criteria can all contribute to health risks. Illnesses arising from food-
borne pathogens can disrupt work schedules, impede productivity, and even re-
sult in long-term health issues for workers. More importantly, the shortcomings
in the healthcare infrastructure and the vulnerability of the healthcare systems
are obvious issues that have a serious influence on the safety and well-being of
Chinese employees. When employees become unwell, the lack of hospitals and
medical services across different locations presents a difficult obstacle, empha-
sizing the vulnerabilities they face in their quest for professional prospects.
For those seeking medical care, the lack of hospitals and other medical facil-
ities—which is sometimes exacerbated by a shortage of medical personnel and
resources—creates a catastrophic scenario. Like locals, Chinese laborers en-
counter a dearth of easily accessible healthcare services. The travel time to the
closest medical institution can be rather long, and in rural regions, this might
result in delays in receiving life-saving medical care. The inadequate healthcare
systems in many African nations further exacerbate the situation. The effective-
ness of healthcare supply is compromised by a lack of resources, poor staffing,
and restricted access to necessary drugs and treatments. Chinese workers may
see a sharp contrast when navigating Africa’s healthcare system because they
are used to more developed healthcare systems. When they become ill, the chal-
lenges of communication gaps, new medical procedures, and disparate standards
of care may make an already difficult situation even worse.
In addition to physical illnesses, mental health conditions pose significant
challenges in Africa. It is worth noting that in certain African countries, men-
tal illnesses are sometimes attributed to spiritual causes, resulting in limited
attention and resources being allocated to address them. For these workers,
grappling with illness while navigating a foreign healthcare system can be emo-
tionally and physically taxing. The fear of misdiagnosis, inadequate treatment,
18 WHO, Food Safety
19 DanielleResnick and Aloysius Uche Ordu, “Africa’s food security challenge”
71
or lack of proper medical attention can intensify their anxieties. Through men-
tal health questionnaires and binary logistic regression models from the Chinese
Center for Disease Control and Prevention, among 154 employees, 48.70% had
mental health problems.20 The sense of homesickness is a common emotional re-
sponse among Chinese workers in Africa. Being separated from their homeland,
families, and the familiar cultural and social context can be emotionally taxing.
They may yearn for the comforts of home, miss important family milestones,
and feel disconnected from their roots. This homesickness can lead to feelings
of sadness, anxiety, and a sense of isolation. Chinese workers often find them-
selves in environments where the local culture and language are different from
their own. Language barriers can hinder effective communication and limit their
ability to form meaningful relationships with locals. This can contribute to a
pervasive sense of loneliness and social isolation, making it difficult for them to
integrate into the local society and establish a support network. The demands
of their work, which often involve long hours and high-pressure responsibilities,
can further exacerbate feelings of loneliness and homesickness. The absence of a
strong social support system can make it challenging to cope with the stressors
inherent to their professional roles.
The issue of an excessive workload coupled with limited leisure time has
emerged as a recurring concern, evident in numerous research papers and sur-
vey questionnaires. Chinese workers find their rights not being adequately safe-
guarded, both by the People’s Republic of China’s labor laws and the local
regulations of the host African countries. This deficiency in legal protection can
result in a situation where workers face challenges in asserting their rights and
achieving a work-life balance that promotes their well-being.
Finally, the health of Chinese workers in Africa is at risk from potential
exposure to various diseases, notably Ebola and malaria, which pose significant
health risks within certain regions. Ebola, a highly contagious and frequently
lethal disease, has caused alarm in several regions of Africa. Quarantines, re-
strictions on movement, and increased fear among workers can result from out-
breaks. Employers and employees must bear higher costs as a result of the
necessity to establish stringent health standards and safety measures to stop
the spread of Ebola. Another major threat in many African nations is malaria,
a disease that is frequently transmitted by mosquitoes. Chinese employees are
particularly vulnerable to infection because they are oblivious to the disease’s
patterns of transmission and defenses. Malaria can have a major negative impact
on health and productivity by increasing absenteeism. The prevalence of the
illness highlights the significance of preventative measures including insecticide-
treated bed nets, anti-malarial drugs, and appropriate sanitation to safeguard
employees’ health. Using a cross-sectional approach, ninety-six (37.5%) partic-
ipants contracted malaria more than once within a year.21
20 Shuo Chen, Mingfan Pang, Xiaopeng Qi, Lili Wang, Xiaochun Wang, “Analysis of mental
health status and influencing factors of employees in Chinese-funded enterprises in Ethiopia”
21 Li Zou, Ke Ning, Wenyu Deng, Xufei Zhang, Mohamamad Shahir Sharifi, Junfei Luo, Yin
Bai, Xiner Wang, Wenjuan Zhou, “Study on the use and effectiveness of malaria preventive
measures reported by employees of chinese construction companies in Western Africa in 2021”
72
4 Culture Conflicts
Chinese workers in Africa often encounter significant cultural barriers, especially
when it comes to their working habits with colleagues. These challenges can
have wide-ranging implications for both the Chinese workers themselves and
their interactions with local African colleagues.
In countries such as Ghana, local governments and communities often ex-
ert pressure on Chinese-funded projects to ensure that a significant portion of
the workforce is composed of local hires. This requirement reflects a desire to
maximize job opportunities for the local population and distribute the economic
benefits of Chinese investments. For example, in the construction of the Bui
Dam, the agreement between Sinohydro, the Chinese state-owned behemoth
contracted to complete the project, and the Ghanaian government stipulated
that a certain proportion of the workforce would be local.22 Chinese workers
may find it challenging to adapt to working alongside local colleagues who have
different cultural backgrounds, work practices, and expectations. Chinese work-
ers in Africa may struggle with language barriers when working with local col-
leagues who may speak different languages or dialects. Miscommunication can
lead to misunderstandings, errors, and strained working relationships. Accord-
ing to Business Insider, only 130 million of the approximately 1 billion people
in Africa speak English (13%).23 Moreover, these Chinese employees are not
fluent enough in English, which enhances their difficulties living and working in
Africa with severe communication problems.
Chinese and African cultures also often have different work ethics and expec-
tations regarding punctuality, productivity, and work hours. Chinese workers
may be accustomed to a more rigorous work schedule and faster pace, while
local colleagues may have a different approach. These differences in work habits
can lead to friction and misunderstandings. Among Chinese employees in Tan-
zania, the adaption to the material environment is the most successful, followed
by the adaption to life culture, while the adaption to work culture is poor.
More specifically, the Chinese employees believe that the overall environment
in African countries is kind and polite, but the scenes on streets, such as beg-
ging habits on the streets, and lazy working style are not acceptable.24 Chinese
workers’ adaptation to the local work culture may be poor, as suggested by the
example from Tanzania. Local work cultures in African countries can be signif-
icantly different from what Chinese workers are used to in terms of hierarchy,
decision-making processes, and the pace of work. This lack of adaptation can
hinder collaboration and productivity.
In summary, Chinese workers in Africa face various cultural barriers when it
comes to their working habits with local colleagues. These challenges encompass
22 Pippa Morgan, Andrea Ghiselli, “Chinese workers on Africa’s infrastructure projects: The
their strategies”
24 Qingmin Li, “A study on cross-cultural adaptation of Chinese employees in Chinese-
73
issues related to local employment requirements, language barriers, differences
in work ethic, adaptation to local work culture, and cultural sensitivity. Over-
coming these barriers requires cultural awareness, effective communication, and
a willingness to adapt and collaborate across cultural boundaries, which are es-
sential for the success of Chinese-funded projects in Africa and the harmonious
coexistence of diverse workforce.
5 Potential Solutions
As China’s presence in Africa continues to grow, finding effective solutions to ad-
dress these challenges becomes increasingly crucial. This exploration delves into
innovative approaches and practical solutions aimed at improving the lives and
well-being of Chinese workers in Africa, while also promoting sustainable devel-
opment and fostering positive relations between the two regions. By addressing
these challenges comprehensively, we can pave the way for a more prosperous
and harmonious coexistence between Chinese workers and their African host
communities.
First, to effectively address the challenges faced by Chinese workers in Africa,
companies can play a pivotal role by implementing certain crucial provisions in
their employment contracts. To begin, it is imperative to establish compre-
hensive insurance coverage for employees, encompassing both health and life
insurance. Adequate coverage not only ensures access to healthcare in remote
areas but also offers financial security to workers and their families. Stipulat-
ing tax-free status for income earned in Africa can significantly enhance the
economic well-being of employees. Furthermore, companies should commit to
providing reliable transportation options and suitable living conditions to alle-
viate the challenges of remote work environments. Setting age limits for specific
job positions can ensure that employees are adequately equipped to handle the
physical demands of their roles, while also promoting safety and well-being. By
incorporating these conditions into employment contracts, Chinese companies
can proactively address the welfare and quality of life for their workers in Africa,
fostering a more conducive and harmonious working environment.
Second, implementing education sessions for Chinese workers in Africa is not
only feasible but also highly advisable, serving as a valuable recommendation
for companies operating on the continent. These sessions can be conducted by
local tutors or even experienced workers who have previously been employed in
the region. Such educational initiatives can cover a range of topics, including
language proficiency, cultural awareness, understanding forbidden zones, and
crucial safety instructions. By harnessing the expertise of local instructors or
experienced colleagues, companies can empower their workforce with the essen-
tial knowledge and skills needed to navigate the unique challenges of working in
African contexts. This proactive approach not only enhances the effectiveness
and efficiency of operations but also fosters a stronger sense of cultural integra-
tion and safety among Chinese workers, ultimately benefiting both employees
and the host communities.
74
From a business perspective, implementing educational sessions for Chinese
workers in Africa is a strategic investment that aligns with a company’s in-
terests on multiple fronts. These educational initiatives need not be costly,
making them a cost-effective way to equip workers with essential skills and
knowledge. By enhancing the attention span and engagement of employees,
companies can address the difficulty of lower productivity and efficiency often
associated with inadequate training and cultural adaptation. Lower turnover
rates among employees who have undergone such training are immensely ben-
eficial to businesses. The cost savings associated with reduced hiring and on-
boarding expenses, along with the preservation of valuable time and energy,
provide tangible benefits. Retaining experienced workers contributes to the
cultivation of expertise, ultimately leading to a more skilled and competent
workforce. Thus, these education sessions emerge as a win-win solution that
not only enhances workers’ capabilities but also proves highly advantageous for
a company’s long-term sustainability and profitability in the African context.
Third, the involvement of business organizations, such as chambers of com-
merce, in addressing the challenges faced by Chinese workers in Africa is a
valuable and collaborative solution. Chambers of commerce serve as associa-
tions or networks of business people dedicated to safeguarding and advancing
the interests of their members. These organizations often comprise business
owners sharing geographical or sectoral interests, and they can also have an
international scope. Companies operating in Africa can forge close partnerships
with chambers of commerce to navigate the complexities of the local business
environment effectively. These chambers often engage in regular dialogues with
local governments, facilitating the resolution of mutual challenges.
When conflicts or issues arise between African and Chinese workers, busi-
nesses can turn to chambers like the China-Africa Business Council (CABC)
for assistance. CABC can serve as a mediator and advisor, helping to find
amicable solutions to address societal conflicts within the workforce. One prac-
tical approach could involve conducting interviews or surveys among Chinese
employees to identify the most significant challenges they face in the business
context. This data can then be compiled into a comprehensive report, high-
lighting key concerns and areas that need attention. Subsequently, the CABC
can engage in discussions with national governments to propose further invest-
ment in projects that result in a ”win-win” situation for both Chinese workers
and local communities. For example, in regions where safety is a concern, col-
laborative efforts can be undertaken to improve the security of society. This
concern can be raised with the national government to initiate measures aimed
at enhancing safety and social well-being.
6 Conclusion
Life challenges faced by Chinese workers abroad that are inadequately docu-
mented and often overlooked by mainstream reporting is a significant issue that
deserves our attention. When these challenges are not thoroughly explored or
10
75
properly documented, they can have far-reaching consequences that affect indi-
viduals, communities, and even entire societies.
Whether it be hazardous working conditions, inadequate access to healthcare
services, or the need to adapt to a culturally diverse work environment, Chinese
workers confront many challenges that make it difficult for them to succeed. As
China continues to invest in its relationships with African countries, it should
also invest in the personal relationships between Africans and Chinese employ-
ees. Strengthening insurance requirements, providing additional training, and
improving communication through chambers of commerce will hopefully facili-
tate a mutually beneficial collaboration, addressing these challenges and creat-
ing an environment where both Chinese workers and African communities can
thrive and succeed together.
The recommendations above are even more urgently needed because of the
nearly 90% drop in Chinese workers in Africa due to COVID-19.25 Those who
remain are even more isolated than before and in even more need of assistance
from their employers. If Chinese workers are to return and continue building
China-Africa ties, they will need robust support from their home country.
References
[Afr14] Africa: China’s second continent. International Peace Institute, 2014.
[Bar23] Kate Bartlett. How chinese private security companies in africa differ
from russia’s. VOA, 2023.
[Bus18] Stephanie Busari. Employed by china. CNN, 2018.
[Ho15] Ufrieda Ho. Chinese in south africa learn to live with violence. South
China Morning Post, 2015.
[Hol23] Hereward Holland. Factbox: Recent coups in west and central africa.
Reuters, 2023.
[Kem18] Laurent Kemoe. How africa can escape chronic food insecurity amid
climate change. IMF, 2018.
[Mor23] Pippa Morgan. Chinese workers on africa’s infrastructure projects:
The link with host political regimes. Phys.org, 2023.
[Nan20] Paul Nantulya. Chinese security contractors in africa. Carnegie En-
dowment for International Peace, 2020.
[Nan21] Paul Nantulya. Chinese security firms spread along the african belt
and road. Africa Center for Strategic Studies, 2021.
[rep20] GT reporters. Chinese in s. africa fear for safety amid rising murder
cases. Global Times, 2020.
25 China Africa Research Initiative, “DATA:CHINESE WORKERS IN AFRICA”
11
76
[Rig22] United Nations Human Rights. South africa: Un experts condemn
xenophobic violence and racial discrimination against foreign nation-
als. OHCHR, 2022.
12
77
Unlocking Quercetin’s Therapeutic Potential:
The Use of Innovative Drug Delivery Strategies
to Remedy Neurodegenerative Disorders
∗†
Andrew Lee
October 10, 2023
Abstract
Neurodegenerative disorders are a class of diseases characterized by
the degeneration of certain parts of the central and peripheral nervous
system, affecting millions of people worldwide. The use of quercetin for
the potential treatment of neurodegenerative disorders has been heavily
researched due to the flavonoid’s antioxidant, anti-inflammatory, metal-
ion chelating, and neuroprotective properties. However, free quercetin
struggles to make a significant clinical impact due to low aqueous sol-
ubility, chemical instability, and an unfavorable absorption profile, ulti-
mately leading to underwhelming levels of bioavailability. To ameliorate
these issues, drug delivery systems have been employed in modern re-
search, including polymer-based nanoparticles, lipid-based nanoparticles,
and metallic nanoparticles. This review aims to discuss modern research
on quercetin’s potential in neurodegenerative disorder treatment, partic-
ularly with these drug carriers, and identify the most promising configu-
rations for future investigation. Most notably, studies showed that drug
carriers for quercetin’s delivery increased the flavonoid’s bioavailability,
likely due to protective mechanisms against bodily chemical degrada-
tion. Additionally, many studies also found that drug carriers signifi-
cantly extended the duration of quercetin’s release within the body, al-
lowing for less frequent administration of the flavonoid during treatment
periods. Finally, drug delivery systems illustrated the facilitation effects
of quercetin’s blood-brain barrier crossing—an essential step in treating
neurodegenerative disorders. Though the use of quercetin-loaded drug
carriers for neurodegenerative disorder treatment is still a relatively new
topic of study, certain configurations have shown tremendous potential.
Most notably, liposomal delivery systems are especially promising can-
didates, and future studies should investigate their use in tandem with
PEGylations for quercetin’s neurodegenerative applications.
∗ Student at Tenafly High School in Tenafly, New Jersey
† Advised by: Dr. Paul Gehret of the University of Pennsylvania
78
1 Introduction
Quercetin (3,3’,4’,5,7-pentahydroxyflavone) is a widely abundant dietary polyphe-
nol and flavonoid found in a wide array of fruits, vegetables, and their respec-
tive derivatives. Quercetin is most profusely found in berries, leafy greens,
citrus fruits, onions, apples, red wine, and green tea [BL22] [SA07] [ADAP16]
[CCE+ 03] [SNS+ 13]. Throughout the past few years, quercetin has been heavily
researched and implemented in both food products and pharmaceuticals alike
due to the wide variety of promising health benefits exhibited [LW22] [PVK+ 22].
One of quercetin’s most prominent properties is a free radical scavenging ability,
allowing for unpaired electron neutralization, which is notorious for inducing in-
flammation and oxidative stress [ADB+ 89]. These very stressors gradually wear
down the body and eventually deteriorate enough to the point where disorders
and diseases can either be caused directly by the damage or be predisposed to
the body’s weakened state. The most common free radical-induced maladies in-
clude viral infections [Aka01], neurodegenerative disorders [LBBD17], diabetes
[MSWI03], cardiovascular complications [ML97], and various cancers [VRM+ 06]
[DJ96] [RBS+ 00]. As a result, among other factors, quercetin has been found to
most significantly display anti-inflammatory [KVM+ 11], antioxidant [HCS+ 18],
anticancer [HDFY+ 17], antidiabetic [SVKP18], antimicrobial [WYZ+ 18], an-
tiviral [SLL+ 21], hepatoprotective [EAPA+ 17], and neuroprotective effects in
vivo [BGP+ 20].
In the past decade, quercetin has seen a rise in preclinical and clinical re-
search for potential treatments of neurodegenerative disorders, including Alzheimer’s
Disease [MPSS+ 17], Parkinson’s Disease [SWM+ 12], Huntington’s Disease [CSD+ 13],
Amyotrophic Lateral Sclerosis (ALS) [BMSD20], and Multiple Sclerosis (MS)
[AEG+ 23]. As a result of quercetin’s free radical scavenging and neuroprotec-
tive properties, quercetin has exhibited the ability not necessarily to reverse
the effects of neurodegenerative disorders but rather to hinder and mitigate
their progression within the body. As neurodegenerative disorders are most
often characterized by neurological degradation and are closely linked to old
age, slowing down their progression can play a tremendous role in increasing
the life expectancy of those who suffer. However, despite the myriad of phar-
maceutical benefits that quercetin possesses, the flavonoid alone has struggled
to make a significant clinical impact due to low aqueous solubility, chemical
stability, and an unfavorable absorption profile, ultimately leading to an under-
whelming bioavailability [MMD+ 97]. Quercetin’s hydrophobic properties, for
one, make absorption into the bloodstream extremely difficult. Additionally,
the highly reactive and pH-sensitive nature of quercetin prone it to chemical
alterations when passing through the acidic environment of the gastrointestinal
tract, undergoing deprotonation, which furthers the flavonoid’s lack of solubility
and bioavailability [ZZM21]. In attempts to ameliorate this issue, recent stud-
ies and developments in quercetin applications encompass a wide range of drug
delivery systems for the flavonoid, including lipid-based nanoparticles, polymer-
based nanoparticles, and metallic nanoparticles [VM19]. In this review, it will
be discussed how the numerous pharmacological properties of quercetin can
79
be harnessed through the use of novel drug delivery systems for the potential
treatment of neurodegenerative disorders.
80
the creation of new free radicals is eliminated as the double bonds and ketone
groups present in quercetin’s chemical structure establish a delocalization of
electrons, allowing for the loss of charge to be evenly distributed throughout
the molecule [WSRE04]. As a result, oxidative stress is unable to amass within
the body, lowering the risk of free radical-induced maladies.
81
2.3 Metal Ion Chelating Activity
Quercetin exhibits a strong metal ion chelating ability due to its unique chemical
structure. Metal ions, such as potassium and iron, have shown dietary benefits
in trace amounts and are essential for human life. However, an oversupply of
these metallic ions and notably even trace amounts of heavy metal ions, such
as lead and mercury, can be highly toxic and, in some cases, lethal [KYSO07].
Similar to free radicals, these metallic ions are highly unstable and can catalyze
the creation of reactive oxygen species (ROS), generating a system of oxidative
stress [Sta90]. To combat the detrimental effects of these metallic ions, metal
ion chelators such as quercetin can be employed. Quercetin possesses the ability
to act as a ligand and form coordinate covalent bonds with metal ions, creating
quercetin-metal complexes [RRD14]. By donating hydrogen ions from hydrox-
ide groups, quercetin is able to neutralize the charge of the metal ion (Fig.
2). Multiple quercetin molecules often contribute to this effort, sequestering
the metal ion within the complex, which protects it from the redox reactions
that generate reactive oxygen species (ROS) (Fig. 2). Through these chelating
mechanisms, quercetin is able to stop the detrimental effects of metal ions.
82
As mitochondria are the energy-producing organelles in cells, including neurons,
mitochondrial dysfunction can be detrimental to neuronal health [FBS+ 07].
By aiding in the maintenance of mitochondrial membrane potentials through
the encouragement of mitochondrial biogenesis-inducing genes, including per-
oxisome proliferator-activated receptor-gamma coactivator 1-alpha (PGC-1α),
in addition to discouraging the buildup of oxidative stress and damage, quercetin
can drastically reduce neuronal apoptosis rates, showcasing its neuroprotective
prowess [XWG+ 16]. Another way quercetin can promote neuronal health is by
enhancing autophagy, which prevents the buildup of neurotoxic proteins and
other harmful cellular components [WLL+ 11]. The amassing of misfolded pro-
teins within the brain often characterizes neurodegenerative disorders, and by
binding to and modulating specific proteins such as AMP-activated protein ki-
nase (AMPK) and Beclin-1, quercetin can stimulate autophagy within neurons,
further promoting their health [WLL+ 11] [KAAS16]. Ultimately, a combination
of all of quercetin’s neuroprotective properties aids the flavonoid in combating
neurodegenerative disorders.
83
enhancing the flavonoid’s health benefits [KTMC22]. The most successful de-
livery systems concerning these efforts include polymer-based, lipid-based, and
metallic nanoparticles [VM19].
84
Figure 3: Percentage inhibition of nitroblue tetrazolium (NBT) reduction of free
catechin (CAT), free quercetin (QC), catechin-loaded PLGA nanoparticles, and
quercetin-loaded PLGA nanoparticles. Measurements taken at three different
concentrations: 7 µM, 21 µM, and 35 µM. Free PLGA nanoparticles were used
as control [PQF+ 12].
Figure 4: Fe2+ ion chelating activity of free quercetin and PLGA encapsulated
quercetin at 20 µM and 100 µM. Free PLGA nanoparticles were used as a
control. Chelating activity was measured after 0.25 hours, 4 hours, 12 hours,
24 hours, and 32 hours [PQF+ 12].
85
3.1.2 Chitosans & Polyethylene Glycols (PEGs)
In addition to PLA and PLGA, a variety of polymer-based nanoparticle de-
livery systems have shown potential for quercetin delivery, including chitosan
nanoparticles and polyethylene glycol conjugations (PEGs) [WSM+ 16]. Chi-
tosans are natural polysaccharides that share many of the same properties
that PLA and PLGA exhibit, such as drug protective and bioavailability en-
hancing abilities, making the polymer a strong candidate for drug delivery
applications [DJ18]. Additionally, due to their chemical structure, chitosans
are entirely hydrophilic, resulting in high aqueous solubility through hydrogen
bonding from its hydroxyl and amine groups. However, what sets chitosans
apart from other polymers as a drug delivery system is its mucoadhesive prop-
erties, which can tremendously boost quercetin’s bioavailability [SWK08]. As
a result of the amine groups’ lone electron pair, chitosans possess a positive
charge, awarding the polymer an affinity for negatively charged mucosal sur-
faces (e.g., the lining of the gastrointestinal tract). Through electrostatic at-
tractions, the chitosan nanoparticle becomes tightly bound to the intestinal
epithelium, allowing for effective quercetin absorption into the bloodstream,
thus increasing the flavonoid’s bioavailability [SWK08]. Baksi et al. demon-
strated this increase, measuring significantly lower IC50 levels in A549 and
MDA MB 468 tumor cell lines for quercetin-chitosan nanoparticles, which were
prepared by ionic gelation, compared to free quercetin [BSB+ 18]. Additionally,
quercetin-loaded chitosan nanoparticles showed more significant reductions in
tumor volume and weight in vivo [BSB+ 18]. In another study, Mukhopadhyay
et al. showed tremendous drops in blood glucose levels in HT29 cell lines in
vitro using quercetin-succinylated chitosan-alginate core-shell-corona nanopar-
ticles [MMM+ 18]. While various particle size groups were tested, the smallest
group, approximately 91.58 nanometers in size, was found to be most efficient for
quercetin’s oral delivery [MMM+ 18]. On the other hand, polyethylene glycols
(PEGs), while sharing many of the same chemical properties as the other poly-
mers, play a unique role in drug delivery applications. PEGs, like chitosans,
are hydrophilic polymers that have high water solubility and biocompatibil-
ity, making them excellent candidates for drug delivery applications. However,
PEGs are non-biodegradable polymers, limiting their use alone as a drug car-
rier. To remedy this issue, researchers have begun to use novel PEG conjuga-
tions where PEG is used in concert with other nanoparticles to improve the
polymer’s biodegradability while being able to harness PEG’s unique chemical
properties for drug delivery applications. By attaching PEG chains to the sur-
face of other nanoparticles in a process titled PEGylation, the polymer is further
able to protect and stabilize the delivery system, creating a stealth effect for the
nanoparticle [LJS+ 21]. These long PEG chains sterically hinder the nanoparti-
cle due to their large yet flexible nature, physically obstructing anything from
binding to the carrier’s surface. Likewise, Li et al. found that longer attached
PEG chains correlate to fewer interactions between the nanocarrier and other
cells within the body [LJS+ 21]. Additionally, due to their hydrophilic nature,
these PEG chains create a film-like layer of water that encapsulates the car-
86
rier, protecting it from protein adsorption and phagocytosis, which ultimately
increases the duration of quercetin’s bioavailability. Qureshi et al. used PE-
Gylated PLGA-quercetin nanoparticles, which were prepared through double
emulsion encapsulation, to show cell viability inhibition of cell line MDA-MB-
231 in vitro [QZW+ 16]. Additionally, in vitro tumor targeting and growth inhi-
bition were shown with tremendous success when doxorubicin was co-delivered
with quercetin, illustrating the nanoparticle’s potential in targeted drug deliv-
ery [QZW+ 16]. Ultimately, despite their lack of biodegradability, PEGs have
shown tremendous promise in the field of targeted and controlled-release drug
delivery.
10
87
ministered the drug intranasally saw tremendous improvements in their platform
acquisition time [PWS+ 08]. This could potentially be indicative of a blood-brain
barrier crossing struggle that the oral administration of liposomes faces, which
intranasal delivery can remedy. In addition to liposomes, micelles are another
lipid-based nanoparticle that has been heavily researched for quercetin delivery.
Micelles are amphiphilic molecules that structurally contain only one layer of
lipids. These lipid molecules face outwards, creating a hydrophilic surface like
that of liposomes but a hydrophobic core more favorable for quercetin delivery
(Fig. 5) [Men79]. Even so, micelles struggle in vivo compared to liposomes due
to their thinner outer shell, which leads to decreased drug protective proper-
ties and stability as a nanoparticle. As such, micelles are sensitive to stimuli,
spontaneously dissociating when faced with rapid temperature and pH changes,
contributing to their unpredictable drug-release properties [WCZZ09] [GLL13].
However, micelles have shown promise in quercetin delivery when coupled with
polyethylene glycols (PEGs) through PEGylations. Lv et al. employed thin-
film hydration to create PEGylated quercetin-loaded micelles in rats and found
that the blood plasma quercetin concentrations of those who had been adminis-
tered the PEGylated micelle conjugation were significantly greater. Even more
notable was that the PEGylated micelle rats also maintained quercetin within
their bloodstream for almost 50 hours, quadrupling the duration of those ad-
ministered free quercetin [LLL+ 17]. Using a similar formulation technique, Qi
et al. created and used PEGylated quercetin-loaded micelles and found signif-
icant H22 cell line tumor growth inhibition as well as reduction potentials for
existing tumors in vivo [QGY+ 22]. These effects lasted up to 15 days after treat-
ment, also illustrating the controlled release potential of PEGylated micelles in
quercetin delivery [QGY+ 22]. Though often outclassed by polymeric nanopar-
ticles in quercetin delivery applications, lipid-based nanoparticles certainly have
their place within the field.
11
88
3.3 Metallic Nanoparticles
Metallic nanoparticles are another form of drug carriers that have been re-
searched for their potential in quercetin delivery. While most metals, par-
ticularly heavier transition metals, such as lead and mercury, have exhibited
non-biodegradable and cytotoxic properties within the body [KYSO07], certain
metals, such as iron and silver, have proven to be more biocompatible, making
them viable candidates for drug delivery [BMTC22] [JRM+ 08]. In addition to
increasing the solubility of quercetin through encapsulation, these biocompati-
ble metallic nanoparticles have also exhibited protective effects, increasing the
flavonoid’s stability [KMZM03]. Unfortunately, due to the high reactivity of
metals within the body, these metallic nanoparticles are susceptible to break-
down due to temperature and pH variations [LCK16]. However, this reactivity
can work in favor of metallic nanoparticles as the property makes them easy to
manipulate and engineer chemically. Most notably, functional groups and other
biomolecules, such as antibodies and peptides, can be attached to the nanopar-
ticle’s surface for targeted delivery to specific cells and molecules. Additionally,
other molecules such as polymers, surfactants, and ligands can be incorporated
into the surface of metallic nanoparticles, partially remedying their extreme re-
activity and enhancing the overall stability of the nanoparticle [ZLA+ 11]. Na-
jafabadi et al. used a novel iron drug carrier system, quercetin conjugated iron
oxide nanoparticles (QT-SPION), for the delivery of quercetin in vivo and, using
high-performance liquid chromatography (HPLC), which showed tremendous
increases in quercetin concentrations within the brain tissue of rats [NKE+ 18].
Additionally, negligible effects of iron concentrations within the brain and blood
plasma were shown. Ultimately, through the use of the quercetin-iron oxide
nanoparticle, the crossing of the blood-brain barrier was improved, which was
shown by increased quercetin concentrations in the brain [NKE+ 18]. Metallic
nanoparticles have their benefits and downfalls in quercetin delivery, but they
have certainly shown promise in targeting neurodegenerative disorders.
12
89
ative disorders [AAAH86]. Additionally, quercetin’s anti-inflammatory, metal
ion chelating, and overall neuroprotective abilities play a role in slowing down
neurodegeneration [DJM13]. Finally, variations of quercetin-loaded nanopar-
ticles have demonstrated the ability to cross through the blood-brain barrier
by improving the bioavailability of the flavonoid [RMS+ 20]. Ultimately, it is
quercetin’s unique repertoire of chemical properties that allows it to combat
neurodegenerative disorders, including Alzheimer’s Disease, Parkinson’s Dis-
ease, Huntington’s Disease, Amyotrophic Lateral Sclerosis (ALS), and Multiple
Sclerosis (MS).
13
90
4.2 Parkinson’s Disease
Parkinson’s Disease is a disorder characterized by a neurological undersupply of
dopamine, most often due to the degeneration of dopamine-producing neurons
located in the brain’s substantia nigra [DP03]. As dopamine is responsible for
facilitating smooth and coordinated motor movements, an undersupply of the
neurotransmitter can cause difficulties in executing simple motor movements,
leading to bradykinesia, involuntary tremors, muscle rigidity, and a variety of
other symptoms [DP03]. While free radical-induced oxidative stress can play a
role in dopamine-producing neuronal death, the aggregation of misfolded protein
α-Synuclein within the brain can also be a monumental factor in the pathogen-
esis of Parkinson’s Disease [BWU12]. By enhancing the autophagy of misfolded
α-Synuclein proteins and aiding in the maintenance of mitochondrial membrane
potentials, quercetin can prevent the death of dopamine-producing neurons
and mitigate the onset of Parkinson’s. Wang et al. administered quercetin
in vitro in 6-hydroxydopamine-treated PC12 cells and showed increased lev-
els of dysfunctional mitochondria and α-Synuclein autophagy [WHH+ 21]. In
that same study, in vivo oral administrations of quercetin over 14 days to 6-
hydroxydopamine-lesioned parkinsonian rats showed inhibitory effects of reac-
tive oxygen species (ROS) levels and free radical generator malondialdehyde
(MDA) levels in addition to improvements in ROS metabolizer superoxide dis-
mutase (SOD) levels (Fig. 6), exhibiting promise for Parkinson’s treatment
[WHH+ 21]. Karuppagounder et al. orally administered quercetin to rotenone-
induced hemi-parkinsonian rats over a period of 4 days and found significant and
consistent increases in dopamine levels within the brain [KMP+ 13]. However,
though free quercetin showed promise in rat models of Parkinson’s Disease, the
flavonoid’s effectiveness would likely not be as pronounced in human adminis-
trations, resulting in large excess doses of quercetin having to be administered in
order to attain the desired effect. Through the use of drug delivery, quercetin’s
properties can be more efficiently harnessed, enhancing the treatment potential
of Parkinson’s Disease.
14
91
4.3 Huntington’s Disease
Huntington’s Disease is a disorder characterized by the degeneration of nerve
cells in a part of the brain known as the basal ganglia, leading to a decline
of motor control and cognitive ability [ABF+ 00]. Symptomatically, Hunting-
ton’s Disease shares similarities with Parkinson’s Disease as they both hinder
smooth, coordinated movements, but while Parkinson’s is associated with the
undersupply of dopamine, the pathogenesis of Huntington’s has been found to be
genetic. Huntington’s Disease is caused by a mutation in the HTT gene, which
is responsible for encoding the huntingtin protein [PRY+ 19]. The resultant pro-
tein is significantly longer as a result of the mutations, proning the protein to
misfolding and aggregation. Encouraging the buildup of oxidative stress and mi-
tochondrial dysfunction, the accumulation of mutated huntingtin proteins can
become cytotoxic to neurons [PRY+ 19]. However, quercetin has shown potential
in combating Huntington’s Disease. By enhancing mutated huntingtin protein
autophagy and anti-inflammatory activity within the brain, quercetin can slow
the onset of Huntington’s. Additionally, the flavonoid can encourage the bio-
genesis of mitochondria, helping to mitigate previously inflicted neuronal dam-
age [DMCD09]. Sandhir et al. orally administered quercetin to 3-nitropropionic
acid-induced models of Huntington’s diseased rats over the course of 21 days
and tested their motor movement and control by measuring their performance
on a balance beam test. Researchers found that quercetin administration over-
time led to improved motor movement control and balance, as measured by
faster balance beam completion times as well as fewer paw slips during the
test [SM13]. Chakraborty et al. found that oral quercetin administration over
four days showed similar increases in motor movements in 3-nitropropionic acid-
induced rat models of Huntington’s, which was measured by increases in stride
as well as higher rates of success in completing an obstacle course compared
to their untreated counterparts [CSD+ 13]. Still, quercetin administration for
the treatment of Huntington’s can be improved. Though both studies showed
improvements in Huntington’s symptoms as a result of quercetin administra-
tion, the rate of administration was high—every day for 21 days for the former
study, while twice a day for four days for the latter. By employing drug deliv-
ery systems, the amount of administrations could be reduced for similar or even
greater results in the realm of Huntington’s treatment.
15
92
cumulation due to excitotoxicity, and even gene mutations that can lead to
misfolded protein aggregations [TRM+ 15] [TSA18]. On the other hand, MS
is characterized by the degeneration of the neuronal myelin sheath, which is
responsible for protecting electrical impulses during transmission within the
central nervous system and, as a result of slowed transmission, can lead to
difficulties in simple movements and coordination [LH11]. The pathogenesis
of MS is primarily autoimmune, degeneration mistakenly inflicted by the im-
mune system, but oxidative stress and neuroinflammation can play a role in
furthering the progression of the disorder [FBDB+ 09]. In attempts to treat
ALS and MS, quercetin has been employed for its free-radical scavenging, anti-
inflammatory, metal ion chelating, and neuroprotective properties. Bhatia et
al. administered quercetin in vitro and measured significant inhibitory effects
of SOD1 fibril aggregations with increasing concentrations of quercetin over a
30-hour period. Inhibitory effects were measured visually using TEM imagery
and numerically by ThT Fluoresence [BMSD20]. As SOD1 fibril aggregation
can be a genetic factor in the pathogenesis of ALS, the shown inhibitory effects
can be useful in treating the disorder. Hendriks et al. administered quercetin
in vitro to isolated myelin taken from the brain tissue of adult mice and let
RAW 264.7 cells phagocytose the myelin for 90 minutes before adding dihy-
drorhodamine 123 (DHR). As DHR can be used for the detection of reactive
oxygen species (ROS) formation, researchers were able to measure that the
myelin treated with quercetin showed significant reductions in reactive oxy-
gen species production during myelin phagocytosis compared to their untreated
counterparts [HDVVDP+ 03]. However, while there have been in vitro studies
for quercetin-based treatments of ALS and MS, few in vivo studies have been
executed potentially due to bioavailability struggles and underwhelming results.
While quercetin’s potential for ALS and MS treatment is evident, the use of drug
delivery systems could enhance quercetin’s bioavailability and potential for ALS
and MS treatments in vivo.
5 Discussion
Quercetin, a dietary flavonoid, has been heavily researched for its potential
in the treatment of neurodegenerative disorders due to its antioxidative, anti-
inflammatory, metal-ion chelating, and neuroprotective properties. However,
quercetin struggles in bioavailability due to the flavonoid’s lack of aqueous solu-
bility, poor chemical stability, and an unfavorable absorption profile. As a result,
free quercetin struggles to pass through the blood-brain barrier and reach neu-
rons and other biomolecules, preventing the flavonoid from having any effect
on the progression of neurodegenerative disorders. However, the rise of tar-
geted and controlled-release drug delivery applications provides a remedy for
the struggles quercetin faces. Through the use of polymer-based nanoparti-
cles, lipid-based nanoparticles, and metallic nanoparticles, among other drug
delivery systems, the bioavailability struggles of quercetin are resolved, allow-
ing the flavonoid to reach and have an impact on neuronal degradation within
16
93
the nervous system. Additionally, the targeting properties of drug carriers, in
concert with their controlled release manipulability, further enhance quercetin’s
abilities, ultimately allowing for more efficient treatments of neurodegenerative
disorders.
The two most significant benefits that quercetin gains from drug delivery
encapsulations are improved bioavailability and controlled release properties.
Modern research has focused primarily on polymer-based nanoparticles, includ-
ing PLA and PLGA nanoparticles, as well as chitosans for quercetin’s neurode-
generative applications. However, while polymer-based nanoparticles, particu-
larly PLA and PLGA, can be easily engineered chemically by attaching ligands
and other biomolecules for targeted release quercetin delivery, their hydropho-
bicity leads to challenges with surface interactions in aqueous environments,
resulting in difficulties getting the nanoparticle to release the drug at the de-
sired rate. While notably, PLGA can be engineered specifically to increase
the nanoparticle’s hydrophilicity by increasing the ratio of glycolyl groups to
lactyl groups, the polymeric nanoparticle still falls short of the level of surface
interactions that lipid-based nanoparticles possess. On the other hand, lipo-
somes and micelles, which only began to rise in popularity recently after they
were implemented in the creation of COVID-19 vaccinations, have shown more
promise in quercetin delivery for the treatment of neurodegenerative disorders.
Despite prior research being much more limited compared to polymer-based
nanoparticles, lipid-based nanoparticles are the better candidate due to their
hydrophilic surfaces in addition to their surface engineerability through the
attachment of biomolecules for targeting applications. Though they struggle
to solve quercetin’s chemical instability, the problem can easily be remedied
by implementing PEGylations, which, in turn, can also further enhance the
nanoparticle’s bioavailability and controlled release applications. Furthermore,
the use of PEGylations can also improve the likelihood of lipid-based nanoparti-
cles crossing the blood-brain barrier—an essential step in facilitating quercetin’s
interactions with neurons and other biomolecules. Ultimately, future research
should investigate the use of PEGylated nanoparticles, particularly for micelles
and liposomes, to deliver quercetin. Additionally, while the drug delivery of
quercetin for the treatment of neurodegenerative disorders is still a relatively
new concept with a very limited research base, especially in vivo, the topic has
shown tremendous potential and will certainly become an emerging field in the
coming years.
References
[AAA+ 18] L. A. Abdulkhaleq, M. A. Assi, Rasedee Abdullah, M. Zamri-
Saad, Y. H. Taufiq-Yap, and M. N. M. Hezmee. The crucial
roles of inflammatory mediators in inflammation: A review.
Veterinary World, 2018.
17
94
Hariri. Quercetin nanoemulsion ameliorates neuronal dysfunc-
tion in experimental alzheimer’s disease model. Antioxidants,
1986.
[AABM+ 19] Marjan Abri Aghdam, Roya Bagheri, Jafar Mosafer, Be-
hzad Baradaran, Mahmound Hashemzaei, Amir Baghban-
zadeh, Miguel de la Guardia, and Ahad Mokhtarzadeh. Re-
cent advances on thermosensitive and ph-sensitive liposomes
employed in controlled release. Journal of Controlled Release,
2019.
[ABF+ 00] Tajrena Alexi, Cesario V. Borlongan, Richard L. M. Faull,
Chris E. Williams, Ross G. Clark, Peter D. Gluckman, and
Paul E. Hughes. Neuroprotective strategies for basal ganglia
degeneration: Parkinson’s and huntington’s diseases. Progress
in Neurobiology, 2000.
18
95
[AUID+ 07] Orhan Aktas, Oliver Ullrich, Carmen Infante-Duarte, Robert
Nitsch, and Frauke Zipp. Neuronal damage in brain inflamma-
tion. Archives of Neurology, 2007.
[BSB+ 18] Ruma Baksi, Pratap Singh, Devendra, Swapnil P. Borse, Rita
Rana, Vipin Sharma, and Manish Nivsarkar. In vitro and in
vivo anticancer efficacy potential of quercetin loaded polymeric
nanoparticles. Biomedicine & Pharmacotherapy, 2018.
[CCE+ 03] Maria Careri, Claudio Corradini, Lisa Elviri, Isabella Nicoletti,
and Ingrid Zagnoni. Direct hplc analysis of quercetin and trans-
resveratrol in red wine, grape, and winemaking byproducts.
Journal of Agricultural and Food Chemistry, 2003.
19
96
[Chi10] Salvatore Chirumbolo. The role of quercetin, flavonols and
flavones in modulating inflammatory cell function. Bentham
Science Publishers, 2010.
20
97
[FBDB+ 09] Josa M. Frischer, Stephan Bramow, Assunta Dal-Bianco, Clau-
dia F. Luncchinetti, Helmut Rauschka, Manfred Schmidbauer,
Henning Laursen, Per Soelberg Sorensen, and Hans Lassmann.
The relation between inflammation and neurodegeneration in
multiple sclerosis brains. Brain, 2009.
[HCS+ 18] Zhi-Qiang Haung, Pan Chen, Wei-Wei Su, Yong-Gang Wang,
Hao Wu, Wei Peng, and Pei-Bo Li. Antioxidant activity and
hepatoprotective potential of quercetin 7-rhamnoside in vitro
and in vivo. Molecules, 2018.
[HDFY+ 17] Mahmoud Hashemzaei, Amin Delarami Far, Arezoo Yari, Her-
avi Heravi, Kaveh Tabrizian, Seyed Mohammad Taghdisi, Sar-
venaz Ekhtiari Sadegh, Konstantinos Tsarouhas, Dimitrios
Kouretas, George Tzanakakis, Dragana Nikitovic, Nikita Yure-
vich Anisimov, Demetrios A. Spandidos, Aristides M. Tsat-
sakis, and Ramin Rezaee. Anticancer and apoptosis-inducing
effects of quercetin in vitro and in vivo. Oncology Reports, 2017.
21
98
[HZL+ 20] Yu Hai, Yuanxiao Zhang, Yingzhi Liang, Xiaoyu Ma, Xiao Qi,
Jianbo Xiao, Weiming Xue, Yane Luo, and Yianli Yue. Ad-
vance on the absorption, metabolism, and efficacy exertion of
quercetin and its important derivatives. Food Frontiers, 2020.
[IZS+ 20] Humaira Idrees, Syed Zohaib Javaid Zaidi, Aneela Sabir,
Rafi Ullah Khan, Xunli Zhang, and Sammer-ul Hassan. A re-
view of biodegradable natural polymer-based nanoparticles for
drug delivery applications. Nanomaterials, 2020.
[KAAS16] Hena Khanam, Abad Ali, Mohd Asif, and Shamsuzzaman. Neu-
rodegenerative diseases linked to misfolded proteins and their
therapeutic approaches: A review. European Journal of Medic-
inal Chemistry, 2016.
[KG99] Rao Manjeet K and B. Ghosh. Quercetin inhibits lps-induced
nitric oxide and tumor necrosis factor-α production in murine
macrophages. International Journal of Immunopharmacology,
1999.
[KMP+ 13] S. S. Karuppagounder, S. K. Madathil, M. Pandey, R. Haobam,
U. Rajamma, and K. P. Mohanakumar. Quercetin up-
regulates mitochondrial complex-i activity to protect against
programmed cell death in rotenone model of parkinson’s dis-
ease in rats. Neuroscience, 2013.
[KMZM03] Do Kyung Kim, Maria Mikhaylova, Yu Zhang, and Mamoun
Muhammed. Protective coating of superparamagnetic iron ox-
ide nanoparticles. Chemistry of Materials, 2003.
22
99
[LBBD17] Sonia Losada-Barreiro and Carlos Bravo-Dı́az. Free radicals
and polyphenols: The redox chemistry of neurodegenerative
diseases. European Journal of Medicinal Chemistry, 2017.
[LBS+ 18] Marija Lesjak, Ivana Beara, Nataša Simin, Diandra Pintać,
Tatjana Majkić, Kristina Bekvalac, Dejan Orčić, and Neda
Mimica-Dukić. Antioxidant and anti-inflammatory activities
of quercetin and its derivatives. Journal of Functional Foods,
2018.
[LJS+ 21] Mengyi Li, Shuai Jiang, Johanna Simon, Marie-Lusie Frey,
Manfred Wagner, Volker Mailänder, Daniel Crespy, and Katha-
rina Landfester. Brush conformation of polyethylene glycol de-
termines the stealth effect of nanocarriers in the low protein
adsorption regime. Nano Letters, 2021.
[LLL+ 17] Li Lv, Chunxia Liu, Zhengrong Li, Fangming Song, Guocheng
Li, and Xingzhen Huang. Pharmacokinetics of quercetin-loaded
methoxy poly(ethylene glycol)-b-poly(l-lactic acid) micelle after
oral administration in rats. BioMed Research International,
2017.
[LS13] Jingyan Li and Cristina Sabliov. Pla/plga nanoparticles for
delivery of drugs across the blood-brain barrier. Nanotechnology
Reviews, 2013.
[LZJS17] Ting Liu, Lingyun Zhang, Donghyun Joo, and Shao-Cong Sun.
Nf-κb signaling in inflammation. Signal Transduction and Tar-
geted Therapy, 2017.
23
100
[Men79] Fredric M. Menger. The structure of micelles. Accounts of
Chemical Research, 1979.
[MPSS+ 17] Lina Clara Gayoso e Ibiapina Moreno, Elena Puerta, José Ed-
uardo Suárez-Santiago, Nereide Stela Santos-Magalhães,
Maria J. Ramirez, and Juan M. Irache. Effect of the oral ad-
ministration of nanoencapsulated quercetin on a mouse model
of alzheimer’s disease. International Journal of Pharmaceutics,
2017.
[MS19] Rubin Thapa Magar and Jae Kyung Sohng. A review on struc-
ture, modifications and structure-activity relation of quercetin
and its derivatives. Korean Society for Microbiology and
Biotechnology, 2019.
24
101
[PGL+ 20] R. G. R. Pinheiro, A. Granhja, J. A. Loureiro, M. C. Pereira,
M. Pinheiro, A. R. Neves, and S. Reis. Quercetin lipid nanopar-
ticles functionalized with transferrin for alzheimer’s disease. Eu-
ropean Journal of Pharmaceutical Sciences, 2020.
[PQF+ 12] Hector Pool, David Quintanar, Juan De Dios Figueroa, Camila
Marinho Mano, J. Etelvino H. Bechara, Luis A. Godı́nez, and
Sandra Mendoza. Antioxidant effects of quercetin and catechin
encapsulated into plga nanoparticles. Journal of Nanomateri-
als, 2012.
[PRY+ 19] Sonia Podvin, Holly T. Reardon, Katrina Yin, Charles Mosier,
and Vivian Hook. Multiple clinical features of huntington’s
disease correlate with mutant htt gene cag repeat lengths and
neurodegeneration. Journal of Neurology, 2019.
[PTK+ 20] Gopal Patel, Neeraj Singh Thakur, Varun Kushwah, Mahesh D.
Patil, Shivraj Hariram Nile, Sanyog Jain, Uttam Chand Baner-
jee, and Guoyin Kai. Liposomal delivery of mycophenolic acid
with quercetin for improved breast cancer therapy in sd rats.
Frontiers in Bioengineering and Biotechnology, 2020.
[PVK+ 22] Paraskevi Papakyriakopoulou, Nikolaos Velidakis, Elina Khat-
tab, Georgia Valsami, Ioannis Korakianitis, and Nikolaos Pe
Kadoglou. Potential pharmaceutical applications of quercetin
in cardiovascular diseases. Pharmaceuticals, 2022.
25
102
[QGY+ 22] Xueju Qi, Cong Gao, Chuanjin Yin, Junting Fan, Xiaochen
Wu, Guohu Di, Jing Wang, and Chuanlong Guo. Development
of quercetin-loaded pvcl–pva–peg micelles and application in
inhibiting tumor angiogenesis through the pi3k/akt/vegf path-
way. Toxicology and Applied Pharmacology, 2022.
[QZW+ 16] Waseem Akhtar Qureshi, Ruifang Zhao, Hai Wang, Yanping
Ding, Ayesha Ihsan, Ayeesha Mujeeb, Guangjun Nie, and
Yuliang Zhao. Co-delivery of doxorubicin and quercetin via
mpeg–plga copolymer assembly for synergistic anti-tumor effi-
cacy and reducing cardio-toxicity. Science Bulletin, 2016.
[SLL+ 21] Yumei Sun, Chang Li, Zhonghua Li, Aishao Shangguan, Jinhe
Jiang, Wei Zheng, Shujun Zhang, and Qigai He. Quercetin as
an antiviral agent inhibits the pseudorabies virus in vitro and
in vivo. Virus Research, 2021.
26
103
[SLZ+ 16] Dongdong Sun, Nuan Li, Weiwei Zhang, Zhiwei Zhao, Zhipeng
Mou, Donghui Huang, Jie Liu, and Weiyun Wang. Design of
plga-functionalized quercetin nanoparticles for potential use in
alzheimer’s disease. Colloids and Surfaces B: Biointerfaces,
2016.
27
104
[VHS+ 11] Fabiana T. M. C. Vicentini, Tianyuan He, Yuan Shao, Maria
J. V. Fonesca, Waldiceu A. Verri, Gary J. Fisher, and Yiru
Xu. Quercetin inhibits uv irradiation-induced inflammatory cy-
tokine production in primary human keratinocytes by suppress-
ing nf-κb pathway. Journal of Dermatological Science, 2011.
[WHH+ 21] Wen-Wen Wang, Ruiyu Han, Hai-Jun He, Jia Li, Si-Yan Chen,
Yingying Gu, and Chenglong Xie. Administration of quercetin
improves mitochondria quality control and protects the neurons
in 6-ohda-lesioned parkinson’s disease models. Aging, 2021.
[WLL+ 11] Kui Wang, Rui Liu, Jingyi Li, Jiali Mao, Yunlong Lei, Jin-
hua Wu, Jun Zeng, Tao Zhang, Hong Wu, Lijuan Chen, Can-
hua Huang, and Yuquan Wei. Quercetin induces protective
autophagy in gastric cancer cells: Involvement of akt-mtor-
and hypoxia-induced factor 1α-mediated signaling. Autophagy,
2011.
[WSM+ 16] Weiyou Wang, Cuixia Sun, Like Mao, Peihua Ma, Fuguo Liu,
Jie Yang, and Yanxiang Gao. The biological activities, chemi-
cal stability, metabolism and delivery systems of quercetin: A
review. Trends in Food Science & Technology, 2016.
[WSRE04] Robert J. Williams, Jeremy P. E. Spencer, and Catherine Rice-
Evans. Flavonoids: antioxidants or signalling molecules? Free
Radical Biology and Medicine, 2004.
28
105
[WYZ+ 18] Shengan Wang, Jiaying Yao, Bo Zhou, Maria T. Chaudry,
Mi Wang, Fenglin Xiao, Yao Li, and Wenzhe Yin. Bacterio-
static effect of quercetin as an antibiotic alternative in vivo and
its antibacterial mechanism in vitro. Journal of Food Protec-
tion, 2018.
[XWG+ 16] Li Xiang, Handong Wang, Yongyue Gao, Liwen Li, Chao Tang,
Guodao Wen, Youqing Yang, Zong Zhuang, Mengliang Zhou,
Lei Mao, and Youwu Fan. Quercetin induces mitochondrial
biogenesis in experimental traumatic brain injury via the pgc-
1α signaling pathway. American Journal of Translational Re-
search, 2016.
[ZLA+ 11] Feng Zhang, Emma Lees, Faheem Amin, Pilar Rivera Gil, Fang
Yang, Paul Mulvaney, and Wolfgang J. Parak. Polymer-coated
nanoparticles: A universal tool for biolabelling experiments.
Small, 2011.
29
106
Art Therapy’s Effectiveness and its Role in
Treating Neurological Conditions
∗
Simryn Patel
October 14, 2023
Abstract
This paper explores the extensive psychological and neurological effects
of art therapy. Additionally, it offers art therapy as a form of treatment
for individuals suffering from PTSD and Alzheimer’s disease. Both of
these conditions pose unique dangers; PTSD patients suffer from psycho-
logical trauma and Alzheimer’s patients suffer from degenerative activity
in the brain. A major factor that reinforces art therapy’s credibility is its
ability to express nonverbal memories and emotions associated with them.
Additionally, certain parts of the brain during art therapy are activated
which can prove useful in PTSD and Alzheimer’s. This paper consid-
ers the mechanisms and processes of art therapy along with its effects
to emphasize its potential in treating symptoms and improving overall
well-being.
1 Introduction
After Frida Kahlo was in a full-body cast for three months due to a debilitating
bus accident, she resorted to art to pass the time and alleviate the pain. Once
she physically recovered, Kahlo completed many paintings that reflected her
traumatic experience. She said, “My painting carries with it the message of
pain” (Svoboda, 2022). Frida Kahlo is just one example of many other people
who have utilized the strength of art therapy in treating certain neurological
conditions and improving well-being.
Art therapy is a type of therapy that uses creative methods of expression
with the guidance of an art therapist. There are many types of art therapy,
but I will be focusing on visual arts therapy such as painting and drawing. Art
therapy has certain psychological effects on patients, such as strengthening a
mind-body connection and improving well-being. Art-making also activates the
hippocampus, amygdala, visual cortex, and prefrontal cortex during the creative
process which can help treat certain neurological conditions such as PTSD and
Alzheimer’s.
∗ Advised by: Dr. Ellen Robertson of the University of Cambridge
107
In this paper, I will refer to specific neurological conditions to demonstrate
the positive psychological and neurological effects of art therapy. Different types
of neurological conditions, ranging from strokes to Alzheimer’s disease, affect
up to one billion people worldwide. An estimated 6.8 million people die every
year as a result of these neurological disorders. I focus especially on art ther-
apy’s effect on PTSD and Alzheimer’s disease, but I make references to other
neurological conditions as well.
Overall, art therapy is an effective method used to combat neurological dis-
orders including PTSD and Alzheimer’s. What makes art therapy effective
are the psychological and neurological changes that take place in an individual
during art therapy. Psychological changes include the expression of emotions,
improved mood, and more. Neurological changes include changes to the neural
connections, the prefrontal cortex, and more. Overall, the combination of these
various psychological and neurological changes that occur justifies the success
of the emerging field of art therapy.
108
that integrates verbal and nonverbal processes by involving both left and right
hemisphere functions. This technique designed by McNamee (2004) is done by
using both hands in an effort to stimulate memories and experiences that are
contained in both sides of the brain (Talwar, 2007).
109
Two-factor Theory of Emotions, once a person feels or experiences physiologi-
cal arousal, they then interpret the arousal to label it as an emotion (Yarwood,
n.d.). From a more neurological standpoint, sensory information is received by
the somatosensory primary cortex in the cerebral cortex. It is then transferred to
the amygdala, where the information is processed into emotions. Therefore, the
sensory information received during art therapy can lead to better access and
acknowledgment of previously blocked emotions (Czamanski-Cohen & Weihs,
2016).
Images produced during art therapy not only retrieve but also improve emo-
tions and attitudes. According to Holmes, Mathews, Dalgleish, and Mackin-
tosh’s (2006) hypothesis, images have the power to increase ratings of emotions
and can have a more positive effect than verbal processing. To support this hy-
pothesis, the researchers conducted a study with participants in which they were
presented with numerous scenarios with initial ambiguity as to a positive out-
come or not. The participants were then asked to either imagine these events or
listen to the same descriptions while thinking about their meaning. The events
were in paragraph form, and the first half of the paragraph began with a sen-
tence that had a negative connotation. For example, a beginning may be “You
are at home alone watching TV. You were dozing and suddenly woke up under
the impression that you heard a frightening noise and then realize. . . ”. The
rest of the sentence is completed by both groups. Researchers found that the
participants in the imagery group reported more positive effects of the scenarios
and rated the descriptions as being more positive than their counterparts. For
example, a positive ending was “and then realize with relief that it was your
partner returning home.” This may have occurred due to participants in the
verbal condition focusing more on the negative components of the paragraphs
(Holmes et al., 2006). So, images produced during art therapy may affect and
result in positive attitudes and emotions.
Visual communication (patient expression through artwork and images) also
has a greater effect on the patient’s memories and helps retrieve them, as ev-
idenced by the picture superiority effect. According to Paivio’s study in 1973
which explored the effect of pictures vs words on memory, participants were
presented with pairs of words, pairs of pictures, or pairs of one word and one
picture. Then, participants were tested on their memory for the previously pre-
sented stimuli. Results showed that there was better memory recall for the pairs
with pictures than those with words alone. According to Paivio’s “dual coding”
theory, images hold more power than words because pictures generate a visual
and verbal response, whereas words are not as likely to generate images for the
participant (Paivio & Csapo, 1973). In any type of therapy, accessing memo-
ries is important to express previous experiences. As shown in Paivio’s study,
since pictures have proved useful for accessing memories, the pictures produced
during art therapy can facilitate a better expressive experience for the patient.
Another example of the picture superiority effect is when patients with
Alzheimer’s disease were unable to recall memories of loved ones when hear-
ing their names but were able to recognize them when presented with a picture
(Ally et al., 2009). In the same study, healthy adults (controls), patients diag-
110
Figure 1: Picture Superiority Effect Study
nosed with MCI (mild cognitive impairment that results in memory and think-
ing problems due to old age), and patients diagnosed with mild Alzheimer’s
disease were assessed to measure memory for pictures versus words. In each
case, pictures held a higher recognition accuracy, meaning that patients could
recall pictures more accurately when presented with stimuli. So, this confirms
the picture superiority effect (Ally et al., 2009). From these results, the use
of pictures as communication and treatment in the field of art therapy can be
crucial, especially in trauma and dementia patients due to the patient’s ability
to recall certain memories through art.
Although images produced during the art-making phase are important, com-
bining this with an examination of the artwork for any hidden emotions or
messages can strengthen a mind-body connection (Cuellar, 2007). The mind-
body connection is a concept that suggests that processes of the mind, such as
thinking and feeling, are rooted in one’s sensory and motor experiences. Art uti-
lizes this connection as art-making itself is sensory (it induces body sensations
and emotions), and then interpreting artwork requires thinking and emotions
(Czamanski-Cohen & Weihs, 2016). Numerous studies provide evidence for the
mind-body connection, one of them focusing on the reduction of cortisol levels
and its connection to participants’ responses following visual art making. The
111
“mind” in this case is the response after the process, and the “body” is the cor-
tisol level of the participants. Cortisol is most commonly known as the body’s
stress hormone. In the study, 39 participants provided saliva samples to deter-
mine cortisol levels before and after making art. During the art-making process,
participants had artistic liberty and were not confined to a specific subject or
material. At the end of the session, participants provided written responses
about their experiences and provided another saliva sample. As shown by the
graph, average cortisol levels after art-making were significantly lower than be-
fore art-making. This result matched the participants’ written responses, as
most stated they felt relaxed, relieved, excited, and fulfilled. The mind-body
connection shown in this study confirms that art therapy affects attitude, well-
being, and emotions (mind) which can have an impact on health (body) and
vice-versa (Kaimal, Ray, & Muniz, 2016).
Additionally, colors play a large role in identifying the patient’s mood and
mental health, and they also transform the patient’s therapeutic experience.
In multiple studies, different people with varying emotional states interacted
with color in different ways. In one study, Wadeson (1971) noticed that people
diagnosed with depression used significantly less color in their paintings than
other patients. In another study involving a leukemic girl, the sick patient who
was not feeling well used much red and black, which indicated an overflow of
negative feelings. That patient died six months later (Cotton, 1985). From
these studies, color has the power to reflect the mental status of the patients.
112
had previously been deprived of the color red (which was believed to induce
madness), but then they received a small red string. The results showed that
the patients became more animated and as a result increased their activity
and work output (Emery, 1929). From these results, the color that the patient
utilizes is not only a form of expression but can lead to changes in their mood
and behavior as well.
However, there can be many explanations for color depending on the patient.
In one study, a 31-year-old suicidal woman was asked to draw something and
provide an explanation of the meaning of the colors. For red specifically, she
stated that she felt strength, power, courage, and joy (Lev-Wiesel, n.d.). In
another study also examining the color red, an experiment was done to draw
connections between dominance (male dominance and testosterone levels) and
the color red. In this experiment, red was hypothesized to signify dominance,
and blue was hypothesized to signify relaxation. Participants were presented
with different words, some words’ meanings related to dominance, and some
words’ meanings related to tranquility. These different words were then changed
into a blue or red font, and participants were asked to classify these words as
dominance-related or rest-related while being timed. Results showed that par-
ticipants made fewer errors when categorizing red dominance-related words in
the dominance category rather than categorizing blue dominance-related words
in the dominance category (Mentzel et al., 2017). From this experiment as well
as the study of the 31-year-old woman, there is some overlap with the color red’s
meaning (as power, strength, and dominance), but there are individual expe-
riences that may lead to certain colors having different meanings for everyone
(shown by red’s meaning as courage and joy for the woman). These colors give
the patient the opportunity to express moods and emotions that they cannot
express verbally. Therefore, once an art piece is complete with certain colors,
and the patient feels an initial sense of self-expression, an explanation about
the whole artwork may come more easily. This explanation along with the use
of colors makes the art therapy session more unique for the patient, as they
are able to give their own, personal explanation for the colors based on their
individual experiences.
113
4.1 The Hippocampus
Art therapy activates the hippocampus during the creative process. In a 2013
study, researchers examined the effect of hippocampal amnesia (due to lesions)
on creative thinking. This is relevant to art therapy’s effect on the hippocampus
because art therapy requires a great deal of creative thinking. In the study,
the participants (those with normal and damaged hippocampi) completed the
Torrance Tests of Creative Thinking (TTCT). In these tests, they were required
to complete both the verbal and figural parts of the experiment that tested
creative thinking. In the verbal form, they were given many prompts that
forced them to creatively problem-solve. Prompts included “Generate ways to
improve a toy so that it is more fun to play with,” “Generate alternative uses
for a common object (ex) cardboard box),” and “Generate hypotheses about
potential benefits or problems related to an improbable situation (ex) if clouds
had strings attached to them).” Next, in the figural form, participants were to
complete a drawing when given one that was incomplete. Examples include
ten incomplete line contours and 30 repeated parallel line segments. Once they
completed their work, the participants were asked to give their artwork a unique
title. After both sections were complete, the researchers scored and examined
each participant’s answers. Scores were dependent upon the fluency of their
answers and originality. In the verbal section, participants with hippocampal
amnesia (the study refers to them as the AM group) scored significantly lower
than their healthy counterparts. For example, when asked to think of creative
uses for cardboard boxes, one healthy participant came up with 26 uses, 23 of
which were unique (e.g. Building a suit of armor). On the other hand, one
amnesic participant came up with only 2 uses which were recycling the boxes
and making a fort (Duff et al., 2013).
In the figural section, the healthy participants also scored higher than the
AM group. The prompt was to create an image that includes the shape of the
large black oval and add new ideas surrounding it to make the picture tell an
exciting story. For example, when given an incomplete drawing of a large black
oval, one healthy participant turned it into a drawing of a golf course complete
114
Figure 4: Drawing prompts
with signs for parking, a clubhouse, Tiger Woods, and more. Another healthy
participant turned the oval into a hot air balloon that takes people for rides
above the city. On the other hand, a participant from the AM group turned the
oval into a bug. Another participant from the AM group used the shape as an
egg and drew a chicken above it (Duff et al., 2013). This study showed that the
hippocampus is extremely important in the creative process, which means that
it can be activated during creativity in art therapy.
Another study that demonstrates the activation of the hippocampus in art
therapy is when King & Kaimal (2019) measured brain activity through elec-
troencephalography (EEG), a non-invasive method that allows for free move-
ment. In one study involving EEG, patients worked with clay and drawing, and
there was activation in brain regions involved in memory processing and med-
itative states (including the hippocampus) (King & Kaimal, 2019). Although
there is little evidence for direct causation, it is possible that there may be some
correlational relationship between the activation of the hippocampus and easier
access to memories and emotions during the creative process.
115
neural activity in the frontal lobe during the execution of artwork (Talwar,
2007). Similarly, in an experiment involving patients coloring, doodling, and
free-drawing, fNIRS scans (functional near-infrared spectroscopy that measures
brain activity) showed significant activation of the medial prefrontal cortex
(Kaimal et al., 2017). In another study by Zeki (2011), participants under-
went brain scans while being shown images of paintings. When participants
viewed paintings that they deemed beautiful, fMRI scans showed that blood
flow increased by almost 10 percent to the medial orbitofrontal cortex region
of the brain, a part of the prefrontal cortex associated with pleasure. The in-
creased amount of blood flow to the medial orbitofrontal cortex is similar when
looking at a loved one (ACRM, 2020). This activation of the prefrontal cortex
is essential in stimulating the reward center of the brain which contributes to
feelings of accomplishment (Chau et al., 2018). In patients with neurological
conditions, these feelings of accomplishment and purpose due to the activation
of the prefrontal cortex prove vital in recovery.
The activation of the prefrontal cortex (PFC) during art therapy also con-
tributes to the lateralization and stimulation of both hemispheres of the brain.
Although the PFC itself does not directly make the connections, the corpus cal-
losum is a bundle of nerve fibers that facilitates communication between both
hemispheres of the brain by connecting the two separate prefrontal cortices (Cor-
pus Callosum - an Overview — ScienceDirect Topics, n.d.). The left hemisphere
is associated with language, speech, analytical thinking, and sequential process-
ing. The right hemisphere is associated with visual motor activities, intuition,
emotions, and sensory skills. The integration of both hemispheres of the brain is
essential for different cognitive processes including attention, decision-making,
and emotional regulation. For neurological conditions (such as PTSD), these
cognitive processes from lateralization can prove useful if they are strengthened
through art therapy, as will be discussed later. Art therapy can promote bilat-
eral stimulation through a technique in which the patient utilizes both dominant
and non-dominant hands in the art-making process. This technique works in
lateralization because the right hand is controlled by the left hemisphere of the
brain, and vice versa (Malchiodi, 2003).
Both hemispheres can also be stimulated by the two-part process of art
therapy. The first part is the art-making and the second part is the explanation
of the artwork. The left hemisphere allows for an explanation of the image
produced by (mostly) the right hemisphere from the first step (Talwar, 2007).
As stated above, the connection between both hemispheres in art therapy is
important for attention, decision-making, and emotional regulation. When this
connection is not present, there can be dire consequences for a patient who is
already struggling with a neurological condition.
The negative consequences of having no connection between the right and
left hemispheres of the brain can be seen, in one case, from the agenesis of the
corpus callosum (AgCC). This disorder presents itself at birth when the tissue
that connects the left and right sides of the brain is partially or completely
missing. The purpose of highlighting this extreme example is to show the ef-
fects of having no connection between both hemispheres in the brain to help
10
116
demonstrate why lateralization is important. In a study conducted by Labadi
& Beke (2017), participants included 18 children between the ages of 6 and 8
with agenesis of the corpus callosum and 18 typically developing children who
were matched by IQ, age, gender, and education. Labadi & Beke examined both
groups’ emotional and mental state recognition with a process that is called the
“Faces Test”. Each child was shown 20 photographs of an actress posing: 10
photos of basic emotions and 10 photos of complex mental states. Under each
photo, two words were typed, but only one described the emotion or mental
state the actress was depicting. The experimenter read the two words, and the
child was asked to choose the words that best represented the actress’s emotion
or mental state in the picture. If they were correct, they got one point, and
if they were incorrect they received no points. After the experiment, the re-
searchers found that children with AgCC were less accurate and showed overall
poorer performance in observing emotional states than the control group. As
shown, the absence of the corpus callosum and any connection between both
hemispheres of the brain limits their understanding of complex social cognitive
functions (Lábadi & Beke, 2017).
However, when art therapy connects both hemispheres, there can be better
social awareness and behavior as a result (Lábadi & Beke, 2017). Improved
social relations through art therapy can not only help many neurobehavioral
conditions (including autism, ADHD, and obsessive-compulsive disorder) but
also give a sense of social purpose when one’s surroundings are better under-
stood. Therefore, art therapy promotes connections between both hemispheres
of the brain that can improve the well-being of patients struggling with neuro-
logical conditions.
4.3 Neuroplasticity
Just as I discussed the connections involved with bilateral stimulation, new
neural connections, and pathways can be formed as well when completing art
(Konopka, 2014). Neuroplasticity is the brain’s ability to form and organize
neural networks, including after a learning experience or after an injury (Pud-
erbaugh & Emmady, 2023).
In a 2014 study, Belkofer, Vaughan Van Hecke & Konopka measured the
effects on the brain after 20 minutes of drawing. The study involved the use of
an EEG to investigate the differences in patterns of brain activity among artists
and non-artists. Results showed that for artists, there was strong activation in
the left posterior temporal, parietal, and occipital regions of the brain. For non-
artists, there was activation in the right parietal and right prefrontal areas of the
brain. The authors believed that the different areas activated between artists
and non-artists were due to the non-artists making new connections because of
learning (Belkofer et al., 2014). These new connections formed when completing
art can prove useful in Alzheimer’s patients, which will be explained more in-
depth later on.
11
117
5 Neurological conditions - PTSD
5.1 What is PTSD?
PTSD results from exposure to emotionally disturbing or life-threatening events.
As a result, there can be lasting effects on someone’s mental, physical, emo-
tional, and social well-being. Traumatic experiences include but are not limited
to physical abuse, poverty, childhood neglect, and racism (What Is Trauma?,
2018).
had heightened awareness and increased BOLD signals to the medial prefrontal
12
118
cortex when presented with a fearful image. For the same stimuli, BOLD sig-
nals to the amygdala were measured as well. Researchers found that PTSD
patients exhibited exaggerated amygdala responses (Shin, 2005). Essentially, in
PTSD, the amygdala (the survival center) goes into overdrive as if the patient
were experiencing that trauma for the first time. At the same time, the pre-
frontal cortex also becomes suppressed so there is less capability to control any
emotions, such as fear (How Does Trauma Affect the Brain?, n.d.).
In another study, J. Douglas Bremner (1999) looked at the blood flow of
Vietnam combat veterans when they were exposed to combat-related and neu-
tral pictures/sounds. Researchers used positron emission tomography (PET)
which uses radioactive substances to measure blood flow. In the study, there
were Vietnam combat veterans with PTSD (n=10) and Vietnam combat veter-
ans without PTSD (n=10). Individuals were shown neutral slides, winter scenes
with nonverbal music, and combat slides, actual violent photographs from Viet-
nam. Scans showed that when veterans with PTSD were exposed to traumatic
images, there was decreased blood flow in the medial prefrontal cortex. As men-
tioned in the previous paragraph, the hyperresponsivity of the amygdala results
in decreased function of the prefrontal cortex as a way to cope with trauma
(Bremner et al., 1999).
The prefrontal cortex works alongside the amygdala (as explained previously
with the Singer-Schachter theory of emotions) to process emotional stimuli. So,
since the prefrontal cortex is involved in memory, emotions, and social behavior,
PTSD patients have difficulties in these areas when recalling a traumatic event
(Kong et al., 2013).
5.2.3 Hippocampus
With PTSD, there are impairments and other effects on the hippocampus as
well. MRI (magnetic resonance imaging) was performed on male miners involved
in coal mine gas explosions. There were 14 with PTSD and 25 without. PTSD
patients showed a decreased gray matter volume in the hippocampus compared
13
119
to their counterparts as shown in the graph to the right. Impairments in the
hippocampus imply impairments in learning and memory (Zhang et al., 2014).
who worked in the field of mental health, and a recent experience with a client
reminded her of her own childhood neglect and rejection. She had attended
talk therapy previously but stated, “I need to work with the image; words
are not enough” (p. 31). At the end of the session, she drew a horse, which
represented freedom, strength, and wholeness. She rated how she felt as a 7
(Talwar, 2007). As shown, art therapy allows for self-expression which allows
for access to blocked traumatic memories.
14
120
6 Neurological Conditions - Alzheimer’s Disease
6.1 What is Alzheimer’s?
Alzheimer’s disease is a type of dementia that affects memory, behavior, and
cognition. Most people with Alzheimer’s are 65 and older. This disease worsens
with time and has no cure. After some time, there are difficulties with speak-
ing, swallowing, and walking, which can lead to difficulties living independently
and eventually death. (What Is Alzheimer’s Disease? Symptoms & Causes —
Alz.Org, n.d.).
15
121
study measured an abstract feeling of well-being, participants clearly showed
increased positive attitudes. So, although art therapy cannot cure Alzheimer’s,
it can lead to an improved quality of life because of self-expression and creativity.
Art therapy, however, may be able to slow the progress of the early stages of
Alzheimer’s. A potential use of art therapy for Alzheimer’s patients could be to
increase neuroplasticity in an attempt to strengthen neural connections (Koch
& Smampinato, 2022). However, there is little empirical research surrounding
this topic but should be pursued due to art therapy’s effect on neuroplasticity
as mentioned earlier.
16
122
in favor of art therapy.
8 Conclusion
Art therapy, although a relatively new field, has the potential to make significant
developments in treating neurological conditions. The multitude of positive
psychological and neurological effects on the body as mentioned above supports
art therapy as an effective way to treat patients diagnosed with neurological
conditions such as PTSD and Alzheimer’s (Kinney & Rentz, 2005; Talwar,
2007).
Psychologically, art therapy works with images that do not solely require
verbal communication from the patient. This can help with PTSD, where ac-
cessing painful memories can be difficult (Meekums, 1999). Art as a way of
communicating can also help to uncover emotions, as shown by the pairing of
the Singer-Schachter theory of emotions with the two-part process of art ther-
apy (Yarwood, n.d.). Furthermore, memory can be strengthened through visual
communication (artwork) as shown by the picture superiority effect (Paivio &
Csapo, 1973). Physical materials/colors and images are also included in the
visual aspect of art therapy and can result in positive attitudes and emotions
(Holmes et al., 2006). In all, the visual aspect of art therapy is not only an alter-
native to talking, but it can contribute to better access to emotions, memories,
and more.
These psychological effects of art therapy can not only benefit PTSD and
Alzheimer’s patients but others as well. For example, patients with depression
can have improved moods. People with Parkinson’s can have strengthened
memory. People with anxiety may be able to regulate and control their emotions.
Many people suffer and die from these various neurological conditions every day,
but art therapy can help to minimize these struggles.
After the psychological effects of art therapy, I covered the neurological ef-
fects. I went on to explain the effects on the hippocampus and the prefrontal
cortex. Overall, the activation of certain areas during art therapy may be linked
to improving the function of these areas that are affected by certain neurological
conditions.
After the psychological and neurological effects of art therapy, I delved into
specific conditions that pertained to the positive results of art therapy. These
two conditions are PTSD and Alzheimer’s disease. PTSD results from exposure
to a traumatic experience and can result in stress, damage to the prefrontal
cortex, and impairments in the hippocampus (What Is Trauma?, 2018). Art
therapy can prove useful not only because it activates the areas that are impaired
during a traumatic event, but also because it can access the nonverbal traumatic
memory through artistic expression (Talwar, 2007). Alzheimer’s is a fatal type
of dementia that affects memory, behavior, and thinking and progresses over
time (What Is Alzheimer’s Disease? Symptoms & Causes — Alz.Org, n.d.). Art
therapy is not a cure for dementia, but it can improve well-being by improving
patients’ moods. There may also be connections with neuroplasticity, but there
17
123
has not been much empirical research done on this topic.
There are limitations to art therapy. Art therapy is not a cure for most
conditions. In PTSD, the trauma caused by the event can never be 100 per-
cent undone, but art therapy can improve the way that the patient deals with
his/her trauma. In dementia, art therapy cannot cure brain atrophy but can
improve well-being. Another possible downside of art therapy is that there can
be negative consequences if treatment is suddenly stopped (Hattori et al., 2011).
Another negative may be that some people may feel anger and reluctance to-
ward art therapy because they view themselves to be bad at art and find it
frustrating. A limitation of some art therapy studies is that there may be a
selection or participation bias. This bias may be caused by a stigma associated
with art therapy, and the idea that it is pseudoscience.
However, if art therapy is not attempted, people miss out on the opportuni-
ties for self-expression and neurological activation that are offered. Other types
of therapies, such as verbal-based therapies, that do not involve the creative
process may not result in the same outcomes as art therapy. Thankfully, art
therapy is a broad genre that includes many mediums and is adaptable to fit
a patient’s needs (Hu et al., 2021). Existing research already shows how art
therapy can help treat certain conditions, and there may be correlational evi-
dence that art therapy can indirectly prevent deaths. For example, patients who
are diagnosed with depression may be suicidal, and since art therapy improves
well-being, it may help to prevent suicides. Also, if someone is feeling suicidal
because of poor social relations, art therapy is a great resource to improve rela-
tionships and feelings of social acceptance. Students dealing with a lot of stress
can reduce their cortisol levels through art therapy and reduce the risk of dying
from heart disease (Stress Can Increase Your Risk for Heart Disease - Health
Encyclopedia - University of Rochester Medical Center, n.d.). The implications
of art therapy are numerous, but to uncover more we should encourage more
people to engage with this research.
References
[All09] Gold C. A. Budson A. E. Ally, B. A. The picture superiority effect
in patients with alzheimer’s disease and mild cognitive impairment.
Neuropsychologia, 2009.
[Arc] Stress disrupts the architecture of the developing brain.
[Arn09] A.F.T. Arnsten. Stress signaling pathways that impair prefrontal cor-
tex structure and function. Nature Reviews, 2009.
[Bel14] Van Hecke A. Konopka L. Belkofer, C. Effects of drawing on alpha
activity: A quantitative eeg study with implications for art therapy.
Art Therapy, 2014.
[Bel22] Van Hecke A. Konopka L. Belkofer, C. Research review shows self-
esteem has long-term benefits. UC Davis, 2022.
18
124
[Bre99] Staib L. H. Kaloupek D. Southwick S. M. Soufer R. & Charney D. S.
Bremner, J. D. Research review shows self-esteem has long-term ben-
efits. Biological Psychiatry, 1999.
19
125
[Kai17] Ayaz H. Herres J. Dieterich-Hartwell R. Makwana B. Kaiser D. H.
Nasser J. A. Kaimal, G. Functional near-infrared spectroscopy assess-
ment of reward perception based on visual self-expression: Coloring,
doodling, and free drawing. The Arts in Psychotherapy, 2017.
20
126
[Tal07] S. Talwar. Accessing traumatic memory through art making: An art
therapy trauma protocol (attp). The Arts in Psychotherapy, 2007.
21
127
Detecting Causality by Using Alexander
Quandles and Alexander-Conway Polynomial
∗
Nikhila Pasam
October 16, 2023
Abstract
The paper by Samantha Allen and Jacob H. Swenberg suggests that
the Jones polynomial is likely able to detect causality in 2+1-dimension
global hyperbolic spacetime; however, the Alexander-Conway polynomial
cannot. The natural question that arises then is what extra information
needs to be added to the Alexander-Conway polynomial so that it can
also distinguish causality. In this paper, I used some of the Alexander
Quandles for the connected sum of 2 Hopf links and the Allen-Swenberg
link and obtained the result that it does not distinguish between the two
links, so it cannot detect causality.
1 Introduction
In a globallyhyperbolic spacetime
X, which has (2+1) dimensions and is in
the form of ×R, where is a Cauchy surface homeomorphic to R2 , we can
define NX as the space that contains all light rays within X. These light rays
can be represented using a solid torus. When a point P ∈ X is considered,
a light cone intersects ×R in a circular curve, defining a knot in the solid
torus (SP of P ). According to the Low Conjecture, as proved by Chernov
and Nemirovski [VC20], two points P and Q are causally related if and only
SP and SQ are linked within NX . Therefore, link invariants that distinguish
whether SP and SQ are linked within NX can detect causality. Findings by
Allen and Swenberg [Joy82] suggest that the Jones polynomial is likely able to
detect causality, while the Alexander-Conway polynomial may be insufficient.
They identified a link that relates to possibly causally connected events, which
the Alexander-Conway polynomial was unable to distinguish. In this paper,
I check whether the Alexander Quandle can distinguish the two examples of
Allen-Swenberg.
∗ Mission San Jose High School, Advised by: Vladimir Chernov of Dartmouth College and
128
2 Quandles and Cocycles
2.1 Quandle [Ame07] or [Cro04]
A quandle is a set X with an operation ▷ satisfying the properties:
1. x ▷ x = x for all x ∈ X;
2. For all x, y ∈ X, there exists a z such that x = z ▷ y
3. (x ▷ y) ▷ z = (x ▷ z) ▷ (y ▷ z) for all x, y, z ∈ X, which is called
self-distributivity.
The figure on the left shows that the arc that is labeled x crosses under the
arc labeled y from left to right; therefore, the result is x ▷ −1 y. The diagram
on the right shows that the arc that is labeled x crosses under the arc labeled y
from right to left; therefore, the result is x ▷ y. To verify that the knot quandle
is an invariant of knot, we check that the Reidemeister moves don’t change the
quandle colorings.
129
If we have a knot diagram with labels on its arcs based on a quandle, these labels
have a specific rule for the crossings. Before and after a certain move, there must
be a bijection of the labelings. By comparing the number of labelings, we can
determine whether the diagrams represent the same knot or different ones. If
the numbers are equal, there’s no conclusion; if the numbers are different, the
diagrams correspond to different knots.
x ▷ y = tx+ (1 − t) y
If we do affine the Alexander quandle over Zp then the cocycle would be (x−y)pr .
[CN10]
2.2 Colorings
A coloring is an assignment of elements from a quandle X to the arcs of a
knot diagram, with the property that undercrossing is compatible with the ▷
operation.
2.3 Cocycles
Let X be a quandle, and take A = Zn for some n. We want to enhance the
coloring invariants using the notion of cocycle (which is part of the theory of
Cohomology). A cocycle ϕ of X with coefficients in A is a function
ϕ : X × X → A satisfying the condition (for all x, y , z):
ϕ(x, z) + ϕ(x ▷ z, y ▷ z) = ϕ(x, y ) + ϕ(x ▷ y, z).
130
Figure 3: Cocycle relation at crossing
Define Bϕ(C) = crossings ± ϕ (x, y ). This is called Boltzmann weight. The
sign is defined ± 1 for positive and negative crossings, respectively. Then the
cocycle invariant of the knot K (with diagram D) is given by the formal sum
of Boltzmann weights:
ψϕ (K) = C Bϕ(C), where C runs over all colorings.
Now, we can apply the Alexander quandle operation. The results are shown in
the table below.
131
Crossings Alexander Quandle
1 Y1 = Y1 ▷ Y3 = 2Y1 + 4Y3
2 Y2 = Y3 ▷ Y4 = 2Y3 + 4Y4
3 Y3 = Y2 ▷ Y1 = 2Y2 + 4Y1
4 Y4 = Y4 ▷ Y2 = 2Y4 + 4Y2
After solving the above system of equations, we obtain the following relation:
Y4 = Y2
Y1 = Y2
Y3 = Y2
Therefore, Y1 = Y2 = Y3 = Y4
Since all the colors are the same, it means that the number of colors is equal to
the number of elements of Z5 , which is 5.
We can apply the same process to the Allen Swenberg link. First, let’s label all
the crossings and arcs.
132
Crossings Alexander Quandle
1 Y45 = Y2 ▷ Y1 = 2Y2 + 4Y1
2 Y2 = Y3 ▷ Y4 = 2Y3 + 4Y4
3 Y38 = Y4 ▷ Y3 = 2Y4 + 4Y3
4 Y39 = Y38 ▷ Y4 = 2Y38 + 4Y4
5 Y4 = Y5 ▷ Y39 = 2Y5 + 4Y39
6 Y40 = Y39 ▷ Y5 = 2Y39 + 4Y5
7 Y5 = Y6 ▷ Y40 = 2Y6 + 4Y40
8 Y16 = Y40 ▷ Y6 = 2Y40 + 4Y6
9 Y6 = Y7 ▷ Y16 = 2Y7 + 4Y16
10 Y8 = Y17 ▷ Y7 = 2Y17 + 4Y7
11 Y18 = Y17 ▷ Y16 = 2Y17 + 4Y16
12 Y10 = Y11 ▷ Y16 = 2Y11 + 4Y16
13 Y12 = Y11 ▷ Y7 = 2Y11 + 4Y7
14 Y16 = Y15 ▷ Y10 = 2Y15 + 4Y10
15 Y7 = Y14 ▷ Y10 = 2Y14 + 4Y10
16 Y10 = Y9 ▷ Y12 = 2Y9 + 4Y12
17 Y41 = Y9 ▷ Y8 = 2Y9 + 4Y8
18 Y8 = Y15 ▷ Y18 = 2Y15 + 4Y18
19 Y12 = Y14 ▷ Y18 = 2Y14 + 4Y18
20 Y18 = Y13 ▷ Y12 = 2Y13 + 4Y12
21 Y19 = Y13 ▷ Y8 = 2Y13 + 4Y8
22 Y43 = Y24 ▷ Y30 = 2Y24 + 4Y30
23 Y29 = Y24 ▷ Y23 = 2Y24 + 4Y23
24 Y23 = Y25 ▷ Y29 = 2Y25 + 4Y29
25 Y30 = Y26 ▷ Y29 = 2Y26 + 4Y29
26 Y19 = Y20 ▷ Y30 = 2Y20 + 4Y30
27 Y21 = Y20 ▷ Y 23 = 2Y20 + 4Y23
28 Y37 = Y25 ▷ Y21 = 2Y25 + 4Y21
29 Y27 = Y26 ▷ Y21 = 2Y26 + 4Y21
30 Y23 = Y22 ▷ Y37 = 2Y22 + 4Y37
31 Y21 = Y22 ▷ Y27 = 2Y22 + 4Y27
32 Y30 = Y28 ▷ Y37 = 2Y28 + 4Y37
33 Y29 = Y28 ▷ Y27 = 2Y28 + 4Y27
34 Y36 = Y37 ▷ Y27 = 2Y37 + 4Y27
35 Y27 = Y31 ▷ Y36 = 2Y31 + 4Y36
36 Y35 = Y36 ▷ Y31 = 2Y36 + 4Y31
37 Y31 = Y32 ▷ Y35 = 2Y32 + 4Y35
38 Y34 = Y35 ▷ Y32 = 2Y35 + 4Y32
39 Y32 = Y33 ▷ Y34 = 2Y33 + 4Y34
40 Y34 = Y33 ▷ Y3 = 2Y33 + 4Y3
41 Y1 = Y3 ▷ Y33 = 2Y3 + 4Y33
42 Y44 = Y42 ▷ Y41 = 2Y42 + 4Y41
43 Y41 = Y43 ▷ Y42 = 2Y43 + 4Y42
44 Y42 = Y44 ▷ Y1 = 2Y44 + 4Y1
45 Y1 = Y45 ▷ Y42 = 2Y45 + 4Y42
6
133
To calculate the number of solutions, I inputted the system of equations into
Wolfram Mathematica and obtained the following:
The system shows that all the variables are equal to each other; therefore, the
number of colors is equal to the number of elements of Z5 , which is 5. Since the
coloring invariant of the connected sum of 2 Hopf links is equal to the coloring
invariant of the Allen Swenberg link, this invariant doesn’t distinguish between
the two links.
t = 3, n = 5 → 5 monochromatic solutions
t = 4, n = 5 → 5 monochromatic solutions
t = 2, n = 7 → 7 monochromatic solutions
t = 3, n = 7 → 7 monochromatic solutions
t = 4, n = 7 → 7 monochromatic solutions
t = 5, n = 7 → 7 monochromatic solutions
5 Conclusion
The number of quandle coloring invariants for the connected sum of 2 Hopf
links and Allen Swenberg link are the same for different values of n and t. The
results show that the Alexander quandle paired with the Alexander-Conway
Polynomial does not contain enough information to detect causality. Since we
affine Alexander quandle over Z5 , then the cocycle would (x − y)5r . However,
they would not help since according to my computations all the quandle colors
are the same.
134
6 Acknowledgement
This research was conducted under the supervision of Professor Vladimir Cher-
nov of Dartmouth College and Professor Emanuele Zappala of Yale University
through the Horizon Academic Research Program in the summer of 2023. I
give thanks to Professors Vladimir Chernov and Emanuele Zappala for giving
me this opportunity and for their support as my mentors.
References
[Ame07] Kheira Ameur. Polynomial Quandle Cocycles, Their Knot Invariants
and Applications. PhD thesis, University of South Florida, 2007.
[Joy82] David Joyce. A classifying invariant of knots, the knot quandle. Jour-
nal of Pure and Applied Algebra, 23(1):37–65, 1982.
[Mat84] Matveev. Distributive groupoids in knot theory. Mathematics of the
USSR-Sbornik, 47(1):73–83, 1984.
135
Community Detection in Dynamic Face-to-Face Interaction
Networks: A Louvain Algorithm Approach
∗
Iroda Ibrohimova
October 13, 2023
Abstract
In this paper, we present a study that evaluates the suitability of the Louvain Algorithm in
the context of face-to-face interaction networks. Traditional community detection methods face
challenges in this context, necessitating specialized solutions. Our research addresses this gap,
offering a systematic approach that aggregates individual game data and applies the Louvain
Algorithm. The results demonstrate the algorithm’s effectiveness in consistently identifying the
original 34 communities, demonstrating its relevance in face-to-face interaction networks.
1 Introduction
Social networks are interconnected individuals or entities characterized by relationships, interac-
tions, and information flow. They indicate key webs of human interactions and help to understand
these interactions’ dynamics [New10]. These qualities make them potential grounds for applying
machine learning techniques to discover patterns and structures within communities [WF94]. With
its capacity to sift through vast datasets, machine learning offers a powerful tool to dissect and
comprehend the complexities of social networks, enabling insights into human behavior, influence
propagation, and community formation [Lea09], [CGP12].
Community detection is a widely employed technique in graph analysis aimed at partitioning
vertices within a graph into coherent ”communities” based on their relatedness. It serves as a
valuable tool in various scientific and industrial fields, including biology, social networks, finance,
and literature analysis, aiding in discovering meaningful structural patterns. Comprehensive reviews
on the different formulations, methods, and applications of community detection can be found in
Michele Coscia, Fosca Giannotti, Dino Pedreschi, and Santo Fortunato [For10], [KN11]. Various
measures have been proposed to evaluate the goodness of partitioning produced by a community
detection method [KS88], [NG04].
Among the methods used, modularity stands out due to its widespread application. Introduced
by Newman [NG04], modularity quantifies the quality of community assignments by examining the
proportion of edges within communities. However, modularity has constraints, including a resolution
limit [FB07]. Nonetheless, it remains a popular choice for practitioners, and resolution-limit-free
variations have been suggested.
Numerous efficient heuristics have been developed over the years, making the analysis of large-
scale networks feasible in practice. The Louvain method, a highly efficient heuristic proposed by
∗ Advised by: Dr Maria Konte
136
Blondel et al. [BGLL08], has gained prominence for its speed and the quality of results it provides
in practice.
Despite the widespread use of community detection methods, applying them to face-to-face
interaction networks remains challenging. These networks, characterized by limited data and direct,
physical interactions, differ significantly from virtual networks. This uniqueness necessitates tailored
approaches. In the context of face-to-face interaction networks, the Louvain algorithm can be an
effective method for community detection.
The objective of this study is to comprehensively assess the suitability of the Louvain Algorithm
in the context of face-to-face interaction networks. These networks present distinctive challenges
involving sparse data due to limited observation opportunities. Traditional community detection
methods may not readily adapt to these conditions, necessitating specialized solutions.
This research addresses the need for effective community detection in face-to-face networks.
The complexity arises from the need to extract meaningful patterns from a web of brief, physical
interactions, a task conventional methods struggle with. Furthermore, the urgency of solving this
problem stems from the increasing interest in comprehending real-world, physical interactions across
diverse contexts, from workplaces to social gatherings.
To address the challenges described above, we have developed a systematic approach. We aggre-
gated individual game data and applied the Louvain Algorithm to detect communities within each
game. Subsequently, we used the results to assess whether the algorithm consistently identifies the
original 34 communities. This process serves as a rigorous evaluation of the algorithm’s effectiveness
and underscores its relevance within face-to-face interaction networks.
Contributions: we make the following contributions in this paper: 1. Enhanced Resolution
Parameter: Our study introduces a refined resolution parameter setting (8) specifically tailored
to the dynamic nature of face-to-face interaction networks. 2. Methodological Framework: The
research establishes a comprehensive methodological framework for analyzing face-to-face interaction
networks. 3. Specialized Application of the Louvain Algorithm: This research employs the tailored
application of the Louvain Algorithm to face-to-face interaction networks, a domain with distinct
observational constraints and network characteristics.
2 Related Works
There have been notable efforts to address community detection within social networks, with a
focus on various methodologies [BBPSV18], [PLR19], [PBM20], [MMP19], [BCT+ 14]. For instance,
Barrat et al. studied temporal multilayer networks, shedding light on the social structure of face-
to-face interaction [BBPSV18]. Their work significantly advances our understanding of dynamic
network interactions, primarily revolving around temporal aspects. Similarly, Peralta, Loslever, and
Ramos contributed substantially to the field by examining community detection in social networks
and provided critical insights into the intricate interplay of social dynamics, with a focus on broader
virtual networks [PLR19].
In face-to-face proximity networks, Puglisi, Bullo, and Mantzaris [PBM20] dived deeper into the
application of stochastic block models and community detection, providing a significant step toward
understanding proximity-based social interactions. Their approach centers on a specific modeling
framework. Conducting a data-driven study on community detection in these networks, Morone et al.
[MMP19] offer valuable insights into the underlying structures, primarily focusing on the structural
aspects. Additionally, Barrat and colleagues [BCT+ 14] lay a significant groundwork by exploring
community detection within temporal multilayer networks. This work provides a foundation for
understanding social structures in face-to-face interactions, particularly temporal aspects.
137
In comparison, our study uniquely addresses the interplay of face-to-face interaction networks.
By employing a systematic approach that encompasses data aggregation, algorithm application, and
parameter optimization, we present a comprehensive framework tailored to this distinct domain.
3 Data Preparation
3.1 Dataset Description
The dataset used in this research paper was obtained from a series of face-to-face interaction games
called Resistance, conducted as part of the research study [Lesnd]. Face-to-face interactions were
extracted from videos of participants playing the game. Dynamically evolving networks were ex-
tracted from the free-form discussions using the ICAF algorithm. The extraction algorithm is a
collective classification algorithm that leverages computer vision techniques for eye gaze and head
pose extraction.
Each game had 5–9 participants and lasted 45-60 minutes. Each participant was part of exactly
one game. In total, the dataset had 34 games and 232 participants [KBSL21].
Figure 1: Given a group video conversation (left), we extracted face-to-face dynamic interaction
networks (right), representing the instantaneous interactions between participants. Participants are
nodes, and interactions are edges in the network.
The dataset is in CSV format, with one file for each game.
The networks are weighted, directed, and temporal. In the Excel spreadsheet, there are columns
containing information about timestamps. The behavior was recorded for every 1/3 seconds, leading
to a total of 9650 1/3 seconds. In addition, there are 65 rows in each file. Every third of a second, a
line is drawn from one person (node 1) to another (node 2). The strength of this line is determined
by how likely it is that person ’1’ is looking at person ’2’ or at the laptop.
138
Figure 2: In the figure, the main elements and variables of the dataset can be observed (TIME,
P1 TO P2 , etc.)
4 Methodology
4.1 Network Construction
Participants and their interactions were represented as nodes and edges in constructing the network.
Each participant and the laptop were represented as nodes. Nodes were labeled according to the
order number of the gamer. Edges were defined by the calculated probabilities of participants looking
at each other.
This approach relied on face-to-face interactions, with visual attention being an essential aspect
of engagement. To define edges, the weighted interaction values were considered, where if the nodes’
interaction value equaled 0, no connection (edge) was created. In contrast, with higher values, the
distance between nodes decreased, placing participants closer to each other and forming clusters.
139
4.2 Graph Analysis and Visualization
NetworkX and Matplotlib were used for analyzing and visualizing these dynamic networks.
NetworkX is a Python library designed to create, manipulate, and study complex networks.
In our study, it served as a foundational tool for understanding the intricate web of interactions
among participants. It allowed us to represent these interactions as nodes (representing participants)
and edges (indicating connections between them). This facilitated a comprehensive analysis of the
dynamic face-to-face interaction networks. [HSS08].
Matplotlib is a versatile data visualization library in Python. It allowed us to translate raw
data into insightful visual representations. Our research used Matplotlib to render the network
graphs derived from NetworkX. This enabled us to visually explore and communicate the patterns
and dynamics inherent in the interactions, offering a clear and interpretable representation of the
complex dataset. [Hun07]
where:
Q is the modularity,
m is the total number of edges in the network,
Aij is the weight of the edge between nodes i and j,
ki and kj are the degrees of nodes i and j, respectively,
ci and cj are the communities to which nodes i and j belong,
δ(ci , cj ) is the Kronecker delta function, which is equal to 1 if ci = cj and 0 otherwise.
In simpler terms, the modularity formula measures the difference between the actual number of
edges within communities and the expected number of edges if the edges were distributed randomly.
A higher modularity value indicates a better partition of the network into communities. The
modularity formula is a critical component of the Louvain method, as it is used to evaluate the
quality of the communities detected in every iteration of the algorithm. [BGLL08]
The second phase of the Louvain method involves building a new network whose nodes are the
communities found during the first phase. To do this, the weights of the links between the new
nodes are given by the sum of the weights of the links between nodes in the corresponding two
communities. Links between nodes of the same community lead to self-loops for this community in
the new network. Once this second phase is completed, the first phase of the algorithm is reapplied
140
Figure 3: The illustration shows the stages involved in an overview of the Louvain Community
Detection Algorithm.
to the resulting weighted network and iterated. A combination of these two phases is denoted as
a ’pass.’ By construction, the number of meta-communities decreases at each pass, resulting in a
hierarchy of communities. [BGLL08]
Several advantages of the Louvain Algorithm fit the context of our research paper very well. In
our dataset, edges represent the strength of interactions. Therefore, the Louvain Algorithm is a great
fit because it can efficiently identify communities in networks with weighted connections [BGLL08]..
Secondly, the Louvain Algorithm is adept at handling interaction data involving clear directions,
ensuring that it accurately captures the interaction flow [BGLL08]. Finally, the Louvain algorithm
performs better than many other algorithms in terms of modularity, which is crucial in identifying
communities [CGar].
141
markedly improved outcomes: this refined resolution allowed for a more granular delineation of
communities, capturing nuanced interaction patterns previously obscured. The adjustments facili-
tated the identification of more distinct groups and provided a deeper understanding of participant
affiliations.
4.6 Visualizing
After applying the Louvain Algorithm, we visualized and analyzed the identified communities. This
involved generating a graphical representation of the network. Each node (participant) was dis-
played, with edges (interactions) connecting them. This visual layout provided an intuitive overview
of how participants interact. Similarly, nodes were assigned distinct colors based on the communities
they belong to. This visual cue helps distinguish different groups of participants who exhibit similar
interaction patterns.
4.7 Clustering
In these dynamic networks, participants’ interactions are complex and nuanced. The Louvain Al-
gorithm identifies communities based on how participants interact more frequently with each other
compared to those outside their communities. However, this raw community information can still
be quite intricate, especially given the nature of our data.
Clustering takes the detected communities a step further by grouping participants who exhibit
similar interaction behavior into distinct and coherent clusters. This process provides a more precise,
more intuitive representation. By employing clustering, we effectively organized participants into
manageable groups, each characterized by shared interaction patterns.
5 Results
The Louvain Algorithm for Community Detection was applied to the dataset obtained from the
game that involved face-to-face interactions. The dataset was cleaned from binary data, participants
142
were labeled in a continuous order, and the inconsistencies, such as resolution, were addressed and
maintained to yield accurate data.
The application of the Louvain Algorithm revealed a total of 34 (from 0 to 33) distinct commu-
nities within the dynamic face-to-face interaction networks. These communities exhibited varying
sizes, ranging from 5 to 9 participants.
143
6 Conclusion
After close examination, it was observed that communities exhibited different interaction patterns.
Some communities demonstrated a higher density of interactions, indicating a stronger cohesion
among members. In contrast, others exhibited a pattern characterized by irregular exchanges,
indicative of a more diffuse and loosely knit network structure.
Also, it should be noted that resolution 8 in the Louvain Community Detection Algorithm was
the most suitable resolution level for our data.
As the results confirmed that the Louvain Algorithm, given the right resolution, can accurately
detect communities in face-to-face interaction networks, we can say that this study provides a pivotal
step toward advancing our understanding of dynamic face-to-face interaction network communities.
References
[BBPSV18] A. Barrat, M. Barthelemy, R. Pastor-Satorras, and A. Vespignani. Community de-
tection in temporal multilayer networks, revealing the social structure of face-to-face
interaction. In Multilayer Networks, 2018.
[BCT+ 14] A. Barrat, C. Cattuto, A. E. Tozzi, P. Vanhems, and N. Voirin. High-resolution tem-
poral networks of face-to-face human interactions. In Advances in Neural Information
Processing Systems, pages 2268–2276, 2014.
[BGLL08] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre.
Fast unfolding of communities in large networks. Journal of statistical mechanics: theory
and experiment, (10):P10008, 2008.
[CGP12] Michele Coscia, Fosca Giannotti, and Dino Pedreschi. A classification for community
discovery methods in complex networks. Statistical Analysis and Data Mining: The
ASA Data Science Journal, 4(5):512–546, 2012.
144
[CGar] P. Chejara and W. W. Godfrey. Comparative analysis of community detection algo-
rithms. Technical report, Department of ICT, ABV-Indian Institute of Information
Technology and Management Gwalior, India, Year.
[FB07] Santo Fortunato and Marc Barthelemy. Resolution limit in community detection. Pro-
ceedings of the National Academy of Sciences, 104(1):36–41, 2007.
[KN11] Brian Karrer and Mark EJ Newman. Stochastic blockmodels and community structure
in networks. Physical Review E, 83(1):016107, 2011.
[KS88] David Krackhardt and Robert N Stern. Informal networks and organizational crises:
An experimental simulation. Social Psychology Quarterly, pages 123–140, 1988.
[Lea09] David Lazer and et al. Computational social science. Science, 323(5915):721–723, 2009.
[Lesnd] Jure Leskovec. Dynamic face-to-face interaction networks [dataset], n.d. SNAP:
Stanford Network Analysis Project. Retrieved August 2, 2023, from http://snap.
stanford.edu/data/comm-f2f-Resistance.html.
[NG04] Mark EJ Newman and Michelle Girvan. Finding and evaluating community structure
in networks. Physical Review E, 69(2):026113, 2004.
[PBM20] S. Puglisi, F. R. Bullo, and A. V. Mantzaris. Stochastic block models and community
detection in face-to-face proximity networks. Applied Network Science, 5(1):1–24, 2020.
10
145
Figure 4: The Sequential (standard) Louvain Algorithm code template.
11
146
Figure 5: In this image, there is a visualization that was obtained right after applying the code and
before clustering the data. Each color determines a particular community.
12
147
Figure 6: There, we are presented with a final graph that clearly shows all 34 communities (as
they were initially grouped). This visualization was obtained after employing clustering to the data
points.
13
148
Self-Supervised Dementia Prediction From MRI
Scans With Metadata Integration
∗
Zile Huang
October 13, 2023
Abstract
We introduce metadata integration in the training process for demen-
tia diagnoses as weak label information using Weakly-Supervised Mod-
ified Knowledge Distillation with No Labels (WS-MDINO). Using WS-
MDINO, we fine-tuned the parameters of the original vision transformer
pre-trained with DINO on ImageNet. Our model achieved equivalent to
the state-of-the-art epoura rformance of 92% accuracy in the OASIS1
dataset under leave-one-out cross-validation. We visualized the perfor-
mance of the model by extracting average self-attention maps and average
brains from the dataset, showing that the model had learned meaningful
structural information about demented brains.
1 Introduction
Alzheimer’s Disease (AD) is a leading cause of dementia, affecting millions
worldwide. Even to date, it has no proper medical treatment and can only
be controlled with continuous medication [KMS+ 22]. Early diagnoses and early
intervention are beneficial for both the patients and caretakers, for the treat-
ment would be most effective and less costly [RL19]. An automated model
would aid the early detection of dementia immensely as it provides a fast,
cheap, and accurate reference for the diagnosing process. Past works have used
MRI scans of patients’ brains to develop image recognition models for AD diag-
noses [SN18, SMP+ 21, FDH+ 19, CGAA22, AR14, SJS+ 23, IZ18]. However, past
models have faced challenges such as poor interpretability, which is a symptom
of most deep learning and CNN architectures, and non-optimal integration of
clinically free metadata [SN18]. Many previous works failed to perform cross
validation because it is too computationally expensive (for each training split
the model needed to be re-trained completely) [FDH+ 19,IZ18]. To address these
limitations, we developed a model with a self-supervised method which can in-
corporate the metadata as weak labels [CZWM+ 22] with a vision transformer
(ViT) backbone [DBK+ 20].
∗ Advised by: Jan Cross-Zamirski
149
While many previous works used the Convolutional Neural Network (CNN),
we use a small ViT with 8 × 8 pixel patch size (ViT-S/8) introduced by Dosovit-
skiy et al. [DBK+ 20]. Compared to traditional CNN models, ViTs on medical
datasets have been shown to capture long-range relationships in the image,
provide built-in insight into the performance of the model with self-attention
maps, and provide superior adaptive-learning with the self-attention mecha-
nism [MHSS21].
Even though ViTs require a significantly larger dataset than CNNs to achieve
these qualities [MHSS21], researchers can perform transfer learning from the
pre-trained weights on ImageNet [DDS+ 09], which consists of millions of la-
beled images. Past work on automatic AD diagnoses using a ViT achieved an
overall accuracy of 83.27%, with 85.07% specificity and 81.48% sensitivity on
the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset [HKK23].
While the typical training methods for both CNNs and ViTs are supervised,
this paper uses an unsupervised training approach, WS-MDINO, a modified
version of the DINO [CTM+ 21] training method that integrates the metadata
of the subjects into the training process . We trained a multi-perception classifier
and a K-nearest-neighbor (KNN) classifier using the features extracted from the
ViT to produce the final prediction.
Compared to other models, our ViT trained with the WS-MDINO method:
• Is a multi-modal model that integrates the clinically available metadata of
the patients into the training process, achieving better overall performance
• Provides less noisy self-attention maps for the image data than supervised
ViTs [CTM+ 21]
• Allows more complicated validation methods, such as K-fold and leave-
one-out validation, without extra time and computing resources, because
the training process of the feature extractor is not supervised, and, thus,
does not require a train-test split.
2 Background
2.1 AD classification
There are many past efforts to use machine learning to diagnose AD in early
stages based on MRI scans. An early work in automatic diagnoses used Struc-
ture Tensor Analysis to extract features from the MRI scan and used Support
Vector Machine to classify stages of dementia [AR14]. It achieved 88.6% two-
class (demented, non-demented) accuracy, 87.6% sensitivity, and 84.8% speci-
ficity. While this method required relatively less computational resources as
it did not use neural network, it could not effectively integrate the clinically
free metadata into the training process. It also lacked three-class classification.
Later works used Convolutional Neural Network for the image recognition task.
Fulton et al. [FDH+ 19] took the center 51 slices of the axial plane of the 3-D
150
image and trained a ResNet50 model for three-class (non-demented, very-mildly-
demented, mildly-demented), achieving a 98.99% accuracy. However, this result
is not convincing as it did not use a K-fold validation and it is likely that the
slices of the same brain were assigned to both training and validation sets, caus-
ing data leakage. Using training-test split and 5-fold cross validation, we could
not reproduce the results listed in the paper. Islam et al. [IZ18] trained three
separate CNN models for each of the sagittal, coronal, and axial views of the
brain, and combined the prediction of each model using vote. Their proposed
model achieved 93% accuracy, 93% sensitivity, and 94% specificity. However,
they failed to use N-fold validation as they considered it too computationally
expensive, adding greater randomness to their performance. Newer studies in-
troduced the ViT approach [ZK22], achieving 86% accuracy on ADNI dataset
with convolutional voxel values as the input. Compared to CNN models, ViTs
had better interpretability and could capture more long-range relations in the
image.
A comprehensive review [WTSDM+ 20] about machine learning models in
AD classification presented the challenges faced in past classification works.
It showed that many works only did a train-test set split and did not per-
form cross validation, making their performance less convincing. It also showed
that many past works, such as the work we failed to reproduce [FDH+ 19], suf-
fered data leakage, knowingly or not, which caused inaccurate representation
of models’ performance. The review showed that many proposed performances
were not reproducible and, in fact, if with proper train-set split and valida-
tion method, most proposed models would be outperformed by Support Vector
Machine (SVM) with image score.
151
Figure 1: Vision Transformer architecture - figure from the original paper
[DBK+ 20]
Where P (x) represents the probability distributions for the output, the Tem-
perature Softmax:
exp (g(x)(i) /τ )
P (x)(i) = K (2)
(k) /τ )
k=1 exp (g(x)
Where K is the dimensionality of the output and τ is the temperature,
different for student and teacher, denoted as τs and τy (τ > 0). The teacher
parameters are updated with an exponential moving average (ema) based on
the student parameters:
152
θt ← λθt + (1 − λ)θs (3)
Where λ is the momentum hyper-parameter. While DINO also works with
other architectures such as ResNet, it performs best with a ViT backbone.
DINO with a ViT backbone presents clearer semantic segmentation information
than supervised ViTs and works excellently with k-NN classifiers using extracted
embeddings.
3 Methods
The implementation of our methods is available in a GitHub repository1 . We
summarize our training and evaluation in Figure 2.
153
Figure 2: Summary of data preprocessing, training, feature extraction, and
evaluation pipeline
Table 1: Caption
Loss = −Pt (vt )log(Ps (x)) (4)
x∈Vs
154
Column Name Data Completeness
Identification (ID) Complete
Gender (M/F) Complete
Dominant Hand (Hand) Complete
Education (Educ) Missing 201 rows
Socioeconomic Level (SES) Missing 220 rows
Mini Mental State Examination Score (MMSE) Missing 201 rows
Clinical Dementia Rating (CDR) Complete
Estimated Total Intracranial Volume (eTIV) Complete
Normalize Whole Brain Volume (nWBV) Complete
Atlas Scaling Factor (ASF) Complete
Delay Missing 416 rows
155
3.4 Data Augmentation
We used limited data augmentations to preserve important features of the
brains. After several trials with various data augmentations such as resizing
and translating, we concluded that, because each brain image is so structurally
similar to another, the model would interpret the noise caused by the data
augmentation a more significant information than the actual information the
images carry. For this reason, we found that models would generally perform
better on brain datasets like OASIS1 with little data augmentation.
Therefore, unlike the original DINO implementation2 , we avoided rotation
and translations to preserve the symmetrical structure of the brain; We avoided
color jitter, solarization, and Gaussian noise for the model to understand that
the input is single-channel, even though the gray-scale channel is copied into
RGB channels to fit the ViT structure and has a black background. For each
global crop, we resized the image to 256 × 256 pixels and centered cropped the
image to 224 × 224 pixels.
156
Method 3-class Acc. 2-class Acc. Sensitivity/Specificity
ResNet50 76.9% 79.6% 46%/87%
WS-MDINO with MMSE labels 84% 89% 64%/96%
WS-MDINO withCompound labels 85% 92% 71%/98%
WS-MDINO with Real labels (CDR) 100% 100% 100%/100%
CNN with vote [IZ18] 3 N/A 93% 93%/94%
2-D CNN [SMP+ 21] N/A 84% N/A
3-D CNN [SMP+ 21] N/A 84% N/A
Forward Neural Network [JKK17] N/A 90% 92%/87%
157
process under weak supervision in Figure 3, 4, and 5. We show that the weak
supervision with weak labels is effective as subjects cluster over time in the 2-D
representation. It is worth noticing that, even though we used 7 weak labels
in total, our model clustered subjects into roughly 4 clusters. This shows that,
while our model learns from the weak labels, it also effectively learns from the
similarities between images of the same class.
5 Conclusion
WS-MDINO is a powerful method of integrating metadata as weak supervi-
sion for DINO training, allowing models to learn effectively from both images
and metadata. Capable of generating 3-class and 2-class predictions with high
accuracy, our dementia diagnosis model trained using WS-MDINO with com-
pound weak labels successfully captures important features of demented and
non-demented brain. Our model also provides insight into weakly supervised
training methods for datasets that are sensitive to data augmentations, such as
brain MRI scan datasets.
While the OASIS1 dataset used in this study is a relatively small dataset,
there are larger datasets for dementia diagnoses such as ADNI4 which consists
of thousands of subjects. Future work should encompass testing our method on
such larger datasets. However, this is beyond the scope of this study. It is also
4 https://adni.loni.usc.edu/
10
158
possible that a stronger pseudo class could further improve the model’s perfor-
mance. However, it is important that the pseudo classes do not have dataset-
specific information, which decreases the model’s generalization ability. Thus,
we suggest building pseudo classes as simple as possible and following existing
studies on metadata’s influence on the subject, such as [RSH+ 13]. For future
work, WS-MDINO has the potential to seamlessly combine machine learning
approaches with classical approaches, which are reflected in the creation of one
or multiple pseudo classes.
11
159
Figure 5: Class representation in 2-D space using ImageNet weights fine-tuned
with WS-MDINO with compound label after 40 epochs
Figure 6: Average attention maps and brains from sagittal, coronal, and axial
view, produced from WS-MDINO using compound labels (left: average brain;
right: average self-attention map)
12
160
References
[AR14] M Archana and S Ramakrishnan. Detection of alzheimer dis-
ease in mr images using structure tensor. In 2014 36th Annual
International Conference of the IEEE Engineering in Medicine
and Biology Society, pages 1043–1046, 2014.
[CGAA22] Kwok Tai Chui, Brij B. Gupta, Wadee Alhalabi, and Fatma Salih
Alzahrani. An MRI scans-based alzheimer’s disease detection via
convolutional neural network and transfer learning. Diagnostics,
12(7):1531, June 2022.
[CTM+ 21] Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou,
Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging
properties in self-supervised vision transformers, 2021.
[DPC17] Silvia Duong, Tejal Patel, and Feng Chang. Dementia. Cana-
dian Pharmacists Journal / Revue des Pharmaciens du Canada,
150(2):118–129, February 2017.
[FDH+ 19] Lawrence Fulton, Diane Dolezel, Jordan Harrop, Yan Yan,
and Christopher Fulton. Classification of alzheimer’s disease
with and without imagery using gradient boosted machines and
ResNet-50. Brain Sciences, 9(9):212, August 2019.
[HKK23] Gia Minh Hoang, Ue-Hwan Kim, and Jae Gwan Kim. Vision
transformers for the prediction of mild cognitive impairment to
alzheimer’s disease progression using mid-sagittal sMRI. Fron-
tiers in Aging Neuroscience, 15, April 2023.
13
161
[IZ18] Jyoti Islam and Yanqing Zhang. Brain MRI analysis for
alzheimer’s disease diagnosis using an ensemble system of deep
convolutional neural networks. Brain Informatics, 5(2), May
2018.
[MWP+ 07] Daniel S. Marcus, Tracy H. Wang, Jamie Parker, John G. Cser-
nansky, John C. Morris, and Randy L. Buckner. Open access
series of imaging studies (OASIS): Cross-sectional MRI data in
young, middle aged, nondemented, and demented older adults.
Journal of Cognitive Neuroscience, 19(9):1498–1507, September
2007.
14
162
[SN18] Lauge Sørensen and Mads Nielsen. Ensemble support vector ma-
chine classification of dementia using structural MRI and mini-
mental state examination. Journal of Neuroscience Methods,
302:66–74, May 2018.
15
163
Sexual Dimorphic Nature of the Amygdala and
its Contribution to Females’ Susceptibility to
Depression
∗
Nardos Shewadeg Gebresenbet
October 12, 2023
Abstract
Depression is about twice as common in females as it is in males,
which raises questions about the root of this significant disparity. Numer-
ous studies have examined the role female hormonal changes at various
stages of life, including the pre-menstrual, prenatal, and postnatal phases,
have in this phenomenon. However, little emphasis is placed on the lim-
bic system’s contribution, particularly the amygdala. The processing of
affective information takes place mostly in the amygdala, which is why
affective disorders like depression have a significant impact on this brain
structure. Furthermore, the amygdala is a subject of interest in studies
of sex differences in the human brain due to its high concentration of sex
hormone receptors. The recent developments of neuroimaging technolo-
gies also provide an opportunity to examine the distinct functions of the
amygdala in males and females. This review’s objective is to investigate
the characteristics that make the amygdala a sexually dimorphic brain
structure by placing a particular emphasis on its volume and function.
It will also cover how the amygdala’s sexually dimorphic characteristics
contribute to the prevalence of depression in women.
1 Introduction
The limbic system comprises various brain areas crucial for processing emotional
memory, along with motivation, social processing, learning, and spatial memory
(Har00). The hippocampus and the amygdala are at the forefront of emotion
regulation, with the amygdala processing emotion and the hippocampus creat-
ing a declarative episodic memory of the emotional event (RL04). The amygdala
has received attention due to its essential role in processing emotionally salient
information and developing adaptive responses subsequently (OB15). It is eas-
ily distinguishable in the temporal lobe due to the almond-shaped nucleus it
∗ Advised by: Dr.Bridget Callaghan (Assist. Professor)
164
contains (All21) and the fact that it has 13 nuclei total (Jan15; oMN). The so-
phisticated neuronal connections between the amygdala and other regions of the
brain also make it a crucial structure in controlling both behavioral and physio-
logical responses (Ham05). Therefore, learning about the amygdala’s structure
and function is worthwhile since any damage would disrupt many neurological
processes.
Numerous studies have asserted that the amygdala is a sexually dimorphic
structure (Nis81; Coo05; Uem12; Blu17; LO21) nevertheless, others have argued
that the difference between males and females is not that significant (Fra00;
Mar17). This review will focus on traits that are considered to make the amyg-
dala a sexually dimorphic organ, notwithstanding the controversy. Amygdala’s
volume will be the first characteristic discussed in this review. Across all ages,
males have a 9–12% larger average brain size than females (Kac19). This over-
all difference also correlates to variances in volume in individual brain areas,
such as the amygdala (Mar17). Other factors, in addition to intracranial vol-
ume disparities, make the sex-related size difference in the amygdala apparent.
For example, the amygdala’s high number of sex hormone receptors renders it
highly influenced by sex hormones such as androgen and estrogen, which play
distinct roles in its volume (Ham05). Furthermore, the peak of amygdala de-
velopment differs between males and females (Uem12). The aforementioned
factors contribute to the alleged amygdala volume differential between males
and females. Little is known about the functional significance of the volume
differential. However, as neuroimaging technologies advance, there is room for
discovery.
The function of the amygdala is another feature that makes it a sexually di-
morphic organ. To begin with, the amygdala’s primary role in the nervous sys-
tem is processing threatening, fear-inducing stimuli, and activating fear-related
behaviors to produce physiological and psychological responses (ˇSi21). It is
also rarely activated and generates responses during emotionally neutral stimuli
(Dav02). The functioning of the amygdala has revealed differences between the
two sexes. The discrepancy is particularly evident in its response and hemi-
spheric lateralization (Ham05). Although little is known about the clinical
consequences of this differential, some studies have shown promise for future
discovery (Sim14).
Apart from its sexual dimorphism, the amygdala’s considerable participation
in affective information processing (ˇSi21) is an appreciable trait. It helps
to categorize sensory data and assign them the proper degree of relevance to
elicit a response to the emotionally significant ones (ˇSi21). In light of the
characteristics mentioned, the amygdala is the most afflicted brain structure in
psychopathologies such as depression. Depression is one of the most common
mental disorders (Kal20). Many studies have been conducted to determine the
causes of this mental condition (Dav02; Fu09), and most of them have discovered
greater amygdala involvement (Rub16; Zha21; ˇSi21). As a result, the amygdala
is at the forefront of investigations into the neurological etiologies of depression.
Given the amygdala’s role in depression and its sexually dimorphic trait, it
is a crucial brain structure to examine when discussing females’ propensity to
165
depression. According to current statistics, depression is 50% more common
in women than men (Org23). Many studies have attempted to investigate the
causes of this immense gap (Alb15; Li17; Kue17). However, their focus was on
the alterations that occur during distinct stages of female life, such as menstrua-
tion, pregnancy, and menopause, with minimal emphasis on sexually dimorphic
brain structures such as the amygdala. The primary goal of this review is focus-
ing on defining the characteristics that distinguish the amygdala as a sexually
dimorphic organ, and demonstrating how they can be relevant in determining
the causes of female susceptibility to depression.
2 Amygdala
2.1 Structure of the amygdala
The amygdala is a brain structure in the temporal lobe that is part of the limbic
system (OB15). It contains neuronal cells that transport electrical and chemical
impulses, as well as glial cells that support the neuronal cells (Cli23). The amyg-
dala was named after the Greek word for almond because of the almond shape
of its basal nuclei (All21). Also, due to its almond shape, it can be clearly
distinguished in its anatomic position, which is anterior to the hippocampus
(OB15). Although it varies depending on the overall size of the brain, it is a
small structure placed near regions that transport information from the senses
(OB15). It is a paired structure, with one in the left hemisphere and the other
in the right (OB15). It has 13 nuclei that fall into four major categories: the ba-
solateral group (which includes lateral, basal, accessory basal, and para laminar
nuclei); the superficial (which encompasses centro medial and cortical nuclei);
the medial and central (which share functional similarities but have distinct
roles at times), the anterior amygdaloid area, and the amygdalohippocampal
area (ˇSi21). Among these nuclei, the basolateral nuclei, which emerges from
the lateral amygdala, plays a predominant role in the whole amygdala (ˇSi21).
It sends efferent projections to other amygdala nuclei and other cortical areas
(ˇSi21).
166
ing emotions and producing adaptations due to its complex interactions with
sensory modalities, which are vital for processing both types of fears (Iso15).
Emotion regulation refers to the process by which the amygdala determines risks
at the unconscious level and modulates behavioral and physiological responses
at the cognitive level (ˇSi21). Furthermore, the amygdala is prominently impli-
cated in both negative and positive valence emotion encoding, following which
it assigns a label to each emotion (ˇSi21). It generates reactions to emotionally
relevant ones after assigning labels (ˇSi21). The categorical model assumed
that the amygdala was primarily involved in negative emotions; however, ad-
vances in neuroimaging techniques revealed that the amygdala is also involved
in emotionally neutral stimuli (Bon15). It can be inferred that the amygdala is
only marginally involved in emotionally pleasant stimuli, but its involvement is
prominent in emotionally unpleasant stimuli. Besides, the amygdala has sophis-
ticated neural connections with other brain parts, namely sensory structures,
and brain regions such as the hippocampus and hypothalamus (Ham05). This
link is especially crucial in information processing between the prefrontal cortex
and hypothalamus (oMN), and memory formation in the hippocampus. Also,
it processes diverse types of emotions through its connections with structures
engaged in the senses (Cli23). As a result, abnormalities in the amygdala’s
functioning may result in difficulty with proper emotion regulation, which leads
to psychopathologies.
Furthermore, the amygdala is one of the brain areas that exhibit lateral-
ity in functioning (Mar99). According to Mar99, in an experiment where the
cerebral blood flow variations in response to emotional valence were evaluated,
unpleasant stimuli substantially stimulated the left hemisphere of the amygdala.
Meanwhile, the right amygdala was involved in the recovery, non-detailed, and
shallow processing of emotional information (Mar99). This demonstrates that
in addition to playing similar roles in emotion processing (All21), the two hemi-
spheres of the amygdala have distinct functions in how emotion is processed.
167
Many studies on sex differences in the human brain have focused on the
amygdala (Nis81; Coo05; Uem12; Blu17; LO21). Other studies have claimed
that the amygdala is not a sexually dimorphic organ (Fra00; Mar17). Regard-
less of the debates, it is critical to examine sexual dimorphic features because
they may have ramifications for various psychopathologies. For example, a
study in rats discovered that exposure to testosterone in the neonatal period
triggers synaptogenesis during postnatal development, resulting in variations in
female and male amygdala’s later in life (Nis81). Furthermore, another study by
Coo05 on rats revealed that gonadal steroid hormones have a strong influence
on the medial amygdala early in life, and it is also lateralized before puberty.
After gonadal steroid hormone exposure, female rats had 80% more excitatory
synapses in the left hemisphere of the medial amygdala than males (Coo05).
Additionally, the amygdala’s increased number of sex hormone receptors makes
it sexually dimorphic in adulthood due to the hormones ingested during the
neonatal period (Gol01). The Amygdala’s sexual dimorphism can be seen in its
volume and function.
168
years earlier than males. In addition, females had a slower growth rate, which
contributed to a smaller amygdala volume (Uem12). The peak age also differed
between the right and left amygdala; for males, the right amygdala peaked at
12.6 while the left peaked at 11.1; for females, the right amygdala peaked at
11.4, and the left peaked at 9.6. Further, the study by Uem12 found that before
the peak, the size difference between males and females was not significant, but
after, there was a substantial difference.
The functional importance of the size differential between males and females
is yet unknown (Ham05), but certain studies have shown promise. For instance,
according to a study by Qin16 conducted on 176 people between the ages of 19
and 30, 100 of whom were males and 76 of whom were females, whole brain size
as well as intracranial brain structure size were substantially associated with
function. It can be deduced from this that, with developments in neuroimaging
techniques such as functional magnetic resonance imaging (fMRI) and positron
emission tomography (PET), functional significance of the amygdala size differ-
ence between males and females may be identified.
169
evocative visuals varying in novelty and valence, and females’ amygdala response
was more persistent to negative valence visuals over repeated trials. According
to the study by And14, persistent responses in females relate to a negative ef-
fect, which has implications for affective disorders. Another study Can02 on
12 females and males found that females remembered emotional occurrences
more. In the study Can02, both males and females were exposed to neutral
and emotionally negative images; their brains were recorded using fMRI; and
their memories were assessed three weeks later. To begin with, both males and
females remembered more emotionally aversive images, and females remem-
bered the emotionally aversive images better than males, with more vivid and
intense recollections (Can02). According to the findings above, the difference
in amygdala response to aversive stimulus between males and females can be
demonstrated by a more persistent amygdala response as well as a stronger
recollection of aversive events in females.
The amygdala is one of the brain regions that has shown laterality, with the
left and right hemispheres being involved in information processing in distinct
ways (Mar99). This laterality has also been demonstrated to differ by sex, with
females’ and males’ left and right amygdala participating in emotion process-
ing differently (Ham05). Cah04 conducted a study on 11 males and females in
which both female and male subjects underwent fMRI scan while viewing slides
ranging from emotionally neutral to extremely arousing. When memories were
assessed two weeks later, males showed a stronger activation of the right hemi-
sphere of the amygdala, whereas females showed a greater involvement of the
left hemisphere (Cah04). Besides, the blood oxygen level-dependent (BOLD)
signal was detected in the left hemisphere of the amygdala in females and the
right hemisphere in males (Cah04). Another study Sch11 included 235 male
and female adolescents who were age and handedness matched. The subjects
completed an emotional face perception fMRI test, and the results revealed that
boys had greater right amygdala involvement (Sch11). According to the study
by Sch11, sex-dependent hemisphere lateralization in teens is a forerunner for
emotional memory in adulthood. It also suggested that hemispheric lateraliza-
tion could have implications for the etiology of mental disorders (Sch11).
170
key brain structure responsible for processing emotions and feelings, any distur-
bance in the amygdala could result in a fault in cognitive reasoning, making us
vulnerable to psychopathologies like depression. Amygdala hyperactivation has
been identified in major depressive disorder (MDD) (Rub16; Zha21;ˇSi21). The
amygdala hyperactivation may potentially influence the initial judgment as well
as the response to incoming information, resulting in cognitive biases towards
unpleasant or emotionally salient information (Dav02). This mechanism is also
hypothesized to be caused by norepinephrine, which is found at abnormally ele-
vated levels in MDD (Dav02). Norepinephrine is involved in amygdala-mediated
learning and is impacted by glucocorticoid secretions, which are similarly ele-
vated in depression (Dav02).
A postmortem study conducted on 13 people with MDD and 10 healthy
controls by Rub16 revealed differences in amygdala structure. Depressed par-
ticipants exhibited a larger lateral nucleus and more total BLA neurovascular
cells than controls (Rub16). This study also has important implications for
how structural disturbances in the amygdala lead to depression. Another study
Ram14 included 55 patients with MDD who met the DSM-IV criteria and 19
healthy controls. The subjects underwent a 3-T fMRI scan, and those with
MDD exhibited impaired intrinsic connectivity with other brain areas involved
in emotion processing and regulation (Ram14). According to the study, these
reduced intrinsic connections of the amygdala may be one of the causes of de-
creased sensations and perceptions, which leads to cognitive disturbances in
MDD (Ram14). The studies reviewed above provide concrete evidence for the
critical role of the amygdala in depression.
171
amygdala volumes have induced psychopathologies (Xu20; Zha21). Given that
the amygdala is the main structure engaged in emotion processing and is in-
volved in depression (Dav02; Si21), any volumetric variation between males and
females can have medical consequences. As an example, one of the factors driv-
ing hyperactivation of the amygdala in females during producing responses to
unpleasant stimuli (Dav02; Ham05) could be a smaller volume of the amygdala,
although further research is needed to prove it. The smaller volume of the amyg-
dala, which results in fewer neuronal and glial cells, may disturb the amygdala’s
normal function in emotion processing. This may make women more prone to
faulty and disrupted cognitions, resulting in depression. Moreover, the differ-
ence in the peak of amygdala maturation between males and females is notable.
Since males reach their peak of maturity one and a half years later (Uem12),
their amygdala could have a higher potential to form efficient connections with
other brain regions. As a result, earlier amygdala maturation in females may
have a negative impact on emotion regulation and render them more susceptible
to depression.
Along with the volume difference, the functional difference in the amyg-
dala of males and females may increase female depression susceptibility. The
amygdala’s response and hemispheric lateralization are manifestations of the
differential. Primarily, females’ amygdala exhibits a persistent response to fear
or unpleasant stimuli compared to males (And14), and this response could re-
sult in amygdala hyperactivation. Thus, hyperactivation may cause a higher
metabolism in the body, resulting in needless physiological reactions and mal-
adaptive cognitions that lead to depression. Furthermore, women may be pre-
disposed to depression due to their more vivid and powerful recall of emotionally
charged memories (Dav02). Women employ their left hemisphere during emo-
tional memory processing (Cah04), and the left hemisphere is generally involved
in the detailed processing of aversive stimuli (Mar99), which can lead to the for-
mation of solid memories in women. Emotional memories are more important
to individuals and will be recalled more frequently than other memories. These
memories may also bring back the pain alongside the negative thoughts felt and
disrupt females’ mental health. It may also result in physiological and behav-
ioral responses such as insomnia or hypersomnia and social isolation.
6 Discussion
6.1 Implications
This review paper explored the amygdala’s sexual dimorphic characteristics and
their role in female depression vulnerability. To begin with, it defined the amyg-
dala by describing its anatomy and function, along with discussing the amyg-
dala’s role in depression. It demonstrated some of the amygdala’s sexually
dimorphic characteristics. The amygdala’s volume was the first attribute exam-
ined in this review, and it stated that females have a smaller amygdala volume.
Also, it implied that the decreased volume would result in fewer neuronal and
172
glial cells, obstructing proper functioning and triggering amygdala hyperactiv-
ity when exposed to adverse stimuli. The review also looked at the amygdala’s
earlier maturity in females. It suggested that this occurrence might render the
female amygdala less effective in connection with itself and other brain struc-
tures. This may jeopardize emotion regulation and predispose women to depres-
sion. The function of the amygdala was the other sexually dimorphic feature
examined. The first functional difference observed was in the amygdala’s sensi-
tivity to unpleasant stimuli, with females exhibiting a more persistent response.
According to the review, this may cause unnecessary physiological reactions in
our bodies, resulting in maladaptive cognitions. In addition, this review looked
at females’ acute and vivid recall of emotional memories. It also claimed that
women’s left hemisphere lateralization while processing emotion may contribute
to this. In essence, emotional memories will be more retained in females, bring-
ing back the pain or bad sentiments they had at the time. As a result, it will
have a negative impact on their mental health, rendering them susceptible to
depression.
6.2 Limitations
One limitation of this review was the scarcity of studies on the sexual dimorphic
feature of the amygdala. This narrowed the scope of the review and limited the
reasons for sexual dimorphism to the effects of sex hormones. Furthermore,
the focus of this review is the amygdala’s sexual dimorphism as a neurological
etiology for female depression vulnerability. However, since the amygdala has
a high density of sex hormone receptors, it is difficult to determine whether
its effects are neurological or sex hormone influenced. Another limitation is
that the notions stated in this review are only suggestions based on the studies
done so far. As a result, numerous studies are needed to determine whether
these may be plausible linkages between the amygdala’s sexual dimorphism and
female depression susceptibility.
10
173
7 Conclusion
This review aimed to investigate the amygdala’s sexual dimorphism and discover
how it affects women’s susceptibility to depression. As previously mentioned,
the amygdala exhibits sexually dimorphic characteristics in its volume and func-
tion. Nevertheless, more research is needed to have a broad understanding of
the extent of the discrepancies. Additionally, because the amygdala plays a role
in depression, it is imperative to have an extensive grasp of the distinctions
when examining the significant sex disparities in the prevalence of depression.
References
[Alb15] P. R. Albert. Why is depression more prevalent in women? Journal
of Psychiatry and Neuroscience, 2015.
[Bon15] Comte A. Tatu L. Millot J.-L. Moulin T.- Medeiros De Bustos E. Bon-
net, L. The role of the amygdala in the perception of positive emotions:
An ”intensity detector.”. Frontiers in Behavioral Neuroscience, 2015.
11
174
[Dav02] Pizzagalli D. Nitschke J. B. Putnam K. Davidson, R. J. Depression:
Perspectives from affective neuroscience. Annual Review of Psychology,
2002.
12
175
[Mar99] H. J. Markowitsch. Differential contribution of right and left amygdala
to affective information processing. Behavioral Neurology, 1999.
13
176
[Wen17] Poh J. S. Ni S. N. Chong Y.-S. Chen H.-Kwek K. Shek L. P. Gluckman
P. D. Fortier M. V. Meaney M. J. Qiu A. Wen, D. J. Influences of
prenatal and postnatal maternal depression on amygdala volume and
microstructure in young children. Translational Psychiatry, 2017.
14
177
Black Sea Grain Initiative: a Game-Theoretic
Analysis
∗†
Paari Dhanasekaran
October 13, 2023
Abstract
In July 2023, the Black Sea Grain Deal expired just a year after its
inception. It has become a very popular topic due to its significant im-
plications for grain and staple food prices. There has been an abundance
of empirical analysis to understand the situation, but the recent develop-
ments in the Black Sea Grain Deal have not been examined using a game-
theoretic approach. This paper provides a game-theoretic viewpoint of the
Black Sea Grain Deal with the focus on the breakdown. Using principles
of game theory, I develop an infinitely repeated game with a defined set of
players, actions, and preferences expressed through payoffs. By analyzing
the game for sub-game perfect Nash equilibria, there is a clearer under-
standing of the breakdown of the Black Sea Grain Deal and its future
implications. I finish by discussing possible extensions and variations of
the model along with what conditions need to be met for a game-theoretic
approach to be viable in general international relations settings.
1 Introduction
On February 2022, Russia began its full-scale invasion of Ukraine. Ukraine’s
ability to export has been severely hampered by Russia’s invasion [UNC22].
Before the war, 90% of Ukrainian crop exports went through ports at the Azov
and Black Seas which became inaccessible due to Russian aggression [OEC22].
On July 2022, Turkey, Russia, Ukraine, and the U.N. signed the Black Sea Grain
Deal. The deal allowed Ukraine to safely export grain, other food, and fertil-
izer from three Black Sea ports: Chornomorsk, Odesa, and Yuzhny/Pivdennyi
[UNC22]. Along with the Black Sea Grain Initiative, the U.N. established an
agreement with Russia “to facilitate the unimpeded exports to world markets
of Russian food and fertilizer (including the raw materials required to produce
fertilizers) to world markets” [Ped23]. The UN brokered these two deals with
the aim of lowering food prices. To some extent, the deal was successful in its
∗ Advised by: Jack Adeney of the California Institute of Technology
† Barrington High School
178
goal to reduce food prices [UNC22]. However, the deal had a finite term limit
of 120 days.
The deal first naturally expired on November 2022. After consideration and
further discussion with the UN, Russia agreed to continue the Black Sea Grain
Initiative for another 120 days. The FAO Price Index continued to decrease. On
March 2023, the U.N. met with Russia to discuss another extension. Moscow’s
agreement was contingent on the removal of Western sanctions. This led to a
stall in the deal on May 2023, which required further talks that led to another
60-day extension [Ped23]. On July 2023, Russia said they were suspending co-
operation with the deal once it reached its expiration date. Russia said that
the agreement concerning their food and fertilizer exports must be met first
before returning to the deal. For that, they have demanded that the Russian
Agricultural Bank is reconnected to the SWIFT payment system and that re-
strictions hampering their agricultural exports (i.e. shipping, insurance) are
lifted [Nic23] [Bon23]
This breakdown has been primarily studied from an empirical standpoint,
with a particular focus on sanctions and restrictions, to explain why the Black
Sea Grain Deal broke down [HS23] [Bon23]. However, there are still lessons to
be learned from a game theoretic approach. By doing so, I conduct a game-
theoretic analysis, where game theory is defined in [Rub94] as “a bag of an-
alytical tools designed to help us understand the phenomena that we observe
when decision-makers interact. The basic assumptions that underlie the the-
ory are that decision-makers pursue well-defined exogenous objectives (they
are rational) and take into account their knowledge or expectations of other
decision-makers behavior (they reason strategically)” (p.1). In response to com-
plex, real-world phenomena, game theory provides a simple and clean structure
that can be used to analyze well defined equilibrium outcomes. Game theory
helps understand outcomes concerning decision-makers whose outcomes are in-
terdependent on others’ actions. Therefore, game theory is a powerful tool of
analyzing international relations. The Black Sea Grain Initiative has clear ac-
tors who have certain actions and preferences that inform those very actions.
Structuring a game around that can give clear, decisive outcomes concerning the
agreement’s breakdown. The merits of game-theoretic analysis will be discussed
in more detail in the literature review.
2 Literature Review
The breakdown of the Black Sea Grain Deal is clearly an issue of interna-
tional relations. The literature has thoroughly explored the links between game
theory and international relations in general. [Cor01] explores the variety of
international relations scenarios in which a game-theoretic approach could be
utilized. With a focus on interaction between nation-states, the primary issues
are security and economics. Like most economic fields, game theory is founded
in the principal influence of individual rationality, meaning that players or ac-
179
tors always play with the aspiration to maximize their own individual payoffs.
Applying this to an international relations context, countries will take the action
that most benefits themselves.
The area of [Cor01] that is of primary interest is the discussion of interna-
tional crises (p.193-195). International crises are characterized specifically “by
the events that take place when one or more nation-states perceive that their se-
curity is suddenly, immediately and seriously threatened by actions proposed or
performed by other nation-states or by events accidentally taking place in them”
(p.193). [Cor01] boils these crises down to two actions: confrontation and coop-
eration where the “threatening nation-state” attempts to force the “threatened
nation-state” to follow their demands. Simultaneously, the “threatened nation-
state” is trying to make the other nation stop their demands. Other papers
explore specific crises in detail.
[Zag14] analyzes several games with varying structures, actions and pay-
offs that attempt to model the Cuban Missile Crisis (CMC). Of the mod-
els which [Zag14] has examined, the one most similar to this paper’s aims is
Thomas Schelling’s 1966 Chicken Game (p.22-26) where the worst outcome is
mutual defection and so one player would yield to the other. Schelling believed
whatever side of the CMC pushed the issue first would force the other to ca-
pitulate and “swerve,” gaining the advantage. This led Schelling to attribute
the U.S.’s ‘victory’ to Kennedy threatening brinkmanship. [Zag14] explains how
Schelling’s model was later proven wrong using White House tapes. The tapes
showed Kennedy wanted to use blockades as a way to buy time for renegotia-
tion (p.24). [Zag14] shows the significance of the real-world context of the crises
being modelled. New discoveries and developments of understanding of a crisis
can debunk a model that was previously supported. [Zag14] demonstrates how
models have been developed to explain real-world events, not limited to but
including the Cuban Missile Crisis. This is very applicable in modelling and
understanding the Black Sea Grain Deal’s breakdown.
A key component of the situation around the Black Sea Grain Deal was
the several expirations and subsequent renegotiations that occurred. Frequent
renegotiation in international agreements makes models of repeated games a
suitable tool of analysis for agreements and their breakdowns. [Pea91], [Sla04],
and [GT20] thoroughly explore the technical aspects of repeated games with
discounted payoffs. [GT20] covers several variations of repeated games concern-
ing monitoring and information while [Pea91] primarily focuses on repeated
games concerning self-enforced agreements referencing several proofs and folk
theorems concerning repeated games, sufficient patience/discount factors and
defining repeated games and their equilibria. [GT20] and [Pea91] provide a vi-
tal, mathematical understanding of repeated games and their equilibria. [Sla04]
tackles specific strategies used in repeated games such as grim trigger, tit for
tat, limited retaliation, deviate once (DEV1L), Grim DEV1L, and Pavlov. He
takes it a step further and assess whether combinations of the some of the afore-
mentioned strategies could be supported as sub game perfect equilibria (2004).
[Kan08] specifically focuses on how repeated games are a setting that en-
courages mutual cooperation. [Kan08] highlights a key issue in international
180
agreements; oftentimes, there is no body powerful enough to enforce an interna-
tional agreement. With there being no explicit commitment device within the
terms of the Black Sea Grain Deal, there is no external force mandating both
sides to cooperate. So, it is best to model the deal in terms of a non-cooperative
game. [Kan08] adds that a long-term relationship with several interactions is
an environment most suitable to establish mutual cooperation, especially when
formal contracts are too costly or impossible to enforce.
3 Model
3.1 Model Setup
Although the Black Sea Grain Initiative was signed by the U.N., Russia,
Ukraine and Turkey, the model incorporates two players: Russia and the U.N.
Turkey has similar interests to the U.N. which makes their payoffs identical. The
U.N having more power and influence and Turkey makes the U.N the primary
player between the two. As a result, their combined preferences are modeled
as a single actor’s, which is simply called the U.N. Although the deal concerns
Ukrainian exports, Ukraine is essentially a bystander in the Black Sea Grain
Deal. Unlike the U.N., they do not have the power to control the restrictions
on Russia, meaning they do not have any action spaces in this game. These
considerations also allow the use of more standardized games that would not
be viable with more than two players. Although in real terms, negotiations
of international agreements can be very complex, for the sake of modelling, I
believe it is appropriate to collapse each actor’s action space into two actions.
They both have essentially binary choices. For the U.N, they can either offer
concessions to Russia or not. And for Russia, they can either return to the
Black Sea Grain Deal or not.
For the U.N., cooperating entails renegotiating the deal and giving some
concessions. Defecting would mean the U.N. ends negotiations over the Black
Sea Grain Initiative and provides no concessions to Russia. For Russia, cooper-
ating entails returning to the Black Sea Grain Initiative. Defecting would mean
Russia does not return to the Grain Deal.
The mutually beneficial outcome would be both sides cooperating and re-
newing a renegotiated Black Sea Grain Deal. Both sides are worse off if there
is no cooperation. The U.N’s payoff becomes negative one because no coopera-
tion leads to lower grain exports and higher prices which hurts their efforts to
combat food insecurity, through the World Food Programme, along with the
welfare of member nations.
The U.N and Russia would benefit the most by exploiting the other. For the
U.N., exploiting would mean they do not renegotiate but Russia still decides to
cooperate and return to the Black Sea Grain Deal. For Russia, cheating would
mean not returning to the Deal when the U.N. makes concessions to renegotiate.
For both sides, being exploited leads to the worst payoff.
181
UN\Russia C D
C 3,3 -2,5
D 4,-1 -1,0
Figure 1. Model of the Prisoner’s Dilemma stage game used to analyze the
Black Sea Grain Initiative
UN\Russia C D
C 3,3 -2,5
D 4,-1 -5,-5
182
pure strategy Nash equlibria become (C,D) and (D,C). The implication of this
is that the U.N. and Russia would rather be exploited by the other side then
both sides not cooperating. If Russia chooses to defect (drop out of the deal),
the U.N would obviously rather not give concessions. However, the equilibria
of the chicken game supports the very opposite. In contrast, the prisoner’s
dilemma stage game reflects the strategic realities for the U.N and Russia. This
is especially shown when examining each stage game’s equilibira.
First looking at (D, C), the U.N would not want to switch to cooperating and
Russia would not want to switch to defection as the payoffs would be worse. The
same concept applies to (C, D). As mentioned previously, both pure strategy
equilbria of the game support the idea that Russia and the United Nations would
rather be taken advantage of than also defecting, which is very unrealistic. So
after examining the pure strategy Nash equilibria, the prisoner’s dilemma is
a more accurate representation and model of how Russia and the U.N would
behave compared to the chicken game.
Due to the Black Sea Grain Deal having finite extension lengths, interactions
surrounding renegotiation have already occurred multiple times. And with no
end to the Ukraine-Russia conflict in the foreseeable future, an infinitely re-
peated discount game, where the subsequent round’s payoffs are discounted by
a factor of δ, δ ∈ (0, 1), seems to be an appropriate method to analyze the recent
tension around the Black Sea Grain Deal.
183
Grain Initiative. This is because the U.N and Russia can clearly observe what
the other side is doing. The U.N can tell if Russia decides to cooperate or defect
from the Black Sea Grain Deal and Russia can tell if the U.N. has decided to
make concessions or not. Thus, it is reasonable that both sides have complete
information on the histories of play in the infinitely repeated stage game model.
Utilizing an infinitely repeated game has significant implications for equi-
libria. Contrary to a game played for only one stage, any mutually beneficial
outcome can be supported as an equilibrium when players interact repeatedly.
This fact is formally stated in folk theorems [Kan08]. Several folk theorems ex-
plore the idea of equilibria in infinitely repeated discounted games. One of the
more prevalent theorems is that any individually rational strategy profile can
be supported as an equlibrium if δ is close to 1. However, more specific folk the-
orems have been developed. [GT20] references Fudenberg and Maskin’s (1986)
folk theorem: “If the number of players is 2 or if the set feasible payoff vectors has
non-empty interior, then any payoff vector that is feasible and strictly individu-
ally rational is a subgame perfect equilibrium of the discounted repeated game,
provided that players are sufficiently patient” [GT20]. Essentially if the players
are patient enough, any strictly individually rational strategy can be supported
as an equilibrium. Strictly individually rational strategies for any player i are
those that yield a higher payoff than player i’s min-max strategy [GT20]; the
min-max strategy is the payoff a player can guarantee themselves in any equi-
librium as explained by [Kan08], which is like the worst-case scenario strategy.
The folk theorem is very broad and lacks predictive power about specific
equilibria. It merely suggests that any individually rational strategy could be
an equilibrium if the players are patient. Over time, the literature has explored
and established more specific folk theorems. [Pea91] explored several of these
folk theorems, of which Friedman’s (1971) theorem is especially pertinent to the
focus of this paper: “Let G = (A1 ..., AN ; Π1 , ...ΠN ) have a Nash equilibrium
e = (e1 , ..., en ) ∈ A, and let q = (q1 , ..., qn ) ∈ A satisfy Πi (q) > Πi (e) for each
i ∈ N . Then for δ sufficiently close to 1, there is a sub-game perfect equilibrium
of G∞ (δ) in which q is played every period on the equilibrium path” (Pearce,
1991). Π denotes the payoff for each player i. Overall, the theorem is very
significant as it supports repeated mutual cooperation as a potential subgame
perfect equilibrium depending on the players’ patience. Similar to the Nash
equilibrium in a one stage game, subgame perfect equilibria are strong indicators
of final outcomes for infinitely repeated games. So for the infinitely repeated
Prisoner’s dilemma stage game model, subgame perfect equilibria are key to
analyze.
As defined in [Rub94]: “a subgame perfect equilibrium of an extensive game
with perfect information (N, H, P, (∗i )) is strategy profile s∗ such that for every
player i ∈ N and every nonterminal history h ∈ H\Z for which P (h) = i we
have
Oh (s∗−i |h , s∗i |h )i |h Oh (s∗−i |h , si |h )
for every strategy si of player i in the subgame T (h),” (p.97). Oh represents
the outcome (the payoff) of a certain strategy profile. What the definition is
184
conveying is that the outcome of player i following the s∗i strategy is greater
than them deviating and following some other strategy si holding every other
player’s strategy s∗−i fixed. Essentially, a strategy profile is a subgame perfect
equlibrium if and only if there are no unilateral, profitable deviations in strategy
a single player can make. This is why the subgame perfect Nash equilibrium is
often referred to as the ’credible threat’. This strongly applies to the Black Sea
Grain Initiative because both Russia and the U.N have a threat to defect which
would severely punish the other player compared to both sides cooperating.
In an infinite stage game, there are a wide number of possible strategies
varying in complexity. Using the aforementioned folk theorems, it is possible
for mutual cooperation in every round to be a subgame perfect equilibrium in the
infinitely-repeated discounted stage game model. There are several strategies
that focus on achieving mutual cooperation: naively cooperating every round or
playing tit-for-tat where player i plays the same move their opponent played the
round before. However, a common strategy to achieve mutual cooperation is the
grim trigger strategy. The aim of a grim-trigger is to use the threat of permanent
defection to enforce cooperation. A grim trigger strategy entails always choosing
to cooperating until the opposing player defects. Following that, the player using
a grim trigger would defect forever, never returning to cooperation. Based on
Friedman’s folk theorem, it is possible for mutual cooperation, which offers a
higher payoff than the one-stage Nash, to be a sub-game perfect. [Sla04] outlines
the strategy in a rather eloquent fashion:
C if t = 0
si (ht ) = C if aτ = (C, C) for τ = 0, 1, ..., t − 1
D otherwise
4 Results
To find whether mutual cooperation with a grim trigger is a sub-game
perfect equilibrium, the payoffs of the strategy and its deviations need to be
considered. Using the one-shot deviation principle, as long as there is a single
profitable deviation, a strategy cannot be considered sub-gameperfect [Rub94].
∞ t
The payoff of always cooperating for either player would be t=0 (3)δ where
∞
t increases by 1 with the next stage of the game. For any δ ∈ (0, 1), t=0 δ t
1 ∞
yields the discounted sum 1−δ . The U.N’s payoff for cheating is 4+δ t=0 (−1)δ t
Under the grim trigger, the U.N would get a payoff of 4 because Russia would
still cooperate while the U.N defects. For future rounds however, Russia would
defect forever which means the U.N’s optimal response would be to also defect
forever (yielding a payoff of -1 which is accordingly discounted by a factor of
∞
δ each round). Russia’s payoff for cheating is 5 + δ t=0 δ t ∗ 0 which ends up
just equalling 5. Under the grim trigger strategy, Russia would get a payoff of 5
because the U.N would still cooperate. For future rounds, the U.N would defect
forever, meaning Russia would also do the same as it yields a higher payoff
compared to cooperating (0 vs -1). Meaning Russia would receive a payoff of 0
185
for all future rounds.
In order for the grim trigger strategy, the payoff of always cooperating has
to be greater than deviating, cheating one round. This can be represented by
the following inequalities:
∞
∞
(3)δ t ≥ 4 + δ (−1)δ t
t=0 t=0
5 Practical Implications
First, it is important to truly understand what δ represents. δ is the factor
by which future payoffs are discounted. A higher delta means future payoffs are
more valuable when normalised to the value of present payoffs. Extending this
186
idea, δ represents the value placed on the future relative to the present. If δ
were to equal 1, that means the future resources/payoffs are equally valuable as
those in the present. A δ of 0 implies the future has no value. Since the discount
factor represents the value of the future, one should consider the possibility of
it varying. This variance can be determined by real world context. Russia has
been embroiled in a war with Ukraine for the last year and a half. As a war
drags on, a country cares more about the present than the future. The war is
causing Russia to divert more present resources, meaning less value in future
resources. This explains the discount factor value lowering. This phenomenon
has been observed since the deal’s inception last July. With each term limit,
Russia was gradually more reluctant to extend. This is very apparent during
March and May 2023 when Russia only agreed to a 60 day extension, half of
the original 120 day extension terms agreed upon. Russia permanently backing
out can be explained by their discount factor dropping below the supported
threshold, leading to the grim trigger strategy not holding as an equilibrium.
Another factor that could change the δ thresholds to support equilibrium
would be change in payoffs. During repeated re-negotiation and as time elapses,
payoffs can possibly change [Jer88]. For example, if the payoff for cheating
the other player increased, both countries would have a stronger incentive to
deviate. Thus, requiring a higher δ to keep them cooperating. An example
would be Russia being more incentivized to cheat the U.N and never return to
the Black Sea Grain Deal. If Russia’s payoff for deviating increased by some
value ϵ, where ϵ > 0, the new inequality for Russia to mutually cooperate
becomes
∞
∞
(3)δ t ≥ (5 + ϵ) + δ δ t ∗ (0 + ϵ)
t=0 t=0
3 ϵ
Using the discounted sum, the inequality becomes 1−δ ≥ (5 + ϵ) + δ( 1−δ ).
Multiplying both sides by (1 − δ) yields 3 ≥ 5 − 5δ + ϵ − δϵ + δϵ. Simplifying,
the inequality becomes 5δ ≥ 2 + ϵ. The solution to the inequality is δ ≥ 2+ϵ 5 .
The minimum delta for the grim trigger to be stable increases by 5ϵ .
Since the U.N’s goal is to establish lasting, mutual cooperation for the Black
Sea Grain Initiative, they may offer more and more concessions to Russia over
time, increasing their payoff for cooperating, reflected by some increase in ϵ This
changes the grim-trigger inequality for Russia to become
∞
∞
t
(3 + ϵ)δ ≥ 5 + δ δ t ∗ 0.
t=0 t=0
3+ϵ
Using the discounted sum, the inequality becomes 1−δ ≥ 5. Multiplying
both sides by (1 − δ) yields 3 + ϵ ≥ 5 − 5δ. Rearranging yields the inequality
5δ ≥ 2 − ϵ. The solution to the inequality is δ ≥ 2−ϵ
5 . By increasing Russia’s
payoff for cooperation by epsilon, the minimum δ decreases by 5ϵ .
10
187
6 Limitations and Possible Extensions of the Game-
Theoretic Approach
When utilizing game theory as an analytical tool, there should be a great
care and caution. Steven J. Brams addresses this in his 2000 paper. Of the
common issues [Bra00] highlight, two apply most to the model outlined in this
paper: Misspecifying the rules and confusing the goals with rational choice.
[Bra00] emphasises that the rules outlined in a game-theoretic model should
reflect how the players would act in the very situation that’s being modeled
(p.222). [Bra00] articulates the intuitive idea that the model should reflect how
the players in the model would realistically act in the given situation. Another
point [Bra00] highlights is that goals and rationality aren’t the same. For exam-
ple, a change in strategy from short to long-term is not varying rationality but
rather, it is a variance in goals with the same underlying rationality (p.222). If
just a one-stage game was used, it would not realistically affect the decision-
making a large country or international governing body (Russia and the U.N)
would take. Through the use of an infinite-stage game, Russia and the U.N’s
long-term lens for decision making is reflected by the model.
The model setup and analysis assumed stable payoffs. While the payoffs of
the repeated stage game were discounted by δ, the stage game payoffs themselves
remained fixed throughout. As [Jer88] notes, preferences evolve over time. In
a constantly changing international environment, preferences, and subsequently
payoffs, are likely to change. While this was briefly explored in the Practical
Implications by increasing Russia’s payoffs to defect and cooperate by some
positive epsilon, more research needs to be done to exactly quantify the payoffs
and how they evolve over time.
While the infinitely repeated Prisoner’s dilemma provides a relatively com-
prehensive analysis of the current developments surrounding the Black Sea Grain
Initiative, future developments may require an adjustment or rethinking of the
model. Even now, there may be opportunities to expand and advance the cur-
rent model. While doing so, it is important to remember what the core purpose
of game-theoretic models is in international relations.. They are supposed to
provide structure that aligns with players’ realistic thinking and actions which
can be analyzed and studied. Adding or reinventing the model should only be
done after extensive and thorough consideration.
7 Conclusion
This paper provided a game-theoretic analysis on the Black Sea Grain Ini-
tiative using an infinitely repeated Prisoner’s Dilemma stage game. It is possible
to structure a model such that cooperation primarily depends on Russia’s pa-
tience. The war in Ukraine carrying on for over a year and a half has decreased
Russia’s valuation of the future. Slight increases in the payoff to defect can in-
crease the minimum δ for Russia to cooperate. However, the U.N can offer more
concessions to increase Russia’s payoff to keep cooperating, therefore decreasing
11
188
the minimum δ required. These findings outline the cause of the breakdown: a
lack of patience on Russia’s behalf. This lack of patience (a low valuation of
the future)makes Russia unwilling to extend the Deal as it is not as beneficial
to them. However, the findings presented also find a solution to preserve coop-
eration around the Black Sea Grain Deal: offering more concessions to Russia
to incentivize a return to the Black Sea Grain Deal. However, the U.N has to
offer enough concessions that renewing the Black Sea Grain Deal is beneficial
to Russia, even with a lower patience, in order for Russia to cooperate.
12
189
References
[Bon23] Courtney Bonnell. Russia halts landmark deal that allowed ukraine
to export grain at time of growing hunger. AP News, 2023.
[GT20] Oliver Gossner and Tristian Tomala. Repeated games with complete
information. Complex Social and Behavioral Systems: Game Theory
and Agent-Based Models, 2020.
[HS23] Nigel Hunt and Jonathan Saul. Black sea grain deal: What’s next
now that russia has pulled out?, 2023.
[Jer88] Robert Jervis. Realism, game theory, and cooperation. World Politics,
1988.
[Nic23] Michelle Nichols. Russia could be ready for black sea grain deal talks,
but no evidence yet, us says. Reuters, 2023.
[Ped23] Raul (Pete) Pedrozo. The black sea grain initiative: Russia’s strategic
blunder or diplomatic coup? International Law Studies, 2023.
[UNC22] UNCTAD. The black sea grain initiative: What it is, and why it’s
important for the world, 2022.
[Zag14] Frank Zagare. A game-theoretic history of the cuban missile crisis.
Economies, 2014.
13
190
Automated Pneumonia Detection From Chest
X-ray Images Using Machine Learning
∗
Lurvı̈sh Polodoo
October 17, 2023
Abstract
In this data science project, pneumonia detection was addressed using
Convolutional Neural Networks (CNNs) applied to chest X-ray images.
With the advancement of deep learning techniques, CNNs have emerged
as a powerful tool for image classification tasks. By leveraging the capa-
bilities of CNNs, this research aims to develop a robust and automated ap-
proach to classifying pneumonia from chest X-ray images, enabling timely
and accurate diagnosis. The study includes comprehensive dataset de-
tails, explores supervised learning principles, and delves into binary clas-
sification techniques. Additionally, the research thoroughly examines the
impact of different image dimensions on the model’s performance, while
utilizing regularization to prevent overfitting. The developed CNN model
achieves high accuracy on both the training and validation datasets, show-
casing its potential in pneumonia detection. In addition to the technical
aspects, potential applications in medical imaging are highlighted, lim-
itations are addressed, and areas for improvement are proposed in this
research. While the CNN model shows promise, it is designed as a valu-
able aid to medical professionals, enhancing early detection and screening
processes.
1 Introduction
Pneumonia is a common and potentially life-threatening respiratory infection
that disproportionately affects young children, leading to a significant number
of deaths globally. In 2019, it claimed the lives of 740,180 children under the
age of 5, accounting for 14% of all deaths in this age group [Wor22].
Early detection and accurate diagnosis are crucial for effective treatment and
management of this disease. To combat this pressing public health challenge,
this research focuses on employing advanced machine-learning techniques to
improve the efficiency and reliability of pneumonia detection from chest X-ray
images.
∗ Advised by: Guillermo Goldsztein, Georgia Institute of Technology
191
With the rapid progress in deep learning and CNNs, it is believed that an
automated approach can assist medical professionals in the early detection of
pneumonia cases, thereby reducing the risk of complications and saving lives.
The indispensable role of healthcare experts in diagnosis is acknowledged, and
it is emphasized that the model serves as a valuable tool to complement their
expertise, rather than replace it. In section 2, a detailed account of the imple-
mentation of the Convolutional Neural Network (CNN) model for pneumonia
detection using chest X-ray images is provided. It covers aspects such as the
dataset description and source, supervised learning, binary classification, and
the architecture of neural networks. Additionally, it explores the concept of
image preprocessing, specifically investigating the impact of different image di-
mensions on the model’s performance.
The discussion delves into vital concepts like generalization error and over-
fitting through an exploration of model training in section 3. This exploration
notably emphasizes the implementation of regularization techniques, strategi-
cally applied to avert overfitting and enhance the model’s ability to adeptly
assimilate uncharted data. Subsequently, the following section undertakes a
comprehensive analysis of the model, with a heightened focus on the accuracy
of the pneumonia detection model.
Furthermore, this research explores the potential applications of the de-
veloped CNN model for pneumonia detection in medical imaging. Section 5
highlights the model’s significance in early pneumonia detection as well as its
limitations and potential areas for improvement. It emphasizes the importance
of conducting clinical validation studies to ensure real-world effectiveness and
safety. Additionally, the ethical implications of deploying AI models in health-
care are acknowledged, focusing on privacy, biases, and the ethical responsibility
of AI as a complementary tool to medical professionals’ expertise.
This project holds great promise in the field of medical imaging and has
the potential to significantly impact healthcare by improving the efficiency and
reliability of pneumonia detection.
2 Methodology
In this section, the implementation of the Convolutional Neural Network (CNN)
model for pneumonia detection is described. The full code implementation
is available on Kaggle [Pol23], and it includes the model architecture, data
preprocessing, and training process.
192
ing set includes 390 pneumonia and 234 normal images. Notably, the training
dataset exhibits class imbalance, with more pneumonia cases than normal cases.
Class imbalance can impact the model’s performance, leading to biased predic-
tions. To address this, techniques like data augmentation, resampling, or class
weights can be explored [Jap01]. By mitigating class imbalance and refining
the approach, AI-assisted systems for pneumonia detection can become more
reliable and accurate, enhancing healthcare diagnostics.
193
from chest X-ray images, providing valuable support to medical professionals in
their diagnostic process.
194
defined as:
f (x) = max(0, x)
So, if the input x is negative, the ReLU function will output 0, and if the
input x is positive (or equal to 0), the ReLU function will output x. This simple
non-linear activation function introduces non-linearity to the neural network,
which is essential for enabling the model to learn complex patterns and perform
well in various tasks, including image classification.
195
Figure 3: Graph of the Sigmoid Function [Pan19]
196
Upon resizing the images to a dimension of 50 x 50 pixels, a noticeable in-
crease in blurriness was observed. This blurriness adversely affected the quality
of the images and resulted in the omission of crucial pneumonia-related in-
formation. As a consequence, the model encountered challenges in accurately
detecting pneumonia cases with this excessively low dimension.
To address this concern, the significance of opting for a higher image reso-
lution was duly recognized. Subsequently, experimentation was conducted with
an image dimension of 250 x 250 pixels, revealing enhancements in compari-
son to the 50 x 50 version. Nevertheless, a degree of information loss persisted
relative to the original images. Despite this drawback, discerning substantial
distinctions between the original and 250 x 250 images proved to be challenging
for the naked eye, underscoring the dimension’s ability to maintain a pragmatic
equilibrium between image quality and operational efficiency.
It is essential to acknowledge that using excessively high dimensions may sig-
nificantly prolong the training time of our model, affecting the overall efficiency
of obtaining results. Conversely, choosing dimensions that are too low can lead
to the loss of vital information crucial for the accurate detection of pneumonia.
In light of the experimentation, an image dimension of 250 x 250 pixels
was ultimately chosen as it offered a favorable trade-off between accuracy and
efficiency. It is important to recognize that the process of image resizing involves
a delicate compromise, where the preservation of vital information is balanced
with minimizing the computational complexity of the model. The choice of
an appropriate image dimension was made with the intention of optimizing the
performance of the pneumonia detection model, all the while upholding practical
feasibility.
3 Model Training
During the training process of the pneumonia detection model, two critical
concepts were encountered: generalization error and overfitting. These concepts
are essential in machine learning as they directly impact the model’s ability to
perform well on new, unseen data.
197
may yield unreliable and inaccurate predictions, limiting its practicality and
effectiveness.
3.2 Overfitting
Overfitting is a common issue encountered during the training of machine learn-
ing models. It occurs when a model performs exceptionally well on the training
data but fails to generalize effectively to new, unseen data. In essence, the model
becomes too complex and starts memorizing the noise and outliers present in
the training data, instead of learning the essential patterns.
When a model overfits, it loses its ability to generalize, leading to poor
performance on test data. Overfitting is particularly problematic in image clas-
sification tasks, as the model may learn to recognize specific features present
in the training images rather than capturing the essential characteristics of the
disease it is supposed to detect.
198
The figure illustrates the impact of regularization on the model’s perfor-
mance. Without regularization, the training accuracy rapidly reaches a perfect
score of 1.00, while the validation accuracy struggles to surpass 75%. This dis-
crepancy between training and validation accuracies is a strong indication of
overfitting, where the model becomes too specialized in fitting the training data
but fails to generalize well to new, unseen data.
However, by introducing regularization with a strength of 0.02, the model’s
ability to generalize improves significantly. The training accuracy remains high,
close to 95%, while the validation accuracy also experiences a substantial boost.
199
L2 Regularization Strength Validation Accuracy
0.01 67.63%
0.02 81.57%
0.05 69.39%
Accuracy Value
Training Accuracy 93.96%
Validation Accuracy 81.57%
These results showcase the model’s effective learning from the labeled data,
as evidenced by its high accuracy on the training dataset. Additionally, the
relatively high validation accuracy further demonstrates the model’s capability
to generalize well to previously unseen chest X-ray images.
With these promising outcomes, the Convolutional Neural Network (CNN)
model holds substantial potential for advancing pneumonia detection in medical
imaging, promising more accurate and reliable diagnoses in the field. These
results open new avenues for further research and application of the model in
real-world medical scenarios, bringing us one step closer to enhanced healthcare
outcomes.
10
200
5 Applications, Limitations, and Potential
Improvements
5.1 Applications of the Model
The developed Convolutional Neural Network (CNN) model for pneumonia clas-
sification using chest X-ray images has several potential applications in the field
of medical imaging. Some practical implications and potential uses of the model,
highlighting its significance in improving healthcare outcomes, are:
• Early Pneumonia Detection:
Timely and accurate diagnosis of pneumonia is crucial for effective treat-
ment and patient management. The CNN model can be utilized as a screen-
ing tool to assist radiologists and healthcare professionals in the early de-
tection of pneumonia [KC21]. By automating the classification process, the
model can expedite the identification of pneumonia cases, enabling prompt
intervention and reducing the risk of complications.
• Support for Clinical Decision-Making:
The CNN model can serve as an aid in clinical decision-making processes.
By providing an objective analysis of chest X-ray images, the model can
assist healthcare professionals in their diagnostic assessments [Sez23]. The
predictions made by the model can be used as a valuable reference, help-
ing physicians validate their initial interpretations and improve diagnostic
accuracy.
• Telemedicine and Remote Areas:
In remote areas or regions with limited access to healthcare facilities, the
CNN model can be employed as a diagnostic tool. By transmitting chest
X-ray images to a centralized location, the model can analyze and classify
the images remotely. This telemedicine application can bridge the gap
in healthcare services, providing access to expert opinions and facilitating
prompt diagnosis, even in underserved regions.
• Education and Training:
The CNN model can also be utilized as an educational tool for medical
students and healthcare professionals. By providing annotated predictions,
the model can aid in the learning process, allowing individuals to compare
their assessments with the model’s classifications. This interactive learning
approach can enhance the understanding of pneumonia patterns in chest
X-ray images and improve diagnostic skills.
11
201
to consider for further improvement. Firstly, to enhance the model’s generaliz-
ability across diverse patient populations and imaging conditions, it is essential
to augment the dataset’s size and diversity.
To build transparency and trust with medical professionals, integrating ex-
plainable AI methods is crucial [MKR21]. By providing interpretive insights
into the model’s decision-making process, clinicians can better understand and
trust the predictions.
In the context of deploying the model in real healthcare settings, conducting
clinical validation studies is vital. Collaborating with medical experts and con-
ducting prospective studies can validate the model’s effectiveness, safety, and
practicality for real-world use. Clinical validation is essential to ensure that the
model’s performance aligns with medical standards and guidelines.
In light of the rapid advancements in AI technology, addressing the ethical
implications of deploying AI models in healthcare becomes paramount. Several
concrete steps can be taken to ensure the responsible and ethical integration of
AI. Transparent algorithm development is essential, requiring clear documen-
tation of the model’s decision-making process to foster understanding among
medical professionals. To mitigate potential biases, robust strategies must be
implemented, accompanied by regular evaluation across diverse demographic
groups [SW22]. Incorporating these measures not only promotes the trustwor-
thy adoption of AI in healthcare but also fosters a collaborative environment
where AI augments and enhances the capabilities of medical professionals, ulti-
mately contributing to improved patient care.
By addressing these challenges and areas of improvement, a more powerful
and reliable AI-assisted tool for pneumonia detection can be developed, signifi-
cantly impacting healthcare outcomes and patient care.
12
202
6 Conclusion
In the context of this data science project, a Convolutional Neural Network
(CNN) model was developed for pneumonia detection using chest X-ray im-
ages. Capitalizing on the capabilities of deep learning and advanced image clas-
sification techniques, this model exhibits substantial potential to aid medical
professionals in promptly and accurately identifying pneumonia cases.
Throughout this research, fundamental machine learning concepts were ex-
plored, encompassing supervised learning and binary classification. Through the
application of binary classification within the framework of supervised learning,
a robust and automated model was crafted, capable of effectively discerning be-
tween pneumonia and normal (non-pneumonia) conditions within chest X-ray
images.
The methodology included evaluating the model’s accuracy on both the
training and validation datasets, demonstrating the effectiveness of my ap-
proach. The model achieves high accuracy on both datasets, indicating its
ability to learn from labeled data and generalize to new instances, instilling
confidence in its practical usability.
While the CNN model showcases impressive performance, it was emphasized
that it is not intended to replace the expertise of medical professionals. The
expertise, experience, and clinical judgment of healthcare experts are irreplace-
able, and the model serves as an aid to complement their skills.
In conclusion, the CNN model for pneumonia detection represents a signifi-
cant advancement in the field of medical imaging. By fusing AI technology with
medical expertise, early and accurate pneumonia diagnosis can be achieved,
leading to improved healthcare outcomes and ultimately, saving lives.
7 Acknowledgements
I extend my sincere gratitude to Professor Guillermo Goldsztein for their exem-
plary mentorship and scholarly guidance throughout the course of this research.
Their insightful feedback and dedication to academic excellence have signifi-
cantly enriched the quality of this work.
I would also like to express my appreciation to Davida Kollmar and the Hori-
zon Academic Research Program for their invaluable support. The constructive
feedback and resources provided by this program have greatly contributed to
the refinement of this research.
13
203
References
[GWK+ 18] Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir
Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Gang Wang,
Jianfei Cai, and Tsuhan Chen. Recent advances in convolutional
neural networks. Pattern Recognition, 2018.
[KC21] Lingzhi Kong and Jinyong Cheng. Based on improved deep convo-
lutional neural network model pneumonia image classification. PloS
one, 16(11):e0258804, 2021.
[Kro08] Anders Krogh. What are artificial neural networks? Nature biotech-
nology, 26(2):195–197, 2008.
[Liu17] Danqing Liu. A practical guide to relu. start using and understand-
ing relu. . . — by danqing liu. https://medium.com/@danqing/
a-practical-guide-to-relu-b83ca804f1f7, 2017. Accessed on
30 July 2023.
[LLQ19] Jian Li, Xuanyuan Luo, and Mingda Qiao. On generalization error
bounds of noisy gradient methods for non-convex learning. arXiv
preprint arXiv:1902.00621, 2019.
[Mar] Brendan Martin. Binary classification – learndatasci. https:
//www.learndatasci.com/glossary/binary-classification/.
Accessed on 30 July 2023.
[MKR21] Aniek F Markus, Jan A Kors, and Peter R Rijnbeek. The role
of explainability in creating trustworthy artificial intelligence for
health care: a comprehensive survey of the terminology, design
choices, and evaluation strategies. Journal of biomedical informat-
ics, 113:103655, 2021.
14
204
[Pol23] Lurvı̈sh Polodoo. Pneumonia detection using cnn.
https://www.kaggle.com/code/lurvish12/
pneumonia-detection-using-cnn, 2023. Accessed on 5 Au-
gust 2023.
15
205
The Effects of Classical Music Intervention on the
Neuropsychiatric and Cognitive Mechanisms of
Alzheimer’s Disease Patients
∗
Nyneishia Janarthanan
October 17, 2023
Abstract
Alzheimer’s Disease (AD) is a progressive neurodegenerative disorder,
presenting a profound challenge to both neuropsychiatric and cognitive
well-being. As the sixth leading cause of death in the United States, AD
currently lacks a cure. This concurrent drawback sheds light into both the
pharmacological and nonpharmacological, therapeutic interventions that
could be incorporated into an AD patient’s course of treatment. Among
these is the transformative promise of classical music as a nonpharma-
cological mediation for AD patients. The exploration between classical
music and the neuropsychiatric and cognitive mechanisms of AD unveils
the effects of classical music on memory, spatial reasoning, depression,
sleep disorders, and other AD symptoms. Concepts such as Mozart’s ef-
fect offer a source of solace for improving the quality of life of individuals
diagnosed with AD. Moreover, the activation of the brain and the alter-
ation in various brain structures give rise to the diverse effects of classical
music in a healthcare and neurological setting.
206
1 Introduction
The implementation of classical music into human existence traces its origins
to the middle of the 18th century, serving not only as an art form but also
as a wordless language of its own. Nonetheless, classical music’s purpose has
evolved over the years in a plethora of ways: music as a pain reliever, the
beneficial role of music in exercise and sport, musical leisure activities in aging
rehabilitation, etc. However, one field that often remains overlooked in this
regard is the effect of classical music on the symptoms of Alzheimer’s Disease
(AD). Even though neuroscience remains to be a highly studied and researched
field, a plethora of questions with unknown answers arise from the topic of
music’s effect on the brain and limbic system of an AD patient. Alzheimer’s –
a neurological disease resulting from neuronal degeneration – ranks among the
leading causes of death worldwide. Its hallmark symptoms include memory loss,
cognitive decline, disorientation, aggression, depression, and a common inability
to perform everyday tasks. The likelihood of acquiring this harmful disorder is
steadily increasing and is expected to worsen tremendously in the future.
207
2 Alzheimer’s Disease
Alzheimer’s Disease (AD) – currently ranked as the sixth leading cause of death
worldwide – represents a progressive neurodegenerative disorder. It lacks a
definitive cure and primarily impacts a patient’s cognitive functions such as
memory, behavior, and thinking. AD stands as the most prevalent form of
dementia, which is characterized by a gradual decline in two or more domains
of cognition such as memory, language, behavior, and executive function [A.18].
The presence of neuritic plaques and neurofibrillary tangles are AD’s hallmark
indications – elements measured throughout the progression and regression of
any disease [R.20]. This condition was first comprehensively described by Alois
Alzheimer in 1906 as a “peculiar severe disease process of the cerebral cortex”.
Healthcare costs for AD are estimated to be approximately $500 billion yearly,
ranging from the necessity for treatments to routine checkups.
208
AD, nearly half of the cases are due to mutations in three genes: amyloid pre-
cursor protein (APP), Presenilin-1 (PSEN1) and Presenilin-2 (PSEN2) [J.15].
Presenilin (PSEN) mutations will be discussed in much more detail later. It is
crucial to understand that findings in early-onset familial cases will translate to
the sporadic (no specific family link) late-onset AD.
Aging is the main risk factor for AD that simply cannot be explained by
the popular amyloid hypothesis theory, which asserts that the amyloid-beta
plaques are the major highlight of this disease. Yet, an alternate perspective
to aging and AD is strongly related to the APOEE4 allele, which remains as
the most robust genetic risk factor for sporadic AD. In AD, the risk conferred
by APOEE4 is mostly observed in the 61-65 age group, which supports the
statement that symptoms of late-onset AD first appear around 65 years of age
[R.97]. One copy of APOEE4 is carried by approximately 25% of individuals,
but inheriting this gene does not indicate that a person will surely develop AD
[212 1702 1 ]. It is important to note that the APOEE4, APOEE3, and APOEE2
alleles all play a significant role in the onset and progression of AD, but the
APOEE4 vastly increases the risk of the disease compared to its counterparts
[G.22]. Moreover, the APOE genotypes’ pathogenesis has been researched way
beyond just amyloid-beta plaques and the Tau neurofibrillary tangles, providing
potential answers to the age-related progression of AD [T.21]. For one, APOEE4
is associated with not only AD but also other symptoms and diseases such as
age-related cognitive decline and Lewy Body Dementia (LBD) [G.22]. Secondly,
the APOE Cascade Hypothesis connects the dots between an increased risk of
AD and aging by stating that the biochemical and biophysical characteristics
of APOEE4 at a cellular level cause a multitude of downstream effects observed
in AD [T.21].
209
The cascade – or the successive progression of APOEE4 – begins at the
biochemical and cellular phase as demonstrated in Figure 3. Properties of the
allele such as lipidation and receptor binding have harmful impacts on some cell
processes, which could accumulate into cellular stress and eventually lead to the
onset of age-related cognitive decline and AD. Aging and AD are interconnected
but are distinct in nature. The number of neurons do not severely increase or
decrease in aging, but neuronal and synapse loss is a key indication of AD.
Nevertheless, aging and the increased risk of contracting AD with the APOEE4
gene is a predominantly researched topic within the field, proving to hold a
major connection to sporadic AD.
Another genetic factor for AD that is widely discussed is the PSEN1 gene
mutation, encoding the Presenilin-1 (PS1) protein. In early-onset AD or famil-
ial Alzheimer’s Disease (FAD), PSEN1 mutations account for nearly 90% of all
mutations recorded in FAD, illustrating the significance of this gene and its pro-
tein products [rSJ17]. The presenilin hypothesis proposes that these deleterious
mutations result in a decrease of the needed presenilin functions in the brain,
triggering both neurodegeneration and dementia in FAD [rSJ17]. Another com-
ponent to discuss – with regard to PSEN1 and PS1 – is the Y-secretase enzyme
whose catalytic subunit is PS1. More specifically, Y-secretase cleaves differ-
ent types of transmembrane proteins attached to the plasma membrane of a
cell, which includes the amyloid precursor protein (APP) – a central element
of AD. Y-Secretases produce two types of amyloid-beta proteins in AD: AB42
and AB40. The only difference between the two is that AB42 has two extra
residues at its C-terminus end [Z.13]. It has been proposed in the past that
AB42 and AB40 are heavily responsible for AD since they accumulate into one
of the hallmark pathological indications for this disease – AB plaques. However,
PSEN1 mutations did not increase both of the proteins; instead, the proteins
both decreased in number (especially AB40) which elevated the AB42/AB40
ratio [rSJ17]. This AB42/AB40 ratio is a useful diagnostic marker of AD since
the ways in which PSEN1 mutations affect APP and AB-plaques is complex
and not yet properly acknowledged [C.06]. It is clear that Y-secretases produce
the final AB proteins involved in AD, but they also regulate Notch signaling,
which regulates cell proliferation, cell fate, differentiation, and cell death [R.12].
Therefore, pharmacological interventions such as drug therapy attempt to alter
AB protein production without interfering with Y-Secretases’ ability to perform
Notch signaling [S.12].
The next genetic factor for AD that will be discussed is Down Syndrome
(DS) or Trisomy 21: a genetic disorder caused by the presence of an extra
copy of Chromosome 21 or a part of it. It is distinguished based on cran-
iofacial abnormalities, heart defects, cognitive impairments, and neurological
alterations [M.20]. With over 200,000 cases in the United States alone, DS
is one of the leading genetic risk factors for FAD. Furthermore, clinical and
biomarker changes in DS associated FAD demonstrate that many of the same
cortical regions are affected in both diseases such as the hippocampus and the
prefrontal cortex [M.21a]. As they progress, both diseases share similar cellular
dysfunctions such as impaired autophagy, reduced and/or damaged lysosomal
210
activity, and mitochondrial dysfunction [M.20].
By the age of 40, NFT and AB accumulation are present in the brains
of individuals with DS, which is sufficient enough to confirm a pathological
diagnosis for AD [C.18b]. In Figure 4, evidence demonstrates that progres-
sive brain inflammation can emerge as early as the late teenage years in DS
based on recorded intracellular accumulations of AB. The early appearance
of AD’s hallmark indications in individuals with DS can be explained by the
presence of neuron-derived exosomes, which are tiny extracellular vesicles that
contain elevated levels of both AB peptides and the hyperphosphorylated Tau
protein [C.18b]. Since exosomes are blood biomarkers, their progression and de-
velopment can be monitored, which informs future AD diagnostics, preventions,
and potential treatments in the DS population as well as the general population.
Finally, sex is an important genetic risk factor for contracting AD, with al-
most two thirds of the late-onset AD population being women [dLMJBRD18].
It cannot simply be stated that women are more likely to develop AD since they
have a greater life longevity compared to men. This is because AD pathology
starts many years prior to the appearance of most clinical symptoms [L.18].
However, there is increasing evidence that the perimenopause to menopause
transition (PTMT) – a midlife neuroendocrine transition specific to women – is
heavily responsible for the sex-observed pathophysiological mechanisms underly-
ing AD [dLMJBRD18]. PTMT is strongly neurological in nature; it disrupts and
alters the systems and mechanisms regulating estrogen and impacts thermoreg-
ulation, circadian rhythm, sleep, depression, and even cognition [dLMJBRD18].
During PTMT, estrogen, progesterone, pituitary, hypothalamic, and ovarian
hormone levels fluctuate and decrease. Estrogen, specifically, is unique to fe-
males and is found in a plethora of areas in the brain controlling memory and
cognitive function, indicating its neurological significance [E.18]. When the
brain’s estrogen network disconnects from other brain areas, the resulting hy-
pometabolic state serves as a major site for neurological dysfunction [L.18].
In fact, perimenopausal (PERI) and postmenopausal (MENO) women show
major declines in estrogen-dependent memory tests compared to men, which
211
is the first indication that PTMT can trigger cognitive decline in the female
population [dLMJBRD18]. Secondly, the MENO and PERI groups disclosed
higher rates of cerebral metabolic rate for glucose consumption (CMRglc) de-
cline compared to males and premenopausal (PRE) women. Glucose is neces-
sary to provide the precursors for neurotransmitter synthesis and fuel adenosine
triphosphate (ATP) production, which is the source of energy and storage at the
cellular level [A.13]. With a noticeable decrease in glucose levels, the neurologi-
cal workings of the body are severely disrupted in PERI and MENO individuals.
In essence, decreased estrogen levels and the deterioration of the pathway that
affects CMRglc explain the higher percentage of women developing AD.
In addition to genetic risk factors, various environmental, predisposing con-
tributors pertain to AD such as Type 2 diabetes (T2D)/Type 2 diabetes mellitus
(T2DM), obesity, and cerebrovascular disease.
Firstly, the interplay between diabetes, obesity, and AD highlights the com-
plex relationship between lifestyle factors and the risk of cognitive decline. It
should be noted that obesity is characterized by an excessive accumulation of
body fat, which is measured using the Body Mass Index (BMI). Obesity can,
in turn, trigger the development of T2D, and the risk of acquiring this disease
linearly grows with an increase in BMI [E.22b]. T2D – representing 90-95% of
diabetic cases – can be defined as a disease affecting metabolic activity, char-
acterized by the presence of chronic hyperglycemia due to pancreatic cell fail-
ure [C.21]. Just by itself, hyperglycemia or high blood glucose, can contribute
to molecular, biochemical, and histopathological lesions in AD [ML14]. Yet, the
main focus when researching the connection between T2D and AD is insulin re-
sistance – the body’s reluctance to the insulin hormone, subsequently resulting
in an increase of blood sugar. The hyperglycemic status of T2D patients due
to insulin resistance affects neuronal homeostasis and affects K-ATP channels,
which increases AB peptide levels [C.21]. Also, an increased level of glucose
in the blood and the dysregulation of glucose molecules drives an unregulated
non-enzymatic reaction between many carbohydrates (such as sugars) and lipids
and between free amino groups (-NH2) of several proteins and nucleic acids,
which results in advanced glycation end-products (AGEs) [C.21]. High levels of
AGEs elicit inflammatory reactions in the brain and develop symptoms leading
to poorer memory and higher hippocampal levels of insoluble AB42 [M.16b].
AGEs promote AB plaques and neurofibrillary tangle formation more in AD
patients with T2D than in non-diabetic AD patients [C.21]. The two main
hallmark indications of AD – AB plaques and neurofibrillary tangles – will be
discussed in detail in the pathology section.
Another environmental risk factor for AD is cerebrovascular disease (CVD)
– a type of cardiovascular disease that harms the blood vessels supplying the
brain. CVD is the most frequent type of life-threatening injury to the brain
and is the fifth most common cause of death. CVD and AD share many of
the same risk factors such as the APOEE4 gene, T2DM, obesity, and age [S.16].
These account for some of the genetic and environmental risk factors of AD pre-
viously discussed, which demonstrates that CVD’s origin is pathologically and
environmentally similar to that of AD. AB plaques in AD accumulate in the
212
extracellular part of a neuron; in cerebral arterioles and blood vessels supplying
to the brain, AB builds up in the capillaries of CVD patients. Most AD patients
have AB angiopathy resulting from CVD, which predominantly affects the cere-
bral leptomeninges, cortex, cerebellum, and the brain stem [S.16]. The capillary
AB angiopathy is detected in almost 35-45% of AD cases, which provides robust
evidence supporting the hypothesis that CVD can contribute to the symptoms
distinct to AD due to the synergistic relationship of the diseases [G.21].
213
symbols and the reduced color intensity in the legend illustrate that many pro-
teins, enzymes, and essential cell structures in AD turn defective or are missing
all together, which essentially kills the whole neuron. Nearly all brain regions
are innervated by cholinergic neurons, which are responsible for processes re-
lated to learning, memory, and attention. The progressive degeneration of the
basal forebrain cholinergic neurons (BFCNs), for instance, is correlated with the
harmful symptoms and memory deficits that AD impedes on a patient [M.16a].
In fact, BFCNs provide the main cholinergic information to prefrontal cortices
in the brain along with other crucial structures such as the amygdala (respon-
sible for evoking emotions) and the hippocampus, which plays a significant role
in long term memory function and memory consolidation [M.22]. Patholog-
ically, there have been major depletions of the cholinergic synthetic enzyme
named choline acetyltransferase (ChAT) and the cholinergic hydrolytic enzyme
acetylcholinesterase (AChE) in and around BFCNs [MM21]. ChAT is intercon-
nected with the process of synthesizing or polymerizing ACh, whereas AChE
breaks down ACh into its component parts. Together, these enzymes along
with the ACh neurotransmitter play an essential role in the nervous system
due to their ability to regulate cell signaling and host effective communication
amongst neighboring neurons. A dramatic loss of ChAT and AChE activity in a
considerable number of AD cases strongly supports the claim that the BFCN’s
degeneration is a strong foundation for the cholinergic theory of AD.
Onto the more prominent hallmarks of AD (ones that contribute to the loss of
neurons and their synapses) are the extracellular neuritic plaques containing the
amyloid-beta (AB) protein and the intracellular neurofibrillary tangles (NFTs)
carrying the hyperphosphorylated Tau protein.
214
As a result, neuronal function is disrupted, which eventually contributes to the
cell’s death.
10
215
ROS. These species range from various types of anions to hydrogen peroxide,
which both have unpaired valence electrons. Such ions and compounds will
readily accept an electron to transition into a stable state with a full octet sur-
rounding their outer shell. However, until they reach stability, ROS at high con-
centrations will react almost immediately with the four major macromolecules
in the body: lipids, proteins, carbohydrates, and nucleic acids [KH12]. On one
hand, ROS are crucial in physiological processes such as redox regulation and
transcription of DNA. Nonetheless, they may also induce undesirable effects and
even irreversible outcomes such as the aggregation of Alzheimer’s.
When ROS levels substantially decrease or increase past their optimal level
of appearance, there can be dangerous consequences such as lack of signaling
when they increase or overshoot signaling (signal is sent exceeding its target)
when they decrease. In the context of AD, increased production of ROS by the
interrelation between AB plaques and NFTs raises the risk of oxidative stress:
a condition caused by the imbalance of ROS in cells and tissues, impairing the
body’s ability to detoxify these reactive substances [A.17]. Elevated levels of
ROS leading to oxidative stress can exacerbate age and disease-dependent mi-
tochondrial dysfunction, reduce antioxidant defences around synaptic activity,
and disrupt neuronal cell signaling, ultimately leading to cognitive dysfunc-
tion [E.17]. Even in the absence of ROS, AB plaques and high levels of Tau can
independently worsen mitochondrial dysfunction and interrupt cell communica-
tion, which inevitably contributes to a perilous cycle in the body with disruptive
homeostatic control.
The intricate pathology of AD, in summary, is marked by hallmark features
of neuronal and synaptic loss, AB plaques, NFTs, ROS, and oxidative stress.
11
216
The interconnectedness between these factors wholly contributes to the cognitive
and behavioral decline associated with AD. Ongoing research continues to shed
light on novel aspects of AD pathology, which holds a degree of promise for
lessening the impact of this neurological disorder.
12
217
that the deterioration of cholinergic neurons – that release the AcH neuro-
transmitter – is primarily responsible for memory loss and learning deficits. In
addition, the ability of the cholinergic system in triggering depression has also
been suggested in clinical studies nearly 50 years ago [S.19a]. With respect
to changes in the cholinergic system, the hippocampal region is the crossroad
where cognitive deficits meet with depressive manners [E.22a].
13
218
of nocturnal sleep [A.20]. Sleep disorders such as sleep breathing disorders
and restless leg syndrome negatively alter circadian fluctuations of AB in the
interstitial brain fluid and cerebrovascular fluid (CVF) related to the production
of AB plaques [G.18]. Such sleep abnormalities evoke the increased production
of the pathological Tau protein and AB plaques. In order to aid AD patients
in adopting healthy sleep patterns, the development of specific procedures to
improve sleep structure and quality are being expanded on constantly. There
are a plethora of NPS associated with AD, and there is extensive, favorable
research being done on alleviating these symptoms.
Conversely, the major cognitive symptom in Alzheimer’s is memory loss.
If asked to name a disease that affects memory, most doctors would probably
choose Alzheimer’s. The six major memory systems include episodic, seman-
tic, simple classical, procedural, working, and priming memory. Of these ma-
jor categorizations, the deterioration of episodic memory is the most clinically
abundant cognitive symptom in AD [E.08]. Episodic memory is utilized when
consciously recalling a particular episode in one’s life, such as watching a movie
with a family member. Dangers arise from a loss in episodic memory when
AD patients forget if certain medications have been taken or even if the stove
is turned off or not [E.08]. Working memory and long-term, explicit memory
are impacted early in the course of this disease [H.13]. The first brain lesions
unique to AD appear in the poorly myelinated limbic neurons in areas affecting
memory, such as the hippocampus. For instance, hippocampal volume reduces
from 2.5 mL to almost 1.6 mL in the brain of an AD patient, especially as the
disease progresses [H.13]. Even with the inclusion of various AD criteria and
hippocampal biomarkers, there remain several barriers in neurological testing
for memory loss in AD.
Other cognitive symptoms of AD are impaired problem solving and language
levels. Something as simple as following a recipe or even paying the bills can
occur as AD worsens. Language impairments are caused by a decrease of soci-
olinguistic aspects such as the meaning of words, difficulties with fitting a word
and phrase into a situation, and word comprehension [K.15]. These two preva-
lent cognitive symptoms appear early on in AD, so they are used occasionally
to help diagnose a patient with AD.
14
219
3.1 Observed Effects of Music on the Brain
The ability for humans to perceive and enjoy music is a universal trait that
originated centuries ago and is still carried with us. Music is one of the most
powerful and diverse sensory, cognitive, and emotional experiences [T.17]. Our
brains light up when interpreting and perceiving music, and our bodies respond
in several ways, reflecting the powerful connection between music and our well-
being. Not only does music reduce feelings of separation and loneliness, it
also evokes cherished memories and maintains self-esteem, competence, and
independence [T.17]. From a cognitive standpoint, music boosts communicative
abilities, memory, self and environmental presence, and verbal and non-verbal
expressions [M.21b]. The improved projection of all these skills originates from
the organ responsible for formulating the very essence of what a human being
is: the brain. On top of encoding music, various parts of the brain are also
engaged based on the type of music traveling from the auditory cortex to the
brain’s nerve signals. This concept is demonstrated below in Figure 10.
Figure 10: Brain Areas Engaged Based on Emotion Category of Music [P.19].
Different brain areas are engaged according to the emotion category of music
in different colors – joyous music (red), tense music (yellow), and sad music
(blue). Based on conclusive results and statistically significant data, the type
of musical input is allocated to different parts of the brain connected by a
bilateral fronto-parietal network [P.19]. From a structural, cross-section view of
the cerebral cortex, it is clear that music in the brain can alter our perception
and emotional response based on the area it activates. For example, music with
a fast tempo and a major mode tend to evoke a positive/happy response, but a
slow tempo induces a negative/sad mood. The functional standpoint of listening
to music over time is an increase in the brain’s alpha waves that are associated
with relaxation and a calm state of mind [V.18]. The brain’s alpha waves are
also robustly responsible for human cognition and emotions, which generates
distinct physiological and psychological effects on the body [V.18].
Additionally, music causes the release of certain neurotransmitters, which
evokes important emotions, memories, and feelings. For instance, dopamine
is released in the mesolimbic reward system while listening to music, which
increases the body’s natural reward sensation. Serotonin, a neurotransmitter
involved in mood regulation and learning, also increases in the presence of audi-
15
220
tory stimuli. Higher concentrations of dopamine and serotonin in the caudate-
putamen and nucleus accumbens (areas linked to reward and motor control)
signify that music has a direct impact on the synaptic activity of these brain
areas and the amount of neurotransmitters released in a healthy manner.
16
221
regulation, motivation, defensive behavior, and anxiety in response to specific
stimuli [D.06]. Typically, the hippocampus is associated with the processing
of unpleasant (permanently dissonant) music compared to pleasant (consonant)
music [D.06]. Just how some brain structures can perceive components like
rhythm and pitch, the hippocampus recognizes harmonious sounds and sepa-
rates them from unstable and tense sounds.
To add, more neuroimaging studies indicate that an overlap in musical ac-
tivation occurs in the superior temporal gyrus (STG), middle temporal gyrus,
middle frontal gyrus, parietal lobe, supplementary motor area, and premotor
cortex [X.19]. During music listening, both the left and right brain hemispheres
are activated, and the right temporal cortex is even involved in the perception
of pitch patterns [X.19]. Clearly, the underlying workings of music is a neu-
rologically ubiquitous process with differing structures involved in an attentive
brain.
17
222
set of benefits that cannot be derived from other types of music. Furthermore,
music with a long-term periodicity, whether of Mozart or other classical com-
posers, resonates within the brain to enhance spatial-temporal performance and
even decrease seizure activity [S.01]. For instance, Greek-American musician
Yanni’s compositions – similar to those of Mozart’s Sonatas in tempo, melody,
harmony, and structure – were also effective and reproduced the exact results
that improved cognitive abilities like reasoning and memory [S.01]. However,
the effects of music may not be dependent on a specific piece. Even though ro-
bust evidence demonstrates classical music’s ability to improve cognitive skills
such as memory and learning, music that is personally liked by subjects turns
out to enhance alpha wave and beta wave frequencies in the temporal brain
regions as well [R.18]. Therefore, non-classical music such as rock, pop, hiphop,
rhythm and blues, and even jazz could potentially generate the same results;
this would greatly depend on the preferences of the individual though.
18
223
classical music, as a non-pharmacological intervention for AD patients, could
ameliorate the neuropsychiatric and cognitive symptoms of AD.
19
224
Figure 12: Gerdner’s Mid-Range Theory of Music Intervention for Agitation
[L.05].
20
225
advanced stages of this deadly disease. From alleviating depression, apathy,
and agitation to improving sleep patterns, classical music therapy should be
encouraged and conducted by trained professionals in AD senior homes in order
to offer fragments of comfort in the midst of uncertainty.
21
226
symphonies of classical music are proven to provide solace and relief for both
the neuropsychiatric and cognitive mechanisms of AD.
5 Conclusion
Alzheimer’s is one of the extreme medical mysteries in the healthcare field with
unanswered questions regarding its etiology, pathology, and diagnosis. Nonethe-
less, it is clear that various genetic and environmental risk factors such as age,
PSEN mutations, gender, obesity, T2D, and CVD are somewhat responsible for
the progression and development of AD. In terms of AD’s pathology, several
distinctive signs such as AB plaques, NFTs, elevated ROS levels, and oxidative
stress are extremely prominent.
After a thorough examination of the effects of classical music on the pro-
gression of Alzheimer’s, the implementation of classical music therapy in senior
homes and assisted living care centers is of utmost importance. The researched,
observed effects of classical music intervention result in enhanced memory, re-
duced NPS, and redeveloped cognitive abilities. Classical music therapy’s ability
to alleviate NPS such as depression and agitation as well as cognitive symptoms,
including learning defects and reduced spatial reasoning, underscores the value
of music as a non-pharmacological intervention that effectively improves an AD
patient’s quality of life alongside existing treatments.
22
227
References
[2A 18] About alzheimer’s disease: Promoting health and indepen-
dence for an aging population. Centers for Disease Control
and Prevention, 2018.
[702 1] Study reveals how apoee4 gene may increase risk for demen-
tia. National Institute on Aging, 2021.
23
228
[C.18a] Gonzalez C. Armijo E. Bravo-Alegria J. Becerra-Calixto A.
Mays C. E. Soto C. Modeling amyloid beta and tau pathol-
ogy in human cerebral organoids. Mol Psychiatry, 2018.
24
229
[E.18] Morgan K. N. Derby C. A. Gleason C. E. Cognitive changes
with reproductive aging perimenopause and menopause. Ob-
stetrics and gynecology clinics of North America, 2018.
25
230
[J.03] Terry A. V. Buccafusco J. J. The cholinergic hypothesis of
age and alzheimer’s disease-related cognitive deficits: Recent
challenges and their implications for novel drug development.
The Journal of Pharmacology and Experimental Therapeu-
tics, 2003.
[J.14] Li X. De Beuckelaer A. Guo J. Ma F. Xu M. Liu J. The
gray matter volume of the amygdala is correlated with the
perception of melodic intervals: a voxel-based morphometry
study. PloS one, 2014.
[J.15] Guerreiro R. Bras J. The age factor in alzheimer’s disease.
Genome medicine, 2015.
[J.17] Gomez Gallego M. Gomez Garcia J. Music therapy
and alzheimer’s disease: Cognitive, psychological, and be-
havioural effects. Neurology, 2017.
[K.15] Klimova B. Maresova P. Valis M. Hort J. Kuca K.
Alzheimer’s disease and language impairments: social inter-
vention and medical treatment. Clinical interventions in ag-
ing, 2015.
[KH12] Brieger K. Schiavone S. Miller Jr. J.F. Krause KH. Reac-
tive oxygen species: from health to disease. Swiss Medical
Weekly, 2012.
[L.05] Gerdner L. Effects of individualized versus classical relax-
ation music on the frequency of agitation in elderly persons
with alzheimer’s disease and related disorders. Cambridge
University Press, 2005.
[L.18] Scheyer O. Rahman A. Hristov H. Berkowitz C. Isaacson R.
S. Diaz Brinton R. Mosconi L. Female sex and alzheimer’s
risk: The menopause connection. The journal of prevention
of Alzheimer’s disease, 2018.
[L.23] Bleibel M. El Cheikh A. Sadier N. S. Abou-Abbas L. The ef-
fect of music therapy on cognitive functions in patients with
alzheimer’s disease: a systematic review of randomized con-
trolled trials. Alzheimer’s Research and Therapy, 2023.
[M.14] Pauwels E. K. Volterrani D. Mariani G. Kostkiewics M.
Mozart music and medicine. Medical principles and prac-
tice : international journal of the Kuwait University Health
Science Centre, 2014.
[M.16a] Ferreira-Vieira T. H. Guimaraes I. M. Silva F. R. Ribeiro F.
M. Alzheimer’s disease: Targeting the cholinergic system.
Current neuropharmacology, 2016.
26
231
[M.16b] Lubitz I. Ricny J. Atrakchi-Baranes D. Shemesh C. Kravitz
E. Liraz-Zaltsman S. Maksin-Matveev A. Cooper I. Lei-
bowitz A. Uribarri J. Schmeidler J. Cai W. Kristofikova
Z. Ripova D. LeRoith D. Schnaider-Beeri M. High di-
etary advanced glycation end products are associated with
poorer spatial learning and accelerated aB deposition in an
alzheimer mouse model. Aging cell, 2016.
27
232
[O.14] Gouras G. K. Olsson T. T. Hansson O. AB-amyloid peptides
and amyloid plaques in alzheimer’s disease. Neurotherapeu-
tics, 2014.
28
233
[S.16] Love S. Miners J. S. Cerebrovascular disease in aging and
alzheimer’s disease. Acta neuropathologica, 2016.
29
234
The Effects of Classical Music Intervention on the
Neuropsychiatric and Cognitive Mechanisms of
Alzheimer’s Disease Patients
∗
Nyneishia Janarthanan
October 17, 2023
Abstract
Alzheimer’s Disease (AD) is a progressive neurodegenerative disorder,
presenting a profound challenge to both neuropsychiatric and cognitive
well-being. As the sixth leading cause of death in the United States, AD
currently lacks a cure. This concurrent drawback sheds light into both the
pharmacological and nonpharmacological, therapeutic interventions that
could be incorporated into an AD patient’s course of treatment. Among
these is the transformative promise of classical music as a nonpharma-
cological mediation for AD patients. The exploration between classical
music and the neuropsychiatric and cognitive mechanisms of AD unveils
the effects of classical music on memory, spatial reasoning, depression,
sleep disorders, and other AD symptoms. Concepts such as Mozart’s ef-
fect offer a source of solace for improving the quality of life of individuals
diagnosed with AD. Moreover, the activation of the brain and the alter-
ation in various brain structures give rise to the diverse effects of classical
music in a healthcare and neurological setting.
235
1 Introduction
The implementation of classical music into human existence traces its origins
to the middle of the 18th century, serving not only as an art form but also
as a wordless language of its own. Nonetheless, classical music’s purpose has
evolved over the years in a plethora of ways: music as a pain reliever, the
beneficial role of music in exercise and sport, musical leisure activities in aging
rehabilitation, etc. However, one field that often remains overlooked in this
regard is the effect of classical music on the symptoms of Alzheimer’s Disease
(AD). Even though neuroscience remains to be a highly studied and researched
field, a plethora of questions with unknown answers arise from the topic of
music’s effect on the brain and limbic system of an AD patient. Alzheimer’s –
a neurological disease resulting from neuronal degeneration – ranks among the
leading causes of death worldwide. Its hallmark symptoms include memory loss,
cognitive decline, disorientation, aggression, depression, and a common inability
to perform everyday tasks. The likelihood of acquiring this harmful disorder is
steadily increasing and is expected to worsen tremendously in the future.
236
2 Alzheimer’s Disease
Alzheimer’s Disease (AD) – currently ranked as the sixth leading cause of death
worldwide – represents a progressive neurodegenerative disorder. It lacks a
definitive cure and primarily impacts a patient’s cognitive functions such as
memory, behavior, and thinking. AD stands as the most prevalent form of
dementia, which is characterized by a gradual decline in two or more domains
of cognition such as memory, language, behavior, and executive function [A.18].
The presence of neuritic plaques and neurofibrillary tangles are AD’s hallmark
indications – elements measured throughout the progression and regression of
any disease [R.20]. This condition was first comprehensively described by Alois
Alzheimer in 1906 as a “peculiar severe disease process of the cerebral cortex”.
Healthcare costs for AD are estimated to be approximately $500 billion yearly,
ranging from the necessity for treatments to routine checkups.
237
AD, nearly half of the cases are due to mutations in three genes: amyloid pre-
cursor protein (APP), Presenilin-1 (PSEN1) and Presenilin-2 (PSEN2) [J.15].
Presenilin (PSEN) mutations will be discussed in much more detail later. It is
crucial to understand that findings in early-onset familial cases will translate to
the sporadic (no specific family link) late-onset AD.
Aging is the main risk factor for AD that simply cannot be explained by
the popular amyloid hypothesis theory, which asserts that the amyloid-beta
plaques are the major highlight of this disease. Yet, an alternate perspective
to aging and AD is strongly related to the APOEE4 allele, which remains as
the most robust genetic risk factor for sporadic AD. In AD, the risk conferred
by APOEE4 is mostly observed in the 61-65 age group, which supports the
statement that symptoms of late-onset AD first appear around 65 years of age
[R.97]. One copy of APOEE4 is carried by approximately 25% of individuals,
but inheriting this gene does not indicate that a person will surely develop AD
[212 1702 1 ]. It is important to note that the APOEE4, APOEE3, and APOEE2
alleles all play a significant role in the onset and progression of AD, but the
APOEE4 vastly increases the risk of the disease compared to its counterparts
[G.22]. Moreover, the APOE genotypes’ pathogenesis has been researched way
beyond just amyloid-beta plaques and the Tau neurofibrillary tangles, providing
potential answers to the age-related progression of AD [T.21]. For one, APOEE4
is associated with not only AD but also other symptoms and diseases such as
age-related cognitive decline and Lewy Body Dementia (LBD) [G.22]. Secondly,
the APOE Cascade Hypothesis connects the dots between an increased risk of
AD and aging by stating that the biochemical and biophysical characteristics
of APOEE4 at a cellular level cause a multitude of downstream effects observed
in AD [T.21].
238
The cascade – or the successive progression of APOEE4 – begins at the
biochemical and cellular phase as demonstrated in Figure 3. Properties of the
allele such as lipidation and receptor binding have harmful impacts on some cell
processes, which could accumulate into cellular stress and eventually lead to the
onset of age-related cognitive decline and AD. Aging and AD are interconnected
but are distinct in nature. The number of neurons do not severely increase or
decrease in aging, but neuronal and synapse loss is a key indication of AD.
Nevertheless, aging and the increased risk of contracting AD with the APOEE4
gene is a predominantly researched topic within the field, proving to hold a
major connection to sporadic AD.
Another genetic factor for AD that is widely discussed is the PSEN1 gene
mutation, encoding the Presenilin-1 (PS1) protein. In early-onset AD or famil-
ial Alzheimer’s Disease (FAD), PSEN1 mutations account for nearly 90% of all
mutations recorded in FAD, illustrating the significance of this gene and its pro-
tein products [rSJ17]. The presenilin hypothesis proposes that these deleterious
mutations result in a decrease of the needed presenilin functions in the brain,
triggering both neurodegeneration and dementia in FAD [rSJ17]. Another com-
ponent to discuss – with regard to PSEN1 and PS1 – is the Y-secretase enzyme
whose catalytic subunit is PS1. More specifically, Y-secretase cleaves differ-
ent types of transmembrane proteins attached to the plasma membrane of a
cell, which includes the amyloid precursor protein (APP) – a central element
of AD. Y-Secretases produce two types of amyloid-beta proteins in AD: AB42
and AB40. The only difference between the two is that AB42 has two extra
residues at its C-terminus end [Z.13]. It has been proposed in the past that
AB42 and AB40 are heavily responsible for AD since they accumulate into one
of the hallmark pathological indications for this disease – AB plaques. However,
PSEN1 mutations did not increase both of the proteins; instead, the proteins
both decreased in number (especially AB40) which elevated the AB42/AB40
ratio [rSJ17]. This AB42/AB40 ratio is a useful diagnostic marker of AD since
the ways in which PSEN1 mutations affect APP and AB-plaques is complex
and not yet properly acknowledged [C.06]. It is clear that Y-secretases produce
the final AB proteins involved in AD, but they also regulate Notch signaling,
which regulates cell proliferation, cell fate, differentiation, and cell death [R.12].
Therefore, pharmacological interventions such as drug therapy attempt to alter
AB protein production without interfering with Y-Secretases’ ability to perform
Notch signaling [S.12].
The next genetic factor for AD that will be discussed is Down Syndrome
(DS) or Trisomy 21: a genetic disorder caused by the presence of an extra
copy of Chromosome 21 or a part of it. It is distinguished based on cran-
iofacial abnormalities, heart defects, cognitive impairments, and neurological
alterations [M.20]. With over 200,000 cases in the United States alone, DS
is one of the leading genetic risk factors for FAD. Furthermore, clinical and
biomarker changes in DS associated FAD demonstrate that many of the same
cortical regions are affected in both diseases such as the hippocampus and the
prefrontal cortex [M.21a]. As they progress, both diseases share similar cellular
dysfunctions such as impaired autophagy, reduced and/or damaged lysosomal
239
activity, and mitochondrial dysfunction [M.20].
By the age of 40, NFT and AB accumulation are present in the brains
of individuals with DS, which is sufficient enough to confirm a pathological
diagnosis for AD [C.18b]. In Figure 4, evidence demonstrates that progres-
sive brain inflammation can emerge as early as the late teenage years in DS
based on recorded intracellular accumulations of AB. The early appearance
of AD’s hallmark indications in individuals with DS can be explained by the
presence of neuron-derived exosomes, which are tiny extracellular vesicles that
contain elevated levels of both AB peptides and the hyperphosphorylated Tau
protein [C.18b]. Since exosomes are blood biomarkers, their progression and de-
velopment can be monitored, which informs future AD diagnostics, preventions,
and potential treatments in the DS population as well as the general population.
Finally, sex is an important genetic risk factor for contracting AD, with al-
most two thirds of the late-onset AD population being women [dLMJBRD18].
It cannot simply be stated that women are more likely to develop AD since they
have a greater life longevity compared to men. This is because AD pathology
starts many years prior to the appearance of most clinical symptoms [L.18].
However, there is increasing evidence that the perimenopause to menopause
transition (PTMT) – a midlife neuroendocrine transition specific to women – is
heavily responsible for the sex-observed pathophysiological mechanisms underly-
ing AD [dLMJBRD18]. PTMT is strongly neurological in nature; it disrupts and
alters the systems and mechanisms regulating estrogen and impacts thermoreg-
ulation, circadian rhythm, sleep, depression, and even cognition [dLMJBRD18].
During PTMT, estrogen, progesterone, pituitary, hypothalamic, and ovarian
hormone levels fluctuate and decrease. Estrogen, specifically, is unique to fe-
males and is found in a plethora of areas in the brain controlling memory and
cognitive function, indicating its neurological significance [E.18]. When the
brain’s estrogen network disconnects from other brain areas, the resulting hy-
pometabolic state serves as a major site for neurological dysfunction [L.18].
In fact, perimenopausal (PERI) and postmenopausal (MENO) women show
major declines in estrogen-dependent memory tests compared to men, which
240
is the first indication that PTMT can trigger cognitive decline in the female
population [dLMJBRD18]. Secondly, the MENO and PERI groups disclosed
higher rates of cerebral metabolic rate for glucose consumption (CMRglc) de-
cline compared to males and premenopausal (PRE) women. Glucose is neces-
sary to provide the precursors for neurotransmitter synthesis and fuel adenosine
triphosphate (ATP) production, which is the source of energy and storage at the
cellular level [A.13]. With a noticeable decrease in glucose levels, the neurologi-
cal workings of the body are severely disrupted in PERI and MENO individuals.
In essence, decreased estrogen levels and the deterioration of the pathway that
affects CMRglc explain the higher percentage of women developing AD.
In addition to genetic risk factors, various environmental, predisposing con-
tributors pertain to AD such as Type 2 diabetes (T2D)/Type 2 diabetes mellitus
(T2DM), obesity, and cerebrovascular disease.
Firstly, the interplay between diabetes, obesity, and AD highlights the com-
plex relationship between lifestyle factors and the risk of cognitive decline. It
should be noted that obesity is characterized by an excessive accumulation of
body fat, which is measured using the Body Mass Index (BMI). Obesity can,
in turn, trigger the development of T2D, and the risk of acquiring this disease
linearly grows with an increase in BMI [E.22b]. T2D – representing 90-95% of
diabetic cases – can be defined as a disease affecting metabolic activity, char-
acterized by the presence of chronic hyperglycemia due to pancreatic cell fail-
ure [C.21]. Just by itself, hyperglycemia or high blood glucose, can contribute
to molecular, biochemical, and histopathological lesions in AD [ML14]. Yet, the
main focus when researching the connection between T2D and AD is insulin re-
sistance – the body’s reluctance to the insulin hormone, subsequently resulting
in an increase of blood sugar. The hyperglycemic status of T2D patients due
to insulin resistance affects neuronal homeostasis and affects K-ATP channels,
which increases AB peptide levels [C.21]. Also, an increased level of glucose
in the blood and the dysregulation of glucose molecules drives an unregulated
non-enzymatic reaction between many carbohydrates (such as sugars) and lipids
and between free amino groups (-NH2) of several proteins and nucleic acids,
which results in advanced glycation end-products (AGEs) [C.21]. High levels of
AGEs elicit inflammatory reactions in the brain and develop symptoms leading
to poorer memory and higher hippocampal levels of insoluble AB42 [M.16b].
AGEs promote AB plaques and neurofibrillary tangle formation more in AD
patients with T2D than in non-diabetic AD patients [C.21]. The two main
hallmark indications of AD – AB plaques and neurofibrillary tangles – will be
discussed in detail in the pathology section.
Another environmental risk factor for AD is cerebrovascular disease (CVD)
– a type of cardiovascular disease that harms the blood vessels supplying the
brain. CVD is the most frequent type of life-threatening injury to the brain
and is the fifth most common cause of death. CVD and AD share many of
the same risk factors such as the APOEE4 gene, T2DM, obesity, and age [S.16].
These account for some of the genetic and environmental risk factors of AD pre-
viously discussed, which demonstrates that CVD’s origin is pathologically and
environmentally similar to that of AD. AB plaques in AD accumulate in the
241
extracellular part of a neuron; in cerebral arterioles and blood vessels supplying
to the brain, AB builds up in the capillaries of CVD patients. Most AD patients
have AB angiopathy resulting from CVD, which predominantly affects the cere-
bral leptomeninges, cortex, cerebellum, and the brain stem [S.16]. The capillary
AB angiopathy is detected in almost 35-45% of AD cases, which provides robust
evidence supporting the hypothesis that CVD can contribute to the symptoms
distinct to AD due to the synergistic relationship of the diseases [G.21].
242
symbols and the reduced color intensity in the legend illustrate that many pro-
teins, enzymes, and essential cell structures in AD turn defective or are missing
all together, which essentially kills the whole neuron. Nearly all brain regions
are innervated by cholinergic neurons, which are responsible for processes re-
lated to learning, memory, and attention. The progressive degeneration of the
basal forebrain cholinergic neurons (BFCNs), for instance, is correlated with the
harmful symptoms and memory deficits that AD impedes on a patient [M.16a].
In fact, BFCNs provide the main cholinergic information to prefrontal cortices
in the brain along with other crucial structures such as the amygdala (respon-
sible for evoking emotions) and the hippocampus, which plays a significant role
in long term memory function and memory consolidation [M.22]. Patholog-
ically, there have been major depletions of the cholinergic synthetic enzyme
named choline acetyltransferase (ChAT) and the cholinergic hydrolytic enzyme
acetylcholinesterase (AChE) in and around BFCNs [MM21]. ChAT is intercon-
nected with the process of synthesizing or polymerizing ACh, whereas AChE
breaks down ACh into its component parts. Together, these enzymes along
with the ACh neurotransmitter play an essential role in the nervous system
due to their ability to regulate cell signaling and host effective communication
amongst neighboring neurons. A dramatic loss of ChAT and AChE activity in a
considerable number of AD cases strongly supports the claim that the BFCN’s
degeneration is a strong foundation for the cholinergic theory of AD.
Onto the more prominent hallmarks of AD (ones that contribute to the loss of
neurons and their synapses) are the extracellular neuritic plaques containing the
amyloid-beta (AB) protein and the intracellular neurofibrillary tangles (NFTs)
carrying the hyperphosphorylated Tau protein.
243
As a result, neuronal function is disrupted, which eventually contributes to the
cell’s death.
10
244
ROS. These species range from various types of anions to hydrogen peroxide,
which both have unpaired valence electrons. Such ions and compounds will
readily accept an electron to transition into a stable state with a full octet sur-
rounding their outer shell. However, until they reach stability, ROS at high con-
centrations will react almost immediately with the four major macromolecules
in the body: lipids, proteins, carbohydrates, and nucleic acids [KH12]. On one
hand, ROS are crucial in physiological processes such as redox regulation and
transcription of DNA. Nonetheless, they may also induce undesirable effects and
even irreversible outcomes such as the aggregation of Alzheimer’s.
When ROS levels substantially decrease or increase past their optimal level
of appearance, there can be dangerous consequences such as lack of signaling
when they increase or overshoot signaling (signal is sent exceeding its target)
when they decrease. In the context of AD, increased production of ROS by the
interrelation between AB plaques and NFTs raises the risk of oxidative stress:
a condition caused by the imbalance of ROS in cells and tissues, impairing the
body’s ability to detoxify these reactive substances [A.17]. Elevated levels of
ROS leading to oxidative stress can exacerbate age and disease-dependent mi-
tochondrial dysfunction, reduce antioxidant defences around synaptic activity,
and disrupt neuronal cell signaling, ultimately leading to cognitive dysfunc-
tion [E.17]. Even in the absence of ROS, AB plaques and high levels of Tau can
independently worsen mitochondrial dysfunction and interrupt cell communica-
tion, which inevitably contributes to a perilous cycle in the body with disruptive
homeostatic control.
The intricate pathology of AD, in summary, is marked by hallmark features
of neuronal and synaptic loss, AB plaques, NFTs, ROS, and oxidative stress.
11
245
The interconnectedness between these factors wholly contributes to the cognitive
and behavioral decline associated with AD. Ongoing research continues to shed
light on novel aspects of AD pathology, which holds a degree of promise for
lessening the impact of this neurological disorder.
12
246
that the deterioration of cholinergic neurons – that release the AcH neuro-
transmitter – is primarily responsible for memory loss and learning deficits. In
addition, the ability of the cholinergic system in triggering depression has also
been suggested in clinical studies nearly 50 years ago [S.19a]. With respect
to changes in the cholinergic system, the hippocampal region is the crossroad
where cognitive deficits meet with depressive manners [E.22a].
13
247
of nocturnal sleep [A.20]. Sleep disorders such as sleep breathing disorders
and restless leg syndrome negatively alter circadian fluctuations of AB in the
interstitial brain fluid and cerebrovascular fluid (CVF) related to the production
of AB plaques [G.18]. Such sleep abnormalities evoke the increased production
of the pathological Tau protein and AB plaques. In order to aid AD patients
in adopting healthy sleep patterns, the development of specific procedures to
improve sleep structure and quality are being expanded on constantly. There
are a plethora of NPS associated with AD, and there is extensive, favorable
research being done on alleviating these symptoms.
Conversely, the major cognitive symptom in Alzheimer’s is memory loss.
If asked to name a disease that affects memory, most doctors would probably
choose Alzheimer’s. The six major memory systems include episodic, seman-
tic, simple classical, procedural, working, and priming memory. Of these ma-
jor categorizations, the deterioration of episodic memory is the most clinically
abundant cognitive symptom in AD [E.08]. Episodic memory is utilized when
consciously recalling a particular episode in one’s life, such as watching a movie
with a family member. Dangers arise from a loss in episodic memory when
AD patients forget if certain medications have been taken or even if the stove
is turned off or not [E.08]. Working memory and long-term, explicit memory
are impacted early in the course of this disease [H.13]. The first brain lesions
unique to AD appear in the poorly myelinated limbic neurons in areas affecting
memory, such as the hippocampus. For instance, hippocampal volume reduces
from 2.5 mL to almost 1.6 mL in the brain of an AD patient, especially as the
disease progresses [H.13]. Even with the inclusion of various AD criteria and
hippocampal biomarkers, there remain several barriers in neurological testing
for memory loss in AD.
Other cognitive symptoms of AD are impaired problem solving and language
levels. Something as simple as following a recipe or even paying the bills can
occur as AD worsens. Language impairments are caused by a decrease of soci-
olinguistic aspects such as the meaning of words, difficulties with fitting a word
and phrase into a situation, and word comprehension [K.15]. These two preva-
lent cognitive symptoms appear early on in AD, so they are used occasionally
to help diagnose a patient with AD.
14
248
3.1 Observed Effects of Music on the Brain
The ability for humans to perceive and enjoy music is a universal trait that
originated centuries ago and is still carried with us. Music is one of the most
powerful and diverse sensory, cognitive, and emotional experiences [T.17]. Our
brains light up when interpreting and perceiving music, and our bodies respond
in several ways, reflecting the powerful connection between music and our well-
being. Not only does music reduce feelings of separation and loneliness, it
also evokes cherished memories and maintains self-esteem, competence, and
independence [T.17]. From a cognitive standpoint, music boosts communicative
abilities, memory, self and environmental presence, and verbal and non-verbal
expressions [M.21b]. The improved projection of all these skills originates from
the organ responsible for formulating the very essence of what a human being
is: the brain. On top of encoding music, various parts of the brain are also
engaged based on the type of music traveling from the auditory cortex to the
brain’s nerve signals. This concept is demonstrated below in Figure 10.
Figure 10: Brain Areas Engaged Based on Emotion Category of Music [P.19].
Different brain areas are engaged according to the emotion category of music
in different colors – joyous music (red), tense music (yellow), and sad music
(blue). Based on conclusive results and statistically significant data, the type
of musical input is allocated to different parts of the brain connected by a
bilateral fronto-parietal network [P.19]. From a structural, cross-section view of
the cerebral cortex, it is clear that music in the brain can alter our perception
and emotional response based on the area it activates. For example, music with
a fast tempo and a major mode tend to evoke a positive/happy response, but a
slow tempo induces a negative/sad mood. The functional standpoint of listening
to music over time is an increase in the brain’s alpha waves that are associated
with relaxation and a calm state of mind [V.18]. The brain’s alpha waves are
also robustly responsible for human cognition and emotions, which generates
distinct physiological and psychological effects on the body [V.18].
Additionally, music causes the release of certain neurotransmitters, which
evokes important emotions, memories, and feelings. For instance, dopamine
is released in the mesolimbic reward system while listening to music, which
increases the body’s natural reward sensation. Serotonin, a neurotransmitter
involved in mood regulation and learning, also increases in the presence of audi-
15
249
tory stimuli. Higher concentrations of dopamine and serotonin in the caudate-
putamen and nucleus accumbens (areas linked to reward and motor control)
signify that music has a direct impact on the synaptic activity of these brain
areas and the amount of neurotransmitters released in a healthy manner.
16
250
regulation, motivation, defensive behavior, and anxiety in response to specific
stimuli [D.06]. Typically, the hippocampus is associated with the processing
of unpleasant (permanently dissonant) music compared to pleasant (consonant)
music [D.06]. Just how some brain structures can perceive components like
rhythm and pitch, the hippocampus recognizes harmonious sounds and sepa-
rates them from unstable and tense sounds.
To add, more neuroimaging studies indicate that an overlap in musical ac-
tivation occurs in the superior temporal gyrus (STG), middle temporal gyrus,
middle frontal gyrus, parietal lobe, supplementary motor area, and premotor
cortex [X.19]. During music listening, both the left and right brain hemispheres
are activated, and the right temporal cortex is even involved in the perception
of pitch patterns [X.19]. Clearly, the underlying workings of music is a neu-
rologically ubiquitous process with differing structures involved in an attentive
brain.
17
251
set of benefits that cannot be derived from other types of music. Furthermore,
music with a long-term periodicity, whether of Mozart or other classical com-
posers, resonates within the brain to enhance spatial-temporal performance and
even decrease seizure activity [S.01]. For instance, Greek-American musician
Yanni’s compositions – similar to those of Mozart’s Sonatas in tempo, melody,
harmony, and structure – were also effective and reproduced the exact results
that improved cognitive abilities like reasoning and memory [S.01]. However,
the effects of music may not be dependent on a specific piece. Even though ro-
bust evidence demonstrates classical music’s ability to improve cognitive skills
such as memory and learning, music that is personally liked by subjects turns
out to enhance alpha wave and beta wave frequencies in the temporal brain
regions as well [R.18]. Therefore, non-classical music such as rock, pop, hiphop,
rhythm and blues, and even jazz could potentially generate the same results;
this would greatly depend on the preferences of the individual though.
18
252
classical music, as a non-pharmacological intervention for AD patients, could
ameliorate the neuropsychiatric and cognitive symptoms of AD.
19
253
Figure 12: Gerdner’s Mid-Range Theory of Music Intervention for Agitation
[L.05].
20
254
advanced stages of this deadly disease. From alleviating depression, apathy,
and agitation to improving sleep patterns, classical music therapy should be
encouraged and conducted by trained professionals in AD senior homes in order
to offer fragments of comfort in the midst of uncertainty.
21
255
symphonies of classical music are proven to provide solace and relief for both
the neuropsychiatric and cognitive mechanisms of AD.
5 Conclusion
Alzheimer’s is one of the extreme medical mysteries in the healthcare field with
unanswered questions regarding its etiology, pathology, and diagnosis. Nonethe-
less, it is clear that various genetic and environmental risk factors such as age,
PSEN mutations, gender, obesity, T2D, and CVD are somewhat responsible for
the progression and development of AD. In terms of AD’s pathology, several
distinctive signs such as AB plaques, NFTs, elevated ROS levels, and oxidative
stress are extremely prominent.
After a thorough examination of the effects of classical music on the pro-
gression of Alzheimer’s, the implementation of classical music therapy in senior
homes and assisted living care centers is of utmost importance. The researched,
observed effects of classical music intervention result in enhanced memory, re-
duced NPS, and redeveloped cognitive abilities. Classical music therapy’s ability
to alleviate NPS such as depression and agitation as well as cognitive symptoms,
including learning defects and reduced spatial reasoning, underscores the value
of music as a non-pharmacological intervention that effectively improves an AD
patient’s quality of life alongside existing treatments.
22
256
References
[2A 18] About alzheimer’s disease: Promoting health and indepen-
dence for an aging population. Centers for Disease Control
and Prevention, 2018.
[702 1] Study reveals how apoee4 gene may increase risk for demen-
tia. National Institute on Aging, 2021.
23
257
[C.18a] Gonzalez C. Armijo E. Bravo-Alegria J. Becerra-Calixto A.
Mays C. E. Soto C. Modeling amyloid beta and tau pathol-
ogy in human cerebral organoids. Mol Psychiatry, 2018.
24
258
[E.18] Morgan K. N. Derby C. A. Gleason C. E. Cognitive changes
with reproductive aging perimenopause and menopause. Ob-
stetrics and gynecology clinics of North America, 2018.
25
259
[J.03] Terry A. V. Buccafusco J. J. The cholinergic hypothesis of
age and alzheimer’s disease-related cognitive deficits: Recent
challenges and their implications for novel drug development.
The Journal of Pharmacology and Experimental Therapeu-
tics, 2003.
[J.14] Li X. De Beuckelaer A. Guo J. Ma F. Xu M. Liu J. The
gray matter volume of the amygdala is correlated with the
perception of melodic intervals: a voxel-based morphometry
study. PloS one, 2014.
[J.15] Guerreiro R. Bras J. The age factor in alzheimer’s disease.
Genome medicine, 2015.
[J.17] Gomez Gallego M. Gomez Garcia J. Music therapy
and alzheimer’s disease: Cognitive, psychological, and be-
havioural effects. Neurology, 2017.
[K.15] Klimova B. Maresova P. Valis M. Hort J. Kuca K.
Alzheimer’s disease and language impairments: social inter-
vention and medical treatment. Clinical interventions in ag-
ing, 2015.
[KH12] Brieger K. Schiavone S. Miller Jr. J.F. Krause KH. Reac-
tive oxygen species: from health to disease. Swiss Medical
Weekly, 2012.
[L.05] Gerdner L. Effects of individualized versus classical relax-
ation music on the frequency of agitation in elderly persons
with alzheimer’s disease and related disorders. Cambridge
University Press, 2005.
[L.18] Scheyer O. Rahman A. Hristov H. Berkowitz C. Isaacson R.
S. Diaz Brinton R. Mosconi L. Female sex and alzheimer’s
risk: The menopause connection. The journal of prevention
of Alzheimer’s disease, 2018.
[L.23] Bleibel M. El Cheikh A. Sadier N. S. Abou-Abbas L. The ef-
fect of music therapy on cognitive functions in patients with
alzheimer’s disease: a systematic review of randomized con-
trolled trials. Alzheimer’s Research and Therapy, 2023.
[M.14] Pauwels E. K. Volterrani D. Mariani G. Kostkiewics M.
Mozart music and medicine. Medical principles and prac-
tice : international journal of the Kuwait University Health
Science Centre, 2014.
[M.16a] Ferreira-Vieira T. H. Guimaraes I. M. Silva F. R. Ribeiro F.
M. Alzheimer’s disease: Targeting the cholinergic system.
Current neuropharmacology, 2016.
26
260
[M.16b] Lubitz I. Ricny J. Atrakchi-Baranes D. Shemesh C. Kravitz
E. Liraz-Zaltsman S. Maksin-Matveev A. Cooper I. Lei-
bowitz A. Uribarri J. Schmeidler J. Cai W. Kristofikova
Z. Ripova D. LeRoith D. Schnaider-Beeri M. High di-
etary advanced glycation end products are associated with
poorer spatial learning and accelerated aB deposition in an
alzheimer mouse model. Aging cell, 2016.
27
261
[O.14] Gouras G. K. Olsson T. T. Hansson O. AB-amyloid peptides
and amyloid plaques in alzheimer’s disease. Neurotherapeu-
tics, 2014.
28
262
[S.16] Love S. Miners J. S. Cerebrovascular disease in aging and
alzheimer’s disease. Acta neuropathologica, 2016.
29
263
Comparing the Effectiveness of Support Vector
Classifier and Stochastic Gradient Descent in
Hate-Speech Detection
∗
Dania Noman Ali
October 17, 2023
Abstract
The increased use of Social Media with easy access to most people in
the world has given rise to a multitude of problems; with cyberbullying
and online hate-speech standing out as significant issues. With the choice
of a user to maintain there anonymity and post most things that would
be considered uncivil in a one-to-one real life conversation, has led to a
widespread dissemination of online hate-speech, posing significant societal
challenges and determinantal effects to an individual’s mental health. In
this paper, we explored two simple Classifiers, Support Vector Classifier
(SVC) and Stochastic Gradient Descent (SGD) which are compared and
analysed through there accuracy score to determine there effectiveness in
detecting hate-speech within the context of Twitter data. To train the
models, a publicly available dataset by Analytics Vidhya which can be
found on Kaggle.com is used which contains 32k tweets labelled with a ‘1’
if it is sexist/racist or ‘0’ if it’s not. The goal of this paper is identifying the
differences in performances in hate-speech detection by the two classifiers
In Latex there are three different types of headings: sections, subsections,
and subsubsections. Below you can see examples of how to make sections,
subsections, and subsubsections.
1 Introduction
Cyberbullying, predominately in the form of hate-speech is a widespread phe-
nomenon especially in the context of Twitter tweets. Sharing an individual’s
opinion with billions of people all around the globe, with the option to stay com-
pletely unknown has led Social Media to be a safe haven for the propagation
of hate-speech using remarks that might be sexist/racist, derogatory against
certain ethnicities, targeting religious minorities and/or defaming another indi-
vidual based off there certain characteristics [1].
∗ Advised by: Maria Konte of Georgia Institute of Technology
264
In light of its latest rebranding to ’X’, the social media platform has accrued
a substantial user base, boasting approximately 450 million active participants.
Official reports from Twitter indicate that these users collectively contribute
to an average daily volume of approximately 500 million tweets. Each user,
on average, invests approximately 30.9 minutes of their daily activity on the
platform.
Notably, the scale of content generation on this platform is significant, with
users capable of generating up to 2400 posts per day. It is worth emphasizing
that this disproportionately large volume of user-generated content is dissemi-
nated with minimal to no pre-posting filtration or content moderation measures
in place, rendering the platform susceptible to the proliferation of hate speech
and other forms of harmful content. Social media companies invest millions of
dollars in dealing with such issues, which mostly includes manual moderation
and deleting posts/tweets that are deemed as ‘offensive’, hateful references, in-
citement, slurs and tropes, dehumanization and hateful imagery [2]. A study
by the European Sociological Review, investigates the impact of perceived so-
cial acceptability on online hate speech and suggests that interventions based
on descriptive norms, such as moderate censorship, can significantly reduce
hate speech and guide future interventions in online communities to prevent the
spread of hate [3]. Therefore, the goal of this paper is to investigate and com-
pare the effectiveness of hate-speech detection by the two classifiers, namely,
Support Vector Classifier and Stochastic Gradient Descent.
2 Text Classification
Text Classification is an important task under Natural Language Processing
(NLP) whose primary objective being the automated allocation of text into
predetermined categories. Examples of tasks that could be achieved by text
classification are:
• Sentiment Analysis
• Classifying Emails as Spam or Non-spam
Text classification falls under the category of supervised learning, which means it
relies on a dataset where each document is labelled with its respective category.
This labelled data is used to train a classifier, which can then categorize new
text documents accordingly. This process forms the basis for many text-related
tasks and applications. For this research paper, we would be delving into two
types of Classifiers, Support Vector Classifier (SVM) and Stochastic Gradient
Descent (SGD). The aim of this paper is to find out which classifiers performs
better in hate-speech detection. Stochastic Gradient Descent
265
Figure 1: SGD frequently updates with significant variations, resulting in sub-
stantial fluctuations in the objective function, as depicted in Image 1:
266
Figure 2: How an SVC Works - Source:
©SHUTTERSTOCK.COM/SIDHARTHA CARVALHO
tweet labels
would you please ask these shameless @user @user give 1
goodnight my friends... stay blessed and highly favored!!! thursday fitfam 0
black women demonic porn 1
vandals turned a jewish family’s menorah into a swastika” antisemitism hate 1
i’m pretty sure that warm weather and sun is my meditation sunshine meditate quietà 0
Table 1: Examples of tweets from the dataset and there corresponding labels,
where it has a value of ‘1’ if the tweet contains hate-speech (which is defined as
sexist/racist remarks for the simplicity of this dataset), and a label of ‘0’ if it
doesn’t.
3 Preparation of Dataset
The dataset used for this paper is from Kaggle.com, provided by Analytics
Vidhya, named Twitter Sentiment Analysis. The dataset contains around 32,000
tweets, where label ’1’ denotes the tweet is racist/sexist and label ’0’ denotes
the tweet is not racist/sexist. The usernames in this dataset are replaced by
@user for the sake of copyright issues. This dataset was chosen due to large
number of tweets, to better test the ability of the two classifiers. [8]
267
Figure 3: Shows the cleaned tweets after preprocessing. This is essential for
preparing Twitter text data for subsequent analysis or modeling by eliminating
noise and unwanted characters from the tweets.
268
Figure 4: Example of how CountVectorizer() from sklearn works
269
Figure 5: shows the code that employs the unsampling to mitigate class imbal-
ance
minority (label ‘1’ - hate-speech tweets). Resampling was done to the minority
class to ensure that the minority and majority has an equal class size. The un-
sampled minority class is then combined with the original minority class, created
a balanced training set. This rebalancing process prevents model bias towards
the majority class and enhances the model’s ability to detect hate-speech.
270
Classifier Accuracy Score F1 Score
Support Vector Classifier 94.8378 56.5408
Stochastic Gradient Descent 96.9179 96.9471
and y-test house the feature matrix and labels of the testing data, dedicated
to assessing model performance. This segregation ensures that the classifier
is rigorously evaluated on unseen data. Furthermore, by specifying random-
state=0, we establish reproducibility, enabling consistent and replicable results
across multiple runs.
4 Results
Scores of the Classifiers
The results indicate that the Stochastic Gradient Descent (SGD) Classifier
outperformed the Support Vector Classifier in terms of both Accuracy and F1
Scores. Specifically, the SGD achieved an accuracy score of 96.9179, which is
notably higher than the Support Vector Classifier’s accuracy score of 94.8378.
Similarly, in terms of the F1 score, the SGD Classifier achieved a significantly
higher score of 96.9471, while the Support Vector Classifier scored 56.5408. This
improvement in performance may be attributed to several factors.
One crucial difference lies in the pre-processing of the text data. The SGD
Classifier implemented a custom text cleaning function, ’clean-text,’ which per-
formed operations such as converting text to lowercase and removing special
characters, mentions, URLs, and other non-alphanumeric characters. In con-
trast, the Support Vector Classifier relied on the ’tweet-preprocessor’ library
for pre-processing, which may not have been as extensive. Notably, the SGD’s
pre-processing included removing mentions and URLs, contributing to cleaner
text data.
Additionally, the SGD Classifier addressed the imbalance in the dataset
by resampling, which involved oversampling the minority class. This step is
271
crucial for dealing with imbalanced data sets and can significantly impact model
performance. In contrast, the Support Vector Classifier did not incorporate such
measures, which might have influenced the differences in model performance.
Another distinguishing factor is the choice of text vectorization. The Sup-
port Vector Classifier employed CountVectorizer() with binary representation,
while the Stochastic Gradient Descent used TfidfVectorizer(). The choice of
vectorizer can influence the representation of text features, with TF-IDF poten-
tially capturing term importance more effectively than CountVectorizer.
Furthermore, both models employed different evaluation metrics. The Sup-
port Vector Classifier relied on the Accuracy score for model evaluation, while
the Stochastic Gradient Descent used the F1 score. The choice of evaluation
metric is essential, and the F1 score, utilized by the SGD Classifier, is particu-
larly suited for imbalanced datasets, which could have contributed to its higher
overall score.
These differences in preprocessing, class imbalance handling, text vectoriza-
tion, and evaluation metrics collectively explain the superior performance of the
Stochastic Gradient Descent Classifier in this study.
5 Conclusion
In this research paper, we conducted an in-depth analysis of two classifiers, the
Support Vector Classifier (SVC) and the Stochastic Gradient Descent (SGD),
to assess their effectiveness and suitability in the context of hate-speech detec-
tion. In the present era, where the proliferation of hate-speech on social media
platforms poses a pressing concern, the need for robust classifiers is paramount.
Our study revealed that both classifiers demonstrated promising results in iden-
tifying hate-speech; however, the Stochastic Gradient Descent (SGD) classifier
exhibited superior performance with an impressive F1 score of 96.96, as opposed
to the Support Vector Classifier’s accuracy score of 94.84.
Several critical factors contributed to this discrepancy in performance. Firstly,
the preprocessing techniques employed by each classifier played a pivotal role.
The SGD classifier utilized an extensive custom text cleaning function, ’clean-
text,’ which not only converted text to lowercase but also effectively removed
mentions, URLs, and various non-alphanumeric characters. Moreover, the SGD
classifier proactively addressed class imbalance through resampling. In stark
contrast, the Support Vector Classifier relied on a pre-made function for pre-
processing and failed to account for class imbalance.
The choice of text vectorization further distinguished the classifiers. The
Support Vector Classifier opted for CountVectorizer() with binary representa-
tion, while the SGD classifier made use of TfidfVectorizer(), a decision that
enhanced its capacity to capture term importance more effectively.
our study showcased the potential of both classifiers in the area of hate-
speech detection. Nonetheless, it is evident that the Stochastic Gradient De-
scent (SGD) classifier, with its comprehensive preprocessing, class imbalance
handling, and advanced vectorization technique, emerged as the more powerful
272
tool for this critical task. Further research and experimentation are needed in
order to refine our understanding of the most effective approach and to continue
addressing the ever-evolving challenge of hate-speech detection in the digital age.
References
[523] Ml: Stochastic gradient descent (sgd). 2023.
[623] Difference between batch gradient descent and stochastic gradient de-
scent. 2023.
10
273
The WHO Is Not The Global Health Government
States Think It Is
∗
Dilay Kuyucak
October 13, 2023
Abstract
With the start of the COVID-19 pandemic, the World Health Or-
ganization was put in the center of the discussion surrounding the global
response. States criticized the organization for its slow and insufficient re-
sponse and leniency towards the Chinese government. This paper argues
that the World Health Organization was unjustly criticized and delegit-
imized for three reasons: (1) unwillingness of member states to cooperate
and the WHO’s lack of authority to ensure compliance; (2) misunder-
standing – by states as well as the WHO itself – of the WHO’s founding
mission and its current role as an international organization; (3) the lack-
ing capacity of states and national healthcare systems to face a pandemic
due to the privatization of health related industries. It suggests that more
authority be given to the organization to ensure accurate and independent
decision-making.
1 Introduction
The World Health Organization (WHO) was first notified of cases of ‘viral pneu-
monia’ originating in Wuhan, China on 31 December 2019. Following the ex-
change of information with the Chinese government and the investigation of the
disease, the WHO declared the novel coronavirus a Public Health Emergency
of International Concern (PHEIC) on 30 January 2020 [WHO20c], and a pan-
demic on March 11 [WHO20a]. From then onward, all eyes turned to the WHO,
and not long afterwards some countries turned against the organization. The
Director General of the WHO, Dr. Tedros was accused of being lenient towards
the Chinese government, as many believed his election was supported heavily
by China. Additionally, as the WHO continued to deny Taiwan a member state
status despite the country’s success during the pandemic, the Taiwanese gov-
ernment claimed that the WHO ignored their concerns about human-to-human
transmission in December 2019 [CC20], and had delayed the global pandemic
response to January 2020 in order to appease the Chinese government. These
∗ Advised by: Dr. David Rezvani of Dartmouth College
274
events resulted in the Trump Administration demanding reform in the organi-
zation’s conduct, and later severing ties with the organization in the height of
the pandemic [Pos20], which was detrimental to the global response overall.
This paper will argue that the World Health Organization is unjustly crit-
icized and delegitimized for three reasons: (1) unwillingness of member states
to cooperate and the WHO’s lack of authority to ensure compliance; (2) mis-
understanding – by states as well as the WHO itself – of the WHO’s founding
mission and its current role as an international organization; (3) the lacking
capacity of states and national healthcare systems to face a pandemic due to
the privatization of health related industries. It will also argue that the efforts
to empower the WHO and create future pandemic plans are futile if states do
not establish strong government organizations and control systems. This paper
will not argue that the WHO’s performance during the COVID-19 pandemic
was satisfactory. It will argue that the circumstances surrounding its failure are
related to its design and operation. The first part of this paper will cover the
need for global health governance, the attempts to discredit the WHO, and its
overall performance during the COVID-19 pandemic. The second will discuss
the limitations of the WHO as an international organization with its role of
meta-governance, the states’ acting in self-interest, and how the design of the
organization impedes its effectiveness during global crises.
275
cember 2019, and had downplayed the severity of the virus [Don20]. These
claims were further supported by the fact that the WHO had excluded Taiwan
from early emergency meetings in January 2020, and had continued to mis-
report Taiwanese case numbers under China’s data. This resulted in the US
demanding reform and later withdrawing from its member position and cutting
funds. This was significant because the US was the organization’s top donor
and was expected to lead the global pandemic response. Many criticized this
decision, and members of the WHO, the media, and scientists came to the orga-
nization’s defense. The German Foreign Minister echoed this sentiment: “The
decision by US President Donald Trump to end cooperation with the World
Health Organization sends the wrong message at the wrong time. (. . . ) We
need a united response in a spirit of solidarity from all countries and the United
Nations, with a strong WHO at the center.” [Hei20] The irreplaceability of the
WHO was widely accepted; however, some agreed that the claims made by the
former US President were significant. In May 2020 the World Health Assembly
demanded a full independent review of the global response, as well as that of
the WHO [CC20].
Those who agreed with the Former President’s claims that the WHO had
been lenient towards China pointed out the fact that the organization praised
China’s measures early on, congratulating the government’s transparency and
mindfulness towards the outbreak, and that it relied exclusively on data pro-
vided by the Chinese government, ignoring cases reported by Taiwan [WHO20c].
Some also argued that the organization did not want to lose the funds provided
by China. This is not as significant a claim as it seems as most of the WHO’s
funds come from the US, international organizations, and philanthropic foun-
dations; and China’s donations play a minor part [DB19]. Similarly it can be
argued that the WHO’s treatment of China was the result of the International
Health Regulations (IHR) set in 2005, which place the responsibility of accu-
rate reporting of data on member states. These regulations were put in place to
counter uncooperative behavior from states, as China had refused to cooperate
during the 2002-2003 SARS outbreak. However, these new regulations gave lit-
tle to no supranational power to the WHO, and it had to rely on data reported
voluntarily by the states, while having limited authority that was not sufficient
in forcing its members to cooperate. This was a way to ensure state sovereignty
while also providing the WHO with accurate health data. However, since the
data provided is voluntary, it was perhaps in the WHO’s best interest to keep re-
lations with China amicable to ensure continuous flow of information during the
start of the pandemic, when the virus was still a mystery [Mel22]. As a result
of these controversies, the WHO experienced a loss of credibility, with many
turning to private efforts for accurate information, like the Bill and Melinda
Gates Foundation and the Johns Hopkins University’s COVID-19 tracker.
Ill-intentioned or not, the WHO provided guidance and relatively accurate
information during the first stages of the pandemic, despite the uncooperative-
ness of its member states. It took initiative to support the development and
distribution of tests, treatments and vaccines through the Access to COVID-19
Tools (ACT) Accelerator [WHO20b]. Although contributions to the WHO’s
276
budget were rising at the start of the pandemic, in February 2021 the ACT
Accelerator had only gathered 20 percent of its estimated need [BB22]. This
means that states did not contribute the necessary, albeit voluntary, donations
they should have and the WHO did not have any means of extracting these
funds any other way. They also created COVAX to facilitate the allocation of
vaccines, offering free doses to low- and middle-income countries. However, 70
percent of vaccine doses were secured by high and upper middle-income coun-
tries [Irw21]. Canada had reserved more than four vaccines per person, while
Brazil and India had less than one for every two people [SW20]. As a result,
the WHO’s efforts of equal access to COVID-19 tools were undermined by its
member states’ selfish behavior.
The WHO failed to promote solidarity amongst international actors, and
policy decisions were made in order to save the day through temporary con-
tainment measures like lockdowns, rather than to eradicate the problem with
accurate tracing and isolation. Explanation for the WHO’s poor performance
during the COVID-19 pandemic could come from its underfunding, its lack of
authority over states, or false handling of the outbreak. Whatever the case
may be, the recent pandemic has shown that there are fundamental errors in
the operation of the World Health Organization and international cooperation.
However, it cannot be denied that a global health organization is the only way
to combat global health emergencies. The problem is that World Health Or-
ganization is not the first responder rushing to control the disease, as states
would like it to be, but rather the coordinator that promotes best practices to
be followed by states and organizations. States need to realize that in such a
system, domestic responsibility falls upon them.
277
templates for member states on how to devise their own national policies when
faced with health emergencies. Additionally, the WHO was limited to voluntary
donations, which came from wealthy countries who wanted to ensure their own
health security by making the WHO get involved only in specific cases which
might affect them [Rus11]. The WHO was expected to perform research on the
ground and take initiative during the COVID-19 pandemic, like it had in its early
days; however, states failed to recognize that the organization’s purpose had
evolved into a global coordinator. Most importantly, states failed to recognize
that this change was the result of their own decisions and actions.
Arguably the biggest strike on the WHO’s autonomy was the International
Health Regulations. The 2002-2003 SARS outbreak brought major changes to
the organization due to the Chinese government’s uncooperative actions. The
International Health Regulations’ implementation brought restrictions to the
WHO’s autonomy in fields like research and data collection, while also limiting
its authority over states. The WHO could collect data voluntarily given to it
and warn off powerful states to the impending dangers, but it could not force
the states to follow any guidelines it would provide. It could not shame states,
like it had done to China during the SARS outbreak, and demand that they
cooperate. Since the WHO had no way of imposing sanctions, it had to ensure
the collaboration of states at times of emergency like the COVID-19 pandemic.
These new restrictions also meant that the WHO could not respond to the crises
on the ground, but rather sit at a desk and try to nudge governments in the
right direction. The limitation of WHO’s coordination function due to it being
reliant on states to provide information and the undermining of its leadership
to face states and call out uncooperative behavior cause it to perform poorly
when faced with global crises [Ben20].
Adding onto the existing lack of authority the WHO has, it does not ac-
cept its role as a governing body for international health either. The WHO’s
mission is to be the technical body that provides health guidance and assis-
tance to countries. In other words, it is not a substitute for national health
systems. The WHO emphasizes scientific decision making as it is constituted
of a ‘transnational Hippocratic society’, which leaves out decisions regarding
law and international politics [Fid99]. This implies that the WHO sees itself as
transcending world politics, or at least aims to depoliticize its decisions affect-
ing the member states. The current Director-General of the WHO, Dr. Tedros
Adhanom Ghebreyesus echoed this by saying “my focus is saving lives, we do
not do politics in the WHO [WHO20d].
The WHO not only does not enjoy any political authority, but it does not
want to. However, this is a crucial mistake when confronting a pandemic be-
cause any suggestion they will make, such as travel restriction, face mask use or
national lockdown measures, is inherently political. Additionally, the WHO’s
dependence on member states means that it has to be political in its conduct
if it wants to be able to continue its existence. It can be argued that during
crises expert opinion should transcend politics, however this does not reflect the
current system these organizations operate in. The WHO cannot isolate itself
from the political decisions of its member states, as evidenced by the Trump
278
administration’s retreat and its consequences. In the end, the WHO itself mis-
understands its position as an inherently political institution, and its response to
political criticisms seems hollow. Instead of trying to brush off these criticisms
with emphasis on the importance of empirical data, the WHO should recognize
its political dimension and emphasize the limitations caused by its design.
Powerful states do not want to follow orders. The recent pandemic has shown
a clear hypocrisy in the conduct of powerful Western governments, mainly in
their responses to the COVID-19 pandemic. For example, rather than lead-
ing the world through this crisis the US opted for retreating from the WHO,
blaming its domestic failures on the organization. When the pandemic spread,
governments adopted an ‘every man for himself’ approach, engaging in com-
petitive politics, limiting exports, and hoarding medical supplies. Likewise in
an integrated system like the EU, no European country was willing to donate
medical supplies and resources to a struggling Italy [BO21]. Rational thinking
would suggest international cooperation would be the only solution to a global
problem, but as witnessed during the COVID-19 pandemic, governments do not
want to comply when it is not in their short-term interest. Such an approach
guarantees that long-term solutions are out of reach.
It is also important to remember that the WHO can only do its job of
surveillance and information gathering if governments supply it with the neces-
sary data. This would require countries themselves to have functioning health
monitoring systems, and the capacity to deal with newly emerging problems.
Countries affected severely by the 2002-2003 SARS outbreak were the ones
who were most prepared for the COVID-19 pandemic. Taiwan, for example,
had implemented a nationwide public health network, comprehensive univer-
sal healthcare for all citizens, and improved infection control practices [ea20].
These measures ensured its early response was adequate and Taiwan was one of
the most successful countries when dealing with the pandemic. The states that
seemed to have been most prepared, performed poorly during the pandemic.
They did not have the necessary capabilities to follow any suggestion given
by the WHO, however the organization was still to blame. Taking the UK as
an example, the privatization of the National Health Service’s logistics lead to
massive shortages of key equipment [HLH+ 20]. By mid-February 2020, the UK
could only conduct five COVID-19 tests per week [DM20]. The extensive use of
lockdown measures by the UK was summarized by the UK Scientific Advisory
Group on Emergency as follows: “From a government perspective, lockdown
had big advantages: it did not require any forward planning, there was no need
to build capacity in advance, and no direct financial cost. All lockdown took
was a government decree and a modicum of enforcement. It was a lazy solution
. . . as well as a hugely damaging one. Avoiding lockdown would have required
a lot more effort.” [Woo22]
Taking into consideration all of the above, it is reasonable that the WHO
would not be able to act as a global governing body for health. It has no
power over its member states, on the contrary it is dependent on them for
information and funding. In addition to its lacking authority, it also does not
want to admit its political responsibility as an international organization. This
279
creates catastrophe, especially when we consider the member states will only
cooperate when it is in their self-interest, and surprisingly their self-interest is
not always the health of their peoples. It is also worth noting that in an ever-
globalized world, privatization of industries such as health will lead to shortages
and introduce multinational companies to the debate. During the production
and distribution of COVID-19 vaccines, one of the main struggles was that
these companies controlled when, where, and how much they would produce
the vaccine, as is their right as the owners of the Intellectual Property [PSH20].
When so many actors are in play, blaming the WHO for its pandemic response
seems hypocritical, especially when states have done nothing but underfund the
organization and undermine its authority.
4 Conclusion
The COVID-19 pandemic showed the world that our governing and health sys-
tems are powerless when faced with global threats, both at the national and
global levels. While many countries seemed prepared for such a disaster the
reality proved otherwise. Amidst all of this the World Health Organization, an
international organization primarily focused on scientific processes and narrow
health system improvement projects, was put in the center as a liable author-
ity. States blamed the organization for their own slow and insufficient responses
while also accusing it of being lenient towards the Chinese government. These
were significant allegations directed at an impartial scientific institution. How-
ever, these allegations were unsubstantiated for the three main reasons discussed
in this paper: the WHO and its member states failed to recognize the need for
political consideration in the WHO’s conduct, and as a result claimed that the
organization was not impartial as it claimed to be; states not only withheld data
and financial resources from the organization, but actively undermined its efforts
to provide the necessary response to the pandemic with their greedy race for who
could secure more vaccines; and the states’ own lack of healthcare planning and
capacities, which failed when faced with the pandemic. Such allegations also
disregard the WHO’s limitations as an organization bound by the IHR. It had
to rely on data provided voluntarily and needed to keep relations amicable with
countries in order not to lose its funding and information flow. Amidst all of
this, it had to provide a pandemic blueprint for countries, of which many could
not follow due to their lacking healthcare capacity. Additionally, not every sug-
gestion they made was the correct one, as the situation was very unclear during
the first year of the pandemic. Thus, the WHO became a scapegoat for all the
misguided policies countries decided to follow. Comments made by the US pres-
ident and others damaged the WHO’s credibility and impeded its work further
as countries reduced funding and did not contribute to pandemic efforts such
as the ACT Accelerator and COVAX. Privatization of health-related industries
also hindered states’ ability to provide adequate testing and healthcare.
In the future, for any improvement to come of the COVID-19 pandemic, the
WHO should be given more authority and funding in order for it to make accu-
280
rate and independent decisions. Its dependence on member states is the reason
it has to limit itself to the information provided by them and suggest policies
that are in their interests. If states are willing to sacrifice some sovereignty
in order to have a powerful World Health Organization, future global health
emergencies could be dealt with through strong international cooperation. The
current system is hollow, any suggestion by the WHO cannot be implemented
on the grounds. For this reason, criticizing the WHO for not being the savior
during the pandemic is hypocritical, because the critics are the reason it is not
able to perform adequately. The WHO is not faultless, it should acknowledge
that it has an international responsibility that is inherently political. The WHO
should be recognized as a tool that states put effort into, in order to reap the
benefits when faced with crises.
References
[BB22] Josephine Borghi and Garrett W. Brown. Taking systems thinking
to the global level: Using the who building blocks to describe and
appraise the global health system in relation to covid-19. Global
Policy, 2022.
[CC20] Yu-Jie Chen and Jerome A. Cohen. Why does the who exclude
taiwan? Council on Foreign Relations, 2020.
[DB19] Kristina Daugirdas and Gian Luca Burci. Financing the world
health organization: what lessons for multilateralism? Interna-
tional Organizations Law Review 299, 2019.
[DM20] Laura Donnelly and Tom Morgan. Uk abandoned testing be-
cause system “could only cope with five coro-navirus cases
a week”. https://www.telegraph.co.uk/news/2020/05/30/revealed-
test-trace-abandoned-system-could-cope-five-coronavirus, 30 May
2020.
[Don20] President donald j. trump’s letter to dr. tedros adhanom ghebreye-
sus. https://perma.cc/RYW8-XMGC, 2020.
281
[ea20] Cheryl Lin et al. Policy decisions and use of information technology
to fight coronavirus disease, taiwan. Emerging infectious diseases,
2020.
[Fid99] David P. Fidler. International law and global public health. Uni-
versity of Kansas Law Review, 1999.
[HLH+ 20] David Hall, John Lister, Cat Hobbs, Pascale Robinson, and Chris
Jarvis. Privatised and unprepared: the nhs supply chain. Univer-
sity of Greenwich/We Own It, https://weownit.org.uk/privatised-
and-unprepared-nhs-supply-chain, 2020.
[Irw21] A Irwin. What it will take to vaccinate the world against covid-19.
Nature, 2021.
[MCF19] Theodore M. Brown Marcos Cueto and Elizabeth Fee. The world
health organization: A history. Cambridge University Press, 2019.
[Mel22] Margherita Melillo. The uneasy coexistence of expertise and politics
in the world health organization. INTERNATIONAL ORGANIZA-
TIONS LAW REVIEW, 2022.
[PSH20] Victoria Pilkington, Mirjam Keestra Sarai, and Andrew Hill. Global
covid-19 vaccine inequity: failures in the first year of distribution
and potential solutions for the future. Frontiers in Public Health,
2020.
[Rus11] Simon Rushton. Global health security: security for whom? security
from what? Political Studies, 2011.
282
[WHO20b] WHO. The access to covid-19 tools (act) accelerator.
https://www.who.int/initiatives/act-accelerator, 2020.
10
283
Targetting the EGFR Pathway in Glioblastoma
Multiforme: A Review of Current Pre-clinical and
Clinical trials with Tyrosine Kinase Inhibitors.
∗
Ananya Bharathapudi
October 16, 2023
Abstract
With a median overall survival expectancy of 15 months or less [FC17],
Glioblastoma Multiforme (GBM) is the most common type of primary
brain tumor [SA18]. Despite extensive research on the pathophysiology
and clinical course of GBM, the malignancy remains one of the most
lethal cancers to date as the 10 year survival rate is 0.71 percent [TT18].
While established methods of treatment such as resection, radiotherapy,
and chemotherapy are effective in prolonging survival time, they are not
effective in preventing recurrence [HL06] which occurs in almost every
patient [OM14]. To better combat the dismal outcomes of GBM, novel
approaches are necessary given the increase in incidence as well as the
increase in tumor burden globally [GN20]. Gene therapy may serve as a
promising novel therapeutic, with initial clinical studies indicating promis-
ing results [PK05]. This review will outline the most recent treatment
protocols for differing GBM subtypes, characterize the tyrosine kinase
epidermal growth factor receptor (EGFR) and its downstream signaling
pathway, and analyze currently on-going and recently completed clinical
trials involving tyrosine kinase inhibitors in GBM.
1 Introduction
Glioblastoma multiforme (GBM) is the most common primary brain tumor in
adults accounting for over 45 percent of all malignant primary CNS tumors.
The disease occurs in older adults with a median diagnostic age of 64 years
and peak incidence between 75-84 years. Incidence is higher in males than
in females as well as in white, non-Hispanics. GBM remains an incurable tu-
mor, with a median survival time of 15-20 months and 5-year survival rate of
approximately 5 percent due to the heterogeneous and complex nature of the
disease. Approximately 80 percent of GBM tumors are primary, rapidly de-
veloping de novo without precursor lesions such as lower-grade gliomas that
∗ Advised by: Paras Minhas, Stanford University
284
are common in secondary tumors. Of primary GBM tumors, 57 percent con-
tain EGFR gene amplification, encoding the epidermal growth factor receptor
(EGFR). EGFR is a transmembrane receptor tyrosine kinase that contains an
extracellular region composed of four domains and an intracellular region com-
posed of a tyrosine kinase domain as well as C-terminal tail. Upon binding
of the epidermal growth factor ligand, EGFR dimerizes and autophosphory-
lates its C-terminal tail, which serves as a docking site for several secondary
messengers that induce cellular proliferation and resist apoptosis. Prominent
downstream pathways of EGFR include the RAS-RAF-MEK-ERK MAPK as
well as PI3K-AKT-mTOR pathways. Interestingly, approximately 26 percent
of primary GBM tumors contain EGFR activating mutations. The most com-
mon EGFR variant EGFRvIII, occurring in approximately 50 percent of all
EGFR-amplified GBM cases, involves the deletion of amino acids 6-273, encom-
passing exons 2-7. This mutation results in an EGFR that contains a modified
extracellular domain which allows for constitutive activation of the receptor.
Clinically, patients with either increased EGFR expression or mutation are
likely to have increased tumor invasion with lower overall survival rates at 6
months, as compared to the median overall survival rate of 15 months for GBM
patients [BZ18]. Other common mutations in GBM patients include specific
genes that lead to increased development of malignancy, and guide prognosis.
Mutations in isocitrate dehydrogenase 1 and 2 (IDH1 and IDH2) are oncogenic,
promoting methylation in cancers as well as production of oncometabolites such
as 2-hydroxyglutarate (2-HG) [CA13, TZ14]. While the mutations themselves
promote undifferentiated cell proliferation, they are also associated with better
prognosis due to targeting therapies [CA13]. In addition, O6-Methylguanine-
DNA-methyltransferase (MGMT) promoter methylation status. The methyla-
tion of this enzyme promoter makes tumor cells susceptible to DNA damage
caused by alkylating agents, such as temozolomide (TMZ) [HM05]. Activation
of the EGFR receptor leads to homodimerization and autophosphorylation of
several tyrosine residues on the C-terminal domain, eliciting downstream activa-
tion of secondary messengers including protein kinase B (Akt) and mammalian
target of rampamycin (mTOR). Studies have found that the amplification of
EGFR is often seen in tandem with increased abundance and phosphoryla-
tion of pleckstrin homology-like domain family A member proteins (PHLDA1
and PHLDA3), transcription factor SOX9, cell adhesion protein CTNND2 (-
catenin), and cell cycle proteins CDK6 and CDKN2C15. Patients with increased
EGFR expression are likely to have increased tumor invasion with lower overall
survival rates at 6 months, as compared to the median overall survival rate of
15 months for GBM patients [BZ18]. In this review, we will cover the most re-
cent pre-clinical and clinical studies concerning modulation of the EGFR using
tyrosine kinase inhibitors and discuss potential synergistic strategies to possibly
decrease the high tumor burden of GBM.
285
2 Initial Diagnosis
Patients with suspected GBM typically present with progressive neurological
symptoms such as headaches, seizures, and memory loss [BF15]. In patients
with suspected GBM, contrast-enhanced MRI scans are conducted to exam-
ine areas of microvascular proliferation and focal necrosis that may represent
the histological characteristics of the disease [TA20]. Screening for systemic
malignancies are often not necessary when radiographic suspicion is high for
high-grade glioma. Full diagnosis is only achieved upon biopsy, which is col-
lected after maximum tumor resection or, in patients where tumor resection
presents itself to be unamenable, in a biopsy procedure [HM19]. In addition to
scans and tissue pathology, the detection of certain genetic mutations through
fluorescence in situ hybridization (FISH), such as EGFR, may also aid in the
diagnosis of the disease [MC14].
286
bined with TMZ, followed by maintenance TMZ and TTFields [TA20]. If the
patient has poor functional status and a KPS ¡ 60, then HRFT alone, or TMZ
alone, is given. Patients who contain unmethylated MGMT are generally re-
sistant to TMZ adjuvant therapy [AI20]. In such cases, standard radiotherapy
is administered, given the patient has a KPS score 60 [TA20]. At tumor re-
currence, the most preferred line of therapy is surgery, as research has shown
that reoperation improves overall survival1, though there is no standard line of
adjuvant treatment for recurring tumors [TA20]. Re-radiation, with a median
total dose of 30–36 Gy, may be an alternative treatment option [WM13] how-
ever, it is not as highly recommended as surgery or systemic therapy, due to
potential for increased toxicity [TA20,O.15]. Systemic therapy involves adminis-
tering chemotherapeutic as well as immunotherapeutic agents such as TMZ and
bevacizumab as well as alkylating agents like carmustine or other blood-brain
barrier (BBB) penetrant nitrosoureas. Unfortunately, systemic therapy dur-
ing tumor recurrence though results of studies testing the effectiveness of such
drugs with recurrence have been discouraging [O.15]. The attending physician
typically chooses the treatment method based on several factors including the
patient’s KPS score, tumor burden, methylation status of MGMT, epidermal
growth factor receptor (EGFR) status, and IDH status. TTFs may also be
used, though studies have shown that majority of patients still do not survive
for over two years, which is why supportive care may present itself as the best
option, as it emphasizes improving quality of life and managing discomforting
symptoms [FC17, TA20].
287
weight loss and skin lesions when dosed at 25mg/kg. They had a brain to plasma
drug concentration ratio was 6:1 in males and 7:1 in females; whereas mice tak-
ing CM93 had a brain to plasma drug concentration ratio of 14:1 in males and
15:1 in females, suggesting CM93’s ability to penetrate through the blood brain
barrier and, therefore, showing efficacy in the brain [ea20]. Another preclinical
trial demonstrates similar results comparing CM93 to Gefitnib, another EGFR
TKI. After a pilot comparative assessment seven hours after a single dose of
30mg/kg of CM93 or 50mg/kg of Gefitinib was administered, CM93 had a kp
value of 28.3; whereas, Gefitinib had a kp value of 0.55 [Ni21]. Unlike Osimer-
tinib, CM93 had little adverse effect on mouse skin; with Osimertinib, mice
lost more than 20 percent of their body weight reaching their endpoint and had
severe hair loss after three weeks. Mice treated with CM93, however, showed
no hair loss; this demonstrates CM93’s potential to improve patient quality of
life [ea20]. Another preclinical trial further examined CM93’s efficacy in vivo
using genetically engineered mice with GBM. Mice taking CM93 had a medium
survival of 33 days while the control group had a medium survival of 25.5 days.
In this model too, there was no significant hair or loss of body weight observed
in the CM93 Group [Ni21].
4.1.1 ERAS-801
Currently in phase one and in a nonrandomized sequential open label designed
study format, the next trial includes patients with a diagnosis of GBM IDH
wildtype. Patients with prior EGFR inhibitor treatment for GBM are ex-
cluded. This clinical trial’s intervention is ERAS-801, a new EGFR Tyro-
sine kinase inhibitor ERAS-801 targets the RAS/MAPK pathway and inhibits
EGFR [ERAb]. Targeting wildtype EGFR and mutant variants of EGFR by
small molecules and antibodies has been shown to improve patient outcomes
in NSCLC, CRC, and HNSCC; however, in CNS tumors the ability to target
wtEGFR and mutant EGFR remains an unmet need. The two main reasons
why current EGFR inhibitors lack efficacy is their lack of ability to penetrate
the blood brain barrier and are week inhibitors of EGFRvIII mutant protein.
ERAS-801, however, differs as it is designed to be selective, reversible, orally
available, and has a 3:7 brain to plasma ratio in mice demonstrating CNS pene-
trability. ERAS-801 is also able to target EGFR alterations such as EGFRvIII
. When a single oral dose of 10mg/kg of ERAS-801 was administered to mice,
ERAS-801’s kp value was 3.7, which was higher than Osimertinib’s (0.99) , Afa-
tinib’s (0.25) , Erlotinib’s (0.06) , Gefitinib’s (0.36) , and Dacomitinib’s (0.61)
; all of the other named drugs are other EGFR TKI’s. Taken together the ev-
idence suggests that ERAS-801 out performs other inhibitors in terms of CNS
penetration. In preclinical studies, ERAS-801 showed efficacy against EGFR
through an IC50 of 0.3nM and high selectivity for EGFR based on a biochem-
ical screen of 484 kinases where ERAS-801 at 10 µM inhibited two non EGFR
family kinases at greater than 90 percent. In vitro cell based assays, ERAS-
801 had an IC50 of 1.1nM against wildtype EGFR an IC50 of 0.7 nM against
EGFRvIII, and an IC50 value of less than 3 µM in a 31 patient derived glioma
288
cell panel where 65 percent of glioma cell growth was inhibited by ERAS-801.
The patient derived glioma cell panel had the most common types of EGFR
alterations which include amplification, EGFRvIII, extracellular domain muta-
tions, and chromosome 7 polysomy. ERAS-801 also showed no activity against
astrocytes, the most common cell in the human brain. This suggests that ERAS-
801 selectively inhibits EGFR without disturbing normal brain cells that were
not dependent on EGFR signaling. In vivo, ERAS-801’s high CNS penetra-
tion resulted in survival benefit. In an EGFRvIII mutant patient-derived GBM
model, the medium survival time was 40 days for the control group, between 60
and 70 days for the 10mg/kg dose of ERAS-801, around 80 days for the 25mg/kg
of ERAS-801 group, and around 80 days for the 75mg/kg of ERAS-801 group.
In four additional patient-derived glioma models that harbor EGFRvIII, EGFR
amplified, or chromosome 7 polysomy mutations, ERAS-801 showed TGI in 93
percent of 14 patient derived models. Taken together, the evidence suggests
ERAS-801’s efficacy in combatting GBM [ERAa].
289
cromoles; first generation EGFR inhibitors had IC50 values of 10 micromoles in
the same setting, suggesting that Osimertinib has greater efficacy compared to
first generation EGFR TKIs [Pet16]. To examine whether Osimertinib inhibits
GBM cell growth due to off-target effect, two U87 cell lines stably express-
ing wild-type or Cys797 mutant EGFR were constructed to reveal that Cys797
residue in the catalytic domain of EGFR is key to the inhibitory effect of Os-
imertinib. While treatment significantly inhibited growth of cells expressing
wild-type EGFR, effects on growth were nearly abolished with Cys797 mutant
EGFR. EdU-positive assay to evaluate Osimeritinib’s inhibitory effect on GBM
proliferation showed that proliferation in U87 and U251 lines were reduced to
25.59 percent and 37.37 percent respectively, suggesting Osimertinib’s strong in-
hibition of GBM cell proliferation in a dose dependent manner. Furthermore, a
colony formation assay revealed that the number of colonies as reduced 767.82
percent by Osimertinib, and a Methylcellulose colony confirmed these results
suggesting Osimertinib’s ability to significantly inhibit GBM cell colony forma-
tion. Flow cytometry also revealed that Osimertinib’s mechanism of GBM cell
proliferation inhibition, was that the cell cycle distribution and progression was
arrested in in G1 phase in both cell types tested in the assay (u87 and U251).
Western blot analysis to test inhibition of the EGFR/ERK pathway activation
in which Osimertinib’s effect was tested on EGFR, AKT, STAT3, and ERK
phosphorylation in GBM cells. Different concentration of Osimertinib treat-
ment U87 and U251 GBM cells tested had no significant changes in total EGFR
expression; however, phosphorylated EGFR numbers gradually reduced with
increasing Osimertinib concentrations which also lowered the level of ERK and
had no effect on AKT and Stat3 level. In erlotinib, a well known TKI, inhib-
ited ERK phosphorylation for 24-48 hours after which ERK reactivation was
observed. Osimertinib, on the other hand, can continuously suppress EGFR
and ERK phosphorylation and may therefore inhibit the growth of GBM cell
continuously by blocking the EGFR/ERK pathway. Also, when Osimertinib is
combined with ERK inhibitor PD098059, anti-proliferations and anti-invasion
activities of Osimertinib are enhanced. Results from EdU assays show that both
Osimertinib and PD098059 inhibited the proliferation of GBM cells; however,
compared to the monotherapies, the combination was observed to be more effec-
tive. PD098059 also enhanced the inhibitory effect of Osimertinib on GBM cell
invasion. Combined with another ERK inhibitor SCH772984, however, Osimer-
tinib showed effects on proliferation of GBM cells but not on cell invasion. This
data suggests that ERK inhibition could increase the sensitivity of GBM cells
to Osimertinib [LX19]. In vivo, orthotopic and heterotopic mice models, tumor
growth in the Osimertinib treated group was slower with a T/C of 0.0241, which
is significant because any value less than 0.4 is considered significant inhibition.
Osimertinib was effective in slowing the growth of intracranial tumors and the
median survival of untreated mice, 26 days, was increased to 42 days in the
treated mice [Cha20]. Another preclinical trial used in situ GBM nude mice
models treated with an intraperitoneal injection and an oral administration of
osimertinib to observe that immunofluorescence stainin of GBm sections in the
Osimertinib treatment group were significantly higher than those in the control
290
group, suggesting that Osimertinib inhibited proliferation and promoted GBm
cell apoptosis in vivo [LX19]. A completed clinical trial including patients with
IDH1 or IDH2 wildtype GBM involved paitents taking 80mg of Osimertinib
orally once a day until unacceptable side effects, death, or medical complica-
tions occurred. Four out of the six patients were assessed for response. Out of
four patients, one showed partial response, two had received stable disease, and
the last was refractory to treatment. Transient improvement in imaging was not
without side effects: two patients had Thrombocytopenia, one developed grade 1
diarrhea and pneumonia, and the other developed grade one mucositis [Abo10].
Because Osimertinib penetrated the blood brain barrier effectively, had in vitro
and in vivo data to support its efficacy, and inhibits multiple intracellular path-
ways, it may be a better treatment option than previously tested EGFR-TKI’s
for GBM patients. Osimeritinib is also irreversible and can lead to prolonged
survival and continuous ERK inhibition. Results show that the combination of
an EGFR inhibitor and an AKT/STAT3 pathway may be more effective than
a monotherapy [LX19]. The clinical trial also shows that Osimeritnib may ben-
efit select patients with recurring MG and EGFR alterations underscoring the
importance of characterizing EGFR alterations before considering Osimertinib
treatment for a certain patient [Abo10].
4.2 BDTX-1535
The next ongoing clinical trial investigates the potential of BDTX-1535 monother-
apy. Currently in phase I, the trial’s includes patients diagnosed with wild-
type IDH GBM and astrocytoma with molecular features of GBM; both must
be recurrent cancers. Its exclusion criteria include known resistant mutations
in tumor tissue or ctDNA, prior treatment with EGFR inhibitors, and brain
metastases or spinal cord compression requiring intervention. BDTX-1535, the
intervention, is selective, highly potent, and an irreversible inhibiter of EGFR
alterations including amplification, mutations, and splice variants seen in GBM.
A report summarizes more information about the drug and some key preclinical
trials that offer some descriptions of BDTX-1535. If BDTX-1535 could over-
come Osimertinib resistance, it could address a pressing rising need in EGFR
mutant non-small lung cell cancer. BDTX is optimized against a broad spec-
trum of EGFR mutations and a Goldilocks wild type selectivity profile. Results
have shown that in mice harboring NSCLC with C797S mutation, BDTX-1535
induced a dose dependent tumor shrinkage without a loss of body weight. The
mice treated with Osimertinib, however, looked like the untreated control group.
BDTX-1535 could penetrate the blood brain barrier addressing brain metastases
and CNS tumors [BDT21].
291
72 patients enrolled, all of which had EGFR-mutant advanced non-small-cell
lung cancer with brain metastases. Patients were given 120mg or 160 mg orally
with safety and tolerability being the primary outcomes. Treatment related
toxicities occurred in 13 (43.3 percent) of the patients in the 120 mg group and
21(50 percent) of the patients in the 160mg group. The drug had an objective
response rate of 53.6 percent in 120 mg group and 40.5 percent in the 160 mg
group. The median duration of response was 7.4 and 9.1 months in the 120
and 160 mg groups respectively, while the median progression-free survival was
7.4 months for both groups. Taken together, the data suggests epitinib in 160
mg showed promising efficacy and was well tolerable; this was also taken as the
recommended phase II dose [ea22]. Another clinical trial testing the safety of
Epitinib in patients with EGFRm+ NSCLC recruited 36 patients in a dose es-
calation phase at 7 dose levels up to 240mg starting at 20 mg. Dose escalation
was followed by a 3+3 design. The most common adverse effects seen were:
rashes which occurred in 60 percent, diarrhea (34.2 percent), elevated AST(34.3
percent), and hyperbilirubinemia (28.6 percent). Drug exposure increased pro-
portionally until it plateaued at 160 mg and above. Out of 12 patients treated
with 160 mg of eptitinib, 5 all reached PR and showed tumor shrinkage. 2 pro-
gression events, in the liver and brain, were observed. With this evidence taken
together, further development of this drug was supported [ZQ16].
4.4 Anlotinib
Currently in phase II, anlotinib is a multitarget TKI that blocks the migration
and proliferation of endothelial cells, reduces the tumor microvascular density
by targeting VEGFRs, FGFRs, and PDGFRs [Anl]. A preclinical trial at-
tempting to test if Osimertinib overcomes acquired resistance to EGFR TKI’s
in patients with EGFR mutant non-small cell lung cancer was conducted. The
researchers evaluated the antitumor effects of gefitinib + anlotinib in gefitinib
resistant lung adenocarcinoma models in vitro and in vivo and investigated the
treatment of an EGFR TKI + Anlotinib in 24 patients with advanced EGFR
mutant NSCLC after EGFR TKI acquired resistance. The results show that
Anlotinib reversed gefitinib resistance adenocarcinoma models by enhancing
antiproliferative and proapoptotic effects of gefitinib. Similarly, EGFR-TKI+
Anlotinib therapy showed an objective response rate of 20.8 percent and a dis-
ease control rate of 95.8 percent. While median progression free survival was
11.53 plus of minus 2.41 months, overall median survival could not be reach. In
the clinical trial, one adverse event in grade 3 was noted, but there were not
grade 4 or 5 adverse events. The researchers conclude by stating that EGFR
TKI + Anlotinib demonstrates powerful antitumor activity in vitro and in vivo.
Using anlotinib can overcome resistance to EGFR-TKI in advanced EGFR mu-
tant NSCLC patients [Zha21]. Another preclinical trial examined the effects
of anlotinib with temozolomide and the molecular mechanisms of anlotinib in
Glioblastoma. Through a Cell Counting Kit-8 and colony forming assays, the
researchers examined cell viability. Cells treated with anlotinib in 0, 1.25, 2.5,
5, 10, and 20 micro moles were tested to reveal that anlotinib could induce cell
292
death when concentrated and in a dose dependent manner in all GBM cell lines
tested. To see long term effects, the researchers used colony formation assay
and found that the size of independent colonies in anlotinib treated group were
much smaller and were significantly reduced, indicating that anlotinib inhibited
the proliferation of GBM cells in a dose dependent manner. Then the migratory
ability of GBM cells was tested through wound healing. The migratory ability
of GBM cells compared to untreated control cells was decreased by anlotinib.
Following that, Transwell migration and Matrigel invasion assays revealed that
GBM cell migration and invasion capacities were reduced when treated with
anlotinib, so anlotinib suppressed the migration and invasion of glioblastoma
cells in a concentration-dependent manner. Then flow cytometry was used to
analyze anlotinib treatment’s effect on the cell cycle profile. After pretreatment
with 0, 2, and 4 micromoles of anlotinib for 24 hours the percentage of cells in
the G2/M phase increased in a dose dependent manner suggesting that anlotinib
could induce a G2/M phase arrest [XP22]. Since previous studies have indicated
that arresting the cell cycle initiates an apoptotic program, anlotinib’s effect was
examined to reveal that the percentage of apoptotic cells was elevated in three
human GBM cell lines. Compared to the cell group, anlotinib was able to in-
duce apoptosis. Researchers also observed that anlotinib induced autophagy
related proteins according to western blotting suggesting that anlotinib started
autophagic programs in GBM. JAK2/STAT3 signaling pathways plays a key role
in angiogenesis; VEGFA, which anlotinib has also known to target, is a down-
stream target gene of JAK2/STAT3 which promotes angiogenesis. A tubular
formation assay was performed to evaluate anlotinib’s effects on new capillaries
sprouting. The human umbilical endothelial tumor formation was inhibited by
u87/anlotinib supernatant which was enhanced by S31-201. Because VEGFA
plays a crucial role in tumor angiogenesis and anlotinib was able to decrease
VEGFA levels secreted by U87 cells, the researchers decided to further explore
underlying molecular mechanisms in GBM cell treatment with anlotinib. Af-
ter a western blot analysis, the researchers found several key signaling pathway
proteins, and after Anlotinib treatment, cell motility related proteins and pro-
liferation related protein expression decreased after 2 micromoles of treatment
which was later enhanced by S31-202 in 100 micromoles. These findings showed
that anlotinib’s influence on the JAK2/STAT/VEGFA signaling pathway could
affect its influence on the anti-angiogenic and anti-glioblastoma effects in GBM.
When put together with temozolomide, a wound-healing assay showed that the
combination of the drugs increased the cell migration inhibition compared to
each drug used alone. Flow cytometry was used to test whether the enhanced cy-
totoxicity was due to cellular apoptosis, but the drugs alone increased apoptosis
with greater efficacy than the combination of drugs [XP22]. Changes to compo-
nents of the JAK2/STAT3/VEGFA signaling pathway were assessed to reveal
that the combination of drugs were more effective than either drug alone to sup-
press JAK2/STAT3/VEGFA signaling. The researchers proceeded to perform
in vivo, nude mice, bioluminescence imaging every seven days suggesting that
anlotinib delayed tumor growth compared to the control group. Staining also
revealed that anlotinib reduced the positivity of the proliferation index. Western
10
293
blotting further revealed that anlotinib reduced p-JAK2, p-STAT3, and VEGFA
in vivo, indicating that anlotinib was able to inhibit proliferation in vivo. The re-
searchers conclude that because anlotinib can suppress proliferation, migration,
invasion and angiogenesis of GBM cells in a dose-dependent manner, anlotinib
offers promise. Furthermore, its cooperative effect with temozolomide to further
enhanced cytotoxicity and anti-angiogenesis offers only stronger evidence of its
promise. While the previous trial did characterize anlotinib in terms of a VEGF
inhibitor, the next trial examines anlotinib combined with cranial radiotherapy
to address cancer patients with brain metastasis. By analyzing the clinical ef-
fects of anlotinib + Cranial Radiotherapy (CRT) versus CRT alone in NSCLC
patients with brain metastasis, the researchers found no significant clinical fea-
tures between the two groups of patients where 45 received CRT alone and 28
received CRT + anlotinib. The researchers also analyzed the overall survival of
anlotinib + CRT compared to CRT alone. After evaluating clinical character-
istics to establish a baseline, prognostic factor for intracranial progression free
survival and overall survival underwent univariate and multivariate analysis.
Compared to the CRT group, the combined group had greater median intracra-
nial progression-free survival of 3 months and 11 months respectively; however,
there were no significant differences in overall survival, extracranial progression
free survival, and systemic progression free survival. Univariate and multivari-
ate analysis further revealed that the addition of anlotinib to treatment was an
independent advantage predictor while an age greater than 57 years and a KPS
score less than or equivalent to 80 were independent disadvantage predictors
of overall survival [He21]. While the difference was not statistically significant,
those with anlotinib and Local CRT treatment had the longest intracranial Pro-
gression free survival of 27 months and overall survival of 36 months, and the mi
progression free survival and m overall survival values for the local CRT group
had values of 11 months and 18 months respectively for shorter values of the
brain. The research concludes by saying that anlotinib can improve intracranial
lesion control and survival prognosis of NSCLC patients with CRT [He21].
5 Conclusion
With its comparable efficacy to Osimertinib in T790M mutations (ic50 4.39nM),
CM93 offers the most promise out of all the other drugs listed above. Although
its inhibition of wt-EGFR (ic50 3300nM) is lacking, it is a selective inhibitor
of EGFR and effectively inhibits EGFRvIII (IC50 0.19 mu moles), the most
common EGFR mutation. CM93’s higher median survival of mice and high
brain-to-plasma concentration suggest potentially improved prognosis and effi-
cacy in patients. The mice’s lack of skin lesions and body weight loss suggests
improved quality of life for patients and its ability to be tolerated in higher doses
gives makes this drug a promising drug for the future. Epitinib offers the least
promise of the drugs listed. Despite its efficacy and ability to penetrate the BBB,
its toxicity and adverse side effects in patients (rashes, diarrhea, elevated AST,
hyperbilirubinemia) suggest its limited effectiveness. The two progression cases
11
294
in the liver and brain observed in the clinical trial evaluating Epitinib lowers the
drug’s promise as it adds a risk factor to the drug. The scarcity of preclinical in-
formation available about this drug also puts limits its promise as it comes with
many unknowns. After CM93, BDTX-1535 and WSD0922-FU offer promise in
terms of improving patient quality of life. BDTX-1535 reported no body weight
loss in vivo and WSD0922-FU reported no dose-related toxicities in vivo stud-
ies. Both show potential to overcome resistance to widely used Tyrosine Kinase
inhibitors (Oismeritibinib for BDTX-1535 and Cetuximab for WSD0922-FU).
WSD0922’s low IC50 values for EGFRm and EGFRvIII inhibition, show its
promise to inhibit different types of EGFR mutations while BDTX’s inhibi-
tion of various EGFR mutations irreversibly offers similar promise. Both have
the ability to penetrate the blood-brain barrier and increase the median survival
time in vivo. With similar efficacy and safety profiles, the lack of information re-
garding both drugs introduces many unknowns giving it less promise than CM93
which not only offers more specific reduced negative effects toxicities on the mice
but also specific inhibition values for various EGFR mutations/variants. With
similar efficacy to CM93, ERASS-801 shows great potential to penetrate the
BBB and inhibit EGFR with low IC50 values (1.1 nM against wild-type and
EGFR, 0.3 nM against EGFRvIII) suggesting strong efficacy. Its selectivity and
lack of interference with astrocytes suggest fewer negative effects or impacts on
the other parts of the brain. Its efficacy and selectivity, while offering promise,
do not mention the effects or potential toxicities on patients placing it below
CM93 in terms of the promise. Similar to ERAS-801, Anlotinib, while showing
strong efficacy with its potential to arrest the G2/M phase in cells, inhibit in
vivo proliferation, and 11.53 months survival progression time shows no evi-
dence of potential to improve patient quality of life. Its high median survival
time, suggests improvements in prognosis; however, if Anlotinib, like Epitinib,
comes with strong dose-related toxicities, it is likely that those toxicities may
inhibit or hinder improvements in a patient’s condition, limiting its promise. Os-
imeritib, while offering strong efficacy through its high kb value (greater than
10) and its low IC50 values (184nM for wt-EGFR, 1nM for t790M mutations,
and 1.25-3 micromoles in GBM cell lines), shows limited promise despite its
ability to increase the median survival time of mice by 16 days. Osimeritnib’s
toxic side effects and severe side taken with the results from the clinical trial
evaluating the drug’s effects on four patients suggest that the drug’s toxicities
could potentially inhibit/hinder treatment/recovery. Its negative effects lower
patient quality of life while drugs such as CM93 show the potential to increase
patient quality of life. Taken together, the preclinical/clinical profiles of these
EGFR Tyrosine Kinase inhibitors suggest that CM93 shows the most promise
followed by BDTX-1535 and WSD0922-FU, ERAS-801, and Anlotinib. Epitinb
and Osimeritnib, while efficacious, lower patient quality of life, giving them less
promise.
12
295
Figure 1: Enter Caption
References
[AA14] O’Neill E. Abraham AG. Pi3k/akt-mediated regulation of p53 in
cancer. Biochem Soc Trans, 2014.
[Anl] Anlotinib combined with dose-dense temozolomide for the first re-
current or progressive glioblastoma after stupp regimen. clinicaltri-
als.gov.
[BZ18] Bakas S et al. Binder ZA, Thorne AH. Epidermal growth factor
receptor extracellular domain mutations in glioblastoma present op-
portunities for clinical imaging and therapeutic development. Cancer
Cell, 2018.
[CA13] Colman H. Cohen AL, Holmen SL. Idh1 and idh2 mutations in
gliomas. Curr Neurol Nuerosci Rep, 2013.
13
296
[CD09] Atkins MB. Cho D, Mier JW. Pi3k/akt/mtor pathway: A growth and
proliferation pathway. in: Bukowski rm, figlin ra, motzer rj, eds. renal
cell carcinoma: Molecular targets and clinical applications. Humana
Press, 2009.
[CE12] Dogrusoz U et al. Cerami E, Gao J. The cbio cancer genomics por-
tal: an open platform for exploring multidimensional cancer genomics
data. Cancer Discov, 2012.
[Cha20] G. et al. Chagoya. Efficacy of osimeritinib against egfrviii+ glioblas-
toma. Oncotarget, 2020.
[CI98] Vaillancourt MT et al Cheney IW, Johnson DE. Suppression
of tumorigenicity of glioblastoma cells by adenovirus-mediated
mmac1/pten gene transfer. Cancer Res, 1998.
[CS16] Arcaro A. Crepo S, Kind M. The role of the pi3k/akt/mtor pathway
in brain tumor metastasis. J cancer Metastasis Treat, 2016.
[CSY18] Huang C-C Huang E-Y. Chou S-Y, Yen S-L. Galecting-1 is a poor
prognostic factor in patients with glioblastoma multiforme after ra-
diotherapy. BMC Cancer, 2018.
[DA13] Palti Y. Davies AM, Weinberg U. Tumor treating fields: a new fron-
tier in cancer therapy. Ann N Y Acad Sci, 2013.
[DF15] Lemaire L Benoit J-P Lagrace F. Danhier F, Messaoudi K. Combined
anti-galectin-1 and anti-egfr sirna-loaded chitosan-lipid nanocapsules
decrease temozolomide resistance in glioblastoma: in vivo evaluation.
Int J Pharm, 2015.
[ea20] Wang Q. et al. Cm93, a novel covalent small molecule inhibitor tar-
geting lung cancer with mutant egfr. bioRxiv, 2020.
[ea22] Zhou Q. et al. Safety and efficacy of epitinib for egfr-mutant non-
small cell lung cancer with brain metastases: Open-label multicentre
dose-expansion phase ib study. Clin Lung Cancer, 2022.
[ER04] Buzzai M et al. Elstrom RL, Baur DE. Akt stimulates aerobic gly-
colysis in cancer cells. Cancer, 2004.
[ERAa] 10-k. sec.gov.
[ERAb] A study to evaluate eras-801 in patients with recurrent glioblastoma.
Clinical Trials.gov.
[FC17] Osorio L et. al Fernandes C, Costa A. Current standards of care in
glioblastoma therapy. Codon Publications, 2017.
[FD17] Hopkins BD Bagrodia S-Cantley LC Abraham RT. Fruman DA,
Chiu H. The pi3k pathway in human disease. Cell, 2017.
14
297
[FD19] Alanhhas I-et al. Fabian D, Guillermo Prieto Eibl MD. Treatment of
glioblastoma (gbm) with the addition of tumor-treating fields (ttf):
A review. Cancers, 2019.
[GG19] Stieber VW Wang BCM Garrison LPJ. Guzauskas GF, Pollom EL.
Tumor treating fields and maintenance temozolomide for newly diag-
nosed glioblastoma: a cost-effectiveness study. J Med Econ, 2019.
[GN20] Mizzi S Meilak L Calleja N Zrinzo A Grech N, Dalli T. Rising
incidence of glioblastoma multiforme in a well-defined population.
Cureus, 2020.
15
298
[KE12] Twigger K et a. Karapanagitou EM, Roulstone V. Phase i/ii trial
of carboplatin and paclitaxel chemotherapy in combination with in-
travenous oncolytic reovirus in patients with advanced malignancies.
Clin Cancer Res, 2012.
[LE19] Huang LE. Friend or foe-idh1 mutations in glioma 10 years on. Car-
cinogenesis, 2019.
16
299
[Osib] 18f-fdg pet and osimertinib in evaluating glucose utilization in pa-
tients with egfr activated recurrent glioblastomad. ClinicalTrials.gov.
[Pet16] Ballard Peter. Preclinical comparison of osimertinib with other egfr-
tkis in egfr-mutant nsclc brain metastases models, and early evidence
of clinical brain metastases activity. American Association for Cancer
Research, 2016.
[PK05] Yla-Herttuala S. Pulkkanen KJ. Gene therapy for malignant glioma:
current clinical status. Mol Ther, 2005.
[QQ13] Liu X et al. Qi Q, He K. Disrupting the p1ke-a/akt interaction inhibits
glioblastoma cell survival, migration, invasion and colony formation.
Oncogene, 2013.
[RC14] Ebner FH et al Roder C, Bisdas S. Maximizing the extent of resection
and survival benefit of patients in glioblastoma surgery: high-field
imri versus conventional and 5-ala assisted surgery. Eur J Surg Oncol
J Eur Soc Surg Oncol Br Assoc Surg Oncol, 2014.
[RR16] Kolarovszki B Richterová R. Genetic alterations of glioblastoma. in:
Agrawal a, ed. neurooncology. InTechOpen, 2016.
[SA05] Wagner e Levitzki A Shir A, Ogris M. Egf receptor-targeted synthetic
double-stranded rna elminates glioblastoma, breast cancer, and ade-
nocarcinoma tumors in mice. PLOS Med, 2005.
[SA13] Karsy M. Sami A. Targeting the pi3k/akt/mtor signaling pathway in
glioblastoma: novel therapeutic agents and advances in understand-
ing. Tumor Biol, 2013.
[SA18] Luesakul U Muangsin N Neamati N. Shergalis A, Bankhead A 3rd.
Current challenges and opportunities in treating glioblastoma. Phar-
macol Rev, 2018.
[SF18] Assi HI. Saadeh FS, Mahfouz R. Egfr as a clinical marker in glioblas-
tomas and other gliomas. Int J Biol Markers, 2018.
[SO05] Heid I et al. Saydam O, Glauser DL. Herpes simplex virus 1 amplicon
vector-mediated sirna targeting epidernmal growth factor receptor
inhibits growth of human glioma cells in vivo. Mol Ther, 2005.
[SR17] Kanner A et al. Stupp R, Taillibert S. Effect of tumor-treating fields
plus maintenance temozolomide vs. maintenance temozolomide alone
on survival in patients with glioblastoma: A randomized clinical trial.
JAMA, 2017.
[SY01] Hirose Y et al. Sonada Y, Ozawa T. Formation of intracranial tu-
mors by genetically modified human astrocytes defines four pathways
critical in the development of human anaplastic astrocytoma. Cancer
Res, 2001.
17
300
[TA20] Lopez GY Malinzak M Friedman HS Khrasaw M Tan AC, Ashley DM.
Management of glioblastoma: State of the art and future directions.
CA Cancer J Clin, 2020.
[WG11] Binder ZA Gallia GL Riggins GJ. Weber GL, Parat M-O. Abrogation
of pi3kca or pik3r1 reduces proliferation, migration, and invasion in
glioblastoma multiforme cells. Oncotarget, 2011.
[WP12] Reardon DA Ligon KL Alfred Yung WK Wen PY, Lee EQ. Cur-
rent clinical development of pi3k pathway inhibitors in glioblastoma.
Nuero oncol, 2012.
[WS97] Li J et al. Wang SI, Puc J. Somatic mutations of pten in glioblastoma
multiforme. Cancer Res, 1997.
[XP22] Pan H Chen J Deng C. Xu P, Wang J. Anlotinib combined with temo-
zolomide suppresses glioblastoma growth via mediation of jak2/stat3
signaling pathway. Cancer Chemother Pharmacol, 2022.
[Zha21] C. et al. Zhang. Concurrent use of anlotinib overcomes acquired re-
sistance to egfr-tki in patients with advanced egfr-mutnat non-small
cell lung cancer. Thorac Cancer, 2021.
18
301
Detecting Distributed Denial Of Service Attacks
(DDoS) Using Machine Learning Models
∗†
Isha Singhal
October 12, 2023
Abstract
The digital landscape of today’s world is vulnerable to the widespread
threat of Distributed Denial of Service (DDoS) attacks. These attacks
have the potential to seriously damage businesses’ finances and reputa-
tions by interfering with the availability of internet services. Traditional
methods of DDoS mitigation, such as rule-based approaches, struggle to
keep up with the evolving nature of attacks. In this paper, I have trained
and tested several supervised machine learning algorithms for the identi-
fication of DDoS attacks to determine the most effective one. I explore
the depths of DDoS, obtaining and adjusting a dataset-utilizing principal
component analysis (PCA) to reduce the number of features in the model
from 80 to 20 while preserving 90% variance in our dataset. By reducing
unnecessary features, PCA allowed us to have higher model accuracy and
training speed. Overall, the Random Forest model trained with PCA had
the best results, obtaining 99.9% accuracy, precision, and recall. The pro-
posed approach exhibits encouraging results, demonstrating its potential
to improve DDoS attack detection and thus reinforce network security.
1 Introduction
Distributed Denial of Service (DDoS) attacks involve overloading a target sys-
tem with an excessive amount of traffic, preventing it from responding to valid
user requests. These types of attacks are executed from multiple computers or
machines, making them both harder to detect and put an end to. They ex-
ploit the fundamental need for the availability of online services and can lead
to severe operational, financial, and reputational consequences. Modern DDoS
attacks are sophisticated and large-scale, and conventional protection methods
like firewalls and intrusion detection systems often fall short in handling them.
The attackers are continuously changing their methods to bypass the defense
∗ Advised by: Dr. Maria Konte
† Student at Northville High School in Northville, Michigan
302
mechanisms put in place to prevent DDoS attacks and researchers, in turn,
change their approach to prevent new attacks.
There are several reasons why DDoS attacks are difficult to defend against.
One reason is because of the scale and volume of the attack. These attacks
involve a large volume of traffic that exceeds the target’s capacity, making it
difficult to filter out attack traffic from real traffic. The scale of attacks can go
all the way up to hundreds of gigabits per second.
Another reason is that DDoS attacks are launched from a multitude of
sources, usually through the use of botnets. Botnets are networks of comput-
ers infected with malware and are under the control of an attacker, making it
difficult to identify and block the attacking sources effectively.
DDoS attacks often imitate normal user behavior, making it difficult to figure
out whether the traffic is legitimate or not. Thus, it becomes hard to filter out
malicious traffic without blocking real users. Lastly, DDoS attackers find and
exploit weaknesses in network protocols, infrastructures, or applications, further
amplifying the detrimental effects of their attacks.
Machine learning techniques have emerged as a potential solution to detect
and prevent DDoS attacks. This is due to their ability to adapt to evolving
attack patterns. I have attempted to use various supervised machine learning
algorithms to detect DDoS attacks in this paper. Using a dataset with benign
and DDoS attacks, I have trained and tested these models and then analyzed the
results to figure out which model is most effective in determining and identifying
DDoS attacks. The ultimate goal is to improve the detection of DDoS attacks
in real-world real-time systems, where intrusion detection systems can be proac-
tively utilized with the appropriate model for the detection and mitigation of
these attacks.
303
2 Background
2.1 Types of DDoS Attacks
There are several kinds of DDoS Attacks [Imp23]. The most common are listed
below:
304
3 Dataset
3.1 What Does the Dataset Include
This dataset [Tal23], which shares its feature set with IDS2017, IDS2018, DoS2017,
and DDoS 2019 CIC NIDS datasets, includes data from various kinds of DDoS
attacks, such as DrDoS, UDP, LDAP, NetBIOS, MSSQL, and many others. It
lists 80 features from over 400,000 DDoS attacks in order to provide a large
dataset.
Using pandas, the data analysis library of Python, I read the dataset into a
pandas data frame with its read csv() method. In this function, index col is an
optional parameter that specifies the column(s) to be used as the index of the
resulting data frame. By default, index col is set to None, meaning that a new
index will be created for the data frame.
305
3.2.2 Forward segment size average (fwd seg avg):
Some DDoS attacks involve packet fragmentation, where attackers split their
payloads into smaller segments to evade detection. Analyzing the forward seg-
ment size average can help identify unusually small or fragmented packets that
might be part of such attacks.
306
3.2.4 Initial backward window bytes (bw win byt):
Some DDoS attacks involve asymmetric communication, where the attacker
sends large amounts of data to the server during the initial connection phase to
consume server resources and establish connections without intending to com-
plete them. Monitoring the Initial Backward Window Bytes in tangent with
the Initial Forward Window Bytes can help detect asymmetric communication
attempts.
307
3.3 Various Types of Attacks in the Dataset
I plotted the counts of various types of attacks (the column is called “Label”)
present in this dataset vs their names to see the spread of various DDoS attacks
using the matplotlib Python library.
308
Figure 5: Plotting Attack Versus Benign
309
Figure 7: Label Encoding [Chu23]
310
Figure 9: Random Undersampling of Data [Bro21]
10
311
I have tried retaining 90% of the variance in data using the PCA and deter-
mined the minimum number of features which would allow me to keep 90% of
variance in data.
Figure 13
11
312
Figure 14: Graphing a Scree Plot [SK23b]
Figure 15
12
313
Figure 16: Splitting the Data [Bro20]
13
314
The salient features and why the above five machine learning models were
selected for DDoS identification are detailed below.
5.3 KNN
DDoS attacks can exhibit non-linear patterns in network traffic data. KNN is
a non-parametric algorithm, which means it can capture complex, non-linear
relationships between features and target classes, making it potentially effective
in detecting such patterns.
KNN can be used for anomaly detection since it classifies data points based
on the majority class of their k-nearest neighbors. In the context of DDoS
detection, attacks are often considered anomalies compared to normal network
traffic, and KNN can help identify these anomalies.
KNN is relatively easy to implement and understand. It does not require a
complex training process, making it suitable for quick prototyping and imple-
mentation.
5.4 AdaBoost
DDoS attacks can exhibit complex and non-linear patterns in network traffic
data. AdaBoost can combine multiple weak classifiers, typically decision trees,
to create a more powerful model capable of capturing complex relationships in
the data.
14
315
AdaBoost assigns higher weights to informative features during training,
which can lead to better feature selection and focus on the most relevant features
for DDoS detection.
Similar to other ensemble methods, AdaBoost is adaptive and can adapt to
changes in the data distribution. This makes it suitable for handling dynamic
DDoS attack patterns.
15
316
5.7 Calculating ROC-AUC Score
The ROC-AUC is the area under the ROC curve. It is the metric that is used
to measure how well the model can distinguish two classes. The score ranges
from 0.5 to 1, and the score being 1 is the ideal case where TPR (true positive
rate) is 1 and FPR (false positive rate) is 0, which means I correctly classify all
positives and negatives.
Comparing the ROC-AUC scores for the five models leads us to the real-
ization that the better option among these classifiers is Random Forest even
though all of them perform well. Bar graphs comparing their accuracies and
ROC-AUC scores are shown below:
16
317
Figure 22: Plot of ROC-AUC Scores for Various Models
ROC curves for the five machine learning models are displayed below. They
are all equally good in terms of the area under the ROC curve which is almost
1 for all of them.
17
318
Figure 24: ROC Curve for Logistic Regression
18
319
Figure 27: ROC Curve for AdaBoost
19
320
5.9 Creating Confusion Matrix
Furthermore, I checked the effectiveness of the models using a confusion matrix.
It is a table with 4 different combinations of predicted and actual values.
The lower the values in the false positive and false negative blocks, the better
the effectiveness of the model. The higher the values in the true positive and
true negative blocks, the better the effectiveness of the model.
20
321
Figure 31: Confusion Matrix for the Random Forest Model
21
322
Figure 33: Confusion Matrix for the AdaBoost Model
22
323
5.10 Creating Classification Report
I also evaluated the performance of these models using the classification report.
A classification report is a performance evaluation metric in machine learning.
It is used to show the precision, recall, F1 Score, and support of the trained
classification model.
The different metrics of a classification report are described as shown below:
Precision is the true positive divided by the sum of the true positive and
false positive.
Precision = TruePositives / (TruePositives + FalsePositives)
Recall is the true positive divided by the sum of the true positive and false
negative.
Recall = TruePositives / (TruePositives + FalseNegatives)
F1 score is two times the recall times the precision over the sum of the recall
and precision.
F1 score = 2 * (precision * recall) / (precision + recall)
The classification reports for the five supervised machine learning models
are:
23
324
Figure 36: Classification Report for Logistic Regression Model
24
325
Figure 39: Classification Report for AdaBoost Model
The closer the values of precision, recall, and f1-score to 1, the better the
effectiveness of the model.
Finally, I evaluated the performance of all five supervised machine learning
models using the precision recall curve.
25
326
Figure 41: Precision Recall Curve for the Logistic Regression Model
Figure 42: Precision Recall Curve for the Random Forest Model
26
327
Figure 43: Precision Recall Curve for the KNN Model
27
328
Figure 45: Precision Recall Curve for the Decision Tree Model
28
329
References
[All22] Stephen Allwright. Using cross val score in sklearn, simply explained.
https://stephenallwright.com/cross_val_score-sklearn/,
2022.
29
330
[Hui23] Purva Huilgol. Precision and recall: Essential metrics for machine
learning (2023 update). https://www.analyticsvidhya.com/blog/
2020/09/precision-recall-machine-learning, 2023.
[Imp23] Imperva. Ddos attack types & mitigation methods: Imperva. https:
//www.imperva.com/learn/ddos/ddos-attacks, 2023.
[Kha21a] Aman Kharwal. Classification report in machine learning:
Aman kharwal. https://thecleverprogrammer.com/2021/07/07/
classification-report-in-machine-learning, 2021.
[Kha21b] Aman Kharwal. Standardscaler in machine learning: Aman
kharwal. https://thecleverprogrammer.com/2020/09/22/
standardscaler-in-machine-learning, 2021.
[Man20] Sanchita Mangale. Scree plot. https://sanchitamangale12.
medium.com/scree-plot-733ed72c8608, 2020.
[Mik19] Bartosz Mikulski. Pca-how to choose the number of
components? https://www.mikulskibartosz.name/
pca-how-to-choose-the-number-of-components, 2019.
[Naj23] et al Najafimehr, Mohammad. Ddos attacks and machine-learning-
based detection methods: A survey and taxonomy. https://
onlinelibrary.wiley.com/doi/full/10.1002/eng2.12697, 2023.
[Nar21] Sarang Narkhede. Understanding confusion
matrix. https://towardsdatascience.com/
understanding-confusion-matrix-a9ad42dcfd62, 2021.
[Net23] Palo Alto Networks. What is a denial of service at-
tack (dos)? https://www.paloaltonetworks.com/cyberpedia/
what-is-a-denial-of-service-attack-dos, 2023.
[One23] OneLogin. What is a ddos attack: Types, prevention & remediation.
https://www.onelogin.com/learn/ddos-attack, 2023.
[Pan22] Pankaj. Numpy.cumsum() in python. https://www.digitalocean.
com/community/tutorials/numpy-cumsum-in-python, 2022.
[Sch23] Frank Schoonjans. Roc curve analysis. https://www.medcalc.org/
manual/roc-curves.php, 2023.
[SK23a] Paula Villasante Soriano and Cansu Kebabci. Principal com-
ponent analysis (pca) in python: Sklearn example. https://
statisticsglobe.com/principal-component-analysis-python,
2023.
[SK23b] Paula Villasante Soriano and Cansu Kebabci. Scree plot for
pca explained: Tutorial, example & how to interpret. https:
//statisticsglobe.com/scree-plot-pca, 2023.
30
331
[Ste20] Doug Steen. Precision-recall curves. https://medium.com/
@douglaspsteen/precision-recall-curves-d32e5b290248, 2020.
31
332
Can Behavioural Economics Help
Explain Gender Disparities in Labour
Markets?
∗
Jumaina Fatima
October 15, 2023
Abstract
The presence of pervasive gender disparities integrated into our labour
market outcomes (of promotion, pay, and hiring) poses a threat to our
current market functions. This paper determines the potential of be-
havioural economics in bridging these disparities in the context of labour
market outcomes. Drawing on insights from the subject, the paper ar-
gues that gender disparities in labour market outcomes can be a result
of a variety of behavioural biases and heuristics that lead to sub-optimal
decision-making.
Using the behavioural lens the paper identifies and focuses on three
cognitive biases: the endowment effect, the overconfidence bias and the
status quo bias. The paper intends to propose interventions and policy
modifications to overcome such ubiquitous biases and correct the present
labour market climate.
1 INTRODUCTION
Two hundred eighty-six years is the daunting estimate given by the World Eco-
nomic Forum’s Global Gender Gap Report 2021 for women to achieve economic
parity with men. (World Economic Forum, 2021). This staggering and stag-
nated number when it comes to labour outcomes re-emphasizes the irrationality
of the ever-widening gender gap.
To account for this, significant data across the field helps us ascertain how
women in higher roles of leadership strongly correlate with firm growth, market
share, revenues, return on investment, productivity and profitability. Failing
to hire and promote women is irrational and inefficient (Martha Fineman &
Terence Dougherty, 2005). Among companies surveyed by the ILO that track
the impact of gender diversity in management, over two-thirds of companies
report 5-20% profit increases (ILO, 2019). Catalyst points out that companies
∗ Advised by: Edoardo Gallo
333
with women on their boards outperformed companies with zero women board
directors; by 84% return on sales, 60% return on invested capital, and 46%
return on equity (Catalyst, 2021). Even enterprises report improvement in
business outcomes due to gender diversity initiatives, over 60% report higher
profitability and productivity, 56.8% report increased ability to attract and
retain talent, 54.4% report greater creativity, innovation and openness, 54.1%
say that their company’s reputation has been enhanced, and 36.5% are better
able to gauge consumer interest and demand (ILO, 2019).
The inference of these statistics clarifies that hiring, promoting and retaining
more women in the existing competitive global market is of the utmost impor-
tance. It is imperative for organisations to view gender balance as a bottom-line
issue, not just a human resource issue (Deborah France-Massin, Director of the
ILO Bureau for Employers’ Activities, 2019).
Additionally, the International Labour Organization estimates that closing
the gender gap in participation by 25% before 2025 could increase global GDP
by US$5.3 trillion. Every 1% of female employment growth is associated with,
on average, annual GDP growth of 0.16% (Women in Business and Management:
The Business Case for Change. ILO, 2019).
Among those aged 25 to 54, the gender gap in labour force participation
stood at 29.2% in 2022, with female participation at 61.4% and male partici-
pation at 90.6% (ILO, 2022). Globally, only about 18% of firms have a female
manager (Esteban Ortiz-Ospina and Max Roser, 2018) and on an overall scale,
women hold 19.7% of seats worldwide with only 6.7% chairing boards (Deloitte,
2021).
With the evidence of these arguments and statistics, two vital points have
been established. Firstly, the proliferation of women’s labour force participation
has created a significant impact on organisations by contributing to an increase
in profitability and the global GDP. Secondly, hindrances to the advancement of
women due to unstructured and loosely targeted policies towards attaining the
required gender parity in current labour markets, organisations and economies
seem to be exhibiting irrational behaviour.
334
Figure 2: Source: Global Gender Gap Report, 2022.
2 LITERATURE REVIEW
2.1 Behavioural Economics- The Psychological Branch of
Economics
The long-standing assumption of human rationality made by neoclassical economists
to simplify their models was challenged by Daniel Kahneman and Amos Tversky
leading to the revolutionary foundation of a much more complex, comprehensive
and intricate branch of study. The amalgamation of psychology, neuroscience,
and economics, behavioural economics is an attempt to put the study of eco-
nomic decision-making onto a firm scientific basis. (David Orrell, 2021). “Be-
havioural Economics is the study of how people make decisions, not how they
should make decisions” (Thaler, 2015). This branch of economics dimensional-
ized the study of economics, expanding the horizons of human understanding of
the act of taking decisions and making choices. Human behaviour, contrary to
the neoclassical economic theory, is not always motivated by rationality. The
human mind eludes certain cognitive heuristics and biases that pose limitations
335
to our decision-making. The gender disparities that we see today in the labour
market are evidently irrational as seen through the tangible data presented ear-
lier. This application of a behavioural economic approach helps us discern the
outcomes caused by cognitive biases when observed through women’s labour
force participation.
336
harder for them to advance, and white men are almost 50 per cent more likely
than men of colour to think this”(Women and Workplace, 2017).
Men feel threatened by efforts towards gender parity due to their ignorance;
their fear of being replaced and facing losses lead to implicit biases creeping
in on the part of men and affecting the outcome. These implicit biases men
conceive have a significant impact on outcomes for women since they are the
ones present more prominently in higher managerial and leadership roles, ac-
counting for 62% of C-suite roles and about 70% of senior management. An
instance of implicit bias creeping into decisions, strengthening the feeling of the
endowment, is when a manager might view a male employee who is assertive
and self-promoting as having ”leadership potential,” while viewing a female em-
ployee who exhibits the same behaviours as ”pushy” or ”abrasive.”(Rudman, L.
A., & Glick, P., 2010) Similarly, research on gender and leadership has found
that female leaders who attempt to establish their authority in a traditionally
masculine (e.g., authoritative or directive) manner are evaluated more harshly
than their male peers (Eagly, Makhijani, & Klonsky, 1992). Perhaps in response
to this resistance, women have tended to develop a more participative leadership
style, which is correspondent with prescriptive gender roles for women (Eagly
& Johnson, 1990) and is more effective for them than traditionally male lead-
ership styles (Eagly, Johannesen-Schmidt, & van Engen, 2003; Eagly, Karau, &
Makhijani, 1995).
Men often see the issue of the gender gap framed as their need to protect
themselves against losses. Particularly when added to the other two bounds,
namely, bounded rationality and bounded willpower, the cumulative effect of
these cognitive and behavioural factors underscores how crucial it is to intervene
to correct this gender gap and the perceptions and behaviours surrounding it
(Heckbert. L, 2018).
337
asking for higher pay due to the undervaluation of their skills results in low
starting salaries. The gender gap in starting salaries is then amplified over years
through pay raises that mostly use starting salaries as a base and consequently
lead to larger gender pay gaps in the long run.
Previous studies on gender differences in initiating salary negotiation find
that, compared to men, women are more likely to feel anxious and less entitled
during negotiations (Bowles, Babcock and McGinn, 2005). If the expected
economic gains were large enough to outweigh the social costs, then the rational
course of action would be to initiate negotiations, in spite of the social costs
(Bowles, Linda Babcock, and Lei Lai, 2007). Referring to the study: “Social
Incentives for Gender Differences in the Propensity to Initiate Negotiations:
Sometimes It Does Hurt to Ask”, by Bowles, Hannah Riley, Babcock, Linda,
and Lai, Lei, there are two things that can be inferred. Firstly, women are
overconfident in undervaluing their potential economic gains which could be
a result of defying social norms. Secondly, women’s reluctance in comparison
to men’s to initiate negotiations over resources, such as compensation, may be
traced to the higher social costs that they face when doing so. A possible
explanation for the lack of entitlement women feel during a salary negotiation is
the serious undervaluation of their work. They tend to be overconfident in their
inference of their work’s worth. This cognitive bias, like many other biases,
stems from social norms and stereotyping. Social norms play essential roles
in people’s economic behaviours (Kray, Galinsky, and Thompson, 2002; Li, De
Oliveira, and Eckel, 2017). Traditional social norms prescribe that women are
generally expected to demand and accept less and give away more (Bowles,
Babcock, and Lai, 2007).
Individuals who believe their performance is better than others are more
likely to ask for a pay raise. Equity theory suggests that people compare their
own input/outcome ratios with others’ input/outcome ratios, and would try to
restore the balance if their ratios are higher than others (Huseman, Hatfield
and Miles, 1987). Rationally, women should be negotiating for their work’s
worth, but overconfidence bias amalgamated with social norms, becomes a cog-
nitive hindrance for women to do so. This identification of this bias and using
behavioural interventions to fix it can result in a significant improvement in
labour market outcomes for women concerning the reduction of the gender pay
gap.
338
W., & Zeckhauser. R, 1988).
The working of this bias is most at play for women’s labour outcomes when
it comes to their promotion to higher levels of leadership roles. According to a
Peterson Institute for International Economics study, only 5% of CEOs in the
S&P 500 are women. Similarly, a 2019 survey by McKinsey & Company found
that women hold only 21% of C-suite positions in the United States.
According to social role theory, women face stereotyping perceptions because
of their multiple social roles. The social role theory examines the causes of sex
differences and similarities in social behaviours. It also argues that gender di-
vision of labour leads to the gender stereotypes which characterise a society
(Eagly, 1987). The inherent status quo becomes the ideologies of patriarchy
and separate spheres leading to the underrepresentation of women in roles of
leadership that they are more than competent of holding. The failure of organ-
isations to budge from their stance on these outdated notions and not taking
into account the multifaceted roles that women play in societies not only causes
significant economic and social costs for women but also results in a huge loss
of economic opportunity for organisations.
If decision-makers have a preference for promoting individuals who resemble
those who have been successful leaders in the past (i.e., the status quo), this
can lead to disproportionate and underrepresentation of women in leadership
positions as well as leave little to no room for women to be considered for
higher-level promotions.
Merit-based pay and promotion programs or meritocracy have long been
used by organisations as affirmative action for diversity policies. Meritocracy has
been culturally accepted as a fair and legitimate distributive principle in many
advanced capitalist countries and organisations (Scully, 1997, 2000; McNamee
and Miller, 2004). However, a study by Castilla and Benard (2010), found that
companies that emphasised meritocracy in their promotion decisions actually
exhibited more significant bias against women and minority groups than those
that did not emphasise meritocracy. The key hypothesis of the study establishes
that managers making decisions on behalf of organisations that emphasise mer-
itocracy ironically showed more significant bias in favour of men over equally
performing women. This happens in part because the culture of meritocracy
unintentionally triggers managers’ stereotypes and other schemata while mak-
ing employment decisions (Swidler, 1986; DiMaggio, 1997). This paradoxical
finding suggests that the emphasis on meritocracy may actually reinforce the
status quo bias. When an organisation is explicitly presented as meritocratic,
individuals in managerial positions favour a male employee over an equally qual-
ified female employee by awarding him a larger monetary reward (Castilla, E.
J., & Benard, S, 2010).
The patriarchal notion of men being perceived as the ultimate leader of
society, and consequently businesses, set the status quo bias in place that now
women need to fight in order to climb up the corporate ladder to reach higher-
level positions, such as that of directors or CEOs. Female candidates do not
resemble the stereotypical notion of directors and leaders. Schein’s research has
shown that, in the UK, Germany, China, Japan and the US, men associate
339
the attributes needed for leadership with men but not with women (Schein,
Virginia E., et al., 2000). This was dubbed the “think manager, think male”
phenomenon. These ideologies are also repeatedly seen in the form of glass
ceilings. Glass ceiling implies blockages or barriers so invisible that they create
obstacles for females and other minority groups as they try to rise to upper
management positions (Morrison, 1980).
4 BEHAVIOURAL INTERVENTIONS
Keeping in mind the heuristics identified that play a significant role in the gender
disparities of labour markets, this section of the paper suggests behavioural
interventions that can potentially be used to curb, or at least, nudge the problem
at hand.
Firstly, this paper suggests the use of explicit rules during the hiring process.
As discussed in section III of the paper, women hesitate to lead or even initiate
negotiations for their salaries. They fear that their outwardness and bold ask
for their work’s worth would make them less likeable, less hirable and rude. An
experimental study (Lin Xiu et al., 2022) examined how explicit pay raise rules
affect men’s and women’s initiations of salary negotiation differently. Their
results showed that when pay raise rules are explicitly stated, women are less
reluctant to ask for a pay raise. The explicit rule effect seems to work well,
particularly for women with above-average task performance. A clearly stated
rule frees women from concerns that their asking decisions might be perceived
as socially less acceptable and that starting salary negotiations conflicts with
their internalised social norm. (Lin Xiu et al., 2022). This would finally let
women infer the value of their work without the constraints of social and gender
norms. Using the behavioural tactic of framing and explicitly stating that
wages are up for negotiations, organisations can not only empower women to
start salary negotiations but also increase women’s trust in the organisation’s
pay raise process, thereby retaining talent longer. Women would be certain that
their work and talent are valued and that their future career advancements are
assured.
Secondly, acknowledging the endowment effect, it is crucial that organisa-
tions carefully depict their stances on diversity and gender parity. This pursuit
is for equality. Women’s better labour market outcomes do not mean adverse
outcomes for men. These two events are not mutually exclusive. “If men believe
their organisations prioritise gender diversity because it leads to better business
results, they are significantly more likely to think it matters. . . . [W]hen men
think companies prioritise gender diversity because it is ‘fair to all people,’ they
are more likely to be personally committed.”(Women and Workplace, 2017) It
is important that organisations make active efforts to curb any ignorance and
misinformation on the part of their male employees.
The most ideal way to strive for better labour outcomes for women is through
gender parity and inclusion, yes, but also by making sure that the competent
human capital is fully at use to accelerate economic development and amplify
340
economic prosperity. This could potentially be a better way of framing the
pursuit of gender parity in labour markets. If businesses provided concrete
examples of economic results achievable with increased diversity and then de-
scribed that economic opportunity in a manner that activated employees’ loss
aversion biases, this could help increase male employees’ prioritisation of diver-
sity in leadership (Heckbert. L, 2018). There are numerous studies and reports
to support the claims that increased women’s participation in the labour mar-
ket leads to astounding outcomes. Through these studies, organisations can
successfully disguise gender parity advancement as economic opportunities and
business output advancements.
Thirdly, joint evaluation employees could help nudge out implicit biases that
people tend to harbour due to social norms. These evaluations could potentially
provide evidence-based rebuttals to any inherent stereotypical ideologies one
must possess. Bohnet et al’s research applied the behavioural economics find-
ing that “people make more reasoned choices when examining options jointly
rather than separately” to the process of employee evaluations. They found
that when jointly evaluated, individual performance drives evaluation decisions;
when separately evaluated, group stereotypes drive such decisions. Businesses
could opt for joint evaluation at each of the hiring, review and promotion stages
as a normative best practice and fairness mechanism.
Organisations could frame it as helping managers maximise profits and team
results, by ensuring consistent selection of higher-performing candidates. Fram-
ing such a procedure as a fairness mechanism should appeal to individuals’
bounded self-interest. Governments could nudge businesses toward adopting
joint evaluation procedures. This could be incorporated in a “comply or ex-
plain” regulation, using an information-based strategy to indirectly alter busi-
ness behaviour (Heckbert, L. (2018).
5 CONCLUSION
Through the behavioural economics lens approach that the paper has taken, the
three heuristics and their respective arguments have reinstated the irrationality
of the gender gaps in labour market outcomes.
Zooming into the endowment effect, it clarifies that paving the path to po-
sitions of authority for women or at the least considering competent women for
these roles is felt as a devastating loss to those already concentrated in high
numbers at the top of the ladder. As suggested in the intervention, this can be
solved by framing the promotion and advancement of women as business and
economic opportunities.
This paper takes a unique approach to the application of the overconfidence
bias in the context of gender disparity in labour market outcomes. Historically,
women have been economically disadvantaged due to the patriarchal norms em-
bedded in the very essence of society. In a situation such as this, where social
norms are against your favour, it’s difficult to value one’s work and skills, es-
pecially if one has been conditioned to downplay their achievements and never
341
demand more. It is also seen that women are hesitant to negotiate their com-
pensations in situations where they believe that the value of their economic
gains is less than that of the social cost that comes with defying social norms.
Organisations can assist women in being better negotiators by imposing the
usage of explicit rules.
The resistance to hiring, promoting, and equitably compensating women
within organisations highlights the persistence of patriarchal notions that re-
inforce men’s leadership dominance. This section emphasises the critical need
for organisations to overhaul their policies and strategies, embracing gender
diversity as a fundamental aspect of their organizational culture.
Behavioural economics is the solution to understanding irrationality through
structured and well-defined heuristics and biases. The overlap of this study en-
sures the problem at hand is understood from all different perspectives, offering
reasons for the inherent issue of gender disparities in labour markets. Once
issues at hand are understood vastly, the curbing and fixing become much more
simplified.
References
[BBL07] Hannah Riley Bowles, Linda Babcock, and Lei Lai. Social incen-
tives for gender differences in the propensity to initiate negotia-
tions: Sometimes it does hurt to ask. Organizational Behavior and
human decision Processes, 103(1):84–103, 2007.
[Hec18] Lori Anne Heckbert. Closing the gender gap in corporate advance-
ment: Insights and solutions from behavioral economics. Windsor
Yearbook of Access to Justice, 35:187–225, 2018.
[KKT+ 91a] Daniel Kahneman, Jack L Knetsch, Richard H Thaler, et al. The
endowment effect, loss aversion, and status quo bias. Journal of
Economic perspectives, 5(1):193–206, 1991.
10
342
[KKT+ 91b] Daniel Kahneman, Jack L Knetsch, Richard H Thaler, et al. The
endowment effect, loss aversion, and status quo bias. Journal of
Economic perspectives, 5(1):193–206, 1991.
[RG21] Laurie A Rudman and Peter Glick. The Social Psychology of gen-
der: How Power and intimacy shape gender relations. Guilford
Publications, 2021.
[RXH] Yufei Ren, Lin Xiu, and Amy B Hietapelto. Gender differences in
asking for pay raises: The role of explicit rules.
[SMLL96] Virginia E Schein, Ruediger Mueller, Terri Lituchy, and Jiang Liu.
Think manager—think male: A global phenomenon? Journal of
organizational behavior, 17(1):33–41, 1996.
11
343
Using Data-Efficient Image Transformers for
Diabetic Retinopathy Severity Classification
†
∗
Veda Fernandes
October 17, 2023
Abstract
Roughly 10% of the global adult population is diabetic, diabetes is
a metabolic condition which results in chronically high blood sugar lev-
els. Patients with diabetes are at substantially higher risk for several
serious health conditions including diabetic retinopathy (DR). DR is a
vision-threatening disease which affects 35% of diabetic patients and is
projected to affect 160 million people by 2045. Diabetic patients should
be screened for retinopathy every one to two years; however, in many
countries patients are not regularly screened and therefore not treated.
Globally, the lack of rapid and cost-effective screening strategies for DR
leads to underdiagnosis and loss of vision. Machine learning tools of-
fer a solution in developing automated models to diagnose DR from eye
fundus images. In published literature, convolutional neural networks
(CNNs) are the state-of-the-art model for classification of DR. More re-
cently, transformer models have been applied and shown superior perfor-
mance. Text transformer models have resulted in the proliferation of tools
such as ChatGPT, which provide contextual understanding and ability to
identify dependencies. In this study, we perform a head-to-head compar-
ison between CNN and vision transform models for classifying DR. We
demonstrate that transformer models diagnose DR with a substantially
higher accuracy, ranging up to 13% as measured by the F1 performance
metric. Furthermore, we identify optimal training parameters for diagno-
sis of DR, training a total of 19 machine learning models reaching a test
set F1 score performance of 90% on a dataset of 35,130 fundus images
with 20% of images withheld for independent testing.
1 Introduction
Diabetic retinopathy (DR), is a vision-threatening microvascular disease caused
by significant damage to the blood vessels in the retina and is one of the most
frequent complications of diabetes mellitus [NPS22]. DR is a leading cause of
∗ Dubai International Academy Emirates Hills
† Advised by: Dr. Parsa Akbari, University of Cambridge
344
preventable blindness and vision impairment among the working-age population,
with a prevalence of about 35% among those with diabetes mellitus. By 2045,
it is estimated that 783 million people will be diabetic [Fed21] and 160 million
people could be affected by DR [TTY+ 21]. The prevalence increase of DR by
2030 is notably concentrated in the low and middle income countries in regions
such as Asia, South America and the MENA. [TW22]. This disease burden will
require effective DR screening strategies to align with the changing demographic.
About 56% of new cases could be reduced with timely monitoring of severity
and treatment [TMS20].
DR can be graded into 5 stages according to morphological changes that oc-
cur in the retina as the disease progresses: non-proliferative diabetic retinopathy
(NPDR), mild NPDR, moderate NPDR, severe NPDR and Proliferative Dia-
betic Retinopathy [DSS17] (Fig. 1). Screening is imperative to identify the
stage of DR - with timely referral, the progress of DR can be slowed and severe
vision-impairment can be prevented [WSK+ 18]. Despite this, there is a shortage
of ophthalmologists to screen millions of retinal images for each diabetic patient,
especially in developing countries. [RLW+ 20] [WSK+ 18]. Here, technology can
provide an alternate solution to traditional screening methods by physicians by
reducing the cost and manpower required to screen patients for PDR.
Ocular telemedicine is a concept which has been proposed to make DR
screening more cost efficient. This involves local clinics sending images of the
retina to a central ‘grading center’ where experts can grade the level of DR
severity [HSCA16]. Hand-held imaging devices are another solution and have
achieved high specificity and sensitivity compared to traditional retinal cam-
eras [PDKB22]. However, in both solutions, trained clinical professionals are
still required to analyze the retinal images. An automated system would con-
duct the initial screening of retinal images to detect signs of DR even when
ophthalmologists are unavailable.
An automated system must recognize the changes to retinal vasculature
caused by the various stages of DR as seen on fundus images. As seen in Fig. 1,
DR affected retinal images show characteristic color and patchy variations on the
fundus image, due to morphological changes in the retina. These changes include
lesions called MicroAneurysms (MA), Hard EXudates (HEX), Soft Exudates
(SE), HEMorrhages (HEM), and an increase in blood vessels. MAs are seen in
one quadrant in mild NPDR, and it progresses to vessel blockage, and presence
of lesions in moderate NPDR. Severe NPDR presents with venous beading and
a large number of HEMs. This is a precursor to ‘neovascularization’ of the PDR
stage [NPM+ 22], or to the formation of new blood vessels on the retina, which
may eventually lead to blindness (Fig. 1).
In recent decades, artificial intelligence has been trained to classify DR stages
from fundus images. Early models used ML-based classifiers like Random Forest
(RF) ( [CSC+ 14], K-Nearest Neighbours (KNN) [NvGRA07] , Support Vector
Machines (SVM) [SAFL10] and Artificial Neural Networks (ANN) [UDH+ 04].
These methods required efficient prior hand-engineered feature extraction, which
could introduce errors into complex fundus imaging. [NG18] evaluated 7 auto-
mated retinal image analysis (ARIA) systems to classify DR. These models had
345
Figure 1: Images of the retina showing stages of diabetic retinopathy, graded
into 5 classes ranging from 0-4, as per the EyePACS dataset [GPC+ 16].
346
Recently, transformer models have gained popularity in a variety of applica-
tions. An example of this rising interest in transformers is ChatGPT - an ML-
tool which uses a transformer architecture to do NLP [Ope23]. Transformers
are efficient in identifying and understanding the relationship between separate
elements within data and are capable of parallel processing, which means they
are able to be trained more rapidly [IEE+ 23]. In addition to text, the success
of transformers in Natural Language Processing (NLP) prompted the creation
of vision transformers (ViT) which can be applied to image data to carry out
computer vision tasks [HGL+ 23]. These traits make ViTs adept at doing tasks
that require an understanding of the context, thus making them valuable not
only in the field of NLP but also in medical imaging classification and segmen-
tation [SKZ+ 23]. Consequently, ViTs may be a prospective model architecture
for DR detection through fundus images.
[WHX+ 21] and [AAKJTT+ 21] demonstrated that attention-based ViTs
provide high accuracy for DR classification. [AKCS23] used an ensemble of
Vision Transformers (ViT), Data efficient image Transformers (DeiT), Bidirec-
tional Encoder representation for image Transformer (BEiT) and Class-Attention
in Image Transformers (CAIT) to stage DR severity. Transformers gained
prominence only recently and hence, there is limited literature on the effect
of hyperparameters of ViTs as compared to CNNs, which have been studied
much more rigorously. Therefore, this study aims to compare the performance
of a ViT and CNN architecture and find the optimal hyperparameters for a ViT
model to classify DR fundus images.
In this study, we benchmarked DeiT, a recent ViT model, against ResNet-
18, a classical CNN model, for binary classification of retinal fundus images
into no or mild nonproliferative DR and moderate nonproliferative or more
severe DR. This classification was chosen as it is recommended by the American
Diabetes Association and the International Council of Ophthalmology that cases
of moderate NPDR or more severe stages are referred to an ophthalmologist
[WSK+ 18] [SCD+ 17]. Therefore, the model would be used as a preliminary
screening tool to help refer patients while the specific diagnosis of the severity
of DR could be done by a medical professional upon consultation. We used the
EyePACS Dataset, consisting of 35,130 fundus images [GPC+ 16].
To address the disparity in literature for the application of the transformer
models in DR classification, we identified the optimal hyperparameters for the
DeiT architecture and compared its performance to ResNet-18. The models
were evaluated on their F1 scores, which is a metric that measures the har-
monic mean of the precision and recall of the model, penalizing any extreme
values from either [HST+ 22]. The results indicated that the transformer mod-
els showed superior performance across a range of hyperparameters including
learning rate and batch size. The F1 scores showed that the DeiT model per-
formed considerably better than the ResNet-18 model and that a learning rate
of 1E-04, batch size 32, and epoch of 6 gave the best model performance.
347
2 Results
2.1 DeiT model performance showed 23% improvement in
F1 score with optimal learning rate
We investigated the impact of learning rates on the performance of two model
architectures - DeiT and ResNet-18. DeiT is a vision transformer while ResNet-
18 is a convolutional neural network. The learning rate governs how quickly a
model learns. A higher learning rate allows for the model to take larger steps
to improve, however this may result in the optimal solution being overshot. In
contrast, smaller learning rates may eventually reach a desirable result but are
less time-efficient. To find the optimal value for the learning rate, each model
was trained with 5 learning rates: 1E-03, 1E-04, 1E-05 and 1E-06. Both the
ResNet-18 model and the DeiT model showed a similar trend in performance
based on learning rates (Fig. 2).
For the DeiT model, learning rates of 1E-04 and 1E-05 demonstrated superior
performance showing 40% improvement in test F1 score compared to higher
learning rates. However, learning rates lower 1E-04 and 1E-05 deteriorated the
ability of the model to detect moderate NPDR or more severe cases with a 7%
decrease in model performance when tested with a learning rate of 1E-06. For
the ResNet-18 model, similar to the DeiT model, the learning rates of 1E-04 led
to better overall model performance. Below and above those values, the model
performance was much poorer.
A significant finding was that the DeiT model consistently outperformed
ResNet-18 across all learning rates (Fig. 2), with a 13% higher Test F1 score.
ViT generally excels in understanding the bigger context in images as compared
to CNN models, which could explain the results.
Figure 2: Graphs showing the effect of different learning rates on the perfor-
mance for the DeiT and ReNet-18 models. This figure shows the Test F1 scores
for the no or mild NPDR and moderate NPDR or more severe DR categories
and overall cost of the DeiT and ResNet-18 models. The DeiT model generally
performs better than the ResNet-18 model, with the F1 score being higher on
average and the precision being lower. It can also be seen that the optimal
learning rate for both models is 1E-04 as the F1 graphs peak while the cost
function is low at that point.
348
2.2 Batch size of 32 improved model test F1 score by 50%
Deep learning models are trained with the stochastic gradient descent algorithm
which performs each iteration of training using a single batch of data. Therefore,
at each training iteration, improvements in model performance are incremental
and dependent only on images present in the batch. Batch size is a critical
training parameter which determines the number of images the model is trained
or tested on in each iteration. Finding the correct batch size is important,
as this fundamentally affects the training of the model. Larger batches result
in a greater amount of information informing each training iteration; however,
larger batch sizes are computationally intensive and are slow to execute. Smaller
batch sizes are computationally efficient; however, less information informs each
training iteration. Furthermore, smaller batch sizes lead to fluctuations in model
training which may be beneficial in searching the model parameter space and
resulting in superior performance, or may lead to ineffective training iterations
and poor performance.
We experimented with a range of values, starting at 8 and increasing in
factors of 2, testing batch sizes of 8, 16, 32, and 64 to assess their impact on the
model’s performance. We found that larger batch sizes tended to improve model
performance, with a batch size of 32 giving the best overall performance (Fig.
3). While increasing batch size to 64 increased the model’s ability to identify no
or mild DR, it negatively impacted its ability to diagnose moderate NPDR or
more severe DR by 10%, which is counterproductive to the aim of this model.
Figure 3: Graph showing the effect of different batch sizes on the performance
of the DeiT model. This figure shows the Test F1 scores for the no or mild
NPDR and moderate NPDR or more severe DR categories and overall cost of
the DeiT and ResNet-18 models. The DeiT model generally performs better
than the ResNet-18 model, with the F1 score being higher on average and the
precision being lower. It can also be seen that the optimal learning rate for both
models is 1E-04 as the F1 graphs peak while the cost function is low at that
point.
349
2.3 Epoch of 6 improved test F1 score by 7%
Epochs control the number of times the model iterates through all the training
fundus images, where a single iteration over all images in the training set is
one epoch. Training the model across multiple epochs often results in superior
performance as the model parameters are further improved from the training
examples. The intention of model training is for the model parameters to be
tuned to detect patterns in the training set which will be predictive of DR in
future fundus images. However, training across too many epochs will result in
overfitting, because the resulting model is over-adjusted for the training set and
identifies patterns which are specific to the peculiarities of the training set but
do not generalize to new data. Overfitting is detected by assessing the model
on a testing set which is not utilized for model training.
An ideal number of epochs avoids overfitting or underfitting by having either
too large or too small a number of epochs. To determine the optimal number of
epochs, the DeiT model was tested on epochs of 2, 4, 6, 8 and 10. There was no
significant relationship between the number of epochs and model performance,
but the model achieved the highest average test F1 Score of 90% and 70% for
each test class with 6 iterations (Fig. 4). The model performance decreased by
an average of 7%, above and below 6 epochs.
Figure 4: Graph showing the effect of different epochs on the performance for
the DeiT model. This figure shows the Test F1 scores for no or mild NPDR
and moderate NPDR or more severe DR categories and overall cost of the DeiT
models. The optimal epoch was identified as 6 epochs as it showed the best F1
and cost results.
3 Discussion
In the study we have demonstrated that vision transformer models show su-
perior predictive performance in diagnosis of DR compared to classical CNNs,
350
with a 13% higher test F1 accuracy. We identified the optimal values for the
DeiT model parameters such as learning rate, batch size and number of epochs.
We analyzed the test F1 score for 19 machine learning models and found that
hyperparameters learning rate of 1E-04. Additionally, a larger number of epochs
of 6 and larger batch sizes of 32 proved to show better model performance. The
DeiT model effectively classified images into both classes. A challenge that we
encountered was the imbalance in the number of images in each category in the
EyePACS dataset. There were over 22,000 images in the no or mild NPDR cat-
egory and approximately 5400 images within the moderate DR or more severe
DR category. Although the classes were weighted to reduce the effect of the
imbalance, there was still a significant difference in the ability of the models to
classify the images - the model performed 20% better at classifying images into
no DR or mild NPDR than the moderate or more severe cases. However, overall,
the DeiT model performed better than the ResNet-18 model, demonstrating the
promise of vision transformers in DR classification applications. DeiT models
can potentially reduce the need for manual screening.
Previous works largely focused on CNNs, with only a few papers on the
applications of ViTs to DR classification [WHX+ 21] [AAKJTT+ 21] [AKCS23]
. Recent research has shown that transformers provide significant advantages
compared to CNNs, including providing contextual understanding and making
connections between disparate features in the input image [MDB23]. However,
there have not been direct comparisons between the performance of ViT and
CNN models for diagnosis of DR. In this study, we aim to bridge this gap by
doing a comprehensive comparison between a classical CNN and a recent ViT
model.
Expanding on the work in this paper, DeiT models can be applied to multi-
class DR classification tasks based on the severity of clinical symptoms. How-
ever, to accomplish this work with high accuracy and to improve the feature
extraction capabilities, a larger dataset can be used, with a more balanced num-
ber of images for each class. More preprocessing techniques can be explored to
arrive at better convergence of the loss function. The high imbalance in classes
can be further addressed by custom data augmentation techniques such as rota-
tion, adjusting brightness, contrast etc. of the images [AKCS23]. Additionally,
variable illuminations and saturations of the images are a barrier to accuracy of
predictions. Luminosity normalization is a pre-processing technique that could
be applied to the images to improve the problem of variable illuminations. To
deal with zero pixels, using the generic cropping functions may cause a loss of
image data and disturb fundus geometry. Using a custom cropping window of
variable lengths depending on the resolution of the images will preserve crucial
information which will help in convergence of the loss function [RPC+ 20].
4 Methods
The Pytorch Package in the Visual Studio Code environment was used to run
the models. The experiment was run on fundoscopy images from the EyePACS
351
database using two deep learning networks - DeiT, which is a vision transformer
and ResNet-18, which is a convolutional neural network.
Fundoscopic examination is a routine clinical examination of the retina using
an ophthalmoscope, and is used to detect numerous eye diseases including dia-
betic retinopathy. EyePACS is a telemedicine healthcare provider which offers
diabetic retinopathy screening solutions in the United States. The EyePACS
dataset includes 5 million fundoscopy images of a diverse population of healthy
patients and those with various stages of diabetic retinopathy [GPC+ 16]. 35,130
EyePACS color fundus images were utilized for the analysis, with 28,102 images
set for training and 7,028 for testing. We ensured that all pairs of eyes from a
single patient were kept together in the training and testing sets to ensure the
testing set is independent from the training set.
While the demographic variables for the EyePACS dataset are not published,
an EyePACS dataset of 9963 images from 4997 patients used in a paper had an
average age of 54.1 (±11.3 ) years, with 62.2% women [GPC+ 16] The EyePACS
fundus images are labelled in five classes as Normal (0), mild (1), moderate
(2), severe (3), and proliferative DR (4) as shown in Fig. 1. Each image was
rated by a clinician for the stage of DR present according to the International
Clinical Diabetic Retinopathy severity scale (0-4) [GPC+ 16]. For our model, we
divided these classes into 2 categories based on recommended screening strate-
gies [WSK+ 18] [SCD+ 17].
The performance of a deep learning model will be dependent on the quantity
and integrity of the dataset used for training the model [ZLLS21]. The EyePACS
dataset images are of various sizes and have pre-existing image noise and incon-
sistencies including the presence of artefacts, images being out of focus, under-
exposed or overexposed [TPM23]. The images were acquired by various cameras
supported by the EyePACS platform, with field views of 40◦ − 50◦ [KOM+ 20]
and have different resolutions [TPM23]. There was no standard orientation of
the images and they could be inverted as well, making it difficult to tell left
from right eye images.
To improve the accuracy and reduce error rates of deep learning networks, a
set of operations for pre-processing the images was required to train the model
[ZLLS21]. Initially, the data was pre-processed by first converting them to
tensors, then resizing the images to a 224 x 244 resolution and finally using the
Random Horizontal Flip transformation to add more variance to the dataset.
Training deep learning models for high performance requires efficient opti-
mization of the hyperparameters of the model. First we trained both DeiT and
ResNet-18 models across learning rates between 1E-06 to 1E-03. Increasing the
number of epochs or the batch size will ensure the model is adequately trained,
thereby converging to the optimal solution and improving the accuracy. The
DeiT model performed consistently better than the ResNet-18 model, over a
range of learning rates, thus the DeiT model was chosen for further experimen-
tation. We varied the batch size of the DeiT model from 4 to 64 and epochs
from 2 to 10. The run time for the model varied due to increasing the number of
epochs and decreasing the learning rate, which made the model take longer to
train. The metrics, including F1, precision and recall, were logged for both the
352
training and testing data. We generated a summary of the overall performance
of the model under various hyperparameters to be visualized as graphs. We
ran 13 different trials, making changes to the stated hyperparameters to iden-
tify which gave the best model performance. Every three iterations, the model
performance was evaluated using the testing dataset of 7027 images.
10
353
filters is doubled, and down-sampling is with a stride of 2. The end of the net-
work is a 1000-way fully connected layer with softmax and an average pooling
layer. The shortcut connections are introduced to each pair of 3 × 3 filters as
shown in Fig. 5 [HZRS15].
The standard ResNet residual block is called the Identity Block (Fig. 6).
It is modified into the Convolutional Block when the input activation does not
have the same dimension as the output – usually 1 X 1 convolutions are done,
with a stride of 2 to match dimensions (Fig. 7) [HZRS15].
Figure 6: Identity Block - When input and output dimensions are equal, no
additional layer in skip connection path
Figure 7: Convolutional Block - When input and output dimensions are not
equal, 1 × 1 convolutional layer added in skip connection path
11
354
elements within sentences of text without regard to their distance in the se-
quence [VSP+ 17]. Vision transformers applied these concepts to image process-
ing and classification [DBK+ 20]. However, compared to classical CNNs, ViT
models depend on pre-training using large amounts of data. Data-efficient im-
age transformers aim to address the requirement for large amounts of training
data by using a knowledge distillation technique to train a modified transformer,
built on the ViT architecture proposed by [DBK+ 20], which transfers knowledge
from a larger CNN to a smaller model. This reduces the training requirement
to about 3 days and also requires less infrastructure. [TCD+ 20]
DeiTs are image transformers that propose a novel distillation procedure
built upon the transformer block proposed by [DBK+ 20] and are trained on the
ImageNet dataset only [TCD+ 20] . This contrasts with ViT which needs pre-
training on hundreds of millions of images of curated data from many datasets
showing high performance [DBK+ 20]. DeiTs are an effective method as they re-
quire lower volume of data and memory footprint for a given accuracy [AKCS23],
making them a less computationally expensive architecture.
The knowledge distillation procedure using attention is the central principle
of DeiT where the transfer of knowledge happens from ‘the teacher’ model to
the ‘the student’ model [TCD+ 20]. The ‘teacher’ is RegNet Y–16GF, a CNN
pre-trained on ImageNet. The student is a modified ViT architecture where the
output of the ‘teacher’ is passed as an input to the ‘student’.
In DeiT, a new distillation procedure is introduced where the teacher’s hard
decision is taken as the true label. DeiT decomposes each RGB image into a
series of N patch tokens of 16 × 16 pixels each and converts it into a linear
layer of 16 × 16 × 3 = 768 dimensional representation. A new distillation token
is included which interacts with the class and patch tokens through the stack
of transformer encoder layers (Fig. 8). The encoder layers contain Multi-head
Self Attention (MSA) and Feed Forward Network (FFN) modules [TCD+ 20]
[DBK+ 20]. The hard decision is illustrated below.
Let yt = argmaxc Zt (c) be the hard decision of the teacher. The task of the
distillation token is to reproduce the hard decision yt predicted by ’the teacher’
and the class token has to reproduce the true label y. DeiT’s loss function is
given by:
1 1
LhardDistill
global = LCE (ψ (Zs ) , y) + LCE (ψ (Zs ) , yt )
2 2
Zs and Zt are the logit functions of the student and teacher models. ψ is the
softmax function, and LCE is the cross-entropy loss. Distillation tokens and the
class token learn by back propagation and the distillation allows the model to
learn from the teacher output [TCD+ 20], and this process is more efficient and
requires less computational power than other vision transformers.
Following the training and testing of the chosen DeiT model, which shows its
efficient performance in classifying diabetic retinopathy fundus images, it holds
promise for application in telemedical or national screening programs to screen
for severity of DR.
12
355
Figure 8: DeiT distillation procedure - The distillation token interacts with
the class and patch token through the transformer encoders. The encoders in
DeiT consist of repeated layers of self-attention and feed-forward network (FFN)
blocks. The objective of the distillation token is to reproduce the teacher’s
prediction instead of the true label. The distillation and class tokens learn by
back propagation.
5 Conclusion
The incidence of diabetes in the global population has reached 10%, patients
with diabetes are at high risk for diabetic retinopathy and should be tested
for DR every one or two years according to standard clinical guidelines. The
lack of rapid and cost effective methods for diagnosis of DR is a major limiting
factor for providing appropriate patient care [WWC+ 22]. Our study has shown
that recent vision transformer methods have superior performance to a CNN
model for the classification of DR. We have identified optimal model training
13
356
parameters for the DeiT architecture. Our work demonstrates the ability of
ViTs to improve the accuracy of automated DR classification.
References
[AAKJTT+ 21] Nouar AlDahoul, Hezerul Abdul Karim, Myles Joshua
Toledo Tan, Mhd Adel Momo, and Jamie Ledesma Fermin. En-
coding retina image to words using ensemble of visiontransfo
rmers for diabetic retinopathy grading. F1000Research, 10,
September 2021.
[BR18] Mihalj Bakator and Dragica Radosav. Deep learning and med-
ical diagnosis: A review of literature. Multimodal Technologies
and Interaction, 2(3):47, August 2018.
14
357
[GL17] Rishab Gargeya and Theodore Leng. Automated identification
of diabetic retinopathy using deep learning. Ophthalmology,
124(7):962–969, July 2017.
[GPC+ 16] Varun Gulshan, Lily Peng, Marc Coram, Martin C Stumpe,
Derek Wu, Arunachalam Narayanaswamy, Subhashini Venu-
gopalan, Kasumi Widner, Tom Madams, Jorge Cuadros, Ra-
masamy Kim, Rajiv Raman, Philip C Nelson, Jessica L Mega,
and Dale R Webster. Development and validation of a deep
learning algorithm for detection of diabetic retinopathy in reti-
nal fundus photographs. JAMA, 316(22), December 2016.
[HGL+ 23] Kelei He, Chen Gan, Zhuoyuan Li, Islem Rekik, Zihao Yin, Wen
Ji, Yang Gao, Qian Wang, Junfeng Zhang, and Dinggang Shen.
Transformers in medical image analysis. Intelligent Medicine,
3(1):59–78, February 2023.
[HJS22] Abid Haleem, Mohd Javaid, and Ravi Pratap Singh. An era of
ChatGPT as a significant futuristic support tool: A study on
features, abilities, and challenges. BenchCouncil Transactions
on Benchmarks, Standards and Evaluations, 2(4), October 2022.
[HSCA16] Mark B Horton, Paolo S Silva, Jerry D Cavallerano, and
Lloyd Paul Aiello. Clinical components of telemedicine pro-
grams for diabetic retinopathy. Current Diabetes Reports,
16(12):129, December 2016.
[HST+ 22] Steven A Hicks, Inga Strümke, Vajira Thambawita, Malek
Hammou, Michael A Riegler, Pål Halvorsen, and Sravanthi
Parasa. On evaluation metrics for medical applications of arti-
ficial intelligence. Scientific Reports, 12(1), April 2022.
[HZRS15] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Deep residual learning for image recognition. arXive, 2015.
[HZRS16] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Identity mappings in deep residual networks. arXive, March
2016.
[IEE+ 23] Saidul Islam, Hanae Elmekki, Ahmed Elsebai, Jamal Bentahar,
Najat Drawel, Gaith Rjoub, and Witold Pedrycz. A compre-
hensive survey on applications of transformers for deep learning
tasks. arXive, June 2023.
15
358
[MDB23] José Maurı́cio, Inês Domingues, and Jorge Bernardino. Com-
paring vision transformers and convolutional neural networks
for image classification: A literature review. NATO Advanced
Science Institutes Series E: Applied Sciences, 13(9), April 2023.
16
359
[RPC+ 20] Hamza Riaz, Jisu Park, Hojong Choi, Hyunchul Kim, and Jung-
suk Kim. Deep and densely connected networks for classifica-
tion of diabetic retinopathy. Diagnostics (Basel), 10(1), January
2020.
[SKZ+ 23] Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muham-
mad Haris Khan, Munawar Hayat, Fahad Shahbaz Khan, and
Huazhu Fu. Transformers in medical imaging: A survey. Medical
Image Analysis, 88, August 2023.
[TPM23] Maria Tariq, Vasile Palade, and Yingliang Ma. Transfer learn-
ing based classification of diabetic retinopathy on the kaggle
EyePACS dataset, 2023.
[TTY+ 21] Zhen Ling Teo, Yih-Chung Tham, Marco Yu, Miao Li Chee,
Tyler Hyungtaek Rim, Ning Cheung, Mukharram M Bikbov,
Ya Xing Wang, Yating Tang, Yi Lu, Ian Y Wong, Daniel
Shu Wei Ting, Gavin Siew Wei Tan, Jost B Jonas, Charu-
mathi Sabanayagam, Tien Yin Wong, and Ching-Yu Cheng.
Global prevalence of diabetic retinopathy and projection of bur-
den through 2045: Systematic review and meta-analysis. Oph-
thalmology, 128(11), November 2021.
[TW22] Tien-En Tan and Tien Yin Wong. Diabetic retinopathy: Look-
ing forward to 2030. Frontiers in Endocrinology, 13:1077669,
2022.
17
360
digital retinal images: a tool for diabetic retinopathy screening.
Diabet. Med., 21(1):84–90, January 2004.
[VSP+ 17] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit,
Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polo-
sukhin. Attention is all you need. arXive, June 2017.
[WHX+ 21] Jianfang Wu, Ruo Hu, Zhenghong Xiao, Jiaxu Chen, and
Jingwei Liu. Vision transformer-based recognition of diabetic
retinopathy grade. Medical Physics, 48(12), December 2021.
[WLZ18] Shaohua Wan, Yan Liang, and Yin Zhang. Deep convolutional
neural networks for diabetic retinopathy detection by image
classification. Computers & Electrical Engineering, 72:274–282,
November 2018.
[WSK+ 18] Tien Y Wong, Jennifer Sun, Ryo Kawasaki, Paisan Ruamvi-
boonsuk, Neeru Gupta, Van Charles Lansingh, Mauricio Maia,
Wanjiku Mathenge, Sunil Moreker, Mahi M K Muqit, Serge
Resnikoff, Juan Verdaguer, Peiquan Zhao, Frederick Ferris,
Lloyd P Aiello, and Hugh R Taylor. Guidelines on diabetic eye
care: The international council of ophthalmology recommenda-
tions for screening, follow-up, referral, and treatment based on
resource settings. Ophthalmology, 125(10), October 2018.
[WWC+ 22] Andrew M Williams, Jared M Weed, Patrick W Commiskey,
Gagan Kalra, and Evan L Waxman. Prevalence of diabetic
retinopathy and self-reported barriers to eye care among pa-
tients with diabetes in the emergency department: the diabetic
retinopathy screening in the emergency department (DRS-ED)
study. BMC Ophthalmology, 22(1):237, May 2022.
[YRK+ 12] Joanne W Y Yau, Sophie L Rogers, Ryo Kawasaki, Ecosse L
Lamoureux, Jonathan W Kowalski, Toke Bek, Shih-Jen Chen,
Jacqueline M Dekker, Astrid Fletcher, Jakob Grauslund, Steven
Haffner, Richard F Hamman, M Kamran Ikram, Takamasa
Kayama, Barbara E K Klein, Ronald Klein, Sannapaneni Kr-
ishnaiah, Korapat Mayurasakorn, Joseph P O’Hare, Trevor J
Orchard, Massimo Porta, Mohan Rema, Monique S Roy, Tarun
Sharma, Jonathan Shaw, Hugh Taylor, James M Tielsch, Ro-
hit Varma, Jie Jin Wang, Ningli Wang, Sheila West, Liang
Xu, Miho Yasuda, Xinzhi Zhang, Paul Mitchell, Tien Y Wong,
and Meta-Analysis for Eye Disease (META-EYE) Study Group.
Global prevalence and major risk factors of diabetic retinopathy.
Diabetes Care, 35(3):556–564, March 2012.
[YXK17] Shuang Yu, Di Xiao, and Yogesan Kanagasingam. Exudate de-
tection for diabetic retinopathy with convolutional neural net-
works. 2017 39th Annual International Conference of the IEEE
18
361
Engineering in Medicine and Biology Society (EMBC), pages
1744–1747, 2017.
19
362
Using Behavioral Economics Insights to
Determine the Likely Causes of the High Rate of
Unemployment in Refugee Camps and What Can
Be Done to Alleviate It
∗
Baraka Muhoza
October 17, 2023
Abstract
Leaving one country and relocating to another because of wars, con-
flict, and natural disasters has an impact on many different areas, includ-
ing the labor market. As a result, despite the difficulties, people strive
to adjust to their new surroundings. This study focuses on the high un-
employment rate in refugee camps, which has a wide-ranging influence on
refugees. It applies behavioral economics to investigate the likely causes of
this problem and propose various solutions that can assist in mitigating it.
In this work, we look at the role of biases in the unemployment crisis, such
as the status quo bias, anchoring bias, conformity bias, and implicit dis-
crimination, all of which are underutilized in refugee camps. For example,
refugees choose to rely on donations as a default rather than examining
other choices and have contributed to the problem’s rise. However, taking
these biases and heuristics into account, as well as applying behavioral
economics insights to the design of prospective solutions, would help to
reduce the unemployment problem.
1 Introduction
According to Emanuel Cleaver, “Hope is the motivation that empowers the un-
employed enabling them to get out of bed every single morning with unbounded
enthusiasm as they look.” [ Cle] Unemployment is one of the economic concerns
that various countries are seeking to address. It occurs when employees who
want to work are unable to find work, which has a detrimental influence on the
nation’s economy. When determining where to work or whether to look for work,
people utilize various approaches and make various decisions, demonstrating the
numerous heuristics and biases used in these situations. When the country has
∗ Advised by: Dr. Edoardo Gallo, University of Cambridge
363
a high percentage of unemployment, it has a variety of repercussions, includ-
ing a slowdown in economic activity and a fall in economic production, which
promote dependence on government spending and influence how people make
decisions. According to United Nations High Commissioner for Refugees (UN-
HCR) estimates, the number of displaced people has topped 60 million for the
first time in history [FRU16], disrupting labor markets in both host cities and
refugee camps. According to NISR and a survey conducted [MS17], the per-
centage of unemployed refugees living in Kiziba, Gihembe, and Kigeme refugee
camps was 51.15
As a result of the effects of unemployment in refugee camps described above,
several policies have been implemented to counteract the high level of unem-
ployment in the camp and the nation. First, UNHCR has provided various
alternatives, such as loans to refugees to assist them in gaining funds to es-
tablish their enterprises, and they have also enhanced education by providing
greater support in helping refugees seek higher education and find jobs. The
extent to which society reflects behavioral economics notions such as status
quo bias, conformity bias, and anchoring bias impacts whether unemployment
rises or diminishes. Using behavioral economics knowledge and skills will be a
significant step in resolving this issue since insights from behavioral economics
allow deviations from standard economic assumptions and have implications for
policy design.
Behavioral economics integrates elements of economics and psychology to un-
derstand how and why individuals behave the way they do and is concerned with
the relationship between economic agents’ rationality [MT00]. Inconsistency in
decision-making is caused by cognitive constraints, and as mentioned in Kah-
neman’s Map of Bounded Rationality: Psychology for Behavioral Economics,
“heuristics are cognitive shortcuts that the human brain develops to cope with
complex problems without calculation to make decisions easier” [Dan03].
Behavioral economics is vital in refugee camps because people are met with
problems that need to be solved and require decision-making. In refugee camps
there is work available, but people are unwilling to look for work and still wish to
rely on UNHCR funding, which causes bias and system problems. Furthermore,
behavioral economics is crucial in this case since applying different heuristics and
biases can lead to better decision-making, which benefits both the community
and the labor market.
The bodies in charge of refugees (UNHCR) and other partners (CARITAS,
INKOMOKO) working in the camps faced several challenges while attempting
to find solutions to the high rate of unemployment. First, they fail to determine
why people continue to rely on donations and why there is a low number of
people who want to take loans even though they are available, demonstrating the
inapplicability of status quo bias and anchoring bias, which leads to the design
of bad policies that do not counteract the effects of these biases. Therefore,
using behavioral economics in the labor market can lead to not only a change
in job policy that benefits the entire community by lowering unemployment,
but it can also lead to positive outcomes in decision-making that aid in the
development of self-reliance and the economy in general.
364
This paper examines the following heuristics: status quo bias, anchoring
bias, conformity bias, and implicit discrimination, which contribute to the high
unemployment rate among refugees. Status quo bias is important in this setting
since people are confronted with a variety of options. However, individuals tend
to stick with the defaults; nevertheless, in the refugee camp, there are many
alternative options beyond relying on donations, for instance, seeking work in
the camp and starting their own businesses. People, on the other hand, continue
to rely on donations which affect the outcomes and increase in dependency.
Additionally, anchoring bias is where people rely on the first piece of information
they receive, which affects the outcomes. In refugee camps, people are anchored
by their refugee status, which limits job searches and has an impact on exam
results when they are chosen, resulting in an increase in unemployment.
Moreover, conformity bias is where people will conform to what other people
are doing which affects personal decisions and outcomes. In the refugee camp,
there are many things to do such as implementing their own business, but peo-
ple will conform to what other people are doing such as relying on donations,
which affects their job search as well as initiating their business. Lastly, implicit
discrimination bias matters in the refugee camp where there are many oppor-
tunities but people cannot access them due to discrimination. First, they are
discriminated against by being a refugee and considered low-skilled labor which
makes them demotivated from applying as well as searching for jobs, which
leads to increases in unemployment, as described in this section. This paper
also makes recommendations for actions that should be taken to address this
issue, by taking effective action to address this economic issue while minimizing
biases and heuristic effects on the local community.
This essay is structured as follows. Section 2 provides background informa-
tion on Congolese refugees in Rwanda and their labor-market situation. Fol-
lowing that, Section 3 will explain how status quo bias affects unemployment in
refugee camps, Section 4 will explain how anchoring bias affects unemployment
in refugee camps, Section 5 will explain how conformity bias affects unemploy-
ment in refugee camps, and Section 6 will explain how implicit discrimination
affects unemployment in refugee camps. Section 7 will also include a solution
to the biases discussed previously. Finally, section 8 concludes with a variety
of policies that have been put in place, as well as policy recommendations that
policymakers can use to minimize or eliminate the effects of biases and heuris-
tics in producing unemployment in refugee camps and to implement practical
methods to counteract it.
365
establishing and earning a living in the destination country is a challenging
process for many refugees [Yak08]. Abandoning their homes where they could
meet their fundamental needs, where statistics show that on 1 September 2016,
UNHCR’s Rwanda office helps almost 75000 Congolese refugees [FRU16], which
shows a dramatic increase in the number of refugees. Following that, they
begin to adjust to the new situation of getting UNHCR aid to meet their basic
needs. In addition to losing their homes, they encountered several problems,
including overcoming social and economic challenges and trauma, finding work,
and managing careers after leaving their home country [CPT06].
People strive for numerous ways to battle unemployment during these terri-
ble economic times, but their efforts are unsuccessful due to the inapplicability
of behavioral economics. For example, instead of applying for loans that could
provide them with the capital they need to start their own business, they may
see their refugee status as an anchor that prevents them from doing so, prevent-
ing them from implementing their business, which may be one of the best ways
to combat unemployment, which shows the effect of anchoring bias on refugee
unemployment.
Furthermore, rather than seeking work for fresh graduates or coming up
with new ideas, they will follow in the footsteps of many unemployed individ-
uals by relying only on donations [HBL92], which inhibits job searching, and
independent decisions, and diminishes the likelihood of being hired. While all
of these factors contribute to unemployment in refugee camps, implementing
diverse behavioral economics insights will result in a drop in unemployment in
our community.
366
while only 5.7 percent expressed a preference for the low-reliability option cur-
rently being experienced by the other group, though it came with a 30 percent
reduction in rates. The low-reliability group, however, quite liked their status
quo, 58.3 percent of them ranking it first. Only 5.8 percent of this group selected
the high-reliability option at a proposed 30 percent increase in rates [KT91].
The status quo has been a recurring issue throughout history, and it has
had an impact on both economic texts and daily life. The word ”status quo” is
widely used when people choose to do nothing or continue with their previous
decision [SZ88]. This means that when given multiple options, people will end
up choosing nothing and remaining with the defaults. Where individuals are
advised to make better decisions and to try out different options to pursue
change more effectively. Because each person in the camp has a variety of
options, they choose the ones that best suit them. However, decisions made by
locals differ from those made by people in refugee camps. For example, along the
way, people only receive monthly donations from UNHCR to meet their basic
needs, but different partners work around the camp to help refugees meet their
needs, where they provide several opportunities to people, mostly volunteering,
which can provide a small income. In this case, people opt to rely heavily on
donations, demonstrating a bias in their decision-making in favor of the status
quo while denying them access to alternate opportunities, such as working with
a partner. A significant tendency to rely on donations, on the other hand, has
an influence on the entire community as well as the next generation because it
develops dependency and discourages people from looking for work or carrying
out ideas they have.
Furthermore, looking to the other side that requires the best decision-making
that can apply perspective from behavioral economics and psychology can help
to overcome status quo bias. People in the camps look for the significance of
status quo bias, which discourages their desire in accepting or pursuing new
employment. The contribution here in the camps serves as the status quo,
where donation might be defined as the monthly amount of money that each
individual receives to assist him/her in meeting necessities [HBL92]. There is
no surprise that many refugees are dependent on humanitarian aid for everyday
survival [Hov11]. People have identified a distinction between this donation
and unemployment. Whereas this donation serves as their income [HBL92], an
increase in donations leads to an increase in unemployment because when people
receive large sums of money, they are able to meet all of their basic needs while
also spending that money on other expenses, discouraging them from looking
for or doing their jobs because they can meet their needs. Because earning from
donations will influence them from looking for a job, doing a job, or exploring
other options that will allow them to earn a lot of money, it will become the
status quo. However, they are influenced by a variety of factors, including
the labor market’s complexity, the number of available options, and how well
working conditions match workers’ preferences and knowledge. People have
limited computing power when presented with a wide number of options [LBI98].
This is not the case in the refugee camp, where possibilities are restricted in
proportion to the number of individuals who desire them, discouraging some
367
applicants. Nonetheless, instead of waiting for a few opportunities to present
themselves, they could create them by starting their own business and seeking
other assistance.
There are numerous strategies to eliminate bias induced by complexity and
misapplication of knowledge, including providing information to people in all
aspects. As previously said, there are few options available, as well as few
pieces of information offered, which necessitate cognitive work, increasing the
likelihood of individuals to adhere with the defaults. Furthermore, the low level
of education in the camp leads to a lower level of understanding about the labor
market, as well as the application and how different companies operate, which
prevents them from being hired or seeking better jobs.
The better-designed solution that can aid employment services should be
improved use of technology in the camp because there is a lot of information
online, such as the job available, and the requirement of any jobs, but they
did not access it, making them rely solely on the information that is presented
in the camp, which affects their job search. Additionally, employers should be
more succinct when presenting information to the labor, which could also help.
Therefore, advancing technology and the way information is presented in the
camp may help us lessen status quo bias, changing who can access jobs and
thereby reducing unemployment.
4 Anchoring bias
The anchoring effect was defined by Tversky and Kahneman (1974) as the dis-
proportionate influence on decision-makers to produce judgments slanted toward
an initially supplied value. This was demonstrated in a 1979 study by Tversky
and Kahneman, in which participants were asked to estimate the percentage of
African countries in the United Nations using a range of randomly generated
numbers obtained by spinning a wheel of fortune between 0 and 100. Before
making the absolute judgment, participants were asked to examine whether the
actual response was higher or lower than the reference value supplied (com-
parative judgment) [KT79]. People in the camp frequently estimate using the
beginning point or the first information they acquire to arrive at the final an-
swer. Where the starting point could be defined as the initial reference. This
also occurs in refugee camps, where a variety of information spreads through-
out the group regarding acquiring jobs and loans, but they rely on the first
information they receive to make the ultimate decision.
The anchor might be regarded quantitatively or qualitatively. In the quali-
tative component, people see the anchor as a refugee who is underprivileged or
who would experience discrimination. This will impact their decision since it
will deter them from looking for work or taking an exam when they have the
option of selecting a refugee in their application or someone else. As a result,
the best approach to change things is to set a default of not being considered
as a refugee. This could excite them and, as a result, help them compete in the
job market, thus leading to a drop in unemployment.
368
The anchor could also be quantitative, as people like to base their reservation
wage on the pay of locals, which serves as a quantitative anchor. Where they
use the salary of locals as a reference point. When referring to a scenario in
which a local teacher earns 130 per month, this could be his initial value while
looking for work abroad. It generates an anchor when two people consider doing
the same work but earning different wages. As a result, they anchor on various
values, yet because the anchoring bias has such a large impact on decision-
making, fewer people are participating in any available occupations, as well as
in the labor force.
Notably, because this anchor influences people’s decisions, one possible so-
lution for overcoming this bias could be the establishment of a new anchor
may entail collaboration among government, citizens, and non-governmental
organizations. As previously said, depending on how the anchor influences the
community, the anchor could be qualitative or quantitative. First, they will dis-
seminate information or launch a campaign to nudge people to make decisions
based on the new reference point. Concerning the anchor of a career, UNHCR
and other partners could encourage refugees to start their businesses by sharing
success stories of those who pursued unrelated careers and assisting them in
obtaining loans that will allow them to put their business ideas into action and
thus move away from various anchors.
5 Conformity bias
Conformity bias is the tendency to change one’s thoughts or behavior to fit
in with others [Nik23]. One research looked at how this was utilized in the
frenzied buying wave of culturally related goods in Korea, where names like
Canada Goose and the North Face became essential commodities that were
mass devoured. This appeared unreasonable because it was costly and imposed
a financial strain on some consumers. However, if a person does not have one,
they may face discrimination from those around them. Following a study of
Korean customers, researchers discovered that the desire for these brands was
not exclusively due to a liking for the brand or the culture that these enterprises
reflected. It was prompted by a severe ”fear of missing out” (FOMO). The
desire to belong to the mainstream group (or the fear of being excluded from
the mainstream group) was a crucial factor in the consumption of these very
popular brands [KS19].
Accordingly, there are various reasons why people conform. First, there is
informational conformity, which occurs when we seek guidance and knowledge
from a group, such as a class. Additionally, there is normative conformity,
where individuals conform to align with the public [Nik23]. This is applicable
in refugee camps, as demonstrated by the UNHCR, where a survey found that
when people were considering taking out loans, many of them held off because
they had heard that doing so might cause them to lose their refugee status or
interfere with other services that they were receiving, which affects individuals
who want to take loans to implement their business ideas, which can affect the
369
labor markets positively, leading to an increase in unemployment.
In several ways, this conformity is linked to unemployment. UNHCR has
strengthened education in the camp, where each year the number of graduates
increases but the number of jobs available does not. It only requires innovation
and creativity and putting into practice what they have studied, but because
youth want to conform to what others are doing, they are drawn to what others
are doing, which discourages job searching and increases unemployment. Fi-
nally, because there is a high rate of unemployment in the camps, young grad-
uates are discouraged from looking for work or starting their own businesses
because they want to behave like their peers, which contributes to an increase
in unemployment. There are several ways to combat this conformity bias, in-
cluding education and training that will equip them with different skills that
are needed in the labor market, as well as challenging them, which could lead
to an increase in self-awareness and thus an improvement in economic outcomes
where people will make decisions based on what affects them. Furthermore,
developing uniqueness in decision-making would encourage people to start their
businesses, resulting in a reduction in unemployment in the refugee camp and
an increase in self-reliance.
Furthermore, using insights from behavioral economics could be one of the
ways to combat these biases where they can change the way they present in-
formation to large groups of people because this affects their decision making
but teaching them and collaborating with different institutions could result in a
better outcome because people will be telling other reliable information which
can reduce conformity.
6 Implicit discrimination
People may have skewed opinions and consciously discriminate for a variety of
reasons. For example, characteristics of a specific group, status, and production.
The implicit association test, which depends on the test taker’s speed of reaction
by linking names, words, and images to reflect the strength of the unconscious
mental association, can be used to quantify implicit discrimination [GS98].
Implicit discrimination has an impact on the issue of unemployment in
refugee camps. To begin with, there is a low possibility that refugees will be
hired outside of the camps due to a lack of Rwandan identification, which has an
impact on the job market and has influenced individuals who wish to apply for
any jobs that are advertised there. Second, because the level of education in a
refugee camp is low and they are unable to acquire a university-level education,
employers regard them as low-skilled labor, preventing them from hiring them
and increasing unemployment.
Implicit discrimination is a significant driver of discriminatory behavior in
the job market. There is significant racial inequality in the host country for
refugee camps, where it is unlikely that refugees will be hired due to a lack of the
required identification card as a citizen, which has an impact on the job market
because they will be demotivated by this discrimination, which will discourage
370
their job search. Refugee employees struggle to obtain work and perform poorly
in the labor market as a result of implicit bias. Implicit discrimination occurs as
a result of ambiguity caused by faulty information supplied or how an employer
perceives an individual, as well as the expectation of immigrant labor as low-
skilled labor, bad work ethics, and low productivity. According to Bertrand and
Duflo [BD16], when individual information is scarce, group participation might
provide useful information about predicted productivity.
To address this issue, various criteria should be addressed in the hiring pro-
cess. To begin, using technology in recruiting systems where they will enter
candidate addresses as well as qualifications into a computer, and then the
computer will determine who to provide a job based on abilities and capability,
which might lead to a reduction in discrimination.
Finally, there is unfairness in the financial sector [KT86] since there is a low
likelihood that a refugee will be able to receive loans, reducing the availability
of cash to build their own firm and so lowering the degree of unemployment.
Using behavioral economics insights can lead to better discrimination treatment
in labor economics, which can lead to improved economic outcomes. When there
are numerous opportunities accessible, but people are unable to take advantage
of them due to bias in the labor market, implicit discrimination has an impact
on results. Advocating and increasing awareness may aid in the elimination of
certain biases.
371
value of a job search, but this job assistance is worthwhile. Evidence suggests
that people are weak at recognizing whether a search is effective, or that they
undervalue the value of a search [Spi10].
Because it suggests policy on how individuals understand the possibilities
before entering the job market, job search assistance is one of the better ways
to overcome biases such as status quo bias and anchoring, which are obstacles to
employment. In addition, officials can innovate in how information is delivered
to citizens. A behavioral obstacle to job search and employment is that individ-
uals may have biased salary expectations, which can be debiased by carefully
designed interventions [BI97].
Thus, incorporating insights from behavioral economics should be impor-
tant because it suggests that the way job options are framed might affect how
individuals respond to the choices.
7.3 Framing
Framing is defined as the underlying or combined effect of both status quo prej-
udice and anchoring bias on how individuals see information, where framing
10
372
adjusts one decision either positively or negatively, and where multiple options
can be framed in different ways. This is also applied in refugee camps, as I
mentioned earlier, people frame loans differently, which causes many to react
differently, affecting their implementation of their business idea as well as con-
cerning refugee status, where people can’t apply because they know there is a
lower probability that a refugee will be hired, which can be considered as a loss.
On the other hand, it may be presented as a benefit if, in a given situation,
they offer a refugee extra opportunity for employment while also pushing the
refugee to pursue employment, in which case it would be gain. [LBM12] Fram-
ing influences choices by working within the constraints of biases and behavioral
habits rather than overcoming them. One area of study reveals how job search
assistance is structured. For example, framing losses as consequences rather
than advantages has been found to alter behavior in a variety of circumstances
[Rot06]. As previously said, framing influences an individual’s proclivity to take
risks, such as applying for loans, starting a new career, or doing unrelated work.
Several approaches may be taken to address this, including modifying the
way information is presented as well as employment counseling services where
salaries or pay information might drive people to apply or seek work elsewhere,
which can lead to a drop in unemployment. Furthermore, taking into account
the context and language in which information is presented to people where
framing can reverse the choice preference where information could be presented
in the right place as well as in a better which can cause people to move far
from their anchor if a different approach is used where they refer to remaining
unemployment as framing as well as cutting down on the donation that they
receive, which could drive individuals to start new occupations, either by begin-
ning their own business or hunting for one, and thus develop a positive attitude
regarding job chances while also promoting job search.
11
373
refugees in different countries to share opportunities and experiences, causing
refugees to move far away from their anchor that refugee can’t access loans, so
by networking refugees will be able to lend money among themselves, easing the
implementation of their business idea, and as a result, many businesses will be
opened.
8 Conclusion
A review of insights from behavioral economics and labor market policies sug-
gests various policies that can be used to combat the issue of unemployment in
refugee camps, where it brought out various solutions such as job search assis-
tance that could help in overcoming various biases that are presented, job train-
ing, and finally framing because these biases include implicit discrimination,
anchoring, conformity, and status quo bias are contributing to the high rate.
Where several frameworks have been put in place to reform existing policies
for reducing unemployment and establishing new policies. This study collected
and applied experimental investigations and concepts from previous works of
literature and practical research. They all demonstrate how heuristics influence
decision-making as well as economic outcomes. So, while the proposed reforms
and adjustments in job training, job search assistance, and framing are limited,
there may be other behavioral approaches that can be used to improve employ-
ment and the labor market in refugee camps. My future research will focus on
evaluating existing policies and new solutions to unemployment.
References
[Ami06] Baruti Amisi. An exploration of the livelihood strategies of durban
congolese refugees. Geneva: UNHCR, 2006.
[BD16] Marianne Bertrand and Esther Duflo. Field experiments on discrimi-
nation. Handbook of economic field experiments 1, 2016.
12
374
[fRU16] United Nations High Commissioner for Refugees (UNHCR) (2016).
Global forced displacement hits record high. 2016.
[Gra05] Katarzyna Grabska. The analysis of the livelihood strategies of su-
danese refugees with closed files in egypt. cairo, egypt. American
University in Cairo, 2005.
[Gra06] Katarzyna Grabska. Marginalization in urban spaces of the global
south: Urban refugees in cairo. Journal of refugee studies 19, 2006.
[GS98] Debbie E. McGhee Greenwald, Anthony G. and Jordan LK Schwartz.
Measuring individual differences in implicit cognition: the implicit
association test. Journal of personality and social psychology 74.6,
1998.
[HBL92] Eftihia Voutira Harrell-Bond, Barbara and Mark Leopold. Counting
the refugees: gifts, givers, patrons, and clients. Journal of Refugee
Studies, 1992.
[HD18] Asad Sadiq Hameed, Sameena and Amad U. Din. The increased vul-
nerability of refugee population to mental health disorders. Kansas
journal of medicine 11.1, 2018.
[Hov11] Hovil. The dilemmas of congolese refugees in rwanda. citizenship and
displacement in the great lakes region. International Refugee Rights
Initiative, 2011.
[IL00] Sheena S. Iyengar and Mark R. Lepper. When choice is demotivating:
Can one desire too much of a good thing? Journal of personality and
social psychology, 2000.
[KS19] Haixin Cui Kang, Inwon and Jeyoung Son. Conformity consumption
behavior and fomo. Sustainability 11.17, 2019.
[KT79] Daniel Kahneman and Amos Tversky. Prospect theory: An analysis
of decision under risk. Econometria, 1979.
[KT86] Jack L. Knetsch Kahneman, Daniel and Richard Thaler. Fairness
as a constraint on profit seeking: Entitlements in the market. The
American economic review, 1986.
[KT91] Jack L. Knetsch Kahneman, Daniel and Richard H. Thaler. Anoma-
lies:the endowment effect, loss aversion, and status quo bias. Journal
of Economic Perspectives 5.1, 1991.
[LBI98] George Loewenstein Linda Babcock and Samuel Issacharoff. Creatin
convergence: Debiasing biased litigants. Law and Social inquiry, 1998.
[LBM12] Lawrence F Katz Linda Babcock, William J Congdon and Sendhil
Mullaninathan. Notes on behavioral economics and labor market pol-
icy. IZA Journal of Labor Policy, 2012.
13
375
[MS17] Craig Loschmann Marchand, Katrin and Melissa Siegel. Forced mi-
gration and labor market outcomes. The Case of Congolese Refugees
in Rwanda, 2017.
14
376
Are Champions Born Or Made?
∗
Yashvendra Singh
October 18, 2023
Abstract
Champions hold the world’s attention and their performances both
inspire and generate curiosity. Whether they are born champions or are
the product of scientific training mechanisms and tremendous hard work
is a debate that rages on with every convincing victory that throws up
an invincible winner. Sporting history is replete with examples where the
sporting fraternity was forced to research characteristics and traits that
marked their invincibility. Some studies showed that the complete domi-
nation of Kenyan and Ethiopian runners in the middle- and long-distance
events and Usain Bolt’s phenomenal success could also be attributed to
their higher haemoglobin and slow twitch muscle fibres suited for en-
durance running and speed. Many believed that Michael Phelps’s wider
wingspan, and unique genetic disposition of producing less lactic acid gave
him an unfair advantage over his competition. There are many such ex-
amples that keep bringing back the question – are champions made or
born? The more pragmatic researchers who emphasize on scientific train-
ing, hard work and personal motivation too have not been able to dismiss
the role of genetic predisposition. Given the level of competition and
hard work that these champions endure to become winners makes this
an interesting case study. This paper analyses the complex interplay be-
tween the roles played by genetic disposition and training in an athlete’s
performance.
1 Introduction
The impact of genetics on sports performance is a hugely contentious debate in
the sporting fraternity. While some like Michael Phelps were hailed as super-
natural and genetically blessed because of his unusually wide wingspan, double-
jointed ankles and his physical distinctiveness wherein his body apparently pro-
duced half the lactic acid as compared to his fellow competitors, which gave him
a huge biological advantage over his fellow athletes, others like Caster Semenya,
the two time Olympic champion from South Africa, became the subject of con-
troversy. Her body allegedly produced higher testosterone levels than most
∗ Advised by: Bridget Callaghan
377
women–a finding that led the Court of Arbitration for Sport to rule that she
would have to lower her testosterone levels through medication to compete in
the women category [Ing19,SEM16], making her a prominent face in the annual
list of ”50 People That Matter” for unintentionally instigating ”an international
and often ill-tempered debate on gender politics, feminism, and race, becoming
an inspiration to gender campaigners around the world” in the 2010 edition of
the British magazine New Statesman.
The absolute domination of the Kenyan long-distance runners is another
trigger that sparked the debate on genetic endowment. Physiological advan-
tages of Africans have recently been studied by Weston et al, whose studies
revealed that “Africans had elevated citrate synthase and 3-hydroxyacyl CoA
dehydrogenase activity and enhanced resistance to fatigue in a treadmill trial
designed to imitate the stresses involved in 10 km running”. [AR99]. They also
demonstrated lower blood lactate concentrations at higher speeds. Another
study revealed that they had relatively higher haemoglobin and haematocrit,
metabolic efficiency and helpful skeletal-muscle-fibre composition and oxidative
enzyme profile that gave them the advantage over equally motivated and trained
athletes. [WR12] Research has shown that one of the main factors that con-
tributes to strength/power which is essential to be a sports champion is also
biomechanically based, highlighting genetics once again. Since most sports de-
mand agility and brute force, “joint torque – this is how fast and/or powerfully
a joint can move based on the force that a muscle applies to it” is important.
[Coy07]. This enhanced joint torque helps an athlete generate greater power
and speed in rotational movements, helps in maintaining better balance, en-
hances precision in movement towards a given goal/target even at odd, angled
movement, enhances endurance and facilitates faster recovery; all of which are
crucial for an athlete’s performance. [Mus23] Interestingly, orthopaedic research
on reconstruction of joints and/or soft tissue attachments has shown that at-
tachment site of a tendon is a crucial determinant of the range of motion of a
joint and joint torque at various positions [Yam07,Miz23 ]. These muscle at-
tachment mechanisms and positions are all genetic. Strength and endurance is
also dependent on the muscle fibre type [Tes85]. It is proven that fast twitch
muscle fibres produce more force and power than slow twitch fibres – primarily
because they are larger in size, giving players with the former a genetic advan-
tage especially in sprinting. This was often one of the attributes that supposedly
made Usain Bolt unstoppable and matchless. [DLCS76]. The type of muscle
fibre may have a direct bearing on the athlete’s performance. For example, the
slow- support long distance runners and the fast-twitch support quick, powerful
movements needed for sports like sprinting or weightlifting. [TP85,ME19]
There is empirical evidence of the fact that “professional bodybuilders more
than likely have some sort of myostatin mutation that allows them to build and
maintain such muscle mass”. [Sch04] Furthermore, the research findings that
elite marathon runners are simply better at dissipating heat than other runners
due to efficient tendon hysteresis and have higher maximum oxygen capacity
again takes us back to genetic predisposition [FEMNes, MD94] .
The genetically blessed dilemma has always stayed enigmatic. Let’s use India
378
as an example as it has the largest population of young people in the world. Are
Indians genetically better at excelling in chess than soccer and Basketball? The
query assumes increased pertinence as an 18-year-old Indian Chess grandmaster,
Rameshbabu Praggnanandhaa takes on Magnus Carlsen in the final of the FIDE
world cup at Baku, Azerbaijan. He is an exceptional talent, motivated by
the likes of fellow Indians like Vishwanathan Anand who himself has been a
champion earlier. While some might wonder why the most populous country
with a population of 1.4 billion has never qualified for Soccer or Basketball at
the Olympics/world cup and has only a handful Olympic medals in athletics,
others might argue how India’s cricketing prowess also throws confusing signals,
where two of its most renowned players Sachin Tendulkar and Sunil Gavaskar
became legends in their craft despite very small physical frames. They were
known to take on the might of some of the fastest physically well-endowed
bowlers from other cricketing nations. The world is replete with such examples
with footballers like Lionel Messi and Deigo Maradona making it to the very
top despite a shorter frame, belying the genetics argument to some extent.
Interestingly in the same vein, while on one side we see a distinctive edge
enjoyed by black athletes in all sports requiring high speed and force and the
complete domination of black athletes over their Caucasian counterparts in the
popular NBA, we see Asians and Caucasians dominate racket sports like Ten-
nis and badminton which also require high levels of agility and brute force.
Such contradictions trigger a counter argument that genetic predisposition is a
significant but not the sole prerequisite for excelling in a particular sport.
With the advent of technical tools to make sports training more scientific,
this debate leads to a larger debate of sports genomics. We attempt to analyse
whether champions can be trained and made or if they need to have a certain
genetic predisposition for training to yield the desired results. Add to this the
role played by hard work, motivation, and the role of a supportive team in the
athlete’s success as expanded by Ericsson’s theory of deliberate practice and
its significance in champion development lends an interesting dimension that
cannot be ignored. [Eri93]
379
advantage. Recreational sport may be for everyone, but competitive sports at
any level involves fierce competition and one must have the physical endurance
to take on the challenge. This prerequisite can be ignored for sports like chess
and other board games that do not involve any physical activity. This is not
as simple and straight forward as it seems as different sports have different
physical requirements. As Vaeyens et al. (2008) argued, the nature of the
sport discipline itself defines to what extent the uni dimensional components
intervene [Vae08]. Moreover, even within specific sport disciplines, the physi-
cal requirements will vary greatly, depending on the position of the players on
the field; This position-specific adaptation has been observed for various sports,
including volleyball [She09] (Sheppard, Gabbe, & Raebery, 2009), handball
[Zap11,Del13] [Zapartidis, Kororos, Christodoulidis, Skoufas, & Bayios, 2011],
and rugby [Delahunt et al., 2013]. These studies reveal that the specifics may
vary. For decades, coaches have obsessed with the “the tale of the tape,” to
measure height, weight and reach to determine a player’s suitability at a com-
petitive level. Now new research out of UC Berkeley suggests that the relative
length of an athlete’s arms to their height might be even more important than
previously believed in sports like NBA, [Bah18] making the term “wingspan”
a key element in NBA. The same advantage was exploited by Michael Phelps
to perfection in his sporting career. Despite variations in the basic sports type
and role, the one thing most studies agree with is that physicality matters in
sports.
2.2 Technical
Technique in any sport is important. Grosser, M. (1982) defines technique as
the ideal model of a movement relative to a specific sport activity. [Roc86] It
refers to the methodology adopted in terms of movements and postures to max-
imise impact, optimize performance, prevent injuries, ensure consistency under
pressure with minimal wastage of effort and force. This technique is the key
to success in sports. These techniques are crucial and are worked on very scien-
tifically and personalised for champions after careful analysis of their physical
attributes and natural abilities and strengths. Michael Phelps’ perfecting the
deep catch or sculling technique to propel him faster and Michael Jordan us-
ing biomechanics to perfect his famous fadeaway are all examples of sportsmen
perfecting techniques to get a competitive advantage.
Michael Jordan’s accomplishments are attributed to his brilliant athleticism
and superior technique. “His planted foot was attached to the floor, making it
easier for him to explode away from his defender at just the right moment”–a
small example of a technique used to perfection.“While others rely on instinct or
muscle memory to make their shots in moments of white-hot pressure, Federer
can delay the moment when he must commit to a shot until impossibly late”
[Fyl09]which is another example of a technique used to perfection by the leg-
endary tennis player Roger Federer.
380
2.3 Tactics and Strategy
Brute force and physical attributes mean nothing without a refined skill set.
That is why athletes spend time adding tactical elements to their training and
work on a winning strategy. Athletes have to have a comprehensive understand-
ing of the strategic aspects of the game and how these strategies withstand the
test of a real time game/competition and not just be secure within the precincts
of their training arena. For this they don’t only plan and strategize for them-
selves but also carefully analyse the strategy of their competitors to ensure that
they can have the winning edge by keeping in mind all contingencies and the
counter for it. This is crucial as it “requires players to maintain high quality
of perception, concentration and decision-making for a long time, even when
the player is physically and psychologically overloaded”. [PP18] “Tactics there-
fore elaborates the strategic intention of preparing the player or team in real
conditions of a match and solving situations in match. Tactics point to the pos-
sibilities of solving certain sub-situations within the strategy. It focuses on the
practical implementation of such situations in the match”. [SO18] Tactical
preparation is the process of equipping a sportsman with knowledge, practical
learning and skills that enable the player to choose the optimal solution in each
game situation and apply it effectively. This is crucial for success.
381
3 Impact of Genetics on the Key Fundamentals
of Sporting Excellence:
3.1 Impact of Genetics on Physical Force and Strength:
It is widely acknowledged that a favourable genetic profile, when combined with
an optimal training environment, is important for elite athletic performance.
[GL13]. As of 2009, more than 200 genetic variants have been associated with
physical performance, with more than 20 variants being associated with elite
athlete status [BM09] and given the extremely slim margins between victory
and defeat. This is undoubtedly a substantial advantage to possess, ensuring a
favourable head start.
Key basic physical traits like height, which is critical for success in some
sports, is highly heritable, with about 80
The tremendous success of many Kenyan athletes has brought back the
focus to the role of genetics in a sportsman’s success. Studies have shown
that African distance runners have reduced lactic acid accumulation in muscles,
increased resistance to fatigue, and increased oxidative enzyme activity, which
gives them the advantage of high levels of aerobic energy production.[WAik]
Larsen et al., (2015) studied the anthropometric characteristics of elite Kenyan
distance runners and reported that they had longer legs (5
The dynamic cyclist Miguel “Big Mig” Induráin won five Tours de France
from 1991 to 1995 and the Giro d’Italia twice was known to have a remarkably
huge lung capacity and an exceptional heart that allowed his blood to transport
7 liters of oxygen throughout his body per minute compared to 3 to 4 liters
pumped in an average individual.
Basketball greats like Michael Jordan whose 6’6” frame was bestowed with
a wingspan of 6’11”, used his reach to a completely different level. Dwight
Howard’s wingspan of 7 feet 5 inches with a 6 feet 11 inches tall frame made
him formidable; at 7 feet 1 inch tall and 325 pounds, Shaquille O’Neal is a with
size 22 feet used his overpowering physical assets to dominate the court and so
does LeBron James who stands 6 feet 8 inches tall and weighs 250 pounds. His
massive legs allow him to make 700 pounds of pressure per leap making him
faster than most other point guards [Hay17].
382
performance is a complex mix of both genetic and environmental factors. Since
every movement and technique is greatly impacted by the physical traits and
the strength of muscles used for movement (skeletal muscles) and the predomi-
nant type of fibers that compose them, genetics again becomes a focus. These
muscle fibres can’t be created artificially and are nature’s gift. These fibres are
primarily of two types, Slow-twitch muscle fibers contract slowly and can work
tirelessly for a longer duration and hence are an asset for any sport that needs
endurance. Fast-twitch muscle fibers contract quickly but tire rapidly; these
fibers are good for sprinting and other activities that require power or strength.
Other traits that have a direct bearing on whether a trained sportsman can
stick to technique in a high pressure competitive environment is also related
to aerobic capacity, muscle mass, height, flexibility, coordination, intellectual
ability, and personality; all of which have a direct genetic connection.
Basketball and soccer are two of several combination anaerobic and aerobic
sports in which athletes need power, speed, quickness, agility, and strength
[NSC17] and studies have revealed how genetic composition can have a direct
bearing with them. There is no doubt that a motivated athlete could train
harder to overcome odds and defy the genetic advantage of an opponent, but
champions need only that fraction of an advantage to take the lead sometimes,
and that minuscule advantage might be the defining difference.
The ability to generate maximal power during complex motor skills is of
paramount importance to successful athletic performance across many sports
[Cor11] and has a direct bearing on their ability to implement technique to
perfection. This is why there is emphasis on power training to improve maxi-
mal power production in dynamic, multi-joint movements. Muscle strength is
directly related to its fibre composition and hence genetics comes in. Studies
have been consistent in their findings to indicate the significant role of genes in
the way an individual’s body responds to exercise and strength training which
have a direct bearing on whether an athlete can execute a given technique to
perfection. A recent study found that up to 72
Technique, tactics, and strategy are perfected through training and other
factors like diet and nutrition. Research on aerobic endurance shows that some
people respond more to training than others. Genetically gifted athletes are
likely to respond better to training as compared to equally motivated less ge-
netically blessed athletes and their bodies are likely to have increased number of
mitochondria in cells that produce Adenosine triphosphate (ATP), the source of
energy usage and storage at the cellular level [MJ19a]. Tactics and strategy are
another key pillar of sporting success. A powerful athlete can implement this
strategy with brute force and impeccable precision and is also capable of destroy-
ing that of his opponent however well prepared. Any race/match is won because
the winner has the capacity to outdo and outperform the tactics and strategy of
his/her opponents. All athletes at the highest level come with the highest levels
of training and motivation, as one cannot compete at the highest level without
it. In the face of this intense competition, studies focused on similarities and
differences in athletic performance within families, including between twins, sug-
gesting that genetic factors underlie 30 to 80 percent of the differences among
383
individuals in traits related to athletic performance[AI15,AI16,WN15,YX16],
which is percentage that cannot be ignored.
384
standard where the need for effortful concentration is minimised”. If the indi-
vidual persists and learns to adapt to situational demands, a stage could come
when the tasks become increasingly automated and the individual could stop
making intentional adjustments. This is where the ace or expert performers are
different as they do not stop the learning curve. “Expert performance continues
to improve as a function of more deliberate practice” [Eri03a]. “The challenge
for aspiring expert performers is to avoid the arrested development associated
with automaticity and to acquire new cognitive skills through their continued
learning and improvement”. By persistently practicing and harnessing one’s
unique talents, “this modification of complex cognitive mechanisms demands
problem-solving skills and undivided concentration” [Eri02]. The key challenge
is to be persistent with deliberate practice and to continue to pursue perfection
in all eventualities with a focussed cognitive approach. Ericsson believed that
“As a result of deliberate practice, many biological characteristics, such as width
of bones, flexibility of joints, size of heart, metabolic characteristics of muscle
fibers, and so forth, can be changed after years of intense and carefully designed
training. Biochemical processes that preserve equilibrium during intense train-
ing influence these anatomical changes” [Eri03b]. (Ericsson, 2003c). Deliberate
practice also helps the expert performers sharpen their “mental representations
that allow the expert performer to bypass the information-processing constraints
imposed by basic capacities” he added. Taking this cue, it could be inferred that
the exemplary reaction mechanisms and superior force and speed exhibited by
ace athletes’ elite athletes in like returning tennis ball of an opponent, can be
attributed to skilled anticipation of events by identification of early predictive
cues and not by superior perceptual acuity or faster cognitive speed alone [ea08].
This theory of deliberate practice also stresses on starting early to give the
potential players the years of practice needed to be an ace sportsman. “In many
domains, such as music and sports, parents arrange for their children to start
practice at very young ages, sometimes as young as 3–4 years of age”. This
early start gives them a huge advantage as they can sharpen their skill sets
with deliberate practice as compared to the late starters. Studies have shown
that beginning deliberate training early in life yields more refined and accurate
adaptive responses and greater cognitive and neurological development.
“The foundations of brain architecture are established early in life through
a continuous series of dynamic interactions between genetic influences, environ-
mental conditions, and experiences” [Fri06,MM06]. This phase has a significant
impact on the brain architecture and “each one of our perceptual, cognitive,
and emotional capabilities is built upon the scaffolding provided by early life
experiences” [GL15].
According to Benjamin Bloom, a professor of education at the University of
Chicago, and author of the book “Developing Talent in Young People”, which
examined the critical factors that contribute to talent, “all brilliant performers
had practiced intensively, had studied with devoted teachers, and had been sup-
ported enthusiastically by their families throughout their developing years His
stud included the retrospective look at the childhoods of 120 elite performers
who had won international competitions or awards in fields ranging from music
385
and the arts to mathematics and neurology [Blo85b]. His study focussed on
deliberate practice and motivated training and overwhelmingly, leaned towards
the concept that experts are always made, not born. These studies make a
clear distinction between regular practice and deliberate practice with the lat-
ter more focussed towards refining the practice to cover all shortcomings and
reaction to unpredictable variables that may stand in the way of an ultimate
victory. “Not all practice makes perfect. You need a particular kind of prac-
tice—deliberate practice—to develop expertise. When most people practice,
they focus on the things they already know how to do. Deliberate practice is
different. It entails considerable, specific, and sustained efforts to do something
you can’t do well—or even at all. Research across domains shows that it is
only by working at what you can’t do that you turn into the expert you want
to become” [KAEC07]. Deliberate practice involves two kinds of learning: im-
proving the skills you already have and extending the reach and range of your
skills. The enormous concentration required to undertake these twin tasks lim-
its the amount of time you can spend doing them. The famous violinist Nathan
Milstein wrote: “Practice as much as you feel you can accomplish with concen-
tration. The general belief of most experts is even the most gifted performers
need a minimum of ten years (or 10,000 hours) of intense training before they
win international competitions, making it difficult and sometimes impossible for
late starters to catch up with competitors who started earlier started earlier and
maintained maximal levels of deliberate practice as rushing through the same
levels of deliberate practice can lead to exhaustion and injuries. Ace Golfer, like
Tiger Woods who started deliberate practice really young in life, is an example
of this approach. Tennis great, Federer himself confessed that he didn’t see
himself as a genius but worked hard at it.
10
386
a personalized sports nutrition plan was highlighted by the American College
of Sports Medicine which stated that “Nutrition plans need to be personalized
to the individual athlete. . . and take into account specificity and uniqueness
of responses to various strategies” [TD16]. These strategies encompass over-
all dietary patterns, macronutrient ratios, micronutrient requirements, eating
behaviours (e.g., nutrient timing), and the judicious use of supplements and
ergogenic aids.
Given the high stakes of building champions at the world stage, these stud-
ies also study the genetic variants which have a direct bearing on the way they
absorb, metabolize, utilize, and excrete nutrients. [ND14]. Given the scientific
foundation of this approach it has been found that given gene diet actionable
advice has positively encouraged individuals and they are more likely to change
health behaviors, including their dietary choices and intakes [HJ18], which is a
welcome change. The positive outcome in terms of building muscle power and
endurance levels, agility and speed and physical power and strength has made
this field extremely popular and an increasing number of athletes are depend-
ing on individually tailored dietary and other performance-related information
based on their DNA to stay competitive.
A fine example of this approach is the use of caffeine in the CYP1A2 rs726551
SNP, individuals with the AA genotype (fast metabolizers) to elicit a positive
or “improved” response (i.e., performance). On the contrary, Individuals with
the CYP1A2 AC or CC genotype may either show no effect or an impaired
response to caffeine intake. [GN18]. Another usage example is the one to bring
haemoglobin levels to the optimal level for athletes with diet and supplements.
A low haemoglobin production decreases the oxygen carrying capacity of the
blood, leading to a lack of oxygen to working muscles and resulting in impaired
muscle contraction and aerobic endurance [HJ01].
Increasing interest in epigenetics is also leading to considerable research fo-
cus on research investigating individual variation in response to exercise train-
ing, playing sports and exercise genomics, a key factor in athletic training and
performance enhancement. [VNipAAcFjs17]
11
387
6 Conclusion
Champions are a class apart and have always been the subject of fascination.
It comes as no surprise that their ‘making’ has intrigued many, leading to some
interesting debates on whether they are a product of nature or can be nurtured.
This debate is as old as sport itself. Its theoretical context can be traced back
to accounts of Hippocrates (460–370 BC), the father of medicine, in his Book 1
(Dietetics) where he stressed on the relative nature versus nurture contribution
highlighting the importance of health and the role of a diet and exercise “regi-
men” in maintaining it. “Eating alone will not keep a man well; he must also
take exercise. For food and exercise, while possessing opposite properties, yet
contribute mutually to maintain health. For it is the nature of exercise to use
up material, while of food and drink to restore them. And it is necessary, as it
appears, to determine exactly the powers of various exercises, both natural and
artificial exercises, and which of them contribute to the development of muscle”.
However, even though he seems to be making a reference to the ‘nurture’ aspect
as a requisite for positive health, he, in the same book talks about heritability
which immediately takes us to an individual’s “genetic predisposition”. Cen-
turies later, Galton became the first academic to advocate a hereditary ceiling
to physical and mental capacities [F69,FS92] and formally objected to “pre-
tensions of natural equality” in his landmark paper “The history of twins as a
criterion of the relative powers of nature and nurture”.
It was Ericsson et al who questioned this notion of inherited talent and excep-
tional abilities with his theoretical framework for “deliberate practice”, an alter-
native means to expert performance limiting the role of innate/inherited char-
acteristics on optimal performance. [EK93]. Elite performance, they claimed, is
the “product of a decade or more of maximal efforts to improve performance in
a domain through an optimal distribution of deliberate practice”, thus rejecting
the Galtonian model of innate ability in the making of champions. Ericsson
proposed a very structured approach claiming that “a specific volume of 10,000
h of training to be accumulated over a period of approximately 10 years, as nec-
essary for achieving expert levels [KA.06]. His theory caught the imagination
of a wider audience leading to very motivating publications like Outliers [M08],
The genius in all of us [D.10] etc. These books fuelled a billion-dollar industry
of nutrition and guided fitness.
However, all these theories were questioned based on the ground realities
of the emerging champions, each more gifted than the other. Studies showed
that despite the widespread appeal and popularity of Ericsson’s Framework of
deliberate practice, the more careful analysis of champions did not show an
overwhelming impact of deliberate practice time. Studies show that only 28
Considering the number of body systems that must interact (musculoskele-
tal, cardiovascular, respiratory, nervous, etc.), athletic performance is one of
the most complex human traits. Perhaps the first noticeable difference between
athletes of different specialties is in body morphology (i.e., height and body
composition), with specific body types naturally suited to specific sports. Be-
yond body morphology, endurance, strength, and power are primary factors
12
388
underlying athletic performance.
It is also important to factor that strategy and tactics is not a paper or a
board exercise, it must be implemented in the sporting arena that has its fair
share of real time dynamics. The athletes must be at their peak competency
in terms of their mental, cardiovascular, respiratory, neuromuscular, metabolic,
hormonal, and thermoregulatory systems; each of which have a genetic influence
that is undeniable.
While deliberate practice and the role of environment and a support system
is indeed critically important in the development of elite athletic abilities, dis-
missal of innate abilities resulting from genetic composition altogether in the
making of elite athletes may not be correct. In fact, heritability studies on
physical performance and functional adaptability provided strong evidence of a
significant genetic component to various parameters that ultimately determine
elite performance. Heritability estimates linked to sporting performance, such
as 99
Ironically, these counter findings came mostly at a time when CRISPR gene
editing began to make headlines, leading to a heightened interest in gene dop-
ing and genetic editing to make champions who would be no different from
the natural ones, sometimes even surpassing them with an inserted genetic ad-
vantage. This turn of events has lent yet another scientific dimension to the
nature vs nurture debate, one that is worrying sporting bodies as these genetic
interventions are difficult to trace. [dA08] The raised concerns over genetic
modification or “gene doping” for enhanced performance arise from impressive
studies in genetically modified rodents where manipulation of individual genes
has increased muscle mass, muscle strength, or running endurance, depending
on the gene that was manipulated. Reviews of these animal studies conclude
that such genetic manipulations could also improve human athletic performance
[HL04, SA06, BA07]; inadvertently also solidifying the debate in favour of the
advantages offered by nature. The only irony is that this nature’s advantage
could soon be nurtured.
Evidence from sporting champions clearly shows that both nature and nur-
ture are critical to their success. However, all studies do show the slight ad-
vantage that inheritability offers, one that could make all the difference at the
highest sporting levels. It would be safe to conclude that champions can be
built but only from among those who are favoured by nature bringing us back
to the prophetic statement of Galton, “there is nothing in what I am about say
that shall underrate the sterling value of nurture, including all kinds of sanitary
improvements; may, I wish to claim them as powerful auxiliaries to my cause;
nevertheless, I look upon race as far more important than nurture.”[100,101] He
clearly implied that deliberate practice and environmental factors are undoubt-
edly both critical to sporting excellence, but they do not in themselves produce
elite athletes. The future debate on this may not be as simple as the scientific
community looks to create nature in laboratories.
13
389
References
[AI15] Fedotovskaya ON Ahmetov II. Current progress in sports ge-
nomics. adv clin chem. Review. PubMed: 26231489., 2015.
14
390
[CA12] Silva AJ et al. Costa AM, Breitenfeld L. Genetic inheritance
effects on endurance and muscle strength: an update. Sports
Med. 2012 Jun 1;42(6):449–58, 2012.
[CKT19] Kwa´sniak K. Czarnik-Kwa´sniak, J. and J. Tabarkiewicz.
How genetic predispositions may have impact on injury and
success in sport. Eur J Clin Exp Med. 16, 366–375. doi:
10.15584/ejcem.2018.4.16, 2019.
[Con09] Hanton S. Connaughton, D. Mental toughness in sport:
Conceptual and practical issues. In S. Mellalieu S. Hanton
(Eds.), Advances in applied sport psychology: A review (pp.
317– 346). London: Routledge., 2009.
[Con10] Hanton S. Jones G. Connaughton, D. The develop-
ment and maintenance of mental toughness in the world’s
best performers. The Sport Psychologist, 24(2), 168– 193.
https://doi.org/10.1123/tsp.24.2.168., 2010.
[Cor11] McGuigan M.R. Newton R.U Cormie, P. Developing maximal
neuromuscular power. Sports Med 41, 125–146 (2011), 2011.
[Coy07] E. F. Coyle. Physiological regulation of marathon perfor-
mance. . Sports Medicine, 37(4-5), 306-311., 2007.
[CP02] Sewell D Clough P., Earle K. Mental toughness: the concept
and its measurement. Solutions in Sport Psychology ed. Cock-
erill I. M. (Boston, MA: Cengage Learning; ) 32–43., 2002.
[CP12a] Perry J. L. Crust L. Clough P., Earle K. Comment on “pro-
gressing measurement in mental toughness: a case example
of the mental toughness questionnaire 48” by gucciardi, han-
ton, and mallett. Sport Exerc. Perform. Psychol. 1 283–287.
10.1037/a0029771, 2012.
[CP12b] Strycharczyk D Clough P. Developing mental toughness: Im-
proving performance, wellbeing and positive behaviour in oth-
ers. . London: Kogan Page Publishers., 2012.
[Cru11] Lee Crust. Mental toughness in sport. International Journal
of Sport and Exercise Psychology Volume 5, Issue 3, 2011.
[D.10] Shenk D. The genius in all of us: new insights into genetics,
talent, and iq. New York: Knopf Doubleday Publishing Group;
2010., 2010.
[dA08] World Anti doping Agency. World anti-doping agency. wada
gene doping symposium calls for greater awareness, strength-
ened action against potential gene transfer misuse in sport.
WADA, 2008.
15
391
[Del13] Byrne R. Doolin R. McInerney R. Ruddock C. Green B. De-
lahunt, E. Anthropometric profile and body composition of
irish adolescent rugby union players. Journal of Strength and
Condition Research, 27, 32523258, 2013.
[EN16] Femia P-et al Eynon N, Ruiz JR. The actn3 r577x polymor-
phism across three groups of elite male european athletes. In:
Garatachea N, editor. PLoS ONE. 8. Vol. 7. 2012. Aug 16, p.
e43132., 2016.
16
392
the arts and sciences, sports, and games (pp. 1–50). Mahwah,
NJ: Erlbaum., 1996.
17
393
[FEMNes] Mike I. Lambert Frank E. Marino and Timothy D. Noakes.
Superior performance of african runners in warm humid but
not in cool environmental conditions. Journal of Applied Phys-
iology Vol. 96, No. 1, Frank E. Marino, Mike I. Lambert, and
Timothy D. Noakes.
18
394
[HJ01] Brownlie TIV. Haas JD. Iron deficiency and reduced work
capacity: a critical review of the research to determine a causal
relationship. J Nutr. (2001) 131:676S688S; discussion 688S-
690S. 10.1093/jn/131.2.676S, 2001.
19
395
[McP11] Perez-Schindler J. Degens H. Tomlinson D. Hennis P. Baar K.
et al. McPhee, J. S. Hif1a p582s gene association with en-
durance training responses in young women. Eur. J. Appl.
Physiol. 111, 2339–2347. doi: 10.1007/s00421-011-1869-4,
2011.
20
396
[Mon98] Marshall R. Hemingway H. Myerson S. Clarkson P. Dollery
C. et al Montgomery, H. E. Human gene for physical perfor-
mance. Nature 393, 221–222. doi: 10.1038/30374, 1998.
21
397
[PZ3] Kaliszewski P. Majorczyk E. Pokrywka, A. and ´ A. Zembron-
L
acny. Genes in sport and doping. Biol. Sport. 30, 155–161.
doi: 10.5604/20831862.1059606, 2013.
22
398
[Tes85] Wright J.E. Vogel J.A. et al Tesch, P.A. The influence of mus-
cle metabolic characteristics on physical performance. Europ.
J. Appl. Physiol. 54, 237–243, 1985.
[TP85] Karlsson J. Tesch P.A. Muscle fiber types and size in trained
and untrained muscles of elite athletes. Journal Of Applied
Psychology, 1985.
[Uni21] Anglia Ruskin University. Genes play key role in exercise out-
comes. ScienceDaily, 2021.
[Vae08] Lenoir M. Williams A.M. Philippaerts R Vaeyens, R. Tal-
ent identification and development programs in sport. Sports
Medicine, 38 (9), 703–714, 2008.
23
399
A Maase K Moran C North KN Pigozzi F Wang G. Web-
born N, Williams A. Direct-to-consumer genetic testing for
predicting sports performance and talent identification: Con-
sensus statement. Br J Sports Med. 2015 Dec;49(23):1486-91.
doi: 10.1136/bjsports-2015-095343. PubMed: 26582191. Free
full-text available from PubMed Central: PMC4680136., 2015.
24
400