1 s2.0 S1110866524000471 Main
1 s2.0 S1110866524000471 Main
1 s2.0 S1110866524000471 Main
A R T I C L E I N F O A B S T R A C T
Keywords: Genetic markers for acne are being studied to create personalized treatments based on an individual’s genes, and
Q-learning the field is benefiting from the application of artificial intelligence (AI) techniques. One such AI tool, the Q-
Genetic marker selection learning algorithm, is increasingly being utilized by medical researchers to delve into the genetics of acne. In
Acne genetics
contrast to previous methods, our research introduces a Q-learning model that is adaptable to diverse sample
Reinforcement learning
Gene expression data
groups. This innovative approach involves preprocessing data by identifying differentially expressed genes and
PubMed text data mining constructing gene-gene connectivity networks. The key advantage of using the Q-learning model lies in its ability
to transform acne gene data into Markovian domains, which are essential for selecting relevant genetic markers.
Performance evaluations of our Q-learning model have shown high accuracy and specificity, although there may
be some sensitivity variations. Notably, this research has identified specific genes, such as CD86, AGPAT3,
TMPRSS11D, DSG3, TNFRSF1B, PI3, C5AR1, and KRT16, as being acne-related through biological verification
and text data mining. These findings underscore the potential of AI-driven Q-learning models to revolutionize the
study of acne genetics. In conclusion, our Q-learning model offers a promising approach for the selection of acne-
related genetic markers, despite minor sensitivity fluctuations. This research highlights the transformative po
tential of Q-learning in advancing our understanding of the genetics underlying acne, paving the way for more
personalized and effective treatments in the future.
1. Introduction AI technology while upholding legal and ethical standards. AI has had a
significant impact on the field of dermatology, particularly in addressing
Acne, a common skin problem affecting people of all ages, has acne and skin health concerns. By utilizing AI-powered image recogni
emerged as a significant public health challenge. It boasts a global tion and analysis tools, dermatologists can efficiently identify various
prevalence rate of 9.38 %, as reported by the Global Burden of Disease acne lesions, assess their severity, and monitor their progression over
Study 2010 [33]. The comedones, papules, pustules, and, in severe time [11]. These AI systems process extensive visual data from patient
cases, cysts and nodules characterize this multifaceted skin condition. images and clinical studies, aiding in early diagnosis and personalized
The European Union (EU) has been actively engaged in developing treatment planning for individuals affected by acne.
regulations and guidelines to govern the implementation of AI in Genetics has emerged as a crucial factor in determining an in
healthcare to ensure patient safety, data privacy, and ethical consider dividual’s predisposition to acne [1,37,10,29]. AI and machine learning
ations are upheld [31]. By adhering to these regulations and ethical algorithms are instrumental in analyzing large-scale genetic data to
guidelines, stakeholders in the medical field can harness the potential of identify potential genetic markers associated with acne susceptibility.
* Corresponding authors.
E-mail addresses: [email protected] (Y.C. Chua), [email protected] (H.W. Nies), [email protected] (I.I. Kamsani), [email protected] (H. Hashim),
[email protected] (Y. Yusoff), [email protected] (W.H. Chan), [email protected] (M.A. Remli), [email protected] (Y.H. Nies), [email protected]
(M.S. Mohamad).
https://doi.org/10.1016/j.eij.2024.100484
Received 1 October 2023; Received in revised form 17 May 2024; Accepted 29 May 2024
Available online 7 June 2024
1110-8665/© 2024 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Computers and Artificial Intelligence, Cairo University. This is an open access
article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
Through the use of AI, researchers can navigate through vast genomic Markov Decision Processes (MDPs) to optimise active feature selection
information more effectively, revealing valuable insights into the ge policies [13,14,30].
netic foundations of acne. This research confronts the primary challenge of elucidating the
Discovering these genetic markers through AI-driven research can interaction between genes and the actions of reinforcement learning
enhance our understanding of acne’s origins and open doors to agents without biological traits. These traits encompass various aspects,
personalized treatment approaches. With AI’s assistance, dermatologists such as acne progression, mutations, copy number variations, and
can potentially tailor acne treatment plans based on an individual’s mRNA levels. The reinforcement learning model aims to optimise pol
genetic predisposition, optimizing therapeutic outcomes, and mini icies based on states reflecting action quality. Since a Markov decision
mizing adverse effects. This intersection of dermatology, genetics, and process involves sequential actions with consequences unfolding over
AI holds promise for advancing our ability to manage and treat skin subsequent steps, immediate outcomes remain elusive. Moreover, gene
conditions like acne effectively. expression data comprises numerous genes, including irrelevant ones
Scientists studying acne have looked for genetic markers, without that can negatively impact classification accuracy. The demand for
considering how diverse people can be. In the past, more studies focused biomarker testing, especially in cancer treatment and drug discovery, is
on identifying genetic markers related to acne by examining DEGs via increasing. Unfortunately, methodologies for processing acne gene
the weighted gene co-expression network analysis (WGCNA) [5,18]. expression data for reinforcement learning models remain limited.
However, these methods have limitations because they rely on co- Therefore, this research addresses the lack of standardised data pre
expression for gene functional inference. Genes with similar expres processing when applying the Q-learning model to acne gene expression
sion profiles have different functions or inconsistencies that arise from data. This method aims the model to learn from gene expression data
regulation and post-transcription, so this approach may not be useful. and select informative genetic markers linked to acne. Furthermore, the
The core issue lies in the rigidity of these methods, which approach ethical considerations surrounding medical AI emphasise the need for
genetic marker identification and may overlook significant heteroge transparency in AI algorithms, accountability for decision-making pro
neity. To address this, some researchers have proposed a more dynamic cesses, and the importance of human oversight in healthcare settings
approach by leveraging reinforcement learning models, such as Q- [31]. Ethical guidelines such as those outlined in the EU’s Ethics
learning. This dynamic approach considers the sequential nature of Guidelines for Trustworthy AI are essential for ensuring that AI in
2
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
medicine upholds fundamental rights and values. Gene Entrez IDs were took the average gene expression value from all
This study aims to show how Q-learning can improve the selection of sample [24,21]. Table 1 summarises the details of the data before and
genetic markers linked to acne. By harnessing the adaptability of this after data preprocessing. All the datasets that have been cleaning the
algorithm, the research aims to enhance the precision and specificity of data rows with missing values and the repeated value of the Gene Entrez
marker identification while considering the subtle variations among ID are further carried out for the identification of the differentially
different patient groups. This research aims to improve acne treatment expressed genes, which is summarized in Fig. 2.
by personalising it, leading to a breakthrough in acne genetics research.
2.2. Q-learning reinforcement model
2. Materials and methods
Fig. 3 shows how Q-learning model to learn about the gene expres
Fig. 1 presents the framework of the proposed method including data sion data. We denote the gene expression matrix as X =
pre-processing and Q-learning. [x1 , x2 , ⋯ , xm ]T ∈ Rm × k , where m is the number of samples; k is the
number of differentially expressed genes. Nevertheless, the differentially
expressed gene data is further processed by generating the weighted co-
2.1. Data collection and pre-processing
expression network to obtain a gene-gene connectivity matrix [6].
Furthermore, the differentially expressed gene data is correlated to the
In our quest for knowledge about acne genetics, we delve into
acne trait by using Pearson correlation in order to obtain a gene list with
GSE108110, GSE53795, and GSE6475 as input data sourced from gene
high gene significance. Hence, the Q-learning agent is learning from the
expression datasets through the Gene Expression Omnibus (GEO) [19].
acne sample, and the Q-learning agent’s reward function corresponds to
It’s worth noting that GSE53795 and GSE6475 have been instrumental
the valid connection between genes based on the sample, and provides
in prior investigations [18]. However, their usage in these previous
the rewards based on the gene-gene connectivity matrix.
studies involved a comprehensive analysis of datasets and pathways via
The Q-learning reinforcement model consists of an environment,
weighted gene co-expression network analysis (WGCNA). GSE108110
agent, and Q-learning module. The role of the environment module is to
and GSE53795 are investigated by Yang et al. [38] for the inflammatory
provide a learnable gene co-expression environment for a Q-learning
acne-related key biomarkers, signalling pathways, and immune infil
agent. Furthermore, in the environment module, the generated envi
tration in the acne lesion. These datasets are built upon gene expression
ronment is based on the gene expression of the expressed genes in the
data and securely stored in quantified Affymetrix image (CEL) files.
acne samples. Thus, for the environment module, there are eight
Thus, the data undergoes rigorous preprocessing, a vital step to ensure
accuracy and reliability, as the missing or redundant data can signifi
cantly sway survival analysis and the interpretation of pivotal factors
like diagnosis stages [25,21,24]. Hence, the datasets were profiled into
probe set as the raw data. Then, the raw data was carried out the
background correction and quantile normalization of the probe level for
gene expression data by using the robust multichip average (RAM) al
gorithm [24]. Next, the gene Entrez IDs were extracted to be the row
name for the gene expression data by using the R software package of
hgu133plus2.db. Therefore, the conversion of the row from the probe
identifiers to Gene Entrez IDs has been performed. Furthermore, the
values of the columns for the data have been converted into the gene
expression value for the Gene Entrez ID [34–35]. Nevertheless, the
missing data and repeated data can cause the ineffective on the analysis
result [25]. Based on the previous studies, the record of the missing gene
Entrez IDs were removed, whereas the repeated record with the same
Table 1
Summary of the data on handling missing data and repeated rows for Gene
Entrez ID.
Characteristics of the Data GSE108110 GSE53795 GSE6475
Number of Samples 54 24 18
18: non-lesional, 12: 6: lesional,
18: lesional lesional, 6: non-
(papules persisting 12: non- lesional,
for less than 48 h), lesional 6: normal
18: lesional
(papules for 21
days)
Number of rows of Raw Data 54,675 54,675 22,277
(probes)
Number of rows of data with 11,537 12,753 2466
missing values for Gene
Entrez ID (genes)
Number of rows of data 43,138 41,922 19,811
BEFORE handling the
duplicated Gene Entrez ID
(genes)
Number of rows of data 20,857 20,174 12,402
AFTER handling the
duplicated Gene Entrez ID
Fig. 2. Data collection and pre-processing before identifying and filtering the
(genes)
differentially expressed genes.
3
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
every gene Entrez ID that has high gene significance. The last function is
“is_connective()”, which is used to check the selected genes whether the
gene-gene connectivity is greater than a small threshold for an agent to
distinguish the value of the state.
Next, the role of the Q-learning module is to provide the Q-learning
model to the agent for calculating and storing the Q-value for pairs of
state and action, Q(s, a). Thus, the agent aims to identify the optimal
action from the Q-value table and to choose the best action in a given
state. In the Q-learning module, there are three variables and two
functions. The first variable in the Q-learning module is the “Q” variable
in array form, which is used to behave as a Q-value table for storing the
Q-value for the states and actions. Besides, the “gamma” variable is used
as the discount factor, whereas the “learning_rate” variable is used as the
learning rate for updating the Q-value table with the Bellman equation.
The values of “gamma” and “learning_rate” variables set to 0.9, due to
the convergence speed of which is much faster than that of other values
[39]. Since when the value of the discount factor is low, the agents are
more eager for the short-term benefits than future return, resulting
eventually in getting stuck in a local optimum and following a long-path
strategy. Moreover, the area of variance for different values of discount
factor is presented with a huge difference: the larger the value of the
discount factor, the smaller the variance area. The two functions of the
Q-learning module are “update_Q_value()” and “get_Q_table()”. The
function of “update_Q_value()” is used to update the Q-value table by
using the Bellman equation, while the function of “get_Q_table()” is used
to retrieve the Q-value table.
Lastly, the agent module is to behave as a decision-maker. Hence, the
agent retrieves a state from the environment and responds with an ac
tion which is the gene to be selected based on the exploration and
exploitation approach, with a probability of the execution of a random
action. The objective of the agent is to select the genes based on the gene
co-expression pattern for a list of the potential genetic markers that have
high gene significance value among the genes. Nevertheless, an agent
with the Q-learning aims to maximize the expected rewards. Hence, the
penalty-reward function for the gene correlation is applied to the gene
expression environment to provide a reward for the agent in order to
identify genetic markers from the correlation pattern by referring to a
list of potential genetic markers [9,26]. Therefore, the agent module
consists of 9 variables and 6 functions. In the agent module, the
“epsilon” variable is used as the probability of the agent executing the
random action or executing the action based on the Q-value table. The
“model” variable is the variable used to initiate the Q-learning model in
the agent class. Furthermore, the “num_action” indicates the size of the
space in which random action can be executed. The “selected_gene”
variable is used to store the selected gene index when the selection of the
gene is executed. Besides, the “previous_selected_gene” variable is used
to store the selected gene index for the previous round of the gene se
lection. The “state” variable is initiated as an array for storing the state
Fig. 3. How Q-learning works in the proposed method. extracted from the environment. Next, the variable “state_index” is used
to store the index of the state for the particular state and used to update
variables and four functions. The first variable is “ggc”, which stores the the Q-value table. The “action” variable is used to store the gene index
gene-gene connectivity matrix. Besides, the “top_gene_list” variable which decides the action from the agent and is also used to update the
stores the genes Entrez ID which have high gene significance, and the Q_value table. The last variable in the agent module is the “reward”, it is
“index_top_genes” variable stores the index for the genes Entrez ID used to store the value of the reward which has been calculated from the
which have high gene significance. Furthermore, the “selection” vari function of “get_rewards()” and also used to update the Q-value table.
able is to store the selected gene in an array. The “expression” variable is Nevertheless, the “get_action()” function is used to obtain the gene index
to store the actively expressed gene from the acne sample in an array. which represents the action for the agent based on the probability of the
The “terminated” variable is a boolean variable for checking the agent executing the random action or executing the action based on the
terminated index has been selected. Nonetheless, the “gene_num” is the Q-value table. The “get_state()” function is used to obtain the state value
variable to stores the number of genes in the acne sample. The last from the environment, whereas the “get_state_index()” function is to
variable of the “gene_id” is a list of gene Entrez ID that exists in the acne convert the given state into an index for storing in and retrieving from
sample. Nevertheless, the function of the “reset()” is used to reset the the Q-value table. The function of “get_rewards()” is to calculate the
environment when the agent has selected the terminated index as an gene-gene connectivity as a reward for the particular state and action
action. The “play_step()” function is used to make the selection of genes, with the application of the reward-penalty function. The “get_se
whereas the “get_index_top_genes()” function is to obtain the index for lected_gene()” function is used to obtain the selected gene while the
selection of the gene. The last function of the agent module is the
4
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
“update_Q_table()” function, which is used to call the “update_Q_value used to update the Q-value table.
()” function from the Q-learning module. The objective of the agent is to select the genes based on the gene co-
expression pattern for a list of the potential genetic markers that have
2.3. Markov decision process high gene significance value among the genes. Nevertheless, an agent
with the Q-learning aims to maximize the expected rewards. Hence, the
Markov Decision Process is a control process for modeling sequential penalty-reward function for the gene correlation is applied to the gene
decision-making in a stochastic situation with a discrete stage [36]. In expression environment to provide a reward for the agent in order to
this process, a set of states, S, actions, A, and rewards, R, will be involved identify genetic markers from the correlation pattern by referring to a
with the interaction between agent and environment. Thus, in this Q- list of potential genetic markers [26].
learning model, the agent retrieves a state from the environment and
uses a Q-value table to determine the action that will be executed for the 2.4. Differentially expressed genes
current state. After executing an action, the agent will gain a reward as
feedback from the environment and move on to the next state to carry The genetic markers are defined from the differentially expressed
out the learning process again. Therefore, an optimal Q-value table will genes [26]. Hence, the differentially expressed genes are identified, and
be obtained, and the Q-value table will provide the optimal solution for removed the no or low changes in expression in different samples for the
discrete stages. The Q-value table consists of a set of states, S, actions, A, datasets [12,3,34–35]. Firstly, the information of the sample is obtained
and the accumulated value of rewards, R. by using the library of “GEOquery” [7]. Next, the boxplot and hierar
The Q-learning agent conducts the selection of the different genes chical tree are used to check the distribution of the gene expression data
across different samples. Hence, the state of the agent represents the and detect the sample outliers respectively. Furthermore, in the process
degree of the gene-gene connectivity of the selected genes with the genes of identification of differentially expressed genes, the value of the me
that have high gene significance, considering the gene-gene connectivity dian of gene expression level is used for filtering the low expressed genes
is greater than a threshold. This can help the agent to distinguish which across samples when the particular gene is not being expressed by at
genes in the list of high gene significance genes are connective to the least 2 samples. Next, the gene expression data is compared by two
selected gene. The state, si is indicated by the gene-gene connectivity of conditions, such as in non-acne condition and acne condition, for
selected genes and genes in the list of high gene significance genes, if it is identifying the differentially expressed genes by using the function of
greater than the threshold used, then the indicator vector ei is set to one, “makeContrasts()” in the limma library [27]. Lastly, the differentially
otherwise is set to be zero. Hence, this shows that state, si = expressed genes are identified from the conditions, such as adjusted P
value <0.05 and absolute value of log 2 based on the fold change >1.5
ei ∈ {0, 1}m × 1 , where i is the time step, and m is the size of the list of
[4,15]. Eventually, all the differentially expressed genes which included
high gene significance genes. In this research, the threshold used is 0.1,
up-regulated genes and down-regulated genes are filtered out for further
whereas the m used is 10.
analysis. Thus, the data of differentially expressed gene expression level
For the selection of genes for identifying genetic markers, the action
is passed to the Q-learning agent to obtain the differentially expressed
space consists of all the possible actions which are the differentially
genes as the action for the Q-learning agent. Nevertheless, the data of
expressed genes. In the Q-learning agent, the action space includes the
differentially expressed gene expression levels is also used for generating
index of differentially expressed genes, A ∈ {a1 , a2 , ... , aL }, where A is
the gene-gene connectivity matrix and gene list of high genes for the Q-
the action space, a is the index of differentially expressed genes and L is
learning model.
the number of differentially expressed genes. In this research, the value
of action, a consists of {0, 1, …, L, L + 1}, where L + 1 represents the
stop action. 2.5. Gene-gene connectivity matrix
{ ∑
− w⋅ Cjk , Cjk < ∊ Furthermore, in order to represent the dependency between genes, a
Reward, R ∑ , wherej ∈ J , k ∈ K, J⊂K (1) gene-gene connectivity matrix is generated by using the function of
Cjk , otherwise
“TOMsimilarity()” [12,17]. In this process, the Pearson correlation
From the function above, j is the genes in the list of high gene sig matrix is calculated from the differentially expressed genes and gives a
nificance genes, whereas J is the list of genes that have high gene sig high topological overlap for two genes that have common neighbor
nificance. Furthermore, the k is the selected gene and K is the list of hoods with a soft threshold. Hence, a weighted gene co-expression
differentially expressed genes. Nonetheless, Cjk is the gene-gene con network is constructed, representing the gene-gene connectivity [12].
nectivity of j and k. Hence, the reward-penalty function above shows Fig. 4 shows the scale independence and the mean connectivity for
that the reward depends on the gene-gene connectivity between the the list of soft threshold for the dataset of GSE108110. Thus, 20th power
gene selected and the gene in the list of high gene significance genes. is picked as the soft threshold for generating the cluster for the differ
This is because gene significance value can be used to identify potential entially expressed genes. This is because the soft threshold is selected
genetic markers, as gene significance is a measurement of the correlation when it have the scale-free fit index is higher and the mean connectivity
for the gene expression with an external trait [17]. Furthermore, in this is remained lower. After picking the soft threshold, a topological overlap
reward-penalty function, the used value of w is 1.5 and the purpose of network construction and module detection in the approach of average
the w in this reward-penalty function is to enlarge the sparsity of the linkage hierarchical clustering are performed for the differentially
penalty for the genes that have the gene-gene connectivity is less the ∊ expressed genes by using the function of blockwiseModules() with the
value [26]. For this reward-penalty function, the value of ∊ used is 0.01, picked soft threshold and Pearson correlation.
to quantify the contribution of the gene as a genetic marker by consid Fig. 5 illustrates a list of soft thresholds into two plots which are
ering the gene-gene connectivity is greater than a small threshold [26]. according by the scale independence and the mean connectivity for the
dataset of GSE53795. Hence, the 20th of the soft threshold was picked
Qnew (st , at ) = Q(st , at ) + α(R + γ⋅max Q(st+1 , a) − Q(st , at ) (2)
for the construction of the topology overlap network for the dataset of
The Q-value function above shows that Q(st , at ) is the Q-value for the GSE53795. This is because the 20th soft threshold has a high scale in
current state, st and action was taken for the current state, at ; α is the dependence power which is nearly to 0.8 and a low mean connectivity.
learning rate; R is the reward for the current state, st and action was Then, the differentially expressed genes are performed top
taken for the current state, at ; γ is the discounted factor; max Q(st+1 , a) is ological overlap network construction and average linkage hierarchical
the maximum Q-value for the new state from all the possible actions. clustering for gene module clustering by using the function of block
Lastly, the Q-value function which is based on the Bellman equation is wiseModules() with the picked soft threshold and Pearson correlation.
5
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
6
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
Fig. 6 shows the soft thresholds of GSE6475 dataset by their scale Next, the agent executed an action with epsilon to generate a probability
independence and the mean connectivity. Thus, the 20th power was for an agent to execute the random action or execute the action based on
selected as the soft threshold for the differentially expressed genes to the Q-value table. In this process, the agent carries out an exploration, if
generate a topology overlap network. The selection for the picked soft the agent executes a random action; otherwise agent carries out
threshold is due to the soft threshold selected soft threshold has a nearly exploitation for the execution of an action that has the highest Q-value
to 0.8, high scale independence power and a low mean connectivity. from the Q-value table.
After the selection of the soft threshold, the network construction and After that, the agent calculates the reward based on the action taken,
module detection are performed by using blockwiseModules() with the which means the gene-gene connectivity of the gene selected and genes
picked soft threshold and Pearson correlation. In this process, a topo from the gene list of the high gene signature is calculated as a reward for
logical overlap network is constructed and gene module is identified in an agent by the reward-penalty function that has been applied in an
the approach of average linkage hierarchical clustering. agent. Then, the next state is retrieved from the environment based on
Nevertheless, the differentially expressed genes data is correlated the action taken. Next, the agent updates the Q-value table. In this
with acne trait in order to generate the correlation for genes and acne research, for the Q-learning agent, the number of learning episodes for
trait. Then, the correlation of the differentially expressed genes and acne every sample is 200.
trait is further processed with the function of “corPvalueStudent()” for The Q-function will be updated with an optimized Q-value until the
generating the gene significance for the differentially expressed genes iteration of the learning is stopped. Hence, the output of the model is the
[17]. In this process, the P-value for correlation is calculated for the genes selected with its frequency of selected. Then, the top 10 genes
differentially expressed genes and acne traits. In order to obtain a gene selected are considered as the genetic markers.
list of high gene significance, the values of the P-value for correlation of
the differentially expressed genes and acne trait are sorted in acceding 2.7. Performance evaluation
order and the top 10 genes which have the lowest value of P-value are
extracted. Hence, the gene list of high gene significance is passed and For the evaluation of the performance of the Q-learning model, the
used for the reward-penalty function in the Q-learning agent. genetic markers obtained from the Q-learning model have carried out
For the GSE108110 dataset, the gene list of the high gene signifi performance evaluation by classification with the model of logistic
cance used included the genes with the gene Entrez ID of 4321, 55509, regression and stratified five-fold cross-validation. In the classification,
1000506776, 4826, 4329, 158830, 64761, 555619, 355, and 1200. the ratio of 0.8: 0.2 is used for splitting the data into the training set and
Nevertheless, the high gene significance genes used for the GSE53795 test set. The used ratio follows the ratio used in the literature review in
dataset are the genes with the gene Entrez ID of 1786, 7264, 2769, 6283, the research of using reinforcement learning to select the active gene
387695, 57016, 5266, 79155, 7378, and 338324. Furthermore, for the signatures for identification in renal cell carcinoma [12]. Moreover,
GSE6475 dataset, the gene list of the high gene significance consists of when the ratio is closer to 0.8, it provides empirically best splitting for
the genes that have the gene Entrez ID of 2210, 4312, 10288, 3002, the training set and test set [8]. Thus, the genetic marker is evaluated by
5552, 57016, 5265, 1890, 3868, and 6702. classification using the genetic marker as features for accuracy, sensi
The higher the topological overlap for two genes, the higher the tivity, specificity, and AUC. K-fold cross-validation divides the data into
similarity for two particular genes as they have common neighborhoods. k groups and carries out the validation by using one of the folds as a test
Therefore, the data pre-processing reduces the redundancy of the fold, the rest are train folds [16,23]. In this evaluation, the datasets are
dataset and prepares the data for configuring and inputting to the Q- split into five-fold and the stratified five-fold cross-validation. In the
learning algorithm. stratified five-fold cross-validation, the different datasets use different
ratios five-fold for the group of acne. For dataset, GSE108110, the ratio
2.6. Q-learning algorithm used for the acne sample to the non-acne sample is 0.33:0.67, because
there are 18 acne-lesional samples and 36 non-acne lesional samples.
Firstly, the differentially expressed genes matrix is passed into the For dataset GSE53795, there are 12 ance-lesional samples and 12 non-
environment module to generate a learnable environment for the Q- acne-lesional samples, hence, the ratio used for acne sample to non-
learning agent. The differentially expressed genes matrix has been acne sample is 0.5:0.5. Lastly, for the dataset GSE6475, the ratio used
converted into the “actively expressed” pattern with the value of the for acne sample to non-acne sample is 0.33:0.67 because there are 6
median of the gene expression level. In other words, the genes in the acne-lesional samples and 18 non-acne lesional samples.
matrix are expressed in active, “1”, when the expression level is over the Nevertheless, PubMed text data mining is used to provide the bio
value of median of expression level, otherwise the gene is not expressed, logical context verification of the gene selected by the Q-learning agent.
“0”. In the Q-learning, the data have to be Markovian domain in order to In the PubMed text data mining, the gene symbol of the gene selected by
learn the data as the Markov Decision Process. Therefore, the data input the Q-learning agent, which has been converted from the gene Entrez ID,
needs to be configured as a Markovian domain. Hence, this configura is used to find the relationship of the particular gene symbol with acne.
tion process includes the configuration of data to the environment for Furthermore, This process is carried out by the function of get_pubme
the Q-learning agent to retrieve a learnable state from the environment d_ids() to find out whether the particular gene symbol has been involved
and the configuration of action for the Q-learning agent to be able to in the publication from PubMed. In this process, the keyword used for
execute an action in the environment. Furthermore, the actively finding the genetic markers that have a relationship with acne from the
expressed gene expression data is then configured as an environment. publication in PubMed is “acne”. Thus, the result of the PubMed text
Then, the agent retrieves the state from the environment by comparing data mining shows the direct relationship between the genetic markers
the gene-gene connectivity of the selected gene and the gene in the high to acne.
gene significance gene list with a value of ∊ used. If the value of the gene-
gene connectivity of the selected and the gene in the high gene signifi 3. Results
cance gene list is greater than the value of ∊ used, then the value of the
digit returns as 1. Hence, the number of digits for the state is 10 due to Table 2 shows the gene Entrez ID of the gene selected with its fre
there are 10 genes in the high gene significance gene list. Nevertheless, quency of being selected for three datasets, which are GSE108110,
the configuration of the action also is carried out. In this process, the GSE53795, and GSE6475. In this process, all the genes are recorded in
number of genes is calculated from the gene list that has been read from their gene Entrez ID. For the dataset of GSE108110, the top 10 genes
the gene expression data. Then, the agent generates an action space that selected are “21”, “100506779”, “942”, “5160”, “199”, “54440”,
contains the gene to be selected and evaluated by the Q-learning agent. “56894”, “91010”, “3109” and “128346”, and the gene with gene Entrez
7
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
Table 2 Table 4
The frequency of the genes selected. Comparative analysis of Q-learning and GSVA for the studied datasets in terms of
GSE108110 GSE53795 GSE6475
AUC.
p-values 95 % Confidence Interval Differences
Gene Entrez Frequency Gene Frequency Gene Frequency
ID Entrez Entrez GSE108110 4.06 × 10-8 (18.14, 20.56) 19.35
ID ID GSE53795 0.04 (− 0.62, 5.94) 3.32
GSE6475 0.36 (− 0.99, 2.11) 0.70
21 1274 12 1364 120 523
100,506,779 181 3587 196 5266 389
942 148 9407 196 728 210
5160 147 7029 171 3868 193
Table 5
199 139 5495 167 5552 165
Table of the conversion of Gene Entrez ID to Gene Symbol.
54,440 113 1830 163 6699 158
56,894 85 6372 120 10,261 156 GSE108110 GSE53795 GSE6475
91,010 76 2237 114 11,151 135
3109 69 2752 98 224 128 Gene Entrez Gene Gene Gene Gene Gene
128,346 59 7133 91 5320 115 ID Symbol Entrez Symbol Entrez Symbol
ID ID
been identified genetic markers in acne vulgaris that are validated to GSE108110 CD86 26495013
ward personalized diagnostic and therapeutic strategies [20]. The AGPAT3 32031713
GSE108110 achieved 81.80 % accuracy, specificity is 100 %, and
sensitivity is reached 75 %. For GSE53795 and GSE6475, both datasets GSE53795 TMPRSS11D 31838778
achieved 100 % accuracy, specificity, and sensitivity. However, this DSG3 31337387
TNFRSF1B 20556591
research has been outperformed than GSVA in GSE53795 and GSE6475.
Table 4 presents the comparative analysis of Q-learning and GSVA
for the GSE108110, GSE53795, and GSE6475 datasets. The tables also GSE6475 PI3 31838778
C5AR1 31688984
state the differences in AUC values between the methods in term of KRT16 36291580
percentages. There were statistically significant differences between the
methods, as supported by p-values and the 95 % confidence interval.
Additionally, the Q-learning demonstrated consistent and significant and KRT16 have a relationship with acne.
improvement over other methods with a minimum average difference of
0.70 %. 4. Discussion
For biological verification, the genes Entrez ID of the high selected
frequency of the genes selected by the Q-learning model are converted The limitation of the proposed method is that the features that have
into a gene symbol. Table 5 shows the gene symbol for the high fre been used to identify the differentially expressed genes for the datasets
quency of genes selected. In the biological validation, the gene symbols are “patient_id” and “sample_type”. This is because the datasets used,
are used in PubMed text data mining to find which gene is related to which are GSE108110, GSE53795, and GSE6475 do not include more
acne. Table 6 shows the gene symbols with the acne publication record features such as the age of the patient, and background of the patient.
on PubMed. Therefore, 8 gene symbols show the relationship with acne, Due to the lack of more information for the patient, factors other than
as 8 gene symbols have publication records about acne on PubMed. “type of sample” are lacking to consider for analysis on identifying
Hence, for the GSE108110, the gene symbols CD86 and AGPAT3 have a differentially expressed genes. For example, acne is a symptom of many
relationship with acne. For the dataset of GSE53795, gene symbols of endocrine disorders [22]. Furthermore, the side effects of medication
TMPRSS11D, DSG3, and TNFRSF1B are the genes that have a relation treatment such as intake the medicine that contains lithium also is a
ship with acne. For the GSE6475 dataset, the gene symbols PI3, C5AR1, factor that causes the acne problem [2]. Hence, it is hard to distinguish
the patients who have acne problems due to endocrine disorders and the
patients who have acne problems due to the side effects of medication
Table 3 treatment by only using the gene expression level data. Therefore, a lack
Summary of the accuracy, specificity, and sensitivity of the datasets. of more information about the patient may lead to a limited analysis of
Datasets Q-Learning GSVA [20] it.
The proposed method can select the genes that are associated with
Accuracy Specificity Sensitivity AUC AUC
acne from an action space that consists of plenty of genes. For instance,
GSE108110 81.80 % 100 % 75 % 75 % 94.35 % the action space for GSE108110 consists of 749 gene candidates, the
GSE53795 100 % 100 % 100 % 100 % 96.68 %
action space for GSE53795 consists of 1482 gene candidates and the
GSE6475 100 % 100 % 100 % 100 % 99.30 %
8
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
action space for GSE6475 consists of 370 genes. The gene symbol of the sensitivity, because the proposed method has a low sensitivity for
gene selected for the GSE108110 are ABCA3, BZRAP1-AS1, CD86, GSE108110 but remained a high sensitivity for the other two datasets.
PDHA1, AIF1, SASH3, AGPAT3, FMNL3, HLA-DMB and C1orf162. For Therefore, the proposed method is unstable for correctly classifying acne
the GSE53795, the genes selected are SERPINA3, IL10RA, TMPRSS11D, samples as acne samples and may produce false negative predictions.
TFDP2, PPM1B, DSG3, CXCL6, FEN1, GLUL, and TNFRSF1B. Further
more, for the GSE6475, the genes selected are ADD3, PI3, C5AR1, 5. Conclusion
KRT16, SRGN, SPRR1B, IGSF6, CORO1A, ALDH3A2 and PLA2G2A. For
the dataset of GSE108110, the genes that have been biologically verified The introduction of a Q-learning model for the selection of genetic
for having a relationship with acne are CD86 and AGPAT3. Besides, for markers associated with acne represents a significant advancement in
the GSE53795, the genes that have biologically verified for having a the field of medical research. Q-learning, an AI-driven approach, allows
relationship with acne are TMPRSS11D, DSG3, and TNFRSF1B, whereas the system to autonomously identify and select genes that play a crucial
for the dataset of GSE6475, the genes that have biologically verified for role in the development of acne. To achieve this, various data pre
having a relationship with acne are PI3, C5AR1, and KRT16. processing techniques are applied, including handling missing data,
According to the list of genes validated by PubMed text data mining, deduplication of Gene Entrez IDs, and identification of differentially
type-1 macrophage marker (Gene Symbol: CD86) with PubMed ID: expressed genes. These processes help streamline the input data for the
26495013, shows that it is highly phagocytic and play critical roles in Q-learning model, making it more effective in identifying relevant ge
infections and cancers to protect the host [28]. Hence, liquiritigenin and netic markers.
isoliquiritin activated human monocytes by increasing CD86 expression One of the key innovations in this research is the construction of a
and phagocytosis. By enhancing macrophage functions, it may be gene-gene connectivity matrix, which measures the relationships be
beneficial in a therapeutic strategy for skin inflammation disorders tween different genes. This matrix, coupled with gene significance cal
including acne. TNF receptor superfamily member 1B (Gene Symbol: culations, transforms the raw acne gene expression data into a format
TNFRSF1B) with PubMed ID: 20556591, has been presented with the suitable for the Q-learning model to understand and learn from.
high frequency of 196R allele in the functional M196R polymorphism of Consequently, the Q-learning agent can efficiently pinpoint genetic
TNFR2 is a risk factor for acne vulgaris in Han Chinese [32]. Trans markers associated with acne after this data preprocessing.
membrane serine protease 11D and peptidase inhibitor 3 (Gene Sym The study incorporates three different acne gene expression datasets
bols: TMPRSS11D and PI3) with PubMed ID: 31838778, shows these from Gene Expression Omnibus (GEO), providing a comprehensive
two genes and KRT16 (keratin 16) were found to characterize hidra analysis. Notably, the results reveal specific genes associated with acne
denitis suppurativa/acne inversa (HS) from a molecular standpoint [40]. for each dataset, shedding light on the genetic factors contributing to
TMPRSS11D is released in the host defence system from the submucosal this skin condition. For instance, genes like CD86 and AGPAT3 are
serous glands onto mucous membrane. PI3 was detected to be differ linked to acne in one dataset, while TMPRSS11D, DSG3, and TNFRSF1B
entially regulated in HS perfomed by Quantitative real-time PCR. PI3 are implicated in another. These findings demonstrate the power of the
functions as an antimicrobial peptide against Gram-positive and Gram- proposed Q-learning model in uncovering genetic markers associated
negative bacteria and fungal pathogens. It modulates a wide range of with acne across diverse datasets.
parameters that are critical for the inflammation process, such as NFκB Moreover, the study employs logistic regression with five-fold cross-
pathway modulation, cytokine secretion and cell recruitment. validation to classify the datasets, showcasing the practicality and
Fig. 7 shows the cumulative rewards by the number of episodes for robustness of the proposed model. It achieves high accuracy and speci
the Q-learning agent. The purpose of the graph of the cumulative reward ficity in identifying genetic markers selected by the Q-learning agent.
and number of episodes is to show the learning trend of the Q-learning However, it’s important to note that the model exhibits some variability
model. The inclining trend shows the Q-learning model is gaining pos in sensitivity, indicating an area for potential improvement.
itive feedback as a reward for selecting genes that are more connective In summary, this research leverages AI-driven Q-learning to effec
to the high gene significance gene list; whereas the declining trend tively identify genetic markers associated with acne. The model’s ability
shows the Q-learning model is experiencing negative feedback as a to autonomously select relevant genes, its innovative data preprocessing
penalty for selecting genes that are less connective to the high gene techniques, and its success in classifying acne gene expression datasets
significance gene list. Therefore, reinforcement learning for selecting highlight its potential in advancing genetic marker identification not
genes that are associated with acne is being carried out. only for acne but potentially for other diseases as well. This work rep
The proposed Q-learning model has a high accuracy, and high resents a significant step toward personalized treatments and a deeper
specificity but is unstable for sensitivity. Hence, the proposed method is understanding of the genetic basis of skin conditions.
a model with high accuracy and specificity. Therefore, the proposed
method can correctly classify the non-acne sample as the non-acne Funding
sample and is less likely to predict the non-acne sample as an acne
sample. However, the proposed method performed unstable for the This work was supported by the Ministry of Education Malaysia
9
Y.C. Chua et al. Egyptian Informatics Journal 26 (2024) 100484
through the Fundamental Research Grant Scheme (FRGS) 2021 (Grant [13] Iftikhar A, Ghazanfar MA, Ayub M, Alahmari SA, Qazi N, Wall J. A Reinforcement
learning recommender system using bi-clustering and Markov decision process.
number FRGS/1/2021/ICT03/UTM/02/2) and Sultan Mizan Antarctic
Expert Syst Appl 2023;121541.
Research Foundation (YPASM). The Universiti Teknologi Malaysia [14] Janisch J, Pevný T, Lisý V. (2019, July). Classification with costly features using
(UTM) also supported this work through UTMER 2021 (Grant number Q. deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial
J130000.3851.20J26) and UTMFR (Grant number Q. Intelligence (Vol. 33, No. 01, pp. 3959-3966).
[15] Jia K, Wu Y, Ju J, Wang L, Shi L, Wu H, et al. The identification of gene signature
J130000.2551.20H71). The United Arab Emirates University also and critical pathway associated with childhood-onset type 2 diabetes. PeerJ 2019;
sponsored this work through the Strategic Research Program (Grant 7:e6343.
number 12R111) and the Research Start-up Program (Grant number [16] Karasiak N, Dejoux JF, Monteil C, Sheeren D. Spatial dependence between training
and test sets: another pitfall of classification accuracy assessment in remote
12M109). sensing. Mach Learn 2022;111(7):2715–40.
[17] Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network
CRediT authorship contribution statement analysis. BMC Bioinf 2008;9(1):1–13.
[18] Li X, Jia Y, Wang S, Meng T, Zhu M. Identification of genes and pathways
associated with acne using integrated bioinformatics methods. Dermatology 2019;
Yong Chi Chua: Conceptualization, Data curation, Formal analysis, 235(6):445–55.
Investigation, Methodology, Resources, Software, Visualization, Writing [19] Liang J, Chen Y, Wang Z, Wang Y, Mu S, Zhang D, et al. Exploring the association
between rosacea and acne by integrated bioinformatics analysis; 2023.
– original draft. Hui Wen Nies: Conceptualization, Formal analysis, [20] Lin Q, Cai B, Ke R, Chen L, Ni X, Liu H, et al. Integrative bioinformatics and
Funding acquisition, Methodology, Project administration, Resources, experimental validation of hub genetic markers in acne vulgaris: Toward
Supervision, Validation, Writing – original draft, Writing – review & personalized diagnostic and therapeutic strategies. J Cosmet Dermatol 2024.
[21] Liu W, Wang W, Tian G, Xie W, Lei L, Liu J, et al. Topologically inferring pathway
editing. Izyan Izzati Kamsani: Formal analysis, Funding acquisition,
activity for precise survival outcome prediction: breast cancer as a case. Mol
Writing – review & editing. Haslina Hashim: Methodology, Validation, Biosyst 2017;13(3):537–48.
Writing – review & editing. Yusliza Yusoff: Conceptualization, Formal [22] Lolis MS, Bowe WP, Shalita AR. Acne and systemic disease. Medical. Clinics 2009;
analysis, Validation, Writing – review & editing. Weng Howe Chan: 93(6):1161–81.
[23] Lyu Z, Yu Y, Samali B, Rashidi M, Mohammadi M, Nguyen TN, et al. Back-
Formal analysis, Investigation, Methodology, Validation, Writing – re propagation neural network optimized by K-fold cross-validation for prediction of
view & editing. Muhammad Akmal Remli: Funding acquisition, Proj torsional strength of reinforced Concrete beam. Materials 2022;15(4):1477.
ect administration, Validation, Writing – review & editing. Yong Hui [24] Mohammed A, Biegert G, Adamec J, Helikar T. Identification of potential tissue-
specific cancer biomarkers and development of cancer versus normal genomic
Nies: Formal analysis, Validation, Writing – review & editing. Mohd classifiers. Oncotarget 2017;8(49):85692.
Saberi Mohamad: Funding acquisition, Project administration, Super [25] Nies HW, Mohamad MS, Zakaria Z, Chan WH, Remli MA, Nies YH. Enhanced
vision, Validation, Writing – review & editing. directed random walk for the identification of breast cancer prognostic markers
from multiclass expression data. Entropy 2021;23(9):1232.
[26] Qiu Y, Wang J, Lei J, Roeder K. Identification of cell-type-specific marker genes
Acknowledgment from co-expression patterns in tissue samples. Bioinformatics 2021;37(19):
3228–34.
[27] Ritchie ME, Phipson B, Wu DI, Hu Y, Law CW, Shi W, et al. limma powers
The authors would like to thank Ministry of Education Malaysia, differential expression analyses for RNA-sequencing and microarray studies.
Sultan Mizan Antarctic Research Foundation (YPASM), Universiti Nucleic Acids Res 2015;43(7). e47–e47.
Teknologi Malaysia (UTM), United Arab Emirates University (UAEU), [28] Sekiguchi K, Koseki J, Tsuchiya K, Matsubara Y, Iizuka S, Imamura S, et al.
Suppression of Propionibacterium acnes-induced dermatitis by a traditional
Universiti Malaysia Kelantan (UMK), and Universiti Kebangsaan
Japanese medicine, jumihaidokuto, modifying macrophage functions. Evid-Based
Malaysia (UKM) for their support in making this research a success. Complement Altern Med 2015;2015.
[29] Shen C, Wang QZ, Shen ZY, Yuan HY, Yu WJ, Chen XD, et al. Genetic association
References between the NLRP3 gene and acne vulgaris in a Chinese population. Clin Exp
Dermatol 2019;44(2):184–9.
[30] Shim H, Hwang SJ, Yang E. Joint active feature acquisition and classification with
[1] Bungau AF, Radu AF, Bungau SG, Vesa CM, Tit DM, Endres LM. Oxidative stress variable-size set encoding. Adv Neural Information Process Syst 2018:31.
and metabolic syndrome in acne vulgaris: pathogenetic connections and potential [31] Stöger K, Schneeberger D, Holzinger A. Medical artificial intelligence: the
role of dietary supplements and phytochemicals. Biomed Pharmacother 2023;164: European legal perspective. Commun ACM 2021;64(11):34–6.
115003. [32] Tian L, Xie H, Yang T, Hu Y, Li J, Wang W. TNFR 2 M196R polymorphism and acne
[2] Chaudhari D, Vohra RR, Ali MA, Nadeem H, Tarimci B, Garg T, et al. A Rare vulgaris in Han Chinese: a case-control study. J Huazhong Univ Sci Technol
phenomenon of lithium-associated acne inversa: A case series and literature [medical Sciences] 2010;30(3):408–11.
review. Cureus 2023;15(3). [33] Vos T, Flaxman AD, Naghavi M, Lozano R, Michaud C, Ezzati M, et al. Years lived
[3] Chen B, Zheng Y, Liang Y. Analysis of potential genes and pathways involved in the with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990–2010: a
pathogenesis of acne by bioinformatics. BioMed Res Int; 2019. systematic analysis for the Global Burden of Disease Study 2010. Lancet 2012;380
[4] Chen Y, Lun AT, Smyth GK. From reads to genes to pathways: differential (9859):2163–96.
expression analysis of RNA-Seq experiments using rsubread and the edgeR quasi- [34] Wang CC, Li CY, Cai JH, Sheu PCY, Tsai JJ, Wu MY, et al. Identification of
likelihood pipeline. F1000Research 2016:5. prognostic candidate genes in breast cancer by integrated bioinformatic analysis.
[5] Cheng Y, Chen T, Hu J. Genetic analysis of potential biomarkers and therapeutic J Clin Med 2019;8(8):1160.
targets in neuroinflammation from sporadic Creutzfeldt-Jakob disease. Sci Rep [35] Wang T, Bao X, Clavera I, Hoang J, Wen Y, Langlois E, et al. Benchmarking model-
2023;13(1):14122. based reinforcement learning. arXiv preprint arXiv:1907.02057; 2019.
[6] Darvish Z, Ghanbari S, Afshar S, Tapak L, Amini P. Psoriasis associated hub genes [36] White III CC, White DJ. Markov decision processes. Eur J Oper Res 1989;39(1):
revealed by weighted gene co-expression network analysis. Cell Journal (yakhteh) 1–16.
2023;25(6):418. [37] Wu C, Xu H, Bai D, Chen X, Gao J, Jiang X. Public perceptions on the application of
[7] Davis S, Meltzer PS. GEOquery: a bridge between the Gene Expression Omnibus artificial intelligence in healthcare: a qualitative meta-synthesis. BMJ Open 2023;
(GEO) and BioConductor. Bioinformatics 2007;23(14):1846–7. 13(1):e066322.
[8] Gholamy A, Kreinovich V, Kosheleva O. Why 70/30 or 80/20 relation between [38] Yang L, Shou YH, Yang YS, Xu JH. Elucidating the key biomarkers and immune
training and testing sets: A pedagogical explanation; 2018. infiltration in acne by integrated bioinformatics analysis; 2021.
[9] Guttà C. Prognostication and prediction of cancer patient outcomes using AI-based [39] Zhou X. Optimal values selection of Q-learning Parameters in Stochastic Mazes. In
classifiers; 2023. Journal of Physics: Conference Series (Vol. 2386, No. 1, p. 012037). IOP Publishing;
[10] Heng AHS, Say YH, Sio YY, Ng YT, Chew FT. Gene variants associated with acne 2022.
vulgaris presentation and severity: a systematic review and meta-analysis. BMC [40] Zouboulis CC, Nogueira da Costa A, Makrantonaki E, Hou XX, Almansouri D,
Med Genomics 2021;14(1):1–42. Dudley JT, et al. Alterations in innate immunity and epithelial cell differentiation
[11] Holzinger A, Haibe-Kains B, Jurisica I. Why imaging data alone is not enough: AI- are the molecular pillars of hidradenitis suppurativa. J Eur Acad Dermatol
based integration of imaging, omics, and clinical data. Eur J Nucl Med Mol Imaging Venereol 2020;34(4):846–61.
2019;46(13):2722–30.
[12] Huang M, Ye X, Imakura A, Sakurai T. Sequential reinforcement active feature
learning for gene signature identification in renal cell carcinoma. J Biomed Inform
2022;128:104049.
10