On The Road To Explainable AI in Drug-Drug Interactions Prediction: A Systematic Review
On The Road To Explainable AI in Drug-Drug Interactions Prediction: A Systematic Review
On The Road To Explainable AI in Drug-Drug Interactions Prediction: A Systematic Review
a r t i c l e i n f o a b s t r a c t
Article history: Over the past decade, polypharmacy instances have been common in multi-diseases treatment. However,
Received 12 March 2022 unwanted drug-drug interactions (DDIs) that might cause unexpected adverse drug events (ADEs) in
Received in revised form 15 April 2022 multiple regimens therapy remain a significant issue. Since artificial intelligence (AI) is ubiquitous today,
Accepted 15 April 2022
many AI prediction models have been developed to predict DDIs to support clinicians in
Available online 19 April 2022
pharmacotherapy-related decisions. However, even though DDI prediction models have great potential
for assisting physicians in polypharmacy decisions, there are still concerns regarding the reliability of
Keywords:
AI models due to their black-box nature. Building AI models with explainable mechanisms can augment
Explainable artificial intelligence
Drug-drug interaction
their transparency to address the above issue. Explainable AI (XAI) promotes safety and clarity by show-
Machine learning ing how decisions are made in AI models, especially in critical tasks like DDI predictions. In this review, a
Deep learning comprehensive overview of AI-based DDI prediction, including the publicly available source for AI-DDIs
Chemical structures studies, the methods used in data manipulation and feature preprocessing, the XAI mechanisms to pro-
Natural language processing mote trust of AI, especially for critical tasks as DDIs prediction, the modeling methods, is provided.
Limitations and the future directions of XAI in DDIs are also discussed.
Ó 2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and
Structural Biotechnology. This is an open access article under the CC BY license (http://creativecommons.
org/licenses/by/4.0/).
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2113
2. Study selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2113
3. Dataset, input data, and features for AI-DDIs studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2113
3.1. DDIs information retrieved from text-based sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2113
3.2. Molecule-based input data and feature preprocessing for DDIs prediction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2114
4. Conventional ML-based prediction models of DDIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2116
4.1. Single ML algorithm-based predictive model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2116
4.2. Ensemble learning predictive model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2117
5. Deep learning-based prediction model of DDIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2118
5.1. Artificial neural network (ANN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2118
5.2. Convolutional neural network (CNN). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2118
5.2.1. Conventional CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2118
5.2.2. Dependency-based CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2118
5.2.3. Deep CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2118
5.3. Graph convolutional neural network (GCNN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2119
⇑ Corresponding author at: Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei 106, Taiwan.
E-mail addresses: [email protected], [email protected] (N.Q.K. Le).
1
These authors contributed equally to this work.
https://doi.org/10.1016/j.csbj.2022.04.021
2001-0370/Ó 2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
1. Introduction IEEE, and Scopus. The search strategy combined the Medical Sub-
ject Headings terms and free terms ‘‘drug drug interaction” or
Drug-drug interactions (DDIs) usually happen in polypharmacy ‘‘drug-drug interaction”, in combination with ‘‘artificial intelli-
instances when the effects of a drug alter that of others in a com- gence” or ‘‘machine learning” or ‘‘deep learning” or ‘‘neural net-
bined regimen. In treatment, preferably, synergistic action and work” and ‘‘prediction model”.
therapeutic benefit are expected. However, in multi-diseases treat- The eligibility criteria consisted of DDI predictive models that
ment, adverse drug events (ADEs) that cause toxicity or reduced were built up using ML - and/or DL-based algorithms. The articles
treatment effect may also inevitably happen. These can eventually were screened and selected independently by two reviewers (N.T.
lead to increased morbidity and mortality in patients [1-3]. In addi- K.N and H.T.V.), and disagreements were resolved by the third
tion, an increased number of recently frequent launches and reviewer (N.Q.K.L.). All the retrieved publications were entered
approval of new drugs and indications in marketed medicines into reference-manager software (EndNote X9, Excel 2018).
introduces more possible DDIs occurrences [4,5]. However, wet- We identified 643 records through Cochrane Library, IEEE,
lab experiments for verifying DDIs can drain researchers’ time PubMed, EMBASE, Scopus database, and two records from refer-
and resources and make it difficult for numerous and regular adop- ence lists of review paper. After removing 215 duplicates, 116
tions. Therefore, artificial intelligence (AI) models have been records were excluded according to the screening of titles and
applied to predict DDIs [6-9]. These models have been continu- abstracts. Of 314 remaining research studies, 220 studies were
ously studied and improved along with the expansion and com- removed after evaluating the selection criteria: (1) related to DDIs,
pleteness of drug-database resources to support clinical decisions. (2) related to predictive model, (3) focused on ML or/and DL. As a
However, since the introduction of AI-models in DDIs recogni- result, we had 94 different research studies. Fig. 1 shows the flow
tion, many efforts have been applied to boost the predictive power diagram of the systematic search. Table 1 shows the detailed infor-
of algorithms by putting forward more complex systems, turning mation of 94 selected studies.
these models into those called ‘‘black-box AI” that hinder the abil- The flowchart of AI-based DDI prediction model is illustrated in
ity of users to explain how these models work [10]. Specifically, Fig. 2. From the whole flowchart, we would like to conduct our
higher performance models are associated with more sophisticated review based on two main aspects: input data (DDIs extraction
systems, but lower performance tools with simple approaches are and feature preprocessing) and AI algorithms (traditional machine
easier to comprehend [11]. Despite various benefits given by wide- learning and deep learning). The evolution of DDI prediction mod-
spread industrial adoption of machine learning (ML) models, a crit- els separated by these two aspects is also shown in Fig. 3.
ical domain as healthcare should be taken more seriously due to its
immense value to humans. Additionally, from a human-oriented
3. Dataset, input data, and features for AI-DDIs studies
research angle, the ambiguity of complicated models in making
predictive decisions hamper its successful adoption in medical set-
In response to the growing number of pharmaceutical drugs
tings as unable-to-interpreted systems are difficult to be trusted.
entering the market over the past decades, many drug-related
Since the fundamental application of AI in drug treatment must
information databases have been updating and expanding to facil-
first do with DDIs, explainable DDIs-AI models are pivotal for clin-
itate DDIs prediction [13-15]. Generally, most DDIs studies referred
icians and patients to understand and trust their prediction. In
to datasets from DDIExtraction 2011 [16,17], DDIExtraction 2013
response, the ignition of the field explainable artificial intelligence
[18] and DrugBank database [19]. These public sources provide
(XAI), which concentrates on methods to interpret ML models, has
various types of drugs’ characteristics and DDIs events to leverage
revived over recent years. XAI can facilitate clinical applications of
AI approaches for DDIs discovery. The quantitative information
DDIs prediction models regarding their requirement of robust yet
about the DDIs is a necessary part of creating the described system.
human-understandable systems to provide clear justifications
The data record format usually has binary characters encoded as 1
and promote safety, reliability, and transparency.
if there is an interaction between two drugs and 0 if there is a lack
This review focuses on the advances of recently developed DDIs
of known interaction.
prediction models regarding their data manipulation technique,
Depending on the DDIs features-based view of different
feature selection process, modeling approach, XAI method, and
approaches, appropriate data extraction and feature preprocessing
the challenge of assuring explainability and transparency of
methods for DDIs prediction tasks can be applied.
DDIs-prediction models without compromising the predictive
power of these systems.
3.1. DDIs information retrieved from text-based sources
2. Study selection This method involves extracting DDIs information in the form of
biomedical text, especially in scientific literature since these
The Preferred Reporting Items for Systematic Reviews and sources represent valuable information for the retrieval of knowl-
Meta-Analyses (PRISMA) guideline was referenced when conduct- edge about the interaction between drugs. The amount of biomed-
ing literature reviewing [12]. We searched five electronic data- ical literature, which holds a vast amount of DDIs, has been
bases up to December 2021: Cochrane Library, PubMed, EMBASE, growing over the past years and facilitating many DDIs extracting
2113
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
studies [20-22]. Aside from studies using public available DDI cor- set of linguistic and semantic features for the drug name recogni-
pus [23,24], some studies have also used additional user-generated tion. Later, the DDIs extraction task was built on a hybrid method
content to compensate for the limits of delayed updates of the of both feature-based and kernel-based machine learning
medical database [25,26]. In addition, multi-information sources approaches. Moreover, the imbalanced class distribution problem
DDI corpora have been constructed based on useful information has also been considered in many articles since this issue can
from FDA adverse event reports [27,28], electronic health records diminish the power of classification [39,40]. Liu et al. used several
(EHRs) [29,30], or by following specific annotation guidelines rules to filter negative instances [41]; others added random nega-
[31] to construct corpus for DDIs extracting. tive sampling as part of the active learning algorithm to deal with
In these DDIs extraction approaches, feature preprocessing is the imbalanced issue [42] or use focal loss function to mitigate
essential. In detail, tokenization and lower casing are the first vital against this problem [43].
steps in reducing the sparsity of feature space. Also, many dimen-
sionally reduction text preprocessing techniques have been used 3.2. Molecule-based input data and feature preprocessing for DDIs
for DDIs extraction. Some compression techniques such as sen- prediction
tence pruning [32] and anaphora resolution have been applied
[33]; Zhao used syntax word embedding strategy [34] instead of Usually, DDIs studies utilize chemical, molecular, and pharma-
the common word embedding technique, some used Bidirectional cological properties information to elucidate drug interactions
Encoder Representations from Transformers (BERT) that relies on insights. In detail, the chemical properties of drugs are typically
attention mechanism to capture high-quality contextual informa- described via the simplified molecular-input line-entry system
tion [35,36]. The domain-specific ontologies approach attempted (SMILES). This flexible chemical notation allows the generation of
to use ancestors’ sequences in the ontology to represent each computer-feedable input [44]. These SMILES structural representa-
entity [37]. Bokharaeian et al. [31] proposed clause dependency tions of drugs are post-processed to capture features of drug pairs
features to improve the relation extraction performance. Also, associated with DDIs events [45]. Moreover, pharmacological prop-
Ben Abacha et al. [38] used the CRF-based algorithm trained by a erties such as targets [8,46], enzymes, transporters, genes and pro-
2114
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
Table 1
Input data type of all papers reviewed in this study.
2115
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
Table 1 (continued)
TML: traditional machine learning, DL: deep learning, ’-’the information was not reported in the original paper.
teins [6,47], interaction pathways like enzymes and transporters Node2vec for Feature Network (FN) construction was used in
[48-61] can also be manipulated to represent drugs features [81] to present drug features as low-dimensional feature vectors.
through a set of descriptors. Network interaction mining [62-64]
and molecular graph representations have also been used to
4. Conventional ML-based prediction models of DDIs
describe substructures of drugs that come in distinctive shapes
and sizes or the structural relations between entities [65-68]. Addi-
Given the advanced computer science development and grow-
tionally, to overcome the lack of data overlap between chemical
ing network pharmacology approaches, the development of a tra-
content and biological characteristics, the combined structure-
ditional ML-based model using multi-dimensional drug
based input that includes both chemical and biological data by
properties has been widely applied as a promising strategy to pre-
hybridizing cheminformatics and bioinformatics techniques to link
dict unknown DDIs [82,83].
all chemical information and biological effects have also been
applied to serve as a meaningful method for DDIs discovery in
many studies [69-71]. 4.1. Single ML algorithm-based predictive model
Many techniques have also been applied to cover multi phar-
macological facets of DDI by admitting heterogeneous characteri- Support vector machine (SVM) was a common algorithm used
zations from various data sources that represent different drug to predict DDIs due to its high performance with a broad range
characteristics and physiological effects [72-74]. The knowledge AUC value of 0.565 – 0.985 [6,19,54,84-87]. Indeed, the number
graphs (KGs)–based features integrated from multiple sources of recruiting features has a certain role in the predictive model,
such as DrugBank, PharmGKB, and KEGG drugs [75] were used to e.g., a study applied the features reducing method and achieved
overcome the limited information issue in single-source methods. an increase of 0.02 in the F-measure score (0.5786 vs 0.5965) of
Along with this, some efforts have been made to address the prob- the predictive model [86]. Kernel machines are a class of algo-
lem of increased noise in the integrated similarity. The similarity rithms for pattern analysis whose best-known member is the
selection heuristic process ranks matrices based on the entropy SVM. Kernel classifiers were used for classifying the drug pairs,
calculated in each matrix and calculates their pair-wise distance including all-paths graph (APG), k-band shortest path spectrum
for the final selection based on redundancy minimization [76,77]. (kBSPS), and the shallow linguistic (SL) kernel [17,31,88,89]. Note-
The classification feature constructing step usually requires the worthy, Thomas et al. [17] showed that SL and APG outperformed
similarity analysis of paired drugs. In most studies, the chemical other methods, such as case-based reasoning and ensemble learn-
structural similarity was measured using the structures of the ing based on F1-score (0.606 vs. 0.416 and 0.583, respectively).
compound of drugs on DrugBank represented by their SMILES Also, Zhang et al. [90] used the label propagation algorithms to
[6]. Structural representation of the drugs can be constructed using work with the scenario where only a small portion of nodes in
different molecular fingerprints generation techniques. The princi- the undirected weighted network being labeled. In the meantime,
ple of this technique is to represent a molecule as a bit vector that logistic regression (LR) algorithm has been less used to establish
codes the attendance or non-attendance of specifically assigned bit DDIs prediction model. Xie et al. [91] integrated active learning,
position structural features. Similarity measurements between random negative sampling, and uncertainty sampling in clinical
molecular fingerprints are calculated using different methods; safety DDI information retrieval (DDI-IR) analysis using SVM and
one commonly applied technique uses the Tanimoto coefficient LR. In addition, Drug-Entity-Topic (DET) model following Bayes-
[8,48,78]. Besides, many studies combine various drug-drug simi- rules was an example in leveraging augmented text-mining fea-
larity measures representing relations between chemical, molecu- tures to improve prediction performance in terms of discrimina-
lar physiological, or target pathways of drugs for the DDIs tion and calibration [73]. Due to the growing demand for adverse
prediction task to gain more helpful information about DDIs DDIs (ADDIs) signal detection, Bayesian network framework and
[79,80]. On the other hand, the network-based features processing domain knowledge were combined to identify direct associations
method exploits the topological properties of the DDI network. between a combination of medicines and the target ADEs [92]. Fur-
2116
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
thermore, gradient boosting-based algorithm XGBoost was Perceptron classifiers, outperformed the original (unbalanced)
employed to achieve robust DDI prediction even for drugs whose train corpora model based on F-score (70.4% vs. 69.0%)[95]. Simi-
interaction profiles were completely unseen during training [60]. larly, a heterogeneous network-assisted inference (HNAI) frame-
XGBoost performed better or comparable to other algorithms, such work consisting of five different ML algorithms, including Naive
as SVM, random forest, and the standard gradient boosting in Bayes (NB), decision tree (DT), k-nearest neighbors (k-NN), LR,
terms of predictive performance and speed in DDIs prediction and SVM, was proposed to detect the unknown DDIs with AUC of
[49,60]. 0.67, higher than that of separated algorithms (NB:0.66,
DT:0.565, k-NN:0.6, LR:0.655, and SVM:0.666) [6]. Other ensemble
4.2. Ensemble learning predictive model methods including genetic algorithm and LR in classifier ensemble
rule for DDIs prediction could obtain AUC value up to 1 and accu-
Ensemble methods use multiple learning algorithms to obtain racy>90%, regardless of approved and unproved drug pairs being
better predictive performance than separate models in DDIs pre- selected [48]. One of the significant concerns for developing a
diction [17,33,48,72,93,94]. Combined ML algorithms using Lib- high-accuracy DDIs prediction model is integrating heterogeneous
LINEAR, which consists of linear SVM, Naïve Bayes, and Voting drug features. Thus, Zhang et al. [62] proposed a multi-modal deep
2117
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
DDIs extraction task. Sun et al. [112] proposed a DCNN model unit, distance to the first drug, and distance to the second drug.
which utilized a small convolution architecture to operate directly These units are input to the embedding layer of the skeleton-LSTM.
at the word level of the raw biomedical text input to get the However, traditional Encoder-Decoder architecture using RNN
embedding-based convolutional features. Then, the softmax classi- or LSTM remained several drawbacks as it can cause the informa-
fier will be used to operate these features and extract DDIs from tion loss problem, especially in the case of long sentences. Atten-
biomedical literature. tion mechanism has been applied to deal with the problem
mentioned above [128]. The model proposed by Yi et al. [129] used
5.3. Graph convolutional neural network (GCNN) a bidirectional RNN layer to generate a sentence matrix as the
word’s semantic representation. Then, the attention layer is
In many DDIs prediction approaches, the molecular structure of applied to create the final representation by combining several rel-
drugs has been extensively exploited to extract the characteristics evant sentences of the same drug pairs. The softmax classifier was
of the drug that link to the DDIs events. In non-Euclidean domains, used to classify specific DDIs. Zheng et al. [130] also introduced a
where complex relationships and interdependencies between model to classify DDIs from texts using a combined attention
molecular structure representation of drugs or interactions mechanism and an RNN with LSTM units.
between drug targets betokened as graphs [113], the application
of GCNN in DDIs prediction was introduced. The most fundamental
part of a GCNN is a graph, a data structure consisting of two com- 6. Interpretability methods in XAI and XAI in DDIs prediction
ponents: nodes and edges [101]. The nodes usually represent the
drug and edges are associated with interactions between nodes The surge in the predictive performance of AI tools is achieved
[114]. The first graph convolutional network was proposed by by increasing model complexity. This turns these models into
Bruna et al. [115] for applying neural networks to graph- black-box systems and causes uncertainty regarding their opera-
structured data. Also, a model called SC-DDIS was introduced by tion mechanism. This ambiguity hinders the wide adaptation of
Liu et al. [74] can learn the final embedding of drugs via a graph AI models in critical domains like healthcare. As a result, eXplain-
spectral CNN. Besides, it deals with the multiple complex struc- able Artificial Intelligence (XAI) focuses on understanding behind
tured entities that consist of two graph types: local graph for struc- the prediction of AI models to accommodate the demand for trans-
tured entities and global graph to capture structured entities’ parency in AI tools. Interpretability methods of AI models can be
interactions. Wang et al. [85] proposed a graph to GCNN model classified based on the type of algorithms, the interpretation scale,
called GoGNN to extract features in both graphs in a hierarchical and the data type [131]. Additionally, based on the purposes of
fashion to leverage the DDIs prediction performance. interpretability, approaches can be categorized as white-box mod-
els creation, black-box models explanation, enhancement of model
5.4. Recurrent neural network fairness and predictive sensitivity testing [132].
In terms of methods to explain DL models, the gradient-based
RNN is highly manipulated in NLP [116,117] and it mainly deals attribution method [133] attempts to explain the prediction by
with sequential data. What makes RNNs differ from CNNs is their attributing them to the network’s input features. This method is
memory mechanism that gets information for the prior inputs to often applied when predictions are made from a DNN system
influence the current input and output. The DDIs extraction task and therefore, can be potential approach for some black-box
is considered a relation extraction task in NLP. Many have utilized DNN models in DDIs prediction like [110,112]. Moreover, the
the long short term memory (LSTM) network to extract DDIs from DeepLIFT is a popular algorithm applied on top of DNN models that
literature [118-120]. Even though Char-RNNs are more common showed considerable advantages compared to gradient-based
for modeling morphologically richer languages [121] and were methods [134]. On the other hand, Guided BackPropagation
introduced for text classification [122]. Kavuluru et al. [123] has method can be applied to network structures [135]. Under this, a
also considered the role of character-level embedding in DDIs convolutional layer with improved stride can replace max-
extraction, and they used an LSTM on the character embedding pooling in CNN to deal with accuracy loss. This approach suggests
to extract the word vectors. a potential application in some CNN-based DDIs prediction such as
Luo et al. [57] presented a model that used an LSTM model for [111]. On top of this, the [136] was proposed in NLP-based neural
DDIs prediction in diabetes using the embedded drug-induced networks. This method used rationales (small pieces of input text)
transcriptome data. The LSTM is a typical RNN architecture intro- and tried to produce the same prediction as the full-text input
duced by Hochreiter and Schmidhuber [124] to deal with the prob- type. Under this method, the architecture consists of two compo-
lem of long-term dependencies. In LSTM, cells in the hidden layers nents, generator and encoder, to look for text subsets highly
contain an input gate, an output gate, and a forget gate to control related to the prediction result. Since the DDIs extraction task is
the flow of information required for the Prediction. Also, the gated conducted via NLP-based models [109,114], the above methods
recurrent units (GRU) was introduced to address the short-term should be considered for application to promote the clarity of these
memory problem of the RNNs model [125]. However, unlike the models.
LTSM, GRUs use hidden states and two gates: reset and update gate Apart from this, methods to create white-box models such as
to control the information to retain for the prediction. linear, decision tree, rule-based models, or sophisticated yet trans-
For the DDIs extraction task, a hierarchical RNN was introduced parent models have also been proposed in XAI. However, due to the
by Zhang et al. [33]. This model framework considers the shortest limited predictive power, especially in the NLP-based domain as in
dependency path (SDP) between two entities and uses the RNN the DDIs extraction task, these approaches are given less interest.
to learn the feature representation of sentence sequence and SDP Additionally, various methods have been proposed to tackle fair-
for extracting DDIs. Zhou et al. [126] introduced an attention- ness in AI. Nevertheless, a minimal number of these scientific
based BiLSTM model to encode biomedical text sentences. pieces of literature considered fairness in non-tabular data such
Besides, considering the difference between DDIs instance and as text-based information for DDIs extraction. While many DDIs
typical sentence, Jiang et al. [127] used a skeleton structure to rep- studies applied the word embedding method [62,109], it was
resent the DDIs instances and the LSTM model to work with the revealed that vectorized representing of text data could carry
structure (skeleton-LSTM). In their framework, a sentence is first strong bias [137]. Therefore, methods to assure fairness should
tokenized into token units followed by a corresponding skeleton be taken into more consideration in DDIs studies. Furthermore,
2119
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
some methods aim to analyze the sensitivity of AI models to ensure these models’ performance and bring them closer to clinical appli-
the reliability of those tools. In the Adversarial Example-based Sen- cation. Since XAI aims to explain the machine learning models, its
sitivity Analysis, Zugner et al. [138] used this approach to study the application does not lead to less accuracy in current models. Also,
graph-structured data. This method considers modifying node con- further studies can show the potential of XAI in sacrificing accuracy
nections or node features to attack node classification models. in the field of DDIs extraction task (NLP) if text based approach is
Since graph-based methods are widely applied in DDIs studies usually used for replenishment of databases and one can refine
[67,68], approaches as in the above research suggest potential the found dependencies in the initial sources. Addressing it may
application in DDIs prediction model. Also, using perturbations to open a new road in the application of XAI in DDI prediction in
the word embeddings [139] in RNN should also be considered. Sig- the future, especially for DDI extraction task using NLP.
nificantly, the input reduction method in the study of Feng et al.
[140] to reveal oversensitivity in NLP models can be a possible
8. Conclusion
approach in DDIs extracting studies. Literature regarding the
explosion of the weakness of DL models in NLP-tasks is complete;
The management of DDIs, which can cause ADEs and affect
however, applications in DDIs- NLP models are still limited.
patients’ health, plays a crucial role in pharmacovigilance and
In the DDIs study of Schwarz et al. [61], an attempt has been
medical practice. The main contribution of this study is the estab-
made to offer their model interpretability using the Attention
lishment of detailed taxonomy of existing models for predicting
scores computed at all layers of modeling. Using these scores, the
DDIs. Given remarkable breakthroughs in DDIs prediction over
contribution of the similarity matrices to the drug representation
the past years, weakness in terms of model interpretability
vectors is determined and the drug characteristics that lead to bet-
exposed considerable limits. We, therefore, believe that XAI in
ter encoding are selected. This approach leverages information that
DDIs prediction still holds many potential aspects to unlock in
passes through all layers of the network.
future studies.
Though traditional ML performed effectively in extracting DDIs, Thanh Hoa Vo: Conceptualization, Methodology, Formal analy-
even from the unstructured package insert (aka drug product label) sis, Data curation, Writing – original draft, Writing – review & edit-
[87], conventional ML-based methods still have several drawbacks. ing, Visualization. Ngan Thi Kim Nguyen: Methodology, Formal
ML-based models are learned from positive and negative data, analysis, Validation, Writing – original draft, Writing – review &
making it difficult in real-world domains due to the lack of true editing, Visualization. Quang Hien Kha: Validation, Data curation.
negative DDIs or a ‘‘gold standard” non-DDI. Therefore, it is neces- Nguyen Quoc Khanh Le: Conceptualization, Methodology, Formal
sary to identify positive data from many unlabeled data containing analysis, Investigation, Data curation, Writing – original draft,
positive and negative samples and avoid biased sampling by ran- Writing – review & editing, Visualization, Supervision, Funding
dom negative sampling and validation set updating. Additionally, acquisition.
it is unknown whether there is DDI between two drugs in a nega-
tive class dataset because some new DDIs drug pairs may not be Declaration of Competing Interest
reported yet. Another issue is different types of DDI data, such as
clinical drug safety and pharmacokinetic data with different tar- The authors declare that they have no known competing finan-
geted samples and proportions in DDI-relevant databases or arti- cial interests or personal relationships that could have appeared
cles. Also, it is more time-consuming to accomplish the to influence the work reported in this paper.
annotated corpora and determine optimal parameters in tradi-
tional ML-based methods. Hence, DNN models, including CNN Acknowledgments
and sequential neural networks such as RNN, have been referred
to as an optimal resolution for feature selection and DDIs extrac- This work was supported by the Ministry of Science and Tech-
tion without complicated feature engineering [120]. However, we nology, Taiwan [grant number MOST110-2221-E-038-001-MY2].
assumed that several paths should be investigated in future work.
First, drug-related textual data sources, such as patent information,
References
are essential. Second, it is unknown how to use drug domain
knowledge or semi-structured drugs, such as paragraph that [1] Askari M et al. Frequency and nature of drug-drug interactions in the
describes the pharmacodynamics or mechanism of action, protein intensive care unit. Pharmacoepidemiol Drug Saf 2013;22(4):430–7.
binding, or experimental properties of a drug in building up predic- [2] Raschetti R et al. Suspected adverse drug events requiring emergency
department visits or hospital admissions. Eur J Clin Pharmacol 1999;54
tive models. (12):959–63.
In addition, DL with superior performance and capability to [3] Budnitz DS et al. National surveillance of emergency department visits for
automatically generate hierarchical input for the classification outpatient adverse drug events. JAMA 2006;296(15):1858–66.
[4] Reis AM, Cassiani SH. Evaluation of three brands of drug interaction software
tasks has gained huge research attention in DDIs prediction for use in intensive care units. Pharm World Sci 2010;32(6):822–8.
domain. Still, these DL methods are neither easily explainable [5] Vonbach P et al. Evaluation of frequently used drug interaction screening
nor commonly trusted by medical staff because of their explain- programs. Pharm World Sci 2008;30(4):367–74.
[6] Cheng F, Zhao Z. Machine learning-based prediction of drug–drug interactions
ability deficiency. In the DDIs prediction field, only a few studies by integrating drug phenotypic, therapeutic, chemical, and genomic
have considered the explainable aspect of their models, which properties. J Am Med Inform Assoc 2014;21(e2):e278–86.
leaves plenty of room to improve, innovate, and ensure predictive [7] Ryu JY, Kim HU, Lee SY. Deep learning improves prediction of drug–drug and
drug–food interactions. Proc Natl Acad Sci U S A 2018;115(18):E4304.
performance and model interpretability in ML-based DDIs predic-
[8] Vilar S et al. Similarity-based modeling in large-scale prediction of drug-drug
tion models. We, therefore, think that either approaches to explain interactions. Nat Protoc 2014;9(9):2147–63.
black-box models, methods to create high-accuracy white-box [9] Vilar S, Uriarte E, Santana L, Tatonetti NP, Friedman C. Detection of drug-drug
models, strategies to ensure models fairness, or strict sensitivity interactions by modeling interaction profile fingerprints. PLoS ONE 2013;8
(3):e58321.
analyses of models in DDIs prediction should be given more con- [10] Gunning, D., et al., XAI—Explainable artificial intelligence. Science Robotics,
sideration in the coming years to produce trust and fairness in 2019. 4(37): p. eaay7120.
2120
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
[11] David Gunning, M.S., Jaesik Choi, Timothy Miller, Simone Stumpf and Guang- [42] Xie W et al. Integrated Random Negative Sampling and Uncertainty Sampling
Zhong Yang, XAI Explainable artificial intelligence. Sci. Robotics, 2019. in Active Learning Improve Clinical Drug Safety Drug-Drug Interaction
eaay7120. Information Retrieval. Front Pharmacol 2021;11(2225).
[12] Page MJ et al. The PRISMA 2020 statement: an updated guideline for [43] Sun X et al. Drug-Drug Interaction Extraction via Recurrent Hybrid
reporting systematic reviews. BMJ 2020;2021:372. Convolutional Neural Networks with an Improved Focal Loss. Entropy
[13] Wishart DS et al. Drugbank: a comprehensive resource for in silico drug (Basel) 2019;21(1).
discovery and exploration. Nucleic Acids Res 2006;1(34) (D668-72. [44] Weininger D. SMILES, a chemical language and information system. 1.
16381955.). Introduction to methodology and encoding rules. J Chem Inf Comput Sci
[14] Whirl-Carrillo M et al. An evidence-based framework for evaluating 1988;28(1):31–6.
pharmacogenomics knowledge for personalized medicine. Clin Pharmacol [45] Hou, X.a.Y., Jiaying and Hu, Pingzhao, Predicting Drug-Drug Interactions
Ther 2021. Using Deep Neural Network, in Proceedings of the 2019 11th International
[15] Kanehisa M et al. KEGG as a reference resource for gene and protein Conference on Machine Learning and Computing. 2019, Association for
annotation. Nucleic Acids Res 2015;44(D1):D457–62. Computing Machinery: New York, NY, USA. p. 168–172.
[16] García Blasco, S., et al. Automatic drug-drug interaction detection: A machine [46] Zhao XM et al. Prediction of drug combinations by integrating molecular and
learning approach with maximal frequent sequence extraction. in CEUR pharmacological data. PLoS Comput Biol 2011;7(12):e1002323.
Workshop Proceedings. 2011. CEUR Workshop Proceedings. [47] Luo, H., et al., DDI-CPI, a server that predicts drug-drug interactions through
[17] Thomas, P., et al., Relation extraction for drug-drug interactions using ensemble implementing the chemical-protein interactome. Nucleic Acids Res, 2014. 42
learning. 1st Challenge task on Drug-Drug Interaction Extraction (Web Server issue): p. W46-52.
(DDIExtraction 2011), 2011: p. 11-18. [48] Mahadevan AA et al. A Predictive Model for Drug-Drug Interaction Using a
[18] Björne, J., S. Kaewphan, and T. Salakoski. UTurku: drug named entity Similarity Measure. 2019 IEEE Conference on Computational Intelligence in
recognition and drug-drug interaction extraction using SVM classification Bioinformatics and Computational Biology (CIBCB). IEEE; 2019.
and domain knowledge. in Second Joint Conference on Lexical and [49] Dang LH et al. Machine Learning-Based Prediction of Drug-Drug Interactions
Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh for Histamine Antagonist Using Hybrid Chemical Features. Cells 2021;10
International Workshop on Semantic Evaluation (SemEval 2013). 2013. (11):3092.
[19] Hailu, N., L. Hunter, and K.B. Cohen. UColorado_SOM: extraction of drug-drug [50] Deng Y et al. A multimodal deep learning framework for predicting drug-drug
interactions from biomedical text using knowledge-rich and knowledge-poor interaction events. Bioinformatics 2020;36(15):4316–22.
features. in Second Joint Conference on Lexical and Computational Semantics [51] Dhami DS et al. Drug-Drug Interaction Discovery: Kernel Learning from
(* SEM), Volume 2: Proceedings of the Seventh International Workshop on Heterogeneous Similarities. Smart Health 2018;9–10:88–100.
Semantic Evaluation (SemEval 2013). 2013. [52] Feng Y-H, Zhang S-W, Shi J-Y. DPDDI: a deep predictor for drug-drug
[20] Zhang Y et al. A hybrid model based on neural networks for biomedical interactions. BMC Bioinf 2020;21(1):419.
relation extraction. J Biomed Inform 2018;81:83–92. [53] Herrero-Zazo M, Lille M, Barlow DJ. Application of Machine Learning in
[21] Lim S, Lee K, Kang J. Drug drug interaction extraction from the literature using Knowledge Discovery for Pharmaceutical Drug-drug Interactions. KDWeb
a recursive neural network. PLoS ONE 2018;13(1):e0190926. 2016.
[22] Liu J et al. Drug-Drug Interaction Extraction Based on Transfer Weight Matrix [54] Hunta S, Aunsri N, Yooyativong T. Integrated action crossing method for
and Memory Network. IEEE Access 2019;7:101260–8. Drug-Drug Interactions prediction in noncommunicable diseases based on
[23] Allahgholi M et al. ADDI: Recommending alternatives for drug-drug neural networks. 2017 International Conference on Digital Arts, Media and
interactions with negative health effects. Comput Biol Med Technology (ICDAMT), 2017.
2020;125:103969. [55] Lee I, Nam H. Identification of drug-target interaction by a random walk with
[24] Zhang, Y., et al., Extracting drug-enzyme relation from literature as evidence for restart method on an interactome network. BMC Bioinf 2018;19(8):208.
drug drug interaction. J Biomed Semantics, 2016. 7: p. 11-11. [56] Lin S et al. MDF-SA-DDI: predicting drug–drug interaction events based on
[25] Xu B et al. Incorporating User Generated Content for Drug Drug Interaction multi-source drug fusion, multi-source feature fusion and transformer self-
Extraction Based on Full Attention Mechanism. IEEE Trans Nanobioscience attention mechanism. Brief Bioinform 2021.
2019;18(3):360–7. [57] Luo Q et al. Novel deep learning-based transcriptome data analysis for drug-
[26] Xu B et al. Full-attention Based Drug Drug Interaction Extraction Exploiting drug interaction prediction with an application in diabetes. BMC Bioinf
User-generated Content. 2018 IEEE International Conference on 2021;22(1):318.
Bioinformatics and Biomedicine (BIBM), 2018. [58] Olha Marushchak RK. Designing of Information Model for Prediction of Drug-
[27] Liu N, Chen CB, Kumara S. Semi-Supervised Learning Algorithm for drug Interactions based on Calculation of Target and Therapeutic Similarity.
Identifying High-Priority Drug-Drug Interactions Through Adverse Event 3rd International Conference on Informatics & Data-Driven Medicine. 2020.
Reports. IEEE J Biomed Health Inform 2020;24(1):57–68. Sweden: CEUR Workshop Proceedings, 2020.
[28] Zhang W et al. Predicting potential drug-drug interactions by integrating [59] Polak, S.a.B., J. and Mendyk, A, Neural System for in silico Drug-Drug Interaction
chemical, biological, phenotypic and network data. BMC Bioinf 2017;18 Screening, in Control and Automation and International Conference on Intelligent
(1):18. Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’06). 2005. p.
[29] Pathak J, Kiefer RC, Chute CG. Using linked data for mining drug-drug 75-80.
interactions in electronic health records. Stud Health Technol Inform [60] Qian S, Liang S, Yu H. Leveraging genetic interactions for adverse drug-drug
2013;192:682–6. interaction prediction. PLoS Comput Biol 2019;15(5):e1007068.
[30] Duke JD et al. Literature based drug interaction prediction with clinical [61] Schwarz K et al. AttentionDDI: Siamese attention-based deep learning
assessment using electronic medical records: novel myopathy associated method for drug–drug interaction predictions. BMC Bioinf 2021;22(1):412.
drug interactions. PLoS Comput Biol 2012;8(8):e1002614. [62] Zhang Y et al. Predicting drug-drug interactions using multi-modal deep
[31] Bokharaeian B, Diaz A, Chitsaz H. Enhancing extraction of drug-drug auto-encoders based network embedding and positive-unlabeled learning.
interaction from literature using neutral candidates, negation, and clause Methods 2020;179:37–46.
dependency. PLoS ONE 2016;11(10):e0163480. [63] Udrescu ML, Udrescu A. Drug Repurposing Method Based on Drug-Drug
[32] Park C, Park J, Park S. AGCN: Attention-based graph convolutional networks Interaction Networks and Using Energy Model Layouts. In: Vanhaelen Q,
for drug-drug interaction extraction. Expert Syst Appl 2020;159:113538. editor. Computational Methods for Drug Repurposing. New York,
[33] Zhang Y et al. Drug–drug interaction extraction via hierarchical RNNs on NY: Springer New York; 2019. p. 185–201.
sequence and shortest dependency paths. Bioinformatics 2017;34(5):828–35. [64] Takarabe M et al. Network-based analysis and characterization of adverse
[34] Zhao Z et al. Drug drug interaction extraction from biomedical literature drug-drug interactions. J Chem Inf Model 2011;51(11):2977–85.
using syntax convolutional neural network. Bioinformatics 2016;32 [65] Nyamabo AK, Yu H, Shi J-Y. SSI–DDI: substructure–substructure interactions
(22):3444–53. for drug–drug interaction prediction. Brief Bioinform 2021;22(6).
[35] Warikoo N, Chang Y-C, Hsu W-L. LBERT: Lexically aware Transformer-based [66] Decker, M.R.K.a.M.C.a.J.B.J.a.M.U.a.O.B.a.S., Drug-Drug Interaction Prediction
Bidirectional Encoder Representation model for learning universal bio-entity Based on Knowledge Graph Embeddings and Convolutional-LSTM Network, in
relations. Bioinformatics 2020;37(3):404–12. 10th ACM Conference on Bioinformatics, Computational Biology, and Health
[36] Zhu Y et al. Extracting drug-drug interactions from texts with BioBERT and Informatics, ACM-BCB. 2019, Association for Computing Machinery, Inc:
multiple entity-aware attentions. J Biomed Inform 2020;106:103451. Niagara Falls, United States.
[37] Lamurias A et al. BO-LSTM: classifying relations via long short-term memory [67] Sun M, Wang F, Elemento O, Zhou J. Structure-Based Drug-Drug Interaction
networks along biomedical ontologies. BMC Bioinf 2019;20(1):10. Detection via Expressive Graph Convolutional Networks and Deep Sets.
[38] Abacha AB et al. Text mining for pharmacovigilance: Using machine learning Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press;
for drug name recognition and drug‑‘drug interaction extraction and 2020.
classification. J Biomed Inform 2015;58:122–32. [68] Xuan Lin ZQ, Wang Z-J, Ma T, Xiangxiang Zeng KGNN. Knowledge Graph
[39] Chowdhury MFM, Lavelli A. Impact of less skewed distributions on efficiency Neural Network for Drug-Drug Interaction Prediction. Proceedings of the
and effectiveness of biomedical relation extraction. Proceedings of COLING Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020.
2012: Posters, 2012. [69] Bo Peng, X.N. Deep Learning for High-Order Drug-Drug Interaction Prediction. in
[40] Fatehifar M, Karshenas H. Drug-Drug interaction extraction using a position In 10th ACM International Conference on Bioinformatics, Computational Biology
and similarity fusion-based attention mechanism. J Biomed Inform and Health Informatics (ACM-BCB ’19). 2019. NY,USA.
2021;115:103707. [70] Zhang W et al. SFLLN: A sparse feature learning ensemble method with linear
[41] Liu S et al. Drug-Drug Interaction Extraction via Convolutional Neural neighborhood regularization for predicting drug–drug interactions. Inf Sci
Networks. Comput Math Methods Med 2016;2016:6918381. 2019;497:189–201.
2121
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
[71] Shankar S et al. Predicting adverse drug reactions of two-drug combinations [103] Fukushima K. Neocognitron: A self-organizing neural network model for a
using structural and transcriptomic drug representations to train an artificial mechanism of pattern recognition unaffected by shift in position. Biol Cybern
neural network. Chem Biol Drug Des 2021;97(3):665–73. 1980;36(4):193–202.
[72] Patrick MT et al. Advancement in predicting interactions between drugs used [104] Suárez-Paniagua V, Segura-Bedmar I. Evaluation of pooling operations in
to treat psoriasis and its comorbidities by integrating molecular and clinical convolutional architectures for drug-drug interaction extraction. BMC Bioinf
resources. J Am Med Inform Assoc 2021;28(6):1159–67. 2018;19(8):209.
[73] Yan S, Jiang X, Chen Y. Text Mining Driven Drug-Drug Interaction Detection. [105] Suárez-Paniagua V, Segura-Bedmar I, Martínez P. Exploring convolutional
In: Proceedings. IEEE International Conference on Bioinformatics and neural networks for drug-drug interaction extraction. Database (Oxford)
Biomedicine. p. 349–55. 2017.
[74] Liu T et al. Modeling polypharmacy effects with heterogeneous signed graph [106] Lapin M et al. Analysis and Optimization of Loss Functions for Multiclass,
convolutional networks. Appl Intell 2021;51. Top-k, and Multilabel Classification. IEEE Trans Pattern Anal Mach Intell
[75] Celebi R et al. Evaluation of knowledge graph embedding approaches for 2018;40(7):1533–54.
drug-drug interaction prediction in realistic settings. BMC Bioinf 2019;20 [107] Chen Y et al. MUFFIN: multi-scale feature fusion for drug–drug interaction
(1):726. prediction. Bioinformatics 2021;37(17):2651–8.
[76] Olayan RS, Ashoor H, Bajic VB. DDR: efficient computational method to [108] Wu H et al. Drug-drug interaction extraction via hybrid neural networks on
predict drug-target interactions using graph mining and machine learning biomedical literature. J Biomed Inform 2020;106:103432.
approaches. Bioinformatics 2018;34(7):1164–73. [109] Quan C et al. Multichannel Convolutional Neural Network for Biological
[77] Rohani N, Eslahchi C. Drug-Drug Interaction Predicting by Neural Network Relation Extraction. Biomed Res Int 2016;2016:1850404.
Using Integrated Similarity. Sci Rep 2019;9(1):13645. [110] Liu S et al. Dependency-based convolutional neural network for drug-drug
[78] Bajusz D, Rácz A, Héberger K. Why is Tanimoto index an appropriate choice interaction extraction. 2016 IEEE International Conference on Bioinformatics
for fingerprint-based similarity calculations? J Cheminform 2015;7(1):20. and Biomedicine (BIBM), 2016.
[79] Rohani N, Eslahchi C, Katanforoush A. ISCMF: Integrated similarity- [111] Zeng T et al. Deep convolutional neural networks for annotating gene
constrained matrix factorization for drug–drug interaction prediction. expression patterns in the mouse brain. BMC Bioinf 2015;16(1):147.
Network Modeling Analysis in Health Informatics and Bioinformatics [112] Sun X et al. Deep Convolution Neural Networks for Drug-Drug Interaction
2020;9(1):11. Extraction. 2018 IEEE International Conference on Bioinformatics and
[80] Lee G, Park C, Ahn J. Novel deep learning model for more accurate prediction Biomedicine (BIBM), 2018.
of drug-drug interaction effects. BMC Bioinf 2019;20(1):415. [113] Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with
[81] Deepika SS, Geetha TV. A meta-learning framework using representation graph convolutional networks. Bioinformatics 2018;34(13):i457–66.
learning to predict drug-drug interaction. J Biomed Inform 2018;84:136–47. [114] Xiong W et al. Extracting Drug-drug Interactions with a Dependency-based
[82] Javed R et al. An Efficient Pattern Recognition Based Method for Drug-Drug Graph Convolution Neural Network. 2019 IEEE International Conference on
Interaction Diagnosis. 2021 1st International Conference on Artificial Bioinformatics and Biomedicine (BIBM), 2019.
Intelligence and Data Analytics (CAIDA), 2021. [115] Bruna J et al. Spectral networks and deep locally connected networks on
[83] Mei S, Zhang K. A machine learning framework for predicting drug–drug graphs. 2nd International Conference on Learning Representations, 2014.
interactions. Sci Rep 2021;11(1):17619. [116] Collobert R et al. Natural Language Processing (Almost) from Scratch. J Mach
[84] Song D et al. Similarity-based machine learning support vector machine Learn Res 2011;12:2493–537.
predictor of drug-drug interactions with improved accuracies. J Clin Pharm [117] Sutskever, I., O. Vinyals, and Q.V. Le, Sequence to sequence learning with neural
Ther 2019;44(2):268–75. networks, in Proceedings of the 27th International Conference on Neural
[85] Wang, H., et al., GoGNN: graph of graphs neural network for predicting Information Processing Systems - Volume 2. 2014, MIT Press: Montreal,
structured entity interactions, in Proceedings of the Twenty-Ninth Canada. p. 3104–3112.
International Joint Conference on Artificial Intelligence. 2021: Yokohama, [118] Zhang, S., et al. Bidirectional Long Short-Term Memory Networks for Relation
Yokohama, Japan. p. Article 183. Classification. in PACLIC. 2015.
[86] Minard A-L et al. Feature selection for drug-drug interaction detection using [119] Sahu SK, Anand A. Drug-drug interaction extraction from biomedical texts
machine-learning based approaches. Challenge Task on Drug-Drug using long short-term memory network. J Biomed Inform 2018;86:15–24.
Interaction Extraction (DDI) SEPLN, 2011. [120] Wang W et al. Dependency-based long short term memory network for drug-
[87] Boyce R, Gardner G, Harkema H. Using natural language processing to identify drug interaction extraction. BMC Bioinf 2017;18(16):578.
pharmacokinetic drug-drug interactions described in drug package inserts. [121] Kim Y et al. Character-aware neural language models. In: Proceedings of the
In: Proceedings of the 2012 Workshop on Biomedical Natural Language Thirtieth AAAI Conference on Artificial Intelligence. Phoenix, Arizona: AAAI
Processing. Montreal, Canada: Association for Computational Linguistics:; Press; 2016. p. 2741–9.
2012. p. 206–13. [122] Zhang, X., J. Zhao, and Y. LeCun, Character-level convolutional networks for text
[88] Dhami DS et al. Drug-drug interaction discovery: kernel learning from classification, in Proceedings of the 28th International Conference on Neural
heterogeneous similarities. Smart Health 2018;9:88–100. Information Processing Systems - Volume 1. 2015, MIT Press: Montreal, Canada.
[89] Zhang Y et al. A Single Kernel-Based Approach to Extract Drug-Drug p. 649–657.
Interactions from Biomedical Literature. PLoS ONE 2012;7(11):e48901. [123] Kavuluru R, Rios A, Tran T. Extracting Drug-Drug Interactions with Word and
[90] Zhang P, Wang F, Hu J, Sorrentino R. Label Propagation Prediction of Drug- Character-Level Recurrent Neural Networks. In: IEEE International
Drug Interactions Based on Clinical Side Effects. Sci Rep 2015;5:12339. Conference on Healthcare Informatics. IEEE International Conference on
[91] Xie W et al. Integrated Random Negative Sampling and Uncertainty Sampling Healthcare Informatics. p. 5–12.
in Active Learning Improve Clinical Drug Safety Drug-Drug Interaction [124] Hochreiter S, Schmidhuber J. Long Short-Term Memory. Neural Comput
Information Retrieval. Front Pharmacol 2021;11:2225. 1997;9(8):1735–80.
[92] Zhan C et al. Detecting high-quality signals of adverse drug-drug interactions [125] Cho, K., et al. On the Properties of Neural Machine Translation: Encoder–Decoder
from spontaneous reporting data. J Biomed Inform 2020;112:103603. Approaches. in Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and
[93] Zhang Y, Lu Z. Exploring Semi-supervised V ariational Autoencoders for Structure in Statistical Translation. 2014.
Biomedical Relation Extraction. Methods 2019;166. [126] Zhou D, Miao L, He Y. Position-aware deep multi-task learning for drug-drug
[94] Hung TNK et al. An AI-based Prediction Model for Drug-drug Interactions in interaction extraction. Artif Intell Med 2018;87:1–8.
Osteoporosis and Paget’s Diseases from SMILES. Mol Inform 2022:2100264. [127] Jiang Z, Gu L, Jiang Q. Drug drug interaction extraction from literature using a
[95] Bobić, T., J. Fluck, and M. Hofmann. SCAI: Extracting drug-drug interactions skeleton long short term memory neural network. 2017 IEEE International
using a rich feature vector. in Second Joint Conference on Lexical and Conference on Bioinformatics and Biomedicine (BIBM), 2017.
Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh [128] Zaikis D, Vlahavas I. Drug-Drug Interaction Classification Using Attention
International Workshop on Semantic Evaluation (SemEval 2013). 2013. Based Neural Networks. In: 11th Hellenic Conference on Artificial
[96] Li D et al. A Topic-modeling Based Framework for Drug-drug Interaction Intelligence. Athens, Greece: Association for Computing Machinery; 2020.
Classification from Biomedical Text. AMIA Annu Symp Proc p. 34–40.
2016;2016:789–98. [129] Yi Z et al. Drug-drug interaction extraction via recurrent neural network with
[97] Kumar Shukla P et al. Efficient prediction of drug-drug interaction using deep multiple attention layers. International Conference on Advanced Data Mining
learning models. IET Syst Biol 2020;14(4):211–6. and Applications. Springer; 2017.
[98] Sejnowski TJ. The unreasonable effectiveness of deep learning in artificial [130] Zheng W et al. An attention-based effective neural model for drug-drug
intelligence. Proc Natl Acad Sci 2020;117(48):30033–8. interactions extraction. BMC Bioinf 2017;18(1):445.
[99] Esteva A et al. A guide to deep learning in healthcare. Nat Med 2019;25 [131] Adadi A, Berrada M. Peeking Inside the Black-Box: A Survey on Explainable
(1):24–9. Artificial Intelligence (XAI). IEEE Access 2018:2169–3536.
[100] Hou WJ, Ceesay B. Extraction of drug-drug interaction using neural [132] Guidotti, R., et al., A Survey of Methods for Explaining Black Box Models. ACM
embedding. J Bioinform Comput Biol 2018;16(6):1840027. Comput. Surv., 2018. 51(5): p. Article 93.
[101] Shtar G, Rokach L, Shapira B. Detecting drug-drug interactions using artificial [133] Simonyan, K., A. Vedaldi, and A. Zisserman, Deep Inside Convolutional
neural networks and classic graph similarity measures. PLoS ONE 2019;14 Networks: Visualising Image Classification Models and Saliency Maps. CoRR,
(8):e0219796. 2014. abs/1312.6034.
[102] Masumshah R, Aghdam R, Eslahchi C. A neural network-based method for [134] Shrikumar, A., P. Greenside, and A. Kundaje, Learning important features
polypharmacy side effects prediction. BMC Bioinf 2021;22(1):385. through propagating activation differences, in Proceedings of the 34th
International Conference on Machine Learning - Volume 70. 2017, JMLR.org:
Sydney, NSW, Australia. p. 3145–3153.
2122
Thanh Hoa Vo, Ngan Thi Kim Nguyen, Quang Hien Kha et al. Computational and Structural Biotechnology Journal 20 (2022) 2112–2123
[135] Springenberg, J.T., et al., Striving for Simplicity: The All Convolutional Net. CoRR, [143] Minard A-L et al. Feature selection for drug-drug interaction detection using
2015. abs/1412.6806. machine-learning based approaches. In Challenge Task on Drug-Drug
[136] Tao Lei RB, Jaakkola T. Rationalizing Neural Predictions. 2016 Conference on Interaction Extraction (DDI), SEPLN, 2011.
Empirical Methods in Natural Language Processing. Austin, [144] Mahendran D, Nawarathna RD. An automated method to extract information
Texas: Association for Computational Linguistics; 2016. in the biomedical literature about interactions between drugs. 2016
[137] Bolukbasi T et al. Man is to computer programmer as woman is to Sixteenth International Conference on Advances in ICT for Emerging
homemaker? debiasing word embeddings. Adv Neural Inform Processing Regions (ICTer), 2016.
Syst 2016;29. [145] Karim, M.R., et al., Drug-Drug Interaction Prediction Based on Knowledge Graph
[138] Zügner, D., A. Akbarnejad, and S. Günnemann, Adversarial Attacks on Neural Embeddings and Convolutional-LSTM Network, in Proceedings of the 10th ACM
Networks for Graph Data, in Proceedings of the 24th ACM SIGKDD International International Conference on Bioinformatics, Computational Biology and Health
Conference on Knowledge Discovery & Data Mining. 2018, Association for Informatics. 2019, Association for Computing Machinery: Niagara Falls, NY,
Computing Machinery: London, United Kingdom. p. 2847–2856. USA. p. 113–123.
[139] Miyato, T., A.M. Dai, and I. Goodfellow, Adversarial training methods for semi- [146] Liu, S.a.H., Ziyang and Qiu, Yang and Chen, Yi-Ping Phoebe and Zhang, Wen,
supervised text classification. arXiv preprint arXiv:1605.07725, 2016. Structural Network Embedding using Multi-modal Deep Auto-encoders for
[140] Feng, S., et al., Pathologies of neural models make interpretations difficult. arXiv Predicting Drug-drug Interactions, in 2019 IEEE International Conference on
preprint arXiv:1804.07781, 2018. Bioinformatics and Biomedicine (BIBM). 2019, IEEE.
[141] Huang K et al. CASTER: Predicting Drug Interactions with Chemical [147] Wang, Y., et al., Dependency and AMR Embeddings for Drug-Drug Interaction
Substructure Representation. In: Proceedings of the AAAI Conference on Extraction from Biomedical Literature, in Proceedings of the 8th ACM
Artificial Intelligence. p. 702–9. International Conference on Bioinformatics, Computational Biology,and Health
[142] Dewulf P, Stock M, De Baets B. Cold-Start Problems in Data-Driven Prediction Informatics. 2017, Association for Computing Machinery: Boston,
of Drug-Drug Interaction Effects. Pharmaceuticals (Basel) 2021;14(5). Massachusetts, USA. p. 36–43.
2123