tgh-07-41
tgh-07-41
tgh-07-41
Abstract: Hepatocellular carcinoma (HCC) is a significant cause of morbidity and mortality worldwide.
Despite significant advancements in detection and treatment of HCC, its management remains a challenge.
Artificial intelligence (AI) has played a role in medicine for several decades, however, clinically applicable AI-
driven solutions have only started to emerge, due to gradual improvement in sensitivity and specificity of AI,
and implementation of convoluted neural networks. A review of the existing literature has been conducted
to determine the role of AI in HCC, and three main domains were identified in the search: detection,
characterisation and prediction. Implementation of AI models into detection of HCC has immense potential,
as AI excels at analysis and integration of large datasets. The use of biomarkers, with the rise of ‘-omics’, can
revolutionise the detection of HCC. Tumour characterisation (differentiation between benign masses, HCC,
and other malignant tumours, as well as staging and grading) using AI was shown to be superior to classical
statistical methods, based on radiological and pathological images. Finally, AI solutions for predicting
treatment outcomes and survival emerged in recent years with the potential to shape future HCC guidelines.
These AI algorithms based on a combination of clinical data and imaging-extracted features can also support
clinical decision making, especially treatment choice. However, AI research on HCC has several limitations,
hindering its clinical adoption; small sample size, single-centre data collection, lack of collaboration and
transparency, lack of external validation, and model overfitting all results in low generalisability of the
results that currently exist. AI has potential to revolutionise detection, characterisation and prediction of
HCC, however, for AI solutions to reach widespread clinical adoption, interdisciplinary collaboration is
needed, to foster an environment in which AI solutions can be further improved, validated and included in
treatment algorithms. In conclusion, AI has a multifaceted role in HCC across all aspects of the disease and
its importance can increase in the near future, as more sophisticated technologies emerge.
Keywords: Hepatocellular carcinoma (HCC); artificial intelligence (AI); machine learning (ML); computer-aided
diagnosis; neural networks
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Page 2 of 15 Translational Gastroenterology and Hepatology, 2022
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Translational Gastroenterology and Hepatology, 2022 Page 3 of 15
Table 1 Glossary of terms related to the use of artificial intelligence in medicine and performance of AI-driven solutions
Term Definition
Artificial intelligence (AI) The use of computers and related technology to emulate intelligent behaviour and critical thinking of
human beings (12)
Machine learning (ML) A discipline in which machines (computers) learn from data, with emphasis on computational algorithms,
which are able to analyse billions of data points (13)
Supervised learning Type of ML, which deals with predicting a known outcome, based on inputs, in the presence of an expert
‘supervisor’ (16)
Unsupervised learning Type of ML, which deals with finding naturally occurring patterns without a pre-defined outcome, without
the presence of an expert ‘supervisor’ (16)
Artificial neural networks Statistical systems, which mimic the complex architecture of biological networks of neurons, to derive
(ANN) outputs, based on interactions of weighted inputs and outputs (14)
Convolutional neural network Type of deep learning ANN, for processing data with grid pattern (e.g., radiological images) using multiple
(CNN) layers, including convolution and pooling layers preforming feature extraction to produce final output (15)
Deep learning A subset of ML, that uses representation learning (automatic discovery of representations from raw data
for classification or detection) (17)
Area under the curve (AUC) Algorithm performance measure, which can be established based on receiver operator characteristics
(ROC) curve. AUC takes values between 0 and 1, depending on average sensitivity and specificity for all
analysed values of the, with values approaching 1 indicating higher performance (18)
Accuracy Algorithm performance measure, taking values between 0% and 100%, based on the number of true
positive and true negative results, compared to the overall size of the population (18)
C-index (c-statistic) Algorithm performance measure, describing the goodness of fit of the model, taking values between 0
and 1 (19)
ML, deep learning, supervised learning, unsupervised list of accepted texts was finalised, full texts were classified
learning) artificial neural network (artificial neural networks, into one of three domains that have been described in the
neural network, ANN, NN) and convolutional neural introduction (detection, characterisation and prediction).
network (convolutional neural network, CNN) combined Following classification into domains, data extraction
with Boolean operators. Relevant keywords were identified was conducted; variables of interest were the number of
using recent related publications on AI in HCC (20,21). participants/pathological slides/radiological images, AI
In total, 702 records were identified, and after removal algorithm used, type of validation, AUC (or if not available,
of duplicates, 470 records have remained. The titles and alternative diagnostic accuracy measures such as sensitivity
abstracts were screened by two authors independently and specificity, c-index or F-score).
(MK and AD), and a third author resolved any conflicts
(TMHG). Inclusion criteria included: original research
Detection
studies, English language, use of AI-driven solutions and
HCC as primary pathology of interest. Exclusion criteria The summary of literature on HCC detection, based
included: malignancies other than HCC as primary on pre-HCC disease models, imaging and biomarkers is
pathology of interest, age <18, non-original research studies presented in Table 2.
(e.g., commentaries, letters to editors, reviews). Relevant
original research studies were included in the analysis,
Pre-HCC disease models
however, conference abstracts and review articles were also
screened. Manual screening of reference lists of included HCC pathogenesis is strongly linked to chronic
full texts was also performed by two authors (MK and AD) inflammatory disease of the liver, which allows for HCC
independently to look for any missing studies. After the detection based on the range of pre-malignant changes.
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Page 4 of 15 Translational Gastroenterology and Hepatology, 2022
Table 2 Summary of literature on HCC detection, based on pre-HCC diseases models, imaging and biomarkers
Domain Sub-category Notes N AI algorithm Type of validation AUC (95% CI) Limitations Reference
Pre-HCC NAFLD/NASH Serum and liver lipids in 15 (5 intervention, Random forest Development only (no N/A Animal (murine) model not replicated on humans Chiappini et al. 2016 (22)
disease murine models 10 control) validation)
models
Population screening for 500 (146 cases, Logistic regression Internal validation (cross- 0.87 (0.83–0.90) Retrosceptive study design and lack of external validation Yip et al. 2017 (23)
NAFLD 354 controls) validation)
Cirrhosis/fibrosis/ Progression of cirrhosis into 442 patients Random forest Internal validation c-index 0.64 (0.60–0.90) Study performed in a tertiary centre. Attrition bias. Low Singal et al. 2012 (24)
hepatitis B/hepatitis C HCC clinical adaptability potential
Progression of chronic HepC 533 (349 normal, Random forest Internal validation 0.84 (0.82–0.86) Narrow enrolment criteria, reducing generalisability of Konerman et al. 2015 (25)
infection into fibrosis 184 fibrosis) conclusions
Progression of HepB/C into 6,561 (Reddy et al.); ANN Internal validation (random 0.911–0.962 (range, Reddy et al.); Retrospective and cross-sectional character of the studies Reddy et al. 2017 (26); Ioannou
HCC 146,218 (Ioannou et al.) sample split) 0.89 (Ioannou et al.) and lack of external validation et al. 2019 (27)
Miscellaneous HCC development based on 165 patients Support vector Internal validation (cross- 0.88 Small sample size Książek et al. 2019 (28)
viral status and clinical data machine validation)
Imaging CT HCC detection on multi-phase 25 (Lee et al.); Temporal subtraction, Development only (no N/A Small sample size, proof-of-concept character of both Lee et al. 2015 (29); Okumura
CT scans 21 (Okumura et al.) 3D global matching validation) studies et al. 2011(30)
MRI HCC nodule detection using 40 images ANN Development only (no Classification accuracy 91.76% Animal (rat) models used. No test-retest reproducibility Guo et al. 2009 (31)
SPIO-MRI in rat models validation)
Biomarkers N/A Serum 167 (Poon et al.); Multiple techniques Development only (no Accuracy 75.9% (Poon et al.); Single centre, internally validated studies with small Poon et al. 2001 (32); Sato et al.
1,582 (Sato et al.) combined validation) 0.844–0.940 (range, Sato et al.) sample size 2019 (33)
Transcriptome 3,981 (2,316 HCC, 1,665 Multiple techniques External validation 0.91–0.96 (range) Heterogeneity of data due to pooled analysis of 30 studies Kaur et al. 2020 (34)
non-tumorous tissue) combined
Gene co-expression 57 (38 HCC, 19 normal PCA Development only (no N/A Use of retrospective databases and no clinical applicability Zhang et al. 2017 (35)
samples) validation)
miRNA N/A Deep belief nets Internal validation (cross- F1-score 75.48% Lack of external validation Ibrahim et al. 2014 (36)
validation)
Genes 95 (43 tumour, 52 non- ANN Internal validation (cross- N/A Proof-of-concept study, use of retrospective gene Gui et al. 2015 (37)
tumour) validation) databases
Urine 15 samples PCA, random forest Internal validation 0.903 Small sample size Liang et al. 2016 (38)
Biomarker identification from N/A Data mining Internal validation (cross- F-score 0.89 Use of impact factor as scoring tool, introducing Chang et al. 2017 (39)
literature validation) systematic bias into the results
HCC, hepatocellular carcinoma; AI, artificial intelligence; AUC, area under the curve; 95% CI, 95% confidence intervals; NAFLD, non-alcoholic fatty liver disease; NASH, non-alcoholic steatohepatitis; HepB, viral hepatitis type B; HepC, viral hepatitis type C; ANN, artificial neural network; PCA, principal
component analysis; miRNA, microRNA; N/A, non-applicable; CT, computed tomography; SPIO-MRI, superparamagnetic iron oxide magnetic resonance imaging.
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Translational Gastroenterology and Hepatology, 2022 Page 5 of 15
Chronic viral hepatitis B and C infections are both stratification of suspicious imaging regions. While more
associated with the development of HCC (40,41). Similarly, advanced applications of AI in imaging will be discussed
non-viral causes, including non-alcoholic fatty liver disease later in this review, it is important to consider the detection
(NAFLD), non-alcoholic steatohepatitis (NASH), cirrhosis of HCC lesions on imaging, which were the precursors to
and fibrosis of the liver are also risk factors for developing current AI applications.
HCC (42). The stepwise progression of these pathologies Lee et al. used computer-aided diagnosis (CAD) on
creates the potential for a screening window, during which multi-phase CT scans to detect HCC in a set of 15
high-risk individuals can be identified. moderate HCC (mean ⌀3.1 cm) and 10 small HCC
The use of AI-driven solutions in detection of NAFLD (mean ⌀1.04 cm). Using a non-rigid registration model,
and NASH has not been comprehensively researched. In a which accounted for deformation between phases due to
2016 study, Chiappini and colleagues investigated serum and respiratory movements and heartbeats, they have achieved
liver lipids in NAFLD and NASH murine models, using 100% detection accuracy, when compared with radiological
supervised ML (random forest analysis). They identified diagnosis (29). Similar results were obtained using 3D non-
unique signatures of NASH, opening new possibilities of linear image wrapping (30). Moreover, in 2009 Guo et al.
pre-HCC change detection (22). Population screening have looked into HCC nodule detection on rat models,
for NAFLD has also been suggested, with ML-algorithm, using SPIO-enhanced MRI, achieving 91.67% classification
based on 23 clinical parameters, achieving an AUC of 0.88 accuracy (31).
(95% CI, 0.84–0.91) in detecting NAFLD (23).
Substantially more research has been made into the
Biomarkers
association of chronic HepB and HepC infections and
development of cirrhosis, fibrosis and eventually HCC. A variety of biomarkers have been researched, to equip
Progression of cirrhosis into HCC was studied as early as clinicians with a reliable tool for HCC detection. Recently
2012; it was found that a supervised ML (random forest) advances in bioinformatics and technology, resulting in have
model outperformed conventional regression analysis, revolutionised the way large biological datasets (‘-omics’
achieving a c-index of 0.64 (95% CI, 0.6–0.69) (24). datasets) can be generated and analysed, allowing for
Konerman and colleagues used a very similar model to the integration of multiple datasets (genome, proteome,
predict progression of chronic hepatitis C infection into transcriptome, etc.) (43). Combination of ‘omics’ with
fibrosis, with AUC of 0.86 (95% CI, 0.85–0.87) (25). AI algorithms has led to the identification of suitable
Throughout the years, as ML models have become more biomarkers with the potential to translate data into
sophisticated, their diagnostic performance has improved. therapeutics (44). The use of biomarkers, identify with the
An ANN model by Reddy and Imler, achieved an AUC as aid of neural networks, combined with the classically used
high as 0.962 in the prediction of malignant transformation serological marker for HCC (alpha-fetoprotein), was shown
from hepatitis B or C chronic infection (26). Similarly, to increase diagnostic sensitivity from 60% to 73.8% while
another recurrent neural network model achieved an AUC maintaining the sensitivity of 88.2% (32).
of 0.89 (27). It is worth highlighting that in all cases, ML Harnessing the power of ML has also allowed for new
models have proven to be superior to the classical statistical types of clinical data to be analysed in the hope of detecting
regression model when analysing big data sets. HCC. Large-scale (2,316 HCC tumour samples and 1,665
Moreover, most recent model by Książek et al., used non-tumorous tissue samples) transcriptomic profiling by
patient characteristics, such as viral status, presence Kaur et al. derived a 3-gene signature (FCN3, CLEC1B,
of comorbidities and laboratory results to predict the and PRC1), which has achieved an AUC of 0.91–0.96 on
development of HCC based on 23 quantitative and 26 its validation set, which used peripheral blood samples
qualitative features, has achieved 88.5% accuracy using this containing mononuclear cells of both HCC and healthy
approach (28). patients (34). Further, each one of the three genes, showed
prognostic character, as their expression levels, when
stratified to greater than mean and lesser than mean groups,
Imaging
correlated with overall survival, progression-free survival
Detection is the most basic way of utilising AI in imaging, (PFS) and disease-free survival (DFS) on univariate analysis.
while more novel approaches focus on characterisation and A wide array of HCC biomarkers have been investigated
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Page 6 of 15 Translational Gastroenterology and Hepatology, 2022
using AI-driven techniques: gene co-expression differentiate between benign and malignant liver tumour
patterns (35), miRNA (36), HCC-related genes (37) and thus identify HCC. Nowadays, more sophisticated AI-
and serum biomarkers (33) have all been suggested as driven solutions are used in lesion differentiation, with
potential diagnostic signatures. Urine biomarkers have greater accuracy and better differentiation, staging and
also been identified using ML approaches—Liang and stratification abilities.
colleagues identified five differential metabolites, from For ultrasound scan, differentiation between cirrhotic
37 urine samples (25 early-HCC and 12 controls), liver and HCC, using neural networks can be made with
showing a sensitivity of 96.5% and specificity of 83% in 94.5% classification accuracy (48). Differentiation between
differentiating between healthy and HCC patients, on an atypical HCC and focal nodular hyperplasia (FNH) on
independent validation set (n=25) (38). the contrast-enhanced US, have also been reported,
Another application of AI is data mining, which can achieving 94.4% classification accuracy, when compared
be used to automatically screen existing literature to against pathology report analysis (biopsy or resection) and
identify biomarker candidates, making it a useful tool for subsequent clinical follow-up (49). High differentiation
bioinformaticians (39). accuracy has also been found for CT and MRI scans (52,64).
What is more, Yamashita et al. have proven the feasibility
of CNN assigning Li-RADS grades [an HCC CT/MRI
Characterisation
scan probability classification used by American College of
The summary of literature on HCC characterisation, based Radiologists (65)], to guide treatment decision-making (66).
on biomarker, imaging and pathology is presented in Table 3. AI models can also assist in imaging-based grading
of HCC. Using contrast-enhanced ultrasound (CEUS),
Sugimoto et al. established classification into well-
Biomarkers
differentiated, moderately differentiated and poorly
Although the primary use of biomarkers is the detection of differentiated HCC, with an AUC of 0.863–0.872 (50).
HCC, ML approaches also allow for stratification of HCC Similar differentiation based on MRI scans was shown by
patients based on their biomarker profile, which can have Zhou et al. (56).
therapeutic implications, as the response to treatment can Furthermore, tumour segmentation algorithms
be dependent on HCC subtype. have been employed to aid in management planning.
Genomics and epigenomics (DNA methylation patterns) Visualisation of the tumour can dictate decisions regarding
analysis using ML has allowed for accurate differentiation tumour extent and resection. These algorithms, based on
between early-stage (stage I) and late-stage (stages II- contrast-enhanced CT scans, can provide clinically useful
IV) HCC (45). Moreover, Estevez et al. have established a 3D projections with a high degree of accuracy, as shown by
biomarker-based classification into HepB-HCC, HepC- Li et al. (53).
HCC and non-viral HCC, using cytokine profile from Most recently, radiomics approach to imaging analysis
serum samples (46). Such stratification can have clinical has been proposed, involving a multi-step process to derive
implications for management, as well as understanding large datasets of radiological features, via image acquisition,
differences in disease pathogenesis. segmentation, feature extraction and automated analysis
of patterns by using high throughput computing (67).
Studies utilising this methodology shed light on the
Imaging
future directions of AI in imaging for HCC, highlighting
AI can help analyse radiological features from ultrasound, its potential for high accuracy tumour characterisation
computed tomography (CT) and magnetic resonance and classification on MRI (57) and multi-phase contrast-
imaging (MRI), all of which are routinely used in the enhanced CT (54) by analysing of textural features.
diagnosis and differentiation of liver pathology.
The first use of AI in HCC characterisation based on
Pathology
imaging focused on the region of interest (ROI) analysis
and computer-aided diagnosis systems using US and (47), AI has been used in pathology in order to precisely analyse
CT (51) or MRI (55). These simple models employed the results of biopsies and resections to help with lesion
analysis of features, such as lesion border or texture, to characterisation and differentiation, using image-analysis
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Translational Gastroenterology and Hepatology, 2022 Page 7 of 15
Table 3 Summary of literature on HCC characterisation, based on biomarkers, imaging and pathology
Domain Sub-category Notes N AI algorithm Type of validation AUC (95% CI) Limitations Reference
Biomarkers N/A Genomics and epigenomics for HCC 400 (173 early stage, 177 late stage, 50 normal) Support machine Internal validation 0.99 (0.98–0.99) Biomarkers derived from tissue, which requires invasive Kaur et al.
staging vector (cross-validation) approaches for sample isolation 2019 (45)
Cytokine profile for HepC-HCC, HepB 411 (102 HCC, 309 normal) Random forest Development only (no 0.90 Small sample size and comparison groups not adjusted Estavez et al.
HCC and non-viral HCC differentiation validation) for ethnicity 2017 (46)
Imaging US Differentiation between focal hepatic 51 images ROI analysis Development only (no N/A Only two data validators (radiologist); data possibly not Kim et al.
lesions (benign and malignant) validation) generalisable 2009 (47)
Differentiation between cirrhosis and 189 images ANN Internal validation Accuracy 94.5% Small sample size Bharti et al.
HCC (cross-validation) 2018 (48)
Differentiation between atypical HCC 257 images ANN Internal validation F1-score 94.62% Small sample size leading to lack of generalisability and Huang et al.
and focal nodular hyperplasia (cross-validation) network which is difficult to interpret 2020 (49)
Grading based on tumour 232 (76 well-differentiated HCC, 133 moderately ANN Development only (no Accuracy 87.5% Use of 2D ultrasound and fine-needle biopsy specimen for Sugimoto et al.
differentiation differentiated HCC, 23 poorly differentiated HCC) validation) establishing differentiation instead of surgical specimen 2016 (50)
CT Differentiation between focal hepatic 147 images Region of interest (ROI) Internal validation Accuracy 84.96% Small sample size Mougiakakou
lesions (benign and malignant) analysis (cross-validation) et al. 2007 (51)
HCC diagnosis from nodular, diffuse 165 (46 diffuse tumour, 43 nodular tumours, 76 Convolutional neural Internal validation Accuracy 98.4–99.7% (range) Segmentation performance for diffuse tumour is not as Li et al. 2020 (52)
and massive tumours massive tumours) networks (CNN) (random sample split) good as other types, creating noise in the data
Tumour segmentation on contrast- 201 images Fully convolutional External validation Accuracy 93.7% Network which is difficult to interpret and restricted by Li et al. 2018 (53)
enhanced enhanced CT neural networks GPU memory
Differentiation between five phases of 502 images Random forest External validation Accuracy 84–98% (range) Overlap between five phases on CT scan (no clear Dercle et al.
CT guidelines on start and end of each phase); decision 2020 (54)
based on expertise of principal investigators
MRI Differentiation between focal hepatic 320 images ANN, CNN Development only (no Accuracy 93% Single centre character of the studies and sample size Zhang et al.
lesions (benign and malignant) validation) insufficient for neural network training 2009 (55)
Grading based on tumour 100 (47 low grade HCC, 53 high grade HCC) CNN Internal validation 0.73–0.83 (range) Single-centre character of the study, small sample size Zhou et al.
differentiation (random sample split) and lack of external validation 2019 (56)
Focal lesion differentiation using 150 (50 HCC, 50 metastatic tumours, 50 hepatic ROI analysis Internal validation 0.75–0.95 (range) Large variation in tumour size, affecting classification Oyama et al.
texture and topological analysis haemangioma) (cross-validation) accuracy for the outliers 2019 (57)
Pathology N/A Differentiation between HCC and 106 whole-slide images Deep learning Internal validation 0.842 Methodology not reflective of clinical practice, reducing Kiani et al.
cholangiocarcinoma (random sample split) general applicability of the results 2020 (58)
Differentiation between healthy tissue 1,773 image features Random forest External validation 0.886 Different in ethological factors between two datasets used Liao et al.
and HCC 2020 (59)
Grading of HCC 109 patients ROI analysis, fractal Internal validation Accuracy 95.97% (Atupelage et al. 2014); Small sample size and lack of external validation. Inclusion Atupelage
dimensions (cross-validation) accuracy 90.51% (Atupelage et al. 2013) of non-informative texture features into the classifiers et al. 2014 (60);
Atupelage et al.
2013 (61)
HCC diagnosis using hyperspectral 14 hyperspectral images Deep learning Internal validation 0.950 Small sample size and single-centre character of the study Wang et al.
imaging analysis (cross-validation) 2020 ( 62)
HCC grading using multiphoton 217 images CNN Internal validation 0.941(0.913–0.968) Small sample size, insufficient for deep learning purpose Lin et al.
microscopy (cross-validation) 2019 (63)
HCC, hepatocellular carcinoma; AI, artificial intelligence; AUC, area under the curve; 95% CI, 95% confidence intervals; HepB, viral hepatitis type B; HepC, viral hepatitis type C; ANN, artificial neural network; ROI, region of interest; CNN, convolutional neural network; N/A, non-applicable; CT, computed
tomography; MRI, magnetic resonance imaging, US, ultrasound scan.
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Page 8 of 15 Translational Gastroenterology and Hepatology, 2022
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Translational Gastroenterology and Hepatology, 2022 Page 9 of 15
Table 4 Summary of literature on HCC prediction based on treatment outcomes and overall survival
Domain Sub-category Notes N AI algorithm Type of validation AUC (95% CI) Limitations Reference
Treatment TACE Response to treatment based on clinical data 282 ANN Internal validation (random sample split) 0.83±0.06 No independent external validation cohort Mähringer-Kunz 2020 (68)
outcomes and risk scoring systems
Response to treatment based on MRI and 36 HCC patients Random forest Internal validation (cross-validation) Accuracy 78% Small patient cohort Abajian et al. 2018 (69)
clinical data
Response to treatment based on CT images 105 patients CNN Internal validation (cross-validation) Accuracy 74.2% (64–82%) General model applied to multiple TACE Morshid et al. 2019 (70)
chemotherapy regimens
Response to treatment based on the 130 patients Deep learning Internal validation (random sample split) 0.93 (0.80–0.98) Limited sample size; single-centre retrospective Liu et al. 2020 (71)
contrast-enhanced US data
SBRT Hepatobiliary toxicity prediction based on CT 125 patients and 2,644 CNN Internal validation (cross-validation) 0.85 Limited liver SBRT database Ibragimov et al. 2018 (72)
images images of human organs
RFA Disease-free survival prediction 252 1-year and 179 2-year ANN Internal and external validation 0.84 for 1-year, 0.75 for 2-year Uneven 1-year and 2-year DFS group sizes Wu et al. 2017 (73)
DFS DFS prediction
Resection Recurrence risk based on whole-slide 522 patients CNN Internal and external validation c-index 0.70 Overfitting (inferior performance on external Salliard et al. 2020 (74)
histological analysis validation set)
Recurrence and progression-free survival 221 patients Random forest Internal validation (cross-validation) 0.80 No external dataset validation Zhou et al. 2019 (75)
based on immunological biomarkers
Survival based on CT images 470 patients ANN Internal and external validation 0.803 Most patients had hepatitis B-related HCC Ji et al. 2019 (76)
167 patients Internal validation (cross-validation) 0.825 No external dataset validation Wang et al. 2019 (77)
995 patients Bayesian network Internal validation (cross-validation) Accuracy 57% Lack of temporal information in the patient data Xu et al. 2019 (78)
Microinvasion based on biomarkers and MRI 160 patients Logistic regression Internal validation (random sample split) 0.83 (0.71–0.95) Normalisation of the signal intensities on MR Feng et al. 2019 (79)
images not performed
Survival based on BCLC criteria 976 patients Classification and Internal validation (random sample split) c-index 0.604 Majority of patients had favourable liver function Tsilmigras et al. 2020 (80)
Regression Tree
Transplantation Recurrence based on clinical data and CT 133 patients Classification and Internal validation (random sample split) c-index 0.789 (0.620–0.957) Retrospective design Guo et al. 2019 (81)
images Regression Tree
Overall survival N/A Survival based on biomarkers in HepB-HCC 67 samples (40 patients) Supervised ML Internal validation (cross-validation) Accuracy 78% Small dataset Ye et al. 2003 (82)
Survival based on DNA methylation patens 488 samples Multiple techniques External validation Accuracy 63% Reason for 40% of all patients being hard to Itzel et al. 2019 (83)
combined predict remained unclear
377 HCC, 50 control Internal validation (cross-validation) 0.95 (mean 10-fold cross- Limited validation outcomes reported Dong et al. 2019 (84)
samples validation score)
Survival based on gene-expression pathways 355 patients Support vector Internal and external validation c-index 0.83 Class label of the TCGA HCC samples obtain Fa et al. 2019 (85)
machine using whole TCGA dataset
Survival based on clinical features 165 patients ANN Internal validation (cross-validation) 0.700 No external dataset validation Santos et al. 2015 (86)
HCC, hepatocellular carcinoma; AI, artificial intelligence; AUC, area under the curve; 95% CI, 95% confidence intervals; HepB, viral hepatitis type B; ANN, artificial neural network; CNN, convolutional neural network; TACE, transarterial chemoembolization; SBRT, stereotactic body radiation therapy; RFA,
radiofrequency ablation; BCLC criteria, Barcelona Clinic Liver Cancer criteria; DFS, disease-free survival; TCGA, The Cancer Genome Atlas; CT, computed tomography; MRI, magnetic resonance imaging.
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Page 10 of 15 Translational Gastroenterology and Hepatology, 2022
analysis of whole-slide digitised histological slides (74). positive patients based on gene expression, hence obtaining
Two CNN algorithms reached similar efficiency, however, information about probable survival chance (82). It also
the combination of CNN with human input (tumour areas identified osteopontin as a biomarker of metastatic HCC.
annotated by the pathologist) slightly outperformed the one Recently, groups led by Itzel and Dong explored
without (AUC 0.78 vs. 0.75). possibilities of using random gene sets and DNA
Immunological tumour biomarkers were also used as methylation levels for survival prognostics (83,84), while
a tool for predicting survival, using three indices: Overall Fa et al. followed disease-specific patterns in dysregulated
Survival (≤24 or >24 months), PFS (≤6 or >6 months) and gene-expression pathways instead of singe genes (85).
recurrence/death producing AUC between 0.76 and 0.8 and More studies followed with training predictive algorithms
accuracy over 85% (75). of clinical features of HCC patients to forecast survival.
Moreover, groups led by Ji and Wang validated CT-based Santos et al. combined a cluster-based oversampling method
ANN and deep CNN to predict survival (76,77). The first with the neural network model to account for small and
group developed a novel three-feature radiomic signature incomplete datasets, improving the AUC score from 0.69 to
of contrast-enhanced CT image, where performance was 0.75 (86).
improved by combining it with clinical features (c-index
0.63–0.69 vs. 0.73–0.801). Wang and colleagues employed
Discussion
multi-phase CT radiomics features together with clinical
models to yield a combined model with AUC of 0.82. A AI solutions have been applied in all aspects of medicine
Bayesian network-based approach was also used to predict in recent years and HCC is no exception. AI has led to
the probability of post-resection HCC recurrence which advances in detection of HCC (based on pre-malignant
considered respective recurrence evolution paths for clinical changes, imaging and biomarkers) due to its ability to
feature datasets (78). analyse large datasets and integrate information efficiently.
At the same time, AI techniques were explored to Biomarkers identified by the integration of multiple ‘-omics’
provide information about the predictive power of datasets are especially promising, potentially leading to
particular biomarkers, which could guide decisions the identification of a biochemical tumour signature,
on liver resection. Feng et al. used MRI radiomics to revolutionising HCC detection in the future.
predict microvascular invasion status of the tumours, an As AI algorithms have become more sophisticated,
important factor for hepatectomy, reaching AUC of 0.83 research emphasis shifted towards lesion characterisation,
for validation dataset (79). Furthermore, an AI model differentiation between various types of hepatic
was applied to determine the prognostic weight of factors malignancies and stratification of patients into groups,
comprising the Barcelona Clinic Liver Cancer (BCLC) based on the tumour stage or grade. Various datasets, such
guidelines, which selected alpha-fetoprotein and Charlson as radiological images or clinical and pathological data, can
comorbidity score as the most important preoperative be used separately or in combination to provide accuracy
factors of overall survival among BCLC-0/A patients, and superior to that of traditional statistical tools. What is more,
radiologic tumour burden score for BCLC-B patients (80). AI-driven solutions can help in reducing interobserver
These results have the potential to shape the next iteration variability when analysing imaging studies, leading to
of BCLC guidelines. standardisation. Since most of the hepatocellular cancers
AI was also used to predict DFS following liver develop on the background of chronic liver disease, future
transplantation. CT radiomics and clinical risk factors were efforts in HCC characterisation should focus on accurate
combined to train the model, which yielded c-index of 0.79 differentiation between pre-malignant changes and early
when tested on the validation cohort (81). malignancy, to provide the most clinical benefit.
AI methods for prediction of overall survival and
treatment outcomes in HCC have emerged in the past
Prediction of overall survival of HCC patients
two years and remain a dynamic area of study. Predictive
ML algorithms have also been used for the general potential of current models is higher for short-term
prediction of survival in HCC patient population. The outcomes rather than long-term survival, however, this
earliest studies used supervised ML to differentiate approach offers an array of novel predictive tools to shape
between metastatic and non-metastatic HCC in HepB HCC guidelines and support clinical decision making. In
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Translational Gastroenterology and Hepatology, 2022 Page 11 of 15
the future, models combining radiomic data with clinical Future research directions
features to provide characterisation and prognosis for HCC
Consistency in reporting and transparency in publishing AI
patients are likely to be implemented in the clinical practice
algorithms can significantly improve the clinical value of
to support decisions on treatment.
studies exploring AI models applied to HCC. Discrepancies
in data standards and diagnostic devices used across
Limitations different treatment centres contribute to overfitting of
AI models, which needs to be overcome to facilitate the
AI has inherent limitations, and this holds true for its
development of AI solutions with general applicability (89).
medical applications for HCC detection, characterisation
External validation of the AI algorithms should also be
and prediction. Most of the studies discussed in this review
favoured over retrospective internal validation, further
suffer from a small sample size, which is a major issue for
increasing the applicability of the AI-driven solutions.
deep learning algorithms, as they require large training
International and interdisciplinary collaboration is
dataset to perform well. What is more, a lot of the studies
instrumental in approaching this issue, as shown in a
use single-institution data, often from tertiary care centres. recent study that investigated an AI model in breast cancer
As a result, despite achieving high AUC and accuracy, diagnosis using data from both the UK and US (10).
such AI algorithms often cannot be used outside a narrow What is more, widespread availability of source code for
context, ultimately hampering widespread clinical adoption algorithms can speed the process of AI development and
of AI within HCC. Specific limitations also exist in each of validation, also contributing to larger applicability of
the three domains discussed. Studies on detection, focus these solutions. Finally, effective communication between
on specific subtypes of HCC and narrow populations computer scientists, engineers and clinicians is crucial for
(e.g., HBV positive patients), rendering the proposed AI generating research which can redefine the current practice
algorithms are unable to perform screening for HCC on the to address unmet clinical needs.
level of the general population. Studies on characterisation
are limited by lack of standardisation of biomarker assays,
imaging techniques and histological specimen preparation, Conclusions
all of which contribute to difficulties in applying the results AI will revolutionise the way we detect and characterise
of the research in settings different than the original. Even HCC, as well as predict the course of its development,
though significant advancements in the implementation of however, it is still experimental. In recent years, the rise
HCC outcomes prediction have been made in the recent of big data has caused AI-driven solutions utilising clinical
years, there are many outstanding questions. Interpretation data, radiological images, biomarkers and pathology results
of algorithm outcomes remains a major challenge as it is to emerge and gradually improve in accuracy, however,
difficult to explain why the model fails to make accurate their widespread introduction into the clinical practice has
predictions for a proportion of cases. not occurred yet. Robust validation, large scale studies,
This review is also not devoid of limitations. Firstly, multicentre cooperation, advocacy for AI and education
the methodology of the study, being a literature review on AI amongst clinicians are all necessary for AI models to
limits the applicability of its conclusions. Moreover, the take the next step, so that in the future, such models using
heterogeneity of quantitative data and AI methodologies multiple data modalities, have the chance of influencing
has not allowed for pooled analysis of outcomes, but only HCC guidelines and shaping clinical practice.
a qualitative synthesis of evidence. The three-domain
classification that was adopted for the purposes for that
Acknowledgments
review is also imperfect, as more recent studies often discuss
potential uses of HCC across more than one domain, by Funding: None.
utilising multiple types of data to inform clinical decisions.
Moreover, selection bias might exist, and studies as studies
Footnote
with negative findings might not be published. Finally, this
review only aimed at assessing studies in English, however, Reporting Checklist: The authors have completed the
high-quality studies in other languages might exist. Narrative Review reporting checklist. Available at https://
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Page 12 of 15 Translational Gastroenterology and Hepatology, 2022
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Translational Gastroenterology and Hepatology, 2022 Page 13 of 15
24. Singal AG, Waljee AK, Mukherjee A, et al. Su2057 bioinformatics analysis of high-throughput data. Med
Machine Learning Algorithms Outperform Conventional Oncol 2017;34:101.
Regression Models in Identifying Risk Factors for 36. Ibrahim R, Yousri NA, Ismail MA, et al. Multi-level gene/
Hepatocellular Carcinoma in Patients With Cirrhosis. MiRNA feature selection using deep belief nets and
Gastroenterology 2012;142:S984. active learning. Annu Int Conf IEEE Eng Med Biol Soc
25. Konerman MA, Zhang Y, Zhu J, et al. Improvement of 2014;2014:3957-60.
predictive models of risk of disease progression in chronic 37. Gui T, Dong X, Li R, et al. Identification of Hepatocellular
hepatitis C by incorporating longitudinal data. Hepatology Carcinoma-Related Genes with a Machine Learning and
2015;61:1832-41. Network Analysis. J Comput Biol 2015;22:63-71.
26. Reddy R, Imler TD. Artificial Neural Networks are Highly 38. Liang Q, Liu H, Wang C, et al. Phenotypic
Predictive for Hepatocellular Carcinoma in Patients with Characterization Analysis of Human Hepatocarcinoma by
Cirrhosis. Gastroenterology 2017;152:S1193. Urine Metabolomics Approach. Sci Rep 2016;6:19763.
27. Ioannou GN, Tang W, Beste L, et al. 498 - Deep Learning 39. Chang NW, Dai HJ, Shih YY, et al. Biomarker
Models Accurately Predict Development of Hcc in 146,218 identification of hepatocellular carcinoma using a
Patients with Chronic Hepatitis C. Gastroenterology methodical literature mining strategy. Database (Oxford)
2019;156:S1201. 2017;2017:bax082.
28. Książek W, Abdar M, Acharya UR, et al. A novel machine 40. Di Bisceglie AM. Hepatitis C and hepatocellular
learning approach for early detection of hepatocellular carcinoma. Hepatology 1997;26:34S-38S.
carcinoma patients. Cogn Syst Res 2019;54:116-27. 41. Di Bisceglie AM. Hepatitis B and hepatocellular
29. Lee J, Kim KW, Kim SY, et al. Automatic detection carcinoma. Hepatology 2009;49:S56.
method of hepatocellular carcinomas using the non-rigid 42. Anstee QM, Reeves HL, Kotsiliti E, et al. From NASH
registration method of multi-phase liver CT images. J to HCC: current concepts and future challenges. Nat Rev
Xray Sci Technol 2015;23:275-88. Gastroenterol Hepatol 2019;16:411-28.
30. Okumura E, Sanada S, Suzuki M, et al. Effectiveness of 43. Manzoni C, Kia DA, Vandrovcova J, et al. Genome,
temporal and dynamic subtraction images of the liver transcriptome and proteome: The rise of omics data and
for detection of small HCC on abdominal CT images: their integration in biomedical sciences. Brief Bioinform
Comparison of 3D nonlinear image-warping and 3D 2018;19:286-302.
global-matching techniques. Radiol Phys Technol 44. Chen B, Garmire L, Calvisi DF, et al. Harnessing big ‘omics’
2011;4:109-20. data and AI for drug discovery in hepatocellular carcinoma.
31. Guo D, Qiu T, Bian J, et al. A computer-aided diagnostic Nat Rev Gastroenterol Hepatol 2020;17:238-51.
system to discriminate SPIO-enhanced magnetic 45. Kaur H, Bhalla S, Raghava GPS. Classification of early
resonance hepatocellular carcinoma by a neural network and late stage liver hepatocellular carcinoma patients
classifier. Comput Med Imaging Graph 2009;33:588-92. from their genomics and epigenomics profiles. PLoS One
32. Poon TCW, Chan ATC, Zee B, et al. Application of 2019;14:e0221476.
classification tree and neural network algorithms to the 46. Estevez J, Chen VL, Podlaha O, et al. Differential Serum
identification of serological liver marker profiles for Cytokine Profiles in Patients with Chronic Hepatitis B, C,
the diagnosis of hepatocellular carcinoma. Oncology and Hepatocellular Carcinoma. Sci Rep 2017;7:11867.
2001;61:275-83. 47. Kim SH, Lee JM, Kim KG, et al. Computer-aided image
33. Sato M, Morimoto K, Kajihara S, et al. Machine-learning analysis of focal hepatic lesions in ultrasonography:
Approach for the Development of a Novel Predictive Preliminary results. Abdom Imaging 2009;34:183-91.
Model for the Diagnosis of Hepatocellular Carcinoma. Sci 48. Bharti P, Mittal D, Ananthasivan R. Characterization of
Rep 2019;9:7704. chronic liver disease based on ultrasound images using the
34. Kaur H, Dhall A, Kumar R, et al. Identification of variants of grey-level difference matrix. Proc Inst Mech
Platform-Independent Diagnostic Biomarker Panel Eng H 2018;232:884-900.
for Hepatocellular Carcinoma Using Large-Scale 49. Huang Q, Pan F, Li W, et al. Differential Diagnosis
Transcriptomics Data. Front Genet 2020;10:1306. of Atypical Hepatocellular Carcinoma in Contrast-
35. Zhang C, Peng L, Zhang Y, et al. The identification of Enhanced Ultrasound Using Spatio-Temporal Diagnostic
key genes and pathways in hepatocellular carcinoma by Semantics. IEEE J Biomed Health 2020. doi: 10.1109/
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Page 14 of 15 Translational Gastroenterology and Hepatology, 2022
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242
Translational Gastroenterology and Hepatology, 2022 Page 15 of 15
carcinoma patients after radiofrequency ablation. J Formos 81. Guo D, Gu D, Wang H, et al. Radiomics analysis enables
Med Assoc 2017;116:765-73. recurrence prediction for hepatocellular carcinoma after
74. Saillard C, Schmauch B, Laifa O, et al. Predicting liver transplantation. Eur J Radiol 2019;117:33-40.
survival after hepatocellular carcinoma resection using 82. Ye QH, Qin LX, Forgues M, et al. Predicting hepatitis B
deep-learning on histological slides. Hepatology virus-positive metastatic hepatocellular carcinomas using
2021;73:2077-8. gene expression profiling and supervised machine learning.
75. Zhou W, Chen H, Han W, et al. Prediction of Nat Med 2003;9:416-23.
hepatocellular carcinoma patient survival using 83. Itzel T, Spang R, Maass T, et al. Random gene sets
machine learning classification rules. J Clin Oncol in predicting survival of patients with hepatocellular
2019;37:e15649-e15649. carcinoma. J Mol Med 2019;97:879-88.
76. Ji GW, Zhu FP, Xu Q, et al. Machine-learning analysis 84. Dong RZ, Yang X, Zhang XY, et al. Predicting overall
of contrast-enhanced CT radiomics predicts recurrence survival of patients with hepatocellular carcinoma using a
of hepatocellular carcinoma after resection: A multi- three-category method based on DNA methylation and
institutional study. EBioMedicine 2019;50:156-65. machine learning. J Cell Mol Med 2019;23:3369-74.
77. Wang W, Chen Q, Iwamoto Y, et al. Deep Learning- 85. Fa B, Luo C, Tang Z, et al. Pathway-based biomarker
Based Radiomics Models for Early Recurrence Prediction identification with crosstalk analysis for robust prognosis
of Hepatocellular Carcinoma with Multi-phase CT Images prediction in hepatocellular carcinoma. EBioMedicine
and Clinical Data. Annu Int Conf IEEE Eng Med Biol Soc 2019;44:250-60.
2019;2019:4881-4. 86. Santos MS, Abreu PH, García-Laencina PJ, et al. A new
78. Xu D, Sheng JQ, Hu PJH, et al. Predicting hepatocellular cluster-based oversampling method for improving survival
carcinoma recurrences: A data-driven multiclass prediction of hepatocellular carcinoma patients. J Biomed
classification method incorporating latent variables. J Inform 2015;58:49-59.
Biomed Inform 2019;96:103237. 87. Abajian A, Murali N, Savic LJ, et al. Predicting treatment
79. Feng ST, Jia Y, Liao B, et al. Preoperative prediction response to image-guided therapies using machine
of microvascular invasion in hepatocellular cancer: a learning: An example for trans-arterial treatment of
radiomics model using Gd-EOB-DTPA-enhanced MRI. hepatocellular carcinoma. J Vis Exp 2018;2018:e58382.
Eur Radiol 2019;29:4648-59. 88. Peng J, Kang S, Ning Z, et al. Residual convolutional
80. Tsilimigras DI, Mehta R, Moris D, et al. Utilizing neural network for predicting response of transarterial
Machine Learning for Pre- and Postoperative Assessment chemoembolization in hepatocellular carcinoma from CT
of Patients Undergoing Resection for BCLC-0, A and B imaging. Eur Radiol 2020;30:413-24.
Hepatocellular Carcinoma: Implications for Resection 89. Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: A
Beyond the BCLC Guidelines. Ann Surg Oncol simple way to prevent neural networks from overfitting. J
2020;27:866-74. Mach Learn Res 2014;15:1929-58.
doi: 10.21037/tgh-20-242
Cite this article as: Kawka M, Dawidziuk A, Jiao LR, Gall
TMH. Artificial intelligence in the detection, characterisation
and prediction of hepatocellular carcinoma: a narrative review.
Transl Gastroenterol Hepatol 2022;7:41.
© Translational Gastroenterology and Hepatology. All rights reserved. Transl Gastroenterol Hepatol 2022;7:41 | http://dx.doi.org/10.21037/tgh-20-242