Artificial Intelligence For The Prevention and Cli

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Review

Artificial intelligence for the prevention and clinical management of


hepatocellular carcinoma
Julien Calderaro1,2, Tobias Paul Seraphin3, Tom Luedde3, Tracey G. Simon4,5,*

Summary
Keywords: Artificial Hepatocellular carcinoma (HCC) currently represents the fifth most common malignancy and the third-
intelligence; Machine learning;
leading cause of cancer-related death worldwide, with incidence and mortality rates that are increasing.
Deep learning; Liver cancer.
Recently, artificial intelligence (AI) has emerged as a unique opportunity to improve the full spectrum of
Received 7 November 2021; HCC clinical care, by improving HCC risk prediction, diagnosis, and prognostication. AI approaches
received in revised form 26
include computational search algorithms, machine learning (ML) and deep learning (DL) models. ML
December 2021; accepted 14
January 2022 consists of a computer running repeated iterations of models, in order to progressively improve per-
formance of a specific task, such as classifying an outcome. DL models are a subtype of ML, based on
neural network structures that are inspired by the neuroanatomy of the human brain. A growing body of
recent data now apply DL models to diverse data sources – including electronic health record data,
imaging modalities, histopathology and molecular biomarkers – to improve the accuracy of HCC risk
prediction, detection and prediction of treatment response. Despite the promise of these early results,
future research is still needed to standardise AI data, and to improve both the generalisability and
interpretability of results. If such challenges can be overcome, AI has the potential to profoundly change
the way in which care is provided to patients with or at risk of HCC.
© 2022 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
1
Assistance Publique-Hôpitaux de Introduction and definitions
Paris, Henri Mondor University
Hospital, Department of Pathology, With a global incidence of approximately 500,000 diagnosis in patients undergoing surveillance im-
Créteil, France; 2Inserm U955 and cases per year, hepatocellular carcinoma (HCC) aging or liver biopsies; and iii) improving prog-
Univ Paris Est Creteil, INSERM, represents the fifth most common malignancy and nostication in patients with established HCC.
IMRB, 94010, Creteil, France;
3
Department of Gastroenterology,
the third-leading cause of cancer-related death AI is a broad field that includes computational
Hepatology and Infectious worldwide.1,2 The vast majority of HCC tumours search algorithms, machine learning (ML) and deep
Diseases, University Hospital arise on a background of cirrhosis, which in turn is learning (DL) models (Fig. 1). ML consists of a
Duesseldorf, Medical Faculty at
most commonly caused by non-alcoholic fatty liver computer running repeated iterations of models in
Heinrich-Heine-University
Duesseldorf, Duesseldorf, Germany; disease (NAFLD), alcohol-related liver disease, or order to progressively improve performance of a
4
Liver Center, Division of HBV/HCV infection. Despite recent advances in specific task, such as classifying an outcome. ML
Gastroenterology, Massachusetts treatment, including the use of atezolizumab plus models are designed to improve with time, by
General Hospital and Harvard
Medical School, Boston, MA, USA;
bevacizumab for unresectable HCC, prognosis re- incorporating additional input training data and
5
Clinical and Translational mains poor, with a 5-year survival rate of just 15%, thereby optimising the parameters of an algorithm.
Epidemiology Unit (CTEU), due to delays in diagnosis and the limited efficacy With time and training, the desired output be-
Massachusetts General Hospital,
of existing therapies.3,4 While liver transplantation comes increasingly accurate. Based on how the
Boston, MA, USA
can be curative for HCC in selected cases, this training process is conducted, ML may be classified
* Corresponding author. Address:
represents a limited and resource-intensive solu- as supervised or unsupervised. Supervised ML al-
Liver Center, Division of Gastro-
enterology, Massachusetts Gen- tion, and the vast majority of patients are not gorithms perform training on a dataset that is
eral Hospital, 55 Fruit Street, eligible for transplantation. Thus, identifying novel labelled in relation to the class of interest, and this
Wang 5th Floor Boston, MA approaches to improve the early diagnosis of HCC label is available to the algorithm while the model
02114, USA; Tel.: 617-724-2401,
fax: 617-724-5997.
and to predict therapeutic response and survival is being created, trained, and optimised. In
among patients with established HCC is of para- contrast, unsupervised ML involves training on a
E-mail address: tgsimon@mgh.
mount importance. dataset that lacks class labels, yielding clusters of
harvard.edu (T.G. Simon).
Owing to the broad heterogeneity in HCC risk output data that subsequently require addi-
https://doi.org/ factors and pathogenesis, established strategies for tional interpretation.
10.1016/j.jhep.2022.01.014
prediction and prognostication are still limited. DL represents a subtype of ML models which are
Recently, artificial intelligence (AI) has emerged as constructed using neural networks (NNs) inspired
a unique opportunity to improve the full spectrum by the neuroanatomy of the human brain. NNs
of HCC clinical care, by: i) improving the prediction consist of a network of interconnected computing
of future HCC risk in patients with established liver units – termed “neurons” – that are organised in
disease; ii) improving the accuracy of HCC layers, such that signals travel from the first layer

Journal of Hepatology 2022 vol. 76 j 1348–1361


Artificial intelligence (AI):
Computer programs designed to mimic human
intelligence and/or behavior, including learning, adapting
and problem-solving

Artificial intelligence

Machine learning (ML):


Machine learning Subtype of AI, in which computer programs are enabled
to “learn” from data and improve with experience

Deep learning

Deep learning (DL):


Subtype of ML inspired by the human brain, in which
programs utilize the complex architecture of multi-
layered neural networks to analyze large amounts of data

Fig. 1. Definitions of artificial intelligence (AI), machine learning (ML) and deep learning (DL).

(i.e. input data) to the last layer (i.e. output data) Current limitations of DL approaches include
after passing through multiple, intervening hidden overfitting of data, limited ‘explainability’ of data,
layers (Fig. 2). To train an NN, data are divided into and the possibility of poor generalisability, due to
a training set and a testing set. The training set the inherent reliance of DL models on the size and
characterises the architecture of the network and diversity of their training dataset. In this review,
defines and adjusts the weights between neurons, we will outline the rapidly evolving role and
in order to improve classification of the desired challenges for AI in the prediction, diagnosis, and
output. The testing set then evaluates the utility of prognostication of HCC.
Key point
the NN for identifying or predicting that output.
This validation can be conducted internally or AI for predicting incident HCC Due to the broad hetero-
externally. Internal validation is commonly per- Several previous case-control and cohort studies geneity in risk factors for
HCC and the lack of estab-
formed by k-fold cross validation within one have developed predictive models for the devel-
lished strategies for pre-
dataset, by splitting that dataset into k parts and opment of HCC using clinical, demographic and/or diction or prognostication,
then training k times on k-1 parts, and then sub- laboratory risk factors, selected using conventional AI has recently emerged as
sequently testing on the remaining part of the statistical approaches. However, these models have a unique opportunity to
improve the full spectrum
dataset. External validation is typically considered largely been criticised for their limited general-
of HCC clinical care.
more robust, as it demonstrates model general- isability, modest accuracy, and lack of broad
isability across populations. external validity. Moreover, HCC risk is notoriously

Input data Model training and validation Inference

Ultrasonography Electronic health Input Input Multiple Output Diagnostics and


records values layer hidden layer classification
layers

CT and MRI Biomarkers Preprocessing Prediction Therapeutic response

Histopathology Survival prognosis


images

Fig. 2. General concept of pipelines using neural networks. Different input data are pre-processed in such a way that they can be used as input values for the
training of a neural network. The neural network consists of one input layer, multiple hidden convolutional and/or multiple fully connected layers extracting
features from the input data, and one output layer with nodes that refer to different labels. These networks can then – among others – be used to classify data or
to predict therapeutic response or prognosis.

Journal of Hepatology 2022 vol. 76 j 1348–1361 1349


Review

challenging to model because this risk can fluc-

Outperformed numerous conventional

ADRESS-HCC and THRI; all p <0.001)


tuate widely in an individual over time, and such

predicting HCC (IDI = 0.01, p = 0.04;

Outperformed conventional logistic


non-linear changes are difficult to estimate using

Outperformed HALT-C model for


Improvement over traditional
rigid, conventional regression models. Recently, the

algorithms (PAGE-B, CU-HCC,


rapid expansion of available electronic health re-
cord (EHR) data has provided an opportunity to

NRI = 0.39, p <0.001)


leverage large-scale, longitudinal data elements for

regression models
automatic feature selection over long-term follow-
up, and thereby improve HCC risk prediction. To

methods
that end, several recent studies have applied AI

AI, artificial intelligence; AUROC, area under the receiver-operating characteristic curve; HCC, hepatocellular carcinoma; IDI, integrated discrimination index; NRI, net reclassification index.
approaches to longitudinal EHR data to improve

n.a.

n.a.
prediction of incident HCC (Table 1). For example,
in 2013, a supervised ML algorithm was found to
80.5% (57.9% in the training

Proportion testing positive


have a c-statistic of 0.64 for predicting incident

at 90%: sensitivity = 0.663


set)/80.7% (46.8% in the

HCC in patients with cirrhosis of any aetiology, and


Sensitivity/specificity

this significantly outperformed a conventional


system for HCC risk prediction.5 More recently,
validation set)

another model developed in patients with chronic


83.6%/99.9%

71.8%/88.4%
hepatitis C infection in the U.S. Veterans Affairs
cohort demonstrated an AUROC of 0.759 for inci-
n.a.

dent HCC.6 In all cases, the models constructed by


AI approaches significantly outperformed tradi-
C-statistic 0.719 (training);

tional regression models.


It has been posited that improved HCC risk
C-statistic (validation)
0.857; AUROC 0.873

prediction models leveraging AI techniques could


0.782 (validation)

be used to personalise HCC surveillance strategies


C-statistic 0.64

AUROC 0.759

by improving risk stratification of patients with


AUROC 0.96

chronic liver disease. For example, Ioannou and


Accuracy

colleagues found that targeting patients with the


uppermost 51% of their NN-derived HCC risk score
would include 80% of patients who would develop
Validation: 88/1,050

HCC within the subsequent 3 years.6 Such an


Training: 165/6,092
HCC cases (n)/total

Validation: n = 316
Training: 10,741/

Training: 86/424
Training: 41/442

approach could be useful in resource-limited set-


Validation: 390/
Training: 1,799/

tings that do not have sufficient capacity for reg-


cohort (n)

ular HCC surveillance in all at-risk patients.


331,694

85,692
48,151

However, to date, the clinical utility of this and


other AI-based scores for predicting risk of HCC is
unclear, particularly as these data have limited
Validation method

generalisability, given their reliance on the size and


External validation

External validation

Internal validation

diversity of the training dataset.


(HALT-C trial)
Table 1. Selected prior studies utilising AI to predict incident HCC.

AI for diagnosing HCC: radiomics,


histopathology and biomarkers
n.a.

n.a.

Numerous studies have tested the utility of AI for


accurately detecting existing HCC, based on imag-
Recurrent neural

ing modalities or biomarkers.


Artificial neural
Random forest

Random forest
Deep neural
AI classifier

Radiomics: ultrasound
network

network

network

Current clinical guidelines recommend regular B-


mode abdominal ultrasound surveillance for the
identification of HCC in patients with cirrhosis.7–9
General popula-

However, ultrasound has several well-described


Cirrhosis (HBV)
Chronic HCV

on entecavir

limitations when it comes to detecting focal liver


tion (Korea)
Population

Cirrhosis

Cirrhosis

lesions, including a high degree of dependence on


operator experience, equipment quality, and pa-
tient body habitus, among others. For detection of
HCC, the sensitivity of B-mode ultrasound is only
Author, year

Ioannou GN,

46-63%.9–11 To address this, several recent studies


An C, 2021
Singal AG,

Reddy R,

have tested the ability of AI frameworks to


Nam JY,
2020

improve the diagnostic accuracy of ultrasound in


2013

2019
2017

this setting.

1350 Journal of Hepatology 2022 vol. 76 j 1348–1361


Schmauch and colleagues designed a supervised the identification of focal liver lesions, out-
DL model, using a training dataset of 367 ultra- performing the results of blinded radiologists.16
sound images together with their corresponding Mokrane and colleagues conducted a small retro-
radiological reports, that could identify liver lesions spective study (n = 178) of patients with cirrhosis
as benign or malignant with a mean AUROC of 0.93 and indeterminate liver lesions, for whom diag-
and 0.92, respectively.12 More recently, Yang and nostic liver biopsy was recommended. Applying DL
colleagues developed and externally validated a approaches, the authors constructed a radiomics
deep convolutional neural network (DCNN), using signature based on 13,920 CT imaging classifiers,
a large, multicentre, ultrasound imaging database that achieved an AUROC of 0.70 for distinguishing
from 13 hospital systems. The final model HCC from non-HCC lesions. Importantly, the au-
demonstrated an AUROC of 0.92 for distinguishing thors demonstrated that the signature was not
benign from malignant liver lesions, and showed influenced by segmentation or by contrast
comparable a) performance to the judgment of enhancement, which adds to its putative general-
clinical radiologists (diagnostic accuracy, both isability.17 Another retrospective study, by Yasaka
76.0%) and b) accuracy to contrast-enhanced CT et al. (n = 460), utilised CT imaging classifiers from
(diagnostic accuracy, both 84.7%) that was only 3 phases (non-contrast-enhanced, arterial, and
slightly inferior to MRI (87.9%).13 delayed) to construct a 3-layer CNN for dis-
Similar approaches have also been applied to tinguishing a) HCC and non-HCC liver cancers from
contrast-enhanced ultrasound (CEUS) imaging for (b) indeterminate liver lesions, haemangiomas and
the detection of HCC. For example, Guo and col- cysts; their CNN had a diagnostic accuracy of 0.84
leagues recently demonstrated that a DL algorithm with a median AUROC of 0.92.18 More recently, Shi
applied to liver lesions seen by CEUS could increase and colleagues compared the performance of a
the sensitivity, specificity, and overall accuracy of triple-phase contrast-enhanced CT protocol
CEUS for detecting HCC.14 Others have used AI to coupled with a DL model, to a four-phase CT pro-
apply additional pattern recognition classifiers to tocol, for distinguishing HCC from other focal liver
CEUS DCNN algorithms, to improve diagnosis of lesions.19 The authors found that a DL model
indeterminate focal liver lesions.15 However, to combined with triple-phase CT protocol without
date, most prior CEUS studies have had small pre-contrast yielded similar diagnostic accuracy
sample sizes and lacked standardised imaging data (85.6%) to a four-phase protocol (83.3%; p = 0.765).
or external validation cohorts (to confirm the These findings suggest that reducing a patient’s
generalisability of models across populations). radiation dose with a triple-phase CT protocol may
not compromise accuracy, and thereby brings the
CT and MRI field one step closer to optimising CT protocols for
Another rapidly growing area of research is focused the accurate classification of liver lesions.
on improved characterisation of indeterminate Given the wide variability of radiographic fea-
liver lesions. In clinical practice, when an abdom- tures of the liver and liver lesions, manual seg-
inal ultrasound shows a new liver lesion, a patient mentation for radiomics-based assessments of HCC
is typically referred for further imaging, with is both difficult and time-consuming. In 2017, the
contrast-enhanced CT or MRI. Based on the fulfil- Liver Tumor Segmentation (LiTS) Challenge called
ment of specific radiologic criteria, certain liver upon investigators to develop AI-based algorithms
lesions may be considered as having pathogno- that could automatically segment liver tumours,
monic features of HCC, and thus do not require using a multinational dataset of 200 CT scans (130
liver biopsy for further histological confirmation. training, 70 validation scans).20,21 All of the top-
However, liver nodules imaged by CT or MRI often scoring automatic methods used fully convolu-
demonstrate indeterminate features, for which tional NNs that separately segmented the liver and
current recommendations include either liver bi- liver tumours. Segmentation quality was evaluated
opsy or close interval follow-up with serial imag- using Dice scores, and the best-scoring algorithm
ing.7,9 This practice is sub-optimal, resulting in achieved a Dice score of 0.96, whereas for liver
numerous imaging studies, patient stress, and the tumour segmentation the best algorithm achieved
potential for delayed diagnoses of liver cancer. For Dice scores between 0.67 and 0.70. While these
this reason, a growing body of recent literature has findings are promising, there was notable vari-
explored AI approaches to improve risk stratifica- ability in both the imaging characteristics of liver
tion of indeterminate liver lesions, to facilitate tumours and in their annotation, underscoring the
earlier and more accurate detection of HCC. need for universal, standardised methods for liver
In an early study focused on this issue, Preis and tumour segmentation.20,21
colleagues developed a NN to assess focal liver le- To date, AI has been applied less frequently to
sions identified by 18F-FDG-PET/CT (fluorine-18 MRI imaging of HCC tumours, and given the tech-
fluorodeoxyglucose positron emission tomogra- nical difficulty and expense associated with
phy/CT) evaluations, together with patient de- manually designing MRI features, the majority of
mographics and clinical characteristics of 98 published studies have been conducted in rela-
patients; their model had an AUROC of 0.896 for tively small populations. Nevertheless, a prior

Journal of Hepatology 2022 vol. 76 j 1348–1361 1351


Review

study combined clinical data with MRI-based hepatopathologists, and significant inter-observer
classifiers to distinguish HCC from metastases and disagreement may be observed. To address this,
from liver adenomas, cysts or haemangiomas, and several recent studies have applied AI to assist with
demonstrated a sensitivity of 0.73 for identifying the diagnosis of liver tumours. Using 2 large data
HCC, albeit with a specificity of just 0.56.22 Addi- sets of H&E-stained digital slides, Liao et al. used a
tionally, Hamm et al. developed a NN algorithm CNN to distinguish HCC from adjacent normal tis-
that successfully classified MRI liver lesions with a sues, with AUCs above 0.90.27 Kiana et al. devel-
sensitivity of 92%, a specificity of 98%, and an oped a tool able to classify image patches as HCC or
overall accuracy of 92%.23 Zhang and colleagues cholangiocarcinoma. The model reached an accu-
tested an automated approach to segmentation of racy of 0.88 on the validation set and, interestingly,
multi-parameter MR images in 20 patients with the authors observed that the combination of the
HCC, and demonstrated the feasibility of bypassing model and the pathologist outperformed both the
the time-consuming process of manually designing model alone and the pathologist alone, suggesting
MRI-based features.24 that AI tools should be used to augment, rather
Key point
More recently, Zhen et al. used CNNs to develop athan replace, the conventional histological diag-
AI reflects a broad and novel DL system that incorporated enhanced MR nosis. They also showed how an incorrect predic-
rapidly evolving field that images, unenhanced MR images and both struc- tion may negatively impact the final diagnosis
includes ML and DL
tured and unstructured clinical data, from 1,210 made by pathologists, underscoring the need to be
computational algorithms,
which are iteratively patients with liver tumours, and an external vali- cautious with AI models aimed at auto-
repeated, in order to pro- dation set (n = 201).25 This DL system demonstrated mating diagnosis.28
gressively improve model excellent performance for classifying liver tumours It has been widely demonstrated that the histo-
performance and classifi-
– including HCC – with sensitivity and specificity on logical appearance of human cancers, including
cation over time.
a par with that observed for experienced radiolo- HCC, contain a massive amount of information
gists. Importantly, this DL model also showed related to their underlying molecular alterations
excellent performance when combining unen- and/or to patient prognosis.29–31 In this line, Wang
hanced MR imaging with clinical data, suggesting et al. trained a multitask DL NN for automated
that, with further validation, these models may single-cell segmentation and classification on digital
permit patients to avoid contrast-related complica- slides. This approach allowed the authors to extract
tions of MRI. Finally, Wang and colleagues recently quantitative image features related to individual
described a DL model designed to address the cells as well as spatial relationships between
limited interpretability of AI-based radiomics as- neoplastic cells and infiltrating lymphocytes. Unsu-
sessments of HCC.26 This innovative model provides pervised consensus clustering of these features led
feedback on the relative importance of various to the identification of 3 subtypes associated with
radiological input features, and thereby serves as anparticular somatic genomic alterations and molec-
important proof of concept, demonstrating that ular pathways.32 Another study showed that DL
“interpretable” DL models could one day be used to could predict a subset of recurrent HCC genetic de-
improve standardised HCC reporting systems and fects with AUCs ranging from 0.71 to 0.89.33
thereby clinical outcomes. Recent pioneering studies have thus aimed to
To date, published AI algorithms for radiomics predict molecular signatures/alterations predictive
assessments of HCC share important limitations, of response to systemic therapies, by processing
including relatively small input datasets, lack of digital slides through NNs. In gastrointestinal can-
sufficiently large or diverse cohorts for robust cers, for example, high performance is achieved for
external validation and lack of standardisation of the prediction of microsatellite instability, a feature
methods or analytical tools. It will be important to strongly associated with sensitivity to immuno-
define the utility of AI-based prediction tools in modulating therapies.34 Two other pan-cancer
prospective cohorts, and in pooled, large-scale and studies also demonstrated that NN models were
diverse populations. able to predict a wide range of molecular alter-
ations or signatures, some of which are related to
Histopathology response to particular systemic therapies.35,36 For
Histopathology is a cornerstone in the manage- HCC, no molecular feature is currently used to
ment of many liver diseases, including autoim- predict response to the systemic therapies avail-
mune hepatitis and non-alcoholic steatohepatitis able for patients with advanced disease. However,
(for grading and staging). Although non-invasive Sangro et al. recently reported that responses to the
criteria allow for the diagnosis of HCC in partic- anti-programmed death 1 receptor (PD1) antibody
ular clinical settings, the histological examination nivolumab were more frequently observed in pa-
of tumour samples is often required for masses tients with tumours showing overexpression of
with atypical features on imaging or to rule out a particular immune gene signatures.37 This was
diagnosis of benign primary liver tumour, chol- further confirmed by Haber et al., who also
angiocarcinoma or even metastasis. However, pre- observed increased sensitivity to immunotherapy
cise histopathological characterisation of liver in HCCs in which interferon gamma and gene sets
tumours can often prove challenging for associated with antigen presentation were

1352 Journal of Hepatology 2022 vol. 76 j 1348–1361


upregulated.38 Immune cells are easily identified identify novel cell types and cell-cell in-
by DCNNs, and it is likely that DL will be able to teractions.46–48 In HCC, it has permitted identifi-
predict this type of gene expression profile. cation of new subsets of tumour-infiltrating
Most of these different studies share the same lymphocytes, including clonally expanded exhaus-
limitations, including the limited number of pa- ted CD8+ T cells and regulatory T cells, and tumour-
tients, sensitivity to staining protocols and lack of associated macrophages.49,50 Collectively, these
prospective validation. The standardisation of slide findings are helping to uncover the immunological
encoding and processing will also be key to enable landscape of chronic liver disease and HCC, with
comparisons of model performance. Finally, it will unprecedented resolution.
be critical to determine how predictions are The field of single-cell RNA-seq is still in its in-
impacted by artifacts such as tissue folds or stains. fancy and key challenges remain, including the
Automated quality control of slides may help to variation between methods in terms of data quality
overcome these issues. and sensitivity, as well as the noisiness and incom-
pleteness of generated data.51–53 Specifically, low-
Molecular biology and biomarkers abundance data is frequently lost, rendering an
The past 20 years have witnessed an explosion in expressed transcript undetectable (a phenomenon
the availability of large, complex data sets with called, “dropout”).54 On the other hand, unnecessary
genomic and molecular data from bulk tissues and amplification of noise risks artificially accentuating
from single cells. Consequently, AI algorithms the significance of less relevant pathways.45 Several
leveraging integrative multiomics approaches have DL-based tools are currently available to address
also been designed to improve the detection and these issues in single-cell RNA-seq datasets,
characterisation of HCC tumours. Such integrated including DeepImpute and SAUCIE, which apply
algorithms have shown promise for informing node/gene interaction structures, as well as adap-
disease diagnosis and staging, and for the predic- tations of generative adversarial networks, which
tion of disease recurrence and therapeu- can generate single-cell RNA-seq data and ascertain
tic response.39,40 individual cell types using NNs.55–57 It is hoped that
As an example, integrated multiomics analyses further improvements in DL algorithms will help to
are increasingly used to assess individual variation improve the validity of single-cell RNA-seq datasets
in key patterns of hepatic gene expression, and to through imputation, by “denoising” with an auto-
define intratumoural heterogeneity.41 Zeng and encoder that predicts genes’ mean, standard devia-
colleagues constructed a DL model based on RNA- tion and likelihood of dropout, or by streamlining
sequencing (RNA-seq)-defined samples, and used downstream data analyses.55,58
those classified features to construct gene expres- New technologies incorporating DL have
sion signatures for cancer.42 The DL-defined auto- recently been developed to integrate single-cell
encoder was found to outperform numerous RNA-seq profiling with epigenetic and proteomic
traditional analytical approaches based on prin- assays, in order to more comprehensively profile
cipal component analysis or top varying genes. individual cells.59–61 Such multi-omics ap-
In another study of HCC samples, Chaudhary proaches have tremendous potential utility for
and colleagues applied supervised and unsuper- uncovering novel biomarkers and therapeutic
vised DL approaches to RNA-seq, miRNA-seq and targets in HCC. However, universal, standardised
DNA methylation data, and identified 2 distinct methods and protocols must first be established,
HCC subpopulations with significant survival dif- and much larger datasets will be needed, given
ferences, with a C-statistic of 0.68 in the training that the accuracy of DL algorithms depends upon
dataset and 0.67-0.82 in 5 external validation the size and quality of input data. This, in turn,
sets.43 This algorithm has subsequently been will require collaboration between investigators
applied to external HCC cohorts (n = 1,494), and the sharing of algorithms, approaches and
revealing consensus driver genes linked to HCC raw datasets.
survival.44 Future work will need to demonstrate
the utility of those signatures for informing thera- AI for prognostication in established HCC
peutic decision making. The development of robust prognostic scoring
Key point
Finally, single-cell RNA-seq technologies now systems is key to improve patient risk stratification
permit thousands of single cells to be profiled and to plan clinical trials testing neoadjuvant or A growing body of research
simultaneously and in an unbiased fashion, which adjuvant therapies (see Table 2). A DL algorithm has applied AI approaches
to improve HCC risk pre-
holds great promise for powerful DL approaches. based on a residual NN architecture was recently
diction, and to more accu-
Single-cell RNA-seq permits the identification of developed in a Korean multicentre study to predict rately detect and risk
unique cellular subpopulations and their tran- HCC recurrence after transplantation. The features stratify existing HCC tu-
scriptomic profiles, as well as complex gene regu- included age, tumour size, and serum levels of mours, based on EHR data,
radiomics approaches, and
latory networks.45 Within the liver, single-cell alpha-fetoprotein and PIVKA-II (prothrombin
molecular or histopatho-
RNA-seq has been used to more comprehensively induced by vitamin K absence or antagonist-II); the logical biomarkers.
elucidate the cellular transcriptomes of non- authors showed the advantages of their model
alcoholic steatohepatitis and cirrhosis, and to (MoRAL-AI, assessed by C-indices) in their external

Journal of Hepatology 2022 vol. 76 j 1348–1361 1353


1354

Review
Table 2. Selected prior studies utilising AI for HCC prognostication.
Author, Year HCC cases (n) AI algorithm Validation method Input data Test statistics Highlight
Abajian A, 2018 36 Logistic regression, Internal leave-one-out MR images and clinical Accuracy: 78% Prediction of TACE response
random forest cross validation data Sensitivity: 62.5% Successful implementa-tion of AI methods for
Specificity: 82.1% the combination of clinical and imaging data
Ji GW, 2019 Training: 210 RSF/MRMR External Validation CT images and clinical C-statistic: 0.73 Prediction of HCC recur-rence after resection;
Journal of Hepatology 2022 vol. 76 j 1348–1361

Validation: data outperformed conven-tional outcome prediction


107 internal scores, e.g. BCLC stage
153 external
Nam JY, 2020 Training: 349 Residual neural network External validation Clinical data C-statistic: 0.75 Prediction of HCC Recur-rence after LT; out-
Validation: 214 Sensitivity: 76% perfor-med conventional recurr-ence prediction
Specificity: 46% scores, e.g. Milan criteria
Saillard C, 2020 Training: 194 Artificial neural network External validation Digitised histopathology C-statistic: 0.78 Survival prediction after HCC resection; Out-
Validation: 328 slides perfor-med conventional clinical, biological or
pathological parameters
Peng J, 2020 Training: 562 Residual convolutional External validation CT images AUC: >0.95 Prediction of TACE response
Validation: 227 neural network First study to predict complete/partial response
and stable/progressive disease showing good
accuracy
Oezdemir I, 2020 36 Distance weighted Internal leave-one-out Contrast-enhanced Accuracy: 86% Prediction of TACE response
discrimination method cross validation ultra-sound images Sensitivity: 89% First study providing proof of concept using AI
Specificity: 82% methods with ultrasono-graphy images
AI, artificial intelligence; AUC, area under the curve; HCC, hepatocellular carcinoma; LT, liver transplantation; MRMR, maximum relevance minimum redundancy; RSF, random survival forest; TACE, trans-
arterial chemoembolisation.
validation cohort, compared to other state-of-the- Interestingly, these factors were shown to be
art predictive models, like the Milan criteria.62 independently associated with survival; yet, it is
The morphological features of HCC have a important to highlight that the study lacked
major impact on patient prognosis, and several DL external validation and a simple train-validate-test
algorithms have thus been developed to improve split approach was used, which may limit gen-
the prediction of HCC recurrence/survival using eralisability.73 Similarly, Zhang et al.’s DL score,
CT scans, MRI or histopathological images. Saillard based on a DenseNet-121 feature extraction archi-
et al. built a model based on the processing of HCC tecture was also derived from CT images of patients
digital slides that was able to predict the survival with HCC treated with TACE plus sorafenib. The DL
of patients with HCC treated by surgical resection score was independently associated with overall
with a higher accuracy than scores including all survival, after controlling for known prognostic
relevant clinical, biological and pathological fea- factors.74 Using residual CNNs, Peng et al. trained
tures. Notably, they were validated in a series of (562 patients) and externally validated (89 and 138
cases for which slides were stained with different patients) an algorithm yielding AUCs of at least
protocols, suggesting that such models may 0.94 for prediction of complete or partial response
generalise well when tested in different clinical and stable or progressive disease after TACE ther-
centres.63 A recent study from Yamashita et al. apy.75 A single study involving ultrasound was
confirmed the capability of AI algorithms to pre- conducted by Oezdemir et al., who extracted
dict outcomes based on digital histologic slides.64 handcrafted HCC microvascular features from CEUS
Lu and Daigle used 3 state-of-the-art CNNs (VGG images to predict response to TACE. The model
16, Inception v3, ResNet50), pretrained on achieved an accuracy of 86%, yet the results require
ImageNet for feature extraction using HCC histo- further evaluation due to the small sample size
Key point
pathology slides from the TCGA-LIHC cohort, and (n = 36).76
selected features significantly associated with Key limitations of existing
survival using multivariable Cox regression anal- Current challenges limiting the use of AI AI algorithms include
overfitting of data, limited
ysis. While this again highlights the possibility of for HCC risk prediction and
‘explainability’ of results,
performing outcome prediction using histopa- prognostication and the possibility of poor
thology slides, the conclusions are limited by the Need for standardisation of algorithms generalisability, due to the
missing adjustment for other prognostic factors, and software inherent reliance of ML and
Although AI holds many promises for the DL models on the size and
as well as the lack of an external validation
diversity of their training
cohort.65 Saito et al. applied classical ML methods improvement of HCC detection and patient strat- datasets.
to handcrafted whole slide image features from a ification, deployment of ML algorithms in clinical
relatively small cohort of 158 patients with HCC to settings remains very rare. The safe translation of
develop a combined model, predicting HCC DL models will indeed require standardisation
recurrence after resection with an accuracy of and robust evaluation using metrics that would
89%. The next step will be to validate these ideally include patient outcomes and quality of
promising results in a larger cohort.66 care, as well as appropriate stakeholder engage-
An exponentially growing number of studies ment and oversight. To date, there are no stand-
also investigate the predictive performance of im- ardised methods for AI-based data analysis or
ages from MRI or CT scans. Ji et al. combined interpretation, and no universal approaches to
several clinical and biological features (including address missing data, which is a fundamental
serum alpha-fetoprotein, albumin-bilirubin [ALBI] concern in large-scale datasets. A significant
grade and tumour margin status), and radiomics number of published studies have investigated
signatures to assess the risk of HCC recurrence after large series of patients with extensive bench-
surgical resection.67 Other authors also aimed to marking against expert performance, but, in the
process CT scan or MR images to predict micro- vast majority of cases, these studies were retro-
vascular invasion, cytokeratin 19 expression (pro- spective. Further, the performance of these
genitor phenotype) or early tumour recurr- models is likely to decrease when assessed pro-
ence.68–71 Several studies investigated the ability of spectively using “real-world” data.
AI methods to predict responses to transarterial The establishment of consensus guidelines in
chemoembolisation (TACE) in patients with reporting data from ML studies is also critical. A
advanced HCC. Abajian et al. used handcrafted group is currently working on the definition of an
radiomics features from MR images to train logistic AI-specific version of the STARD checklist (STARD-
regression and random forest models to classify AI-Standards for Reporting of Diagnostic Accuracy
patients treated with TACE as responders or non- Study-AI). These guidelines will aim to improve the
responders. The models achieved a maximal over- completeness and transparency of studies investi-
all accuracy of 78% but revealed the potential of ML gating diagnostic test accuracy. Other recommen-
algorithms in TACE response prediction.72 Classical dations will be needed for prognostic or
ML, as well as DL techniques were used on CT theranostic biomarkers. Their performance should
image radiomics features by Liu et al. to develop AI- finally be compared to existing diagnostic, staging
based prognostic risk factors for overall survival. and predictive systems.

Journal of Hepatology 2022 vol. 76 j 1348–1361 1355


Review

1 Transparency: Knowing the structure of the network and


and activation status of the neurons

A 3 Explanation: Association of nuclei with


hyperchromasia (node A) and irregular
Input:
put: Imag
Image tile contours (node B)
Tile classified as tumour

OUTPUT: TUMOUR
99%
9% likelyhood
likely

B
Schematic representation
of the tile after tranformation
by layers of the network

2 Semantics : Meaning of the network components


Nuclei with hyperchomasia (node A) and
irregular contours (node B)

Fig. 3. Explainable artificial intelligence: example of pathology. This virtual model is dedicated to the prediction of the tumour or non-tumour nature of
images from digital slides. The aim of explainable artificial intelligence is to better understand, through transparency, semantics and explanation, how the model
makes its predictions. Transparency (1) consists of having an in-depth knowledge of the structure of the neural network and the activation status of its different
neurons/nodes. Semantics will provide insights on the type of objects that result in the activation of particular parts of the network). Finally, explanation will
enable clinicians to understand how the association of different features impact the final prediction.

Need for data sharing/open-source algorithms such as patients’ anonymity and the residual risk of
As the performance of AI models is highly depen- re-identification, cost of data storage/provision,
dent on the amount of data used for training, the and need for specific consent regarding sharing.
availability of large data sets is key to fostering the However, the availability of IPD from clinical trials
development of research and its future impact on (including imaging and digital slides) testing sys-
clinical care. To this end, the deposition and temic therapies will be key for the development of
sharing of large datasets should be encouraged. AI models able to predict response/survival.
This includes utilisation and sharing of large-scale
data from EHRs across and between health sys- Need for sufficiently diverse populations
tems. Moreover, sharing of individual-participant To date, cohorts used to develop and train AI
data (IPD) from clinical trials or purely academic models focused on HCC risk prediction, diagnosis
research studies, a clear “ethical and scientific and prognostication have lacked sufficient racial,
imperative”, has gained increasing traction and is ethnic and socioeconomic diversity. This is a crit-
now advocated by many scientists and organisa- ical issue, given that the accuracy of AI-based al-
tions, and would assist in constructing datasets of gorithms depends upon the validity and size of
sufficient size and detail to appropriately train and their input data. Consequently, future studies will
validate AI models.77 Moreover, a universal, need to ensure that promising AI-based tools are
standardised method for addressing and analysing thoughtfully validated in diverse cohorts that
missing data in AI models is necessary, and this is include racial and ethnic minorities as well as pa-
particularly important when considering shared tients across the complete socioeconomic spec-
datasets. The International Committee of Medical trum. This once again underscores the need for
Journal Editors has thus implemented a clinical data sharing between investigators and across in-
trial data policy that requires an IPD sharing stitutions, so that representative cohorts can
statement for manuscripts reporting clinical trials. be constructed.
Although several repositories are now able to store
IPD and make it available to third parties, the rate Examples from other disciplines
of sharing remains very low. The main obstacle is Currently, approximately 150 AI-based medical de-
likely to be cultural, however other issues remain, vices have been approved by the FDA. Most of these

1356 Journal of Hepatology 2022 vol. 76 j 1348–1361


A Current
Therapy evasion Therapy evasion Final therapy evasion
therapy concept

First-line
therapy Radiologic progress: Radiologic progress:
decision Therapy adjustment Therapy adjustment

Tum
our
loa
d

Drug
efficacy
Therapy 1 Therapy 2 Therapy 3

Time (of survival)

B Future, AI-supported AI-based progress AI-based progress


prediction prediction
therapy concept Final therapy
evasion
AI-based first-line Early therapy Early therapy
therapy decision adjustment adjustment

Tum
our lo
a d

Drug
efficacy
Therapy 1 Therapy 2 Therapy 3

Time (of survival)

Fig. 4. Artificial intelligence could support doctors in decision making in tumour therapy in the future. (A) Current oncologic therapy pattern. After an initial
first-line therapy, the tumour is evading therapy through resistance mechanisms. The following tumour growth is recognised during radiologic follow-up leading
to therapy adjustment. (B) Hypothetical, future, AI-supported therapy pattern. Initial, individualized first-line therapy decision, accounting for an AI-based
recommendation. After an AI algorithm predicts progression of a tumour, doctors decide to adjust therapy before the tumour can develop resistance to ther-
apy and grow again.

models were developed for the fields of radiology Explaining “the black box” of AI
(e.g. CT scan image reconstruction or brain MRI A common issue for all existing and future AI ap-
interpretation), cardiology (e.g. electrocardiogram plications is to make their decisions comprehen-
analysis, cardiac monitoring) and ophthalmology sible to the user. The term “explainable AI” refers to
(detection of diabetic retinopathy). Interestingly the a particular set of methods that allows users to
FDA has also very recently granted its first clearance comprehend how the AI models work and make
for an AI-based pathology software application. The their decisions. It thus provides feedback on the
product analyses digital slides of prostatic biopsies, most important features involved in the pre-
highlights areas that are most likely to contain dictions and helps to understand the potential
cancer and flags them for further review by a biases. This transparency is critical to build up the
pathologist (https://www.paige.ai/). This landmark trust needed to convince doctors to rely on these
approval marks the beginning of a new era in the computer-aided devices they might be using in the
use of AI-assisted diagnostics for pathology, and it is future. The approaches most commonly used in DL
very likely that models aiming to assist HCC histo- consist of extremely complex layers of mathemat-
logical diagnosis/prognosis assessment will also be ical computation, and it is thus very difficult to gain
available soon. They are particularly needed to assist insights into how the data are transformed
with the differentiation of benign vs. malignant throughout the whole network.
hepatocellular tumours, and also for a more robust Explainable AI is however an active field of
and standardised diagnosis of rare pathological en- research and many aim to open the black boxes of
tities, such as combined hepatocellular-cholangio- NNs. The main strands of work are making the
carcinoma or fibrolamellar carcinoma. networks “transparent”, learning the semantics of

Journal of Hepatology 2022 vol. 76 j 1348–1361 1357


Review

its different components and finally generating stratify the risk of HCC emergence in high- and
post hoc explanations. Transparency mainly con- low-risk patients.84
sists of understanding the model structure and its Treatment with immune checkpoint inhibitors
function. Semantics of the different network com- (ICIs) has represented a fundamental breakthrough
ponents will provide insights on the meaning of in many cancers.85–87 In palliative treatment of HCC
particular neurons and the post hoc explanation patients, the IMBRAVE-150 trial showed that the
finally analyses why a result is inferred (Fig. 3).78 combination of atezolizumab and bevacizumab
For example, post hoc explanations of models conferred a significant survival benefit compared to
processing digital histology slides can be estab- sorafenib in patients with HCC.3 However, like in
lished by getting a human expert to review the many previous trials in distinct entities, it became
image areas associated with the highest predictive apparent that not all patients with HCC benefit from
value.. This type of approach was used in the study ICIs to a similar extent. While there are signals for
by Saillard et al., who built a model able to predict HCC subgroups with a potentially higher benefit (e.g.
the survival of patients after resection of HCC. viral hepatitis vs. non-viral liver disease88), there is
Interestingly, reviewing the tumoural tiles associ- still no biomarker that reliably predicts therapeutic
ated with a high risk of death showed an enrich- response before or very early after starting ICI ther-
ment in several features (including macro- apy in patients with HCC. Therefore, a significant
trabecular-massive subtype, cellular atypia) previ- fraction of patients will be subjected to the (low) risk
ously shown to be predictive of dismal clinical of severe ICI-related toxicity without benefit, thereby
outcome.63 These results show that the models, at being at an increased risk of tumour progression and
Key point
least in part, rely on known histological parame- worsened liver function, while the cost of ICI therapy
There remains a great need ters. The authors also identified a new prognostic is remarkably high. In this setting, AI-based response
to standardise and robustly feature, i.e. the presence of vascular spaces. prediction could play a key role in improving patient
evaluate AI algorithms in
Together, these results underscore the importance outcomes and reducing healthcare expenditure.
prospective studies and
using large-scale “real- of human/machine interactions and show that Generating, training and applying an algorithm
world” datasets, as well as novel hypotheses can be generated with this type could involve a deep net trained on histologic data,
to establish consensus of approache. Altogether, addressing ‘explain- e.g. from randomised clinical trials in immuno-
guidelines to ensure accu- ability’ is a critical issue, and will be necessary to: i) therapy, and/or the combination of different deep
rate and comprehensive
gain the required confidence in AI models’ outputs, nets including histology, radiology, genomic and
reporting of data from ML
and DL studies. and ii) exploit NNs to discover key features that clinical information. Importantly, a DL-based algo-
may have been overlooked. rithm could either be trained on data available
before the start of therapy or on data extracted
immediately after the initiation of therapy. Thus, it
Future applications of AI: towards tailored
may, before the first radiological response evalua-
clinical trials tion, provide early predictions of whether a patient
Prospective studies are needed to fully demon-
will benefit or should be switched to another
strate the potential of AI to improve the clinical
therapeutic strategy. Beyond determining the ideal
care of patients with HCC. In other medical areas,
first-line therapy per patient, AI-based decision
several AI-based randomised clinical trials have
making could also provide a basis for a funda-
already been conducted. As such, in endoscopy,
mental switch in the way that treatment changes
numerous randomised clinical trials have evalu-
are implemented into long term palliative treat-
ated the impact of computer-aided systems on
ment of oncologic patients. Currently, a successful
physicians’ performance in diagnosing intestinal
line of therapy is provided to a patient until
adenoma or indicating blind spots of colonos-
radiological progression is evident (Fig. 4). How-
copy.79,80 The need to incorporate these new de-
ever, it could be beneficial to establish a tool for the
velopments prompted the research community to
early prediction of treatment failure, recommend-
extend the widely used SPIRIT and CONSORT
ing a switch to another therapy, even before full
guidelines for the use of AI methods in 2020.81,82
progression is documented on imaging. This tool
According to ClinicalTrials.gov (https://
could enable preemptive therapy adjustment in the
clinicaltrials.gov/), there are currently 6 ongoing
interval between molecular resistance and imaging
trials involving AI for the management of HCC. A
(Fig. 3). AI could represent the ideal toolbox to
research group at the University of Hong Kong is
facilitate such a concept. Similar to a first-line de-
comparing an algorithm designed to diagnose HCC
cision, an algorithm would need to be trained
from CT images against the standard diagnostic
within clinical trials, first proving that radiological
procedure that relies on the LI-RADS criteria
progression can be reliably predicted, e.g. on an
(NCT04843176).83 A multicentre study from France
algorithm trained on radiology, but also on labo-
is prospectively developing an AI algorithm in a
ratory values and clinical parameters. Once a proof
non-randomised clinical trial. The research group
of concept for an AI algorithm is achieved, future
uses clinical, biological and ultrasound data to
clinical trials could compare a possible benefit from

1358 Journal of Hepatology 2022 vol. 76 j 1348–1361


early AI-based regimen switches to a conventional demonstrate the reliability and robustness of models.
approach based on pure radiological progression We know that AI can predict a very large set of clin-
within the standard clinical imaging intervals (e.g. ically relevant features, and we must also now
6, 8, or 12 weeks). demonstrate that these approaches work in a clinical
While these concepts are still hypothetical, it will setting, by comparing model performance to that of
be important to integrate AI-based algorithms into conventional staging systems, and further through
current and future clinical trials, in order to prove the careful design of large prospective trials.
that they are valuable tools to predict responses to
first-line therapy and to predict early progression. Abbreviations
Implementing these steps will depend on access to AI, artificial intelligence; CEUS, contrast-enhanced
biological samples and clinical data within large ultrasound; DCNN, deep convolutional neural
clinical trials, and will require acceptance of these network; DL, deep learning; EHR, electronic health
concepts and further that these data are made record; HCC, hepatocellular carcinoma; ICI, immune
accessible to the clinician scientists who are checkpoint inhibitors; IPD, individual-participant
contributing patients to these trials. To that end, data; ML, machine learning; NAFLD, non-alcoholic
collaborative networks based on trust and united in fatty liver disease; NN(s), neural network(s); TACE,
the collective aim of improving patient outcomes transarterial chemoembolisation.
need to be implemented not only between clinicians
but also with industry. Nevertheless, it is paramount Financial support
for any model developed and trained within the NIH K23 DK122104 (TGS). Dana-Farber/Harvard
framework of a clinical trial to be thoroughly vali- Cancer Center GI SPORE Career Enhancement
dated in diverse, real-world patient populations Award (TGS).
before clinical implementation, to address possible
biases introduced by the trial’s inclusion criteria. Conflict of interest
Moreover, AI-based algorithms and any resultant Dr. Simon has served as a consultant to Aetion and has
clinical tools must also be constructed with appro- received grants to the institution from Amgen, for
priate stakeholder engagement and oversight, to work unrelated to this manuscript. Pr Calderaro serves
ensure that validated algorithms are standardised as a consultant for Keen Eye, Crosscope and Owkin.
according to protocol and that they are used in the Please refer to the accompanying ICMJE disclo-
correct clinical contexts, and further that data sure forms for further details.
output is interpreted properly to maximise clinical
benefit. Correctly interpretating data output from an Authors’ contributions
AI-based clinical tool will in turn require appro- Concept and literature review: all co-authors.
priate training and awareness, both amongst the Drafting of manuscript: all co-authors. Critical
public and clinical providers. revision of the manuscript: all co-authors. Guar-
antor of the manuscript: Simon. All authors
Conclusion contributed to the critical revision of the manu-
It is hoped that AI will profoundly change the way we script for important intellectual content and
care for patients with HCC. Although significant approved the final version of the manuscript. The
progress has been made during the last decade, im- corresponding author (TGS) attests that all listed
provements in HCC risk prediction, diagnosis and authors meet authorship criteria and that no others
response prediction are still critically needed. Several meeting the criteria have been omitted.
challenges remain to fully implement such technol-
ogies in clinical practice, including the need to Supplementary data
develop robust approaches for structured data Supplementary data to this article can be found
collection, sharing and storage, and the need to online at https://doi.org/10.1016/j.jhep.2022.01.014.

References in predicting development of hepatocellular carcinoma. Am J Gastro-


[1] Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. enterol 2013;108:1723–1730.
Global cancer statistics 2020: GLOBOCAN estimates of incidence and [6] Ioannou GN, Tang W, Beste LA, Tincopa MA, Su GL, Van T, et al. Assess-
mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin ment of a deep learning model to predict hepatocellular carcinoma in
2021;71:209–249. patients with hepatitis C cirrhosis. JAMA Netw Open 2020;3:e2015626.
[2] Baecker A, Liu X, La Vecchia C, Zhang Z-F. Worldwide incidence of he- [7] Heimbach JK, Kulik LM, Finn RS, Sirlin CB, Abecassis MM, Roberts LR, et al.
patocellular carcinoma cases attributable to major risk factors. Eur J AASLD guidelines for the treatment of hepatocellular carcinoma. Hep-
Cancer Prev 2018;27:205–212. atology 2018;67:358–380.
[3] Finn RS, Qin S, Ikeda M, Galle PR, Ducreux M, Kim T-Y, et al. Atezolizumab [8] Marrero JA, Kulik LM, Sirlin CB, Zhu AX, Finn RS, Abecassis MM. Diagnosis,
plus bevacizumab in unresectable hepatocellular carcinoma. N Engl J Med staging, and management of hepatocellular carcinoma: 2018 practice
2020;382:1894–1905. guidance by the American Association for the Study of Liver Diseases.
[4] El-Serag HB, Kanwal F. Epidemiology of hepatocellular carcinoma in the Hepatology 2018;68:723–750.
United States: where are we? Where do we go? Hepatology [9] European Association for the Study of the Liver, Electronic address: eas-
2014;60:1767–1775. loffice@easloffice.eu, European Association for the Study of the Liver. EASL
[5] Singal AG, Mukherjee A, Elmunzer BJ, Higgins PDR, Lok AS, Zhu J, et al. clinical practice guidelines: management of hepatocellular carcinoma.
Machine learning algorithms outperform conventional regression models J Hepatol 2018;69:182–236.

Journal of Hepatology 2022 vol. 76 j 1348–1361 1359


Review

[10] Yu NC, Chaudhari V, Raman SS, Lassman C, Tong MJ, Busuttil RW, et al. CT [32] Wang H, Jiang Y, Li B, Cui Y, Li D, Li R. Single-cell spatial analysis of tumor
and MRI improve detection of hepatocellular carcinoma, compared with and immune microenvironment on whole-slide image reveals hepato-
ultrasound alone, in patients with cirrhosis. Clin Gastroenterol Hepatol cellular carcinoma subtypes. Cancers (Basel) 2020;12:3562.
2011;9:161–167. [33] Chen M, Zhang B, Topatana W, Cao J, Zhu H, Juengpanich S, et al. Classi-
[11] Vecchiato F, D’Onofrio M, Malagò R, Martone E, Gallotti A, Faccioli N, et al. fication and mutation prediction based on histopathology H&E images in
Detection of focal liver lesions: from the subjectivity of conventional ul- liver cancer using deep learning. NPJ Precis Oncol 2020;4:14.
trasound to the objectivity of volume ultrasound. Radiol Med [34] Kather JN, Pearson AT, Halama N, Jäger D, Krause J, Loosen SH, et al. Deep
2009;114:792–801. learning can predict microsatellite instability directly from histology in
[12] Schmauch B, Herent P, Jehanno P, Dehaene O, Saillard C, Aubé C, et al. gastrointestinal cancer. Nat Med 2019;25:1054–1056.
Diagnosis of focal liver lesions from ultrasound using deep learning. Diagn [35] Kather JN, Heij LR, Grabsch HI, Loeffler C, Echle A, Muti HS, et al. Pan-
Interv Imaging 2019;100:227–233. cancer image-based detection of clinically actionable genetic alterations.
[13] Yang Q, Wei J, Hao X, Kong D, Yu X, Jiang T, et al. Improving B-mode ul- Nat Cancer 2020;1:789–799. https://doi.org/10.1038/s43018-020-0087-6.
trasound diagnostic performance for focal liver lesions using deep [36] Fu Y, Jung AW, Torne RV, Gonzalez S, Vöhringer H, Shmatko A, et al. Pan-
learning: a multicentre study. EBioMedicine 2020;56:102777. cancer computational histopathology reveals mutations, tumor compo-
[14] Guo L-H, Wang D, Qian Y-Y, Zheng X, Zhao C-K, Li X-L, et al. A two-stage sition and prognosis. Nat Cancer 2020;1:800–810.
multi-view learning framework based computer-aided diagnosis of liver [37] Sangro B, Melero I, Wadhawan S, Finn RS, Abou-Alfa GK, Cheng A-L, et al.
tumors with contrast enhanced ultrasound images. Clin Hemorheol Association of inflammatory biomarkers with clinical outcomes in
Microcirc 2018;69:343–354. nivolumab-treated patients with advanced hepatocellular carcinoma.
[15] Ta CN, Kono Y, Eghtedari M, Oh YT, Robbin ML, Barr RG, et al. Focal liver J Hepatol 2020;73:1460–1469.
lesions: computer-aided diagnosis by using contrast-enhanced US cine [38] Haber PK, Torres-Martin M, Dufour J-F, Verslype C, Marquardt J, Galle PR,
recordings. Radiology 2018;286:1062–1071. et al. Molecular markers of response to anti-PD1 therapy in advanced
[16] Preis O, Blake MA, Scott JA. Neural network evaluation of PET scans of the hepatocellular carcinoma. J Clin Oncol 2021;39. 4100–4100.
liver: a potentially useful adjunct in clinical interpretation. Radiology [39] Johannet P, Coudray N, Donnelly DM, Jour G, Illa-Bochaca I, Xia Y, et al.
2011;258:714–721. Using machine learning algorithms to predict immunotherapy response
[17] Mokrane F-Z, Lu L, Vavasseur A, Otal P, Peron J-M, Luk L, et al. Radiomics in patients with advanced melanoma. Clin Cancer Res 2021;27:131–140.
machine-learning signature for diagnosis of hepatocellular carcinoma in [40] Patel SK, George B, Rai V. Artificial intelligence to decode cancer mecha-
cirrhotic patients with indeterminate liver nodules. Eur Radiol nism: beyond patient stratification for precision oncology. Front Phar-
2020;30:558–570. macol 2020. https://doi.org/10.3389/fphar.2020.01177. 0.
[18] Yasaka K, Akai H, Abe O, Kiryu S. Deep learning with convolutional neural [41] Liu S, Yang Z, Li G, Li C, Luo Y, Gong Q, et al. Multi-omics analysis of pri-
network for differentiation of liver masses at dynamic contrast-enhanced mary cell culture models reveals genetic and epigenetic basis of intra-
CT: a preliminary study. Radiology 2018;286:887–896. tumoral phenotypic diversity. Genomics Proteomics Bioinformatics
[19] Shi W, Kuang S, Cao S, Hu B, Xie S, Chen S, et al. Deep learning assisted 2019;17:576–589.
differentiation of hepatocellular carcinoma from focal liver lesions: choice [42] Zeng WZD, Glicksberg BS, Li Y, Chen B. Selecting precise reference normal
of four-phase and three-phase CT imaging protocol. Abdom Radiol (NY) tissue samples for cancer research using a deep learning approach. BMC
2020;45:2688–2697. Med Genomics 2019;12:21.
[20] Christ P, Ettlinger F, Grün F, Lipkova J, Kaissis G. Lits - liver tumor seg- [43] Chaudhary K, Poirion OB, Lu L, Garmire LX. Deep learning-based multi-
mentation challenge n.d. http://www.lits-challenge.com (accessed omics integration robustly predicts survival in liver cancer. Clin Cancer
December 12, 2021). Res 2018;24:1248–1259.
[21] Chlebus G, Schenk A, Moltz JH, van Ginneken B, Hahn HK, Meine H. [44] Chaudhary K, Poirion OB, Lu L, Huang S, Ching T, Garmire LX. Multimodal
Automatic liver tumor segmentation in CT with fully convolutional neural meta-analysis of 1,494 hepatocellular carcinoma samples reveals signifi-
networks and object-based postprocessing. Sci Rep 2018;8:15497. cant impact of consensus driver genes on phenotypes. Clin Cancer Res
[22] Jansen MJA, Kuijf HJ, Veldhuis WB, Wessels FJ, Viergever MA, Pluim JPW. 2019;25:463–472.
Automatic classification of focal liver lesions based on MRI and risk fac- [45] Hwang B, Lee JH, Bang D. Single-cell RNA sequencing technologies and
tors. PLoS One 2019;14:e0217053. bioinformatics pipelines. Exp Mol Med 2018;50:1–14.
[23] Hamm CA, Wang CJ, Savic LJ, Ferrante M, Schobert I, Schlachter T, et al. [46] Xiong X, Kuang H, Ansari S, Liu T, Gong J, Wang S, et al. Landscape of
Deep learning for liver tumor diagnosis part I: development of a con- intercellular crosstalk in healthy and NASH liver revealed by single-cell
volutional neural network classifier for multi-phasic MRI. Eur Radiol secretome gene analysis. Mol Cell 2019;75:644–660.e5.
2019;29:3338–3347. [47] Ramachandran P, Dobie R, Wilson-Kanamori JR, Dora EF, Henderson BEP,
[24] Zhang F, Yang J, Nezami N, Laage-Gaupp F, Chapiro J, De Lin M, et al. Liver Luu NT, et al. Resolving the fibrotic niche of human liver cirrhosis at
tissue classification using an auto-context-based deep neural network single-cell level. Nature 2019;575:512–518.
with a multi-phase training framework. Patch Based Tech Med Imaging [48] Aizarani N, Saviano A, Sagar, Mailly L, Durand S, Herman JS, et al. A human
2018;11075:59–66. 2018. liver cell atlas reveals heterogeneity and epithelial progenitors. Nature
[25] Zhen S-H, Cheng M, Tao Y-B, Wang Y-F, Juengpanich S, Jiang Z-Y, et al. 2019;572:199–204.
Deep learning for accurate diagnosis of liver tumor based on magnetic [49] Zheng C, Zheng L, Yoo J-K, Guo H, Zhang Y, Guo X, et al. Landscape of
resonance imaging and clinical data. Front Oncol 2020;10:680. infiltrating T cells in liver cancer revealed by single-cell sequencing. Cell
[26] Wang CJ, Hamm CA, Savic LJ, Ferrante M, Schobert I, Schlachter T, et al. 2017;169:1342–1356.e16.
Deep learning for liver tumor diagnosis part II: convolutional neural [50] Zhang Q, He Y, Luo N, Patel SJ, Han Y, Gao R, et al. Landscape and dynamics of
network interpretation using radiologic imaging features. Eur Radiol single immune cells in hepatocellular carcinoma. Cell 2019;179:829–845.e20.
2019;29:3348–3357. [51] Kim JK, Kolodziejczyk AA, Ilicic T, Teichmann SA, Marioni JC. Character-
[27] Liao H, Long Y, Han R, Wang W, Xu L, Liao M, et al. Deep learning-based izing noise structure in single-cell RNA-seq distinguishes genuine from
classification and mutation prediction from histopathological images of technical stochastic allelic expression. Nat Commun 2015;6:8687.
hepatocellular carcinoma. Clin Transl Med 2020;10:e102. [52] Jia C, Hu Y, Kelly D, Kim J, Li M, Zhang NR. Accounting for technical noise
[28] Kiani A, Uyumazturk B, Rajpurkar P, Wang A, Gao R, Jones E, et al. Impact in differential expression analysis of single-cell RNA sequencing data.
of a deep learning assistant on the histopathologic classification of liver Nucleic Acids Res 2017;45:10978–10988.
cancer. Npj Digital Med 2020;3. https://doi.org/10.1038/s41746-020- [53] Papalexi E, Satija R. Single-cell RNA sequencing to explore immune cell
0232-8. heterogeneity. Nat Rev Immunol 2018;18:35–45.
[29] Calderaro J, Couchy G, Imbeaud S, Amaddeo G, Letouzé E, Blanc J-F, et al. [54] Kharchenko PV, Silberstein L, Scadden DT. Bayesian approach to single-
Histological subtypes of hepatocellular carcinoma are related to gene mu- cell differential expression analysis. Nat Methods 2014;11:740–742.
tations and molecular tumour classification. J Hepatol 2017;67:727–738. [55] Arisdakessian C, Poirion O, Yunits B, Zhu X, Garmire LX. DeepImpute: an
[30] Calderaro J, Ziol M, Paradis V, Zucman-Rossi J. Molecular and histological accurate, fast, and scalable deep neural network method to impute single-
correlations in liver cancer. J Hepatol 2019;71:616–630. cell RNA-seq data. Genome Biol 2019;20:211.
[31] Ziol M, Poté N, Amaddeo G, Laurent A, Nault J-C, Oberti F, et al. Macro- [56] Amodio M, van Dijk D, Srinivasan K, Chen WS, Mohsen H, Moon KR, et al.
trabecular-massive hepatocellular carcinoma: a distinctive histological Exploring single-cell data with deep multitasking neural networks. Nat
subtype with clinical relevance. Hepatology 2018;68:103–112. Methods 2019;16:1139–1145.

1360 Journal of Hepatology 2022 vol. 76 j 1348–1361


[57] Marouf M, Machart P, Bansal V, Kilian C, Magruder DS, Krebs CF, et al. [73] Liu Q-P, Xu X, Zhu F-P, Zhang Y-D, Liu X-S. Prediction of prognostic risk
Realistic in silico generation and augmentation of single-cell RNA-seq factors in hepatocellular carcinoma with transarterial chemoembolization
data using generative adversarial networks. Nat Commun 2020;11:166. using multi-modal multi-task deep learning. EClinicalMedi-
[58] Eraslan G, Simon LM, Mircea M, Mueller NS, Theis FJ. Single-cell RNA-seq cine 2020;23:100379.
denoising using a deep count autoencoder. Nat Commun 2019;10:390. [74] Zhang L, Xia W, Yan Z-P, Sun J-H, Zhong B-Y, Hou Z-H, et al. Deep learning
[59] Genshaft AS, Li S, Gallant CJ, Darmanis S, Prakadan SM, Ziegler CGK, et al. predicts overall survival of patients with unresectable hepatocellular
Multiplexed, targeted profiling of single-cell proteomes and tran- carcinoma treated by transarterial chemoembolization plus sorafenib.
scriptomes in a single reaction. Genome Biol 2016;17. https://doi.org/10. Front Oncol 2020;10:593292.
1186/s13059-016-1045-6. [75] Peng J, Kang S, Ning Z, Deng H, Shen J, Xu Y, et al. Residual convolutional
[60] Dey SS, Kester L, Spanjaard B, Bienko M, van Oudenaarden A. Integrated neural network for predicting response of transarterial chemo-
genome and transcriptome sequencing of the same cell. Nat Biotechnol embolization in hepatocellular carcinoma from CT imaging. Eur Radiol
2015;33:285–289. 2020;30:413–424.
[61] Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, Teng MJ, et al. G&T-seq: [76] Oezdemir I, Wessner CE, Shaw C, Eisenbrey JR, Hoyt K. Tumor vascular
parallel sequencing of single-cell genomes and transcriptomes. Nat networks depicted in contrast-enhanced ultrasound images as a predictor
Methods 2015;12:519–522. for transarterial chemoembolization treatment response. Ultrasound Med
[62] Nam JY, Lee J-H, Bae J, Chang Y, Cho Y, Sinn DH, et al. Novel model to Biol 2020;46:2276–2286.
predict HCC recurrence after liver transplantation obtained using deep [77] Bauchner H, Golub RM, Fontanarosa PB. Data sharing: an ethical and
learning: a multicenter study. Cancers 2020;12:2791. https://doi.org/10. scientific imperative. JAMA 2016;315:1237–1239.
3390/cancers12102791. [78] Xu F, Uszkoreit H, Du Y, Fan W, Zhao D, Zhu J. Explainable AI: a brief
[63] Saillard C, Schmauch B, Laifa O, Moarii M, Toldo S, Zaslavskiy M, et al. survey on history, research areas, approaches and challenges. Natural
Predicting survival after hepatocellular carcinoma resection using deep language processing and Chinese computing. Springer International
learning on histological slides. Hepatology 2020;72:2000–2013. Publishing; 2019. p. 563–574.
[64] Yamashita R, Long J, Saleem A, Rubin DL, Shen J. Deep learning predicts [79] Wang P, Berzin TM, Glissen Brown JR, Bharadwaj S, Becq A, Xiao X, et al.
postsurgical recurrence of hepatocellular carcinoma from digital histo- Real-time automatic detection system increases colonoscopic polyp and
pathologic images. Sci Rep 2021;11:2047. adenoma detection rates: a prospective randomised controlled study. Gut
[65] Lu L, Daigle Jr BJ. Prognostic analysis of histopathological images using 2019;68:1813–1819.
pre-trained convolutional neural networks: application to hepatocellular [80] Wu L, Zhang J, Zhou W, An P, Shen L, Liu J, et al. Randomised controlled trial of
carcinoma. PeerJ 2020;8:e8668. WISENSE, a real-time quality improving system for monitoring blind spots
[66] Saito A, Toyoda H, Kobayashi M, Koiwa Y, Fujii H, Fujita K, et al. Prediction during esophagogastroduodenoscopy. Gut 2019;68:2161–2169.
of early recurrence of hepatocellular carcinoma after resection using [81] Cruz Rivera S, Liu X, Chan A-W, Denniston AK, Calvert MJ, et al., SPIRIT-AI
digital pathology images assessed by machine learning. Mod Pathol and CONSORT-AI Working Group. Guidelines for clinical trial protocols for
2021;34:417–425. interventions involving artificial intelligence: the SPIRIT-AI extension. Nat
[67] Ji G-W, Zhu F-P, Xu Q, Wang K, Wu M-Y, Tang W-W, et al. Machine- Med 2020;26:1351–1363.
learning analysis of contrast-enhanced CT radiomics predicts recurrence [82] Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK, SPIRIT-AI and
of hepatocellular carcinoma after resection: a multi-institutional study. CONSORT-AI Working Group. Reporting guidelines for clinical trial re-
EBioMedicine 2019;50:156–165. ports for interventions involving artificial intelligence: the CONSORT-AI
[68] Song D, Wang Y, Wang W, Wang Y, Cai J, Zhu K, et al. Using deep learning extension. Nat Med 2020;26:1364–1374.
to predict microvascular invasion in hepatocellular carcinoma based on [83] (md) CGB. Identifier NCT04843176. A prototype Artificial intelligence al-
dynamic contrast-enhanced MRI combined with clinical parameters. gorithm versus liver imaging reporting and data system (LI-RADS) criteria
J Cancer Res Clin Oncol 2021. https://doi.org/10.1007/s00432-021- in diagnosing hepatocellular carcinoma on computed tomography: a
03617-3. randomized trial. National Library of Medicine (US); 2021.
[69] Zhang Y, Lv X, Qiu J, Zhang B, Zhang L, Fang J, et al. Deep learning with 3D [84] Gov C. NCT04802954. Risk stratification of hepatocarcinogenesis using a
convolutional neural network for noninvasive prediction of microvascular deep learning based clinical, biological and ultrasound model in high-risk
invasion in hepatocellular carcinoma. J Magn Reson Imaging patients (STARHE). 2021.
2021;54:134–143. [85] Hodi FS, O’Day SJ, McDermott DF, Weber RW, Sosman JA, Haanen JB, et al.
[70] Jiang Y-Q, Cao S-E, Cao S, Chen J-N, Wang G-Y, Shi W-Q, et al. Preoperative Improved survival with ipilimumab in patients with metastatic mela-
identification of microvascular invasion in hepatocellular carcinoma by noma. N Engl J Med 2010;363:711–723.
XGBoost and deep learning. J Cancer Res Clin Oncol 2021;147:821–833. [86] Borghaei H, Paz-Ares L, Horn L, Spigel DR, Steins M, Ready NE, et al.
[71] Wang W, Chen Q, Iwamoto Y, Han X, Zhang Q, Hu H, et al. Deep learning- Nivolumab versus docetaxel in advanced nonsquamous non–small-cell
based radiomics models for early recurrence prediction of hepatocellular lung cancer. N Engl J Med 2015;373:1627–1639.
carcinoma with multi-phase CT images and clinical data. Conf Proc IEEE [87] Garon EB, Rizvi NA, Hui R, Leighl N, Balmanoukian AS, Eder JP, et al.
Eng Med Biol Soc 2019;2019:4881–4884. Pembrolizumab for the treatment of non–small-cell lung cancer. N Engl J
[72] Abajian A, Murali N, Savic LJ, Laage-Gaupp FM, Nezami N, Duncan JS, Med 2015;372:2018–2028.
et al. Predicting treatment response to intra-arterial therapies for he- [88] Li X, Ramadori P, Pfister D, Seehawer M, Zender L, Heikenwalder M. The
patocellular carcinoma with the use of supervised machine learning-an immunological and metabolic landscape in primary and metastatic liver
artificial intelligence concept. J Vasc Interv Radiol 2018;29:850–857.e1. cancer. Nat Rev Cancer 2021;21:541–557.

Journal of Hepatology 2022 vol. 76 j 1348–1361 1361

You might also like