NLP&Clinical Decision

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Journal of Biomedical Informatics 42 (2009) 760–772

Contents lists available at ScienceDirect

Journal of Biomedical Informatics


journal homepage: www.elsevier.com/locate/yjbin

Methodological Review

What can natural language processing do for clinical decision support?


Dina Demner-Fushman a,*, Wendy W. Chapman b, Clement J. McDonald a
a
U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
b
Department of Biomedical Informatics, University of Pittsburgh, 200 Meyran Avenue, M-183 VALE, Pittsburgh, PA 15260, USA

a r t i c l e i n f o a b s t r a c t

Article history: Computerized clinical decision support (CDS) aims to aid decision making of health care providers and
Received 21 November 2008 the public by providing easily accessible health-related information at the point and time it is needed.
Available online 13 August 2009 natural language processing (NLP) is instrumental in using free-text information to drive CDS, represent-
ing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative.
Keywords: The early innovative NLP research of clinical narrative was followed by a period of stable research con-
Natural language processing ducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review pri-
Decision support techniques
marily focuses on the recently renewed interest in development of fundamental NLP methods and
Clinical decision support systems
Review
advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages,
intended user groups, and support goals are discussed.
Published by Elsevier Inc.

1. Introduction some clinical information, for example, laboratory results, phar-


macy orders, and discharge diagnoses in a structured and coded
The goal of clinical decision support (CDS) is to ‘‘help health form. Today, a major portion of the patients clinical observations,
professionals make clinical decisions, deal with medical data about including radiology reports, operative notes, and discharge summa-
patients or with the knowledge of medicine necessary to interpret ries are recorded as narrative text (dictated and transcribed, or di-
such data” [1]. Clinical decision support systems are defined as rectly entered into the system by care providers). And in some
‘‘any software designed to directly aid in clinical decision making systems even laboratory and medication records are only available
in which characteristics of individual patients are matched to a as part of the physician’s notes. Moreover, in some cases the facts
computerized knowledge base for the purpose of generating that should activate a CDS system can be found only in the free
patient-specific assessments or recommendations that are then text. For example, the CDC technical instructions for tuberculosis
presented to clinicians for consideration” [2]. A CDS system struc- screening and treatment2 define a complete screening medical
ture could be envisioned as a neural reflex arc: its receptors reside examination for tuberculosis as consisting ‘‘of a medical history,
in, and are activated by patient data; its integration center contains physical examination, chest radiography (when required), determi-
decision rules and the knowledge base; and its effectors are the nation of immune response to Mycobacterium tuberculosis anti-
patient-specific assessments and recommendations. gens (i.e., tuberculin skin testing, when required), and laboratory
Patient data can be manually entered into a CDS system by cli- testing for human immunodeficiency virus infection . . . and M.
nicians seeking support, but then they only get support when tuberculosis (when required.)” Notably, medical history, physical
they recognize the need and have time to find and enter the req- examination, and chest radiography results are routinely obtained
uisite data. Support is thus much more effective when the com- in free-text form. Indications for further tuberculosis screening
puterized system has access to electronic heath record (EHR) could be identified in these clinical notes using NLP methods [3]
data that can trigger reminders or alerts automatically as situa- at no additional cost.
tions arise that require physician action.1 The EHR will often carry In principle, natural language processing could extract the facts
needed to actuate many kinds of decisions rules. In theory, NLP
systems might also be able to represent clinical knowledge and
* Corresponding author. Address: Communications Engineering Branch, Lister
Hill National Center for Biomedical Communications, U.S. National Library of CDS interventions in standardized formats [4–6]. That is, NLP could
Medicine, Bldg. 38A, Room 10S-1020, 8600 Rockville Pike MSC-3825, Bethesda, MD potentially enrich all three major components of CDS systems. The
20894, USA. Fax: +1 301 402 0341. goal of this review is to present the current state of clinical NLP,
E-mail address: [email protected] (D. Demner-Fushman).
1
its contributions to CDS, and the development of biomedical NLP
According to the Personalized Health Care (PHC) Initiative team at the Depart-
ment of Health and Human Services data collection supported by CDS tools is either
active or passive. http://www.hhs.gov/healthit/ahic/materials/09_07/phc/
2
background.pdf. http://www.cdc.gov/ncidod/dq/pdf/ti_tb_8_9_2007.pdf.

1532-0464/$ - see front matter Published by Elsevier Inc.


doi:10.1016/j.jbi.2009.08.007
D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772 761

towards processing of clinical narrative and information resources tems for generation of bottom-line advice to clinicians and the
required for further and more involved participation in the CDS public, which are likely to be integrated with CDS systems in the
process. near future. Section 7 presents research on direct applications of
Since their introduction about 40 years ago, CDS systems have NLP in diagnosis, treatment, and other healthcare decisions. The re-
improved practitioner performance in approximately 60% of the re- view concludes with potential future directions in natural lan-
viewed cases [7]. Several features are correlated with decision sup- guage processing for clinical decision support.
port systems’ ability to improve patient care: automatically and
proactively providing decision support as part of clinician work-
2. Current state of clinical decision support systems
flow; providing recommendations rather than assessments; and
providing decision support at the time and location of decision
Clinical decision support systems can be described along several
making [8]. If CDS systems were to depend upon NLP, it would re-
axes [11,12]:
quire reliable, high-quality NLP performance and modular, flexible,
and fast systems. Some NLP applications have been integrated in
access (integrated with an EHR or stand-alone systems)
both active and passive CDS. Active NLP CDS applications leverage
setting (used in ambulatory care or inpatient setting)
existing information and push patient-specific information to
task (targeting a specific clinical or administrative task such as
users. Passive NLP CDS applications require input by the user to
diagnosis, immunization, or quality control)
generate output.
scope (general or targeting a specialty)
Active NLP CDS includes alerting, monitoring, coding, and
timing (before, during, or after the clinical decision is made)
reminding. Active NLP CDS is considered quintessential by most
output (active, for example, reminders, or passive)
interested parties. Passive NLP CDS has focused on providing
implementation (knowledge-based, statistical)
knowledge and finding patient populations. This review presents
the view of NLP CDS which includes active and passive support,
and leaves out the related burgeoning areas of tutoring [9] and A thorough review of the systems is beyond the scope of this
clinical text mining [10]. Both active and passive NLP CDS systems paper, but further information on CDS and CDS systems can be
process a variety of textual sources, such as clinical records, bio- found in [1,2,4,5,11,12].
medical literature, web pages, and suicide notes. A variety of NLP Much of the data that could support CDS is textual and there-
CDS systems have been targeted to clinicians, but other users are fore cannot be leveraged by a CDS system without natural lan-
researchers, patients, administrators, students, and coders. guage processing. For example, Aronsky and colleagues studied
Although the application types targeted to clinicians represent a the usefulness of NLP for a CDS system that identified commu-
diverse set of active tools, there appears to be a recent trend to- nity-acquired pneumonia in emergency department patients and
ward a higher volume of passive tools being targeted to clinicians showed that performance was significantly better with the NLP
and researchers. Fig. 1 demonstrates the types of active and passive output [13]. The following examples of integrated CDS systems
CDS applications to which NLP tools have contributed and have po- demonstrate the possibilities for NLP in the context of CDS.
tential to contribute in the future.
The scope and workflow of an idealized NLP system capable of 2.1. An outpatient reminder system
supporting various clinical decisions and text types are discussed
in Section 3. The rest of the review is organized as follows: Section In the early 1970s, the Regenstrief Institute introduced prospec-
2 presents a short overview of CDS systems. Section 4 provides an tive, protocol-driven reminders in the outpatient clinics of Wishard
overview of the fundamental NLP methods applied to clinical text. Memorial Hospital. The reminder system searched patients’ charts
Section 5 describes integrated and end-to-end medical natural lan- for conditions specified initially in about 300 rules in two catego-
guage processing systems. Section 6 presents methods and sys- ries: ordering specific tests after starting certain drug therapy;

Fig. 1. NLP–CDS applications.


762 D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772

and changing drug therapy in response to abnormal test results ‘‘result in confusion among the lab technicians and pharmacists
[11]. For example, if a patient’s serum aminophylline level had who receive completed orders” [20]. This, however, does not pre-
not been measured within a certain time after starting aminophyl- clude NLP contributions to CDS systems. A passive CPOE-centered
line therapy, the system generated a reminder for the responsible NLP CDS system would have the advantage of receiving structured
physician suggesting to order the test. The messages suggesting input in the form of patient-specific variables plugged into a text
specific actions and explaining the rationale with references to re- template. It would still need to formally represent knowledge in
search articles were printed on paper before each patient encoun- textual resources and find matching representations. For example,
ter and attached to the front of the patient’s chart [11,15,16]. an NLP CDS system could contribute to educational information
In a summative 2-year evaluation of the expanded system, the provided by the Vanderbilt University Medical Center HEO (Hori-
clinicians who received reminders undertook the expected actions zon Expert Orders, McKessonÒ), formerly the WizOrder system,
at a significantly higher rate (44–49% vs. 29%) than those who did which (in addition to recommendations about patient safety and
not receive the reminders. Interestingly, the study participants quality of care issues traditionally provided by CPOE systems
never requested articles referenced in the reminders because of [16,17]) provides summaries of disease-specific national guidelines
time pressures or because they knew the evidence that justified [21] by matching formal representations of guidelines relevant to a
the reminder [11,15]. Reduction of time required for analysis of specific order. An NLP system could also contribute by processing
evidence presented in research articles is the goal of text summa- free text collected by the HEO system, which ranges from com-
rization and question answering methods discussed in Section 6. ments to structured fields such as orderables for baby aspirin to
Further details on over a quarter century of CDS experience at completely free-text nursing orders (personal communication with
the Regenstrief Institute can be found in [11,15,16]. Dr. Dominik Aronsky and Dr. Russ Waitman). For example, if the
system can process a skin-care related order ‘‘turn every three
2.2. Inpatient reminder and diagnostic decision support systems hours”, it can then search the EHR for the last documented turn
and issue a reminder, if the elapsed time exceeds the ordered time.
The HELP (Health Evaluation through Logical Processing) Hospi- Centering CDS on CPOE allows decision support at various
tal Information System installed in the Intermountain Healthcare stages of order entry (initiation, patient selection, order selection,
hospitals provides examples of CDS in the inpatient setting. In order construction, and order completion) [21]. For example, deci-
addition to alert, critique, and suggestion systems similar to the sion support at Partners Healthcare originated at the Brigham and
Regenstrief reminder system, the HELP system provides diagnostic Women’s Hospital as a set of passive tools focused on referential
decision support. For example, a rule-based subsystem helps diag- knowledge, anticipated order sets, guidelines and feedback and
nose adverse drug events (ADE) through identifying patients with evolved into an active system integrated with the workflow. For
specific chemistry test results, drug level tests, orders for drugs example, if a physician prescribes a drug that lowers potassium,
that are commonly used to treat ADEs, and a program in which the system displays the patient’s potassium lab values [22].
providers choose symptoms that may be caused by ADEs (e.g., rash, CPOE systems provide a mechanism for complex, interactive
change in heart rate, respiratory rate, mental status, etc.) [17]. decision support based on protocols and guidelines [21,23]. Only
Manual entry of the symptoms that may be caused by ADEs could decidable guidelines shown to be useful in clinical trials and tested
be facilitated or replaced by an NLP tool for extraction of symptoms against patients’ cases should be provided for CDS in the form of
from free text (for example, from patients’ progress notes), partic- preprogrammed suggestions [23]. Although CPOE systems often
ularly because several NLP systems (discussed in Section 5) are al- collect patient data in structured form, NLP systems could enrich
ready in use at Intermountain Healthcare. CDS through linking data collected through CPOE with additional
The Antibiotic Assistant [18], initially developed and imple- information contained in the free-text fields of the EHR and provid-
mented at LDS Hospital, identifies patients with potential nosoco- ing assistance in guideline generation, monitoring programmed
mial infections, alerts physicians to the possible need for anti- guidelines for applicability, and generating updates.
infective therapy, and suggests dose and regiment for individual
patients. Evans et al. have shown that use of the Antibiotic Assis-
tant decreased the number of ADEs, length of stay, morbidity, 3. NLP for CDS: scope and models
and significant reductions in adverse drug events and cost [18]. Re-
cently, the Antibiotic Assistant has been deployed in multiple IHC The existing NLP and CDS systems provide a solid foundation for
hospitals and a commercialized version is being marketed across the generalized models presented in this section. The models of
the U.S.3 A key element in the diagnostic reasoning for the Antibiotic NLP–CDS systems range from specialized systems dedicated to a
Assistant is whether there is radiographic evidence of bacterial specific task, to a set of NLP modules run by a CDS system, to
pneumonia. Fiszman et al. [19] showed that application of an NLP stand-alone systems/services that take clinical text as input and
system to identification of pneumonia performed better than the generate output to be used in a CDS system. The implementation
simple keyword-based method implemented in the Antibiotic Assis- and expansion or retargeting of these models differ along the axes
tant, but integrating the NLP system in a production-level system shown in Fig. 2. The three axes represent relationship types be-
proved too complex, and the Antibiotic Assistant still uses the inter- tween the NLP and CDS modules: whether the NLP system (1) is
nal keyword-based algorithm. If the NLP system were deployed, pro- integrated within a CDS or coupled with it to various degrees of
cessing clinical reports in real-time, and storing the NLP annotations, tightness, (2) is governed by a CDS or implements knowledge
as is the case at Columbia University (see Section 5.1), making use of and logic necessary to support decisions, or (3) has been developed
the NLP system’s output would be more straightforward for existing
CDSs.

2.3. Decision support centered on CPOE

Many computerized provider order entry (CPOE) systems use


controlled vocabularies to avoid unstructured narrative that can

3
www.theradoc.com. Fig. 2. Axes of NLP–CDS relations in clinical NLP models.
D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772 763

Fig. 4. An integrated self-governed multi-task NLP–CDS system.


Fig. 3. A coupled task-specific NLP system governed by the CDS module.

for a specific task or as a generalized tool that could be customized A specialized NLP system is provided with information about
for different tasks. the tasks and takes over the management of the process. The NLP
Particular combinations of system features along the three axes system could be loosely coupled with a CDS system. For example,
will result in specific NLP–CDS models. Some of those are described it may get a signal and text for performing a certain task from a
in the next section and illustrated in Figs. 3 and 4. CDS system, perform the task independently, and deliver the pre-
specified output to the CDS system, which then incorporates the
results into an EHR. Or the NLP system could be used as a module
3.1. NLP models
integrated directly in an EHR. Such a system could also use readily
available software systems (UMLS, UIMA, GATE, etc.) for basic NLP
A coupled or integrated NLP system that performs one specific
tasks. A schematic representation of such a system, which seems to
task can, for example, determine whether a chest radiograph re-
be the current model partially implemented at the leading clinical
port shows evidence of pneumonia or assign a pre-defined subset
centers, is shown in Fig. 4.
of ICD-9-CM codes to radiology reports for billing purposes [24].
The idealized system in Fig. 4 will have a module that monitors
Such a system might be activated when a new radiology report is
an EHR for insertion of new data into specific fields. When a radiol-
submitted to an EHR. The coupled NLP system will be invoked by
ogy report of a patient admitted to a hospital after a pedestrian acci-
the EHR (See Fig. 3), whereas the integrated system will monitor
dent, for example, is entered into the EHR, the NLP system could
the incoming reports and start the task as needed (See Fig. 4).
activate the basic processing pipeline. Processing the following
The NLP system might be self-contained and resort to searching
impression section: ‘‘Right lower lung opacity, which could be contu-
phrases and regular expressions associated with each code or use
sion or pneumonia,” the system will extract information about po-
some of the basic tools described in Section 4. The ICD-9-CM codes
tential pneumonia or pulmonary contusion. The system will look up
obtained by the system and submitted to an EHR could be used to
decision rules for suspected pneumonia that might, for example, con-
assist human coders while assigning codes or to enable quality
tain instructions to retrieve the structured results of blood tests and
control after code assignment. For example, a sophisticated NLP
evaluate the white blood cell count. If the count were high, the re-
engine is used in a successful, widely-deployed, commercial com-
minder message generated by the system would say the patient is
puter-assisted coding solution, which employs human review for
more likely to have pneumonia than pulmonary contusion. The sys-
quality assurance of the system output and in cases of low confi-
tem could use the results of the text analysis to solicit more infor-
dence in automatic coding.4
mation (for example, find evidence for best approaches to
An NLP system governed by a CDS system comprises a suite of
management of both disorders) and present succinct summaries
modules that can be selected from and aligned into a pipeline cus-
of the information. At this point, the NLP system can hand off the re-
tomizable for a variety of tasks. In this model, the CDS system
minder text and summaries to the CDS system, or issue the remin-
drives and monitors the tightly integrated set of NLP modules
der directly and insert the summaries into designated EHR fields.
and ensures application-specific workflow. Software systems such
The above idealized systems would have to deal with unique
as the National Library of Medicine’s Unified Medical Language
challenges faced by existing NLP systems that have to process text
System (UMLS), developed to facilitate clinical data processing
and document types ranging from informal notes typed into a pa-
and linking to biomedical knowledge [25], General Architecture
tient’s record by various caregivers to highly structured peer-re-
for Text Engineering (GATE),5 systems based on the Unstructured
viewed publications in scientific journals. It is not clear whether
Information Management applications (UIMA)6 and provided by
an NLP system can be designed to handle all text types, applying
the Open Health Natural Language Processing (OHNLP) Consortium,7
common modules to both clinical notes and literature, or whether
and LingPipe8 can be used in this type of customizable NLP–CDS sys-
specialized clinical text processing systems should communicate
tem coupling.
with specialized biomedical literature processing systems using
4 common representation and messaging.
http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_032028.
html.
5
http://gate.ac.uk/.
6
3.2. Text and document types encountered in CDS
http://incubator.apache.org/uima/index.html.
7
https://cabig-kc.nci.nih.gov/Vocab/KC/index.php/OHNLP_Documentation_
and_Downloads. Successful processing of clinical narrative is the key to overall
8
http://alias-i.com/lingpipe/. success of any NLP–CDS system. This type of text is particularly
764 D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772

challenging, because clinical notes are often entered by healthcare form better than traditional string edit distance algorithms at cor-
providers who have limited time and therefore frequently use do- recting spelling in clinical notes [29–32].
main-specific abbreviations and do not check spelling. Develop- Part-of-speech tagging is an essential step in natural language
ment of NLP processors of clinical text requires access to large understanding. For example, the following phrase found in a pa-
volumes of such text, but privacy considerations present barriers tient’s record can be interpreted differently depending on the as-
to such access Privacy issues have stimulated research in de-iden- signed part-of-speech tag (the contextually correct tag in square
tification and anonymization of clinical records [26,27], in order to brackets follows the word, the wrong tag assigned by the
reduce the privacy constrains and generate the corpora needed to part-of-speech-tagger is shown in parenthesis): ‘‘hemorrhagic
advance the field. Some modest number of de-identified clinical [adjective] corpus [adjective](noun) luteum [adjective](noun)
narratives has been made available to the community [24], how- cyst[noun], left [adjective](verb) ovary [noun](adjective).” Given
ever, much larger sets are needed to unleash the potential of NLP the tagging provided by the part-of-speech-tagger, we should
and provide access to clinical narratives to NLP researchers work- interpret the phrase as a report about the cyst disappearing from
ing outside of medical centers. the ovary, as opposed to a correct interpretation of a specific cyst
Because CDS involves not only patient-specific information found in the left ovary.
from the clinical record but also general medical knowledge Several part-of-speech taggers (POS-taggers) were developed
regarding best practices in diagnosing or treating conditions expe- specifically for biomedical domain: the MedPost tagger, based on
rienced by the patient, NLP beyond current capabilities is needed a hidden Markov model (HMM) and the Viterbi algorithm, was
to find and formally represent publications containing guidelines, trained and tested on 5700 manually tagged sentences, achieving
CDS rules, and actionable recommendations offered in free text 97% accuracy on sentences extracted from various thematic MED-
in publicly available online databases (such as MEDLINE/PubMed,9 LINE subsets [33]. Despite an opinion that at the part-of-speech
BioMed Central,10 and PubMedCentral11) that provide access to sci- tagging level, sublanguage differences seem to vanish [34], there
entific literature. Another publicly available resource is ‘‘gray litera- is evidence that POS-taggers trained and tested on formal text that
ture,” which is more likely to report preliminary, non-significant or does not include clinical documents do not achieve state-of-the-art
negative results than peer-reviewed, commercially published litera- performance. For example, training a POS-tagger on a relatively
ture. Taking grey literature into account when analyzing and sum- small set of clinical notes improves the performance of the POS-
marizing best available evidence may provide a more complete tagger trained on Penn Treebank from 90% to 95% in one study
and objective answer to the question under consideration [28]. For [35] and from 79% to 94% in another study [36].
example, averaging over 39 manually conducted meta-analyses One of the main causes of errors when porting part-of-speech
alone, treatment was shown to be more effective for preventing an taggers to new domains is assignment of tags to out-of-vocabulary
undesirable health outcome compared to manual analysis which words in the new domain. Errors at the part-of-speech level can
including abstracts of conference proceedings, Food and Drug propagate upward to create more errors at the syntactic processing
Administration (FDA) documents, and unpublished resources in level. Errors in syntactic analysis, which provides information nec-
addition to published literature [28]. Bringing together evidence essary for semantic interpretation of both the clinical narrative and
from formal studies and grey literature is a promising venue for the biomedical literature text, can, in turn, cause errors in text
NLP CDS. understanding.
CDS-specific language processing builds upon the fundamental
clinical text processing briefly described in the next section. The 4.2. Ner
biomedical natural language processing of different document
and text types is discussed in parallel throughout the remainder Named entity recognition (NER) involves identifying the bound-
of this review. aries of the name in the text and understanding (and disambiguat-
ing) its meaning, often through mapping the entity to a unique
concept identifier in an appropriate ontology [37].
4. NLP building blocks
4.2.1. Dictionary-based NER
Even sophisticated NLP systems are built on the foundation of By its nature, dictionary-based NER needs resources that at a
recognizing words or phrases as medical terms that represent the minimum provide a list of names for a given entity type. For exam-
domain concepts (named entity recognition) and understanding ple, the NCI Dictionary of Genetics Terms,12 contains about 100
the relations between the identified concepts. terms and their definitions that support genetics cancer information
summaries. This and other dictionaries and thesauri are included in
the UMLS that preserves information from the original contributing
4.1. Text pre-processing sources and enriches it through linking and adding meta-informa-
tion, such as semantic types. The UMLS 2009AA version13 includes
The pre-processing steps leading to term and relation identifi- 2,125,395 concepts with 8,006,171 distinct names contributed by
cation usually include tokenization, part-of-speech tagging, and 152 sources and merged into the Metathesaurus (UMLS Meta). Many
syntactic parsing. Processing of clinical notes often starts with NER methods (applied to both the clinical narrative and the biomed-
spelling correction and context specific expansion of abbreviations. ical literature text) utilize UMLS Meta and tools developed within
Unlike many abbreviations in the literature, abbreviations in clin- the UMLS.
ical notes are not often expressed in parenthetical phrases follow- The expanse and origins of the UMLS Metathesaurus present the
ing the expansion; therefore, researchers sometimes treat need for customization to address specific needs and sublanguages,
abbreviation expansion as a word sense disambiguation problem. as demonstrated in the comparative study of UMLS content views
Similarly, spell checking algorithms that use NLP techniques, such to support NLP processing of biomedical literature and clinical text
as word sense disambiguation and named entity recognition, per- [38]. One of the issues faced by dictionary-based methods is

9
http://www.ncbi.nlm.nih.gov/sites/entrez.
10 12
http://www.biomedcentral.com/. http://www.cancer.gov/cancertopics/genetics-terms-alphalist.
11 13
http://www.pubmedcentral.nih.gov/. http://www.nlm.nih.gov/research/umls/release_metadata.html#sb6_0.
D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772 765

whether the lexicon coverage is sufficient for specific sub-domains and determination of the relation type. Several resources contain
[39]. The coverage of the semantic lexicon could be increased relation types for the biomedical domain, for example, the UMLS
employing morphosemantics-based systems capable of generating Semantic Network14 that defines binary relations allowed between
definitions for unknown terms [40]. In clinical NLP, it might be the UMLS semantic types. Although annotation efforts sometimes in-
desirable to use a local lexicon instead of or in addition to signifi- clude relations [46,60], it remains to be seen whether explicitly sta-
cant domain knowledge captured in the UMLS. Manual integration ted relations occur in clinical narrative regularly and frequently
of local terminologies and the UMLS can be aided through Word- enough to be necessary or useful for clinical decision support and
Net-based mapping methods [41]. whether experience in relation extraction from the literature
[61,62] can be leveraged in clinical text processing.
4.2.2. Statistical NER In the absence of explicitly stated relations, researchers rely on
One of the successful alternatives to dictionary-based methods co-occurrence statistics of certain semantic types to infer a rela-
is the supervised machine learning approach to biomedical NER. tionship. For example, Chen et al. [10] applied the Medical Lan-
This approach usually requires a substantial manual annotation ef- guage Extraction and Encoding System (MedLEE) and BioMedLEE
fort. For example, in one effort 1442 MEDLINE abstracts were man- [63] to discharge summaries and MEDLINE articles to identify dis-
ually tokenized and annotated for automatic recognition of ease and drug entities and the chi-square statistic to measure the
malignancy named entities [42]. Reducing annotation effort for significance of associations between diseases and drugs. The sub-
NER can be achieved through dynamic selection of sentences to sequent overview of the top five disease–drug associations by a
be annotated [43] or through active learning [44]. These methods medical expert across eight diseases confirmed the appropriate-
could be of value for annotation of clinical notes because the exist- ness of the method for extracting disease–drug associations from
ing collections of annotated clinical notes are significantly smaller both text sources [10].
than those of the medical literature (most dataset owners report To date, the bulk of the relation extraction experience stems
gold standards around 160 notes [45,46]. The F-scores achieved from processing of the literature. One of the leading systems in this
for statistical NER on these collections range from low 70s [46] research, SemRep, is a rule-based, symbolic natural language pro-
to 86% [47]. It remains to be seen if a larger annotated collection cessing system developed to identify relations defined in the UMLS
of clinical notes will prove beneficial for statistical NER. Semantic Network. SemRep relies on its ‘‘indicator” rules to map
An in-depth overview of the dictionary-based, rule-based, sta- syntactic elements (such as verbs and nominalizations) to predi-
tistical, and hybrid approaches to automatic named entity recogni- cates in the Semantic Network, such as TREATS, CAUSES, and
tion in biomedical literature is provided in Krauthammer and LOCATION_OF. Argument identification rules (which take into
Nenadic [48]. Meystre et al. review information extraction from account coordination, relativization, and negation) then find
clinical narrative [49]. syntactically allowable noun phrases to serve as arguments for
indicators [64]. Propositional representation (predications) of the
4.3. Context extraction TREATS(DISORDER, PHARMACOLOGICAL_SUBSTANCE) relation iden-
tified using SemRep significantly outperformed the co-occurrence
The key functions of clinical decision support systems require frequency method in finding evidence for treatment suggestions
understanding the context from which an event or a named entity for over 50 diseases, gaining 0.17 in mean average precision [65].
is extracted. For example, supporting clinical diagnosis and treat-
ment processes with best evidence will require not only recogniz-
5. Current state of clinical NLP systems
ing a clinical condition, but determining whether the condition is
present or absent. Several algorithms have been developed for
The currently existing systems roughly fall into two categories:
negation identification [50–53]. Chapman et al. [54] developed a
general-purpose clinical NLP architectures (increasingly publicly
stand-alone algorithm, ConText, for identifying three contextual
available), and specialized systems developed for specific tasks.
features: negation (for example, no pneumonia); historicity (the
condition is recent, occurred in the past, or might occur in the fu-
ture); and experiencer (the condition occurs in the patient or in 5.1. General-purpose clinical NLP systems
someone else, such as parents abuse alcohol). In many cases it is
desirable to detect the degree of certainty in the context (for exam- The early vision of medical NLP was implemented in the Lin-
ple, suspected pneumonia). Solt and colleagues described an algo- guistic String Project (LSP) system that developed the basic compo-
rithm for determining whether a condition is absent, present, or nents and the formal representation of clinical narrative, and
uncertain [55], and Uzuner and colleagues compared rule-based implemented the transformation of the free-text clinical docu-
and machine learning approaches to assertion classification [56]. ments into a formal representation [66]. The LSP system evolved
Aramaki et al. developed an application for creating tables from into the Medical Language Processor (MLP) that includes the Eng-
clinical texts that places information in the context of whether lish healthcare syntactic lexicon and medically tagged lexicon, the
the information is negated, may occur in the future, and is needed, MLP parser, parsing with English medical grammar, selection with
planned, or recommended [57]. Denny and colleagues developed medical co-occurrence patterns, English transformation, syntactic
an application for identifying timing and status of colonoscopy regularization, mapping into medical information format struc-
testing from reports, including whether a test was described in tures, and a set of XML tools for browsing and display.15
the context of being refused or scheduled [58]. A comprehensive MedLEE is an NLP system that extracts information from clinical
review of temporal reasoning with medical data can be found in narratives and presents this information in structured form using a
[59]. controlled vocabulary. MedLEE uses a lexicon to map terms into
semantic classes and a semantic grammar to generate formal rep-
4.4. Associations and relations extraction resentation of sentences. It is in use at Columbia University Medi-
cal Center, and is one of the few natural language processing
A better understanding of the clinical narrative text might be
gained through identification and extraction of meaningful rela- 14
http://semanticnetwork.nlm.nih.gov/.
tionships between the identified entities and events. Similarly to 15
The system, provided by Colorado-based Medical Language Processing, L.L.C.
NER, relation extraction can be decomposed into relation detection corporation, can be downloaded from http://mlp-xml.sourceforge.net/.
766 D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772

systems integrated with clinical information systems. MedLEE has and decision rules. Some examples of task-specific systems are
been successfully used to process radiology reports, discharge provided in this section.
summaries, sign-out notes, pathology reports, electrocardiogram
reports, and echocardiogram reports [10,67–70]. An in-depth over- 5.2.1. Clinical events monitoring
view of the system and a case scenario are provided in [71]. Clinical events monitoring is one of the most common and
The Text Analytics architecture developed in collaboration be- essential tasks of CDS systems. Particularly important are detection
tween the Mayo Clinic and IBM is using Unstructured Information and prevention of adverse events. Murff et al. found the electronic
Management Architecture (UIMA) to identify clinically relevant discharge summaries to be an excellent source for detecting ad-
entities in clinical notes. The entities are subsequently used for verse events; however, practically useful automatic detection of
information retrieval and data mining [72]. The ongoing develop- those events could not be achieved using simple keyword queries
ment of this architecture resulted in two specialized pipelines: to trigger an alert [78]. Building upon rule-based extraction of clin-
medKAT/P, which extracts cancer characteristics from pathology ical conditions from radiology reports [67], Hripcsak et al. describe
reports, and cTAKES, which identifies disorders, drugs, anatomical a NLP-based framework for adverse event discovery [79]. The event
sites, and procedures in clinical notes. Evaluated on a set of manu- discovery process involved seven steps: (1) identification of the
ally annotated colon cancer pathology reports, MedTAS/P achieved target event, for example, drug interactions; (2) selection of a clin-
F1-scores in the 90% range in extraction of histology, anatomical ical data repository for monitoring; (3) natural language processing
entities, and primary tumors [73]. A lower score achieved for met- to formally represent clinical narrative. The formal representation
astatic tumors was attributed to the small number of instances in was generated by MedLEE [68]; (4) query generation for event
the training and test sets [73]. cTAKES and HiTEx, described below, detection and classification; (5) verification of the accuracy of
are the first generalized clinical NLP systems to be made publicly detection and classification; (6) error analysis; (7) iterative
available. improvement of steps 1, 3, and 4 based on findings in steps 5
Developed at the National Center for Biomedical Computing, and 6. The framework sensitivity ranged from 0.15 to 0.37 at
Informatics for Integrating Biology & the Bedside (I2B2), the Health 0.99 specificity levels in identifying 45 types of adverse events
Information Text Extraction (HiTEx) tool based on GATE is a mod- such as pulmonary embolism, wound dehiscence requiring repair,
ular system that assembles a different pipeline for extracting spe- medication errors, and other serious adverse events [79].
cific findings from clinical narrative. For example, a pipeline to
extract diagnoses is formed by applying sequentially a section 5.2.2. Processing radiology reports
splitter, section filter, sentence splitter, sentence tokenizer, POS- Radiology reports are probably the most studied type of clinical
tagger, noun phrase finder, UMLS concept mapper, and negation narrative. This extremely important source of clinical data provides
finder [74]. A pipeline for extraction of family history from dis- information not otherwise available in the coded data and allows
charge summaries and outpatient clinic notes evaluated on 350 performing tasks from coding of the findings and impressions
sentences achieved 85% precision and 87% recall in identifying [67,80], to detection of imaging technique suggested for follow-
diagnoses; 96% precision and 93% recall differentiating family his- up or repeated examinations [81], to decision support for nosoco-
tory from patient history; and 92% precision and recall exactly mial infections [19], to biosurveillance [82]. The complete descrip-
assigning diagnoses to family members [75]. tion of systems developed for processing of radiology reports is
The MediClass (a ‘‘Medical Classifier”) system was designed to beyond the scope of this review. This section outlines the scope
automatically detect clinical events in any electronic medical re- and research directions in processing of radiology reports and
cord by analyzing the coded and free-text portions of the record. omits most of the radiology report studies based on the described
It was assessed in detecting care delivery for smoking cessation; above general-purpose systems.
immunization adverse events; and subtypes of diabetic retinopa- A family of systems with the initial goal of processing radiology
thy. Although the system architecture remained constant for each reports was developed at the LDS Hospital (Intermountain Health-
clinical event detection task, new classification rules and terminol- care). The Special Purpose Radiology Understanding System
ogy were defined for each task [76]. For example, to detect possible (SPRUS) extracts and encodes the findings and the radiologists’
vaccine reactions in the clinical notes, MediClass developers iden- interpretations using information from a diagnostic expert system
tified the relevant concepts and the linguistic structures used in [80]. SPRUS was followed by the Natural language Understanding
clinical notes to record and attribute an adverse event to an immu- Systems (NLUS) and Symbolic Text Processor (SymText) systems
nization or vaccine [77]. The identified terms and structures were that combine semantic knowledge stored and applied in the form
encoded into rules of a MediClass knowledge module that defines of a Bayesian Network with syntactic analysis based on a set of
the classification scheme for automatic detection of possible vac- augmented transition network (ATN) grammars [83,84]. SymText
cine reactions. The scheme requires detecting an explicit mention was deployed at LDS Hospital for semi-automatic coding of admit
of an immunization event and detecting or inferring at least one diagnoses to ICD-9 codes [85]. SymText was also used to automat-
finding of an adverse event [77]. In 227 of 248 cases (92%), Medi- ically extract interpretations from Ventilation/Perfusion lung scan
Class correctly detected a possible vaccine reaction [77]. reports for monitoring diagnostic performance of radiologists
[86]. The accuracy of the system in identifying pneumonia-related
concepts and inferring the presence or absence of acute bacterial
5.2. Specialized clinical NLP systems pneumonia was evaluated using 292 chest X-ray reports annotated
by physicians and lay persons. The 95% recall, 78% precision, and
The evaluation of the general-purpose architectures in specific 85% specificity achieved by the system were comparable to that
tasks and the use of the general-purpose systems as components of physicians and better than that of lay persons [19]. SymText
or foundation of many task- or document-specific systems, make evolved to MPLUS (M+), which also uses a semantic model based
the line between these system types somewhat fuzzy. The differ- on Bayesian Networks (BNs), but differs from SymText in the size
ences are most probably not in the end-results but in the initial and modularity of its semantic BNs and in its use of a chart parser
goals of developing a system to process free text for any task versus [87]. M+ was evaluated for the extraction of American College of
solving a specific clinical task. Independent of the starting point Radiology utilization review codes from 600 head CT reports. The
and reuse of the general-purpose components, solving a specific system achieved 87% recall, 98% specificity and 85% precision in
task at minimum requires developing a task-specific database classifying reports as positive (containing brain conditions) [87].
D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772 767

M+ was also evaluated for classifying chief complaints into syn- ment between the report and the gold standard. Each variable had
drome categories [88]. Currently, M+ has been redesigned as Onyx a different profile of errors. Numerous system errors were ob-
and is being applied to spoken dental exams [89]. These evolving served for Gleason Score extraction that requires fine distinctions
NLP systems provide examples of successful retargeting to coding and TNM stage extraction requiring multiple discrete decisions.
of other types of clinical reports. The authors conclude that the existing system could be used to
Elkin et al. [82] presented a specialized tool based on the aid manual annotation or could be extended for automatic annota-
general-purpose architecture, which was used to code radiology tion [96]. These findings second observations of Schadow and
reports into the SNOMED CT reference terminology. The subse- McDonald that general-purpose tools and vocabularies need to
quent processing was based on the SNOMED CT encoded rule for be adapted to the specific needs of surgical pathology reports
the identification of Pneumonias, Infiltrates or Consolidations or [94]. The Cancer Text Information Extraction System (caTIES) sys-
other types of pulmonary densities [82]. The rule consisted of 17 tem, built on a GATE framework, uses MetaMap [93] and NegEx
increasingly complex clauses, starting with ‘‘if pneumonia [51] to annotate findings, diagnoses, and anatomic locations in
233604007 Positive Assertion Explode Impression Section –> Posi- pathology reports. caTIES provides researchers with the ability to
tive Assertion Pneumonia”. Identification of pneumonias was eval- query, browse and create orders for annotated tissue data and
uated on 400 reports and resulted in 100% recall (sensitivity), 97% physical material across a network of federated sources using auto-
precision (positive predictive value), and 98% specificity [82]. matically annotated pathology reports.16
The REgenstrief data eXtraction tool (REX) coded raw version
2.x Health Level 7 (HL7) messages to a targeted small to medium 5.2.5. Processing a mixture of clinical note types
sized sets of concepts for a particular purpose in a given kind of The above studies indicate that natural language processing
narrative text [90]. REX was applied to 39,000 chest X-rays acceptable for clinical decision support is better achieved using
performed at Wishard Hospital in a 21-month period to identify tools developed for specific tasks and document types. It is there-
findings related to CHF, tuberculosis, pneumonia, suspected malig- fore not surprising that processing of a mixture of clinical notes
nancy, compression fractions, and several other disorders. REX is successful when the task is well-defined and a small knowledge
achieved 100% specificity for all conditions, 94–100% sensitivity, base is developed specifically for the task. For example, Meystre
and 95–100% positive predictive value, outperforming human cod- and Haug created a subset of the UMLS Metathesaurus for 80 prob-
ers in sensitivity [90]. In contrast, mapping six types of radiology lems of interest to their longitudinal Electronic Medical Record and
reports to a UMLS subset and then selectively recognizing most evaluated extraction of these problems from 160 randomly se-
salient concepts using information retrieval techniques, resulted lected discharge summaries, radiology reports, pathology reports,
in 63% recall and 30% precision [91]. progress notes, and other document types [97]. The evaluation
demonstrated that using a general purpose entity extraction tool
5.2.3. Processing emergency department reports with a custom data subset, disambiguation, and negation detection
Topaz targets 55 clinical conditions relevant for detecting pa- achieves 89.2% recall and 75.3% precision [97].
tients with an acute lower respiratory syndrome [92]. Topaz uses The MediClass system [76] was configured to automatically as-
three methods for mapping text to the 55 conditions: index UMLS sess delivery of evidence-based smoking-cessation care [98]. A
concepts with MetaMap [93]; create compound concepts from group of clinicians and tobacco-cessation experts met over several
UMLS concepts or keywords and section titles (e.g., Sec- weeks to encode the recommended treatment model using the
tion:Neck + UMLS concept for lymphadenopathy = Cervical Lym- concepts and the types of phrases that provide evidence for smok-
phadenopathy); and identify measurement-value pairs (e.g., ing-cessation medications, discussions, referral activities, quitting
‘‘temp” + number > 38 degrees Celsius = Fever). Topaz is built on activities, smoking and readiness-to-quit assessments. The treat-
the GATE platform and implements ConText as a GATE module ment model involves five steps, ‘‘5A’s”: (1) ask about smoking sta-
for determining whether indexed conditions are present or absent, tus; (2) advise to quit; (3) assess a patient’s willingness to quit; (4)
experienced by the patient or someone else, and historical, recent, assist the patient’s quitting efforts; and (5) arrange follow-up. Eval-
or hypothetical. After integrating potentially multiple mentions of uated on 500 patient records containing structured data in addi-
a condition from a report, agreement between Topaz and physi- tion to progress notes, patient instructions, medications, referrals,
cians reading the report was 0.85 using weighted kappa. reasons for visit, and other smoking-related data, MediClass per-
formance was judged adequate to replace human coders of the
5.2.4. Processing pathology reports 5A’s of smoking-cessation care [98].
Surgical pathology reports are another trove of clinical data for The InfoBot system under development at the National Library
locating information about appropriate human tissue specimens of Medicine identifies the elements of a well-formed clinical ques-
[94] and supporting cancer research. For example, a preprocessor tion [99] in clinical notes. It subsequently invokes a question
integrated with MedLEE to abstract 13 types of findings related answering module (the CQA 1.0 system described in Section 6.3)
to risks of developing breast cancer achieved a sensitivity of that extracts answers to the question about the best care plan for
90.6% and a precision of 91.6% [69]. a given patient with the identified problems from the literature,
In MEDSYNDIKATE, an NLP system for extraction of medical and delivers documents containing the answers [100]. In a pilot
information from pathology reports, the basic sentence-level evaluation by 16 NIH Clinical Center nurses, each evaluating 15 pa-
understanding of the clinical narrative (that takes into consider- tient cases, documents containing answers were found to be rele-
ation grammatical knowledge, conceptual knowledge, and the link vant and useful in the majority of cases [101]. It remains to be seen
between syntactic and conceptual representations) is followed by if such automated methods of linking evidence to a patient’s record
the text-level analysis that tracks reference relations to eliminate can achieve the accuracy of more controlled delivery implemented
representation errors [95]. in Infobuttons, decision support tools that deliver information
Liu et al. assessed the feasibility of utilizing an existing GATE based on the context of the interaction between a clinician and
pipeline for extraction of the Gleason score (a measure of tumor an EHR [102]. Automatic linking of external knowledge bases and
grade), tumor stage, and status of lymph node metastasis from patients’ records will be useful if the NLP systems achieve accept-
free-text pathology reports [96]. The pipeline was evaluated on
committing errors related to the text processing and extraction
of values from the report, and errors related to semantic disagree- 16
http://caties.cabig.upmc.edu/.
768 D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772

able accuracy in extraction of bottom-line advice and presentation Templates were then clustered into semantically related units in
of this information in an easily comprehensible form. Extraction of order to generate a summary [112,113].
the bottom-line advice and answers to clinical questions are pre- Based on the semantic abstraction paradigm, Fiszman et al. are
sented in the next section. developing a summarization system that relies on SemRep for
semantic interpretation of the biomedical literature. The system
condenses SemRep predications and presents them in graphical
6. Providing evidence: personalized context-sensitive format [114]. We hope to see in the future if the above method
summarization and question answering holds promise for summarization and visual presentation of clini-
cal notes.
The need to link evidence to patients’ records was stated in the
1977 assessment of computer-based medical information systems 6.2. Clinical data and evidence summarization for patients
undertaken because of increased concern over the quality and ris-
ing costs of medical care [103]. The assessment concluded that the The online access to personal health and medical records and
quality and cost concerns could be addressed by medical informa- the overwhelming amount of health-related information available
tion systems that will supply physicians with information and to patients (alternatively called health care consumers and lay
incorporate valid findings of medical research [103]. The results users) pose many interesting questions. Hardcastle and Hallet
of medical research might soon become directly available through studied which text segments of a patient record require explana-
querying clinical research databases, however to date, findings of tion before being released to patients and what types of explana-
medical research can be primarily found in the literature. Follow- tion are appropriate [115]. Elhadad and Sutaria presented an
ing the 1977 report, medical informatics research focused on unsupervised method for building a lexicon of semantically equiv-
understanding physicians’ information needs and enabling physi- alent pairs of technical and lay medical terms [116].
cians’ access to the published results of clinical studies. This re- Ahlfeldt et al. surveyed issues related to communicating techni-
search provides a solid foundation for NLP aimed at satisfying cal medical terms in everyday language for patients and generating
physicians’ desiderata. The most desired features include compre- patient-friendly texts [117]. The survey presents research on alle-
hensive specific bottom-line recommendations that anticipate and viating the lack of understanding of clinical documents caused by
directly answer clinical questions, rapid access, current informa- medical terminology. This research includes generation of patient
tion, and evidence-based rationale for recommendations [104]. vocabularies and matching those vocabularies and problem lists
One important summarization task is to provide an overview of with standard terminologies; generation of terminological re-
the latest scientific evidence pertaining to a specific clinical situa- sources, corpora and annotation tool; development of natural con-
tion. The secondary sources that compile evidence found in the sci- sumer language generation systems; and customization of patient
entific publication (such as Family Physicians Inquiry Network17 education materials [117]. Green presents the design of a discourse
and BMJ Clinical Evidence18) deliver expert-generated support in generator that plans the content and organization of lay-oriented
the form of short answers to clinical questions followed by summa- genetic counseling documents to assist drafting letters that sum-
ries. This model is partially implemented in several systems de- marize the results for patients [118].
scribed in this section.
6.3. Clinical question answering

6.1. Clinical data and evidence summarization for clinicians One of the principal purposes of CDS is answering questions
[14]. Questions occurring in clinical situations could pertain to
Unlike the comparatively better researched summarization and ‘‘information on particular patients; data on health and sickness
visualization of structured clinical data [105–108], summarization within the local population; medical knowledge; local information
of clinical narrative is an evolving area of research. Afantenos et al. on doctors available for referral; information on local social influ-
surveyed the potential of summarization technology in the medical ences and expectations; and information on scientific, political, le-
domain [109]. Van Vleck et al. identified information physicians gal, social, management, and ethical changes affecting both how
consider relevant to summarizing a patient’s medical history in medicine is practiced and how doctors interact with individual pa-
the medical record. The following categories were identified as tients” [119]. Some questions do not need NLP and can be an-
necessary to capturing patient’s history: Labs and Tests, Problem swered directly by a known resource. For example, the NLM Go
and Treatment, History, Findings, Allergies, Meds, Plan, and Identi- Local service19 (which connects users to health services in their local
fying Info [110]. Meng et al. approached generation of clinical communities and directs users of the Go Local sites to MedlinePlus
notes as an extractive summarization problem [111]. In this ap- health information) was established to answer logistics questions
proach, sentences containing patient information that needs to by providing access to local information. Questions about particular
be repeated are extracted based on their rhetorical categories patients are currently answered by manually browsing or searching
determined using semantic patterns. This extraction method com- the EHR. Answering these questions can be facilitated by summari-
pares favorably to the baseline extraction method (the position of a zation (which requires NLP if information is extracted from free-text
sentence in the note) on a test set of 162 sentences in urological fields) and visualization tools [105–108]. Facilitating access to med-
clinical notes [111]. Cao et al summarized patients’ discharge sum- ical knowledge by providing answers to clinical questions is an area
maries into problem lists [70]. of active NLP research [120]. The goal of clinical question answering
The PERSIVAL project (a prototype system, not currently in use) systems is to satisfy medical knowledge questions providing an-
summarized medical scientific publications [112,113]. The sum- swers in the form of short action items supported by strong
marization module of the PERSIVAL system generated summaries evidence.
tailored for physicians and patients. Summaries generated for a Jacquemart and Zweigenbaum studied the feasibility of answer-
physician contained information relevant to a specific patient’s re- ing students’ questions in the domain of oral pathology using Web
cord. Each publication was represented using a set of templates. resources. Questions involving pathology, procedures, treatments,

17
http://www.fpin.org/mc/page.do.
18 19
http://clinicalevidence.bmj.com/ceweb/about/history.jsp. http://www.nlm.nih.gov/medlineplus/golocal/about.html.
D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772 769

examinations, indications, diagnosis and anatomy were used to de- Clinical NLP is also used for medication compliance and drug
velop eight broad semantic models comprised of 66 different syn- abuse monitoring. Butler et al. explored usefulness of content anal-
tactico-semantic patterns representing the questions. The triple- ysis of Internet message board postings for detection of potentially
based model ([concept]–(relation)–[concept]) combined with abusable opioid analgesics [131]. In this study, attractiveness for
which, why, and does modalities accounted for a vast majority of abuse of OxyContinÒ, VicodinÒ, and KadianÒ determined automat-
questions. The formally represented questions were used to query ically (using the total number of posts by product, total number of
10 different search engines. Search results were checked manually mentions by product (including synonyms and misspellings), total
to find a passage answering the question in a consistent context number of posts containing at least one mention of each product,
[121]. total number of unique authors, and the number of unique authors
The [concept]–(relation)–[concept] triples generated by Sem- of posts referencing any of the 3 target products) was compared to
Rep can be used to generate conceptual condensates that summa- the known attractiveness of the products. The numbers of men-
rize a set of documents [114], or answer specific questions, for tions of the products were significantly different and corresponded
example, finding the best pharmacotherapy for a given disease to the product attractiveness. Based on this and other metrics, the
[65]. Within the EpoCare project, the same question type is an- authors conclude that a systematic approach to post-marketing
swered by using an SVM to classify MEDLINE abstract sentences surveillance of Internet chatter related to pharmaceutical products
as containing an outcome (answer) or not and extracting the is feasible [131]. Understanding patient compliance issues could
high-ranking sentences [122]. The CQA-1.0 system also imple- help in clinical decisions. This understanding could be gained
ments an Evidence Based Medicine (EBM)-inspired approach to through processing of informal textual communications found in
outcome extraction [120]. In addition to extracting outcomes from the publicly available blog postings and e-mail archives. For exam-
individual MEDLINE abstracts to answer a wide range of questions, ple, Malouf et al. analyzed 316,373 posts to 19 Internet discussion
the CQA-1.0 system aggregates answers to questions about the groups and other websites from 8731 distinct users and found
best drug therapy into 5–6 drug classes generated based on the associations (such as cognitive side effects, risks, and dosage re-
individual pharmaceutical treatments extracted from each ab- lated issues) the epilepsy patients and their caregivers have for dif-
stract. Each class is supported by the strongest patient-oriented ferent medications [132].
outcome pertaining to each drug in the class. The EpoCare and To the best of our knowledge, the applications described in this
CQA-1.0 systems rely on the Patient-Intervention-Comparison- section are experimental rather than deployed and regularly used
Outcome (PICO) framework developed to help clinicians formulate in clinical setting. The difficulties in translation of clinical NLP re-
clinical questions [99]. The MedQA system answers definitional search into clinical practice and obstacles in determining the level
questions by integrating information retrieval, extraction, and of practical engagement of NLP systems are discussed in the next
summarization techniques to automatically generate paragraph- section.
level text [123].
8. Conclusions and thoughts on future work

7. Clinical NLP: direct applications of NLP in healthcare Discussing the road ahead for the clinical decision support sys-
tems, Greenes notes that despite the demonstrated benefits and lo-
In addition to processing text pertaining to patients and gener- cal successes of CDS systems, the past 45 years of CDS research
ated by clinicians and researchers, NLP methods have been applied have not been translated into widespread use and daily practice
directly to patients’ narratives for diagnostic and prognostic [14]. This observation can be expanded to NLP systems and meth-
purposes. ods for CDS. The strong foundation and local successes combined
The Linguistic Inquiry and Word Count (LIWC)20 tool was used with the renewed community-wide interest to medical language
to explore personality expressed through a person’s linguistic style processing provide hope that mature NLP systems for CDS will be-
[124]. The LIWC tool (which calculates the percentage of words in come available to the wider community in the near future.
written text that match up to 82 language dimensions) was evalu- Most of the above presented methods and systems were devel-
ated in predicting post-bereavement improvements in mental and oped for specific users, document types and CDS goals. Future re-
physical health [125], predicting adjustment to cancer [126], differ- search might indicate if such systems could be easily retargeted
entiating between the Internet message board entries and homepag- for new users and goals and whether the retargeted systems can
es of pro-anorexics or recovering anorexics [127], and recognizing compete with those designed for specific tasks and clinical sys-
suicidal and non-suicidal individuals [128]. Pestian et al. demon- tems. Evaluation methods for measuring the impact of NLP meth-
strated that the sequential minimization optimization algorithm ods on healthcare in addition to reliable standardized evaluation of
can classify completer and simulated suicide notes as well as mental NLP systems need to be developed.
health professionals [129]. For several issues very important to the future development of
Another potential clinical NLP application is assessment of neu- NLP for CDS, there is currently only anecdotal evidence and sparse
rodegenerative impairments. Roark et al. studied automation of publications. For example, with few exceptions, we do not know
NLP methods for diagnosis of mild cognitive impairment (MCI). which of the reviewed NLP–CDS systems are actually implemented
Automatic psychometric evaluation included syntactic annotation or deployed, and what makes these systems worthwhile. We might
and analysis of spoken language samples elicited during neuropsy- speculate that, for example, MedLEE is successfully integrated with
chological exams of elderly subjects. Evaluation of syntactic com- a clinical information system because it was developed and
plexity of the narrative was based on analysis of dependency adapted, as needed, for specific users and CDS goals, but the reason
structures and deviations from the standard (for English) right- for its success could also be its sophisticated NLP. We could better
branching trees in parse trees of subjects’ utterances. Measures de- judge which features determine whether NLP–CDS systems are ap-
rived from automatic parses highly correlated with manually de- plied outside of the experimental setting if we had more data
rived measures, indicating that automatically derived measures points. We believe it would be valuable to have a special venue
may be useful for discriminating between healthy and MCI sub- for presenting case studies and analysis of applied NLP systems
jects. [130]. in the near future.
Priorities in NLP development will be determined by the readi-
20
http://www.liwc.net/. ness of intended users to adopt NLP. The early successes in NLP and
770 D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772

CDS led to high user expectations that were not always met. NLP [17] Haug PJ, Gardner RM, Evans RS, Rocha BH, Rocha RA. Clinical decision support
at intermountain healthcare. In: Berner ES, editor. Clinical decision support
researchers need to re-gain clinicians’ trust, which is achievable
systems, theory and practice. New York: Springer; 2007. p. 159–89.
based on better understanding of the NLP strengths and weak- [18] Evans RS, Pestotnik SL, Classen DC, Clemmer TP, Weaver LK, Orme Jr JF, et al. A
nesses by clinicians, as well as significant progress in biomedical computer-assisted management program for antibiotics and other
NLP. Reacquainting clinicians with NLP can be facilitated by NLP antiinfective agents. N Engl J Med 1998;338(4):232–8.
[19] Fiszman M, Chapman WW, Aronsky D, Evans RS, Haug PJ. Automatic
training, well-planned NLP experiments, careful and thoughtful detection of acute bacterial pneumonia from chest X-ray reports. J Am Med
evaluation of the results, high-quality implementation of NLP mod- Inform Assoc 2000;7(6):593–604.
ules, semi-automated and easier methods for adapting NLP for [20] Dixon BE, Zafar A. Inpatient computerized provider order entry: findings from
the AHRQ Health IT Portfolio (prepared by the AHRQ National Resource
other domains, and evaluations of NLP–CDS adequacy in satisfying Center for Health IT). AHRQ Publication No. 09–0031-EF. Rockville,
user needs. MD: Agency for Healthcare Research and Quality; 2009.
We believe NLP can contribute to decision support for all groups [21] Miller RA, Waitman LR, Chen S, Rosenbloom ST. Decision support during
inpatient care provider order entry: the Vanderbilt experience. In: Berner ES,
involved in the clinical process, but the development will probably editor. Clinical decision support systems, theory and practice. New
focus on the areas for which there is higher demand. For example, York: Springer; 2007. p. 215–48.
if researchers are more eager consumers of NLP than clinicians, [22] Bates DW, Lo HG. Patients, doctors, and information technology: clinical
decision support at Brigham and Women’s Hospital and Partners Healthcare.
NLP research into text mining and literature summarization will In: Greenes RA, editor. Clinical decision support: the road ahead. Burlington,
continue dominating the field. MA: Elsevier, Inc.; 2007. p. 127–41.
The NLP CDS tasks are so numerous and complex that this area [23] McDonald CJ, Overhage JM. Guidelines you can follow and can trust: an ideal
and an example. JAMA 1994;271(11):872–3.
of research will succeed in making practical impact only as a result
[24] Pestian J, Brew C, Matykiewicz P, Hovermale DJ, Johnson N, Cohen KB, et al. A
of coordinated community-wide effort. shared task involving multi-label classification of clinical free text. In: ACL’07
workshop on biological, translational, and clinical language processing
(BioNLP’07). Prague, Czech Republic; 2007. p. 36–40.
[25] Humphreys BL, Lindberg DA. The UMLS project: making the conceptual
Acknowledgments connection between users and the information they need. Bull Med Libr Assoc
1993;81(2):170–7.
This work was partially supported by the Intramural Research [26] Friedlin FJ, McDonald CJ. A software tool for removing patient identifying
information from clinical documents. J Am Med Inform Assoc
Program of the National Library of Medicine, National Institutes
2008;15(5):601–10.
of Health. We also thank Kevin Bretonnel Cohen for inspiration [27] Neamatullah I, Douglass MM, Lehman LW, Reisner A, Villarroel M, Long WJ,
and valuable comments, and the anonymous reviewers for the de- et al. Automated de-identification of free-text medical records. BMC Med
Inform Decis Mak 2008;24(8):32.
tailed analysis and helpful comments.
[28] McAuley L, Pham B, Tugwell P, Moher D. Does the inclusion of grey literature
influence estimates of intervention effectiveness reported in meta-analyses?
Lancet 2000;356(9237):1228–31.
References [29] Ruch P, Baud R, Geissbuhler A. Using lexical disambiguation and named-
entity recognition to improve spelling correction in the electronic patient
record. Artif Intell Med 2003;29(1–2):169–84.
[1] Shortliffe EH. Computer programs to support clinical decision making. JAMA
[30] Xu H, Friedman C, Stetson PD. Methods for building sense inventories of
1987;258(1):61–6.
abbreviations in clinical notes. AMIA Annu Symp Proc 2008:819.
[2] Hunt DL, Haynes RB, Hanna SE, Smith K. Effects of computer-based clinical
[31] Pakhomov S, Pedersen T, Chute CG. Abbreviation and acronym
decision support systems on physician performance and patient outcomes: a
disambiguation in clinical discourse. AMIA Annu Symp Proc 2005:589–93.
systematic review. JAMA 1998;280(15):1339–46.
[32] Joshi M, Pakhomov S, Pedersen T, Chute CG. A comparative study of
[3] Hripcsak G, Knirsch CA, Jain NL, Pablos-Mendez A. Automated tuberculosis
supervised learning as applied to acronym expansion in clinical reports.
detection. J Am Med Inform Assoc 1997;4(5):376–81.
AMIA Annu Symp Proc 2006:399–403.
[4] Osheroff JA, Teich JM, Middleton B, Steen EB, Wright A, Detmer DE. A roadmap
[33] Smith L, Rindflesch T, Wilbur WJ. MedPost: a part-of-speech tagger for
for national action on clinical decision support. J Am Med Inform Assoc
bioMedical text. Bioinformatics 2004;20(14):2320–1.
2007;14(2):141–5.
[34] Wermter J, Hahn U. Really, is medical sublanguage that different?
[5] Sittig DF, Wright A, Osheroff JA, Middleton B, Teich JM, Ash JS, et al. Grand
Experimental counter-evidence from tagging medical and newspaper
challenges in clinical decision support. J Biomed Inform 2008;41(2):387–92.
corpora. Stud Health Technol Inform 2004;107(Pt 1):560–4.
[6] Aspden P, Corrigan JM, Wolcott J, Erickson SM, editors. Patient safety:
[35] Pakhomov SV, Coden A, Chute CG. Developing a corpus of clinical notes
achieving a new standard for care. Washington, DC: The National Academies
manually annotated for part-of-speech. Int J Med Inform 2006;75(6):418–29.
Press; 2004.
[36] Liu K, Chapman W, Hwa R, Crowley RS. Heuristic sample selection to
[7] Garg AX, Adhikari NK, McDonald H, Rosas-Arellano MP, Devereaux PJ, Beyene
minimize reference standard training set for a part-of-speech tagger. J Am
J, et al. Effects of computerized clinical decision support systems on
Med Inform Assoc 2007;14(5):641–50.
practitioner performance and patient outcomes: a systematic review. JAMA
[37] Ananiadou S, Friedman C, Tsujii J. Introduction: named entity recognition in
2005;293(10):1223–38.
biomedicine. J Biomed Inform 2004;37(6):393–5.
[8] Kawamoto K, Houlihan CA, Balas EA, Lobach DF. Improving clinical practice
[38] Demner-Fushman D, Mork JG, Shooshan SE, Aronson AR. UMLS Content Views
using clinical decision support systems: a systematic review of trials to
Appropriate for NLP Processing of the Biomedical Literature vs. Clinical Text.
identify features critical to success. BMJ 2005;330(7494):765.
AMIA Annu Symp Proc 2009.
[9] Crowley RS, Tseytlin E, Jukic D. ReportTutor – an intelligent tutoring system
[39] Johnson SB. A semantic lexicon for medical language processing. J Am Med
that uses a natural language interface. Proc AMIA Symp 2005:171–5.
Inform Assoc 1999;6(3):205–18.
[10] Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated acquisition
[40] Deléger L, Namer F, Zweigenbaum P. Morphosemantic parsing of medical
of disease drug knowledge from biomedical and clinical documents: an initial
compound words: transferring a French analyzer to English. Int J Med Inform
study. J Am Med Inform Assoc 2008;15(1):87–98.
2008.
[11] Prokosch HU, McDonald CJ. The effect of computer reminders on the quality
[41] Mougin F, Burgun A, Bodenreider O. Using WordNet to improve the mapping
of care and resource use. In: Prokosch HU, Dudeck J, editors. Hospital
of data elements to UMLS for data sources integration. AMIA Annu Symp Proc
information systems: design and development characteristics; impact and
2006:574–8.
future architecture. Elsevier Science; 1995. p. 221–240.
[42] Jin Y, McDonald RT, Lerman K, Mandel MA, Carroll S, Liberman MY, et al.
[12] Berner ES, editor. Clinical decision support systems-theory and practice. New
Automated recognition of malignancy mentions in biomedical literature.
York: Springer; 2007.
BMC Bioinformatics 2006;7:492.
[13] Aronsky D, Fiszman M, Chapman WW, Haug PJ. Combining decision support
[43] Tsuruoka Y, Tsujii J, Ananiadou S. Accelerating the annotation of sparse
methodologies to diagnose pneumonia. Proc AMIA Symp 2001:12–6.
named entities by dynamic sentence selection. In: Proceedings of the
[14] Greenes RA, editor. Clinical decision support: the road ahead. Burlington,
workshop on current trends in biomedical natural language processing
MA: Elsevier, Inc.; 2007.
(BioNLP’08), June 19. Columbus, Ohio; 2008. p. 30–37.
[15] Mamlin BW, Overhage JM, Tierney W, Dexter P, McDonald CJ. Clinical
[44] Tomanek K, Wermter J, Hahn U. An approach to text corpus construction
decision support within the Regenstrief medical record system. In: Berner ES,
which cuts annotation costs and maintains reusability of annotated data. In:
editor. Clinical decision support systems, theory and practice. New
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural
York: Springer; 2007. p. 190–314.
Language Processing and Computational Natural Language Learning (EMNLP-
[16] Biondich P, Mamlin BW, Tierney W, Overhage JM, McDonald CJ. Regenstrief
CoNLL 2007). Jun 28–30, Prague, Czech Republic; 2007. p. 486–449.
medical informatics: experiences with clinical decision support systems. In:
[45] Ogren P, Savova G, Chute C. Constructing evaluation corpora for automated
Greenes RA, editor. Clinical decision support: the road ahead. Burlington,
clinical named entity recognition. In: Proceedings of the Sixth International
MA: Elsevier, Inc.; 2007. p. 111–26.
D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772 771

Language Resources and Evaluation (LREC’08). Marrakech, Morocco, May 28– computational linguistics (ACL’ 05). Interactive poster and demonstration
30; 2008. sessions, June 25–30. Ann Arbor: Michigan; 2005. p. 25–28.
[46] Roberts A, Gaizauskas R, Hepple M, Demetriou G, Guo Y, Roberts I, et al. [73] Coden A, Savova G, Sominsky I, Tanenblatt M, Masanz J, Schuler K, et al.
Building a semantically annotated corpus of clinical texts. J Biomed Inform Automatically extracting cancer disease characteristics from pathology
2009;42(5):950–66. reports into a disease knowledge representation model. J Biomed Inform
[47] Li D, Savova G, Kipper-Schuler K. Conditional random fields and support 2009;42(5):937–49.
vector machines for disorder named entity recognition in clinical texts. In: [74] Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting
Proceedings of the workshop on current trends in biomedical natural principal diagnosis, co-morbidity and smoking status for asthma research:
language processing (BioNLP’08), June 19. Columbus, Ohio; 2008. p. 94–95. evaluation of a natural language processing system. BMC Med Inform Decis
[48] Krauthammer M, Nenadic G. Term identification in the biomedical literature. Mak 2006;6:30.
J Biomed Inform 2004;37(6):512–26. [75] Goryachev S, Kim H, Zeng-Treitler Q. Identification and extraction of family
[49] Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting information history information from clinical reports. AMIA Annu Symp Proc
from textual documents in the electronic health record: a review of recent 2008:247–51.
research. Yearb Med Inform 2008:128–44. [76] Hazlehurst B, Frost HR, Sittig DF, Stevens VJ. MediClass: a system for
[50] SMutalik PG, Deshpande A, Nadkarni PM. Use of general-purpose negation detecting and classifying encounter-based clinical events in any electronic
detection to augment concept indexing of medical documents: a quantitative medical record. J Am Med Inform Assoc 2005;12(5):517–29.
study using the UMLS. J Am Med Inform Assoc 2001;8(6):598–609. [77] Hazlehurst B, Mullooly J, Naleway A, Crane B. Detecting possible vaccination
[51] Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple reactions in clinical notes. AMIA Annu Symp Proc 2005:306–10.
algorithm for identifying negated findings and diseases in discharge [78] Murff HJ, Patel VL, Hripcsak G, Bates DW. Detecting adverse events for patient
summaries. J Biomed Inform 2001;34(5):301–10. safety research: a review of current methodologies. J Biomed Inform
[52] Elkin PL, Brown SH, Bauer BA, Husser CS, Carruth W, Bergstrom LR, et al. A 2003;36(1–2):131–43.
controlled trial of automated classification of negation from clinical notes. [79] Hripcsak G, Bakken S, Stetson PD, Patel VL. Mining complex clinical data for
BMC Med Inform Decis Mak 2005;5(1):13. patient safety research: a framework for event discovery. J Biomed Inform
[53] Huang Y, Lowe HJ. A Novel hybrid approach to automated negation detection 2003;36(1–2):120–30.
in clinical radiology reports. J Am Med Inform Assoc 2007. [80] Haug PJ, Ranum DL, Frederick PR. Computerized extraction of coded findings
[54] Chapman WW, Chu D, Dowling JN. ConText: an Algorithm for identifying from free-text radiologic reports. Work in progress. Radiology
contextual features from clinical text. In: ACL’07 workshop on Biological, 1990;174(2):543–548.
translational, and clinical language processing (BioNLP’07). Prague, Czech [81] Dang PA, Kalra MK, Blake MA, Schultz TJ, Halpern EF, Dreyer KJ. Extraction of
Republic; 2007. p. 81–88. recommendation features in radiology with natural language processing:
[55] Solt I, Tikk D, Gal V, Kardkovacs ZT. Semantic classification of diseases in exploratory study. AJR Am J Roentgenol 2008;191(2):313–20.
discharge summaries using a context-aware rule-based classifier. J Am Med [82] Elkin PL, Froehling D, Wahner-Roedler D, Trusko B, Welsh G, Ma H, et al. NLP-
Inform Assoc 2009;16(4):580–4. based identification of pneumonia cases from free-text radiological reports.
[56] Uzuner O, Zhang X, Sibanda T. Machine learning and rule-based approaches AMIA Annu Symp Proc 2008:172–6.
to assertion classification. J Am Med Inform Assoc 2009;16(1):109–15. [83] Haug P, Koehler S, Lau LM, Wang P, Rocha R, Huff S. A natural language
[57] Aramaki E, Miurua Y, Tonoike M, Ohkuma T, Mashuichi H, Ohe Kazuhiko. understanding system combining syntactic and semantic techniques. Proc
TEXT2TABLE: medical text summarization system based on named entity Annu Symp Comput Appl Med Care 1994:247–51.
recognition and modality identification. In: Proc of the Workshop on [84] Haug PJ, Koehler S, Lau LM, Wang P, Rocha R, Huff SM. Experience with a
BioNLP2009 at the NAACL Symposium; 2009. p. 185–192. mixed semantic/syntactic parser. Proc Annu Symp Comput Appl Med Care
[58] Denny J, Peterson J, Choma N, Xu H, Miller R, Bastarache L, Peterson N. 1995:284–8.
Development of a natural language processing system to identify timing and [85] Haug PJ, Christensen L, Gundersen M, Clemons B, Koehler S, Bauer K. A natural
status of colonoscopy testing in electronic medical records. AMIA Annu Symp language parsing system for encoding admitting diagnoses. Proc AMIA Annu
Proc 2009. Fall Symp 1997:814–8.
[59] Zhou L, Hripcsak G. Temporal reasoning with medical data – a review with [86] Fiszman M, Haug PJ, Frederick PR. Automatic extraction of PIOPED
emphasis on medical natural language processing. J Biomed Inform interpretations from ventilation/perfusion lung scan reports. Proc AMIA
2007;40(2):183–202. Symp 1998:860–4.
[60] Roberts A, Gaizauskas R, Hepple M. Extracting clinical relationships from [87] Christensen LM, Haug PJ, Fiszman M. MPLUS: A probabilistic medical
patient narratives. In: Proceedings of the workshop on current trends in language understanding system. In: Proceedings of the workshop on
biomedical natural language processing (BioNLP’08), June 19. Columbus, natural language processing in the biomedical domain, Philadelphia.
Ohio; 2008. p. 10–18. Association for Computational Linguistics, July 2002. p. 29–36.
[61] Cohen KB, Palmer M, Hunter L. Nominalization and alternations in biomedical [88] Chapman WW, Christensen LM, Wagner MM, Haug PJ, Ivanov O, Dowling JN,
language. PLoS ONE 2008;3(9):e3158. et al. Classifying free-text triage chief complaints into syndromic categories
[62] Cohen KB, Hunter L. A critical review of PASBio’s argument structures for with natural language processing. Artif Intell Med 2005;33(1):31–40.
biomedical verbs. BMC Bioinformatics 2006;7(Suppl 3):S5. [89] Christensen L, Harkema H, Haug P, Irwin J, Chapman WW. ONYX: a system for
[63] Chen L, Friedman C. Extracting phenotypic information from the literature via the semantic analysis of clinical text. In: BioNLP workshop of the association
natural language processing. Stud Health Technol Inform 2004;107(Pt for computational linguistics conference, Boulder, CO; 2009.
2):758–62. [90] Friedlin J, McDonald CJA. Natural language processing system to extract and
[64] Rindflesch TC, Bodenreider O. Advanced library services: developing a code concepts relating to congestive heart failure from chest radiology
biomedical knowledge repository to support advanced information reports. AMIA Annu Symp Proc 2006:269–73.
management applications. September 2006. The Lister Hill National Center [91] Hersh W, Mailhot M, Arnott-Smith C, Lowe H. Selective automated indexing
for Biomedical Communications. LHNCBC-TR-2006-001. Available from: of findings and diagnoses in radiology reports. J Biomed Inform
http://lhncbc.nlm.nih.gov/lhc/docs/reports/2006/tr2006001.pdf. 2001;34(4):262–73.
[65] Fiszman M, Demner-Fushman D, Kilicoglu H, Rindflesch TC. Automatic [92] Chapman WW, Fiszman M, Dowling JN, Chapman BE, Rindflesch TC.
summarization of MEDLINE citations for evidence-based medical Identifying respiratory findings in emergency department reports for
treatment: a topic-oriented evaluation. J Biomed Inform 2009;42(5):801–13. biosurveillance using MetaMap. Medinfo 2004;2004:487–91.
[66] Sager N, Lyman M, Bucknall C, Nhan N, Tick LJ. Natural language processing [93] Aronson AR. Effective mapping of biomedical text to the UMLS
and the representation of clinical data. J Am Med Inform Assoc metathesaurus: the MetaMap program. Proc AMIA Symp AMIA 2001:17–21.
1994;1(2):142–60. [94] Schadow G, McDonald CJ. Extracting structured information from free text
[67] Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SBA. General natural- pathology reports. AMIA Annu Symp Proc 2003:584–8.
language text processor for clinical radiology. J Am Med Inform Assoc [95] Hahn U, Romacker M, Schulz S. MEDSYNDIKATE – a natural language system
1994;1(2):161–74. for the extraction of medical information from findings reports. Int J Med
[68] Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical Inform 2002;67(1–3):63–74.
documents based on natural language processing. J Am Med Inform Assoc [96] Liu K, Mitchell KJ, Chapman WW, Crowley RS. Automating tissue bank
2004;11(5):392–402. annotation from pathology reports – comparison to a gold standard expert
[69] Xu H, Anderson K, Grann VR, Friedman C. Facilitating cancer research using annotation set. AMIA Annu Symp Proc 2005:460–4.
natural language processing of pathology reports. Stud Health Technol Inform [97] Meystre S, Haug PJ. Natural language processing to extract medical problems
2004;107(Pt 1):565–72. from electronic clinical documents: performance evaluation. J Biomed Inform
[70] Cao H, Chiang MF, Cimino JJ, Friedman C, Hripcsak G. Automatic 2006;39(6):589–99.
summarization of patient discharge summaries to create problem lists [98] Hazlehurst B, Sittig DF, Stevens VJ, Smith KS, Hollis JF, Vogt TM, et al. Natural
using medical language processing. Stud Health Technol Inform language processing in the electronic medical record: assessing clinician
2004;107(Pt 2):1540. adherence to tobacco treatment guidelines. Am J Prev Med 2005:434–9.
[71] Friedman C. Semantic text parsing for patient records. In: Chun H, Fuller SS, [99] Richardson W, Wilson MC, Nishikawa J, Hayward RSA. The well-built clinical
Friedman C, Hersh W, editors. Knowledge management and data mining in question: a key to evidence-based decisions. ACP Journal Club 1995;123:A-
biomedicine. NY: Springer; 2005. p. 423–48. 12.
[72] Pakhomov S, Buntrock J, Duffy P. High throughput modularized NLP system [100] Demner-Fushman D, Seckman C, Fisher C, Hauser S, Clayton J, Thoma G. A
for clinical text. In: Proceedings of the 43rd annual meeting on association for prototype system to support evidence-based practice. AMIA Annu Symp Proc
2008:151–5.
772 D. Demner-Fushman et al. / Journal of Biomedical Informatics 42 (2009) 760–772

[101] Seckman C, Demner-Fushman D, Fisher C, Hauser SE, Thoma GR. InfoBot: a [117] Åhlfeldt H, Borin L, Daumke P, Grabar N, Hallett C, Hardcastle D, et al.
prototype system to support evidenced based practice. Building Connections Literature review on patient-friendly documentation systems. the open
for Patient Centered Records. University of Maryland School of Nursing 18th university, United Kingdom. May 2006. Technical Report No 2006/04.
Annual Summer Institute in Nursing Informatics, July 2008. Baltimore, MD. Available from: http://mcs.open.ac.uk/nlg/publications/TR2006_04.pdf.
[102] Cimino JJ, Del Fiol G., Infobuttons and point of care access to knowledge. In: [118] Green N. Representing normative arguments in genetic counseling. In:
Greenes RA, editors. Clinical decision support: the road ahead. New York: Proceedings of the AAAI-2006 spring symposium on argumentation for
Academic Press. consumers of healthcare; 2006 Mar 27–29, Stanford, California; 2006.
[103] Policy implications of medical information systems. Washington, DC: U.S. [119] Smith, R. 1996. What clinical information do doctors need? BMJ 313;0:1062–
Government Printing Office; 1977. 1068.
[104] Ely JW, Osheroff JA, Chambliss ML, Ebell MH, Rosenbaum ME. Answering [120] Demner-Fushman D, Lin J. Answering clinical questions with knowledge-
physicians’ clinical questions: obstacles and potential solutions. J Am Med based and statistical techniques. Comput Linguistics 2007;33(1):63–103.
Inform Assoc 2005;12(2):217–24. [121] Jacquemart P, Zweigenbaum P. Towards a medical question-answering
[105] Downs S, Walker MG, Blum RL. Automated summarization of on-line medical system: a feasibility study. Stud Health Technol Inform 2003;95:463–8.
records. In: Proceedings of the fifth world congress on medical lnformatics [122] Niu Y, Zhu X, Hirst G. Using outcome polarity in sentence extraction for
(Medinfo). Elsevier Science Publishers 1986:800–4. medical question-answering. AMIA Annu Symp Proc 2006:599–603.
[106] Shahar Y, Goren-Bar D, Boaz D, Tahan G. Distributed, intelligent, interactive [123] Lee M, Cimino J, Zhu HR, Sable C, Shanker V, Ely J, et al. Beyond
visualization and exploration of time-oriented clinical data and their information retrieval – medical question answering. AMIA Annu Symp Proc
abstractions. Artif Intell Med 2006;38(2):115–35. 2006:469–73.
[107] Plaisant C, Lam SJ, Shneiderman B, Smith MS, Roseman DH, Marchand G, et al. [124] Pennebaker JW, Mehl MR, Niederhoffer KG. Psychological aspects of
Searching electronic health records for temporal patterns in patient histories: natural language. Use: our words, our selves. Annu Rev Psychol
a case study with microsoft amalga. AMIA Annu Symp Proc 2008:601–5. 2003;54:547–77.
[108] Stacey M, McGregor C. Temporal abstraction in intelligent clinical data [125] Pennebaker JW, Mayne TJ, Francis ME. Linguistic predictors of adaptive
analysis: a survey. Artif Intell Med 2007;39(1):1–24. bereavement. J Pers Soc Psychol 1997;72(4):863–71.
[109] Afantenos S, Karkaletsis V, Stamatopoulos P. Summarization from medical [126] Owen JE, Giese-Davis J, Cordova M, Kronenwetter C, Golant M, Spiegel D. Self-
documents: a survey. Artif Intell Med 2005;33(2):157–77. report and linguistic indicators of emotional expression in narratives as
[110] Van Vleck TT, Stein DM, Stetson PD, Johnson SB. Assessing data relevance for predictors of adjustment to cancer. J Behav Med 2006;29(4):335–345. Epub
automated generation of a clinical summary. AMIA Annu Symp Proc 2006 Jul 15.
2007:761–5. [127] Lyons EJ, Mehl MR, Pennebaker JW. Pro-anorexics and recovering anorexics
[111] Meng F, Taira RK, Bui AA, Kangarloo H, Churchill BM. Automatic generation of differ in their linguistic internet self-presentation. J Psychosom Res
repeated patient information for tailoring clinical notes. Int J Med Inform 2006;60(3):253–6.
2005;74(7–8):663–73. [128] Stirman SW, Pennebaker JW. Word use in the poetry of suicidal and
[112] McKeown K, Chang SF, Cimino JJ, Feiner S, Friedman C, Gravano L, et al. nonsuicidal poets. Psychosom Med 2001;63(4):517–22.
PERSIVAL, a system for personalized search and summarization over [129] Pestian J, Matykiewicz P, Grupp-Phelan J, Lavanier SA, Combs J, Kowatch R.
multimedia healthcare information. In: Proceedings of the joint conference Using natural language processing to classify suicide notes. In: Proceedings of
on digital libraries (JCDL) 2001;0:331-340. the workshop on current trends in biomedical natural language processing
[113] Elhadad N, Kan MY, Klavans J, McKeown K. Customization in a unified (BioNLP’08), Jun 19. Columbus, Ohio; 2008 . p. 96–99.
framework for summarizing medical literature. Artif Intell Med [130] Roark B, Mitchell M, Hollingshead K. Syntactic complexity measures for
2005;33(2):179–98. detecting Mild Cognitive Impairment. In: ACL’07 workshop on biological,
[114] Fiszman M, Rindflesch TC, Kilicoglu H. Summarization of an online medical translational, and clinical language processing (BioNLP’07). Prague, Czech
encyclopedia. Stud Health Technol Inform 2004;107(Pt 2):506–10. Republic; 2007. p. 1–8.
[115] Hardcastle D, Hallett C. Exploring the Use of NLP in the disclosure of [131] Butler SF, Venuti SW, Benoit C, Beaulaurier RL, Houle B, Katz N. Internet
electronic patient records. In: ACL’07 workshop on Biological, translational, surveillance. content analysis and monitoring of product-specific internet
and clinical language processing (BioNLP’07). Prague, Czech Republic; 2007. prescription opioid abuse-related postings. Clin J Pain 2007;23(7):619–28.
p. 161–162. [132] Malouf R, Davidson B, Sherman A. Mining web texts for brand associations.
[116] Elhadad N, Sutaria K. Mining a lexicon of technical terms and lay equivalents. In: Proceedings of the AAAI-2006 spring symposium on computational
In: ACL’07 workshop on biological, translational, and clinical language approaches to analyzing weblogs, Mar 27–29. Stanford, California; 2006. p.
processing (BioNLP’07). Prague, Czech Republic; 2007. p. 49–56. 125–126.

You might also like