E 2087 - 00 Rtiwodc
E 2087 - 00 Rtiwodc
E 2087 - 00 Rtiwodc
Designation: E 2087 – 00
INTRODUCTION
In 1839, William Farr stated in his First Annual Report of the Registrar-General of Births, Deaths,
and Marriages in England, “The nomenclature is of as much importance in this department of inquiry,
as weights and measures in the physical sciences, and should be settled without delay.” Since that time
this theme has been heard resounding from an in increasingly large group of scientists (see Appendix
X1). Today, the need for controlled vocabularies to support health record systems has been widely
recognized (see Specification E 1238, Guide E 1239, Guide E 1384, Specification E 1633, and EN
12017). Controlled vocabularies provide systems with the means to aggregate data. This aggregation
of data can be done at multiple levels of granularity and therefore can enhance the clinical retrieval
of a problem oriented record, data pertaining to a classification for billing purposes, or outcomes data
for a given population. Maintenance of large-scale vocabularies has become a burdensome problem
as the size of term sets has escalated (IS 15188). Without a well-structured backbone, large-scale
vocabularies cannot scale to provide the level of interoperability required by today’s complex
electronic health record applications.
The solution rests with standards (1).2 Over the past ten or more years, Medical Informatics
researchers have been studying controlled vocabulary issues directly. They have examined the
structure and content of existing vocabularies to determine why they seem unsuitable for particular
needs, and they have proposed solutions. In some cases, proposed solutions have been carried forward
into practice and new experience has been gained (2). As we prepare to enter the twenty-first century,
it seems appropriate to pause to reflect on this experience, and publish a standard set of goals for the
development of comparable, reusable, multipurpose, and maintainable controlled health vocabularies
(IS 12200, IS 12620) (3).
This specification represents the initial input taken from the ANSI-HISB Framework Paper by
Chute, et al (4), the Desiderata from Cimino (3), the ToMeLo Architecture and Terminology Paper by
Rossi-Mori and Zanstra, and the Compositionality Paper by Elkin, et al (5). Other useful references
include, “GALEN Generalized Architecture for Language, Encyclopedias and Nomenclatures in
Medicine: Univ. of Manchester” (6, 7) and “Unified Medical Language System (UMLS) Knowledge
Sources” (8).
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.
1
E 2087 – 00
basis from which vocabulary developers will build robust, tiple terms (linguistic representations) may have the same
large-scale, reliable and maintainable terminologies. meaning if they are explicit representations of the same
1.3 This specification explicitly does not refer to classifica- concept. This implies non-redundancy, non-ambiguity, and
tions or coding systems (for example, a simple list of pairs of non-vagueness.
rubrics and codes) that are not designed to be used clinically. 3.1.1.1 Non-redundancy—Terminologies must be internally
consistent. There must not be more than one concept in the
2. Referenced Documents terminology with the same meaning (IS 704, Guide E 1284).
2.1 ASTM Standards: This does not exclude synonymy; rather, it requires that this be
E 1238 Specification for Transferring Clinical Observations explicitly represented.
Between Independent Computer Systems3 3.1.1.2 Non-Ambiguity—No concept should have two or
E 1239 Guide for Description of Reservation/Registration- more meanings. However an entry term (some authors have
Admission, Discharge, Transfer (R-ADT) Systems for referred to this as an “interface terminology”) can point to
Automated Patient Care Information Systems3 more than one concept (for example, MI as a myocardial
E 1284 Guide for Construction of a Clinical Nomenclature infarction and mitral insufficiency).
for Support of Electronic Health Records3 3.1.1.3 Non-Vagueness—Concept names must be context
E 1384 Guide for Content and Structure of the Electronic free (some authors have referred to this as “context laden”).
Health Record (EHR)3 For example “diabetes mellitus” should not have the child
E 1633 Specification for Coded Values Used in the Elec- concept “well controlled,” instead the child concept’s name
tronic Health Record3 should be “diabetes mellitus, well controlled.”
E 1712 Specification for Representing Clinical Laboratory
3.2 Purpose and Scope—Any controlled vocabulary must
Procedure and Analyte Names3
have its purpose and scope clearly stated in operational terms
2.2 Other Standards:
so that it its fitness for particular purposes can be assessed and
ISO/DIS 860 International Harmonization of Concepts and
evaluated (IS 15188). Where appropriate, it may be useful to
Terms
illustrate the scope by examples or “use cases” as in database
EN 12017 Medical Informatics—Vocabulary
models and other specification tools. Criteria such as coverage
EN 12264 Medical Informatics—Categorical Structure of
and comprehensiveness can only be judged relative to the
Syntax of Concepts—Model for Representation of Se-
intended use and scope. For example, a vocabulary might be
mantics
comprehensive and detailed enough for general practice with
ICD-9-CM
respect to cardiovascular signs, symptoms, and disorders, but
IS 704 Principles and Methods of Terminology
inadequate to a specialist cardiology or cardiothoracic surgery
IS 1087-1 Terminology—Vocabulary—Part 1: Theory and
unit. Conversely, a vocabulary sufficiently detailed to cope
Application
with cardiology and cardiothoracic surgery might be totally
IS 1087-2 Terminology—Vocabulary—Part 2: Computer
impractical in general practice.
Applications
IS 11179-3 Terminology—Data Registries 3.3 Coverage (3)—Each segment of the healthcare process
IS 12200 Terminology—Computer Applications-Machine must have explicit in-depth coverage and not rely on broad
Readable Terminology Interchange Format summary categories that lump specific clinical concepts to-
IS 12620 Terminology—Computer Applications—Data gether. For example, it is often important to distinguish specific
Categories diagnosis from categories presently labeled Not Elsewhere
IS 15188 Project Management for Terminology Standard- Classified (NEC), or to differentiate disease severity such as
ization indolent prostate cancer from widely metastatic disease. The
IS 2382–4 Information Technology—Vocabulary—Part 4: extent to which the depth of coverage is incomplete must be
Organization of Data explicitly specified for each domain (scope) and purpose as
ISO TR 9789 Guidelines for the Organization and Repre- indicated in 3.2.
sentation of Data Elements for Data Interchange—Coding 3.4 Comprehensiveness (9)—All segments of the healthcare
Methods and Principles process, such as physical findings, risk factors, or functional
status, must be addressed for all related disciplines, across the
3. General Information breadth of medicine, surgery, nursing and dentistry. This
3.1 Basic characteristics of a terminology influence its criterion applies because decision support, risk adjustment,
utility and appropriateness in clinical applications. outcomes research, and useful guidelines require more than
3.1.1 Concept Orientation (3)—The basic unit of a vocabu- diagnoses and procedures. Examples include existing AHCPR
lary must be a concept, which is the embodiment of some guidelines and the HCFA mortality model. The extent to which
specific meaning and not a code or a character string. Repre- the degree of comprehensiveness is incomplete must be explic-
sentations of a concept must correspond to one and only one itly specified for each domain (scope) and purpose as indicated
meaning, and in a well-ordered vocabulary only one concept in 3.2.
may have that same meaning (ISO/DIS 860). However, mul- 3.5 Mapping (10)—Government and payers mandate the
form and classification schema for much clinical data ex-
change. Thus, comprehensive and detailed representations of
3
Annual Book of ASTM Standards, Vol 14.01. patient data within computer-based patient records should be
2
E 2087 – 00
able to be mapped to those classifications, such as ICD-9-CM. concept, modifier concept, and qualifier (also called “status”)
This need for multiple granularities is needed for clinical health concept. These terms are being specifically defined in a
care as well (ISO TR 9789). For example, an endocrinologist document on meta-terminology currently being written under
may specify more detail about a patient’s diabetes mellitus than the auspices of ISO TC 215 Working Group 3.
a generalist working in an urgent care setting, even though both
may be caring for the same patient. The degree to which the NOTE 1—The term “concept” in this specification is used to refer to the
representation of a concept rather than the thought itself.
terminology is isolated from other classifications must be
explicitly stated. 4.2.1 Atomic Concept—A representation of a concept that is
3.6 Systematic Definitions (4)—In order for users of the not composed of other simpler concept representations within
vocabulary to be certain that the meaning that they assign to a particular terminology. In many cases “atomic concepts” will
concepts is identical to the meaning which the authors of the correspond to what philosophers call “natural kinds.” Such
vocabulary have assigned, these definitions will need to be entities cannot be meaningfully decomposed. Concepts should
explicit and available to the users. Further, as relationships are be separable into their constituent components, to the extent
built into vocabularies, multiple authors will need these defi- that it is practical. These concepts should form the root basis of
nitions to ensure consistency in authorship. all concepts. For example, in the UMLS Metathesaurus, colon
3.7 Formal Definitions—A compositional system should is a synonym for large bowel, and cancer is a synonym for
contain formal definitions for non-atomic concepts and formal neoplasm, malignant. Colon cancer is non-atomic, since it can
rules for inferring subsumption from the definitions (Specifi- be broken down into “large bowel” and “neoplasm, malig-
cation E 1712). nant.” Each of these two more atomic terms has a separate and
3.8 Explicitness of Relations—The logical definition of unique Concept Unique Identifier (CUI).
subsumption should be defined. The formal behavior of all 4.2.2 Composite Concept—A concept composed as an ex-
links/relations/attributes should be explicitly defined. The pri- pression made up of atomic concepts linked by semantic
mary hierarchical relation should be subsumption (“kind of”) representations (such as roles, attributes, or links).
as defined by logical implication: “B is a kind of A” means “All 4.2.2.1 Pre-coordinated Concept—An entity that can be
Bs are As.” If a looser meaning such as “broader than/narrower broken into parts without loss of meaning (can be meaningfully
than” is used, it should be explicitly stated. decomposed) when the atomic concepts are examined in
3.9 Reference Terminology—The set of canonical concepts, aggregate. These are representations, which are considered
their structure, relationships, and, if present, their systematic single concepts within the host vocabulary. Ideally, these
and formal definitions. These features define the core of the concepts should have their equivalent composite concepts
controlled health terminology. explicitly defined within the vocabulary (that is, the vocabulary
3.10 Atomic Reference Terminology—A reference terminol- should be normalized for content). For example, colon cancer
ogy consisting of only atomic concepts and their systematic is non-atomic, however it has a single CUI, which means to the
and formal definitions. In this type of reference terminology, no Metathesaurus that it represents a single concept. It has the
two or more concepts can be combined to create a composite same status in the vocabulary as the site “large bowel” and the
expression as the same meaning as any other single concept diagnosis “neoplasm, malignant.”
contained in the atomic reference terminology.
4.2.2.2 Post-coordinated Concept—A composite concept is
3.11 Colloquial Terminology—The set of terms that consist
not pre-coordinated and therefore must be represented as an
of commonly used entry points and which map to one or more
expression of multiple concepts using the representation lan-
canonical terms within the vocabulary. These have been called
guage. This is the attempt of a system to construct a set of
“entry terms” or “interface terminologies” by different authors.
concepts from within a controlled vocabulary to more com-
4. Structure of the Terminology Model pletely represent a user’s query. For example, the concept
4.1 Terminology structures determine the ease with which “bacterial effusion, left knee” is not a unique term within the
practical and useful interfaces for term navigation, entry, or SNOMED-RT terminology. It represents a clinical concept that
retrieval can be supported (IS 704, IS 1087-1, EN 12264). some patient has an infected left knee joint. As it cannot be
Terminologies that do not currently meet these criteria can be represented by a single concept identifier, to fully capture the
in compliance with this specification by putting mechanisms in intended meaning a system would need to build a representa-
place to move toward these goals. tion from multiple concept identifiers or lose information to
4.2 Compositionality—Atomic concepts must be able to be free text.
combined to create composite concepts (11). A concept is a 4.2.3 Types of Atomic and Pre-coordinated Concepts—We
notion represented by language, which identifies one idea. For can classify unique concept representations within a vocabu-
example, “colon cancer” comprises “neoplasm, malignant” and lary into at least three distinct types: kernel concepts, modifi-
“large bowel” as atomic components. In a compositional ers, and qualifiers (which contain status concepts). This sepa-
system, concept representations can be divided into atomic and ration allows user interfaces to provide more readable and
composite concept representations. Composite concept repre- therefore more useful presentations of composite concepts.
sentations can be further divided into “named pre-coordinated 4.2.3.1 Kernel Concept—An atomic or pre-coordinated
concept representations” and “post coordinated representation concept, which represents one of the one or more main
expressions.” Within a composite concept, it may be possible concepts within a pre-coordinated or post-coordinated compo-
to separate the constituents into three categories: the kernel sition.
3
E 2087 – 00
4.2.3.2 Refining Kernel Concept—Constituents of a com- nuance or structure when arrived at via the cancer hierarchy as
posite concept refine the meaning of a kernel concept. For opposed to GI diseases. Inconsistent views could have cata-
example, “stage 1 a” in “having colon cancer stage 1a,” or strophic consequences for retrieval and decision support by
“brittle, poorly controlled,” in “Brittle, poorly controlled dia- inadvertently introducing variations in meaning that may be
betes mellitus.” In general, these concepts are expressed as a unrecognized and therefore be misleading to users of the
link plus a value (“attribute-value pair”). Terminologies must system.
support a logical structure that can support temporal duration 4.6 Explicit Uncertainty—Notions of “probable,” “sus-
and trend. Attributes must be themselves elements of a termi- pected,” “history of,” or differential possibilities (that is, a
nology and fit into a practical model that extends a terminol- differential diagnosis list) must be supported. The impact of
ogy. For example, cancers may be further defined by their stage certain versus very uncertain information has obvious impact
and histology if they have been symptomatic for a specifiable on decision support and other secondary data uses. Similarly, in
time and if they may progress over a given interval. Attributes the case of incomplete syndromes, clinicians should be able to
are required to capture important data features for structured record the partial criteria consistent with the patient’s presen-
data entry and are pertinent to secondary data uses such as tation. This criterion is listed separately as many current
aggregation and retrieval. Kernel concepts can be refined in terminological systems fail to address this adequately.
many ways, including a clinical sense, a temporal sense, and by 4.7 Representation—Computer coding of concept identifi-
status terms (for example, “recurrent”). ers must not place arbitrary restrictions on the terminology,
4.3 Normalization of Content—Normalization is the process such as numbers of digits, attributes, or composite elements. To
of supporting and mapping alternative words and shorthand do so subverts meaning and content of a terminology to the
terms for composite concepts. All pre-coordinated concepts limitations of format, which in turn often results in the
must be mapped to or logically recognizable by all possible assignment of a concept to the wrong location because it might
equivalent post-coordinated concepts. There should be mecha- no longer “fit” where it belongs in a hierarchy. These reorga-
nisms for identifying this synonymy for user created (“new”) nizations confuse people and machines alike, as intelligent
post-coordinated concepts as well (that is, when there is no navigation agents are led astray for arbitrary reasons. The long,
pre-coordinated concept for this notion in the vocabulary). This sequential, alphanumeric tags used as concept identifiers in the
functionality is critical to define explicitly equivalent meaning UMLS project of the National Library of Medicine exemplify
and to accommodate personal, regional, and discipline-specific this principle.
preferences. Additionally, the incorporation of non-English
5. Maintenance
terms as synonyms can achieve a simple form of multilingual
support. 5.1 Technical choices can impact the capacity of a termi-
4.4 Normalization of Semantics—In compositional systems, nology to evolve, change, and remain usable over time.
there exists the possibility of representing the same concept 5.2 Context Free Identifiers (14)—Unique codes attached to
with multiple potential sets of atoms that may be linked by concepts must not be tied to hierarchical position or other
different semantic links. In this case the vocabulary needs to be contexts; their format must not carry meaning. Because health
able to recognize this redundancy/synonymy (depending on knowledge is being updated constantly, how we categorize
your perspective). The extent to which normalization can be health concepts is likely to change (for example, peptic ulcer
performed formally by the system should be clearly indicated. disease is now understood as an infectious disease, but this was
For example, the concept represented by the term “laparo- not always so). For this reason, the code assigned to a concept
scopic cholecystectomy” might be represented in the following must not be inextricably bound to a hierarchy position in the
two dissections: terminology, so that we need not change the code as we update
our understanding of, in this case, the disease. Changing the
4.4.1 “Surgical Procedure: Excision” {Has Site Gallblad-
code may make historical patient data confusing or erroneous.
der}, {Has Method Endoscopic} and
This notion is the same as non-semantic identifiers.
4.4.2 “Surgical Procedure: Excision” {Has Site Gallblad- 5.3 Persistence of Identifiers—Codes must not be reused
der}, {Using Device Endoscope}. when a term becomes obsolete or superseded. Consistency of
4.5 Multiple Hierarchies (12)—Concepts should be acces- patient description over time is not possible when concepts
sible through all reasonable hierarchical paths (that is, they change codes; the problem is worse when codes can change
must allow multiple semantic parents). For example, stomach meaning. This practice not only disrupts historical analyses of
cancer can be viewed as a neoplasm or as a gastrointestinal aggregate data, but it can be dangerous to the management of
(GI) disease. A balance between number of parents (as sib- individual patients whose data might be subsequently misin-
lings) and number of children in a hierarchy should be terpreted. This encompasses the notion of concept permanence.
maintained. This feature assumes obvious advantages for 5.4 Version Control (15)—Updates and modifications must
natural navigation of terms (for retrieval and analysis), so a be referable to consistent version identifiers. Usage in patient
concept of interest can be found by following intuitive paths records should carry this version information. Because the
(users should not have to guess where a particular concept was interpretation of coded patient data is a function of terminolo-
instantiated). gies that exist at a point in time (16) (for example, AIDS
4.5.1 Consistency of View (13)—A concept in multiple patients were coded inconsistently before the introduction of
hierarchies must be the same concept in each case. The the term AIDS), terminology representations should specify the
example of stomach cancer in 4.5must not have changes in state of the terminology system at the time a term is used.
4
E 2087 – 00
Version information most easily accomplishes this, and it may If post-coordinated expressions are to be accepted, what can be
be hidden from ordinary review (IS 15188, IS 12620, IS inferred about them and what restrictions must be placed on
1087-2, IS 11179-3, IS 2382/4). them?
5.4.1 Editorial Information—New and revised terms, con- 6.2.5 Transformations (Mappings) to Other Vocabularies—
cepts, and synonyms must have their date of entry or effect in What transformations/mappings are supported for what in-
the system, along with pointers to their source or authority, or tended purpose? For example, transformation for purposes of
both. Previous ways of representing a new entry should be bibliographic retrieval may require less precision than trans-
recorded for historical retrieval purposes. formation for clinical usage. What is the sensitivity and
5.4.2 Obsolete Marking—Superseded entries should be so specificity of the mappings?
marked, together with their preferred successor. Because data 6.2.6 User/Developer Extensibility—Is it intended that the
may still exist in historical patient records using obsolete vocabulary be extended by users or applications developers? If
terms, future interpretation and aggregation are dependent so, within what limits? If not, what mechanisms are available
upon a term being carried and cross-referenced to subsequent for meeting new needs as they arise?
terms (for example, HTLV III to HIV).
6.2.7 Natural Language Input or Output—Are they sup-
5.5 Recognize Redundancy—Authors of these large-scale
ported for analysis or input? To what level of competence are
vocabularies will need mechanisms to identify redundancy
they supported, for example, stilted telegraphic presentation,
when it occurs. This is essential for the safe evolution of any
idiomatic presentation, etc.?
such vocabulary. This implies normalization of concepts and
semantics, but specifically addresses the need for vocabulary 6.2.8 Other Functions—What other functions are intended?
systems to provide the tools and resources necessary to Examples include linkage to specific decision support systems,
accomplish this task. linkage to post-marketing surveillance, etc.
5.6 Language Independence—It would be desirable for 6.2.9 Current Status—To what extent is the system intended
terminologies to support non-English presentations. As health to be finished or a work in progress? If different components of
care confronts the global economy and multi-ethnic practice the terminology are at different stages of completion, how is
environments, routine terminology maintenance must incorpo- this indicated?
rate multilingual support. While substantially lacking the 6.3 Measures of Quality (Terminological Tools):
power and utility of machine translation linguistics, this 6.3.1 Interconnectivity (Mapping):
simplistic addition will enhance understanding and use in 6.3.1.1 To what extent is the vocabulary mappable to other
non-English speaking areas. Questions that need to be ad- coding systems or reference terminologies?
dressed: Have there been translations? What is the expected 6.3.1.2 To what extent can the vocabulary accommodate
cost of translation? local terminological enhancements?
5.7 Responsiveness—The frequency of updates, or sub-
6.3.1.3 Can the vocabulary server respond to queries sent
versions, should be sufficiently short to accommodate new
over a network (LAN, WAN)?
codes and repairs quickly. Ideally it should occur weekly.
6.3.2 Precision and Recall:
6. Evaluation 6.3.2.1 What are the vocabulary’s precision and recall for
6.1 As we seek to understand quality in the controlled mapping diagnoses, procedures, manifestations, anatomy, or-
vocabularies that we create or use, we need standard criteria for ganisms, etc., against an established and nationally recognized
the evaluation of these systems. All evaluations should reflect standard query test set? This should be evaluated only within
and specifically identify the purpose and scope of the vocabu- the intended scope and purpose of the vocabulary system.
lary being evaluated. (17) 6.3.2.2 Is a standard search engine used in the mapping
6.2 Purpose and Scope—Important dimensions along which process?
scope should be defined include: 6.3.3 Usability:
6.2.1 Clinical Area of Use, Disease Area of Patients, and
6.3.3.1 Has the usability of the vocabulary been verified?
Expected Profession of Users—What parts of health care is it
intended to be used in and by whom? 6.3.3.2 How have interface considerations been separated
6.2.2 Primary Use—Includes: reporting for remuneration, from vocabulary evaluation?
management planning, epidemiological research, indexing for 6.3.3.3 Is there support for user interfaces? Has an effective
bibliographic, Web-based retrieval, recording of clinical details user interface been built? Is there a proof of concept? Has the
for direct patient care, use for decision support, linking of vocabulary been shown to have an effective user interface for
record to decision support, etc. its intended use? If not, what questions or issues are outstand-
6.2.3 Persistence and Extent of Use—Some vocabularies ing? What is the evidence for speed of entry, accuracy,
are intended, at least initially, primarily for a specific study or comprehensiveness, and the like in practice with different
a specific site. If a vocabulary is intended to be persistent, there approaches?
should be a means of updating or some kind of change 6.3.3.4 Is there support for computer interfaces and system
management. implementers? Is there a demonstrated proof of concept
6.2.4 Degree of Automatic Inferencing Intended—Is it in- implementation in software? Can it be shown to be usable for
tended that classification be automatic? Is it intended that the primary purpose indicated? Have there been cases where
validation on input be possible and, if so, within what limits? interfaces failed?
5
E 2087 – 00
6.3.4 Feasibility—If it is intended for use in an EPR Government Agency
(Electronic Patient Record), what are the options for informa- HMO
Private Practice
tion storage? Has feasibility been demonstrated? Academic Organization
6.4 Measures of Quality (Study Design)—Generalizability
(applicability) of any study design reported (evaluating re- 6.4.6.4 Was the principal investigator independent of the
ported evaluations). vocabulary being evaluated? Was the principal investigator an
6.4.1 What is the vocabulary’s healthcare/clinical rel- appropriate individual to direct this research (in terms of
evance? credentials, backing from academic or professional bodies, and
6.4.2 What was the gold standard used in the evaluation? expertise)? Did the investigator have any conflicts of interest in
6.4.3 If published population rates are used for comparison, performing this research?
was the study population comparable to the population from 6.4.7 Sample Size:
which the rates were derived? 6.4.7.1 Was the sample size of sufficient size to show the
6.4.4 Was the study appropriately blinded? anticipated effect, should one exist?
6.4.5 Was the test set selection randomized or shown in 6.4.7.2 Who reviewed the statistical methods?
some sense to be a representative sample of the end user 6.4.7.3 Were the specific aims clear?
population? 6.4.8 Personnel:
6.4.6 Test Location: 6.4.8.1 Were the study personnel appropriate?
6.4.6.1 Was it different from the developer’s location? 6.4.8.2 Were there sufficient resources to complete the
6.4.6.2 How was the test site suited to the study design? project in a reasonable period of time?
(This includes tools, resources, etc.) 6.4.9 Reviewers:
6.4.6.3 With which of the following was the principal 6.4.9.1 Number of reviewers if hand review was necessary.
investigator associated?
6.4.9.2 Type of reviewer (physician, nurse, other clinician,
University
Academic Medical Center
coder, knowledge engineer).
Corporation 6.4.9.3 Were the reviewers blinded to the other reviewer’s
Hospital judgments? (Was there independence?)
APPENDIX
(Nonmandatory Information)
X1.1 The present coding practices rely on data methods and could be encoded about patients. (12,20)
principles for terminology maintenance that have changed little
since the adoption of the statistical bills of mortality in the X1.3 The College of American Pathology (CAP) carried
mid-17th century. (18) The most widely accepted standard for the torch further by creating the Systematized Nomenclature of
representing patient conditions, ICD9-CM (19), is an intellec- Pathology (SNOP), and subsequently the Systemized Nomen-
tual descendent of this tradition. ICD9-CM relies overwhelm- clature of Medicine (SNOMED). In these systems, the number,
ingly on a tabular data structure with limited concept hierar- scope, and size of the compositional structures has increased to
chies and no explicit mechanism for synonymy, value the point where an astronomical number of terms can be
restrictions, inheritance or semantic and non-semantic link- synthesized from SNOMED atoms. One well-recognized limi-
ages. The maintenance environment for this healthcare classi- tation of this expressive power is the lack of syntactic
fication is a word processor, and its distribution is nearly grammar, compositional rules, and normalization of both the
exclusively paper-based. concepts and the semantics. Normalization is the process by
which the system knows that two compositional constructs
X1.2 Significant cognitive advances in disease and proce- with the same meaning are indeed the same (for example, that
dure representation took place in 1928 at the New York the term “colon cancer” is equivalent to the composition of
Academy of Medicine, resulting in industry-wide support for “malignant neoplasm” and the site “large bowel”). These are
what became the Standard Nomenclature of Diseases and issues addressed by CAP in their efforts to make SNOMED a
Operations. The profound technical innovation was the adop- robust reference terminology for health care. (12,20)
tion of a multiaxial classification scheme. (9,12) Now a
pathologic process (Inflammation) could be combined with an X1.4 Other initiatives of importance are the Clinical Terms
anatomic site (Oropharynx Component: Tonsil) to form a v3 (Read Codes), which are maintained and disseminated by
diagnosis (Tonsillitis). The expressive power afforded by the the National Health Service in the United Kingdom, and the
compositional nature of a multiaxial terminological coding Galen effort which expresses a very detailed formalism for
system tremendously increased the scope of tractable terminol- term description. The Read Codes are composed of a large
ogy, and, additionally, the level of granularity that diagnosis corpus of terms, now in its third revision, that is hierarchically
6
E 2087 – 00
designed and is slated for use throughout Great Britain. A and Clinical Terms Version 3 into a derivative work
development of interesting note is the newly signed agreement (SNOMED—Clinical Terms {SNOMED-CT}).
of CAP and the NHS to merge the content of SNOMED-RT
REFERENCES
(1) Masys, D. R., “Of Codes and Keywords: Standards for Biomedical “The Compositional Approach for Representing Medical Concept
Nomenclature,” Academy of Medicine, 65, 1990, pp. 627-629. Systems,” Medinfo, 95;8, Pt (1), pp. 70-74.
(2) Solbrig, H., Final submission to the CorbaMED Request for Proposals (12) Campbell, K. E., Musen, M. A., “Representation of Clinical Data
on Lexical Query Services (CorbaLex), OMG, http://www.omg.org/ Using SNOMED III and Conceptual Graphs,” Proceedings of the
cgi-bin/doc?formal/99-3-6.pdf or http://www.omg.org/cgi-bin/ Annual Symposium on Computer Applications in Medical Care,
doc?formal/99-3-1.pdf, 1998. 1992, pp. 354-358.
(3) Cimino, J. J., “Desiderata for Controlled Medical Vocabularies in the (13) Rossi-Mori, A., Galeazzi, E., Gangemi, A., Pisanelli, D. M., Thornton
Twenty-First Century,”Methods of Information in Medicine, 1998. A. M., “Semantic Standards for the Representation of Medical
(4) Chute, C. G., Cohn, S. P., Campbell, J. R., “A Framework for Records,” Medical Decision Making 4(Suppl), 1991, pp. 576-580.
Comprehensive Health Terminology Systems in the United States: (14) Tuttle, M. S., Olson, N. E., Campbell, K. E., Sherertz, D. D., Nelson,
Development Guidelines, Criteria for Selection and Public Policy,” S. I., Cole, W. G., “Formal Properties of the Metathesaurus,”
Journal of the American Medical Informatics Association, 1998. Proceedings of the Annual Symposium on Computer Applications in
(5) Elkin, P. L., Tuttle, M., Keck, K., Campbell, K., Atkin, G., Chute, C. Medical Care, 1994, pp. 145-149.
G., “The Role of Compositionality in Standardized Problem List (15) Campbell, K. E., Cohn, S. P., Chute, C. G., Rennels, G., Shortliffe, E.
Generation,” Medinfo, 1998. H., “Galapagos: Computer-based Support for Evolution of a Conver-
(6) Rector, A. “Thesauri and formal classifications: Terminologies for gent Medical Terminology,” Journal of the American Medical Infor-
people and machines,” Methods of Information in Medicine, 37 (4-5), matics Association, 1996, SympSuppl. pp. 269-273.
1998, pp. 501-509. (16) Cimino, J. J., “Formal Descriptions and Adaptive Mechanisms for
(7) Rector, A. L., P. E. Zanstra, et al., “Reconciling Users’ Needs and Changes in Controlled Medical Vocabularies,” Methods of Informa-
Formal Requirements: Issues in developing a Re-Usable Ontology for tion in Medicine, 35(3), 1996, pp. 211-217.
Medicine,” IEEE Transactions on Information Technology in Bio- (17) Elkin, P. L., Chute, G. G., “ANSI-HISB Code Set Evaluation
Medicine, 2(4), 1999, pp. 229-242. Criterion Survey,” Minutes ANSI-HISB meeting, April 1998.
(8) “Unified Medical Language System (UMLS) Knowledge Sources,” (18) Farr, William, “Regarding the Cullenian system of 1785,” First
National Library of Medicine, 7th Experimental Edition, January Annual Report of the Registrar-General of Births, Deaths, and
1998. Marriages in England, London, 1839, p. 99.
(9) Cote, R. A., Rothwell, D. J., “The Classification-Nomenclature Issues (19) Evans, D. A., Cimino, J. J., Hersh, W. R., Huff, S. M., Bell, D. S., for
in Medicine: A Return to Natural Language,” Medical Informatics the Canon Group, “Toward a Medical-Concept Representation Lan-
14(1), 1989, pp. 25-41. guage,” Journal of the American Medical Informatics Association, 1,
(10) Rocha, R. A., Rocha, B. H., Huff, S. M., “Automated Translation 1994, pp. 207-217.
Between Medical Vocabularies Using a Frame-Based Interlingua,” (20) Musen, M. A., Wiechert, K. E., Miller, E. T., Campbell, K. E., Fagan,
Proceedings of the Annual Symposium on Computer Applications in L. M., “Development of a Controlled Medical Terminology: Knowl-
Medical Care, 690-4 1993. edge Acquisition and Knowledge Representation,” Methods of Infor-
(11) Bernauer, J., Franz, M., Schoop, D., Schoop, M., Pretschner, D. P., mation in Medicine, 34(1-2), March 1995, pp. 85-95.
ASTM International takes no position respecting the validity of any patent rights asserted in connection with any item mentioned
in this standard. Users of this standard are expressly advised that determination of the validity of any such patent rights, and the risk
of infringement of such rights, are entirely their own responsibility.
This standard is subject to revision at any time by the responsible technical committee and must be reviewed every five years and
if not revised, either reapproved or withdrawn. Your comments are invited either for revision of this standard or for additional standards
and should be addressed to ASTM International Headquarters. Your comments will receive careful consideration at a meeting of the
responsible technical committee, which you may attend. If you feel that your comments have not received a fair hearing you should
make your views known to the ASTM Committee on Standards, at the address shown below.
This standard is copyrighted by ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959,
United States. Individual reprints (single or multiple copies) of this standard may be obtained by contacting ASTM at the above
address or at 610-832-9585 (phone), 610-832-9555 (fax), or [email protected] (e-mail); or through the ASTM website
(www.astm.org).