FULLTEXT01

http://www.diva-portal.
org
This is the published version of a paper published in Accreditation and Quality Assurance.
Citation for the original published paper (version of record):
Magnusson, B., Theodorsson, E. (2017)

Full method validation in Clinical chemistry.
Accreditation and Quality Assurance
https://doi.org/10.1007/s00769-017-1275-7
Access to the published version may require subscription.
N.B. When citing this work, cite the original published paper.
Permanent link to this version:

http://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-31248
Accred Qual Assur
DOI 10.1007/s00769-017-1275-7
GENERAL PAPER
Full method validation in clinical chemistry

Elvar Theodorsson1 • Bertil Magnusson2
Received: 18 January 2017 / Accepted: 6 June 2017

The Author(s) 2017. This article is an open access publication
Abstract Clinical chemistry is subject to the same prin- and need to be emphasised more by the laboratories
ciples and standards used in all branches of metrology in themselves and accreditation authorities alike.
chemistry for validation of measurement methods. The use
of measuring systems in clinical chemistry is, however, of Keywords Validation Bias Verification
exceptionally high volume, diverse and involves many Commutability Diagnostic uncertainty
laboratories and systems. Samples for measuring the same
measurand from a certain patient are likely to encounter
several measuring systems over time in the process of Introduction
diagnosis and treatment of his/her diseases. Several chal-
lenges regarding method validation across several The core of method validation in general, including that of
laboratories are therefore evident, but rarely addressed in ‘‘closed’’ measuring systems intended for healthcare, is the
current standards and accreditation practices. The purpose investigation of whether their properties are adequate for
of this is paper to address some of these challenges, making the intended use [1–5]. A single laboratory validation/
a case that appropriate conventional method validation verification is sufficient if the same measuring system is
performed by the manufacturers fulfils only a part of the always used when analysing all samples from a population
investigation needed to show that they are fit for purpose in of patients. However, this is seldom the case in clinical
different healthcare circumstances. Method validation chemistry. Patients are commonly diagnosed and their
across several laboratories using verified commercially treatment combined with monitoring initiated at large
available measuring systems can only be performed by the University hospitals to be continued at a smaller hospital
laboratories—users themselves in their own circumstances, and one or two primary healthcare physicians (Fig. 1).
Even if the measuring systems, for example, for mea-
suring the concentrations of glycated haemoglobin in
whole blood from diabetics are validated and found fit for
the intended use when investigating one or a handful of
Presented at the Eurachem Workshop on Method Validation, May measuring systems in ideal situations under the control of
2016, Gent, Belgium. manufacturers, they may not necessarily be fit for the
intended use when the patient utilises the services of sev-
& Elvar Theodorsson eral laboratories using different measuring systems, in
[email protected]
different real-life situations and even perhaps performs
Bertil Magnusson point-of-care measurements himself/herself. The manu-
[email protected]
facturers cannot be expected to shoulder the responsibility
1
Department of Clinical Chemistry and Department of for their measuring systems in any constellation of labo-
Clinical and Experimental Medicine, Linköping University, ratories and users. That responsibility rests with the users—
Linköping, Sweden the healthcare organisations.
2
RISE, Research Institutes of Sweden, Borås, Sweden
123
Accred Qual Assur
Fig. 1 Illustration of the common situation where a patient (centre of different measuring systems. Each patient in the population may,
illustration) is being treated by two primary healthcare physi- furthermore, be cared for by different combinations of hospitals and
cians (bottom of illustration) and by specialists at two different primary healthcare physicians
hospitals where both primary healthcare centres and the hospitals
measure the blood concentration of, e.g. glycated haemoglobin by
This paper intends to provide a brief overview of vali- Diagnostic uncertainty is the uncertainty physicians and
dation practices in clinical chemistry and laboratory other healthcare personnel optimally need to count in when
medicine and makes a case for extensions to these vali- faced with challenges in diagnosis or when monitoring
dation practices that should be and need to be performed by treatment effects. It is the combined uncertainty of all
laboratory and other healthcare personnel during their use diagnostic measures taken, including anamnesis, physical
of the measuring systems (including pre- and postanalytical examination, imaging and laboratory, and furthermore the
factors) in patient care. Practices in this vein have the uncertainty in the full diagnostic validation of the diag-
potential to substantially contribute to minimising diagnostic measures, including diagnostic sensitivity,
nostic uncertainty in the interest of the patients and diagnostic specificity and diagnostic decision limits.
healthcare providers alike. Analytical uncertainty is the combined uncertainty for a
certain measurement result of a certain measurand for all
measuring systems in a conglomerate of laboratories
Causes of variation/uncertainty in clinical catering for a population of patients.
chemistry The total testing chain in clinical chemistry involves
several possible sources of uncertainty from the clinical
Before discussing this topic, it is worthwhile recalling the decision to order a test through the biological variation
following concepts: inherent in all mammals, the preanalytical, analytical and
The measurement procedure is commonly called mea- postanalytical phases to the use of the test results days,
surement method (as in the term method validation and in weeks and months on end for monitoring the effects of
ISO/IEC 17025) or examination procedure (ISO 15189). treatment (Figs. 2, 3).
123
Accred Qual Assur
Fig. 2 Sources of uncertainty

in the total testing chain in Biological variation
Clinical response
clinical chemistry
to result
Test ordered
Result interpreted
in full Patient identification
clinical context
Clinical phase
Patient preparation
Results
conveyed Taking sample
to clinician Postanalytical Preanalytical
phase phase
Transporting
sample
Interpretation
in the laboratory Analytical phase
Sample
identification
Quality control
Calibration
Measuring sample
The clinical phase involves the knowledge and skills of Understanding of the uncertainty caused by biological
the healthcare personnel in the use of biomarkers for variation [6–10] (which is frequently in the order of twice
diagnosing and monitoring treatment effects, including the the measurement uncertainty of a single measuring system)
understanding, e.g. of the effects of biological variation, and its influence on diagnosis and monitoring is crucial.
drugs, interferences on the results. The preanalytical phase Biological variation is a homeostatic biological mechanism
involves preparing the patient for sampling, e.g. making whereby the body keeps the concentration of the measur-
sure that samples to be compared are taken in a standard- and varying around an individual set point which
ised manner. Biological variation is sometimes included in commonly differs amongst individuals. Knowledge of
the preanalytical phase. The analytical phase including the biological variation and skills in handling this uncertainty
uncertainty in this phase (analytical uncertainty) involves component must be an integral part of medical decision-
all measuring systems and laboratories that a patient making since biological variation cannot be regulated
potentially encounters. The postanalytical phase deals with neither in humans nor in other living organisms.
the interpretation of the measurement results in the context Preanalytical variation is the variation caused by dif-
of the patient(s). Successful handling of the postanalytical ferences in patient preparation, in the techniques and
phase is highly dependent on the knowledge and skills of equipment used taking the sample and when transporting
the laboratory- and other healthcare personnel. The clinical the sample to the laboratory. For example, the effect of
phase involves understanding of the pathophysiology of gravitation on body fluids and molecules dissolved in them
diseases and the strengths and weaknesses of individual decreases the concentration of cells and large molecules by
biomarkers in diagnosis and in monitoring of treatment 8 % to 10 % about 30 min after a patient changes body
effects. Healthcare personnel acquires knowledge in this position from vertical (standing up) to horizontal (laying
area during their basic training, but recurrent opportunities down).
for continuous educational activities which include aspects Besides staff at the medical wards, also laboratory personnel
of laboratory medicine are needed to optimise the clinical are responsible for assessing preanalytical issues such as
phase in the total testing chain. Engagement of laboratory haemolysis in the sample and errors in sample transport. It is
personnel is crucial to make this happen in any healthcare crucial to register and regularly monitor such events for pos-
organisation. sible of lack of conformance using computerised systems in
123
Accred Qual Assur
The analytical phase is usually conceived as fully in the

Ubiological hands of commercial producers of measuring systems and
Patient sample (a) reagents, even though individual laboratories are crucial in
monitoring the entire conglomerate of measuring systems.
Upreanalytical
The postanalytical uncertainty is caused by suboptimal
technical facilities or routines in conveying the results to
the healthcare personnel and/or lack of knowledge and
skills in interpreting the results by the laboratory personnel
and end-users [12, 16–18].
Measurement
Uanalytical (b)
procedure
Standardisation and harmonisation in clinical
chemistry
If measurements systems give different (biased) results for

the same patient sample, it risks confusion amongst
patients and their doctors. Furthermore, monitoring and
treatment practices risk being implemented erroneously
(c)
Upostanalytical due to the bias, since clinical practice guidelines [19–21]
that inform about proper actions for diagnosis and treat-
ment are optimally based on unbiased test results (Fig. 4).
Absence of bias can only be assumed in very rare cases.
In many cases, guidelines are based on measurement
results obtained with a single, non-standardised device.
Even worse, for guidelines based on studies performed in
Udiagnostic (d) the past it is often not known in what manner the mea-
surement scale used in the study relates to measurement
scales, calibrators and selectivity of current devices. This
Fig. 3 Components of diagnostic uncertainty when using chemical
measurements in diagnostic medicine. Diagnostic uncertainty (D) is can be a problem even if the same measurement principle
the combination of all the other uncertainty components (including and method is used, due to uncontrolled method drift. It is
A-C) also common that ‘‘old’’ cut-off points are used for mea-
surement results obtained with ‘‘new’’ methods. Therefore,
the uncertainty of reference intervals and clinical decision
order to monitor their incidence and prevalence preferably as limits is essential when counting in the postanalytical
internationally agreed quality indicators [11], aiming to reduce uncertainty.
preanalytical errors as much as possible. The Working Group A general comment concerns the definition of stan-
on Laboratory Errors and Patient Safety of the International dardisation. In the field of clinical chemistry, some authors
Federation of Clinical Chemistry and Laboratory Medicine has have developed the tendency to use definitions for stan-
agreed on such quality indicators [12–15] which include dardisation and harmonisation that deviate from those
misidentification errors, transcription errors, incorrect sample generally used in measurement science or metrology. In
type, incorrect fill level, transportation and storage problems, fact, standardisation is defined in ISO/IEC Guide 2:2004
contamination, haemolysed and clotted samples, data tran- (Standardisation and Related Activities—General Vocab-
scription errors and inappropriate turnaround times. Most ulary) as ‘‘activity of establishing, with regard to actual or
importantly this register is crucial in deciding where and when potential problems, provisions for common and repeated
to efficiently use the resources of the laboratory organisation for use, aimed at the achievement of the optimum degree of
self-improvement and as an aid to their clinical colleagues in order in a given context’’. Standardisation can be achieved
improving their knowledge and skills in preanalytics by edu- in different ways, for example, by developing standards
cational activities, preferably delivered in person to individuals with consensus scales (e.g. the SI units or International
and groups. Since the influence of both biological and prean- Units of WHO standards).
alytical variation on the patient’s diagnosis is highly dependent Clinical practice guidelines [19–21] that inform about
on the knowledge and skills of all involved in the clinic and in proper actions for diagnosis and treatment are based on
the laboratory alike [11], these factors should be included in the unbiased test results. Standardisation aims at achieving
evaluation of the overall uncertainty estimates. equivalent results by applying calibrators traceable to SI
123
Accred Qual Assur
Fig. 4 A bias of ?5 arbitrary Measurement results Measurement results

units in this case means that an using method A using method B
increased number of healthy
persons are falsely diagnosed as
sick as shown by the increase in
the dark triangular area in the
figure to the right compared to
Frequency
the figure to the left
Arbitrary units Arbitrary units
and the use of reference measurement procedures [22–26]. Harmonised methods work through consensus and are
Standardisation is accomplished when equivalent results valid during a particular period in time. They do not share
are obtained by different clinical laboratory tests conducted the ability of standardised methods to maintain trueness
by different laboratories using valid traceability chains over extended periods of time. Harmonisation is usually
established between the measurement results and a based on the use of natural patient samples for comparing
stable endpoint, be it the SI, the value of internationally methods [28]. The advantage of harmonisation is that it is
agreed reference material (RM) or a value obtained with a able to addresses the tests that as yet cannot be standardised
reference method. (Fig. 5).
Standardisation is not possible when internationally Complex large-molecular measurands that exist in sev-
agreed RM, and corresponding reference measurement eral molecular forms (e.g. lutropin, follitropin, human
procedures are not available. Harmonisation is then the chorionic gonadotropin) are difficult to standardise. Con-
second best and in fact the only option. It aims at achieving sensus is required on the unique definition of the
equivalent results amongst different measurement proce- measurands based on solid research findings and under-
dures commonly using fresh patient samples [27–31]. standing of the clinically and metrologically relevant
Unfortunately, less than 10 % of measurands (60 of more molecular forms that are needed both in RM and the patient
than 600) in a typical university hospital laboratory of samples. We are currently only in the very beginning of a
clinical chemistry and laboratory medicine are as yet long process of accomplishing this for all relevant mea-
traceable to SI. surands in laboratory medicine.
Standardised and harmonised clinical laboratory test The use of a single central laboratory has been the rule
results [24–26] improve the quality of healthcare by when establishing laboratory result-based clinical guideli-
ensuring reliable screening, diagnosis and supporting nes [28]. Knowledge of their performance in the complex
appropriate treatments. They also reduce the risk of diag- uncertainties conglomerates of laboratories using different
nostic and treatment errors that may be caused by measuring systems is in its infancy.
unnecessary variation in test results. They lower healthcare
cost by avoiding false-positive or false-negative results
from non-standardised/harmonised tests. Such results risk Method validation in clinical chemistry
unnecessary follow-up diagnostic procedures and
treatments. Single laboratory method validation is appropriate when a
Standardisation is the method of choice for obtaining method is used for a specific purpose in one laboratory.
equivalence of measurement results. It has the unique Full method validation in a conglomerate of laboratories
advantage that when measurement results provided by includes, in addition to the procedures of single laboratory
reference methods or values assigned to RM are traceable validation, a study of the fitness for the intended use of
to the SI units, this allows maintenance of proper calibra- measuring systems in a number of locations, several
tion over time and across locations. Standardisation has operators, etc. including a study of the performance
proven particularly successful for well-defined measurands characteristics of the measuring systems over extended
existing in only a single molecular form (e.g. small mole- periods of time including the effects of lot-to-lot
cules like creatinine and cholesterol) in clinical samples. variations.
123
Accred Qual Assur
Standardisation –
a vertical regulatory process
Harmonisation – a horizontal
consensus process
Fig. 5 Standardisation using traceable and internationally agreed RM on the right). The consensus process of harmonisation using natural
and appropriate reference measurement procedures is optimal. patient samples can, however, always be used
Unfortunately, only about 10 % of measurands in laboratory medicine
today are traceable to SI (illustrated by the tip of the iceberg analogy
Full diagnostic method validation is an investigation of measuring system, same operating conditions and same
the diagnostic properties of the method (diagnostic sensi- location are used for replicate measurements on the same
tivity, diagnostic specificity and diagnostic decision limits, or similar objects over a short period of time, usually less
etc.) and the added value the method brings to the clinical than a working day of 8 h. Reproducibility conditions
diagnosis and monitoring of treatment effects. It is used for includes the same or different measurement procedure,
establishing the diagnostic properties of the method in different location, and replicate measurements on the same
health and disease [32–35], a major undertaking demand- or similar objects over an extended period, but may include
ing that the diagnosis in question is independently other conditions involving changes. Intermediate precision
established by methods other than the one being tested. includes conditions in between the extremes of repeata-
Diagnostic validation investigates to what extent a con- bility and reproducibility. It is usually estimated by daily
glomerate of measuring systems that samples from a patient examinations over extended periods of time for at least
are likely to encounter can reproduce the conditions that 1 year. All sources of variation included in intermediate
existed during the original full diagnostic method validation. precision including, e.g. lot number changes are included in
The conglomerate of laboratories should minimise the appropriate number of occurrences. The intermediate pre-
analytical uncertainty since results can be produced and cision can refer to one measuring system or to all
reported by any laboratory within the conglomerate. The measuring systems in the conglomerate of laboratories.
contribution of pre- and postanalytical uncertainty also
needs to be minimised by systematic monitoring of errors Bias
and other sources of uncertainty and collaboration with the
clinically active personnel. The analytical uncertainty is Bias is an estimate of a systematic measurement error. The
preferably estimated by stabilised samples for internal qualitative concept trueness—in this case lack of trueness—
quality control for measuring precision and using com- is quantitatively expressed as bias. It is optimally estimated
mutable samples, e.g. using split-sample techniques for using commutable certified RM or by comparing the average
estimating bias as described below. concentration measured in a natural patient sample with the
method to be tested with the average concentration measured
Precision in the same sample using a reference method.
Precision is the quantitative expression of random error Commutability

usually by the coefficients of variation monitored under
specific conditions. Repeatability conditions exist when the Commutability is a property of a material/sample demon-
same examination procedure, same operators, same strated by ‘‘the closeness of agreement between the relation
123
Accred Qual Assur
Fig. 6 a Lack of commutability of a RM (grey dots and broken line) commutable RM (grey dots and broken line) overlaps with natural
compared to natural patient samples (black dots and black solid line). patient samples (black dots and solid line)
Commutability in clinical chemistry describes a RM ability to react in
the same way as patient specimens in laboratory measurements. b A
among the measurement results for a stated quantity in this commutable—are commonly used, the averages of partici-
material, obtained according to two given measurement pants’ results grouped by measuring system or method
procedures, and the relation obtained among the measure- commonly differ. Therefore, participants’ performances are
ment results for other specified materials’’ [1] (Fig. 6). commonly evaluated against an assigned value, which in
Commutability is thus ‘‘the equivalence of the mathemat- clinical chemistry is most often determined as the partici-
ical relationship between the results of different pants’ consensus value. This bias information is, however,
measurement procedures for a RM and for representative valuable for monitoring the performance of individual
samples from healthy and diseased individuals’’ [36]. measuring systems and methods. Furthermore, accreditation
Natural patient samples are by definition commutable. and certification organisations keep data from proficiency
When a traceability chain is established, it is crucial to testing in high regard and find them essential for obtaining
include commutable materials in the procedures for and maintaining accreditation and certification.
determining the concentrations in secondary RM, working Participating in a proficiency testing programme
calibrators and product calibrators (Fig. 7) in order that the applying singleton measurements of the samples will pro-
results ultimately measured in the patient samples are vide a check on the estimated uncertainty (the combination
comparable. Omission or disregard of this fundamental of precision and bias) instead of trueness. Optimal esti-
necessity contributes to the bias frequently found between mation of trueness requires replicate measurements and
measuring systems and methods from different manufac- calculation of the average and the difference (bias) between
turers even for traceable measurement methods. the average and the assigned value.
If a RM is not commutable, the results from routine Some organisations/companies running proficiency test-
methods cannot be properly compared with the assigned ing schemes occasionally use fresh patient samples in their
value of the RM when determining a possible bias [37, 38]. surveys. This practice substantially decreased the bias
Observed bias may in this case be either due to the non- between different measuring systems and methods because
commutability of the RM or due to the differing speci- the manufacturers commonly use natural patient samples
ficities of the methods. Non-commutable RM used in which are commutable in their efforts to establish and
validation results in wrong estimation of bias [38, 39]. maintain traceability to certified RM and reference methods.
Proficiency testing Split samples for estimation of bias

within a conglomerate of laboratories
In proficiency testing, individual laboratory results are
compared with a consensus value or assigned value. Since Running a proficiency testing scheme requires sophisti-
the stabilised control materials—that may or may not be cated logistics and computerisation outside the scope of
123
Accred Qual Assur
Material Primary Secondary Working Product Patient sample

reference reference calibrator calibrator
Commutable? Commutable? Commutable? Commutable? Commutable! Patient

Measurement Primary Secondary result
procedure reference reference Routine
measurement measure-ment Manufacturers measurement measurement in a
clinical laboratory
Provider BIPM, National National

metrology metrology
institutes, institutes,
accredited accredited Manufacturers laboratory End user
reference reference
laboratories laboratories
Uncertainty for commutable material
Uncertainty for noncommutable material
Fig. 7 Traceability chain of RM involves reference measurement performed, there is a risk of bias and increased uncertainty in the
procedures and measurement procedures of lower metrological order traceability chain as shown at the bottom of the figure
including routine measurement procedures. If non-commutable RM is
used for calibration in one or more of the measurement steps
conglomerates of laboratories. However, the laboratory The advantages of natural patient samples are: (1) the
conglomerate always maintains logistics for sending material is commutable and has similar matrix properties,
patient samples between the laboratories, e.g. from a (2) they are available without cost for all laboratories
small laboratory analysing a limited number of measur- accepting routine patient samples, (3) there is a general
ands to a larger laboratory analysing a comprehensive agreement that theoretically all measuring systems and
selection of measurands. Let’s imagine using this already reagents should result in identical results when analysing
established and well-maintained logistic function for the same patient samples. This is not always the case.
estimating bias. In this case, a laboratory (adept) sends a
patient sample that it has already analysed to a central
laboratory (mentor) which measures the sample using its Fitness for purpose/fitness for intended use
normal automation and measuring systems and methods. evaluation
However, in this case the sample result is not reported to
healthcare as a patient result but as a result for internal Fitness for purpose is ‘‘the property of data produced by a
use in the laboratory conglomerate for estimation of the measurement process that enables a user of the data to
bias between the methods used by the mentor and adept make technically correct decisions for a stated purpose’’
laboratories. [40]. When defining the concept Thompson and Ramsey
Such a split-sample mentor-adept scheme does evidently [40] referred to Tonks study from 1963 [41] that the
not establish or maintain traceability of the measuring allowable limits of error for a measurand should be one
systems and methods in the conglomerate of laboratories. quarter of the reference interval and expressed as per-
However, it provides valuable information about the cali- centage of the mean of the reference interval. Thereby, the
bration and other technical parameters of the different concept of ‘‘fitness for purpose’’ was from the outset cou-
measuring systems that influence the trueness and thereby pled to the concept of ‘‘analytical quality specifications’’/
the uncertainty when measuring patient samples that are ‘‘analytical performance specifications’’ widely used in
analysed at different locations/laboratories with the labo- clinical chemistry [42–50].
ratory conglomerate. This bias information is then most In decision theory, fitness for purpose is ‘‘the property of
commonly used to identify measuring systems that need re- a result when it provides the maximum utility’’ [5]. Deci-
calibration, maintenance or full blown overhaul rather than sions on fitness for purpose may therefore be based on
for secondary adjustment of the calibration functions to informed professional judgement and an agreement
reduce bias. between the laboratory and the users of the laboratory [5].
123
Accred Qual Assur
Estimating fitness for purpose has also been defined as Optimal clinical/patient outcomes remain the ‘‘reasons
reaching externally stated requirements of ‘‘target mea- for being’’ in clinical chemistry and should, whenever
surement uncertainty’’ [51] or ‘‘property of a result of a proper data are available, remain at the top of the list of
measurement when the uncertainty provides minimal total performance specifications for laboratories; however,
average costs’’ [5]. Such fitness for purpose evaluations tempting it may seem to regress to purely technical/
may, for example, be performed in proficiency testing metrological specifications including ‘‘target measurement
schemes, e.g. using z-scores. uncertainty’’ and state of the art determined, e.g. by per-
Whereas evaluation of fitness for purpose/fitness for formance in proficiency testing schemes.
intended use has been narrowed to reaching an agreed Optimal performance specifications should evidently
‘‘target measurement uncertainty’’ in some parts of the cover the entire total testing process (Figs. 2, 3)
sciences of metrology [51] including VIM 2.34, it has including the pre- and postanalytical phases [11, 13,
maintained its original ‘‘maximum utility’’ [5] scope in 53, 54]. Since clinical decision limits are based on
clinical chemistry and is known as analytical quality or studies where all phases of the total testing process have
analytical performance specifications [48]. Fitness for been involved, they are usually counted in when model 1
purpose remains the property of results produced by mea- (see above) is used. A primary task of laboratories and
suring systems that enables a user of the data to make conglomerates of laboratories is to establish and main-
clinically correct decisions for a stated purpose. tain systems to minimise pre-and postanalytical errors
and to monitor their occurrences. If and when pre- and
postanalytical errors can be expressed as uncertainty
Performance specifications—target measurement components, they should evidently be included in per-
uncertainty formance specifications in the same manner as
measurement uncertainties [48].
The Stockholm Conference held in 1999 on ‘‘Strategies to set
global analytical quality specifications in laboratory medi-
cine’’ advocated the following hierarchical structure for The European in vitro diagnostics IVD directive
performance specifications. (1) evaluation of the effect of
analytical performance on clinical outcomes in specific In vitro diagnostic (IVD) medical devices are in Europe
clinical settings; (2) evaluation of the effect of analytical regulated by the IVD Directive 98/79/EC [55] which has
performance on clinical decisions in general using (a) data been mandatory since December 2003.
based on components of biological variation, or (b) analysis ISO 17511:2013 (In vitro diagnostic medical devices—
of clinicians’ opinions; (3) published professional recom- Measurement of quantities in biological samples—Metro-
mendations from (a) national and international expert logical traceability of values assigned to calibrators and
bodies, or (b) expert local groups or individuals; (4) perfor- control materials) [56] is the standard showing how to
mance goals set by (a) regulatory bodies, or (b) organisers of achieve traceability in accordance with EU legislation. The
external quality assessment (EQA) schemes; and (5) goals fact that it is a harmonised standard means that it is
based on the current state of the art as (a) demonstrated by recognised at EU level as describing how the legislation
data from EQA or proficiency testing scheme, or (b) found in (IVD directive) should be implemented.
current publications on methodology [52]. ISO 17511 [56] describes several different possible
The conference ‘‘Defining analytical performance traceability chains, which can all be used to achieve stan-
specifications’’ the 1st Strategic Conference of the Euro- dardisation (albeit it only within a particular measurement
pean Federation of Clinical Chemistry and Laboratory system in the last case):
Medicine in Milan 2014 maintained and simplified the
• Cases with primary reference measurement procedure
criteria in an attempt to improve its applications for various
and primary calibrator(s) giving metrological traceability
stakeholders [48]. Model 1. Based on the effect of analyt-
to SI.
ical performance on clinical outcomes (1) Direct outcome
• Cases with international conventional reference mea-
studies—investigating the impact of analytical perfor-
surement procedure (which is not primary) and
mance of the test on clinical outcomes; (2) Indirect
international conventional calibrator(s) without metro-
outcome studies—investigating the impact of analytical
logical traceability to SI.
performance of the test on clinical classifications or deci-
• Cases with international conventional reference mea-
sions and thereby on the probability of patient outcomes,
surement procedure (which is not primary) but no
e.g. by simulation or decision analysis. Model 2. Based on
international conventional calibrator and without metro-
components of biological variation of the measurand.
logical traceability to SI.
Model 3. Based on state of the art.
123
Accred Qual Assur
• Cases with international conventional calibrator (which risks complacency amongst the users of the measuring
is not primary) but no international conventional refer- systems and methods since it puts the overwhelming
ence measurement procedure and without metrological responsibility for the overall quality of measurements in
traceability to SI. clinical chemistry on the shoulders of the manufacturers of
• Cases with manufacturer’s selected measurement proce- measuring systems. Furthermore, it only demands the
dure but neither international conventional reference verification of each measuring system independently, and
measurement procedure nor international conventional not as a part of a conglomerate of measuring systems all
calibrator and without metrological traceability to SI. potentially reporting to the same client.
The manufacturers of measuring systems are usually in
no position to do full method validations (as defined earlier
in this paper) and are therefore unable to supply the end-
Validation versus verification
users with information about bias and reproducibility pre-
cision to be expected and possibly verified in typical
The IVD directive [55] states ‘‘The traceability of values
conglomerates of laboratories for a certain population. The
assigned to calibrators and/or control materials must be
users of measuring systems in conglomerates of laborato-
assured through available reference measurement proce-
ries in clinical chemistry therefore need to look for
dures and/or available RM of a higher order’’. (98/79/EC,
analytical performance specifications/goals [46, 48–50,
Annex1 (A) (3) 2nd paragraph). ‘‘Higher order’’ is not
58, 59] appropriate for the patient population their labo-
defined in the directive and neither was implementing
ratories serve preferably in close collaboration with their
legislation beyond assigning responsibility for assuring
clinical colleagues. The priorities within the conglomerate
traceability to national notified bodies. Furthermore, har-
of laboratories should then be to fulfil these analytical
monisation for the methods that are not traceable is not
performance goals not only in the analytical phase of the
either mentioned in the directive.
total testing process, but also in the pre- and postanalytical
One of the crucial advantages of the IVD directive is
phases. Using commutable control materials including split
that it emphasises standardisation/traceability of measure-
natural patient samples will serve well in this effort. The
ment methods and puts the responsibility for validation on
main purpose of bias control within a conglomerate of
the shoulders of the manufacturers. The responsibility of
laboratories using commutable materials is to identify
the users/laboratories then becomes to verify the mea-
measuring systems in need of technical overhaul and pri-
surement methods—to investigate to what extent the
mary calibration. Secondary adjustment of calibrations [60]
performance data obtained by manufacturers during
is rarely required when calibrations are properly performed
method validation can be reproduced in the environments
and the measuring systems are in optimal technical
of the end-users.
condition.
Verification practices have commonly been established
over time and are naturally influenced by accreditation and
certification authorities. The EP15-A2 protocol from CLSI
Conclusions
[57] is commonly used for this purpose and uses stabilised
control material with assigned concentrations or certified
Samples for measuring the same measurand from a cer-
RM. Another pragmatic method involving com-
tain patient are likely to encounter several measuring
mutable materials is to measure a range of concentrations
systems over time in the process of diagnosis and treat-
in at least 20 natural patient samples both by the estab-
ment of his/her diseases. The conglomerate of laboratories
lished method and by the new method to estimate bias and
serving a population of patients will serve the interest of
to measure at least two concentrations of stabilised control
their patients even better if they minimise even further the
materials at least twice daily for at least 10 days to estimate
part of diagnostic uncertainty caused by analytical
repeatability and intermediate reproducibility.
uncertainty and improve the traceability/harmonisation of
the measuring systems. A full method validation is a
study of fitness for purpose including all the measuring
Limitation of the IVD directive and current
systems in a number of laboratories. Clinical decision
verification practices
limits and clinical guidelines will thereby be appropriately
used.
The IVD directive [55] has done Clinical chemistry in
Europe service in emphasising traceability and clarifying Acknowledgements The authors acknowledge with gratitude the
the responsibilities of metrology institutes and manufac- substantial contribution of the reviewers and editor to improving the
turers of measuring systems. However, the IVD directive quality of this manuscript.
123
Accred Qual Assur
Open Access This article is distributed under the terms of the fundamental tool for quality and patient safety. Clin Biochem
Creative Commons Attribution 4.0 International License (http:// 46(13–14):1170–1174. doi:10.1016/j.clinbiochem.2012.11.028
creativecommons.org/licenses/by/4.0/), which permits unrestricted 15. Plebani M, Chiozza ML, Sciacovelli L (2013) Towards harmo-
use, distribution, and reproduction in any medium, provided you give nization of quality indicators in laboratory medicine. Clin Chem
appropriate credit to the original author(s) and the source, provide a Lab Med CCLM FESCC 51(1):187–195. doi:10.1515/cclm-2012-
link to the Creative Commons license, and indicate if changes were 0582
made. 16. Skeie S, Perich C, Ricos C, Araczki A, Horvath AR, Oosterhuis
WP, Bubner T, Nordin G, Delport R, Thue G, Sandberg S (2005)
Postanalytical external quality assessment of blood glucose and
hemoglobin A1c: an international survey. Clin Chem
51(7):1145–1153. doi:10.1373/clinchem.2005.048488
References 17. Kristoffersen AH, Thue G, Sandberg S (2006) Postanalytical
external quality assessment of warfarin monitoring in primary
1. JCGM (2012) International vocabulary of metrology—Basic and healthcare. Clin Chem 52(10):1871–1878. doi:10.1373/clinchem.
general concepts and associated terms (VIM 3). Bureau Interna- 2006.071027
tional des Poids et Mesures. http://www.biphttp//www.bipm.org/ 18. Favaloro EJ, Lippi G, Adcock DM (2008) Preanalytical and
utils/common/documents/jcgm/JCGM_200_2012.pdf. Accessed postanalytical variables: the leading causes of diagnostic error in
1 Feb 2017 hemostasis? Semin Thromb Hemost 34(7):612–634. doi:10.1055/
2. Fearn T, Fisher SA, Thompson M, Ellison SLR (2002) A decision s-0028-1104540
theory approach to fitness for purpose in analytical measurement. 19. Wils J, Fonfrede M, Augereau C, Watine J (2014) Further
Analyst 127(6):818–824. doi:10.1039/B111465d comments on ‘‘Critical review of laboratory investigations in
3. Thompson M, Fearn T (1996) What exactly is fitness for purpose clinical practice guidelines: proposals for the description of
in analytical measurement? Analyst 121:275–278 investigation’’. Clin Chem Lab Med CCLM FESCC 52(8):e155–
4. Magnusson B, Örnemark U (2014) Eurachem guide: the fitness e157. doi:10.1515/cclm-2013-1045
for purpose of analytical methods—a laboratory guide to method 20. Aakre KM, Langlois MR, Watine J, Barth JH, Baum H, Collinson
validation and related topics, 2nd edn. Eurachem. www. P, Laitinen P, Oosterhuis WP (2013) Critical review of laboratory
eurachem.org investigations in clinical practice guidelines: proposals for the
5. Thompson M, Ellison SLR (2006) Fitness for purpose—the description of investigation. Clin Chem Lab Med CCLM FESCC
integrating theme of the revised Harmonised protocol for profi- 51(6):1217–1226. doi:10.1515/cclm-2012-0574
ciency testing in analytical chemistry laboratories. Accred Qual 21. Trenti T, Schunemann HJ, Plebani M (2016) Developing
Assur 11(8–9):373–378. doi:10.1007/s00769-006-0137-5 GRADE outcome-based recommendations about diagnostic tests:
6. Fraser CG, Cummings ST, Wilkinson SP, Neville RG, Knox JD, a key role in laboratory medicine policies. Clin Chem Lab Med
Ho O, MacWalter RS (1989) Biological variability of 26 clinical CCLM FESCC 54(4):535–543. doi:10.1515/cclm-2015-0867
chemistry analytes in elderly people. Clin Chem 35(5):783–786 22. Armbruster D (2013) Accuracy controls: assessing trueness
7. Simundic AM, Bartlett WA, Fraser CG (2015) Biological varia- (bias). Clin Lab Med 33(1):125–137. doi:10.1016/j.cll.2012.10.
tion: a still evolving facet of laboratory medicine. Ann Clin 002
Biochem 52(Pt 2):189–190. doi:10.1177/0004563214567478 23. Bais R, Armbruster D, Jansen RT, Klee G, Panteghini M, Pas-
8. Fraser CG, Hyltoft Peterson P (1993) Desirable standards for sarelli J, Sikaris KA, Results IWGoAEfT (2013) Defining
laboratory tests if they are to fulfill medical needs. Clin Chem acceptable limits for the metrological traceability of specific
39:1447–1455 measurands. Clin Chem Lab Med CCLM FESCC 51(5):973–979.
9. Ricos C, Alvarez V, Perich C, Fernandez-Calle P, Minchinela J, doi:10.1515/cclm-2013-0122
Cava F, Biosca C, Boned B, Domenech M, Garcia-Lario JV, 24. Armbruster DA (2009) Measurement traceability and US IVD
Simon M, Fernandez PF, Diaz-Garzon J, Gonzalez-Lao E (2015) manufacturers: the impact of metrology. Accred Qual Assur
Rationale for using data on biological variation. Clin Chem Lab 14(7):393–398. doi:10.1007/s00769-009-0535-6
Med CCLM FESCC 53(6):863–870. doi:10.1515/cclm-2014- 25. Armbruster DA (2013) Implementation of traceability: is the IVD
1142 industry’s approach really fulfilling obligations? In: 7th CIRME
10. Ricós C, Alvarez V, Cava F, Garcı́a-Lario JV, Hernéndez A, international scientific meeting: metrological traceability & assay
Jiménez CV, Minchinela J, Perich C, Simón M (1999) Current standardization, May 24th, 2013, Stresa
databases on biological variation: pros, cons and progress. Scand 26. Armbruster D, Miller RR (2007) The Joint Committee for
J Clin Lab Invest 59(7):491–500 Traceability in Laboratory Medicine (JCTLM): a global approach
11. Plebani M, Sciacovelli L, Aita A, Pelloso M, Chiozza ML (2015) to promote the standardisation of clinical laboratory test results.
Performance criteria and quality indicators for the pre-analytical Clin Biochem Rev 28(3):105–113
phase. Clin Chem Lab Med CCLM FESCC 53(6):943–948. 27. Miller WG, Tate JR, Barth JH, Jones GR (2014) Harmonization:
doi:10.1515/cclm-2014-1124 the sample, the measurement, and the report. Ann Lab Med
12. Sciacovelli L, Plebani M, Garcia del Oino Castro I, Lippi G, 34(3):187–197. doi:10.3343/alm.2014.34.3.187
Sumarac Z, Furtado Veira K, West JB, Ivanov A (2016) Labo- 28. Miller WG, Eckfeldt JH, Passarelli J, Rosner W, Young IS (2014)
ratory Errors and Patient Safety (WG-LEPS). International Harmonization of test results: what are the challenges; how can
federation of clinical chemistry and laboratory medicine (IFCC). we make it better? Clin Chem 60(7):923–927. doi:10.1373/
http://217.148.121.44/MqiWeb/resources/doc/Quality_ clinchem.2012.201186
Indicators_Key_Processes.pdf. Accessed 16 Sept 2016 29. Miller WG, Myers GL (2013) Commutability still matters. Clin
13. Plebani M, Sciacovelli L, Aita A, Padoan A, Chiozza ML (2014) Chem 59(9):1291–1293. doi:10.1373/clinchem.2013.208785
Quality indicators to detect pre-analytical errors in laboratory 30. Gantzer ML, Miller WG (2012) Harmonisation of measurement
testing. Clin Chim Acta 432:44–48. doi:10.1016/j.cca.2013.07. procedures: how do we get it done? Clin Biochem Rev
033 33(3):95–100
14. Plebani M, Sciacovelli L, Marinova M, Marcuccitti J, Chiozza 31. Miller WG, Myers GL, Lou Gantzer M, Kahn SE, Schonbrunner
ML (2013) Quality indicators in laboratory medicine: a ER, Thienpont LM, Bunk DM, Christenson RH, Eckfeldt JH, Lo
123
Accred Qual Assur
SF, Nubling CM, Sturgeon CM (2011) Roadmap for harmo- Lord SJ, Test Evaluation Working Group of the European Fed-
nization of clinical laboratory measurement procedures. Clin eration of Clinical C, Laboratory M (2015) Setting analytical
Chem 57(8):1108–1117. doi:10.1373/clinchem.2011.164012 performance specifications based on outcome studies—is it pos-
32. Bossuyt PM, Reitsma JB, Linnet K, Moons KG (2012) Beyond sible? Clin Chem Lab Med 53(6):841–848. doi:10.1515/cclm-
diagnostic accuracy: the clinical utility of diagnostic tests. Clin 2015-0214
Chem 58(12):1636–1643. doi:10.1373/clinchem.2012.182576 48. Sandberg S, Fraser CG, Horvath AR, Jansen R, Jones G,
33. Bossuyt PM, Cohen JF, Gatsonis CA, Korevaar DA, Group S Oosterhuis W, Petersen PH, Schimmel H, Sikaris K, Panteghini
(2016) STARD 2015: updated reporting guidelines for all diag- M (2015) Defining analytical performance specifications: con-
nostic accuracy studies. Ann Transl Med 4(4):85. doi:10.3978/j. sensus statement from the 1st strategic conference of the
issn.2305-5839.2016.02.06 european federation of clinical chemistry and laboratory medi-
34. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, cine. Clin Chem Lab Med CCLM FESCC 53(6):833–835. doi:10.
Irwig L, Lijmer JG, Moher D, Rennie D, de Vet HC, Kressel HY, 1515/cclm-2015-0067
Rifai N, Golub RM, Altman DG, Hooft L, Korevaar DA, Cohen 49. Oosterhuis WP, Sandberg S (2015) Proposal for the modification
JF, Group S (2015) STARD 2015: an updated list of essential of the conventional model for establishing performance specifi-
items for reporting diagnostic accuracy studies. BMJ 351:h5527. cations. Clin Chem Lab Med CCLM FESCC 53(6):925–937.
doi:10.1136/bmj.h5527 doi:10.1515/cclm-2014-1146
35. Moons KG, de Groot JA, Linnet K, Reitsma JB, Bossuyt PM 50. Thue G, Sandberg S (2015) Analytical performance specifications
(2012) Quantifying the added value of a diagnostic test or marker. based on how clinicians use laboratory tests. Experiences from a
Clin Chem 58(10):1408–1417. doi:10.1373/clinchem.2012. post-analytical external quality assessment programme. Clin
182550 Chem Lab Med CCLM FESCC 53(6):857–862. doi:10.1515/
36. (ISO) IOfS (2003) In vitro diagnostic medical devices—mea- cclm-2014-1280
surement of quantities in biological samples—metrological 51. De Bievre P (2007) Fitness for purpose is different from a per-
traceability of values assigned to calibrators and control materi- formance specification. Accred Qual Assur 12(10):501. doi:10.
als. International Organization for Standardization (ISO), 1007/s00769-007-0312-3
Geneva, Switzerland 52. Kallner A, McQueen M, Heuck C (1999) The Stockholm con-
37. Miller WG, Myers GL, Rej R (2006) Why commutability mat- sensus conference on quality specifications in laboratory
ters. Clin Chem 52(4):553–554. doi:10.1373/clinchem.2005. medicine, 25–26 April 1999. Scand J Clin Lab Invest
063511 59(7):475–476
38. Vesper HW, Miller WG, Myers GL (2007) Reference materials 53. Plebani M, Sciacovelli L, Aita A, Chiozza ML (2014) Harmo-
and commutability. Clin Biochem Rev 28(4):139–147 nization of pre-analytical quality indicators. Biochem Med
39. Franzini C, Ceriotti F (1998) Impact of reference materials on (Zagreb) 24(1):105–113. doi:10.11613/BM.2014.012
accuracy in clinical chemistry. Clin Biochem 31(6):449–457 54. Plebani M, Astion ML, Barth JH, Chen W, de Oliveira Galoro
40. Thompson M, Ramsey MH (1995) Quality concepts and practices CA, Escuer MI, Ivanov A, Miller WG, Petinos P, Sciacovelli L,
applied to sampling—an exploratory-study. Analyst Shcolnik W, Simundic AM, Sumarac Z (2014) Harmonization of
120(2):261–270. doi:10.1039/an9952000261 quality indicators in laboratory medicine. A preliminary con-
41. Tonks DB (1963) A study of the accuracy and precision of sensus. Clin Chem Lab Med CCLM FESCC 52(7):951–958.
clinical chemistry determinations in 170 Canadian laboratories. doi:10.1515/cclm-2014-0142
Clin Chem 9:217–233 55. EU (1998) Directive 98/79/EC of the European Parliament and of
42. Laessig RH (1990) Medical need for quality specifications within the Council of 27 October 1998 on in vitro diagnostic medical
laboratory medicine. Ups J Med Sci 95(3):233–244 devices. Eur-Lex, http://eur-lex.europa.eu/LexUriServ/
43. Fraser CG (2015) The 1999 Stockholm consensus conference on LexUriServ.do?uri=CELEX:31998L0079:EN:NOT
quality specifications in laboratory medicine. Clin Chem Lab 56. ISO (2003) 17511:2003 In vitro diagnostic medical devices—
Med 53(6):837–840. doi:10.1515/cclm-2014-0914 measurement of quantities in biological samples—metrological
44. Dybkaer R (1993) Medical need for quality specifications in traceability of values assigned to calibrators and control materials
clinical laboratories. Truth, accuracy, error and uncertainty. Ups J 57. CLSI (2006) User verification of performance for precision and
Med Sci 98(3):215–220 trueness; approved guideline EP15-A2. Clinical and Laboratory
45. Fraser CG (1990) Quality specifications in laboratory medicine. Standards Institute
Ups J Med Sci 95(3):229–232 58. Stepman HC, Stockl D, Twomey PJ, Thienpont LM (2013) A
46. Ceriotti F, Fernandez-Calle P, Klee GG, Nordin G, Sandberg S, fresh look at analytical performance specifications from biolog-
Streichert T, Vives-Corrons JL, Panteghini M, Task E, Finish ical variation. Clin Chim Acta 421:191–192. doi:10.1016/j.cca.
Group on Allocation of laboratory tests to different models for 2013.03.018
performances (2016) Criteria for assigning laboratory measur- 59. Panteghini M, Sandberg S (2015) Defining analytical perfor-
ands to models for analytical performance specifications defined mance specifications 15 years after the Stockholm conference.
in the 1st EFLM strategic conference. Clin Chem Lab Med. Clin Chem Lab Med CCLM FESCC 53(6):829–832. doi:10.1515/
doi:10.1515/cclm-2016-0091 cclm-2015-0303
47. Horvath AR, Bossuyt PM, Sandberg S, John AS, Monaghan PJ, 60. Theodorsson E, Magnusson B, Leito I (2014) Bias in clinical
Verhagen-Kamerbeek WD, Lennartz L, Cobbaert CM, Ebert C, chemistry. Bioanalysis 6(21):2855–2875. doi:10.4155/bio.14.249
123

FULLTEXT01

Uploaded by

Copyright:

Available Formats

FULLTEXT01

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FULLTEXT01

Uploaded by

Copyright:

Available Formats

http://www.diva-portal.

Citation for the original published paper (version of record):

Magnusson, B., Theodorsson, E. (2017)

Access to the published version may require subscription.

Permanent link to this version:

Full method validation in clinical chemistry

Received: 18 January 2017 / Accepted: 6 June 2017

Fig. 2 Sources of uncertainty

The analytical phase is usually conceived as fully in the

If measurements systems give different (biased) results for

Fig. 4 A bias of ?5 arbitrary Measurement results Measurement results

Arbitrary units Arbitrary units

Precision is the quantitative expression of random error Commutability

Proficiency testing Split samples for estimation of bias

Material Primary Secondary Working Product Patient sample

Commutable? Commutable? Commutable? Commutable? Commutable! Patient

Provider BIPM, National National

Uncertainty for commutable material

Uncertainty for noncommutable material

You might also like