ISO - DIS - 13530 - (E) - Guide To Analytical Quality Control
ISO - DIS - 13530 - (E) - Guide To Analytical Quality Control
ISO - DIS - 13530 - (E) - Guide To Analytical Quality Control
An die
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION • МЕЖДУНАРОДНАЯ ОРГАНИЗАЦИЯ ПО СТАНДАРТИЗАЦИИ • ORGANISATION INTERNATIONALE DE NORMALISATION
ICS 13.060.45
To expedite distribution, this document is circulated as received from the committee secretariat.
ISO Central Secretariat work of editing and text composition will be undertaken at publication
stage.
Pour accélérer la distribution, le présent document est distribué tel qu'il est parvenu du
secrétariat du comité. Le travail de rédaction et de composition de texte sera effectué au
Secrétariat central de l'ISO au stade de publication.
THIS DOCUMENT IS A DRAFT CIRCULATED FOR COMMENT AND APPROVAL. IT IS THEREFORE SUBJECT TO CHANGE AND MAY NOT BE
REFERRED TO AS AN INTERNATIONAL STANDARD UNTIL PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS BEING ACCEPTABLE FOR INDUSTRIAL, TECHNOLOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN NATIONAL REGULATIONS.
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall
not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the
unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
Copyright notice
This ISO document is a Draft International Standard and is copyright-protected by ISO. Except as permitted
under the applicable laws of the user's country, neither this ISO draft nor any extract from it may be
reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, photocopying,
recording or otherwise, without prior written permission being secured.
Requests for permission to reproduce should be addressed to either ISO at the address below or ISO's
member body in the country of the requester.
Contents Page
Foreword .............................................................................................................................................................v
1 Scope ......................................................................................................................................................1
2 Normative references............................................................................................................................1
3 Terms and definitions ...........................................................................................................................2
4 Performance characteristics of analytical systems...........................................................................5
4.1 Introduction............................................................................................................................................5
4.2 Scope of the method .............................................................................................................................5
4.3 Calibration..............................................................................................................................................6
4.4 Limit of detection, limit of quantification ............................................................................................6
4.5 Interferences and matrix effects ..........................................................................................................7
4.6 Accuracy (trueness and precision) required of results.....................................................................7
4.7 Uncertainty of measurement................................................................................................................8
4.8 Robustness ............................................................................................................................................9
4.9 Fitness for purpose ...............................................................................................................................9
5 Choosing analytical systems ...............................................................................................................9
5.1 General considerations.........................................................................................................................9
5.2 Stepwise procedure to select analytical techniques to be used in a measurement
programme...........................................................................................................................................10
5.3 Practical considerations .....................................................................................................................11
6 Initial tests to establish performance of analytical system ............................................................11
6.1 General .................................................................................................................................................11
6.2 Precision tests .....................................................................................................................................12
6.3 Recovery tests .....................................................................................................................................15
7 Intralaboratory quality control ...........................................................................................................16
7.1 General .................................................................................................................................................16
7.2 Terms relating to within-laboratory quality control .........................................................................16
7.3 Control of accuracy.............................................................................................................................16
7.4 Control of trueness .............................................................................................................................17
7.5 Control of precision ............................................................................................................................18
7.6 Principles of applying control charts ................................................................................................20
7.7 Conclusions .........................................................................................................................................24
7.8 Control charts with fixed quality criterions (target control charts) ...............................................26
8 Quality control in sampling ................................................................................................................27
9 Interlaboratory quality control ...........................................................................................................27
10 Quality control for lengthy analytical procedures or analysis undertaken infrequently or
at an ad hoc basis ...............................................................................................................................27
10.1 Quality control for lengthy analytical procedures ...........................................................................27
10.2 Analysis undertaken infrequently or on an ad hoc basis ...............................................................28
Annex A (informative) Evaluation of interference effects on analytical methods......................................30
A.1 General .................................................................................................................................................30
A.2 Procedure.............................................................................................................................................30
A.3 Experimental design ...........................................................................................................................31
A.4 Interpretation and reporting of results..............................................................................................32
A.5 Worked examples of the use of interference testing.......................................................................33
Annex B (informative) The nature and sources of analytical errors............................................................37
B.1 General .................................................................................................................................................37
Figures
Tables
Table A.1 — Example of experimental design for one batch of interference tests .................................. 31
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 13530 was prepared by Technical Committee ISO/TC 147, Water quality, Subcommittee SC 2, Physical,
chemical and biochemical methods.
This edition cancels and replaces the edition (ISO/TR 13530:1997), which has been technically revised.
1 Scope
This International Standard defines a guide with the objective of providing detailed and comprehensive
guidance on a coordinated programme of within-laboratory and between-laboratory quality control for ensuring
the achievement of results of adequate and specified accuracy in the analysis of waters and associated
materials.
This International Standard and its annexes are applicable to the chemical and physicochemical analysis of
natural waters (including sea water), waste water, raw water intended for the production of potable water,
and potable water. It is not intended for application to the analysis of sludges and sediments (although
many of its general principles are applicable to such analysis) and it does not address the biological or
microbiological examination of water. Whilst sampling is an important aspect, this is only briefly considered.
Analytical quality control as described in this International Standard is intended for application to water
analysis carried out within a quality assurance programme. This International Standard does not address
the detailed requirements of quality assurance for water analysis.
The recommendations of this International Standard are in agreement with the recommendations of
established quality assurance documentation (e.g. ISO/IEC 17025).
This International Standard is applicable to the use of all analytical methods within its field of application,
although its detailed recommendations may require interpretation and adaptation to deal with certain types
of determinands (for example non-specific determinands such as suspended solids or biochemical oxygen
demand BOD). In the event of any disparity between the recommendations of this International Standard
and the requirements of a standard method of analysis, the requirements of the method should prevail.
The basis of the International Standard is to ensure the achievement of results of adequate accuracy by
adherence to the sequential stages of analytical quality control.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 3534-1, Statistics — Vocabulary and Symbols — Part 1: Probability and general statistical terms
ISO 3696, Water for analytical laboratory use — Specification and test methods
ISO 5667-1, Water quality — Sampling — Part 1: Guidance on the design of sampling programmes
ISO 5667-3, Water quality — Sampling — Part 3. Guidance on the preservation and handling of water
samples
ISO 5667-14, Water quality — Sampling — Part 14: Guidance on quality assurance of environmental water
sampling and handling
ISO 8466-1, Water quality — Calibration and evaluation of analytical methods and estimation of performance
characteristics — Part 1: Statistical evaluation of the linear calibration function
ISO 8466-2, Water quality — Calibration and evaluation of analytical methods and estimation of performance
characteristics — Part 2: Calibration strategy for non-linear second order calibration
ISO/IEC 17025, General requirements for the competence of testing and calibration laboratories
ISO/IEC guide 43-1, Proficiency testing by interlaboratory comparisons — Part 1: Development and operation
of proficiency testing schemes
ILAC-G13:2000, Guidelines for the Requirements for the Competence of Providers of Proficiency Testing
Schemes
3.1
accuracy of measurement
the closeness of agreement between the result of a measurement and a true value of the measurand [9]
3.2
analytical run
a group of measurements or observations carried out together either simultaneously or sequentially without
interruption on the same instrument by the same analyst using the same reagents. An analytical run may
consist of more than one batch of analyses. During an analytical run the accuracy and precision of the
measuring system is expected to be stable [43]
3.3
bias
difference between the expectation of a test result or measurement result and a true value [ISO 3534-2]
3.4
batch of analyses
a group of measurements or observations of standards, samples and/or control solutions which have been
performed together in respect of all procedures, either simultaneously or sequentially, by the same analysts
using the same reagents, equipment and calibration
3.5
combined standard uncertainty
uc(y)
standard uncertainty of the result y of a measurement when the result is obtained from the values of a number
of other quantities, equal to the positive square root of a sum of terms, the terms being the variances or
covariances of these other quantities weighted according to how the measurement result varies with these
quantities [8]
3.6
conventional true value
value attributed to a particular quantity and accepted, sometimes by convention, as having an uncertainty
appropriate for a given purpose [9]
EXAMPLES
a) At a given location, the value assigned to the quantity realised by a reference standard may be taken as a
conventional true value.
b) The CODATA (1986) recommended value for the Avogadro constant, NA: 6.0221367*1023 mol-1.
NOTE 1 "Conventional true value" is sometimes called assigned value, best estimate of the value, conventional value
or reference value.
NOTE 2 Frequently, a number of results of measurements of a quantity is used to establish a conventional true value.
3.7
coverage factor
k
numerical factor used as a multiplier of the combined standard uncertainty in order to obtain an expanded
uncertainty [8]
3.8
error (of measurement)
the result of a measurement minus a true value of the measurand [9]
NOTE Since a true value cannot be determined, in practice a conventional true value is used.
3.9
expanded uncertainty
U
quantity defining an interval about the result of a measurement that may be expected to encompass a large
fraction of the distribution of values that could reasonably be attributed to the measurand [8]
NOTE 1 The fraction can be regarded as the coverage probability or level of confidence of the interval.
NOTE 2 To associate a specific level of confidence with the interval defined by the expanded uncertainty requires
explicit or implicit assumptions regarding the probability distribution characterised by the measurement result and its
combined standard uncertainty. The level of confidence that can be attributed to this interval can be known only to the
extent to which such assumptions can be justified.
NOTE 3 An expanded uncertainty U is calculated from a combined standard uncertainty uc and a coverage factor k
using
U = k · uc.
3.10
limit of detection
smallest amount or concentration of an analyte in the test sample that can be reliably distinguished from zero
[40]
3.11
limit of quantification
lowest concentration of a substance in a defined matrix where positive identification and quantitative
measurement can be achieved using a specified method [41]
3.12
precision
the closeness of agreement between independent test results obtained under stipulated conditions [ISO 3534]
NOTE 1 Precision depends only on the distribution of random errors and does not relate to the true value or the
specified value.
NOTE 2 The measure of precision is usually expressed in terms of imprecision and computed as a standard deviation
of the test results. Less precision is reflected by a larger standard deviation.
NOTE 3 "Independent test results" means results obtained in a manner not influenced by any previous result on the
same or similar test object. Quantitative measures of precision depend critically on the stipulated conditions. Repeatability
and reproducibility conditions are particular sets of extreme stipulated conditions.
3.13
random error
result of a measurement minus the mean that would result from an infinite number of measurements of the
same measurand carried out under repeatability conditions [9]
NOTE 2 Because only a finite number of measurements can be made, it is possible to determine only an estimate of
random error.
3.14
standard uncertainty
u(xi)
uncertainty of the result xi of a measurement expressed as a standard deviation [8]
3.15
systematic error of measurement
mean that would result from an infinite number of measurements of the same measurand carried out under
repeatability conditions minus a true value of the measurand [9]
NOTE 2 Like true value, systematic error and its causes cannot be known.
NOTE 3 Alternative terms used in this document are trueness (see 3.2) and bias (see 3.3).
3.16
traceability
property of the result of a measurement or the value of a standard whereby it can be related to stated
references, usually national or international standards, through an unbroken chain of comparisons all having
stated uncertainties [9]
3.17
trueness
closeness of agreement between the expectation of a test result or a measurement result [ISO 3534-2]
3.18
true value
value consistent with the definition of a given particular quantity [9]
NOTE 3 The indefinite article "a" rather than the definite article "the" is used in conjunction with "true value" because
there may be many values consistent with the definition of a given particular quantity.
3.19
uncertainty (of measurement)
parameter associated with the result of a measurement, that characterises the dispersion of the values that
could reasonably be attributed to the measurand [9]
NOTE 1 The parameter may be, for example, a standard deviation (or a given multiple of it), or the width of a
confidence interval.
NOTE 2 Uncertainty of measurement comprises, in general, many components. Some of these components can be
evaluated from the statistical distribution of the results of a series of measurements and can be characterised by
experimental standard deviations. The other components, which can also be characterised by standard deviations, are
evaluated from assumed probability distributions based on experience or other information.
NOTE 3 It is understood that the result of the measurement is the best estimate of the value of the measurand and that
all components of uncertainty, including those arising from systematic effects, such as components associated with
corrections and reference standards, contribute to the dispersion.
4.1 Introduction
The validation of each method applied is described in ISO/IEC 17025. Primary validation is part of the
development of a new analytical method, and is performed during the standardization of the method.
Important points to be considered for the primary validation of an analytical method are the following.
⎯ Calibration;
⎯ Interferences;
⎯ Uncertainty of measurement;
⎯ Robustness;
A clear definition should be given of the forms of the substance that are determined by the procedure and also,
when necessary to avoid ambiguity, those forms that are not capable of determination. At this point, it is worth
emphasizing that the analyst's selection of an analytical method should meet the user's definition of the
determinand. Non-specific determinands need the use of rigorously standardized analytical methods in order
to obtain reliable and comparable results.
Many substances exist in water in a variety of forms or 'species', and many analytical systems provide a
differential response to the various forms. For example, when a separation of 'dissolved' and 'particulate'
material is required, special care is necessary to define precisely the nature and pore-size of the filter to be
used.
A precise description of the types and natures of samples is important before the analytical system can be
chosen. The precautions to be taken when a sample is analysed will depend to a high degree on the sample.
The analyst needs information as complete as possible on sample types, concentration levels and possible
interferences. The scope should contain a clear statement of the types of sample and sample matrices for
which the procedure is suitable. If necessary, a statement should also be made of important sample types and
matrices for which the procedure is not suitable.
The range of application corresponds to the lowest and highest concentrations for which tests of precision and
bias have been carried out using the system without modification. Where an extension can be used to enable
the examination of samples containing concentrations greater than the upper limit, such as by analysis after
dilution, then it should be regarded as a different procedure but whose performance characteristics can be
inferred from the values quoted for the original.
The concentration range of interest can have a marked effect on the choice of analytical technique; of primary
concern is the smallest concentration of interest. The lower limit of application (4.4.4) is required to be ≤ 30 %
of the relevant limit although other considerations could be applied, if appropriate.
4.3 Calibration
If not stated otherwise the calibration is performed using the prescribed procedure in the standardized
method or by applying ISO 8466-1 or ISO 8466-2.
The quality of water used in preparation of standard and blank solutions should be examined carefully. In
general water complying with purity grade 1 of ISO 3696 is used. The standard addition technique is used to
overcome multiplicative matrix effects on the calibration curve. Methods for the determination of non-specific
determinands are calibrated by the use of arbitrarily chosen standard solutions, prescribed and exactly
defined in the standardized method. Detailed instructions on calibration procedures are given in [32], [38],
ISO 8466-1 and ISO 8466-2.
In broad terms, the limit of detection is the smallest amount or concentration of an analyte in the test sample
that can be reliably distinguished from zero [40]. For analytical systems where the application range does not
include or approach it, the detection limit does not need to be part of a validation, e.g. if hardness of water
shall be analysed limit of detection is not of interest.
There has been much diversity in the way in which the limit of detection of an analytical system is defined.
Most approaches are based on multiplication of the within-batch standard deviation of results of blanks or the
multiplication of the standard deviation of the method sxo by a factor. These statistical inferences depend on
the assumption of normality, which is at least questionable at low concentrations. Notwithstanding this, in
method validation a simple definition, leading to a quickly implemented estimation of the detection limit, shall
be applied.
4.4.1 Limit of detection for methods with normally distributed blank values
xLD = 3 s 0
where
xLD is the limit of detection;
The precision estimate s0 shall be based on at least 10 independent complete determinations of analyte
concentration in a typical matrix blank or low-level material, with no censoring of zero or negative results. For
that number of determinations the factor of 3 corresponds to a significance level of α = 0.01.
With the recommended minimum degrees of freedom, the value of the limit of detection is quite uncertain, and
may easily be in error by a factor of 2. Where more rigorous estimates are required more complex calculations
should be applied. For special cases see respective standards ISO 11843 Parts 1 to 4 [2], [3], [4], [5].
The limit of detection xLD is defined as the concentration of the analyte at a signal/noise ratio S/N = 3
The limit of quantification represents by convention the lowest reportable result. Usually it is arbitrarily taken
as a fixed multiple of the detection limit [40].
For method validation the limit of quantification xLQ shall be calculated as:
xLQ = 3 xLD
For verification of the limit of quantification spiked blank samples at this concentration level shall be analysed
in the same manner as real samples.
Considering the estimated limit of quantification a lower limit of application xLA (xLA ≥ xLQ) shall be defined. For
methods which need calibration, the lowest possible limit of application is equal to the lowest standard
concentration.
An important source of systematic error in results is the presence of constituents of a sample other than the
determinand that cause an enhancement or a suppression of the analytical response. The evaluation of
interference effects on analytical systems is described in Annexes A and B, and the results of such evaluation
should provide estimates of error at or near the lower and upper concentration limits of the system. These
estimates should be available for each substance (interferent) of interest at a concentration slightly higher
than the greatest value expected in samples.
Most analytical techniques produce accurate results with standard solutions at the optimal concentration.
Relevant information should include the types of samples (fresh water, sea water, waste water, etc.) for which
the method is suitable.
The general term accuracy is used to refer to trueness and precision combined. Accuracy is a measure of the
total displacement of a result from the true value, due to both random and systematic errors (see Annex B).
The determination of the precision can be realized on different levels. The bandwidth comprises on one side
the repeatability which is measured in a single laboratory (one person, same equipment, short time), on the
other side the reproducibility which is calculated from the results of the interlaboratory comparison which is
executed within the method validation process and is published in method standards.
Trueness is closely connected with the demonstration of measurement traceability which is a requirement of
ISO 17025. The procedure how to demonstrate traceability and how to use appropriated reference material is
reported in many guides, e.g. [27], [49]. It should be mentioned that the demonstration of traceability in
chemical analysis of water is often not easy or partly can not be done because of the use of “empirical”
methods and of the complexity of the matrices.
Results from interlaboratory tests may be useful in choosing analytical systems. Consistently accurate results
in such tests often indicate an analytical approach which is robust. It is often important for a laboratory to keep
a high level of quality in day-to-day routine operation. A strict adherence to a quality control programme is
necessary, using reference substances where possible, to check trueness and control charts in order to keep
precision and in special cases both precision and trueness, under control.
The nature of any known form of bias should be summarized. Comparison of results obtained using the
system under consideration with those using reference procedures, and also the results obtained by the
analysis of certified materials using the system under consideration are all relevant - but are often not
available.
The concentration of many determinands may change between sampling and analysis, and large systematic
errors may result.
Evidence concerning the magnitude of these systematic errors and the efficiency of measures to eliminate
them is required. Possible errors due to the inability of the system to measure all forms of the substance as
defined above, and any bias attributable to the methods of calibration and blank correction, should be known
and reported.
The International Standard ISO 17025 demands that the accreditated laboratories have procedures for the
estimation of the measurement uncertainty available and under certain conditions (“when it is relevant to the
validity or application of the test results, when a client's instruction so requires, or when the uncertainty affects
compliance to a specification limit”) they shall state the measurement uncertainty.
Uncertainties in an analytical process may be estimated using different procedures according to the
destination of the result itself. In any case, when associating an uncertainty to a result, the analytical
laboratory should indicate the approach which has been chosen.
The formal approach to measurement uncertainty estimation calculates a measurement uncertainty estimate
from an equation, or mathematical model. The procedures described as method validation are designed to
ensure that the equation used to estimate the result, with due allowance for random errors of all kinds, is a
valid expression embodying all recognized and significant effects upon the result. The basic document in
which this approach is described is the “Guide to the expression of uncertainty in measurement” [8]. Because
this guide is hard to comply with in practice some different standards and guidelines [7] [26] [28] [44] have
been published to give support to the implementation of the concept of measurement uncertainty for routine
measurements in laboratories. Some of these guides describe reduced procedures (so-called
“top-down-models”) to estimate the uncertainty of measurement.
They have the common approach that to provide a practical, understandable and common way of
measurement uncertainty calculations, mainly based on already existing quality control and validation data.
In the top-down-models there are in principle two different procedures to estimate the uncertainty of
measurement, see Annex C. Data from internal quality control should be preferably used to calculate
measurement uncertainty if there is a choice between these data and results of collaborative studies.
Water analysis often deals with operational (empiric) methods, where no systematic errors occur. In these
cases (e.g. sum parameters) and also if the correction of systematic errors is possible a useful approach to
estimate measurement uncertainty is to consider only precision data which are bigger than 1/3 of the major
random error:
st = ∑s 2
x
where
sx(max) is the biggest standard deviation (random error of the most imprecise analytical step).
If sampling is an integral part of the analysis random errors shall include the sampling and sample
preservation procedure (see Clause 8).
4.8 Robustness
A measurement programme, for example a river survey, may often include a high number of very different
types of samples. For this reason, routine analytical laboratories often prefer robust, multipurpose analytical
techniques applicable to a broad range of samples.
A 'robust' or 'rugged' analytical procedure means a procedure such designed that the accuracy of analytical
results is not appreciably affected by small deviations from the experimental design prescribed by the
analytical method. The use of robust procedures is of great help to achieve reliable results in routine
laboratories. The most robust procedure is the preferable choice, if the procedure meets the user's
requirements. There is no simple numerical value indicating the robustness, but results from interlaboratory
trials should be used to illustrate the robustness of a procedure. Special responsibility falls on experts
improving or standardizing methods to produce robust techniques.
The need for complete and clear specification of analytical procedures should be stressed. The method
should specify all details regarding analysis, equipment, calibration, calculation of results, etc., and also
include any details on sampling, sample handling and preservation, any digestion step or other specific
pretreatment of samples. Any optional operations should be specifically noted.
Fitness for purpose is the extent to which the performance of a method matches the criteria, agreed between
the analyst and the end-user of the data describing the end-user’s needs. For instance, the errors in data
should not be of a magnitude that would give rise to incorrect decisions more often than a defined small
probability, but they should not be so small that the analyst is involved in unnecessary expenditure. Analytical
fitness-for-purpose criteria may be expressed either in terms of acceptable combined measurement
uncertainty or acceptable individual performance characteristics.
Sampling of waters and effluents is carried out in order to provide information on their qualities. This
information may be used for different reasons, for example:
⎯ environmental monitoring;
The user's needs are of primary importance. It is the responsibility of the user to define precisely the
objectives of the measurement programme and to help to choose the measurement techniques to be used.
The following topics should be defined in the measuring programme:
In this clause it is emphasized that all analytical work should be based on a sound and precisely defined
measurement programme, providing the analyst with representative and stable samples. The inclusion of a
quality assurance system implies the production of data of stated quality. This is partially attained by analytical
quality control activities which keep random and systematic errors within prescribed limits.
In establishing a measurement programme, including decisions on sampling, analysis and treatment of data,
all aspects and parts of work are inter-connected. All important factors in selecting analytical techniques can
be discussed in a stepwise procedure. A check-list composed of sequential stages may be useful. However, it
is important to note that some decisions may require a change in this process at a later stage. The final
programme often is a compromise between what is desirable and what is practicable. It should be
emphasized that all analytical quality control activities should be performed in connection with on overall
quality assurance programme.
e) choose analytical methods (considering determinands, accuracy, concentrations, types of samples and
interferences) considering all points mentioned in Clause 4. A suitable analytical system includes the
correct implementation of
⎯ AQC activities such as use of reference substances, control charts and participation in interlaboratory
tests;
⎯ a quality assurance system, including laboratory and data audits, in order to maintain a stated level of
quality throughout the programme period;
When discussing the requirements with the user and selecting suitable analytical systems to fit the measuring
programme, the following practical points should be considered:
⎯ the frequency of sampling and the total number of samples on each occasion;
⎯ the maximum period between sampling and analysis, in relation to sample stability;
⎯ the maximum period between sampling and the user's need for the results;
⎯ applicability of the proposed method in the laboratory concerned with respect to cost, speed, etc.
Regarding these practical considerations, factors such as convenience, speed and cost may have a great
influence on the final selection of analytical systems. When analysis is required infrequently, it may be
necessary to adopt a different approach from that used for regular, frequent determinations. It is still essential
that the most appropriate action is taken to ensure control of the measurement process and to provide an
estimate of analytical accuracy (see Clause 10).
6.1 General
Once a method has been chosen for a particular application, it is necessary to test the performance of its
routine operation. The emphasis should be placed on an examination of the performance of the whole
analytical system, of which the method is only a part. All the components of the analytical system -
instrumentation, analysts, laboratory facilities, etc. - should be critically examined before routine analysis is
started.
This clause describes the approach recommended for the experimental estimation, and, when necessary,
reduction of errors; this stage may be called 'secondary validation' 'or within-laboratory validation'. It should be
completed before samples are routinely analysed separately. The third phase of validation, called 'routine
quality control' is dealt with in Clause 7.
The estimation of systematic error should already have been made in the initial evaluation of the technique. It
will usually be impossible to check many of the most important sources of bias when a method is used
routinely for the first time. A check on some sources of bias, by means of a spiking recovery, is included at this
stage.
The estimation (and, when necessary, control) of random error is an essential precursor to routine analysis.
Preliminary tests provide the necessary evidence that the precision of routine data is adequate, and form the
basis for routine quality control.
The analytical system should be revalidated whenever there is evidence of a significant deterioration in
performance which cannot be corrected.
For this reason, precision should be estimated from analyses taken from separate batches, spread over a
suitable period, e.g. five days. The duration of this period is a matter of choice and depends on which sources
of random variation are to be assessed. Testing to give at least 10 degrees of freedom for each estimate of
standard deviation is recommended if a reliable estimate is to be obtained.
The approach described in this clause allows the total random error to be separated into random error arising
from variations within and between batches of analysis. This information is of value in indicating the dominant
sources of random error. Estimates of within-batch standard deviation are pooled from all batches and so
provide an indication of what is achievable on a regular basis.
The basic approach is to make n determinations on a representative group of samples in each of m batches of
analysis. In deciding on suitable values for n and m, care is required for two reasons.
Too few analyses will not provide a worthwhile estimate of standard deviation. The uncertainty of an estimate
of standard deviation depends on the number of associated degrees of freedom. Designs of test which are
likely to provide estimates of standard deviation with fewer than 10 degrees of freedom, i.e. fewer than 11
batches of analysis, may prove uninformative.
It is desirable to design the test so that a satisfactory estimate of the dominant source of error is obtained. For
example, if between-batch error is likely to be dominant, a design where n = 10 and m = 2 will give a relatively
precise estimate of the less important source of error, but will estimate the dominant source of error very
imprecisely. A more appropriate design would be for n to be made small and m large.
The experimental design recommended for general use is to make n = 2 and m = 8 to 11. Such a design
provides estimates of within- and between-batch standard deviations with approximately equal numbers of
degrees of freedom. This design should be modified as indicated by knowledge of the analytical technique. In
particular, when within-batch errors are assumed to be dominant, values such as n = 4 and m = 5 could be
chosen. This has the merit of reducing the number of batches of analysis which need to be conducted, whilst
(given the assumption is correct) providing a reasonably good estimate of total standard deviation. The
product m ⋅ n should not be less than 10 and should be preferably 20 or greater. Analysis of 11 batches of
duplicate samples will guarantee that the estimates of total standard deviation will have at least 10 degrees of
freedom.
It is clearly essential that the solutions used for tests of precision show no appreciable changes in
concentration during the period in which portions of them are taken for analysis. The solutions should also be
sufficiently homogeneous that the concentration of the determinand is essentially the same in each portion of
a solution. Water samples may sometimes be inadequately stable to allow tests over several days (adequate
sample stability can sometimes be achieved by suitable preservation techniques, but these should be used
only if specified in the analytical methods of interest).
It is convenient to use standard solutions when estimating precision. Standards of any desired concentration
can be obtained, so that a range of concentrations is available for the estimates of precision; real samples of
the desired concentration may not be available. However, the analyst should also have estimates of the
precision for water samples as it should not, in general, be assumed that standard solutions and water
samples can be analysed with the same precision. Therefore, precision estimates for samples and standards
should normally be obtained.
For these tests, standard solutions and samples should be analysed, measured and evaluated in exactly the
same way as normal routine samples.
When the limit of detection is of interest, a matrix solution containing essentially none of the determinand
should also be included (see 4.4).
These various requirements may seem rather complex, but the worked example, given in Annex D, shows
that they may be simply resolved. Clearly, the greater the number of different solutions included in the tests,
the greater the information obtained on precision, but a compromise with the effort required will often be
necessary. As a guide to the minimum number of solutions, it is suggested that the following test samples
should be included in each batch of analysis.
Two standard solutions or samples at concentrations near the upper and lower concentration of interest.
When standard solutions are used, one water sample near the average concentration in samples should also
be included.
For testing matrix influences the water sample mentioned above should be spiked with a known quantity of
determinand. If estimation of precision at a variety of different concentrations is of key interest, the level of the
spike should be chosen so that the final concentration differs from those of the other solutions. Otherwise, it is
advisable to make as large an addition as the initial sample concentration and the range of the method will
allow.
To estimate the limit of detection, n replicate blank samples (natural or analytical blanks) should be analysed.
If precision at the blank level is known to be dependent on the sample matrix, it will be necessary either to use
a blank sample which contains the determinand (and risk a likely overestimation of limit of detection) or to take
steps to remove the determinand from a sample so that it may be used as a blank. When, as with some
chromatographic techniques, no response is obtained for a blank, it is recommended that a blank is spiked
with enough determinand to produce a measurable response. This can form the basis for an estimate of limit
of detection. The measured values should not, of course, be used for blank correction.
The simplest approach to the design of precision tests is to prepare all samples for analysis at the start of the
tests and use these without preparing fresh aliquots for each batch of analysis. This is satisfactory provided
there is no sample instability. The possibility that sample instability may be present rules out the direct
estimation of between-batch standard deviation and may call into question the assessment of within-batch
standard deviation. Further discussion is reported in [38].
6.2.3.1 Randomization
Randomization of the order of analyses should be used to eliminate the effects of any systematic changes in
factors that cannot be controlled, and which might otherwise cause false conclusions to be drawn.
Standard deviations should be calculated from the set of results for each sample. Thus, for each solution
analysed, e.g. two results are available from each batch, corresponding to the first and second portions of the
sample to be analysed. These results should, if necessary, be blank corrected using the analytical blank for
the appropriate batch.
It is useful to obtain estimates of the within-batch and between-batch standard deviations, sw and sb,
respectively. These two estimates are needed to allow an estimate of the total standard deviation, st, to be
obtained. A statistical technique known as 'analysis of variance' (see Annex D) is used. The theoretical basis
of the technique is described in statistical texts, but in the present context it may be taken simply as a
convenient means of calculating sw, sb and st. These estimates also provide important information about the
sources of error.
If the total standard deviation should not be significantly greater than some target value, Z, a variance ratio test,
at a specified significance level, is used. However, an estimated number of degrees of freedom, DF, for the total
standard deviation should first be calculated. For n results in each of m batches:
m(m − 1)[M 1 + (n − 1) M 0 ]
2
DF = 2 2
mM 1 + (m − 1)(n − 1) M 0
where Mo and M1 are the within-batch and between-batch mean squares, respectively, obtained from the
analysis of variance (see Annex D). The calculated value for DF may be non integral; if this is so, the nearest
whole number should be used for DF in the following test.
The variance ratio F = st2/ Z 2 is calculated, (clearly, this calculation is not needed if st ≤ Z ) and compared with the
tabulated value F α using DF and ∞ degrees of freedom for the numerator and denominator, respectively. The
value of α appropriate to each situation should be used; a value of 0,05 should usually be suitable for analytical
applications. st is taken to be significantly greater than Z if the calculated value for F is larger than the tabulated
value. If it is found that st is significantly greater than Z, steps should be taken to identify and eliminate the
important sources of error. If st is appreciably greater than Z but not significantly so, it is desirable either to carry
out more tests to obtain a better estimate of st or to attempt to reduce the most important sources of error.
Variance ratio tests may also be used to test whether within-batch standard deviations are significantly greater
than given target values. The general procedure is as for the total standard deviation, but the experimental
estimates of the within-batch variances have m(n-1) degrees of freedom.
Consider the example where n determinations of both 'spiked' and 'unspiked' samples are made in each of m
batches. To show that the mean recovery does not differ significantly (at the significance level α) from
(100 ± D) % (where D is the accepted limit for recovery in percent) of the amount added, the following procedure
is used.
The recovery should be calculated from the difference between the results for the n pairs of 'spiked' and
'unspiked' samples in each batch. The mean recovery, Rec, should be calculated from the m⋅n results. Also the
standard deviation, s, of the m mean recoveries should be calculated for the m batches. Let the amount added
for 'spiking' be d (in the same units as Rec). The mean recovery is then acceptable if:
Rec > (1,00 - 0,01 D)d- s. t2α /m1/2 (for Rec < 100 %)
Rec < (1,00 + 0,01 D)d+ s. t2α /m1/2 (for Rec > 100 %)
where
The object of the test is to identify bias from certain sources occurring in the analysis of real samples. A
known quantity of the determinand is added to a real sample, forming the spiked solution, and the two are
analysed, the difference in concentrations found being used to calculate the recovery. This is repeated n times
and the mean differences compared statistically with the theoretically expected recovery.
Since the spiked solution is made up by adding a fixed quantity of standard solution to a fixed quantity of real
sample, the calculation of its recovery can be made as follows:
s( v + V ) − uV
Rec = ⋅ 100%
cv
It should be emphasized that Rec, calculated from m . n values (i.e. m batches, n replicate analyses in each
batch), is only an estimate of the true mean recovery.
The standard deviation, s, is calculated from the m daily mean recoveries (each calculated from two
replicates); s refers therefore to the standard deviation of m daily mean recoveries.
s
sRe c =
m
where m is the number of values on which s is based. The standard error is, in fact, the standard deviation of
an estimate of the mean (as opposed to the standard deviation of a single observation). The true mean can be
expected to lie within ± t 0,05 s Rec of the estimated Rec with 95 % confidence.
In the test of acceptability, the normal requirement is that Rec is not outside the range of 95 % to 105 %, and
that a recovery is unsatisfactory when it is 95 % certain that it does not comply with that condition.
NOTE Lower recovery rates can be accepted as long as they are reproducible.
It should be noted that the spiking recovery test is fairly limited in the information it yields. For example if bias is
found in the standard solution results, it is quite probable that it will also occur in the spiking recovery results and
yield no additional information. It only assumes importance when significant bias does not occur elsewhere. In
this case, the implication is a cause of bias in the real sample only, and this usually implies interference
proportional to the concentration of the determinand. (Clearly an interference effect of absolute magnitude would
not affect the difference between spiked and real samples). In the case of unsatisfactory spiking recovery, it is
advisable to check the precision of the real and spiked results, particularly if the spiked solution has not been
freshly prepared for each analysis. If either of the two solutions shows signs of deterioration, this could easily
produce an unsatisfactory spiking recovery.
7.1 General
The previous clause deals with the evaluation of the capabilities of an analytical technique in order to judge its
likely suitability for a particular application. This clause describes the procedure to be adopted when the
system is put into routine use, sometimes called “tertiary validation” or “internal quality control” (IQC) or
“routine analytical quality control” (routine AQC).
Having chosen an analytical system capable of being used to produce results of adequate accuracy, the next
stage is to establish control over the system and to monitor routine performance. The aim is to achieve a
continuing check on the errors in routine analysis and to provide a demonstration of satisfactory performance
of the method.
Control sample: Sample material whose analytical results are used to construct control charts, for
example, standard solutions, real samples, blank samples.
Analytical result: Value reported as defined in the method. It is derived from the response by
application of the calibration.
The total error associated with an analytical result has components of random and systematic errors. Both
sources of error can be controlled on a routine basis. Inaccuracy or analytical error is untrueness and
imprecision, i.e. a combination of random and systematic errors.
It is not sufficient for a laboratory to adopt a suitable method, check its performance initially and assume that
thereafter the results produced will be of adequate accuracy. The chosen method should be subject to routine
tests each time it is used to ensure that adequate performance is maintained. These tests are in addition to
the routine checks to ensure all components of the analytical system are performing correctly before analysis
is commenced, sometimes called “system suitability checks”.
The control of accuracy can be carried out using control charts. The simplest form of control chart is one in
which the results of the individual measurements made on a control sample are plotted against a time series.
This type of chart (for example see Figure 1) provides a check on random and systematic error (from the
spread of results and their displacement). It is an easy procedure to be used by the analyst because it is
simple to plot and no data processing is needed. It is useful when the size of analytical batches is variable or
when batches consist of a small number of determinations. Individual result charts are used widely and often
form the mainstay of a Iaboratory's approach to control charting.
Key
X data intervals
Y concentration found for standard solution (mg/l)
1 expected concentration
2 warning limit
3 action limit
However, this type of chart may produce false out-of-control values if random error does not follow the normal
distribution. For these reasons, a range of more specialized types of chart has been devised. These are
described below.
7.4.1 General
One way of assessing systematic error is to participate regularly in interlaboratory trials, but these are too
infrequent and the results take too long to process for routine day to day control.
As a routine procedure for controlling systematic error, the use of Shewhart control charts [46] based on single
results or the mean, spiking recovery and analysis of blanks is recommended.
NOTE A range control chart is most commonly used in conjunction with a mean chart constructed from the same
data. The combined use of mean and range charts gives greater control over both systematic and random errors than the
use of a single result control chart.
For trueness control, standard solutions, synthetic samples or certified real samples may be analysed using a
Shewhart chart of individual or mean values.
The analysis of standard solutions serves only as a check on calibration. If, however, solutions with a
synthetic or real matrix are used as control samples, the specificity of the analytical system under examination
can be checked, provided an independent estimation of the true value for the determinand is available. A
useful alternative is to use a typical sample matrix containing none of the determinand and to spike it with a
known amount of determinand.
The respective control sample should be analysed a fixed number of times (≥ 1) in each batch of samples and
the mean result entered in the mean control chart.
It is advisable to analyse certified reference samples (if these are suitable available and not too expensive)
with routine samples as a check on trueness. A restricted check on systematic error by means of recovery
control charts is often made instead (7.4.3).
The choice of using a single result control chart or a mean control chart will depend on the particular
circumstances. The mean control chart gives greater control of trueness than the single result control chart,
but at the cost of losing control over within batch random error. See 7.5.2.
The recovery control chart is used as a check on systematic errors arising from matrix interferences. A
separate control chart for each type of matrix is required in water analysis, because samples of strongly
varying matrix composition, such as surface water, municipal and industrial waste water, may be subject to
errors of differing sizes and natures.
The recovery control chart, however, provides only a limited check on trueness because the recovery tests will
identify only systematic errors which are proportional to determinand concentration; bias of constant size may
go undetected.
The blank control chart represents a special application of the mean control chart.
The blank control chart may help to identify the following sources of error:
⎯ contamination of reagents;
It is appropriate therefore, to analyse a blank solution at the beginning and at the end of each batch of
samples. The blank values thus obtained are then entered on the blank control chart.
7.5.1 General
There are four ways of estimating the precision of analytical results in routine analysis:
⎯ use of the mean control chart (between batch errors only) (7.4.2);
A range control chart is used to control the within-batch precision of an analytical method. In addition, it allows
some assessment of errors caused by calibration drift. The standard deviation for a certain analytical result
can be estimated from an existing range control chart, provided the matrix of the sample under examination is
similar to that of control samples chosen for the range control chart. The range of the sample in question may
also be determined and entered on the control chart as well, in order to prove that an out-of-control situation
does not exist.
R
s=
d2
where
Table 1 — d2 values
n d2
2 1,128
3 1,693
4 2,059
5 2,326
6 2,534
7 2,704
8 2,847
9 2,970
10 3,078
where n is the number of replicate analyses in each
batch (see [16])
It is highly recommended that the analyst perform replicate analyses of the sample in question to obtain higher
reliability of the final result, especially in those cases where the contravention of a threshold value is to be
proved. From the data obtained, the standard deviation valid for the matrix in question can be estimated.
Additionally, the performance of replicate determinations offers two further advantages: firstly, coarse errors
(outliers) can be detected, and secondly, the analytical error can be reduced.
( xi − x ) 2
i =n
s= ∑
i =1 ( n − 1)
The estimation of the standard deviation from the range control chart, or with replicate analysis, can help to
identify a matrix-dependent imprecision.
The application of the method of standard addition, whilst helping to control untrueness, can tend to degrade
precision compared with direct determination. This is the price paid for control over systematic error. The
method of standard addition should be applied with caution. It is essential that the linear range of the method
be established.
The difference control chart is a chart of the difference, D (R1 – R2), in the results of analysis of two portions of
the same sample. R1 and R2 are the results for the first and second portions respectively. It is essential always
to subtract the second result from the first and to plot the difference including sign. The expected value for the
chart is zero. In all other ways the chart is constructed as for a single result chart. This type of chart is useful
when a control solution of known or reproducible value is not available. It is also useful when sample
homogeneity is a major source of error. The main disadvantage of this type of chart is the dependence of the
standard deviation, and therefore the control limits, on the concentration. This problem may be overcome by
plotting the percentage difference instead of the absolute difference.
The choice of control samples depends on the matrix, the analytical method and the accuracy required.
Advantages and disadvantages of the several types of control samples are described in [33] and [36].
For single result and mean control Solutions of the determinand in water, preferably real samples, stable
charts: for at least one control period
For blank control charts: Purified water or water samples with a sufficiently small concentration
of the determinand
For recovery control charts: Real samples with and without addition of the determinand
For range control charts: Real samples, in special cases (see 7.4.1) solutions of the determinand
in water
7.6.2.1 Construction of single result, mean, blank and difference control charts
At least 20 mean control values, x; are required for a trial period to estimate the following tentative control
parameters. They are obtained by analysing the control sample on at least 10 working days in duplicate (see
Clause 6).
⎯ control value xi (i.e. depending on the type of control chart: single result, mean of the replicate analyses,
single blank or single difference of the ith batch);
⎯ mean ( x );
⎯ upper warning limit and lower warning limit (UW, LW) = ( x ) ± 2s;
⎯ upper action limit and lower action limit (UA, LA) = ( x ) ± 3s;
1 i =n
x= ∑ xi
n i =1
The control chart is constructed in a coordinate system with the ordinate 'concentration" and the abscissa
'time of analysis' and/or 'batch number'. The numerical values for mean, warning limit and action limit are
plotted on the ordinate and drawn as lines parallel to the abscissa in the control chart.
The control value should be obtained at least once per batch of analyses. The frequency with which control
values are obtained within a batch lies in the responsibility of the laboratory and should be related to the risks
of important errors and the seriousness of their likely consequences. At regular intervals, the control chart
should be examined for changes in mean and standard deviation.
The Harmonised Guidelines [40] give recommendations for frequencies of analysis of control samples.
In the long-term operation of a control chart, the question arises whether or not to update the estimate of
mean and standard deviation used to generate the action and warning limits and, if so, how this might best be
done. The guiding principle should be that the chart is intended to detect (with known risks of making the
wrong decision) departures from the existing state of statistical control. Including the latest data in the overall
estimates of mean and standard deviation may not be sufficient to allow this aim to be fulfilled.
It is assumed that the last 60 data points are a homogeneous set and that the issue is whether or not these
points are of the same precision as that implied by the initial choice of control limits.
It is also assumed that the normal practice is to base the action and warning limits on a mean and standard
deviation derived from all available data points (including the latest). Data points corresponding to "out-of-
control" situations for which a definite cause has been identified should not, of course, be included in the
calculations.
Review the last 60 data points on the chart. If there are between 1 and 6 (inclusive) cases where the 2s
warning limits have been exceeded, there is no clear evidence that the precision of analysis has changed. No
revision of the chart is required except, as usual, the incorporation of new data points into the estimates of s
and x.
If there are either no cases where the warning limits have been exceeded or more than 6 cases, it may be
concluded with approximately 90 % confidence that the precision has changed (improved or degraded,
respectively) and that a revision of the action and warning limits is needed.
In this case, recalculate the control limits on the basis of the mean and standard deviation of the last 60 points
and proceed as usual.
Whenever new control limits are calculated as a result of a change in precision, review the new standard
deviation (and where appropriate the bias implied by the new mean) against the accuracy targets which apply
to the analyses in question. Take corrective action if necessary.
The above procedure need not be carried out each time a new data point is generated. This check on the
validity of the current control limits might be worthwhile after, for example, 20 successive points have been
plotted - though any obvious changes in the operation of the chart would warrant immediate concern.
The design and the criteria of decision of the recovery control chart are similar to those of the mean control
chart.
For the construction of a recovery control chart it is recommended to run a trial period of tests.
Reci = (xa-x0),100/ca
where
xa is the analytical result (for example concentration) of the determinand in the spiked sample;
xo is the analytical result (for example concentration) of the determinand in the original sample;
ca is the concentration or mass respectively of the spiked determinand. This assumes negligible dilution
of the sample by the spiked addition.
After completion of the trial period the following statistical characteristics are derived from the recoveries Reci
(n ≥ 20):
Calculation:
1 i=n
Rec = ∑ Reci (%)
n i =1
i=n
( Reci − Rec) 2
sRe c = ∑
i =1 (n − 1)
(%)
UW = Rec + 2s Rec
LW = Rec - 2s Rec
UA = Rec + 3s Rec
LA = Rec - 3s Rec
The recovery chart is constructed and maintained in the same way as described in 7.6.2.1. For the calculation
of the statistical parameters, x and s should be replaced with Rec and sRec, respectively.
At least 20 control values ( n ≥ 20) are required for the pre-period. Control value is the relative range Rrelj
x i max − x i min
Rrel j = ⋅ 100 %
xi
where
with:
1 i =n
xi = ∑ xi
n i=1
where
After the completion of a preliminary test period, the relative range values Rreli ( n ≥ 20) are used to calculate
the following statistical parameters:
1 j =n
R rel = ∑ Rrel j (%)
n j =1
LA = R rel ⋅ D LA (%)
Several calculation models may be used to estimate the action limits for this type of control chart. For
application in routine work it is recommended that only the upper action limits (UA) be calculated, warning
limits can be calculated additionally (see [33]. When performing replicate determinations (duplicate to six-fold),
the lower action limit (LA) is identical with the abscissa (zero-line).
The numerical values for the factors DUA and DLA for P=99,7 % are:
DLA (P=99,7 %) 0 0 0 0
NOTE For further numerical values for the factors DUA and DLA refer to [16].
The quality control chart is intended to identify changes in random or systematic error.
The following criteria for out-of-control situations are recommended for use with Shewhart charts:
⎯ 10 out of 11 consecutive control values being on one side of the central line.
The following out-of-control situations apply to the range type of control chart if:
⎯ a range Rrelj falls below the lower action limit (valid only for LA > 0); or
A cyclic variation of ranges may be observed, for example, by a regularly scheduled maintenance of an
analytical instrument or by re-preparation of reagents.
7.7 Conclusions
An out-of-control situation occurring on a control chart implies that an important error might apply to the
analysis of the routine samples. It is very important to immediately identify and eliminate the cause of the error
in order to maintain control over the performance of the analytical system. For fast and effective identification
of the source of analytical error, the approach described in the following subclauses is recommended.
7.7.1.1 Initial investigation to identify gross errors or deviations from the analytical procedure
The analysis of the control sample is repeated, strictly following the analytical method and avoiding possible
gross errors. If the new result of the control sample shows that the method is under control again, it may be
assumed that the method of analysis had not strictly been observed on the previous batch of analyses or that
a gross error had occurred. The entire batch should then be re-analysed.
If, however, the result of the analysis of the control sample is erroneous but reproducible, a systematic error is
very likely to exist.
To check for systematic errors, several different trueness control samples are analysed. To detect errors
depending on the reagents or the method, control samples should be used whose concentrations cover the
entire measuring range. As a minimum, a trueness control sample in the lower and one in the upper part of
the working range should be used. In the event of a systematic error with results predominantly being higher
or lower than the actual values, a step by step examination should be performed to find the reason for this
bias. Exchanging experimental parameters, such as reagents, apparatus or staff, might help to identify quickly
this type of error.
The precision can also be improved by a step-by-step approach to find the causes of random error.
The total precision of an analytical method can be improved by examining its individual procedural steps to
find the one which contributes most to the total error.
There could be errors which may not be detected by a statistical approach to quality control. In the majority of
such cases, this concerns errors influencing individual analyses in a batch, but not ones before or after. This
type of error can only be revealed by means of plausibility controls - checks on the observed value in relation
to expectations based on previous knowledge. Such knowledge may be based on chemical consideration, for
example checks on the equivalence of anions and cations in a sample, or a prior expectation, for example that
COD will be greater than BOD.
A successful approach to plausibility control requires that appropriate background information is available. The
procedure of plausibility control may be subdivided in two parts:
⎯ Information/harmonization;
Plausibility control may form a worthwhile additional check to supplement routine AQC. A large proportion of
failures on the basis of plausibility control (which is not mirrored by routine AQC) suggests an inadequate
routine system of quality control or a system which is not stable in its operation.
In the event of repeatedly occurring out-of-control situations being detected in the control charts, the initial
tests for implementation of analytical quality control, as described in Clause 6, should be performed with the
matrix in question, if the out-of-control situations cannot be remedied by simpler actions, such as exchange of
vessels, apparatus or reagents.
7.8 Control charts with fixed quality criterions (target control charts)
In the contrary to the classical control charts of the SHEWHART type described in 7.4 the target control charts
operate without statistically evaluated values. The bounds for this type of control charts are given by external
prescribed and independent quality criterions. A target control chart (for the mean, the true value, the blank
value, the recovery rate, the range) is appropriate if
⎯ there is no normal distribution of the values from the control sample (i.e. blank values);
⎯ the Shewhart or range control charts show persisting out of control situations;
⎯ there are not enough data available for the statistical evaluation of the bounds;
⎯ there are external prescribed bounds which should be applied to ensure the quality of analytical values.
The control samples for the target control charts are the same as for the classical control charts as described
in 7.4 to 7.6.
⎯ standards of analytical methods and requirements for internal quality control (IQC);
⎯ the (at least) laboratory-specific precision and trueness of the analytical value, which had to be ensured;
⎯ the valuation of laboratory-intern known data of the same sample type (see 7.4).
The chart is constructed with an upper and lower bound. A pre-period is inapplicable. The target control chart
of the range needs only the upper bound.
The analytical method is out-of-control if the analytical value is higher or lower than the respective prescribed
bounds.
If applying the target control charts the analytical method is formally out-of-control only if the analytical values
are outside the prescribed bonds. Nevertheless trends in the analytical quality should be identified and steps
should be taken against them. Helpful hints are given in 7.7.
It is emphasized that, as with the analytical stage, the initial selection of soundly based sampling
procedures is of fundamental importance. Indeed, given the difficulty of assessing by practical tests, many
of the potential errors which may arise during sampling, the need for careful initial selection of equipment
and procedures is probably even more crucial than in analysis. Similarly, control tests of sampling and
sample handling have the same basic objectives as their counterparts in analysis, namely to ensure that
any important deterioration of the accuracy of results arising from these steps is detected as rapidly as
possible, so that corrective action can be taken.
Guidance on quality control and quality assurance of sampling is given in ISO 5667-14.
⎯ collaborative studies for validation of a candidate method for standardization as specified in ISO 5725 [1];
⎯ interlaboratory tests to determine a consensus (certified) value for the composition of a reference test
material as specified in ISO Guide 35 [10];
⎯ proficiency testing as specified in ISO/IEC Guide 43 [11] and ISO 13528 [6];
⎯ collaborative studies to estimate the accuracy of data produced by a group of laboratories which share a
common interest using the Youden method [48], known as the paired sample technique.
For further details see ILAC Guide 13 [39] and IUPAC Harmonized Protocol [47] as well as guidelines [14],
[17], [18], [19], [29], [30].
Some multistage analytical procedures, for example the determination of trace organic contaminants, are
capable of producing relatively few results at a time. This raises the question of how to implement quality
control measures which were initially put into practice with high-throughput techniques. The argument that
because organic analyses are time-consuming they should not be subject to performance tests of the same
complexity as, for example, nutrient determinations is unsound.
An analytical result which takes hours to produce should be supported by performance and quality control
information of at least the same reliability as that associated with 'simple' determinations. Indeed, because
trace analysis is subject to greater uncertainty and is more costly to repeat, it can be argued that
proportionally more effort needs to be directed towards quality control. The maxim that relative few results of
known and adequate accuracy are better than many results of unknown and probably inadequate accuracy
remains true.
The stated approach to tests of precision and recovery should not be regarded as an ideal only attainable
under favourable circumstances. Rather, it is the minimum of testing which will provide a modestly reliable
indication of performance. For trace analysis, there is a strong case for expanding the range of samples tested
to include checks on precision and recovery from samples of differing matrices. Where limit of detection is of
special interest, it is particularly important that a pooled estimate is obtained from many batches of tests.
Replicate determinations performed on a single occasion are likely to give an unreliable and probably
optimistic estimate.
Similarly, the approach to routine quality control should follow the recommendations given in Clause 7.
Particular attention should be paid to the implementation of recovery control charts or other means of
monitoring and controlling recovery through the whole process.
The procedures recommended for preliminary performance tests (Clause 6) and routine quality control
(Clause 7) are most easily put into practice for analyses which are carried out regularly and often. It is
necessary to consider what approach to quality control should be adopted for analyses which may be
performed infrequently or which may be undertaken only once. The same considerations apply to analyses
carried out over a short period in relatively few batches.
Two main features distinguish this type of analysis from frequent, regular determinations.
Firstly, any quality control activity is likely to take up a relatively large proportion of the total analytical effort
compared with routine analyses. This is inconvenient and expensive, but it is a consequence of organizing
analysis in this way. It should not be used as an excuse to avoid evaluation of the analytical system. Any
analytical system used to produce data should be tested to provide an estimate of its performance. Not to test
would be to provide data of unknown accuracy. This is unacceptable to users of analytical data. Tests as
described in Clause 6 are recommended as a means of providing background performance data for all
analytical systems.
Secondly, it is not possible to establish and maintain a state of statistical control in relatively few batches of
analysis. This is an important drawback of not carrying out frequent, regular batches of analysis. It may be a
consideration why analytical work might be subcontracted to laboratories having reason to perform the
determination in question frequently. However, when analyses are carried out on a one-off or ad hoc basis the
following approach is recommended.
The proportion of samples analysed more than once should not be less than 20 % but could be as large as
100 % in the case of very small batches or highly important analyses. Single analysis of samples is an
acceptable approach only when a state of statistical control can be established and maintained.
The Harmonised Guidelines [40] recommend that all test materials (samples) are analysed in duplicate. In
addition the use of spiking or recovery tests or use of a formulated control material, with different
concentrations of analyte if appropriate, is recommended. If possible, procedural blanks should also be carried
out. As no control limits are available, the estimates of bias and precision obtained should be compared with
values derived from fitness for purpose.
Annex A
(informative)
A.1 General
One of the most commonly occurring types of bias in the analysis of water is the interference produced by
substances other than the determinand. The magnitude of the interference depends on the results of effects of
all individual substances causing interferences and any other substances that may affect the effects of such
interfering substances. Interference may lead to positive or negative bias and the size of the effect may
depend on the concentration of the determinand as well as that of the interferent. "Interference" can be best
defined as follows:
"For a given analytical system, a substance is said to cause interference if its presence in the original sample
for analysis and/or in the sample during analysis leads to systematic error in the analytical result, whatever the
sign and magnitude of the error."
If the magnitude of the interference effects is to be assessed for the effects of individual or combinations of
substances, two general and important points follow:
⎯ In general, analytical methods suffering from as few interference effects as possible should be used.
⎯ It is advantageous to use analytical methods for which the principles and mechanisms are well known, so
that likely interferences and their magnitude can be predicted.
The final choice of analytical method generally depends on a number of other factors which need to be
considered along with the above points. If the main aim is, however, to minimize the bias of analytical results,
the importance of the above points cannot be over-emphasized.
A means of continually monitoring for the effect of interference caused by other substances is to carry out
spiking recovery tests. This is achieved by adding known amounts of the determinand to the sample under
examination and assessing the recovery of that addition by carrying out at least duplicate analyses. It is not
unreasonable to expect an achievement of 100 % ± 5 % recovery for most determinands.
A.2 Procedure
If the bias B jk due to the presence of a given concentration of an interfering substance (k) for a determinand
concentration of c j is to be assessed, replicate analyses should be carried out on a solution containing only
the substance and determinand at the specified concentration. The mean result R j k is calculated and the
estimation of bias is then given by:
B jk = R j k - c j
If the calibration parameters for the method do not vary appreciably over a number of batches of analyses, the
above approach can be used. If these conditions do not apply a second approach should be adopted where n
portions of a second solution containing only the determinand at a concentration c, are also analysed at the
same time and in the same way as the first solution mentioned above. If the mean analytical results for the
solutions with and without the interfering substance are denoted by R j k and R j respectively, the estimate of
bias B jk is then given by:
B jk = R j k - R j
The advantage of using the second case is that the bias, B jk has been assessed using two results, the
difference between which should only be attributable to the interfering substance. Using this approach there is
not such a need for an accurate calibration and it is therefore the most appropriate means of testing bias
caused by interfering substances.
To assess an interference effect and, as an analyst, be confident that an effect has been detected, it is
important to consider the number of analyses required to make it both practical and valid. The main
considerations are:
a) Since the magnitude of the interference effect may depend on the concentration of the determinand the
effect of any substance chosen for the test should be estimated for at least two determinand
concentrations. It is suggested that the lower and upper limits of the concentration range of interest are
studied if only two concentrations are tested. If there appears to be large discrepancies between these
results then additional intermediate concentrations should also be tested.
b) If a substance is present in a sample then the effect of that substance should be considered as a potential
interferent. In this situation, the methodology and analytical system should be reviewed to decide if the
effect of these substances can be considered negligible and to be deleted from the list of possible
interfering substances. Substances with concentrations less than the determinand may also produce
significant interferences and should be tested. It is impractical to test for the effect of all substances
present in complicated matrices such as water and the number of potential interfering substances tested
can and should be reduced by using literature reviews and consideration of the methodology.
c) The effect of the substances identified above should be estimated experimentally at concentrations
slightly greater than the expected maximum level in samples. Substances causing appreciable
interference should be tested at other concentrations.
It should be remembered that any effects produced have only been estimated at the concentrations chosen
for the test for a particular sample matrix. If other sample types are analysed using that method, the
interference information obtained may not be applicable and other tests should be carried out to determine if
there is an effect and to what level. A second problem may be that an effect may occur at concentrations less
than the concentration level tested. Knowledge of the physical and chemical mechanisms should identify
substances which may cause interference at lower concentrations. Of course the major problem is that the
larger the number of substances to be tested the greater the analytical effort. The final difficulty to be
mentioned is that the magnitude of an effect caused by a substance may depend on the concentration of a
second substance. Again, detailed knowledge of the analytical method should provide sufficient information to
identify likely effects caused by interaction of those substances. It is useful to test the effects of a few
combinations of at least the major components of samples.
Table A.1 — Example of experimental design for one batch of interference tests
In this table Sjk denotes a sample with determinand concentration cj and the kth other substance present at a
defined concentration; k = 0 corresponds to no other added substance.
Other batches of similar design would be analysed until all the substances at all concentrations and
combinations of interest had been tested.
Another way of reducing random error is to carry out at least two analyses for each solution, Sjk. The more
analyses carried out the greater the reduction in random error. Generally duplicate analyses are acceptable.
This provides, in addition to reducing random error, an estimate of those errors from the tests themselves. It is
statistically possible to estimate the number of replicate analyses to carry out or obtain statistically acceptable
results, but this often makes the test impractical if large numbers of analyses are required.
B ik =R ik -R i
where R ik is the mean analytical result for the solution containing the other substance, and R i is the result for
a solution not containing the other substance, but containing the same concentration of the determinand
(which may be zero).
When B ik is calculated using this equation, blank-correction is not needed. The differences, B ik , are the
primary experimental estimates of the interference effects and it is strongly recommended that the individual
differences be reported together with their limits at a defined confidence level for each concentration of
determinand (i.e. level of 1). The precision of analytical results is assumed here to depend on the
concentration of determinand but to be unaffected by the presence of the other substances. Table A.2 gives
an example of a suitable format for presenting the results.
The results in the table have 100 (1-α) % confidence limits, L , equal to the result ± L , and result ± L, for the
determinand concentrations c0 and ci respectively.
The method of presenting the results shown in Table A.2 allows rapid examination to identify those substances
causing statistically significant effects - i.e. those for which B 0k ± L 0 and/or B1k ± L i are greater than zero. The
biases observed for such substances are directly recorded. They may easily be converted to relative effects if
desired, and may also be easily assessed at other confidence levels if required. Further, the table shows the
results for any substances whose apparent effects have not achieved statistical significance.
The calculation of the confidence limits L 0 and L i is performed as follows (assuming that precision is not affected
by the other substances, that m replicate analyses are made for each sample, that n other substances are tested
and that the means are normally distributed):
A.4.1 Estimate variance, s2jk for each solution from its m results
(∑ R 2 jkl ) − (∑ R jkl ) 2
s 2 jk = k k
m −1
where subscript I refers to the Ith replicate result for a sample.
A.4.2 Combine all estimates of variance with a given value of j for each value of j to obtain pooled estimates
of variance, s2 jk
∑s 2
jk
s 2
j = j
n +1
When m = 2 the following equation can be used:
(∑ R jk − R2 k ) 2
j
s2 j =
2(n + 1)
A.4.3 Form the confidence limits, Lj for the differences Rjk for each value of jLj = t(2s2j / m) 1/2
where t is the tabulated value of the t-statistic at the desired confidence level (for example t = 2,45 for α = 0,05
and n = 5, m = 2).
B jk = R jk - cj
where
(∑ R 2 jkl ) − (∑ R jkl ) 2 / m
s 2 jk = k k
m −1
For determinand concentration of 0 µg/I (i.e. j = 0):
a If the other substances had no effect, results would be expected to lie (95 % confidence limits) within the following ranges:
Bjk = Rjk – Rj
(∑ R 2 jkl ) − (∑ R jkl ) 2 / m
s 2 jk = k k
m −1
For sodium fluoride (k = 1) and for an arsenic concentration of 0 µg As/l (j = 0):
Similarly, for sodium selenite (k = 2) and for an arsenic concentration of 0 µg As/l (j = 0):
S202 = 0,000 4
and for sodium selenite (k = 2) and for an arsenic concentration of 20 µg As/l (j = 1):
S212 = 0,000 7.
∑s jk
s 2
= j
n +1
j
a If the other substances had no effect, results would be expected to lie (95 % confidence limits) within the following ranges:
Annex B
(informative)
B.1 General
The following clauses provide a succinct discussion of the nature and origin of errors in analytical results for
waters and effluents. Further information on many of the topics covered is given elsewhere in this International
Standard, and the subject is also discussed extensively in [38].
The total error, E, of an analytical result, R, is defined as the difference between that result and the true value,
T, i.e.
E = R – T.
Repeated analysis of identical portions of the same, homogeneous sample does not, in general, lead to a
series of identical results 1 ). Rather, the results are scattered about some central value. The scatter is
attributed to random error, so called because the sign and magnitude of the error of any particular result vary
at random and cannot be predicted exactly. Precision is said to improve as the scatter becomes smaller - i.e.
as random error decreases - and imprecision is therefore a synonym for random error.
Because random errors are always present in analytical results, statistical techniques are necessary if correct
inferences regarding true values are to be made from the results.
Terms such as 'repeatability' and 'reproducibility' have specialized meanings in the context of interlaboratory
collaborative trials. In this International Standard, random error is quantified in terms of the standard deviation,
σ. Since exact measurement of the standard deviation generally requires an infinite number of repeated
results, only estimates, s, of σ will usually be obtainable. The number of degrees of freedom (DF) of the
estimate provides an indication of its worth; as the number of degrees of freedom increases, the random error
of the estimate itself, s, decreases.
Systematic error (or bias) is present when there is a persistent tendency for results to be greater, or smaller,
than the true value. The mean of n analytical results for identical portions of a stable, homogeneous sample
approaches a definite, limiting value, µ , as n is increased indefinitely. When µ differs from the true value, T,
results are said to be subject to systematic error or bias, ß , where:
ß=µ-T
1) This may not be true when the discrimination of the analytical system is coarse. However, the apparent perfect
concordance of repeated results in such a situation is illusory, because samples differing in concentration will also give the
same results.
Because an indefinitely large number of determinations cannot be made on a single sample, the effect of
random error prevents exact determination of µ, and hence also of ß. Only an estimate, x , of µ will generally
be available, so that only an estimate, b of ß can be obtained.
B . 3 Sources of error
The distinction between random and systematic errors is important for two reasons: first, because they have
different effects on the use to be made of analytical results, and second, because they usually have different
origins.
Random errors arise from uncontrolled variations in the conditions of the analytical system2) during different
analyses. The nature of such variations include, for example, differences in the volume of sample or
reagent taken on different occasions, fluctuations in temperature - either in time, or across the different
sample positions in a heating bath, block or oven, fluctuations in instrumental conditions (for example in
temperatures, fluid flowrates, voltages and wavelengths) and operator-induced variations in reading scales.
Variations from batch to batch, in the extent to which the calibration function represents the true calibration
for that batch, also give rise to between-batch random errors, whereas a consistent calibration error across
many batches gives rise to systematic error - see below.
Whilst many of these factors causing random errors can often be more closely controlled to achieve better
precision, they can never be totally eliminated, so that all results are subject to some degree of random error.
There are five general sources of systematic error (if clear blunders by the analyst in following the written
method, and bias introduced by the sample collection itself are both excluded).
These are:
This is a potentially important source of error in many cases, and evidence should always be obtained - either
from the literature or by direct test - to ensure that unacceptable bias is not introduced by this factor. Effective
sample stabilization procedures are available for many determinands, but they should be compatible with the
analytical system being employed, and with the particular sample type being analysed.
Many substances exist in water in a variety of physical and/or chemical forms (or 'species'). For example, iron
can exist in both dissolved and particulate forms, and within each of those physical categories a variety of
chemical species may be present - for example free ions and complexes, including those of different oxidation
states, in the dissolved phase. An inability of the analytical system to determine some of the forms of interest
will give rise to a bias when those forms are present in samples.
2) The analytical system is the combination of all factors - analyst, equipment, method, reagents, etc. involved in
producing analytical results from samples.
Some determinands are overall properties of a sample, rather than a particular substance - for example
biochemical oxygen demand (BOD). Such determinands are called 'non-specific' and have to be carefully
defined by specifying the use of a particular analytical method. The so-called 'dissolved' fractions of, for
example trace metals, are also non-specific in the sense that the type and pore-size of filter to be used in their
determination should be clearly specified.
c) Interferences
Few analytical methods are completely specific for the determinand. Response to another substance (for
example, response to iron by a spectrophotometric procedure for manganese based on formaldoxime) will
give rise to biased results when that substance is present in samples, and it is important that the effects of all
such interferents likely to be present in samples are known before a new method is applied routinely.
In some cases, the effect of another substance is to alter the chemical state of the determinand such that it is
not measured by the method being used - for example, the presence of fluoride will cause aluminium
complexes to form, which may not be measured by an ion-selective electrode. Such an effect can be regarded
as an interference upon the determination of total dissolved aluminium, or as a failure to recover all forms of
dissolved aluminium. Although it more strictly falls into the latter category, the effect - and others like it - may
be most conveniently treated as an interference when data on performance characteristics are being obtained
or reported (see Annex A, A.4).
d) Biased calibration
Most methods require the use of a calibration function (explicit or implicit) to convert the primary analytical
response for a sample to the corresponding determinand concentration. If the samples and calibration
standards are treated in exactly the same manner (and provided that the materials used to prepare the
calibration standards are of adequate purity) no systematic error should arise from the calibration. (It has been
noted in B.3.1 that any variations in the correctness of the calibration from batch to batch will be manifested as
between-batch random errors).
If, however, samples and calibration standards are treated differently, this can represent a potentially serious
source of error. Thus, for example, a method prescribing some form of pre-concentration; of the determinand
from samples, but employing direct calibration with standards not taken through the pre-concentration step,
will give rise to negative bias if the pre-concentration recovery is less than 100 %. In such cases, evidence
should be obtained on the accuracy of the prescribed calibration, or the difference in treatment of samples and
standards should be eliminated.
Impurity of the material used to prepare calibration standards is, of course, another potential cause of biased
results.
e) Biased blank
The same considerations as in d) above apply to blanks. There is, however, another source of bias arising
from blank correction. If the water used for the blank contains the determinand, results for samples will be
biased low by an equivalent amount unless a correction for the determinand content of the blank water is
applied. Ideally, however, a source of blank water should be obtained, such that the determinand content is
negligible in comparison with the concentration in samples.
Annex C
(informative)
C.1 Foreword
With an increasing reliance on measurement uncertainty as a key indicator of both fitness for purpose (4.9)
and reliability of results, analytical chemists will increasingly undertake measurement validation to support
uncertainty estimation. Measurement uncertainty is accordingly treated briefly in Annex C as a performance
characteristic of an analytical method.
The second procedure, described in detail in ISO/TS 21748 [7], uses the results of collaborative studies
established under the principles of ISO 5725 [1]. Here the reproducibility standard deviation is the decisive
figure. A second step is to prove if the internal precision data of the laboratory coincide with the repeatability
standard deviation of the collaborative study. If they differ significantly it is recommended that the internal
repeatability standard deviation, si, is used instead of the repeatability standard deviation, sr, of the
collaborative study. This can lead to higher or lower values of the measurement uncertainty.
C.3.1 General
In this model the reproducibility within-laboratory (Rw ) is combined with estimates of the method and
laboratory bias. The details are described in the NORDTEST handbook [44].
Before calculating or estimating the measurement uncertainty, it is recommended that the needs of the
customers are established. Subsequently, the main aim of the actual uncertainty calculations will be to check
whether the laboratory is able to fulfil the customer demands on the analytical method in question. However,
customers are not used to specifying demands, so in many cases the demands need to be set in dialogue
with the customer. In cases where no demands have been established, a guiding principle could be that the
calculated expanded uncertainty, U, should be approximately equal to, or less than, 2 times the reproducibility,
s R.
The flow scheme presented in this section forms the basis for the method outlined in the NORDTEST
handbook [44]. The flow scheme, involving 6 defined steps, should be followed in all cases. For each step,
there may be one or several options for finding the desired information.
Before starting always identify the main error sources, to make sure that they are included in the calculations.
The measurement uncertainty for NH4-N will thus be reported as ± 6 % at this concentration level.
Summary table
Ammonium in water by ISO 11732: Measurement uncertainty U (95 % confidence interval) is estimated to
± 6 %. The customer demand is ± 10 %. The calculations are based on control chart limits and interlaboratory
comparisons.
Combined uncertainty, uc is calculated from the control sample limits and bias estimation from interlaboratory
comparisons. The sR from interlaboratory comparisons can also be used (see 6.2) if a higher uncertainty
estimation is acceptable.
Annex D
(informative)
Analysis of variance
D.1 General
This calculation identifies the different sources of variation and allows the estimation of total standard
deviation. It is a standard statistical operation and is most conveniently performed by computer. Details of
manual calculation are given in most statistical textbooks.
As stated above, analysis of variance is used to give two statistical parameters, the within- and between-batch
mean squares, M0 and M 1, respectively. The mean squares are then compared to determine whether M 1 is
significantly greater than M o, that is, whether there is a statistically significant between-batch source of error.
The results of an analysis of variance are usually presented in the form of a table, a general example of which
is given in Table D.1.
NOTE The formulas for ANOVA described in this Annex, are valid only if no exceptions occur. Missing
values distort the assumed balanced design and other formulas should be employed for the reduced
unbalanced design.
Within batches m n
2 S0 /[m(n − 1)] = M0 m(n-1) σw2
∑ ∑ ( x ij − x i ) = S0
i =1 j =1
Total m n nm-1
∑ ∑ ( x ij − x i )2
i =1 j =1
Where:
n is the number of replicate analyses within a batch;
m is the number of batches of analysis;
Calculate the batch means ( x i ) and standard deviations (swi) from the n repeats for each of the m batches.
s2
sw2 = ∑ wi is the best estimate of the true within-batch variance, σw2, with m(n-1) degrees of freedom.
n
Calculate the variance of the batch means, sbm2
- Test with F-test to see if the between-batch variance is significant, i.e. if it is significantly larger than within-
batch variance.
F = sbm2/(sw2/n), which is the estimate of [σb2 + (σw2)/n)]/ (σw2)/n) with (m-1) and m(n-1) degrees of freedom for
the numerator and denominator respectively.
EXAMPLE 1
Pooled variance sw2= 135,6 is the estimate of σw2, the within-batch variance, with degrees of freedom,
DF = 10 .
The means are affected by the between-batch and within-batch error. in estimating the variance, the latter has
to be divided by the number replicate analyses, 2 in this case.
s 2
F = bm
sw 2
[ 2 2
]
~ σb + (σ w / 2) /(σ w / 2)
2
This is not significant with 9 and 10 degrees of freedom, and it can be concluded that the estimate of the
between-batch variance
is not significantly different from zero and it is accepted that the hypothesis σb = 0 and the total standard
deviation st = sw = 11,6.
EXAMPLE 2
F-test
F = sbm2/(sw2/2) = 3,61
which is significant with m-1 = 9 and m(n-1) = 10(2-1) = 10 degrees of freedom. Since the test exceeds the
F-table value of 3,02 at 95 % confidence level, the between-batch standard deviation is significant compared
with the within-batch standard deviation and can be estimated as
sw 11,6 5,43
sb 0 6,20
st 11,6 8,24
The number of degrees of freedom associated with the 'total standard deviation' estimate is according to the
Satterthwaite formula
(st2 )2
DFst =
2 2
(sbm ) (n − 1)(sw2 )2
+
m −1 mn 2
Annex E
(informative)
A water sample was analysed in duplicate before and after spiking on each of 10 days. The results are given
in Table E.1.
The spiked solution was made up of 10 ml of standard solution, of 100 mg/l concentration, made up to 100 ml
with real sample.
If the true mean is 92,75 %, then 90 % of estimates of that mean would lie in the range:
92,75 ± (1,833 x 2,058) where 1,833 is from Student's t distribution, that is, between 88,98 and 96,52.
The observed recovery is therefore not significantly (α = 0,05) outside the range 95 % to 105 %.
Annex F
(informative)
The following table is based on experiences of a lab which uses GC-MS. Each week the system runs on 2
subsequent days and produces 60 results (including replicates, calibration and blank values). The remaining 3
days per week are used for preparation and evaluation of the analyses.
Step No. Time after starting What to do to estimate precision What to do to estimate trueness
(weeks)
Bibliography
[1] ISO 5725:1994, Parts 1 to 6, Accuracy (trueness and precision) of measurement methods and results
[3] ISO 11843-2:2000, Capability of detection — Part 2: Methodology in the linear calibration case
[4] ISO 11843-3:2003, Capability of detection — Part 3: Methodology for determination of the critical value
for the response variable when no calibration data are used
[5] ISO 11843-4:2003, Capability of detection — Part 4: Methodology for comparing the minimum
detectable value with a given value
[6] ISO 13528:2005, Statistical methods for use in proficiency testing by interlaboratory comparisons
[7] ISO/TS 21748:2004, Guide to the use of repeatability, reproducibility and trueness estimates in
measurement uncertainty estimation
[8] ISO (1993), Guide to the expression of uncertainty in measurement, first edition 1993, corrected and
reprinted 1995
[9] ISO Draft Guide 99999; ISO VIM 2004-05, International Vocabulary of basic and general terms in
metrology (VIM)
[10] ISO Guide 35, Certification of reference materials — General and statistical principle
[11] ISO/IEC Guide 43, Proficiency testing by interlaboratory comparisons — Part 1: Development and
operation of proficiency testing schemes; Part 2: Selection and use of proficiency testing schemes by
laboratory accreditation bodies
[12] APHA-AWWA-WEF (1998) Standard Methods for the Examination of Water and Waste, 20th edition,
APHA, Washington
[14] ASTM – American Society for Testing and Materials (1999) Standard Practise for Conducting an
Interlaboratory Study to Determine the Precision of a Test Method E691-99
[15] ASTM (1975) Standard Methods for the Examination of Water and Waste Water 14th edition, 104B,
26-33.
[16] ASTM Manual 7A Manual on Presentation of Data and Control Chart Analysis, Seventh Edition (2002).
[17] ASTM E1763-98 Standard Guide for Interpretation and Use of Results from Interlaboratory Testing in
Chemical Analysis Methods
[18] ASTM E1301-95e1 Standard Guide for Proficiency Testing by Interlaboratory Comparison
[19] ASTM E1601-98 Standard Practice for Conducting an Interlaboratory Study to Evaluate the
Performance of an Analysis Method
[20] BARWICK, V. J., ELLISON, S. L. R.: (VAM Project 3.2.1 Development and harmonisation of measurement
uncertainty principles, part (d)): Protocol for uncertainty evaluation from validation data (Jan. 2000)
[21] British Standards Institution (2003) Guide to data analysis and quality control using cusum techniques.
British Standard 5703, Parts 1 to 4
[22] CARDONE, M.J. (1986) New technique in chemical assay calculations. Parts 1 and 2, Analytical
Chemistry, 58, 433-445
[23] DAVIES, O.L. (editor) (1978) Design and Analysis of Industrial Experiments, Longman.
[24] DAVIES, O.L. and GOLDSMITH, P.L. (editors) (1972) Statistical methods in research and production. 4th
edition, revised. Edinburgh, Oliver and Boyd for ICI, pp 478
[25] EISENHART, C. (1963) Realistic evaluation of the precision and accuracy of instrument calibration
systems, J Res National Bureau of Standards, 67C, 161
[26] EURACHEM/CITAC Guide (2000), Quantifying Uncertainty in Analytical Measurement, 2nd edition
[29] European cooperation for Accreditation of Laboratories EAL (1996) Interlaboratory Comparisons
(EAL-P7)
[30] European Co-operation for Accreditation (2001) Use of profciency testing as a tool for accreditation
(EA-03/04)
[31] FRANKE, J.P. and DE ZEEUW, R.A. (1978) Evaluation and Optimisation of the Standard Addition Method
for Absorption Spectrometry and Anodic Stripping Voltammetry, Analytical Chemistry, 50,1374-1380
[32] FUNK, W. et al (1985) Statistische Methoden in der Wasseranalytik, Begriffe, Strategien, Anwendungen,
Wiley-VCH, Weinheim
[33] FUNK, W., DAMMANN, V., DONNEVERT, G. (1995) Quality Assurance in Analytical Chemistry, Wiley-VCH,
Weinheim
[34] GARDNER, M.J. and GUNN, A.M. (1986) Optimising precision in standard additions determination, Z Anal
Chem, 325, 263-268
[35] GARDNER, M.J. and GUNN, A.M. (1988) Approaches to calibration in GFAAS: direct or standard additions,
Z Anal Chem, 330,103-106
[36] GARDNER, M.J., WILSON, A. L., and CHEESMAN, R. J. (1989) A Manual on Analytical Quality Control for
the Water Industry, NS30, WRc Medmenham.
[37] HUNT, D.T.E. and MORRIES, P. (1980) The determinand content of the blank water, Analyt Proc, 19,
407-411
[38] HUNT, D.T.E. and WILSON, A.L. (1986), The Chemical Analysis of Water. General Principles and
Techniques, Second edition, The Royal Society of Chemistry, London
[39] ILAC-G13:2000: Guidelines for the Requirements for the Competence of Providers of Proficiency
Testing Schemes (http://www.ilac.org/)
[40] IUPAC: Harmonized guidelines for single-laboratory validation of methods of analysis, Pure Appl.
Chem., 74 (2002), 835-855
[41] IUPAC: Glossary of Terms Relatings to Pesticides, Pure Appl. Chem., Vol. 68, No.5, pp. 1167-1193,
1996, http://www.iupac.org/reports/1996/6805holland/l1.html
[42] MANCY, K.H. (1974), Design of water quality measurement programs. In: Design of Environmental
Information Systems, edited by R.A. Deininger. Ann Arbor Science Publishers Inc, Ann Arbor MI, USA,
pp. 173-197
[43] National Committee for Clinical Laboratory Standards, Internal Quality Control Testing: Principles and
Definitions, http://www.westgard.com/quest7.htm
[44] NORDTEST Report TR 537 (2003) Handbook for calculation of measurement uncertainty in
environmental laboratories; published by Nordtest, Finland
[45] RATZLAFF, K.L. (1979), Optimising Precision in Standard Addition Measurement, Anal. Chem. 51, pp.
232-235
[46] SHEWHART, W. (1931), The Economic Control of Quality of Manufactured Products. van Nostrand, New
York
[47] THOMPSON, M.; ELLISON, S.L.R., WOOD, R: The International Harmonized Protocol for the Proficiency
Testing of Analytical Chemistry Laboratories (IUPAC Technical Report); Pure Appl. Chem. 78 (2006),
No. 1, pp. 145-196
[48] YOUDEN, W.J. (1972) Graphical diagnosis of interlaboratory test results. Journal of Quality Technology,
Vol. 4, No. 1, 29-33
[49] EEE/RM/062: The selection and use of reference materials; A basic guide for laboratories and
accreditation bodies (2002). Available from the Eurachem secretariat or website
(http://www.eurachem.org/).