Data analysis and presentation

1991, Instrumental Analyses of Pollutants, Elsevier Science …

Chapter 8 Data Analysis and Presentation* STUART c. BLACK Nuclear Radiation Assessment Division, Environmental Monitoring System Laboratory, PO Box 93478, Las Vegas, Nevada 89193-3478, USA 8.1 INTRODUCTION Analysis of data begins with a procedure that is commonly termed validation. For the purposes of this chapter, validation means a thorough check of the analytical method to ensure that an adequate quality control process has been used. For analysis of data from a research or a method development project, quality control must have been an integral part of the method used for analysis, as is required by good laboratory practice. If, on the other hand, the data to be analyzed originate from a monitoring or surveillance program used to estimate environmental contamination, then a comprehensive quality assurance plan covering both the sampling performed and the analytical procedure should exist, and validation then consists of verifying that the data quality objectives set forth at the beginning of the program have been met. Since this handbook is devoted to sampling methods as well as instrumental methods of analysis, the following sections will address those quality control procedures that relate to the bias and precision of the data and to the adequacy of the sample that was analyzed. Obviously, the fact * Notice: Although the work described in this chapter has been supported by the US Environmental Protection Agency, it has not been subjected to Agency review and therefore does not necessarily reflect the views of the Agency and no official endorsement should be inferred. 335 C. N. Hewitt (ed.), Instrumental Analysis of Pollutants © Elsevier Science Publishers Ltd 1991 336 INSTRUMENTAL ANALYSIS OF POLLUTANTS that a positive result has been obtained from an analysis ofa sample has little significance unless the result can be related to something of interest and unless the uncertainty associated with that result is known. The relationship may be to appropriate blank samples or to results obtained in control areas or in control samples. The uncertainty of a result is a combination of the systematic and random errors that are an integral part of the sampling and analytical procedures used to obtain that result. Common practice has been to use the terms precision and accuracy, implying that systematic errors (bias) and accuracy are synonymous. However, accuracy is some combination of precision and bias where the precision is high and the bias is low. Any other combination is inaccurate. The final section of this chapter addresses the various means of presenting the data so that noninvolved persons can readily understand the results that were obtained from a project. However, for documenting the results of an experiment, a true data base is preferable. The data base may be in tabular form or in some form of electronic storage. It should include all results from the project together with the factors that were used to obtain the results, such as the analytical error, extraction efficiency, calibration data, blank values, volume corrections, and so forth. This will permit a complete reconstruction of the final results if that should ever become necessary or in case a different method of presentation were to be desired. 8.2 QUALITY CONTROL Analysis of the output signal of an instrument requires comparison of the magnitude of that output signal to a reference signal that may be the instrumental background signal, the signal obtained when a reagent blank is analyzed, or an appropriate contro1. 1 Also required is an estimate of the error associated with the signal of interest. These two factors, signal comparison and associated error,can be derived specifically if an effective quality control (Qq program has been followed during the sample collection and analysis procedures. An effective QC program will include blanks, duplicates, calibration and check standards, splits, and other such procedural requirements. 2 ,3 Each of these is discussed in the following subsections. 8.2.1 Instrument calibration Proper calibration of the analytical instrument will yield two significant benefits, namely: it ensures that the analyte of interest will give a signal that DATA ANALYSIS AND PRESENTATION 337 is detectable if of sufficient magnitude, and it allows definition of the correlation between the signal from the instrument and the amount of analyte present in the sample. To begin this procedure, the matrix in which the analyte of interest is to appear on injection into the instrument should be determined since it could be either aqueous or organic depending on the properties of the analyte. The matrix blank is then determined, i.e. the response of the instrument on injection of the matrix. An aqueous matrix may be distilled and deionized (001) water or 001 water with dilute acid or base while an organic matrix may be methylene chloride or methanol or some other organic solvent. The matrix chosen should be the one that interferes least with the analyte signal and produces little or no signal of its own. With an appropriate matrix chosen, a certified pure variety of an analyte is added to it. The amount added can be estimated from prior knowledge but should be enough to give a definite signal when injected into the instrument. Depending on the magnitude of the resultant signal, higher and/or lower amounts are analyzed so that there are at least three results which span the expected range of the analyte in the samples to be analyzed. The instrument response is then plotted versus either the concentration or the amount of added analyte. 4 The desired outcome of this plot is a straight line which can be described by an equation of the form C = aR + b, where C is concentration or amount, R is the instrument response, a is the slope of the line, and b is the intercept on the C-axis. For those cases where the instrument response is nonlinear, intermediate concentrations of the certified material may be analyzed to determine the shape of the response more accurately. 8.2.2 Calibration check standard Since certified pure materials are expensive, a calibration check standard (CCS) commonly is prepared. This can be made of reagent grade materials in some arbitrary concentration since the exact value is not necessary.5 Five to seven portions of the CCS are analyzed immediately after the instrument is calibrated to determine the mean value. A preliminary control chart is prepared that is updated each time the CCS is analyzed. This standard is analyzed each operating day or each shift to ensure that the instrument is still in calibration. Once the response of the instrument to the CCS exceeds the control limits, and remains so, then recalibration of the instrument becomes necessary. The CCS is useful because it can be made with less expensive chemicals, because there is no real need to standardize it against national standards, 338 INSTRUMENTAL ANALYSIS OF POLLUTANTS and because it is not necessary to expend the effort to make a precise concentration. However, if the CCS is not stable over time, precise concentration information may be required so that replacement CCSs can be prepared until such time as recalibration of the analytical instrument becomes necessary. 8.2.3 Laboratory control standard The laboratory control standard (LCS) normally consists of a standard reference material (SRM) in the same matrix as used for samples in the analytical procedure. In the USA the National Bureau of Standards produces a variety of SRMs that are useful for many types of analyses. 3 If an SRM is not available, the alternative is a purified analyte produced by a commercial concern or by a government laboratory that specializes in such materials. The function of the LCS is to estimate the bias in the analytical procedure and provide data on which the comparability of the results obtained by the analytical method can be assessed. Comparability is an important attribute of an analytical method since it implies that the measurements made can be reproduced by laboratories that may be using other methods. The LCS is measured just after the calibration procedure is completed and then with each 20 or so samples until 10 or more aliquots have been measured, then the frequency can be reduced. A running average of the LCS results is calculated and represents the bias (also called systematic error) in the method. 8.2.4 Blanks Once the analytical procedure has been calibrated and the calibration controls have been identified, the result obtained from the procedure when the analyte of interest is thought not to be present should be assessed. This is done by analyzing blank samples. As a general rule, several types of blanks can be devised for any sampling and analysis project. The type of blank to be analyzed depends on which part of the procedure is to be tested for its contribution to the final result. The important blanks and a brief description of their use are listed below. 6 Field blanks This blank is the most comprehensive. It normally consists of DDI water placed in 5-10% of the sample containers that are sent out for sample collection. At the sampling site these containers are opened during sample collection. The DDI water is used to rinse the sampling instrument(s) after DATA ANALYSIS AND PRESENTATION 339 cleaning but, before collecting the next sample, the DDI water is placed back in the container, which is then sealed. If any sample preparation, such as mixing, sieving, ball-milling, etc., is performed on the samples, the field blank is then used to rinse the equipment after cleaning. This blank is then analyzed along with the samples. Therefore this blank will contain any contamination introduced into the sample by the collection apparatus, the container, the sample preparation equipment, the reagents or the matrix. Transit blank This blank also consists of DDI water placed in the sample containers. Normally about half of the field blanks can be used for transit blanks. These blanks are used to assess any contamination due to leaching from the sample container during shipment and storage. They are sent to the field and returned to the laboratory unopened, stored during sample preparation, and analyzed with the samples. Sample preparation blank This is frequently called the sample bank blank because samples generally are sent to a central area (sample bank) for processing and numbering. Every 20 samples or so, a sample container ofDDI water is used to rinse the sample preparation equipment and the rinse collected and analyzed with the regular samples. Reagent blank This blank consists of DDI water of the same volume as the samples to be analyzed and is used to check for contamination in the reagents. It is analyzed identically as the samples are analyzed, with the same amount of reagents, at a rate of about one blank for each 20 samples. Additional reagent blanks will be required whenever the reagents are changed. Matrix blank This blank is merely an aliquot of whatever final solvent is used in the analytical procedure. It is injected into the analytical instrument without further treatment and any result obtained is an indication ofcontamination of the matrix material. Although all of the blanks listed above should be collected during the course of a project, only the field blanks, or the field and reagent blanks, need to be analyzed at the start. However, if the blanks show any detectable amounts of the analyte in question, then the other types of blanks may need to be analyzed to locate the source of extraneous analyte signal. Even 340 INSTRUMENTAL ANALYSIS OF POLLUTANTS though the various kinds of blanks may make up 30% of the total sample load, it is possible to reduce this to 5% as experience is gained with the samples and procedures, and especially if the field and reagent blanks show negligible contamination. 8.2.5 Controls The three most important parts of an analytical procedure are the standards, the blanks and the controls. Assuming the first two parts have been covered adequately, then a positive result may be obtained when samples are analyzed. However, any such result has no significance unless it can be related to a standard, a regulation, a normal baseline amount, or something similar to these. If the result of analysis is to be related to a standard, then the control is the reagent blank. Ifit is to be compared to a regulation, it generally would be the amount above a local background and the control would be the local baseline. Finally, if the result is to be related to a normal baseline, then the control would be regional or national average values. Local control The local control is composed ofsamples collected from areas that are close in time and space to the area being studied. The sample collection sites should be upwind and upstream (as regards surface and ground water flow) of the area in question and the samples should be distributed so they are affected by all the sources that impact that area except the source that is suspected of causing the impacts being measured. The local control is very useful because it is more directly relatable to the area of study than regional or national controls. However, in many cases an appropriate local control is impossible to obtain so the other types of control are needed. Regional control The same criteria apply to a regional control, as stated above for the local control. Because of the larger area available for choice of samples, this control may be more readily available. An additional factor to be considered in selecting a regional control is the demographic characteristic of the two areas: study versus control. As an example, a farming area would not be a suitable control for an industrial area. National control If resources are scarce, a national control may be derived from a literature DATA ANALYSIS AND PRESENTATION 341 search as there exists a surprising amount of data that can be uncovered by a diligent search. In fact, there may even be routine monitoring or surveillance programs that can provide data useful for developing a suitable control value. A good example of a national control is the set of data on cancer rates published by the government that are used to determine whether or not these rates are high in a given area. The principal problem will be, in general, the lack of adequate data, either in the type of analyte reported or in the quality of the data. If collected several years in the past, then the QA may be poor or nonexistent so interpretation would be difficult or impossible. 8.2.6 Detection limits There have been many names for detection limits and many definitions for them, but present practice is to base the detection limit, by whatever name, on some multiplier of the standard deviation of sample analyses near zero analyte concentration. The value of the multiplier is related, in turn, to the acceptability of the errors that are possible in any analytical scheme. The two principal errors are Type I (also called alpha error) that is the probability of stating that a substance is present when it is not, and Type II (also called beta error) that is the probability of stating a substance is not present (not found) when it actually is present. The American Society for Testing and Materials has issued a standard that includes a procedure for determining detection limits? In that procedure several samples are prepared by adding near zero concentrations of the analyte to the matrix for which the analytical method was developed. After analysis the standard deviation of the results is calculated. A detection limit called the criterion of detection is then determined by choosing to accept a Type I error of 5% and using a value of 1·645 from a table of normal cumulative probabilities as a multiplier for the standard deviation (s). Thus the criterion of detection becomes 1·645s. However, if many samples with an analyte concentration of 1·645s are analyzed, only half of them will give results exceeding the criterion of detection. This is because the probability of a Type II error is 50%. To adjust the probability of a Type II error to 5% to match the Type I error, the same multiplier is used. The result is termed the limit of detection when the risk of making a Type I and Type II error is 5% and is equal to (2)(l'645)s = 3·29s. The power of a test is determined by 1- p, so when the risk ofType II error is 5%, then the power of the test (probability of finding a substance when it is present) is 100% - 5% = 95%. Although the preceding was based on inorganic analyses of water 342 INSTRUMENTAL ANALYSIS OF POLLUTANTS samples, almost equivalent procedures apply for other types of analysis. For analysis of samples containing radioactivity, the detection limit is based on background counting rate, as well as the counting rate from the sample, so the multiplier for the lower limit of detection (LLO) is increased by the square root of 2. This arises from the normal propagation of errors where the total error (St) is the square root of the background variance (s~) plus the sample variance (s;): St =J (8.1) S b2 +S2 s Since the LLO is based on analyte concentrations near background, then Sb and Ss will be equal and the equation reduces to (8.2) Therefore the LLO becomes 4·65s where background counts: S is the standard deviation of LLO = j2(2)(1'645s) = 4·65s (8.3) This factor is also based on Type I = Type II = 5% false detection and false nondetection probabilities. 8 Also, for analysis of organic contaminants in samples, a similar procedure is used for setting detection limits. In this case it is called a method detection limit and is based on the 99% confidence Iimit. 9 Since the procedure specifies seven aliquots of a sample matrix containing the analyte at near zero concentration, the value of t for a one-sided test at 0·99 confidence level with six degrees of freedom (7 - I = 6) is 3'14. In this case, then, the MOL becomes 3·14s where s is the standard deviation of the seven aliquots. The three detection limits described above (limit of detection, lower limit of detection, and method detection limit) are all computed by multiplying the standard deviation of a background or low-level sample by a factor of 3 or 4. Some analysts use various multiples of method detection limit (MOL) to describe method capability, such as quantitation limit (= 3 x MOL) and practical quantitation limit (= 10 x MOL).lO Many analysts use the term 'trace' to describe any detectable amounts that are greater than the MOL but less than the quantitation limit. Figure 8.1 shows the basic relationships discussed in the preceding. Here J.ll is the mean of the background (or lowlevel) distribution and J.l2 is the mean of the sample distribution. Ifthe MOL (C) was set at the 95% level, then all measured concentrations ~ C would be detectable and the Type I error (a) would be 5% but the Type II error (p) DATA ANALYSIS AND PRESENTATION ", 343 c Fig. 8.1. Sample and background distributions. would be greater, perhaps 30%. Such facts lead to the use of 4s in analyses of radioactivity and 3s for organic analysis of samples. 8.2.7 Uncertainty of results The uncertainty in the result generated by any analytical procedure is due to some combination of two principal errors, namely random and systematic. Systematic error is determined by examining the results from analysis of standards, such as the LCS described above. The average of all results from analysis of the LCS, when compared to the true value, is a measure of the bias to be expected when using the chosen method. The bias is due to the cumulation of systematic errors as they occur in the laboratory that is using the method. Systematic errors are not normally statistical in nature but are due to factors that consistently bias any result in one direction. Such errors may include incorrect calibration of balances, thermometers or pipettes as well as calculations of concentration, dilution, etc. Random errors, on the contrary, are those errors that can be treated statistically.ll To eliminate the systematic errors in a procedure requires careful cross-checking of all the steps, with particular attention to any steps where small changes may have a large effect. Random errors These statistical errors in an analytical method can be detected by performing various duplicate or replicate analyses. For instance, to determine the random error for all the steps in a procedure from collection of the sample to the output of the analytical instrument, a duplicate sampling program is used. A duplicate sample is one that is collected adjacent, in time and space, to the principal sample. For example, if two air sampling instruments are placed next to each other and both are then operated for the same length of time, the collected air samples are duplicates. If one large sample is divided into two parts, the result is termed a split sample. This differs from the duplicate sample because any sampling 344 INSTRUMENTAL ANALYSIS OF POLLUTANTS Table 8.1 Methods of error assessment Replicate analyzed Error(s) assessed Duplicate sample Split sample Split extract Sampling, homogeneity, processing and measurement Homogeneity, processing and measurement Measurement error has been eliminated, and analysis of the splits yields an estimate of the errors due to sample inhomogeneity, extraction, preparation and the instrument response. Instrumental errors can be evaluated by analyses of splits ofsample extracts. Errors in extraction of the analyte from the sample can be determined if homogeneous aliquots of the sample can be obtained, for example, if the sample is in solution. Table 8.1 puts these methods of error assessment into perspective. The total random error is an estimate of the precision ofa process and its value is determined by the standard deviation (s) of the measurements. The square of the standard deviation (S2) is the variance of the measurements and variances can be added so the variance of the total procedure is just the sum of all the component variances. In mathematical terms s~ = s~ + s~ + s~ + s~ + ... (8.4) where the subscripts of the variances are T for total, S for sampling, H for homogeneity, E for extraction and M for measurement. The square root of any of these values is the precision of that component of the method and the square root of the sum is the precision of the total method. As an example, suppose many duplicate samples were to be analyzed and a mean and standard deviation calculated from the results. If that calculation yielded 80 ± 3 units, it could then be stated that the precision of the method was 3 units for samples containing about 80 units of analyte. This would be the extent of the statement that could be made until data for samples with different amounts of analyte were obtained since it would not be known whether or not the relation between precision and concentration was linear. Ifit was desired to put some confidence limits on the above data, then a factor of2 could be used to approximate the 95% probability interval, that is 2 x ±3 = ±6 units is the 95% probability interval for samples containing about 80 units. Since these calculations are based on pairs of analyses, either duplicates or splits, the range is almost as efficient for expressing precision as the DATA ANALYSIS AND PRESENTATION 345 standard deviation. The range is found by subtracting the lower result in a pair from the higher one. If the range of paired observations is divided by 1-128, the result is an estimate of the standard deviation. To use this technique, the range is calculated for all pairs, the ranges are summed and the sum is divided by the number of pairs that were summed to get the mean range, and this result divided by 1-128 to get the standard deviation estimate. For some methods, or for some analytes, the standard deviation is not constant with changes in concentration but is a constant fraction of the concentration. In this case it is preferable to use the coefficient of variation (CV), which is defined as the standard deviation divided by the mean value and is usually expressed in percentage (lOOs/x = CV). This CV can be used as an estimate of precision in the same manner as the standard deviation or the range can be used. A third relationship may arise, that is the precision may be neither a constant nor a constant proportion of the analyte as may be revealed when precision is plotted against concentration. If this is so, then the formula for a straight line (E = aC + b) must be developed and used to describe the relationship. In this formula, E is the error, C is the concentration, a is the slope of the line, and b is the intercept on the E axis. Linear correlation analysis is used to obtain the parameters for the straight line (also called the least-squares line). Total uncertainty As stated at the beginning of this section, the total uncertainty in an analytical procedure is some combination of the bias and precision (systematic and random errors). There have been some theoretical studies of this subject but no particular method of combining these two errors has become predominant. The one suggested method that appears reasonable is to combine the various components in quadrature but adjusting the value of the bias by 1/3. 8 The mathematical statement for this is (8.5) where UT is total uncertainty, ST is total random error, and B is the bias (systematic error). For this equation, the s used should be one with a high confidence level. The 95% level is recommended because the bias is known with high confidence. To illustrate the use of this combination, suppose 10 samples were spiked with low levels of an analyte and then analyzed. When the results were compared with the known spikes, suppose the outcome was 92 ± 3% then the 95% probability interval for precision is ±6% and the 346 INSTRUMENTAL ANALYSIS OF POLLUTANTS bias is 92% -100% = -8%. The total uncertainty then is calculated as follows: (8.6) This uncertainty value is used as a multiplier for the result of an analysis and sets the 95% probability interval in which the true value lies. For example, suppose a method for which the total uncertainty was 7·6% was used to analyze a sample and the result was X units, then the uncertainty in that result would be ±0·076X units. Therefore a result of 250 ppm would have an uncertainty of 0'076(250) = 19 ppm and the true result would lie in the interval from 231 to 269 ppm. 8.2.8 Comparability Comparability is an important, and in some cases an essential, characteristic of the data generated by an analytical method. Some analysts have thought that this consisted only of stating the results in consistent units such as the MKS or SI systems. Of course comparisons are easier when the units used are identical, but comparability implies much more than just use of any system of units. It implies that if other laboratories analyze the same sample but use different methods the results will be comparable. To achieve this a laboratory must participate in an intercomparison program or produce comparable results when analyzing standard reference materials or similar materials as produced by a national standards office. In 1983 a symposium entitled Quality Assurance for Environmental Measurements was held in Boulder, Colorado. The proceedings of that symposium includes many papers that discuss practical applications of the principles outlined in this section. 12 8.3 DATA ANALYSIS Preliminary to analyzing the data, some treatment of the results is required, such as grouping the data, eliminating outliers, and handling less than detectable values. Grouping the data is an obvious step. All the data should be categorized as to the area from which collected, the sample type (soil, air, water, etc.), and the constituent for which analyzed. This permits calculation of the mean and standard deviation, or putting the data in some kind of order for ease of analysis. If the quality control procedures of Section 8.2 have been followed, then the accuracy and comparability of the 347 DATA ANALYSIS AND PRESENTATION results produced by the analytical method will be known and only the significance of the results remains to be demonstrated. 8.3.1 Rejection of outliers Several methods have been, and presently are, used to reject data that appear not to belong to a given data set. One of the better methods, at least for data that can be described by a mean and standard deviation, for deciding whether or not to reject a result has been described. 13 Theoretically, no result should be discarded unless an obvious error has occurred and, in some cases, deviant results may indicate flaws in the analytical procedure or may indicate 'hot spots' of environmental contamination. If it has been decided to reject possible outliers, then proceed as in the following. Arrange the set of results in order from low to high: (8.7) and calculate the mean and standard deviation for the whole set. Next, calculate the following statistic: or T= (XH - X)/s if a value appears high T= (X - Xd/s if a value appears low (8.8) Finally, compare the calculated value of T with the value under either the 5% or I% column in the tabulation for tests ofdiscordancy shown in Table 8.2. If the calculated value of T is greater than the value in the tabulation for the number of measurements made, then either the low or high result is an outlier with that level of significance. Table 8.2 Tabulation for tests of discordancy Number of measurements 9 (Example) ... 10 12 14 16 18 20 30 5% 2·11 2·18 2·29 2·37 2-44 2·50 2·56 2·74 1% 2-32 2-41 2·55 2-66 2·75 2·82 2·88 3-10 348 INSTRUMENTAL ANALYSIS OF POLLUTANTS As an example, suppose 10 samples were analyzed and the results ranged from 45 to 70 ppm. The mean and standard deviation are calculated as 56 ± 6 ppm. To determine whether or not the high value is an outlier, calculate T from (70 - 56)/6 = 2·33. That value is between the 2·18 and 2·41 values for 10 measurements, as indicated in the tabulation. Therefore 70 ppm is an outlier at the 5% level of significance but not at the 1% level. 8.3.2 Treatment of less-than values It has been customary to treat less than detectable (e.g. <MDL) values as zeroes when calculating statistics or when comparing data sets, and other conventions have also been used. However, any of these tend to introduce a bias into any calculations. One such convention has been to use the value of the MDL for any value that is less than MDL(or LLD, or etc.). Using either zero or the detection limit for all values in a data set that are less than the limit yields a censored data set with an extremely biased average and standard deviation. Although using half the distance between zero and a detection limit, or the geometric mean in case of non-Gaussian distributions, results in less bias, these adjustments are also inappropriate. A better method for handling nondetectable data is to use probability plotting on either normal or log-normal probability paper. The data are ranked so that percentages are readily obtained. The data are then plotted on the two kinds of probability paper. The kind on which the plotted points most nearly approach a straight line determines the type of distribution of the data. The value represented by the 50% point on the plot is the mean of the data while the standard deviation is found by dividing the value at the 50% point by the value at the 84·15% point. This discussion on handling less than detectable values applies to existing data or reports, where the values cannot be changed. Presently, it is recommended that no data be reported as 'nondetectable', or 'less than MDL', or similar such expressions. It is always possible to calculate a value for any measurement that is made, even if it is a negative number as may occur when a sample value is subtracted from a background or blank value. As an example of the points discussed above, consider the data displayed in Table 8.3. These data are from a study of a contaminated area where heavy metals had infiltrated the ground water system. The first column is a common method of displaying the results of 20 samples taken from a control area that is upstream of the contaminated area. Ignoring the <MDL or equating them to 0 or to the MDL results in a censored distribution with an inaccurate mean and standard deviation. If some historical data exist, then it may be possible to approach the true mean by 349 DATA ANALYSIS AND PRESENTATION 91i I 90 I 80 70 04~ X f: --·1 X 30 20 1.0 2.0 10 Concen1n1ion • ppm Fig. 8.2. Probability plot of water concentration values. plotting the data on probability paper, as shown in Fig. 8.2. Here column 1 data are ordered in column 3 and the sum of the cumulative number of values is expressed as a percentage of the total as in column 4, that is the seven values less than MOL represent 35% of the data, etc. The eyeballed line drawn on the points is shown on the figure. The 50% intersection is the mean of the data (2,0 mg/liter), which is very close to the actual mean shown for column 2. 8.3.3 Data comparisons The most common procedure used to test a set of data to determine whether or not it is significantly different from other data is to use Student's t-test. However, care must be exercised in using this test. The data in the two sets to be compared must be approximately normal or modified to fit a normal distribution. Much environmental data are distributed log-normally, which means that the data are converted to logarithms before either plotting or calculating any statistics. 14 If there are 10 or fewer values to be tested in the data sets, then the type of distribution is not crucial. To perform the t-test, assume that the background, reagent blank or control sample data are known with high precision, and that the data from the collected samples is to be compared with those data. Since it is expected 350 INSTRU:MENTAL ANALYSIS OF POLLUTANTS that the sampled area will have a higher concentration of the analyte than the background, blank or control samples, then a one-tailed i-test is used. At this point access to a table of the t distribution is necessary. Based on the confidence level chosen for the test and the degrees of freedom (actually one less than the number of samples) a t value is selected from the table for use as shown in eqn (8.9) to generate a test statistic. The procedure will be clearer if a numerical example is presented as follows. Assume the mean concentration of lead in control samples is 2·04 ppm in water. Assume 20 water samples are taken from a suspect area and that the mean is 2·60 ± 0·28 ppm (Table 8.3). If the desired confidence level is 95%, Table 8.3 Concentration in water (mg/liter) (MOL = 1·93 mg/liter) Control samples· Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 .X' s I 2 2-3 2-3 3·1 2·0 <MOL 2·1 <MOL <MOL 2·2 2·2 2·0 2·0 <MOL 2·8 2·0 <MOL 2-2 <MOL <MOL 2·0 2·2 2'13 b 0·31 3·1 2·0 1-8 2·1 1·8 1·7 2·2 2·2 2·0 2·0 1·9 2·8 2·0 1·9 2·2 1·7 1·0 2·0 2·2 2·04 0·42 3 <MOL <MOL <MOL <MOL <MOL <MOL <MOL 2·0 2·0 2·0 2·0 2·0 2·1 2-2 2·2 2-2 2·2 2-3 2·8 3·1 Study samples E% 2·5 3·2 2·7 2·8 2·7 2·0 35 2-3 2·9 2-3 2·7 3·0 60 65 85 90 2-4 2·5 2·6 2·8 2·7 2-3 2-4 2·8 2·5 2·60 0·28 • Column 1 is common listing of control results; column 2 is all results (preferred); column 3 is column 1 ordered low to high; and ~ % is a cumulative percentage of column 3. b Assumes all < MOL are equal to the MOL; if < MOL were deleted, then.X' = 2·62. 351 DATA ANALYSIS AND PRESENTATION then from a table of ( at 95% level and 19 degrees of freedom (20 - 1 = 19) find the factor 1·729. The test statistic is then calculated as u = (s/~ = 1'729(0'28)/fi = 0·11 (8.9) where s = ±0·28. If the difference between the means of the control and tested samples is greater than 0'11, then the concentration of lead in the water in the suspect area is higher than in the control area at the 95% confidence level. Since the difference between the means is 2·60 - 2·04 = 0·56 ppm and since this is greater than the 0·11 calculated above, then the lead concentration in the water in the sampled area is significantly higher than in the water from the control area. The (-test is one of the simpler tests that can be used for comparing two data sets, but there are others which vary with the type ofdistribution of the data sets as well as some nonparametric tests which are independent of distribution. ls One of the most generally used of the latter is the WilcoxonMann-Whitney test. For comparing two sets of data, assign rank to each number in the two sets, starting from the lowest number (assign a 1) and proceeding to the highest. Use an average rank for all numbers that are equal. For example, use the first nine numbers of the study and control samples in Table 8.3 and rank as shown in Table 8.4. Then choose the significance level desired for the test, e.g. a = 0,05, and obtain the test number from a table of'critical values of smaller rank sum for the W-M-W test' (Ref. 15 contains such a table). In this example the value for samples where n 1 = 10 and n 2 = 10 is 66. If the ranks assigned to the control sample results are summed, the result is 58'5, as shown below. Since 58·5 is less than Table 8.4 Ranking of selected data from Table 8.3 Control 2·3 3·1 2·0 1·8 2·1 J-8 1·7 2·2 2·2 Rank 10 17 4·5 2·5 6 2·5 1 7·5 7·5 L58·5 Study 2·5 3·2 2·7 2·8 2·7 2·0 2-3 2·9 2-3 Rank 12 18 115 15 115 4·5 10 16 10 352 INSTRUMENTAL ANALYSIS OF POLLUTANTS the 66 obtained from the table, one can conclude that the concentration in the study sample is greater than that in the control sample with only a 5% probability of being in error. For precise tests that compare results from various sampling programs, the services of a statistician are indispensable and, if possible, the statistician should be consulted prior to the initiation of any sampling program. 8.3.4 Graphical presentation A picture is worth a thousand words, to coin an expression, and this is at least true for graphical presentation of scientific data. There is no fixed rule for determining the form of the presentation, whether it should be a line, bar or pie chart, etc. Therefore the empirical approach of testIng several methods for the presentation of the pertinent data should be used as well as 40 (AI ..-. . .. .. .. .-..- .. -' "(j Do 20 I .. .. . e ..... . .5 .. • • o •• .- • •e ... . ~ :~ ·10 ~ ·20 1982 -'40 "(j 1983 1984 1985 1986 1988 ...... e.-... . (B) Do .... .. ... . ..: .. .......... 1982 1987 1983 1984 1985 1986 . 1987 1988 Fig. 8.3. Tritium in water, time-series and running average plots. 353 DATA ANALYSIS AND PRESENTATION attempting several types of data summing. For example, a pure time-series plot ofdata as shown in Fig. 8.3(A) may present a confusing picture which is no better than the normal tabular accumulation when wide variations in the data occur. To clarify the apparent trend shown in Fig. 8.3(A), a running six-month average of the data was calculated and then plotted as shown in Fig. 8.3(8). This latter presentation is much more useful as it shows the gradual upward trend in tritium concentration as well as two excursions that suggest a pulse of contamination passing the sampling point. The use of a six-month running average was purely arbitrary in this case; any convenient averaging period can be used. A disadvantage of using running averages is that the actual time of peak activity is shifted, in which case the sequential plot or actual tabulation of the data should be consulted. The graphs shown in Fig. 8.3 are termed rectilinear coordinate graphs and are the most common type for displaying research data. Such graphs can be modified to display much significant information about a set ofdata. One common modification is to add the standard deviation (in some cases the standard error) and range of data sets. The data displayed in Fig. 8.3 are based on one analysis per month so range and standard deviation are inappropriate. In Fig. 8.4 the data displayed are the mean, standard deviation and range for 15 air sampling stations in Nevada. An additional C E o 50 .e ~ e c 0 45 40 y E l Y E l Y 30 25 0 15 :ci 10 l!! I T 35 .. ';1 A R C c 20 8c (J 5 0 Jen Feb Mer Apr Mey Jun Fig. 8.4. Rectilinear plot with range. standard deviation. mean and location of maximum concentration. 354 INSTRUMENTAL ANALYSIS OF POLLUTANTS 30 ~ . « Ii CD 20 " ~ U Q. . .S l.... .g. 12 6 •0 ~ 14 ... 4 0 19!;6 ~8 60 62 64 66 68 70 72 74 76 78 80 Fig. 8.5. Bar chart of strontium-90 concentration in bone.•. Bighorn sheep; O. deer; ,~ cattle. Number of bone samples for each animal is indicated. useful datum has been added to the figure; the city where the maximum air concentration was detected. The next most common form of graph for displaying research or monitoring data is the bar chart. This type of graph is most useful for comparing data from different sources. In our work of monitoring for radioactivity, data on the concentration of radioactive strontium in the bone of various species of animal have been collected for many years. The most compact method of displaying all the accumulated data is by means of a bar chart, as shown in Fig. 8.5. Such a graph shows not only the relative concentration by species but also the change in concentration as the amount of strontium in the environment changes over time. There is a publication, now in its second edition,16 that includes an extensive discussion of the different types of graphical display as well as methods for designing them. A large variety of each of the types of graphs are shown and explained, but the rectilinear coordinate and bar charts as described above will suffice for most data display purposes. REFERENCES 1. Sharaj, M., IIIman, D. & Kowalski, B., Chemometrics. John Wiley, New York, Chapter 3, 1986. 2. Brown, K. W. & Black, S. C, Quality assurance and quality control data validation procedures used for the Love Canal and Dallas lead soil monitoring programs. Environ. Mon. Assmt, 3 (1983) 113-22. 3. Taylor, J. K., Quality assurance of chemical measurements. Anal. Chem., 53 (1981) 1588A-93A. 4. USEPA, Guidelines establishing test procedures for the analysis of pollutants under the clean water act. Title 40, Code of Federal Regulations, Part 136 (40 CFR 136), Washington, DC, 1986. DATA ANALYSIS AND PRESENTATION 355 5. Goldin, A. S., Evaluation of internal control measurements in radioassay. Health Phys., 47 (1984) 361-74. 6. Black, S. c., Defining control sites and blank sample needs. In Principles of Environmental Sampling, ed. L. H. Keith. American Chemical Society, Washington, DC, 1987. 7. ASTM, Standard Practice for Intralaboratory Quality Control Procedures and a Discussion on Reporting Low-Level Data. Designation: D4210-83, American Society for Testing and Materials, Philadelphia, PA, 1983. 8. USEPA, Upgrading Environmental Radiation Data. EPA 520/1-80-012, Washington, DC, Chapter 6, 1980. 9. Glaser, 1. A., Foerst, D. L., McKee, G. D., Quave, S. A. & Budde, W. L., Trace analysis for wastewaters. Environ. Sci. Technol., 15 (1981) 1426-35. 10. USEPA, National Drinking Water Regulations for Synthetic Organic and Inorganics. Title 40, Code of Federal Regulations, Part 141, Washington, DC, 1985. II. ASTM, Standard Practicefor Determination ofPrecision and Bias ofMethods of Committee D-J9 on Water. Designation: D2777-77, American Society for Testing and Materials, Philadelphia, PA, 1977. 12. Taylor, 1. K. & Stanley, T. W. (eds), Quality Assurance for Environmental Measurements. Special Technical Publication 867, American Society for Testing and Materials, Philadelphia, PA, 1985. 13. Barnett, V. & Lewis, T., Outliers in Statistical Data. John Wiley, New York, 1978. 14. Gale, H. J., The lognormal distribution and some examples of its application in the field of radiation protection. Report AERE-R-4736, AERE, Harwell, 1965. 15. Natrella, M. G., Experimental Statistics. National Bureau of Standards Handbook 91, Washington, DC, 1966. 16. Schmid, C. F. & Schmid, S. E., Handbook ofGraphic Presentation, 2nd edn. John Wiley, New York, 1979.