L Moments

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

RC 20525 (90933) 8/5/96, revised 7/25/05 Mathematics 33 pages

Research Report
Fortran routines for use with the method of L-moments Version 3.04
J. R. M. Hosking IBM Research Division T. J. Watson Research Center Yorktown Heights, NY 10598

Research Division Almaden Austin Beijing Haifa T.J. Watson Tokyo Zurich

Fortran routines for use with the method of L-moments Version 3.04

J. R. M. Hosking IBM Research Division P. O. Box 218 Yorktown Heights, NY 10598

Abstract. Hosking (J. R. Statist. Soc. B, 1990) and Hosking and Wallis (Regional frequency analysis: an approach based on L-moments, Cambridge University Press, 1997) have described L-moments and L-moment ratios, quantities useful in the summarization and estimation of probability distributions. This report contains details of Fortran routines that should facilitate the use of L-moment-based methods.

The routines described in this Research Report are available on the electronic software repository StatLib, at the Internet location http://lib.stat.cmu.edu/general/lmoments.

Contents
Introduction Routines for specic distributions Routines to compute L-moments Routines for regional frequency analysis Auxiliary routines Data les Links between routines Machine dependence Change history References 1 4 12 16 26 27 28 29 30 32

Index of routines
Routine CDFEXP CDFGAM CDFGEV CDFGLO CDFGNO CDFGPA CDFGUM CDFKAP CDFNOR CDFPE3 CDFWAK CLUAGG CLUINF CLUKM DERF DIGAMD DLGAMA DURAND GAMIND LMREXP LMRGAM Page 6 6 6 6 6 6 6 6 6 6 6 17 19 20 26 26 26 26 26 8 8 Routine LMRGEV LMRGLO LMRGNO LMRGPA LMRGUM LMRKAP LMRNOR LMRPE3 LMRWAK PELEXP PELGAM PELGEV PELGLO PELGNO PELGPA PELGUM PELKAP PELNOR PELPE3 PELWAK QUAEXP Page 8 8 8 8 8 8 8 8 8 10 10 10 10 10 10 10 10 10 10 10 7 Routine QUAGAM QUAGEV QUAGLO QUAGNO QUAGPA QUAGUM QUAKAP QUANOR QUAPE3 QUASTN QUAWAK REGLMR REGTST SAMLMR SAMLMU SAMPWM SORT XCLUST XFIT XSIM XTEST Page 7 7 7 7 7 7 7 7 7 26 7 21 22 13 14 15 26 24 24 25 25

Introduction
Probability weighted moments of a random variable X with cumulative distribution function F were dened by Greenwood et al. (1979) to be the quantities Mp,r,s = E [X p {F (X )}r {1 F (X )}s ] . Particularly useful special cases are the probability weighted moments r = M1,0,r = E [X {1 F (X )}r ] and r = M1,r,0 = E [X {F (X )}r ] . Hosking (1990) dened L-moments to be the quantities
r = E [XPr 1 {F (X )}] ( . ) is the r th shifted Legendre polynomial. L-moments and probability weighted where Pr moments are related by r

r+1 =
k=0

p r,k k

(1)

where
r k p r,k = (1)

r k

r+k . k

L-moment ratios are the quantities r = r /2 . The foregoing quantities are dened for a probability distribution, but in practice must often be estimated from a nite sample. Let x1:n x2:n ... xn:n be the ordered sample. Let ar = n1 br = n 1
r+1 = k=0

(n j )(n j 1)...(n j r + 1) xj :n , (n 1)(n 2)...(n r) j =1 (j 1)(j 2)...(j r) xj :n , (n 1)(n 2)...(n r) j =1 p r,k bk .


n

(2) (3)

Then ar , br and r are unbiased estimators of r , r and r respectively. The estimator tr = r / 2 of r is consistent but not unbiased. Landwehr et al. (1979a, 1979b) also considered plotting-position estimators of probability weighted moments. Let pj :n be a plotting 1

position, i.e. a distribution-free estimator of F (xj :n ). Then r , r , r and r are estimated respectively by
n

r = n1
j =1 n

(1 pj :n )r xj :n , pr j :n xj :n ,
j =1 n Pr 1 (pj :n ) xj :n , j =1

r = n1 r = n1 r / 2 . r =

(4) (5)

Plotting-position estimators are consistent, but can have large bias and cannot be generally recommended (Hosking and Wallis, 1995; Hosking and Wallis, 1997, sec. 2.8). Hosking (1990) and Hosking and Wallis (1997, chap. 2) give expositions of the theory of L-moments and L-moment ratios. Hosking and Wallis (1997, Appendix) give, for many distributions in common use, expressions for the L-moments of the distributions and algorithms for estimating the parameters of the distributions by equating sample and population L-moments (the method of L-moments). This report contains details of Fortran routines that should facilitate the use of L-moment-based methods. The following routines are included. For each of the eleven distributions listed in Table 1 on page 5, routines to evaluate the cumulative distribution function and quantile function of the distribution, to calculate the L-moments given the parameters and to calculate the parameters given the loworder L-moments. Given a sample of data, routines to calculate estimates of the probability weighted moments and L-moments of the distribution from which the sample was drawn. Routines useful in connection with index-ood procedures for regional frequency analysis (Hosking and Wallis, 1997). Three routines are concerned with cluster analysis, and do not explicitly involve L-moments. One routine calculates a weighted average of L-moment ratios estimated from data measured at dierent sites, an essential part of L-moment-based index-ood procedures such as the regional L-moment algorithm described in Hosking and Wallis (1997, sec. 6.2). One other routine calculates diagnostic statistics useful in deciding whether a set of samples may be regarded as having similar frequency distributions: these statistics are described more fully in Hosking and Wallis (1997, chaps. 35). Programs to apply the regional L-moment algorithm and to calculate the diagnostic statistics for a regional data set. Auxiliary routines called by some of the earlier routines: mathematical functions, a pseudo-random number generator and a routine to sort a data array into ascending order.

The routines are written in Fortran-77. The routines aim for accuracy of about 8 signicant digits. For this reason double-precision arithmetic is used throughout. Single precision may 2

be adequate on machines that use 60 bits to represent a single-precision oating-point number. Caveat: Statements about the accuracy of individual routines are not unconditional guarantees, but are valid over wide ranges of values of the input parameters including those most frequently encountered in practice. Decreased accuracy is particularly likely when the shape parameter of the distribution is close to zero (less than 104 in absolute value), when the shape parameter is close to the end of the range of its permissible values, or is close to a value at which the mean of the distribution does not exist, or when the shape parameter is large and positive.

Routines for specic distributions


Routines are provided for the eleven distributions listed in Table 1. The following routines are provided for each distribution. FUNCTION CDFxxx FUNCTION QUAxxx SUBROUTINE LMRxxx SUBROUTINE PELxxx The cumulative distribution function of the distribution. The quantile function (inverse cumulative distribution function) of the distribution. Calculates the L-moment ratios of the distribution given its parameters. Calculates the parameters of the distribution given its L-moments. When the L-moments are the sample L-moments of a set of data, then the resulting parameters are of course the method of L-moments estimates of the parameters.

Here xxx is the 3-letter code used to identify the distribution, as given in Table 1. For example the cumulative distribution function of the gamma distribution is FUNCTION CDFGAM. Most of these distributions are standard and are as dened in Hosking and Wallis (1997, Appendix). The generalized Normal distribution combines lognormal distributions (which have positive skewness), reected lognormal distributions (which have negative skewness) and the Normal distribution (which has zero skewness). The three-parameter lognormal distribution, with cumulative distribution function F (x) = [{log(x ) }/ ] , is a generalized Normal distribution with parameters k = , = e , = + e . The Pearson type III distribution combines gamma distributions (which have positive skewness), reected gamma distributions (which have negative skewness) and the Normal distribution (which has zero skewness). We parametrize the Pearson type III distribution by its rst three (conventional) moments: the mean , the standard deviation and the skewness . The relationship between these parameters and those of the gamma distribution is as follows. Let X be a random variable with a Pearson type III distribution with parameters , and . If > 0, then X + 2/ has a gamma distribution with parameters = 4/ 2 , = /2. If = 0, then X has a Normal distribution with mean and standard deviation . If < 0, then X + 2/ has a gamma distribution with parameters = 4/ 2 , = |/2|.

Table 1: Distributions for which L-moment subroutines are provided.


Number of Parameters parameters

Distribution Code

F (x), x(F ) F = 1 exp{(x )/} x = log(1 F ) F = G(x/, ) x(F ) not explicitly dened F = exp[{1 k (x )/}1/k ] x = + {1 ( log F )k }/k F = 1/[1 + {1 k (x )/}1/k ] x = + [1 {(1 F )/F }k ]/k F = [k 1 log{1 k (x )/}] x(F ) not explicitly dened F = 1 {1 k (x )/}1/k x = + {1 (1 F )k }/k F = exp[ exp{(x )/}] x = + log( log F ) F = [1 h{1 k (x )/}1/k ]1/h x = + [1 {(1 F )h /h}k ]/k F = {(x )/ } x(F ) not explicitly dened
2 >0 F = G (x + 2/ )/| 1 2 |, 4/ , 1 F = 1G (x+2/ )/| 2 |, 4/ 2 , <0 x(F ) not explicitly dened

Exponential

EXP

Gamma

GAM

Generalized GEV extreme-value Generalized logistic Generalized Normal Generalized Pareto Gumbel GLO

GNO

GPA

GUM

Kappa

KAP

kh

Normal

NOR

Pearson type III

PE3

Wakeby

WAK

F (x) not explicitly dened x = + {1 (1 F ) } {1 (1 F ) }

G(x, ) = {()}1 (x) = (2 )


1/2 x

x 1 t t e 0 2

dt is the incomplete gamma integral.

exp(t /2) dt is the standard Normal cumulative distribution function.

Cumulative distribution functions


DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION CDFEXP(X,PARA) CDFGAM(X,PARA) CDFGEV(X,PARA) CDFGLO(X,PARA) CDFGNO(X,PARA) CDFGPA(X,PARA) CDFGUM(X,PARA) CDFKAP(X,PARA) CDFNOR(X,PARA) CDFPE3(X,PARA) CDFWAK(X,PARA)

Formal parameters X PARA [input; double precision] The argument of the function. [input; double-precision array of length equal to the number of parameters of the distribution] The parameters of the distribution.

Numerical method CDFGAM and CDFPE3 are calculated from the incomplete gamma integral, auxiliary routine GAMIND. CDFGNO and CDFNOR are calculated from the error function, supplied as the auxiliary routine DERF. Routine CDFWAK, given an argument x0 of the Wakeby cumulative distribution function, solves the equation x(F ) = x0 for F using Halleys method, the second-order analogue of Newton-Raphson iteration (Churchhouse, 1981, pp. 151-152; Huh, 1986). The other cumulative distribution functions have explicit analytical expressions, given in Table 1: the routines straightforwardly evaluate these expressions. Accuracy In principle (but see the caveat on page 3) all routines are accurate to at least 8 decimal places.

Quantile functions
DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE DOUBLE PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION PRECISION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION QUAEXP(F,PARA) QUAGAM(F,PARA) QUAGEV(F,PARA) QUAGLO(F,PARA) QUAGNO(F,PARA) QUAGPA(F,PARA) QUAGUM(F,PARA) QUAKAP(F,PARA) QUANOR(F,PARA) QUAPE3(F,PARA) QUAWAK(F,PARA)

Formal parameters F PARA [input; double precision] The argument of the function. [input; double-precision array of length equal to the number of parameters of the distribution] The parameters of the distribution.

Numerical method Routine QUAGAM, given an argument F0 of the Gamma quantile function, solves the equation G(x, ) = F0 for x using Newton-Raphson iteration. Routines QUAGNO and QUANOR use the quantile function of the standard Normal distribution, supplied as the auxiliary routine QUASTN, and transform it to get the required quantile function. The other quantile functions have explicit analytical expressions, given in Table 1: the routines straightforwardly evaluate these expressions. Accuracy Let F be the argument of the quantile function. In principle (but see the caveat on page 3) each routine returns a quantile x(F0 ) corresponding to an argument F0 that satises |F F0 | 108 . In other words, the error of the approximation is equivalent to a perturbation of no more than 108 in the argument of the function.

L-moment ratios
SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE LMREXP(PARA,XMOM,NMOM) LMRGAM(PARA,XMOM,NMOM) LMRGEV(PARA,XMOM,NMOM) LMRGLO(PARA,XMOM,NMOM) LMRGNO(PARA,XMOM,NMOM) LMRGPA(PARA,XMOM,NMOM) LMRGUM(PARA,XMOM,NMOM) LMRKAP(PARA,XMOM,NMOM) LMRNOR(PARA,XMOM,NMOM) LMRPE3(PARA,XMOM,NMOM) LMRWAK(PARA,XMOM,NMOM)

Formal parameters PARA XMOM NMOM [input; double-precision array of length equal to the number of parameters of the distribution] The parameters of the distribution. [output; double-precision array of length NMOM] The L-moment ratios of the distribution, in the order 1 , 2 , 3 , 4 , ... . [input; integer] The number of L-moments to be found. In routines LMRGAM and LMRPE3, NMOM may be at most 4; in the other routines it may be at most 20.

Numerical method LMREXP LMRGAM L-moments and L-moment ratios are calculated using the expressions in Hosking and Wallis (1997, Appendix A.2). L-moments and L-moment ratios are calculated using expressions in Hosking and Wallis (1997, Appendix A.9): analytic expressions for 1 and 2 , rational-function approximations for 3 and 4 . Probability weighted moments k are calculated using eq. (9) of Hosking et al. (1985). Each k is then calculated from k and the j , j = 1, ..., k 1, using the formula k 2k (2j 1)(2k )! k+1 = k j , k (k + j )! (k + 1 j )! j =1 easily proved by induction using (1). LMRGLO The rst two L-moments are calculated using the expressions in Hosking and Wallis (1997, Appendix A.7). Higher-order L-moment ratios are expressed as polynomials in the shape parameter k . The rst two L-moments are calculated using the expressions in Hosking and Wallis (1997, Appendix A.8). For k = 0, higher-order L-moment ratios are those 8

LMRGEV

LMRGNO

of the Normal distribution. For k = 0, higher-order L-moment ratios are found by numerical evaluation of the integral exp{(x + k/ 2)2 }Pr1 (erf x) dx ,

where Pr1 ( . ) is the (r 1)th Legendre polynomial and erf is the error function; this integral is, apart from a linear transformation, the rth L-moment of the distribution. LMRGPA The routine evaluates the explicit formulas given by Hosking and Wallis (1997, Appendix A.5). Higher-order L-moments (r , r 5) are computed from the general expression (1 + k )(r 1 k ) r = , (1 k )(r + 1 + k ) which can be derived from the expression for r given by Hosking and Wallis (1987, p. 341). The rst two L-moments are calculated using the expressions in Hosking and Wallis (1997, Appendix A.3). Higher-order L-moment ratios are the same for all Gumbel distributions. They are initialized in a data statement and copied into array XMOM as required. Probability weighted moments k are calculated using eq. (12) of Hosking (1994). The k are calculated from the k as in routine LMRGEV. The rst two L-moments are calculated using the expressions in Hosking and Wallis (1997, Appendix A.4). Higher-order L-moment ratios are the same for all Normal distributions. They are initialized in a data statement and copied into array XMOM as required. L-moments are calculated from those of the gamma distribution, of which the Pearson type III is a reparametrization. The numerical method is the same as that of routine LMRGAM. The routine evaluates the explicit formulas given by Hosking and Wallis (1997, Appendix A.11). Higher-order L-moments (r , r 5) are computed from the general expression r = (1 + )(r 1 ) (1 )(r 1 + ) + , (1 )(r + 1 + ) (1 + )(r + 1 )

LMRGUM

LMRKAP LMRNOR

LMRPE3

LMRWAK

which can be derived from the expression for r given by Kotz et al. (1988, p. 514). Accuracy In principle (but see the caveat on page 3) all routines are accurate to at least 8 signicant digits for 1 and 2 . For higher-order L-moment ratios, routines LMREXP, LMRGLO, LMRGPA, LMRGUM, LMRNOR and LMRWAK are accurate to at least 8 decimal places. Routines LMRGAM, LMRGNO and LMRPE3 are accurate to 6 decimal places. Routines LMRGEV and LMRKAP become less and less accurate as the order of the L-moment ratio increases; their accuracy for r is at least 8 decimal places if r 12, at least 4 decimal places if r 18. 9

Parameters
SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE SUBROUTINE PELEXP(XMOM,PARA) PELGAM(XMOM,PARA) PELGEV(XMOM,PARA) PELGLO(XMOM,PARA) PELGNO(XMOM,PARA) PELGPA(XMOM,PARA) PELGUM(XMOM,PARA) PELKAP(XMOM,PARA,IFAIL) PELNOR(XMOM,PARA) PELPE3(XMOM,PARA) PELWAK(XMOM,PARA,IFAIL)

Formal parameters XMOM PARA IFAIL [input; double-precision array of length equal to the number of parameters of the distribution] The L-moment ratios 1 , 2 , 3 , 4 , ... . [input; double-precision array of length equal to the number of parameters of the distribution] The parameters of the distribution. [output; integer] Used only by routines PELKAP and PELWAK. For explanation see Numerical method below.

Numerical method Except where indicated below, the routines evaluate expressions for the parameters in terms of the L-moments given in the appendix of Hosking and Wallis (1997). PELGAM PELGEV Rational approximation is used to express , the shape parameter of the gamma distribution, as a function of the L-CV = 2 /1 . For 0.8 3 < 1, the shape parameter k is estimated from rational-function approximations given by Donaldson (1996). This approximation has absolute accuracy better than 3 107 . If 3 < 0.8, Donaldsons approximation is further rened by Newton-Raphson iteration until similar accuracy is achieved. Parameters and are then calculated as in Hosking and Wallis (1997, Appendix A.6). The method is as described by Hosking (1994). Newton-Raphson iteration is used to solve the equations that express 3 and 4 as functions of k and h, and and are calculated as functions of 1 , 2 , k and h. To ensure a 1-1 relationship between parameters and L-moments, the parameter space is restricted as in Hosking (1994, eq. (13)): k > 1; if h < 0 then hk > 1; h > 1; k + 0.725h > 1 .

PELKAP

10

On exit, parameter IFAIL is set as follows. 0 1 2 Successful exit. L-moments invalid: 2 0, or 3 and 4 do not lie within the set of feasible values given in Hosking and Wallis (1997, eq. (2.45)). 2 )/6. This implies that the L-moments Routine was called with 4 (1 + 53 are inconsistent with a kappa distribution with parameters restricted as above. Newton-Raphson iteration did not converge. Newton-Raphson iteration reached a point from which no further progress could be made. Newton-Raphson iteration encountered numerical diculties: overow would have been likely to occur. Iteration for h and k converged, but overow would have occurred while calculating and .

3 4 5 6

PELWAK

The basic method is that of Landwehr et al. (1979b), but the distribution is differently parametrized and the calculations are expressed in terms of L-moments rather than probability weighted moments. The computations are given by Hosking and Wallis (1997, Appendix A.11). First a solution is sought in which all ve parameters are estimated, as functions of the rst ve L-moments. If no solution is found, is set equal to zero and a solution is sought in which the other four parameters are estimated as functions of the rst four L-moments. If this too is unsuccessful, then a generalized Pareto distribution is tted instead, using the rst three L-moments and the same method as routine PELGPA. On exit, parameter IFAIL is set as follows. 0 1 2 3 Successful exit; all ve parameters estimated. Solution could only be found by setting = 0. Solution could only be found by tting a generalized Pareto distribution. L-moments invalid: 2 0, or |r | 1 for some r 3.

Accuracy For shape parameters, routines PELGLO and PELGPA are accurate to at least 8 decimal places; PELGEV is accurate to within 3 107 ; PELGNO has relative accuracy better than 2.5 106 ; PELKAP is accurate to within 5 106 ; PELGAM and PELPE3 have relative accuracy better than 5 105 . For any of these routines, let be the relative error of estimation of the shape parameter; for routines PELGUM and PELNOR, let = 108 ; then the estimate of the scale parameter (or , or ) has relative error approximately , and the estimate of the location parameter has absolute error approximately . Routine PELWAK is usually accurate to within 106 , but can be much less accurate for certain combinations of input parameters. In such cases, however, the quantiles of the tted distribution are usually still reliable to within an error equivalent to a perturbation of less than 106 in the argument of the quantile function. 11

Routines for computing sample L-moments


Three routines are provided for computing sample L-moments and probability weighted moments of a data sample. SUBROUTINE SAMLMR SUBROUTINE SAMLMU SUBROUTINE SAMPWM Calculates the sample L-moment ratios of a data set, indirectly via the probability weighted moments. Calculates the unbiased sample L-moment ratios of a data set by a more direct method. Calculates the sample probability weighted moments of a data set.

12

SUBROUTINE SAMLMR(X,N,XMOM,NMOM,A,B) Purpose Calculates the sample L-moment ratios of a data set. Formal parameters X N XMOM NMOM A,B [input; double-precision array of length N] The data, in ascending order. [input; integer] Sample size. [output; double-precision array of length NMOM] The sample L-moment ratios, in the order 1 , 2 , t3 , t4 , ... . [input; integer] The number of L-moments to be found. At most min(N, 20). [input; double precision] Parameters of plotting position. If A and B are both set to zero, unbiased estimates of L-moments are found. Otherwise plotting-position estimates are found, based on the plotting position pj :n = (j + A)/(n + B) for the j th smallest of n observations. For example, A=-0.35D0 and B=0.0D0 yields the estimates recommended by Hosking et al. (1985) for use with the generalized extreme-value distribution.

Numerical method The routine explicitly computes probability weighted moments using (2) or (4), and from them the sample L-moments using (3) or (5).

13

SUBROUTINE SAMLMU(X,N,XMOM,NMOM) Purpose Calculates the unbiased sample L-moment ratios of a data set. Formal parameters X N XMOM NMOM [input; double-precision array of length N] The data, in ascending order. [input; integer] Sample size. [output; double-precision array of length NMOM] The sample L-moment ratios, in the order 1 , 2 , t3 , t4 , ... . [input; integer] The number of L-moments to be found. At most min(N, 100).

Numerical method From (2) and (3), we can write


r

is a linear combination of the ordered sample values x1:n , . . . , xn:n , and


n r

= n 1
j =1

wj :n xj :n .

(r )

The weights wj :n are related to the discrete Legendre polynomials dened by Neuman and Schonbach (1974). In Neuman and Schonbachs notation, the weight wj :n is the discrete Legendre polynomial (1)r Pr1 (j 1, n 1). The weights can be computed as wj :n = 1, wj :n = 2(j 1)/(n 1) 1, and, for r 3, (r 1)(n r + 1)wj :n = (2r 3)(2j n 1)wj :n
(r ) (r 1) (2) (1) (r )

(r )

(r 2)(n + r 2)wj :n ;

(r 2)

this recurrence relation is equivalent to equation (9) of Neuman and Schonbach (1974). Routine SAMLMU evaluates sample L-moments using this recurrence, and achieves additional (r ) (r ) eciency by making use of the symmetry in the weights, wj :n = (1)r1 wn+1j :n . This routine is generally equivalent to SAMLMR, but is particularly suitable when highorder L-moment ratios are to be calculated. Routine SAMLMU is accurate to 8 decimal places for about the rst 50 L-moment ratios, whereas SAMLMR achieves similar accuracy for only the rst 12 L-moment ratios. The routines require approximately the same amount of computing time, but SAMLMU can be a little slower than SAMLMR when NMOM is large or odd.

14

SUBROUTINE SAMPWM(X,N,XMOM,NMOM,A,B,KIND) Purpose Calculates the sample probability weighted moments of a data set. Formal parameters X N XMOM NMOM A,B [input; double-precision array of length N] The data, in ascending order. [input; integer] Sample size. [output; double-precision array of length NMOM] The sample probability weighted moments, in the order (if KIND=1) a0 , a1 , ..., or (if KIND=2) b0 , b1 , ... . [input; integer] The number of probability weighted moments to be found. At most min(N, 20). [input; double precision] Parameters of plotting position. If A and B are both set to zero, unbiased estimates of probability weighted moments are found. Otherwise plotting-position estimates are found, based on the plotting position pj :n = (j + A)/(n + B) for the j th smallest of n observations. For example, A=-0.35D0 and B=0.0D0 yields the estimates recommended by Hosking et al. (1985) for use with the generalized extreme-value distribution. [input; integer] Indicates the kind of probability weighted moments to be found. If KIND=1, the routine estimates the probability weighted moments r = E [X {1 F (X )}r ]. If KIND=2, the routine estimates the probability weighted moments r = E [X {F (X )}r ].

KIND

15

Routines for regional frequency analysis


These routines implement the methods described by Hosking and Wallis (1997) for regional frequency analysis, which combines the L-moments of data samples from dierent measuring sites to achieve improved estimates of the frequency distribution at each site. Three routines provide facilities for cluster analysis, and do not explicitly involve L-moments. Two routines implement the estimation and testing methods from Hosking and Wallis (1997, chaps. 36). Four driver programs are also provided: these illustrate the use of the regional frequency analysis routines by applying them to data sets used by Hosking and Wallis (1997). SUBROUTINE CLUAGG SUBROUTINE CLUINF SUBROUTINE CLUKM SUBROUTINE REGLMR SUBROUTINE REGTST Performs cluster analysis by one of several agglomerative hierarchical methods: single-link, complete-link, and Wards procedure. Obtains information about clusters arising from agglomerative hierarchical clustering. Performs cluster analysis by the K -means algorithm. Calculates regional weighted averages of L-moment ratios. Calculates statistics useful in regional frequency analysis. The statistics are based on the at-site and regional average sample L-moments. Discordancy measures for individual sites in a region. These are intended to assess whether the sample L-moments at a site are markedly dierent from the regional average. Heterogeneity measures for a region. These are intended to assess whether the between-site variation in the sample L-moments is consistent with what would be expected of a homogeneous region. Goodness-of-t measures, based on the regional average L-moment ratios. These are intended to assess whether any of ve candidate distributions gives an adequate t to the data. The statistics are described in more detail by Hosking and Wallis (1997, chaps. 35). PROGRAM XCLUST PROGRAM XFIT Program to illustrate the use of the cluster analysis routines. Program to illustrate the use of routine REGLMR. The program performs frequency analysis of a regional data set using an index-ood procedure and the method of L-moments. Program to illustrate the use of routine REGTST. Program to illustrate the use of simulation to nd the accuracy of estimates obtained from regional frequency analysis.

PROGRAM XTEST PROGRAM XSIM

16

SUBROUTINE CLUAGG(METHOD,X,NX,N,NATT,MERGE,DISP,IWORK,WORK,NW) Purpose Performs cluster analysis by one of several agglomerative hierarchical methods: single-link, complete-link, and Wards procedure. Formal parameters METHOD X NX N NATT MERGE [input; integer] Denes which clustering method is used. Should be set to 1 for single-link clustering, 2 for complete-link clustering, 3 for Wards procedure. [input; double-precision array of dimension (NX,NATT)] X(I,J) should contain the Jth attribute for the Ith data point. [input; integer] The rst dimension of array X, as declared in the calling program. [input; integer] Number of data points. [input; integer] Number of attributes for each data point. [output; integer array of dimension (2,N)] MERGE(1,I) and MERGE(2,I) are the labels of the clusters merged at the Ith stage. MERGE(1,N) and MERGE(2,N) are not used. [output; double-precision array of length N] DISP(I) is a measure of the withincluster dispersion after the Ith merge. Dispersion is dened dierently for each method: see below. DISP(N) is not used. [local; integer array of length N] Work array. [local; double-precision array of length NW] Work array. [input; integer] Length of array WORK. Must be at least N*(N-1)/2.

DISP

IWORK WORK NW Method

In agglomerative hierarchical clustering, there are initially N clusters, each containing one data point, labeled 1 through N in the same order as the data points. At each stage of clustering, two clusters are merged. Their labels are saved in the MERGE array. The smaller of the two labels is used as the label of the merged cluster. After the Mth stage of clustering there are N M clusters. To nd which data points belong to which clusters, use routine CLUINF. Dierent methods correspond to dierent criteria for which clusters should be merged at each stage. The following methods are implemented. Single-link clustering: the distance between two clusters A and B is dened to be the minimum of the Euclidean distances between pairs of points with one point in A and one in B . At each stage, the two clusters separated by the smallest distance are merged. The square of this distance is saved in the corresponding element of array DISP. 17

Complete-link clustering: the distance between two clusters A and B is dened to be the maximum of the Euclidean distances between pairs of points with one point in A and one in B . At each stage, the two clusters separated by the smallest distance are merged. The square of this distance is saved in the corresponding element of array DISP. DISP(I) is therefore the largest squared Euclidean distance between two points that are in the same cluster after the Ith merge. Wards method: at each stage, the clusters that are merged are chosen to minimize the total within-cluster sum of squared deviations of each attribute about the cluster mean. This sum of squares is saved in the corresponding element of array DISP. This routine makes no claim to eciency in either computing time or storage requirements. It is computationally practical for small and moderate-sized problems. For large problems, more ecient clustering programs may be preferred, e.g., for Wards procedure, the hcon2 program of Murtagh (1985), available from StatLib at http://lib.stat.cmu.edu/multi/hcon2.

18

SUBROUTINE CLUINF(NCLUST,N,MERGE,IASSGN,LIST,NUM) Purpose Obtains information about clusters arising from agglomerative hierarchical clustering. Agglomerative hierarchical clustering procedures typically produce a list of the clusters merged at each stage of the clustering. This routine uses this list to construct arrays that explicitly show which cluster a given data point belongs to, and which data points belong to a given cluster. Formal parameters NCLUST N MERGE [input; integer] Number of clusters. [input; integer] Number of data points. [input; integer array of dimension (2,N)] MERGE(1,I) and MERGE(2,I) identify the clusters merged at the Ith step. This is the array MERGE returned by routine CLUAGG, and should be left unchanged after exit from that routine. [output; integer array of length N] Its Ith element is the number of the cluster to which the Ith data point belongs. [output; integer array of length N] Contains the data points in cluster 1, followed by the data points in cluster 2, etc. Data points in each cluster are listed in increasing order. The last data point in each cluster is indicated by a negative number. See the example below. [output; integer array of length NCLUST] Contains the number of data points in each cluster.

IASSGN LIST

NUM

Cluster numbers used in arrays IASSGN, LIST and NUM range from 1 to NCLUST. They are arbitrary, but are uniquely dened: cluster 1 contains data point 1, cluster M (M 2) contains data point J, where J=MERGE(2,N-M). Example of the LIST array. Suppose that there are 8 data points and 3 clusters, and that the elements of the LIST array are 1, 4, 3, 6, 8, 2, 5, 7. Then the clusters are as follows: cluster 1 contains points 1 and 4; cluster 2 contains points 3, 6 and 8; cluster 3 contains points 2, 5 and 7.

19

SUBROUTINE CLUKM(X,NX,N,NATT,NCLUST,IASSGN,LIST,NUM,SS,MAXIT,IWORK,RW,NW) Purpose Performs cluster analysis by the K -means algorithm. The aim of the K -means algorithm is to partition a set of points into K clusters so that the within-cluster sum of squared distances from the points to their cluster centers is minimized. Except for very small data sets, it is not practical to require that the sum of squares is minimized over all possible partitions. Instead, a local optimum is sought such that no movement of a point from one cluster to another will reduce the within-cluster sum of squares. Formal parameters X NX N NATT NCLUST IASSGN [input; double-precision array of dimension (NX,NATT)] X(I,J) should contain the Jth attribute for the Ith data point. [input; integer] The rst dimension of array X, as declared in the calling program. [input; integer] Number of data points. [input; integer] Number of attributes for each data point. [input; integer] Number of clusters. [input and output; integer array of length N] On entry, should contain the initial assignment of sites to clusters. On exit, contains the nal assignment. The Ith element of the array contains the label of the cluster to which the Ith data point belongs. Labels must be between 1 and NCLUST, and each of the values 1 through NCLUST must occur at least once. [output; integer array of length N] Contains the data points in cluster 1, followed by the data points in cluster 2, etc. Data points in each cluster are listed in increasing order. The last data point in each cluster is indicated by a negative number. [output; integer array of length NCLUST] Contains the number of data points in each cluster. [output; double precision] Within-group sum of squares of the nal clusters. [input; integer] Maximum number of iterations for the K -means algorithm. [local; integer array of length NCLUST*3] Work array. [local; real array of length NW] Work array. N.B. this array is of type real, not double precision! [input; integer] Length of array RW. Must be at least (N+NCLUST)*(NATT+1)+2*NCLUST.

LIST

NUM SS MAXIT IWORK RW NW

Method This routine is a front end to Applied Statistics Algorithm AS136 (Hartigan and Wong, 1979). Algorithm AS136 is not included in this package, but is available from StatLib at http://stat.lib.cmu.edu/apstat/136.

20

SUBROUTINE REGLMR(NSITE,NMOM,NXMOM,XMOM,WEIGHT,RMOM) Purpose Calculates regional weighted averages of L-moment ratios, as in Hosking and Wallis (1997, sec. 6.2). Formal parameters NSITE NMOM NXMOM XMOM [input; integer] The number of sites in the region. [input; integer] The number of L-moments to be found. [input; integer] The rst dimension of array XMOM, as declared in the calling program. [input; double-precision array of dimension (NXMOM,NSITE)] XMOM(K,I) should contain the Kth L-moment ratio for site I, as returned by a previous call to routine SAMLMR. [input; double-precision array of length NSITE] WEIGHT(I) should contain the relative weight of site I in the regional weighted average. It is recommended that, as in Hosking and Wallis (1997, sec. 6.2), the weights be proportional to the record lengths at each site. [output; double-precision array of length NMOM] The regional weighted average R R R L-moment ratios R 1 , 2 , t3 , t4 , ... , as dened below.

WEIGHT

RMOM

Numerical method Let the number of sites be N and let the L-moment ratios at site i be 1 , 2 , t3 , t4 , ... . Let the weight allotted to site i be wi . When the data are scaled by dividing by the mean, (i) (i) (i) (i) the L-moment ratios become 1, t(i) , t3 , t4 , ..., where t(i) = 2 / 1 is the at-site L-CV. The regional weighted average mean, L-CV and L-moment ratios are the weighted averages of these at-site values, and are dened by
N R 1 N N N i) wi t( r i=1 i=1 (i) (i) (i) (i)

= 1,

tR =
i=1

wi t(i)
i=1

wi ,

tR r =

wi ,

r = 3, 4, ...

(cf. Hosking and Wallis, 1997, sec. 6.2, who set wi to be the record length at site i).

21

SUBROUTINE REGTST(NSITES,NAMES,LEN,XMOM,A,B,SEED,NSIM,NPROB,PROB, KPRINT,KOUT,RMOM,D,VOBS,VBAR,VSD,H,Z,PARA) Purpose Calculates the discordancy, heterogeneity and goodness-of-t measures described by Hosking and Wallis (1997, chaps. 35). Formal parameters NSITES NAMES LEN XMOM [input; integer] Number of sites in region. [input; CHARACTER*12 array of length NSITES] Site names. [input; double-precision array of length NSITES] Record lengths at each site. [input; double-precision array of dimension (5,NSITES)] Contains the rst 5 sample L-moment ratios for each site, in the order mean, L-CV, t3 , t4 , t5 . XMOM(K,I) should contain the Kth L-moment ratio for site I. Elements XMOM(1,.) are not used by the routine. Note that XMOM(2,.) should contain L-CV, not the usual 2! [input; double precision] Parameters of plotting position. Should be the same as the values used to calculate the L-moments in the XMOM array. [input; double precision] Seed for random number generator. Should be a whole number in the range 2D0 to 2147483647D0. [input; integer] Number of simulated worlds used in calculation of heterogeneity and goodness-of-t measures. If NSIM is set to zero, the routine will calculate only the discordancy measures. [input; integer] Number of quantiles to be calculated for each distribution that gives a good t to the data. [input; double-precision array of length NPROB] Probabilities for which quantiles are to be calculated. [input; integer] Output ag. Should be set to zero to suppress printed output, to 1 to produce printed output. [input; integer] Channel to which output is directed. [output; double-precision array of length 5] Regional weighted average L-moment ratios. [output; double-precision array of length NSITES] Discordancy measure for each site. [output; double-precision array of length 3] The observed values of three heterogeneity statistics. V(1) is the weighted standard deviation of the at-site L-CVs. V(2) is the weighted average distance from a site to the regional average on a graph of L-CV vs. L-skewness. V(3) is the weighted average distance from a site to the regional average on a graph of L-skewness vs. L-kurtosis. 22

A,B SEED NSIM

NPROB PROB KPRINT KOUT RMOM D VOBS

VBAR VSD H Z

[output; double-precision array of length 3] The means of the NSIM simulated values of the V statistics. [output; double-precision array of length 3] The standard deviations of the NSIM simulated values of the V statistics. [output; double-precision array of length 3] Heterogeneity measures for the region: H(J)=(VOBS(J)-VBAR(J))/VSD(J). [output; double-precision array of length 5] Goodness-of-t measures for ve distributions: (1) generalized logistic, (2) generalized extreme-value, (3) generalized Normal (lognormal), (4) Pearson type III (3-parameter gamma), and (5) generalized Pareto. [output; double-precision array of dimension (5,6)] Parameters of regional distributions tted by the above ve distributions, plus (6) Wakeby. PARA(I,J) contains the Ith parameter of the Jth distribution.

PARA

Numerical method The routine straightforwardly carries out the calculations described in the Formal denition subsections of Hosking and Wallis (1997, chaps. 35). Computed quantities are as dened in the following equations in Hosking and Wallis (1997): discordancy measures are the Di in eq. (3.3); heterogeneity statistic V(1) is V in eq. (4.4); heterogeneity statistics V(2) and V(3) are V2 and V3 in eqs. (4.6) and (4.7); heterogeneity measure H(1) is H in eq. (4.5); heterogeneity measures H(2) and H(3) are computed analogously to H in eq. (4.5) but using V2 and V3 in place of V ; goodness-of-t measures are Z DIST in eq. (5.6).

23

PROGRAM XCLUST Purpose This program illustrates the use of the cluster analysis routines. An example data set and the results obtained by applying PROGRAM XCLUST to it are given in the les APPALACH.DAT and APPALACH.OUT respectively. This cluster analysis is described in Hosking and Wallis (1997, sec. 9.2). The data are site characteristics for 104 streamow gaging stations in central Appalachia. The attributes for each site are drainage basin area, gage elevation, gage latitude and gage longitude. Nonlinear transformations are applied to two of the attributes: a logarithmic transformation to drainage basin area and a square root transformation to gage elevation. These transformations give a more symmetric distribution of the values of the site characteristics at the 104 sites, reducing the likelihood that a few sites will have site characteristics so far from the other sites that they will always be assigned to a cluster by themselves, and, in Hosking and Walliss judgement, give a better correspondence between dierences in site characteristics and the degree of hydrologic dissimilarity between dierent basins. All four attributes are then standardized by dividing by the standard deviation of their values at the 104 sites. Finally, the drainage basin area variable is multiplied by 3 to give it an importance in the clustering procedure equal to that of the other variables together. The transformation and weighting of the variables involves subjective decisions whose ultimate justication is the physical plausibility of the regions that are ultimately obtained from the clustering procedure. Clusters are formed by Wards method, using routine CLUAGG. Hosking and Wallis (1997) judged that seven clusters is an appropriate number; information about these clusters is printed using routine CLUINF. The clusters obtained by Wards method are adjusted using the K -means procedure (routine CLUKM). This yields clusters that are a little more compact in the space of transformed site characteristics. The cluster numbers in the output le are not those used by Hosking and Wallis (1997), who renumbered the clusters according to the average drainage area of the sites in each cluster.

PROGRAM XFIT Purpose This program is an example of how routine REGLMR and others can be used to perform frequency analysis on a regional data set using an index-ood procedure that ts a regional distribution using the method of L-moments. The program ts a Wakeby distribution to the regional L-moments using the regional L-moment algorithm described by Hosking and Wallis (1997, sec. 6.2). It should be easy to change the program to t other distributions. An example data set and the results obtained by applying PROGRAM XFIT to it are given in les MAXWIND.DAT and MAXWIND.OUT respectively. The data are annual maximum wind speeds at 12 sites in the U.S., and are taken from Simiu et al. (1979). These 12 sites, all in the south-east of the U.S. and close to the coast, were judged to constitute a homogeneous region suitable for the application of an index-ood procedure (Wallis, 1993). 24

PROGRAM XTEST Purpose This program is an example of the use of routine REGTST. The program reads a le of atsite sample L-moments for one or more regions and calls routine REGTST to calculate test statistics for the regions. An example data set and the results obtained by applying PROGRAM XTEST to it are given in les CASCADES.DAT and CASCADES.OUT respectively. The data set is given in Hosking and Wallis (1997, Table 3.4) and used in the Example sections of Hosking and Wallis (1997, chaps. 36). The data are annual precipitations at 19 sites in the northwest U.S., the North Cascades region of Plantico et al. (1990). Data were obtained from the Historical Climatology Network (Karl et al., 1990). The results in CASCADES.OUT indicate that the region is acceptably close to homogeneous and can be well described by a lognormal or Pearson type III distribution.

PROGRAM XSIM Purpose This program illustrates the use of Monte Carlo simulation to derive the properties of estimated quantiles in regional frequency analysis. The program implements the algorithm in Table 6.1 of Hosking and Wallis (1997, p. 95), with inter-site correlation matrix taking the form of equal correlation between each pair of sites (Hosking and Wallis, 1997, eq. (6.11)). The region is set up as in Hosking and Wallis (1997, sec. 6.5). The region contains 19 sites with record lengths as for the North Cascades data. The sites have lognormal frequency distributions with L-CV varying linearly from 0.0978 at site 1 to 0.1228 at site 19, and each site has L-skewness 3 = 0.0279. The intersite correlation is 0.64, the average intersite correlation for the North Cascades sites. 10, 000 realizations of this region are made and the regional L-moment algorithm described by Hosking and Wallis (1997, sec. 6.2) is used to t a lognormal distribution to the data generated at each realization. The regional average relative RMSE of the estimated growth curve is calculated from the simulations, and quantiles of the distribution of the regional average of the ratio of the estimated to the true at-site growth curve are computed from a histogram accumulated during the simulations. From these quantiles, 90% error bounds for the growth curve are computed as in Hosking and Wallis (1997, eq. (6.19)). These results are contained in the output from PROGRAM XSIM, which is le XSIM.OUT. PROGRAM XSIM can also be used to nd the average value taken by heterogeneity measures over dierent simulations of a given region. This is useful when assessing the accuracy of estimated quantiles in regional frequency analysis, as discussed by Hosking and Wallis (1997, sec. 6.4). File XSIMH.OUT contains the output from running PROGRAM XSIM with parameter values DSEED=619145091D0, NREP=100, NSIM=500, and KPRINT=1, and omitting the section between dashed lines near the end of the program. The rst of the AVERAGE HETEROGENEITY MEASURES in this output is the source of the statement the average H value of simulated regions is 1.08 in Hosking and Wallis (1997, p. 98). 25

Auxiliary routines
FUNCTION DERF FUNCTION DIGAMD Error function, double precision. The numerical method is the same as that of Algorithm 5666 of Hart et al. (1968). Eulers psi function, (x) = d{log (x)}/dx, double precision. The numerical method is the same as that of Algorithm AS103 (Bernardo, 1976). Natural logarithm of the gamma function, double precision. The numerical method is the same as that of Algorithm ACM291 (Pike and Hill, 1966).

FUNCTION DLGAMA

SUBROUTINE DURAND(SEED,N,X) Fills the double-precision array X, of length N, with pseudo-random numbers uniformly distributed on the interval (0, 1). The double-precision variable SEED is the seed for the random-number generator. It should be initialized to an integer value between 2 and 2147483647 (231 1), though small values, less than 200000 say, should be avoided. It should be left unchanged between successive calls to the routine. The routine uses the algorithm of Lewis et al. (1969). This routine is supplied for completeness and portability, but on most computers it is more ecient to replace routine DURAND by a routine programmed in assembly language. For example, on an IBM RS6000 44P model 170 workstation, the Engineering and Scientic Subroutine Library for AIX, Version 4.2, contains a routine, also called DURAND, which has an identical specication to the one described here and is about 7 times as fast. FUNCTION GAMIND(X,A,G) Incomplete gamma integral, double precision. The function is dened to be G(x, ) = {()}1
0 x

t1 et dt .

The three arguments of the routine are x, and log{()}. The numerical method is the same as that of Algorithm AS239 (Shea, 1988), except that a dierent approximation to G(x, )Hills approximation (Johnson et al., 1994, eq. (18.35))is used for large . FUNCTION QUASTN Quantile function of standard Normal distribution, double precision. The numerical method is the same as that of Algorithm AS241 (Wichura, 1988).

SUBROUTINE SORT(X,N) Sorts the double-precision array X, of length N, into ascending order.

26

Data les
APPALACH.DAT Example data set: data for 104 streamow gaging stations in central Appalachia. The columns of data are as follows. siteid lat long area elev n l1 t t3 t4 t5 The sites Hydrologic Unit Code, a unique identier. Gage latitude, in degrees. Gage longitude, in degrees west of the Greenwich meridian. Drainage basin area, in square miles. Gage elevation, in feet. Sample size. Sample mean, 1 . Sample L-CV, t. Sample L-skewness, t3 . Sample L-kurtosis, t4 . Sample 5th L-moment ratio, t5 .

Only the four columns lat, long, area and elev are used by PROGRAM XCLUST. The data in these columns, and the streamow data used to compute the sample L-moments, were obtained from Hydrodata CD-ROMs (Hydrosphere, 1993), which reproduce data from the U.S. Geological Surveys WATSTORE data les. APPALACH.OUT MAXWIND.DAT Output when PROGRAM XCLUST is applied to the data in APPALACH.DAT. Example data set: annual maximum wind speeds at 12 sites in the southeast U.S., from Simiu et al. (1979). Numbers in parentheses are those used by Simiu et al. (1979, Table 3.4.1) to identify the sites. Output when PROGRAM XFIT is applied to the data in MAXWIND.DAT. Example data set: L-moments of annual precipitation at 19 sites in the northwest U.S. The columns in the table are the site identier used by the Historical Climatology Network, the record length, and the at-site sample L-moments 1 , L-CV, t3 , t4 , t5 . Output when PROGRAM XTEST is applied to the data in CASCADES.DAT. Output from PROGRAM XSIM. Output from PROGRAM XSIM with parameters DSEED=619145091D0, NREP=100, NSIM=500, and other modications as described on page 25.

MAXWIND.OUT CASCADES.DAT

CASCADES.OUT XSIM.OUT XSIMH.OUT

27

Links between routines


The following functions and subroutines call other routines. CDFGAM calls DERF, DLGAMA, GAMIND. CDFGNO calls DERF. CDFNOR calls DERF. CDFPE3 calls DERF, DLGAMA, GAMIND. CDFWAK calls QUAWAK. LMRGAM calls DLGAMA. LMRGEV calls DLGAMA. LMRGNO calls DERF. LMRKAP calls DLGAMA, DIGAMD. LMRPE3 calls DLGAMA. PELGAM calls DLGAMA. PELGEV calls DLGAMA. PELGNO calls DERF. PELKAP calls DLGAMA, DIGAMD. PELPE3 calls DLGAMA. QUAGAM calls DERF, DLGAMA, GAMIND, QUASTN. QUAGNO calls QUASTN. QUANOR calls QUASTN. QUAPE3 calls DERF, DLGAMA, GAMIND, QUAGAM, QUASTN. CLUKM calls routines KMNS, OPTRA and QTRAN from Hartigan and Wong (1979). REGTST calls most of the PELxxx, QUAxxx and auxiliary routines. GAMIND calls DERF.

28

Machine dependence
Some constants initialized in data statements in some of the routines may need to be changed to give good results on dierent machines. The list below explains how the values of these constants should be chosen. The values assigned to these constants in the listings of the routines are appropriate for an IBM 390 computer using the FORTVS2 compiler, for which double-precision constants have precision of approximately 15 decimal digits and maximum magnitude approximately 1075 .
Routine Constant

CDFxxx

SMALL

[used by routines CDFGEV, CDFGLO, CDFGNO, CDFGPA, CDFKAP] Should be as small as possible, so long as underow or overow conditions do not occur. A number x such that the operation ex just does not cause underow. A number x such that the operation ex just does not cause overow. A number x such that the operation ex just does not cause overow. A number x such that the operation (x) just does not cause overow. A number x such that the operation ex just does not cause underow. A number x just large enough that erf x = 1 to machine accuracy. A number x such that
1 log (x) = (x 2 ) log x x + 1 2 log(2 )

CDFWAK LMRKAP PELKAP PELKAP QUAWAK DERF DLGAMA

UFL OFL OFLEXP OFLGAM UFL X1 BIG

to machine precision. The value 10d/2 , where d is the number of decimal digits in a double-precision constant, should be adequate. DLGAMA GAMIND TOOBIG UFL A number x such that the operation x2 just does not cause overow. A number x such that the operation ex just does not cause underow.

In routines that use iterative methods (CDFWAK, LMRGNO, PELGEV, PELKAP, QUAGAM, GAMIND), constants EPS and MAXIT control the test for convergence of the iterations. If an iterative procedure fails frequently, it may be helpful to change the values of these constants.

29

Change history
Initial release (July 1988) Routines: CDFGEV, CDFGLO, CDFGNO, CDFGPA, CDFGUM, CDFKAP, CDFWAK, LMRGEV, LMRGLO, LMRGNO, LMRGPA, LMRGUM, LMRKAP, LMRWAK, PELGEV, PELGLO, PELGNO, PELGPA, PELGUM, PELKAP, PELWAK, QUAGEV, QUAGLO, QUAGNO, QUAGPA, QUAGUM, QUAKAP, QUAWAK, SAMLMR, SAMPWM, SAMREG, SAMXMP, ALGAMD, DIGAMD, ERFD, QUASTN, SORT. Version 2 (August 1991) Routines added: CDFGAM, CDFNOR, CDFPE3, LMRGAM, LMRNOR, LMRPE3, PELGAM, PELNOR, PELPE3, QUAGAM, QUANOR, QUAPE3, REGLMR, REGTST, XFIT, XTEST, DERF, DLGAMA, GAMIND. Routines modied: all except CDFGUM, LMRGPA, LMRWAK. Routines replaced: ALGAMD (by DLGAMA), ERFD (by DERF), SAMREG (by REGLMR), SAMXMP (by XTEST). Version 3 (August 1996) Routines added: CDFEXP, LMREXP, PELEXP, QUAEXP, SAMLMU, CLUAGG, CLUINF, CLUKM, XCLUST, XSIM. Routines modied: CDFGUM CDFWAK LMRKAP PELGEV PELGNO PELKAP REGTST Removed initialization of an unused variable. Iterative procedure modied to reduce the (already small) chance that it will fail to converge. Minor bug xes. Removed initialization of an unused variable. More accurate numerical approximation. Simpler and more accurate numerical approximation. Routine QUASTN no longer required. Returns = 1 if 3 is too large (|3 | 0.95). Modied to reduce the chance of numeric overow. In consequence, two new values of the IFAIL parameter are possible. Changed critical values for discordancy measure. Order in which output is printed was changed: table of parameters now printed before table of quantiles. Placement of IMPLICIT statement corrected: this can cause minor changes in the simulation results returned by the routine. Minor bugs and inconsistencies removed. Minor changes to comments. Minor bug x. Minor changes to comments. Minor bug xes. Minor changes to comments. Corrected error when argument of function was exactly zero. Removed initialization of an unused variable. Minor bug x. Speed increased. 30

XFIT XTEST DERF DLGAMA DURAND

Version 3.01 (December 1996) Routines modied: XSIM Replaced call to nonexistent function CDFSTN by call to DERF.

Version 3.02 (March 1997) Routines modied: CLUAGG CLUINF XCLUST XSIM Implemented single-link and complete-link clustering. Added validity check for parameters. Minor changes to comments. Minor change to FORMAT statement 6080. Changed random number seed.

The le XSIM.OUT was also changed, in consequence of the changes to XCLUST and XSIM. Version 3.03 (June 2000) Routines modied: REGTST XTEST XSIM QUASTN CHARACTER variable declarations changed to conform to Fortran-77 standard. CHARACTER variable declarations changed to conform to Fortran-77 standard. RETURN statements replaced by STOP. Changed WRITE(6,7000) statement to be consistent with FORMAT statement 7000.

The le APPALACH.OUT was also changed, as a consequence of the change to PROGRAM XCLUST made at Version 3.02. Version 3.04 (July 2005) Routines modied: PELWAK SAMLMU REGTST XCLUST XSIM Minor bug x in test for validity of L-moments. Set XMOM(1) to sample mean, not zero, when all data values are equal. Print agged values of discordancy measure even when there is only one. Removed declarations of unused variables. Removed declarations of unused variables.

31

References
Bernardo, J. M. (1976). Algorithm AS 103: psi (digamma) function. Applied Statistics, 25, 315317. Churchhouse, R. F. (ed.) (1981). Numerical methods, Handbook of applicable mathematics, vol. 3. New York: Wiley. Donaldson, R. W. (1996). Calculating inverse CV, skew and PWM functions for Pearson-3, log-normal, extreme-value, and log-logistic distributions. Communications in StatisticsSimulation and Computation, 25, 741747. Greenwood, J. A., Landwehr, J. M., Matalas, N. C., and Wallis, J. R. (1979). Probability weighted moments: denition and relation to parameters of several distributions expressable in inverse form. Water Resources Research, 15, 10491054. Hart, J. F., et al. (1968). Computer approximations. New York: Wiley. Hartigan, J. A., and Wong, M. A. (1979). Algorithm AS 136: A K -means clustering algorithm. Applied Statistics, 28, 100108. Hosking, J. R. M. (1990). L-moments: analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society, Series B, 52, 105124. Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38, 251258. Hosking, J. R. M., and Wallis, J. R. (1987). Parameter and quantile estimation for the generalized Pareto distribution. Technometrics, 29, 339349. Hosking, J. R. M., and Wallis, J. R. (1995). A comparison of unbiased and plottingposition estimators of L-moments. Water Resources Research, 31, 20192025. Hosking, J. R. M., and Wallis, J. R. (1997). Regional frequency analysis: an approach based on L-moments. Cambridge, England: Cambridge University Press. Hosking, J. R. M., Wallis, J. R., and Wood, E. F. (1985). Estimation of the generalized extreme-value distribution by the method of probability-weighted moments. Technometrics, 27, 251261. Huh, M. Y. (1986). Computation of percentage points. Communications in Statistics Simulation and Computation, 15, 11911198. Hydrosphere (1993). Hydrodata CD-ROMs, vol. 4.0: USGS peak values. Hydrosphere Data Products, Boulder, Colo. Johnson, N. L., Kotz, S., and Balakrishnan, N. (1994). Continuous univariate distributions, vol. 1, 2nd edn. New York: Wiley. Karl, T. R., Williams, C. N., Quinlan, F. T., and Boden, T. A. (1990). United States historical climatology network (HCN) serial temperature and precipitation data. ORNL/CDIAC-30 NDP-019/R1, Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tenn. Kotz, S., Johnson, N. L., and Read, C. B. (eds.) (1988). Encyclopedia of statistical sciences, vol. 9. New York: Wiley. Landwehr, J. M., Matalas, N. C., and Wallis, J. R. (1979a). Probability weighted moments compared with some traditional techniques in estimating Gumbel parameters and quantiles. Water Resources Research, 15, 10551064. 32

Landwehr, J. M., Matalas, N. C., and Wallis, J. R. (1979b). Estimation of parameters and quantiles of Wakeby distributions. Water Resources Research, 15, 13611379. Correction: Water Resources Research, 15 (1979), 1672. Lewis, P. A. W., Goodman, A. E., and Miller, J. M. (1969). A pseudorandom number generator for the System/360. IBM Systems Journal, 8, 136146. Murtagh, F. (1985). Multidimensional clustering algorithms. Berlin: Springer. Neuman, C. P., and Schonbach, D. I. (1974). Discrete (Legendre) orthogonal polynomials: a survey. International Journal for Numerical Methods in Engineering, 8, 743770. Pike, M. C., and Hill, I. D. (1966). Algorithm 291: logarithm of the gamma function. Communications of the Association for Computing Machinery, 9, 684. Plantico, M. S., Karl, T. R., Kukla, G., and Gavin, J. (1990). Is recent climate change across the United States related to rising levels of anthropogenic greenhouse gases? Journal of Geophysical Research, 95, 1661716637. Shea, B. L. (1988). Algorithm AS 239: chi-squared and incomplete gamma integral. Applied Statistics, 37, 466473. Simiu, E., Chanery, M. J., and Filliben, J. J. (1979). Extreme wind speeds at 129 stations in the contiguous United States. Building Science Series 118, National Bureau of Standards, Washington, D.C. Wallis, J. R. (1993). Regional frequency studies using L-moments. In Concise encyclopedia of environmental systems, ed. P. C. Young, 468476. Oxford, U.K.: Pergamon. Wichura, M. (1988). Algorithm AS 241: the percentage points of the normal distribution. Applied Statistics, 37, 477484.

33

You might also like