Travel Cost - AJAE - 2009
Employing a unique and rich data set of water quality attributes in conjunction with detailed household
characteristics and trip information, we develop a mixed logit model of recreational lake usage and
undertake thorough model specification and fitting procedures to identify the best set of explanatory
variables, and their functional form for the estimated model. Our empirical analysis shows that
individuals are responsive to the full set of water quality measures used by biologists to identify the
impaired status of lakes. Thus, changes in these quality measures translate into changes in the
recreational usage patterns and well-being of individual households. Willingness-to-pay (WTP)
estimates are reported based on improvements in these physical measures.
Parsons and Kealy (1992) use dummy vari- lack of direct infor- mation on how nutrient
ables based on dissolved oxygen levels and levels (phosphorus and nitrogen) impact
av- erage Secchi transparency readings to recreational usage is un-
capture the impact of water quality on
Wisconsin lake recreation. Similarly, Parsons,
Helm, and Bon- delid (2003) construct dummy
variables indi- cating high and medium water
quality levels for use in their analysis of
recreational de- mand in six northeastern
states. These dummy variables are based on
pollution loading data and water quality
models, rather than on di- rect measurements
of the local water quality. In these studies, the
physical water quality indi- cators are found to
significantly impact recre- ation demand, but
because of the limited na- ture of the measures
themselves, provide only a partial picture of
value associated with pos- sible water quality
improvements. Other pa- pers that have used
one or more measures of water quality
include von Haefen (2003), Atasoy, Palmquist
and Phaneuf (2006), Pha- neuf (2002), Kaoru,
Smith, and Liu (1995), Ribaudo and Piper
(1991), Russell and Vaughan (1982), and
Stevens (1966).
An alternative to physical measures of
wa- ter quality has been the use of perceived
or reported water quality measures
(Adamowicz et al. 1997; Jeon et al. 2005).
While perceived measures are likely to be
the direct drivers of behavior (McConnell
1993) and could be studied in a structural
model (Kaoru, Smith, and Liu 1995), their
major drawback is that information can
typically be gleaned only for the sites the
individual has visited. While im- portant
questions concerning the relationships
between perceived and observed measures
re- mains, it is likely that perceptions are
related to physical measures, the focus of
this work.
Bockstael, Hanemann, and Strand’s
(1986) analysis of beach usage in the
Boston-Cape Cod area has perhaps one of
the most ex- tensive lists of objective
physical water qual- ity attributes included
in a model of recre- ation: oil, fecal
coliform, temperature, chem- ical oxygen
demand (COD), and turbidity. However, the
study also points out one of the frequently
encountered problems in iso- lating the
impact of individual water quality attributes
—multicollinearity. Seven additional water
quality measures were available to the
analysts: color, pH, alkalinity, phosphorus,
ni- trogen, ammonia, and total coliform, but
were excluded from the analysis due to
correlation among the measures. While
these choices are certainly reasonable, the
fortunate in the context of setting
model below, the correlation reaches 0.9.
standards in many states, where nutrient
loadings are of particular concern.
The contribution of the current article
lies in our ability to incorporate a rich set of
physical water quality attributes, as well as
site and household characteristics, into a
model of recreational lake usage.
Importantly, the full set of water quality
variables used by biolo- gists to classify
lakes as impaired by the EPA, and therefore
potentially in need of policy ac- tion, are
included. Trip data for the study are drawn
from the 2002 Iowa Lakes Survey. The
survey was sent to a random sample of
8,000 Iowa households, eliciting
information on their recreational visits to
Iowa’s 129 principal lakes, along with
socio-demographic data and atti- tudes
toward water quality issues. The unique
feature of the project, however, is that a par-
allel inventory of the physical attributes of
these lakes was conducted by Iowa State
Uni- versity’s Limnology Laboratory. Three
times a year, over the course of a five-year
project, thirteen distinct water quality
measurements were taken at each of the
lakes, providing a clear physical
characterization of the condi- tions in each
lake. Moreover, because of the wide range
of lake conditions in the state, Iowa is
particularly well suited to identifying the
im- pact of these physical characteristics on
recre- ation demand. Iowa’s lakes vary from
a few clean lakes with up to fifteen feet of
visibil- ity to other lakes having some of
the highest concentrations of nutrients in the
world, and roughly half of the 129 lakes
included in the study are on the EPA’s list of
impaired lakes. An additional unique aspect
of Iowa lakes is that the diversity of land
uses in the watershed contributing to them −
leads to a relatively low degree of
collinearity among the physical and
chemical water quality measures, with
corre- lation coefficients ranging from 0.53
to 0.68, and typically lying below 0.4.2
Thus, Iowa lakes provide an almost ideal
“laboratory” for study- ing the effects of
biological water quality mea- sures on usage
and value.
A second unique contribution of this
study is the application of careful model
specifica- tion and fitting procedures to
identify the best set of explanatory
variables, and their func- tional form, for
the estimated model. Since economic
theory does not provide guidance
The biological measures, cyanobacteria, and total
Phytoplank- ton are more highly correlated. In the
6 out the concern with the correlated standard errors, and one of the
The authors would like to thank Spencer Banzhaf for pointing
reviewers for pointing out the concern with the misspecification
error. 8
The $0.25 per mile is used as a relatively conservative estimate
The unit of observation is the household. Although, we also of gasoline and deprecation costs per mile of driving. This esti-
in- mate is generally less than most official government reimburse-
clude the survey respondent’s individual socio-demographic ment rates. The round-trip travel distances and times were calcu-
data. lated using the software package PCMiler (Streets Version 17).
9 As one of the reviewers points out, the handicap facilities vari-
The “average wage rate” is calculated for all respondents as
able may proxy more generally for the ease of accessibility of the
their household’s income divided by 2,000 (assuming total annual
hours worked is forty hours per week for 50 weeks). 11
The candidate fish species are bluegill, crappie, largemouth
smallmouth bass, catfish, bullhead, walleye, sunfish, yellow
perch, and northern pike. See Iowa Department of Natural
Resources (2004b) for details.
There are, of course, a large number of po-
tential model specifications given the range of
teristics, such as wake restrictions and site fa-
cilities, observed in previous studies (e.g.,
Train 1998). For example, some households
may pre- fer to visit less developed lakes
with wake restrictions in place, while others
might be at- tracted to sites allowing the use
of motorboats, jet skis, etc. To allow for a
range of possible reactions to the various
site characteristics, it is initially assumed
that the random parame- ters (þa) are each i
normally distributed with the mean and
dispersion of each parameter estimated.
Even restricting our attention to the wa-
ter quality characteristics in table 2, there
are a large number of potential model
specifica- tions. We focus on five groups of
water qual- ity characteristics for the Qj in
equation (9): Secchi transparency,
chlorophyll, nutrients (to- tal nitrogen and
total phosphorus), suspended solids
(inorganic and organic), and bacteria
(cyanobacteria and total phytoplankton).
The first four characteristic groups directly
affect visible features of water quality,
making it more likely that households
respond to them. Bacteria is included
because surveyed house- holds report it to
be the single most impor- tant water quality
concern (see Azevedo et al. 2003).
Our initial intent was to consider three pos-
sible specifications for each of these groups
of variables: inclusion linearly, inclusion log-
arithmically, or exclusion. However, prelimi-
nary analysis indicated that these variables in-
dividually and as groups were consistently
sig- nificant at a 5% level or better. Thus, we
chose to focus on determining whether each
group of factors should enter the model in a
linear or logarithmic fashion. This required
estimat- ing a total of 2 5 32 versions of the =
model. Table 4 provides a summary of the
results, with the various specifications listed
in terms of de- creasing log-likelihood. The
models in table 4 are non-nested
alternatives relative to each
other, precluding a direct test of one model els that have bacteria entering logarithmically
over another.13 Since themodels all employ the being preferred to (i.e., having a higher log-
same number of parameters, Pollak and Wales’ likelihood relative to) those in which bacte-
(1991) likelihood dominance criteria for model ria enters linearly, regardless of the functional
selection, used in the current analysis, reduces form for the remaining variables. Given the
to a direct comparison of log-likelihood values. specification for bacteria, the models having
One interesting feature of the results is that chlorophyll entering logarithmically are gener-
the modeling ranking is lexicographic in terms ally preferred to those having it enter linearly.
of the bacteria specification, with all mod- The preferred model has Secchi transparency
and suspended solids entering the model lin-
early, with the remaining variables entering in
One can, of course estimate an “encompassing” model,
includ- ing both linear and logarithmic versions of the water a logarithmic fashion. This model is referred
quality vari- ables. Each model in table 4 could then be tested to as model A below.14
against this hybrid specification. As Greene (2000, p. 301)
notes, however, there are problems with this approach. First,
such an approach tests each
model against the hybrid model, not against each other. Second,
the hybrid model requires a substantial increase in the number of
parameters, increasing potential multicollinearity problems. 14
We also estimated exploratory models utilizing the
More- over, given that the loglikelihood value for the preferred quadratic function for the water quality measures. The results
model provides a lower bound on the loglikelihood value for indicate that a model with Secchi transparency and the
the hybrid model, the results in table 4 indicate that, at best, suspended solids en- tered quadratically (and the nutrients and
only the first two models would not be rejected using a bacteria levels remain- ing logged) fits better than model A.
likelihood ratio test. Thus, the hybrid strategy would provide However, we do not formally include the quadratic form in our
little guidance regarding the choice of models. specification search, as the ver-
sions of the model would increase from 32 to 243 (=35 ).
and (4) bacteria (cyanobacteria and total
phy- toplankton). Focusing on the five 15
Hensher and Greene (2003) provide an excellent discussion of
groups of water quality measures we have the alternative distributional assumptions commonly used in mixed
considered, Secchi transparency is clearly logit models and their limitations.
the best single measure to include, as it is
easy to obtain and consistently has a
substantial and statistically significant
impact on recreational site selection and
participation decisions. Nutrient levels are
also relatively easy to collect and
phosphorous levels are the most important
additional mea- sure after Secchi
transparency. Moreover, nu- trient loadings
have consistently been a target of
agricultural and environmental policies and
understanding their impact on recreational
us- age patterns would seem worthwhile.
There- fore, if possible we recommend the
inclusion of the nutrient levels in the data
collection pro- cess. The bacteria levels are
also found to be consistently correlated with
recreational be- havior in our analysis.
However, they are also the most costly.
Choosing to include them in the data
collection process will be dictated in part by
the potential for cyanobacteria levels to
reach dangerous levels and threaten human
health. Questions in the Iowa Lakes survey
in- dicate that this was the single greatest
concern regarding water quality amongst
respondents. In our informal cost-benefit
analysis chloro- phyll and the suspended
solids fair the worst. They are relatively
more expensive to collect and also the least
important measures in ex- plaining
recreational usage patterns.
Finally, the specification search process
above assumes that the random parameters
associated with site characteristics are each
normally distributed. While this assumption
is common practice in the literature, there is
no a priori basis for this choice. Moreover, a
num- ber of authors have noted potential
problems with unbounded parameter
distributions (such as the normal) leading to
implausible results for some portion of the
population (Revelt and Train 1998). As a
final stage in the spec- ification search
process, we consider alterna- tive choices
for the random parameters in our model.15
Specifically, We grouped the site char-
acteristic parameters in (b) the five parame-
ters associated with the discrete variables for
paved ramps, handicap facilities, state parks,
wake restrictions, and the fish index, and (c)
participation parameter ai. For each set of
pa- rameters, we consider both normal and
trian- gular distributions, for a total of eight
sets of distributional assumptions. The For the results in table 7, a total of 750 MLHS draws are used
in the simulation. Changing the distributional assumptions
advan- tage of the triangular distribution is across specifications required a large number of draws to obtain
that it al- lows for both positive and stability in the comparisons. The estimation results in table 8
negative parameter values and retains the are also with 750 MLHS draws.
There is, of course, a key distinction between model A0
and the standard nested logit specification in that the latter
assumes that the individual’s decisions over choice
occasions are independent, whereas model A0 allows for
correlation as induced by the shared element ai.
The distributional assumptions investigated here is by no
means exhaustive. In particular, discrete-factor and latent
class dis- tributions may also prove useful in capturing
variations in agent preferences, including the potential
multi-modal features of these underlying distributions. See,
e.g., Morey, Thacher and Breffle 2006; Hilger and
Hanemann 2006.
found to be statistically significant factors in recreation site selection.
A1. Since model A1 best fits the data, we
con- sider it to be the best model for
estimation, welfare analysis, and prediction.
In the next section, we turn to the estimation
of model A1 and other models for
Estimation Results
Given the results from the specification
search, five models were used in the second
third of the sample: models A, A1, and A0
(varying the specification for the random
parameters of the model), a model including
only Secchi transparency as a measure of
water quality (referred to as model B
hereafter), and a model with all of the water
quality variables entering in a linear fashion
(model C).19 We include model B to
illustrate the consequences of re- lying on a
single measure of water quality, in this case
one that is often available to analysts. Model
C reflects what might be naturally con-
sidered as a default specification. The
resulting parameter estimates are presented
in tables 7 and 8.
In table 7a, focusing on model A1, all of
the coefficients associated with household
char- acteristics are significant at the 5%
level, ex- cept for age. Note that the socio-
demographic data are included in the
conditional indirect utility for the stay-at-
home option. Therefore, in model A1, males,
less-educated individuals, and larger
households are all more likely to
An additional model (model D), including all thirteen water
quality characteristics listed in table 2, was also estimated and
is available in the electronic appendix to this paper (Egan et al.,
forth- coming) available online through AgEconSearch. The
results from this expanded model are not substantially different
from the results for model A, with pH and Alkalinity also
take a trip to a lake. However, the size and
sign of the age, school, and household size
variables are sensitive to the model
specification used. The price coefficient is
negative and virtually identical across all
The physical water quality coefficients
are reported in table 7b are relatively stable
across the various models. For all models,
the ef- fect of Secchi transparency is
positive and, in general, organic and
inorganic (volatile) suspended solids have a
negative impact (al- though ISS and VSS
are not statistically sig- nificant at the 5%),
indicating the respondents strongly value
water clarity. However, the co- efficient on
logged chlorophyll is positive, sug- gesting
that on average respondents do not mind (or
even prefer) some “greenish” water. The
negative coefficient on logged total phos-
phorus, the most likely principal limiting
nu- trient, indicates higher algae growth
leads to fewer recreational trips. High
logged total ni- trogen levels also have a
negative impact on recreational utility
associated with a site, al- though it is only
statistically significant at the 5% level for
model A. The log of the possibly toxic
cyanobacteria is negative, while the log of
total Phytoplankton is positive, indicating
that as cyanobacteria is a larger percentage
of the total Phytoplankton in the lake,
recreators are less likely to visit.
Finally, turning tothesite amenities, again
all of the parameters are of the expected
sign. As the size of a lake increases, has a
cement boat ramp, gains handicap facilities,
or is adjacent to a state park, the average
number of visits to the site increases.
Notice, however, the large dispersion
estimates in table 8b. For example, in model
A the dispersion on the size of the lake
indicates 11% of the population prefers a
smaller lake, possibly someone who enjoys
Table 7. Repeated Mixed Logit Model Parameter Estimates—Fixed Parameters a (Robust Std.
Errors in Parentheses)
Model B Model C
Model A Secchi Linear WQ Model A1 Model A0
a. Household Characteristics
∗∗ ∗∗ ∗∗
Male −7.91 −6.65 −9.37 −6.63 −5.33
(6.57) (3.47) (2.16) (1.71) (1.60)
Age 1.13 −0.23 −0.73 −0.35 −0.43
(0.32) ∗ (0.41) (0.47) (0.29) (0.28) ∗∗
∗ ∗
Age2 −0.0071 0.0064 0.0090 0.0064 0.0079
(0.0032) (0.0048) (0.0042) (0.0027) (0.0027)
School −2.11 5.80 −2.59 5.70 −2.62
(2.29) (2.34) (2.37) (1.84) (1.74)
∗∗ ∗
Household 0.75 −0.68 0.36 −1.88 −1.41
(0.73) (0.69) (1.15) (0.56) (0.65)
∗∗ ∗∗ ∗∗ ∗∗ ∗∗
Price −0.49 −0.49 −0.49 −0.49 −0.50
(0.023) (0.024) (0.024) (0.023) (0.023)
b. Water Quality Attributes
∗∗ ∗∗ ∗∗ ∗∗ ∗∗
Secchi transp. (m) 2.40 1.77 1.73 2.49 1.38
(0.66) (0.35) (0.33) (0.58) (0.53)
∗ ∗∗ ∗
Log(chlor. (µg/l))b 2.37 0.022 2.42 1.52
(1.02) (0.16) (0.69) (0.67)
Log(nitrogen (mg/l))b −1.16 −0.18 −0.84 −0.44
(0.53) (0.13) (0.51) (0.46)
∗∗ ∗∗ ∗∗ ∗∗
Log(phosph. (µg/l))b −2.83 −0.023 −2.87 −2.30
(0.96) (0.0078) (0.76) (0.75)
Inorganic SS (mg/l) −0.0057 −0.012 0.0013 −0.014
(0.029) (0.034) (0.029) (0.030)
Volatile SS (mg/l) −0.019 0.052 −0.014 −0.0074
(0.076) (0.78) (0.069) (0.069)
∗∗ ∗∗ ∗∗
Log(cyanobact. (mg/l))b − 2.44 − 0.0010 − 2.39 1.86
(0.40) −(0.030) (0.36) (0.38)
∗∗ ∗∗ ∗∗
Log(total phytop. (mg/l))b 3.44 0.0022 3.28 2.19
(0.56) (0.030) (0.49) (0.51)
LogLik −37,627 −37,759 −37,743 −37,568 −39,960
Note: All of the parameters are scaled by 10. single asterisk ( ∗) and double asterisk (∗∗ ) are used to denote significance at the 5% and 1% levels, respectively.
a For Model C, each of these variables enter linearly, rather than logarithmically.
Note: All of the parameters are scaled by 10, except a (which is unscaled). Single asterisk (∗) and double asterisk (∗∗ ) are used to denote significance at the
5% and 1% levels, respectively.
a The dispersion coefficients are the estimated standard deviations for the normally distributed parameters, and the estimated spread coefficients for the
total Rootfor
and reported
Yˆi is theinfitted
parentheses. We
total trips forcalculate
each ,
RMSE as RMSE = N −1 N ˆ
i (Y i − Y ) where
i Y
person. i =1
is the
example, West Okoboji Lake has slightly reported in table 11. We consider the welfare
over five times the water clarity, measured
by Sec- chi transparency, of the other lakes.
The sec- ond scenario is a less ambitious,
though more realistic, plan of improving
nine lakes to the water quality of West
Okoboji Lake (see table 10 for comparison).
The state is divided into nine zones with one
lake in each zone, allow- ing every Iowan to
be within a couple of hours of a lake with
superior water quality. The nine lakes were
chosen based on recommendations by the
Iowa Department of Natural Resources for
possible candidates of a clean-up project.
The third and final scenario is also a policy-
oriented improvement. Currently of the 129
lakes, 65 are officially listed on the EPA’s
im- paired waters list and by 2009 the plans
must be in place to improve the water
quality at these lakes enough to remove
them from the list. Therefore, in this
scenario, the 65 impaired lakes would be
improved to the median physi- cal water
quality levels of the 64 non-impaired lakes.
The last two columns of table 10 com- pares
the median values for the non-impaired
lakes to the averages of the impaired lakes.
The resulting compensating variations and
changes in total trips under each scenario are
results from model A1 to be the best results
for policy analysis. For comparison
purposes, we present welfare results from
models A, A0, B, and C as well. Finally, the
relatively poor per- formance of the various
models in terms of out- of-sample
predictions, particularly in terms of
nonparticipants, raises concerns about the
re- sulting welfare measures. One approach
to the problem, as suggested by a reviewer,
is to con- dition on the observed choices in
the sample (see, e.g., von Haefen, 2003).
Table 11 pro- vides both unconditional and
conditional wel- fare measures, though we
would recommend the conditional estimates
as more credible.20
We start by contrasting the unconditional
and conditional welfare and trip measures.
As expected, conditioning on the observed
us- age patterns typically reduces the
compensat- ing variations (by as much as
40%), though the qualitative pattern of the
results remains generally the same. This is
consistent with the fact that,
unconditionally, the models gener- ally
over-predict both participation and the
We use 1,000 draws in the unconditional and conditional
wel- fare simulations. Moreover, for the conditional welfare
simula- tions, we also report Krinsky and Robb (1986) standard
errors using 200 simulated conditional welfare estimates.
Egan et al. Valuing Water as a Function of Quality Measures 121