1
The relevance of hedonic price indices
The case of paintings*
by
Olivier Chanel, Louis-André Gérard-Varet
GREQAM, Marseilles
and
Victor Ginsburgh
Université Libre de Bruxelles and CORE
August 1993
(Revised November 1995)
Abstract
We argue that for the case of heterogeneous commodities with infrequent tradings, such as
paintings, it is relevant to base a price index on hedonic regressions using all sales and not
resales only. To support this conclusion we construct a price index for paintings by
Impressionists and their followers and compare the various estimators using bootstrapping
techniques.
Published The Journal of Cultural Economics 20 (1996), 1-24.
* We are grateful to W. Baumol, S. Bazen, R. Blundell, R. Davidson, A. Farber, Cl. Le Pen, M. Lubrano,
P.-M. Menger, J. M. Montias, R. Moulin, M. Mougeot, J.-Cl. Passeron and M. Shubik for many useful
comments, discussions and encouragements. Two referees provided extremely helpful suggestions which led
us to a large number of revisions. The Musée Cantini (and especially Ms Creuset) and the Musée
Longchamps in Marseilles were of great help in completing the data. Victor Ginsburgh acknowledges
financial support from contract PAI n°26 from the Belgian Government as well as support from Université
Libre de Bruxelles.
2
Introduction
Real assets, such as houses or paintings, are known to be illiquid: only a fraction of
the stock is on sale during one run of the market. These are also heterogeneous commodities
and the price of each unit depends, to some extent at least, on its own characteristics. In order
to construct a price index for such markets, it is necessary to control for possible non
temporal determinants of price variations. This is what motivated the estimation of "hedonic
price indices," initiated by Court (1939), extended and used among others by Griliches (1971)
for car prices and Ridker and Henning (1967) for housing.
These techniques produce indices of the market price for a standardized commodity
by using the estimates of a regression of the sale price of a sample of commodities on their
characteristics and on (some representation of) time. Because the correct set of characteristics
is not known with certainty, it has been suggested, following Bailey, Muth and Nourse (1963),
that the analysis be confined to commodities which have been sold more than once and to
estimate an index by regressing the change in the (logarithm of the) price of each commodity
on a set of dummy variables (one for each time period during which the commodity is hold).
This is the so-called "repeat-sales regression" method which has been used to compute
indices for property values or family houses by Palmquist (1980), Mark and Goldberg
(1984), Case (1986), Case and Shiller (1987, 1989), Goetzmann (1990a), for paintings by
Anderson (1974) and Goetzmann (1990b, 1993) or for prints by Pesando (1993). Although
the method avoids the difficulty of specifying the various quality characteristics, it does so at
the cost of ignoring all the information on single transactions.
As observed by Shiller (1991), a repeat sales estimator is actually a hedonic estimator
where hedonic variables consist only of commodity dummy variables, one for each
commodity. This also suggests the possibility of using changed characteristics in a repeat
sales regression by augmenting the set of hedonic variables. See Palmquist (1982) or Case
and Quigley (1991). However, for markets characterized by infrequent trades, there is an
advantage to an ordinary hedonic regression including all commodities, even those sold only
once.
In this paper, we consider the market for paintings. Besides Anderson's (1974) and
Goetzmann's (1990b) papers mentionned above, Stein (1977), Baumol (1986) and Frey and
Pommerehne (1989) have also contributed to the issue. Most of these contributions were
motivated by measuring the expected returns to investment in art. They consider the
3
relationships between art and other assets, but they are not so much interested in constructing
a price index for art markets. We argue that constructing such an index is a preliminary step
to any sensible study of returns and that hedonic estimation provides a suitable method, since
it allows combining information on single sales with information on repeat sales for
commodities the characteristics of which may change over time.
Holub, Hutter and Tappeiner (1993) have criticized the approach in two respects. First,
they rightly claim that it is meaningless to compute a unique price index comprising all
painters, schools and artists.1 Second, they argue that the repeat sales method does not really
avoid the heterogeneity problem since results on homogeneous repeat sales are necessarily
aggregated to produce a global index.
We provide a partial answer to both of their remarks, first by considering a relatively
homogeneous market (Impressionist and Modern paintings), and second, by implicitly
weighting the various artists appearing in our sample. Using all observations on sales
provides many more observations and also avoids the difficult work of searching for paintings
which have been sold twice at least. Unless the artwork sold is described by its number in a
catalogue raisonné, one can never be sure that it is the same work: the title is often translated
into the language of the country where it is sold; many works bear titles which make them
undistiguishible (such as Reclining Nude, or Still Life); dimensions "change" because they are
sometimes not accurately reported or measured, etc.
It is also worth noting that most (if not all) studies estimating returns for art markets
are based on transactions at public auctions. Guerzoni (1994) pointed out that unobserved
private sales (through galleries or other intermediaries) may also take place between sales at
auctions. He shows that if one takes these into account (at least for Reitlinger's (1960, 1970)
compendium used by most researchers), returns may turn out to be very different from those
usually obtained. This adds to the reasons for which it may be better to use all the information
(resales as well as sales) to compute returns. Finally, one cannot exclude the possibility of
selection biases when using resales only: it may be the case that only "good" works (or on the
contrary, only "lemons") appear often on the market.
In the first section we discuss alternative approaches to the construction of price
indices. In section 2, we briefly survey previous empirical findings on the art market. The
third section is devoted to the presentation of the results that we have obtained using hedonic
1 This is also pointed out by Buelens and Ginsburgh (1994).
4
estimation. In Section 4, hedonic estimation is compared with two other widely used
estimators (the geometric mean and the geometric repeat sales estimator), using bootstrap
replications. We show that the hedonic estimator provides estimates that are much more
precise than the two other estimators, even if the number of observations is identical. In the
conclusion, we take up some issues concerning the art market as a financial institution.
1. On the construction of price indices
In this section, we describe three possible estimators of price indices, obtained from
observing a set of 2N transactions related to i = 1, 2, ..., N different commodities (described in
terms of some attributes or characteristics). For simplicity, we assume that each commodity
has been the object of two transactions.2 The set of dates (say, years) is t = 0, 1, ..., T and
defines possible periods (period t goes from date t - 1 to date t ) or market runs for the
commodities. There exist data on prices for each commodity during some (here, 2) periods,
but not for all commodities in every period. A transaction of commodity i in period t is
indexed by subscripts (i, t).
The three estimators considered now are the geometric mean, the geometric repeatsales estimator and the hedonic estimator. We illustrate our discussion using an example in
which there are 2N = 12 sales of N = 6 commodities at three possible dates (T = 2). Let pit be
the (log of the) price Pit of commodity i, sold at date t. Assume commodities 1 and 4 were
sold in t = 0 and 1, commodities 3 and 5 were sold in t = 1 and 2 and finally, commodities 2
and 6 were sold in t = 0 and 2. We define a vector y with elements yi (the logged differences
of prices obtained at two dates) as follows:
-p10
pp11-p
22 20
p -p
y = p32-p31 .
41 40
p52-p51
p62-p60
2 This simplifies exposition and is not too restrictive.
5
The geometric mean estimator
We write the following linear model:
yi/τi = β + εi,
(1.1)
where β is a parameter to be estimated and τi is a variable which takes as value the number of
periods during which a commodity was hold by an owner (i.e. not sold); τi is thus equal to 1
for i = 1, 3, 4 and 5 and equal to 2 for i = 2 and 6; εi is a random disturbance with the usual
properties. The variable yi/τi is the annualized rate of return on commodity i. The parameter β
can be estimated by running a regression of y/τ on a variable which takes the value one for
each observation. It is trivial to check that the OLS estimate for β is the average of annualized
returns:
(1.2)
^ 1
β = NΣiyi/τi.
This is the estimator used by Baumol (1986) and Frey and Pommerehne (1989).3 It is
obviously very easy to compute, but its drawback is that it does not provide an index over
time. Moreover, it puts equal weights on all annualized rates, irrespective of the length of time
during which the commodity was held. For our example, (1.2) leads to:
^β = 1 [(p -p ) + (p22-p20) + (p -p ) + (p -p ) + (p -p ) + (p62-p60) ].
11 10
32 31
41 40
52 51
6
2
2
The geometric repeat-sales estimator
To derive this estimator, we construct an N x T matrix X. The columns of this matrix
represent periods (not dates); observation i is in row i, which contains ones for periods during
which the commodity was held and zeroes otherwise. For our case, this matrix is:
3 Actually, Baumol and Frey and Pommerehne have used the exact formula
(t-t')
P it/P it' to compute the
√
annual return of a painting sold in t' and subsequently in t. We use the approximation (lnPit-lnP it')/(t-t')
instead. The two lead to comparable results if Pit is close to Pit'.
6
0
1
1
0 .
1
1 1
11
0
X= 1
0
The OLS estimator of the two coefficients β1 and β2 is given by:
(1.3)
^β = (X'X)-1X'y
and leads, in our example, to the following system of two equations:
^ + 2β
^ = (p -p ) + (p -p ) + (p -p ) + (p -p ),
4β
1
2
11 10
22 20
41 40
62 60
^
^
2β1 + 4β2 = (p22-p20) + (p32-p31) + (p52-p51) + (p62-p60).
This can also be written:
1
^
^
^β
1 = 4 [(p11-p10) + ((p22-β2)-p20) + (p41-p40) + ((p62-β2)-p60)],
^β = 1 [(p -(p +β
^
^
2 4
22 20 1)) + (p32-p31) + (p52-p51) + (p62-(p60+β1))].
If we now interpret ^β1 and ^β2 as being estimates of the mean rates of return in periods 1 and
^ ) and (p -β
^
2 respectively, (p22-β
2
62 2) are estimates of the prices of commodities 2 and 6, had
^ ) and (p +β
^
they been resold in year 1 instead of year 2, while (p20+β
1
60 1) are estimates of the
prices of the same commodities, had they been sold for the first time in year 1 instead of year
0. Once this interpretation is accepted, one immediately sees that ^β1 is the average return of
the commodities sold in t = 0 and in t = 1, while ^β2 is the average return of the commodities
sold in t = 1 and in t = 2.
Usually, the repeat-sales estimator is presented in a slightly different way. Let Ω be a
T x T matrix constructed as follows: row t starts with t ones, while the other elements of the
row are zeroes. For our example, this matrix is:
Ω=
( 11 01 ).
7
We then construct a matrix Z = XΩ-1 of explanatory variables. This leads to the following
OLS estimator:4
(1.4)
^γ = (Z'Z)-1Z'y.
Some straightforward matrix algebra shows that (1.4) can also be written:
(1.5)
^γ = Ωβ
^,
which relates estimators (1.3) and (1.4). It implies that:
(1.6)
t ^
^γ
t = ∑β τ, t = 1, 2, ..., T.
τ=1
^ and γ^ = ^β + β
^ . Since we can set ^γ = β
^ = 0, the
For our example, this means that ^γ 1 = β
1
2
1
2
0
0
^
^
^
5
sequence exp(γ 0), exp(γ 1), exp(γ 2) represents the price index over the three years.
The hedonic estimator
We consider the vector of (logged) prices p, the elements of which are pit. For
convenience, we rank the observations for t = 0 first, then those for t = 1, etc., without taking
into account that some of the prices concern resales of the same commodity. We also
construct a matrix C of independent variables consisting of 2N rows and T+1 columns,
denoted c0, c1, ..., cT. Element cit is equal to one if a transaction on commodity i occurs in
year t, and zero otherwise. For the example at hand, the (column) vector of prices is (p10, p20,
p40, p60, p11, p31, p41, p51, p22, p32, p52, p62), while say, the first column of C contains 4 ones,
followed by 8 zeroes.
We next estimate the parameters δt of the linear model:
T
(1.7)
pit =
∑δtcit + εit,
τ=0
4 This is the usual way to present the geometric repeat sales estimator, for which the matrix of independent
variables Z has the same dimensions as X; element t in row i is -1 if the first sale of commodity i occured in
year t; it is 1 if the resale occured in t and it is zero otherwise.
5 See Shiller (1991) for arithmetic repeat-sales estimators.
8
where εit is a random disturbance. The OLS estimator is:
(1.8)
^δ = (C'C)-1C'p.
It is straightforward to check that the estimator for the price in year t is simply the average of
the (log of) prices of the nt commodities sold during that year.
(1.9)
^δ = 1 Σ p , t = 0, 1, ..., T.
t n i it
t
Obviously, this is a sound approach as long as the same mix of commodities is sold in each
year. This is where the hedonic approach will be useful since what it does is to homogenize
sales mixes over time.
Consider now the set of commodities sold in a specific year t and assume that the
price of a commodity i sold in t can be considered as a function of m time-invariant
characteristics vik, k = 1, 2, ..., m (e.g. the dimensions of a painting) and of n time-varying
characteristics wijτ, τ = 0, 1, ..., t (e.g. the changing owners of a painting), j = 1, 2, ..., n. Then,
we can write that pit = f(vi1, ..., vim,wi10, ..., wi1t, wi20,...). We specialize the functional form
to:
m
(1.10)
pit =
∑αkvik +
k=1
t
n
∑ β∑ θjτw ijτ + δt + εit.
τ=0 j=1
The parameters αk and θjτ appearing in (1.10) can be interpreted as (implicit) prices of the
various characteristics describing the commodity, δt is the intercept and εit is a random error
term. These implicit prices can be obtained by a regression of the prices on observable
characteristics; once they are known, it is possible to compute, like in (1.9), the average price
^δ of a characteristic-free commodity in year t:
t
(1.11)
^δ = 1 Σ (p - m α v - t β n θ w ).
∑ k ik ∑ ∑ jτ ijτ
t
nt i it k=1
τ=0 j=1
The sequence of ^δt, t = 0, 1, ..., T would then describe the price of an (artificial) characteristicfree commodity over time, and this can obviously be obtained by a hedonic regression pooling
the sales over time, by combining (1.7) and (1.10):
9
m
(1.12)
pit =
∑αkvik +
k=1
t
n
∑ β∑ θjτw ijτ +
τ=0 j=1
T
∑δtcit + εit.
τ=0
The method easily allows for interactions between time and characteristics, if one
believes that the prices of some characteristics (say, the ones which represent painters) may
vary over time. For this, one merely has to introduce new variables ωkt = vkct. The regression
coefficients ζ kt picked by these variables describe the time path of the implicit price of
characteristic k. The two previous estimators can also provide such information, by computing
the parameters on subsamples, e.g. painter by painter. However, given that the number of
resales is small compared with the total number of sales, the coefficients would not be
estimated with much precision.
Obviously, there are many other ways to specify the way in which prices depend on
T
time. The
∑δtct formulation makes it possible to construct a price index. One can also
τ=0
introduce a variable t which takes the values 0, 1, 2, ..., T and specify (1.12) with a term φt,
where φ would be an estimate of the price trend. One can also estimate different time trends
over subperiods. For example, the time dependent term can be written φ1u1t + φ2u2t with u1t
= t, u2t = t - τ for t > τ; the trend would then be φ1 between 0 and τ and φ1 + φ2 afterwards.
Combining repeat-sales and hedonic etimators
Case and Quigley (1991) try to use all the information and combine sales and resales
(of houses) in a system of equations. For sales, they use a hedonic equation similar to (1.12),
while a repeat sales equation is used for resales. The authors also distinguish resales for
which characteristics have changed from other resales.
Though the results are extremely interesting - in particular, they provide estimates with
small standard deviations -, the suggestion is hard to apply to paintings, since characteristics
are mainly described by qualitative variables, while Case and Quigley deal with (a small
number of) continuous variables only. Since in our case time is represented by dummies, we
would need to introduce a very large number of such ωkt variables.
10
2. Real rates of return on paintings: existing evidence
Anderson (1974) computes the (nominal) rate of return for each of the 1,730
paintings sold at least twice over the period 1653-1970, using Reitlinger (1960, 1970). Next,
he runs a regression of returns on dummy variables representing subperiods6 during which
the painting was held before being resold, applying a variant of the geometric repeat sales
procedure described in Section 1. He estimates a long term rate of return of 4.9 % (3.8% in
real terms). For the period 1950-1969, he also computes an average rate of return of 18% per
year for Impressionists (166 observations), and 23% for Twentieth Century paintings (49
observations).
Goetzmann (1990b) uses repeat sales estimation7 on two different databases
(Reitlinger with 1,233 resales and Mayer with 1,714 resales); the second one includes data up
to 1990. For the 1950-1987 period, he obtains a real rate of return of 10.5%,8 but his long
term rate is just around 3.3% between 1714 and 1986.
The work by Baumol (1986) is also based on Reitlinger, but introduces an extra
constraint in selecting resales separated by more than 20 years.9 The sample reduces to 650
observations. Baumol computes the real rate of return for each resale, and obtains a
distribution of returns (for which normality cannot be rejected at the 5% probability level),
leading to a mean (the geometric mean estimator) of 0.55% and a median of 0.85%. This is
much smaller than the 2.5% real rate of return on (risk-free) financial assets, such as bonds,
during the same period.10 Baumol concludes that "art prices behave randomly" and that
financial rationality alone is unable to explain why people buy and possess paintings.11
Frey and Pommerehne (1989) extend the data set used by Baumol to cover the 19611987 period, and draw similar conclusions on sales made during two subperiods: 1635-1949
and 1950-1987. The real rates of return on some 1,200 resales are respectively 1.4% and
1.7% (net of transaction costs estimated to amount to 0.4%); this is fairly close to Baumol's
6 Anderson works with five year periods to compensate for the lack of data.
7 His procedure is more sophisticated than the one described in Section 2.
8 These figures are consistent with Stein's (1977) findings who, using the Capital Asset Pricing approach to
value paintings, estimates the net return to be...anything between 0 and 11% for the period 1946-1968.
9 This is apparently done in order to eliminate works "bought" by their owner in case reservation prices were
not reached, or owners who "buy" at high prices in an effort to raise the market value of a painter or a work.
10 And also smaller than the rate of 3.8% found by Anderson, who uses all resales.
11 See also the paper by Buelens and Ginsburgh (1994), who show that this average is obtained from returns
that are very different over subperiods and schools.
11
findings. For the same periods, real rates on financial assets reach 3.3 % and 2.4%, with a
standard deviation of 1.7%, implying a much lower risk than paintings (with a standard
deviation of 5%). Thus, though art has become a relatively better investment after the Second
World War, it still achieves lower rates of return than low risk paper assets.
All these findings are based on a relatively small number of resales which makes it
difficult to construct annual price indices.
3. Real rates of return on paintings: new evidence
We adopt the approach suggested in (1.12) and explain the price of a painting by its
characteristics. The first issue is related to defining characteristics. These may be the size of
the work, the year in which it was sold, the year in which it was painted, the place of sale, the
characteristics of the buyer and/or the seller, the type of painting (still-life, nude, landscape,
abstraction, etc.). Such a simple description is however isufficient to explain the price
difference between, say Picasso and Miro, and one is necessarily led to include a "measure of
the repute of individual artists" (Anderson (1974, p. 17)). To represent reputation, Anderson
(1974) uses an "estimated price" for each artist. We chose to work with dummy variables
representing artists.
We thus included in the set of characteristics a dummy for each artist12 and the size of
the work (height, width and surface). The auction house is included among the time-varying
variables and the year of sale is the time variable. We decided not to use "schools", since there
is little consensus on this matter, and several if not all artists have changed their style during
their lifetime.13 The "quality" of the buyer and/or the seller are, we think, important
characteristics: there may be differences in the willingness to pay (or to sell) of a museum, a
well-known collector, a Japanese insurance company or the Getty Foundation. Such
information is unfortunately only seldom available.14 Finally, we dropped all interaction terms
which would have cost too many degrees of freedom in our regression and would also have
made less meaningful the comparisons between estimators (Section 4).
12 It is worth noting that Grampp (1989) for instance, considers that the name of the painter is part of the
aesthetic object, no less than the painting itself. To make his point convincing, Grampp (1989, p. 131)
suggests evaluating how "a dealer would fare if he (...) did not provide information about the paintings he
offered for sale: no name, no title, no provenance, no references to works of art history or criticism, no dates.
Nothing but the price."
13 Styles can be recovered easily from the results of the regressions by a suitable renormalization.
14 See however a recent paper by Pommerehne and Feld (1995).
12
Our data set is based on Reitlinger (1960, 1970),15 and our interest is mainly centered
on Impressionists, Post-impressionists and their "followers," but our sample includes all
artists born after 1830 and having had auctions reported by Reitlinger between 1855 and
1970. 16 This makes for the 46 painters listed in Table 1 and for some 1,900 sales.1 7
Obviously, Reitlinger's choice of artists is subjective, and ours is even more so, since we
exclude for instance "old masters." This is of relatively little importance here, since our main
purpose is to compare estimation methods.
Current prices were corrected for inflation using, like Baumol, the price index
constructed by Phelps-Brown and Hopkins (1956) for the years 1855 to 1954; for more
recent years, we used the US consumer price index (IMF (1972)). Prices were not corrected
for possible transaction fees charged to buyers and/or sellers by auction houses; nor did we
take into account costs of storing, restoring and insuring paintings.18
The table given in Appendix 1 reproduces our main results for a sample of equations
that we estimated. In all cases, painter dummies, auction house dummies and dimensions are
present. The equations differ mainly by how the temporal effects are taken into account: in
Equation 1, we split the period into 4 subperiods (see eqn. (3.1) below). In Equation 2, we use
a trend over the whole period. In Equation 3, we introduce dummy variables for years (see
eqn. (1.12)) and we restrict observations to the period 1950-1969 only, since there are few
sales per year for earlier years.
Note first that over 50% of the variance of the (log) of prices is explained by 60 to 80
variables. In many cases, the most interesting coefficients (i.e. those relative to variables
representing time or trends, dimensions and "age" of the painting) are significantly different
from zero at the 5% or even the 1% probability level.
15 In many cases, characteristics are missing in Reitlinger; for such cases, we have completed the data set
using Tout l'oeuvre peint de...
16 With the exception of Bastien-Lepage, Jacob, Mathew and William Maris, Orchardson and Walker, who
were dropped by Reitlinger in his 1970 additions and for whom he does not report sales after 1960. Note that
we did not include artists added by Reitlinger in his 1970 volume since we would have missed their sales
before 1960.
17 Reitlinger's compendium includes 2,986 sales, but we could only retrieve the dimensions for 1,972.
18 This is also the case in most studies, with the exception of Frey and Pommerehne (1989). Note that it is
easier to take these costs into account in repeat sales than in hedonic regression methods.
13
The regression coefficients picked by artist dummies can be used to rank each painter
according to the price of a "normalized" (i.e. ageless, standard dimension, sold in a standard
auction house) painting of his. Such a ranking is given in Table 1. It is based on Equation 3
which best fits the data. Since this model includes sales made between 1949 and 1969 only, it
ranks the artists according to their prices in the late 1950's-early 1960's. It is interesting to
look at these rankings with a 1990 eye which, given the more recent auctions, would probably
rank Picasso much higher. This was far from being true some thirty years ago, where Picasso
was "cheap" in comparison with Van Gogh, while Cezanne proved to be the most expensive
painter. The coefficients picked by auction house dummies imply that Christie's performed
rather poorly: an average sale makes some 10 to 20% less than a at Sotheby's. The
dimensions of the painting significantly contribute to explaining prices; since the logarithm of
prices is a concave function of height and width, the results imply that there exists an
"optimal" size beyond which the price decreases.
We finally consider the effect of time. In Equation 1, the period 1855-1969 is split
into four sub-periods (obtained after some experimentation, by spline techniques): 18551914, 1915-1949, 1950-1960 and 1961-1969. For each subperiod, we include a trend among
the variables. More formally, Equation 1 is derived from model (1.12) as follows:
m
(3.1)
pit =
∑αkvik +
k=1
t
n
∑ β∑ θjτwijτ + φ1u1t + φ2u2t + φ3u3t + φ4u4t + εit.
τ=0 j=1
14
Table 1
Ranking of painters*
____________________________________________________
Rank
Painter
Coeff.
St.dev.
Index
____________________________________________________
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
Cezanne
Van Gogh
Renoir
Degas
Seurat
Manet
Monet
Sisley
Pissarro
Gauguin
Matisse
Picasso
Lautrec
Braque
Fantin-Latour
Modigliani
Bonnard
Rouault
Gris
Signac
Vuillard
Chagall
Soutine
Morisot
Klee
Rousseau
Utrillo
Derain
Vlaminck
Dufy
Kandinsky
Redon
Cassat
De Staël
Leger
Miro
Van Dongen
Munnings
Ernst
Sargent
Whistler
Tissot
John
Burne-Jones
Alma-Tadema
Lord Leighton
1.718
1.374
1.222
1.159
1.155
1.034
0.976
0.911
0.893
0.889
0.862
0.824
0.799
0.769
0.727
0.686
0.629
0.246
0.233
0.181
0.179
0.129
0.128
0.116
0.061
0.000
-0.029
-0.103
-0.156
-0.175
-0.198
-0.202
-0.251
-0.274
-0.327
-0.401
-0.482
-0.605
-0.778
-1.372
-1.649
-1.919
-2.162
-2.317
-2.794
-3.649
0.177
0.175
0.160
0.169
0.215
0.203
0.168
0.173
0.167
0.177
0.188
0.164
0.245
0.168
0.200
0.181
0.159
0.177
0.183
0.187
0.183
0.164
0.193
0.198
0.176
0.174
0.185
0.162
0.168
0.177
0.185
0.182
0.189
0.180
0.188
0.166
0.180
0.183
0.210
0.233
0.265
0.204
0.201
0.234
0.234
557
395
339
319
317
281
265
249
244
243
237
228
222
216
207
199
188
128
126
120
120
114
114
112
106
100
97
90
86
84
82
82
78
76
72
67
62
55
46
25
19
15
12
10
6
3
____________________________________________________
* This ranking is based on Equation 3 reported in Appendix 1.
15
Since t is the year at which the sale took place, we have u1t = t, u2t = t - 1915 if t >
1915 and 0 otherwise, u3t = t - 1949 if t > 1949 and 0 otherwise and u4t = t - 1960 if t > 1960
and 0 otherwise. Therefore, the growth rates during the four subperiods are φ1, (φ1+φ2),
(φ1+φ2+φ3) and (φ1+φ2+φ3+φ4), respectively. The results, reported in Table 2, show that the
growth rates of prices over the four subperiods are very different.
Table 2
Growth rates of inflation-free prices
___________________________________
Time period
Growth rate
___________________________________
1855-1914
1915-1949
1950-1960
1961-1969
6.9
-3.1
22.4
4.3
1950-1969
13.8
1855-1969
4.9
___________________________________
In Equation 2, a simple time trend βt is introduced over the whole period. The
equation shows that the average trend is equal to some 4.8% per year. In Equation 3, a
dummy is introduced for each year between 1950 and 1969. As shown in (1.12), this makes it
possible to construct a price index for the period 1950-1969. The coefficients are obtained
from the following regression:
m
pit =
∑αkvik +
k=0
t
n
∑ β∑ θjτw ijτ +
τ=0 j=1
1969
∑ δtcit + εit,
t=1950
where cit is, as before, a dummy variable which takes the value 1 for a sale which occurred in
year t, and 0 otherwise. The index so obtained is reproduced in Table 3 and is compared with
the index of real returns on US common stocks. The table needs little comment and shows
that during that period, paintings did much better than stocks.
16
Table 3
Price index for paintings
and for returns on common stocks*
1950-1969
____________________________________________________
Year
Coeff.
St.dev.
Index
Comm. stocks
Index**
____________________________________________________
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
0.000
-0.138
0.193
0.731
1.123
0.862
1.293
1.186
2.088
2.179
2.284
2.349
2.654
2.781
2.711
2.705
2.721
2.567
2.581
2.998
2.994
0.229
0.206
0.217
0.202
0.177
0.227
0.316
0.169
0.169
0.163
0.155
0.163
0.155
0.154
0.155
0.145
0.147
0.145
0.143
0.146
100
87
121
207
307
236
364
327
807
883
981
1,047
1,415
1,613
1,504
1,494
1,518
1,303
1,321
2,005
1,996
100
110
130
150
150
229
300
311
269
381
421
416
526
474
573
661
731
633
763
814
694
____________________________________________________
* The results are based on Equation 3 reported in Appendix 1.
** Calculations made on the basis of the Standard and Poors index, deflated
by the US CPI.
4. Comparing alternative estimators
In this section, we compare the statistical properties of the three estimators discussed
in Section 2. The first two have often been used by researchers, whose results are reported in
Section 3, while the third is the hedonic estimator. Table 4 reproduces annual returns resulting
from the three different methods. The first line shows the results obtained with a hedonic
regression run on the full sample of 1,972 observations (Equation 1 in Appendix 1). These
are compared with three calculations based on resales only. In the first, we use the geometric
mean estimator (1.2) to compute an average return over the whole period (245 resales) and
within subperiods. In the second, we use the geometric repeat-sales estimator (1.3) to compute
10-year-subperiod returns (adapted to match our four subperiods). Finally, we also run a
hedonic regression using resales only. As can be checked, the results are of the same order of
17
magnitude and there is little reason to believe that the quality of resales is different from the
quality of "all" sales.
Table 4
Comparison of real returns
__________________________________________________________________
Estimator
1855-1969
1855-1914
1915-1949
1950-1960
1961-1969
__________________________________________________________________
Hedonic regression est.
(N = 1,972)
4.9*
6.9
-3.1
22.4
4.3
Geometric mean est.
(N =)
5.9
(245)
14.9
(31)
-3.2
(19)
18.4
(10)
6.8
(42)
Geometric repeat-sales est.
(N = 245)
5.0*
6.0
-3.7
23.8
11.3
Hedonic regression est.
(N = 295)
5.0
6.9
-2.4
13.5
12.2
__________________________________________________________________
* Obtained by a weighted average of subperiod returns.
Bootstrapping
We now show that the estimators have quite different properties. In particular, the use
of the largest possible sample, which includes all sales, leads to estimates with much smaller
variances. To do this, we use non-parametric bootstrap methods, which make it possible to
describe the distributional properties of the estimates and not their means and standard
deviations only.
We have computed 3,000 bootstrap replications for each of the three methods. In the
regression and the GRS replications, we sample from the residuals of the computed
regressions; in the geometric means replications, the drawings are taken from the observed
individual rates of return.19
To make the comparisons as simple as possible, we estimate a unique parameter
assumed to represent time, instead of intoducing years (like in Table 3) or subperiods (like in
Table 4). The results, reproduced in Table 5 are all very close to the "true values" of the
parameters obtained on the basis of our data (see column 1 of Table 4): all methods lead to
19 See Appendix 2 for details.
18
unbiased estimates, but the standard deviations resulting from the hedonic regression are 4 to
8 times smaller.
The results are illustrated in Figures 1 and 2, where we compare the distributions of
the bootstrap replications. Figure 1 clearly shows that the distribution of returns obtained with
the hedonic regression model is much more concentrated than with geometric means and
repeat sales regression methods.
To verify that these favorable results are not due to differences in sample sizes only
(245 for resales methods and 1,972 when all sales are used), we computed 3,000 bootstrap
replications using 490 random drawings20 out of our observations instead of 1,972. The
result is given in the last line of Table 5 (see also Figure 2, which compares the distributions),
and again, the standard deviation of the hedonic regression is much smaller than the one
derived with the two other methods. Hedonic regression seems thus significantly more
accurate than methods based on resales only. Moreover, the geometric means method does
better than repeat sales regression, but it cannot provide an index over time.
Table 5
Comparison of annual returns
obtained from bootstrapping
______________________________________________________________________
Method
"True value"
Bootstrap
mean
Bootstrap
stand. dev.
Minimum
value
Maximum
value
______________________________________________________________________
Geometric mean
(N = 245)
5.90
5.92
0.92
2.22
GRS estimator
(N =245)
4.99
5.00
1.87
-2.4412.34
Hedonic regression 4.88
(N = 1,972)
4.88
0.25
3.93
9.53
5.85
______________________________________________________________________
Hedonic regression 4.88
(N = 490)
4.87
0.51
2.99
6.53
______________________________________________________________________
20 The choice of 490 rather than 245 is based on the fact that Baumol and Anderson work on 245 resales, i.e
490 pieces of information on prices.
19
20
21
Though the means do not look too different, we wanted to check this more carefully
and test formally whether the estimated coefficients were statistically different. For this, we
use the classical statistic for comparing means:
|α 1-α 2|
√
,
σ12 / n 1 +σ 22 / n 2
where α1 and α2 are two bootstrapped means, σ1 and σ2 are their standard deviations and n1
and n2 represent the number of observations on which their computations are based.21,22 The
results, given in Table 6, show that there are significant differences, which seems to imply that
resales and sales are not drawn from the same population.23
Table 6
Are the results identical
____________________________________
Comparison between
Test-value
____________________________________
Hedonic (N=1972) and Geom. mean
Hedonic (N=1972) and GRS
Hedonic (N=490) and Geom. mean
17.61
1.00*
16.63
Hedonic (N=490) and GRS
1.07*
Geom. mean and GRS
6.90
____________________________________
* Equality accepted at the 5% probability level.
We have already stressed that hedonic regression makes it possible to compute annual
price indices, without need to collect a large number of resales (if they can be found at all24).
Therefore, it is of some importance to verify whether the annual coefficients associated to the
21 The test is devised for comparing means in two independent samples. This is only approximatively the
case here, since the hedonic regression (N = 1,972) with 1,972 observations includes those resales for which
we have all the characteristics (i.e. some 60% of all resales).
22 The number of observations is not the number of replications (3,000) but the number of cases on which
each estimator is calculated (for instance, 245 in the case of the geometric mean).
23 This is confirmed by the result of a hedonic regression run with the 1,972 observations, in which we also
include a dummy variable which takes the value one for resales. This dummy is significantly different from
zero at the 1% probability level.
24 This is possible in the case of prints, since the several copies sold of the same print can be considered as
resales of the same object. See Pesando (1993).
22
time dummies in (1.12), as well as their standard deviations are meaningful. To verify this, we
ran 3,000 bootstrap replications of Equation 3 in Appendix 1. Results appear in Table 7.
Table 7
Comparison of annual indexes obtained
from the regression equation and from bootstrapping
________________________________________________________________________
Bootstrap
Year
Regression
__________________________
_________________________ ∆ range
Mean
Coeff.
St.dev.
95% lb
95% ub
St.dev.
95% lb 95% ub
________________________________________________________________________
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
-0.1362
0.1918
0.7319
1.1239
0.8595
1.2930
1.1884
2.0914
2.1793
2.2874
2.3467
2.6554
2.7829
2.7103
2.7039
2.7204
2.5672
2.5816
2.9990
2.9929
0.2249
0.2050
0.2100
0.1981
0.1741
0.2166
0.3158
0.1663
0.1664
0.1610
0.1541
0.1601
0.1537
0.1510
0.1518
0.1422
0.1435
0.1441
0.1406
0.1431
-0.5744
-0.2342
0.3151
0.7343
0.5134
0.8638
0.5603
1.7691
1.8445
1.9623
2.0386
2.3413
2.4864
2.4069
2.4025
2.4393
2.2893
2.2941
2.7220
2.7142
0.3245
0.5853
1.1380
1.5119
1.1957
1.7003
1.8123
2.4121
2.5008
2.5994
2.6594
2.9655
3.0838
3.0161
3.0002
3.0102
2.8547
2.8591
3.2725
3.2730
-0.1380
0.1927
0.7310
1.1229
0.8618
1.2933
1.1861
2.0886
2.1793
2.2839
2.3485
2.6540
2.7809
2.7111
2.7045
2.7214
2.5673
2.5812
2.9984
2.9943
0.2293
0.2064
0.2168
0.2023
0.1775
0.2268
0.3156
0.1687
0.1686
0.1634
0.1553
0.1627
0.1549
0.1537
0.1545
0.1450
0.1470
0.1450
0.1427
0.1453
-0.5875
-0.2119
0.3061
0.7265
0.5139
0.8487
0.5674
1.7579
1.8489
1.9636
2.0442
2.3351
2.4774
2.4099
2.4017
2.4371
2.2792
2.2970
2.7187
2.7096
0.3114
0.5974
1.1559
1.5194
1.2096
1.7379
1.8047
2.4194
2.5097
2.6041
2.6528
2.9728
3.0844
3.0123
3.0074
3.0056
2.8555
2.8654
3.2780
3.2790
0.0000
-0.0102
0.0229
0.0153
0.0135
0.0527
-0.0147
0.0184
0.0046
0.0034
-0.0121
0.0135
0.0097
-0.0068
0.0080
-0.0024
0.0109
0.0034
0.0089
0.0107
________________________________________________________________________
The first four columns concern the bootstrap replications and respectively give the
_
means δ t (t = 1950 to 1969) of the coefficients obtained in the 3,000 replications, their
_
standard deviations (computed as Σi(δit-δt)2/3,000), and the confidence intervals at the 95%
level (based on the distribution of the 3,000 replications, leaving 2.5% at each tail). The four
^ t and the 95%
next columns give the regression coefficients δ^t, their standard deviations σ
^ + 1.96σ
^ t). The last column in the table
confidence interval based on the standard deviations (δ
t reports the difference between the confidence intervals obtained from the bootstrap
replications and the hedonic regression. As can be checked, the hedonic coefficients and their
23
standard deviations are very close to the bootstrapped means and standard deviations, so that
the differences between the confidence intervals are negligible (except for the year 1955). This
means that the standard deviations from the regression are an accurate measure of the
precision with which indices over time are estimated by a hedonic regression.
5. Conclusions
In this paper, we suggest that price indices of paintings should be based on
regressions using the full set of sales, and not resales only. To support this conclusion, we
have constructed a price index for Impressionists. However, while taking the art market as an
example, we did not consider other issues which seem essential, such as its efficiency.
There are good reasons to think that the art market should be less efficient than are
financial markets: trades are infrequent, transactions are individualized, etc. However, we are
not aware of any work confirming whether this is so, except for a few comments in
Goetzmann (1990b). Indeed, some empirical results, such as Baumol's (1986) have been used
to argue that, on the contrary, the art market is quite efficient. However, the observation that
"returns" on paintings are smaller than returns on bonds or other relatively secure assets
cannot be taken as empirical tests of the efficiency of the art market.
As a matter of fact testing market efficiency by testing whether prices follow a random
walk poses special problems when - as in the case of paintings - commodities are traded
infrequently. As noted by Goetzmann (1990b), repeat sales estimation is particularly ill-suited
for studying serial correlation of the market. More importantly perhaps, it seems that
discussions on market efficiency have overlooked the fact that most statistical tests do not
show that returns cannot be forecasted but only that these are not "very" forecastable. Models
that have prices determined by fads may well imply that returns are not very forecastable.25
The hedonic methodology suggested here, applied to extended data sets, will provide a
better basis for studying the predictability of returns and the efficiency of the art market.
25 See Fama (1991) for a survey of the literature.
24
Appendix 1
Regression results
__________________________________________________________________________________________
Equation 1
Equation 2
Equation 3
Coeff.
St. dev.
Coeff.
St. dev.
Coeff.
St. dev.
__________________________________________________________________________________________
Time periods
1855-1914
1915-1949
1950-1960
1961-1969
0.062
-0.082
0.246
-0.184
0.006
0.010
0.013
0.016
Trend
0.048
0.002
Individual years *
included
Dimensions (inches)
Height
Height squared (x1,000)
Width
Width squared (x1,000)
Surface (x1,000)
0.036
-0.246
0.016
-0.051
-0.066
0.006
0.074
0.005
0.072
0.124
0.024
-0.049
0.014
-0.091
0.013
0.005
0.054
0.004
0.046
0.117
0.036
-0.259
0.017
-0.090
-0.036
0.005
0.060
0.004
0.058
0.100
-0.406
-0.194
-0.247
-0.156
0.000
0.089
0.078
0.083
0.080
-0.298
-0.015
-0.305
-0.110
0.000
0.094
0.075
0.085
0.079
-0.258
-0.104
0.042
-0.045
0.000
0.077
0.068
0.078
0.071
Auction houses
Christie's
Sotheby's
Paris
New-York
All other**
Artists *
-
-
included
included
0.640
0.630
58.55
0.547
0.534
42.02
included
Goodness of fit
R2
Corr. R2
F-value
0.757
0.747
70.72
1,972
1,972
1,751
Sample size * * *
Degrees of freedom
1,913
1,916
1,676
__________________________________________________________________________________________
* We do not report all the results; see however Tables 1 and 3.
**"All other" (299 observations) also includes "auction house not available" (159 observations).
*** In Equation 3, only observations belonging to the period 1950-1969 are included.
25
Appendix 2. The bootstrap method
The idea is to approximate with minimal assumptions, the unknown distribution F of
a function of the observations zi (i= 1, 2, ..., n) or of residuals of a regression ^εi (i= 1, 2, ...,
n).
Geometric (or arithmetic) average
The steps are the following:
(i) construct F^, the sample probability distribution, by putting mass 1/n on each observed data
point zi, i = 1, 2, ..., n;
(ii) draw a bootstrap sample z1*, z *2 , ..., z *n by randomly sampling n draws with replacement
^
from F; compute the value of the statistic θ* (in our case, the real rate of return of paintings
between 1855 and 1969);
(iii) repeat step (ii) a large number of times (say, M) and obtain M independent bootstrap
replications θ*1, θ*2, ..., θ*M and the bootstrap distribution of θ* = f(z*, F^).
Assuming M and n to be sufficiently large, it can be proved26 that the distribution of
θ* is a consistent estimate of the true distribution of θ = f(z, F), where F is unknown.
Geometric repeat sales or hedonic regression
A regression is run, the estimated coefficients and residuals of which are the vectors β^
and ^ε, respectively. Bootstrap samples are then drawn from these residuals, assumed to
follow an unknown distribution F. The steps are:
(i) construct F^, the sample probability distribution, by putting mass 1/n on each estimated
residual ^εi, i= 1, 2, ..., n);
26 See Bickel and Freedman (1981).
26
(ii) draw a bootstrap sample ε^1* , ε^2* , ..., ε^n* by randomly sampling n draws with replacement
from F^; compute the value of the statistic β * (here, a vector of regression coefficients,
obtained as β* = β + (X'X)-1X'ε^;
(iii) repeat step (ii) a large number of times (say, M) and obtain M independent bootstrap
replications β*1, β*2, ..., β*M from which one can construct the bootstrap distribution β* =
^*, F^).
f(ε
Assuming again M and n to be sufficiently large, it can be proved27 that the
^, F), where F is
distribution of β* is a consistent estimate of the true distribution of β = f(ε
unknown.
27 See Freedman (1981).
27
References
Anderson, R. C. (1974), Paintings as an investment, Economic Inquiry 12, 13-25.
Bailey, M., J.F. Muth and H. O. Nourse (1963), A regression method for real estate price
index construction, Journal of the Americn Statistical Association 58, 933-942.
Baumol, W. J. (1986), Unnatural value: or art investment as floating crap game, American
Economic Review, Papers and Proceedings 76, 10-14.
Bickel, P. J. and D. A. Freedman (1981), Some asymptotic theory of the bootstrap, Annals of
Statistics 9, 1196-1217.
Buelens, N. and V. Ginsburgh (1993), Revisiting Baumol's 'Unatural value or art as a floating
crap game', European Economic Review 37, 1351-1371.
Case, K.E. (1986) The Market for Simple Family Homes in the Boston Area, New England
Economic Review, May-June, 38-48.
Case, B and J. M. Quigley (1991), The dynamics of real estate prices, Review of Economics
and Statistics 73, 50-58.
Case, K. E. and R. J. Shiller (1979), The efficiency of the market for single family homes,
American Economic Review 79, 125-137.
Case, K. E. and R. J. Shiller (1987), Prices of single family homes since 1970: new indices
for four cities, New England Economic Review, 44-56.
Court, A. T. (1939), Hedonic price indexes with automotive examples, in The Dynamics of
Automobile Demand, New York: The General Motors Corporation, 99-117.
de la Barre, M., S. Docclo and V. Ginsburgh (1994), Returns of Impressionist, Modern and
Contemporary European painters, 1962-1991, Annales d'Economie et de Statistique 35
(1994), 143-181.
Fama E.F. (1991) Efficient Capital Markets : II, The Journal of Finance XLVI , 1575-1617.
Freedman, D. (1981), Bootstrapping regression models, Annals of Statistics 9, 1218-1228.
Frey, B. S. and W. W. Pommerehne (1989), Muses and Markets, London: Basil Blackwell.
Goetzmann, W. N. (1990a), Estimating price trends for residential property: a comparison of
repeat sales and assessed value methods, Working Paper, University of Connecticut.
Goetzmann, W. N. (1990b), Accounting for taste: an analysis of art returns over three
centuries, First Boston Working Paper Series FB-90-11, November.
Goetzmann, W.N. (1993), Accounting for taste: art and financial markets over three centuries,
American Economic Review 83, 1370-1376.
Grampp, W. D. (1989), Pricing the Priceless, Art Artists and Economics, New-York: Basic
Books Inc.
28
Griliches, Z. (1971), ed., Price Indexes and Quality Change: Studies in New Methods of
Measurement, Cambridge, Mass.: Harvard University Press.
Guerzoni, G. (1994), Testing Reitlinger's sample reliability, paper presented at the 8th
Conference on Cultural Economics, Witten, August.
Holub, H.W., M. Hutter and G. Tappeiner (1993), Light and shadow in art price competition,
Journal of Cultural Economics 17, 49-63.
IMF (1972), Supplement to International Financial Statistics, Washington, DC.
Mark, J. H. and M. A. Goldberg (1984), Alternative housing price indices: an
AREUEA Journal 12, 31-49.
evaluation,
Palmquist, R.B. (1980), Alternative Techniques for Developing Real Estate Price Indices,
Review of Economic and Statistics 62, 442-448.
Palmquist, R. B. (1982), Measuring environmental effects on property values without hedonic
regressions, Journal of Urban Economics 11, 333-347.
Pesando, J.E. (1993), Art as an investment. The market for modern prints, American
Economic Review 83, 1075-1089.
Phelps-Brown, H. and S. V. Hopkins (1956), Seven centuries of the prices of consumables,
Economica 23, 313-314.
Pommerehne, W.W. and L. Feld (1995), The impact of museum purchase on the auction
price of paintings, manuscript, University of Saarland.
Reitlinger, G. (1960), The Economics of Taste: The Rise and Fall of the Picture Prices 17601960, London: Barrie and Rockliff.
Reitlinger, G. (1970), The Economics of Taste: The Art Market in the 60s, London: Barrie
and Jenkins Ltd.
Ridker, R. G. and T. A. Henning (1967), The determinants of residential property values with
special reference to air pollution, Review of Economics and Statistics 44, 246-257.
Rosen, S. (1974), Hedonic prices and implicit markets: product differentiation in pure
competition, Journal of Political Economy 82, 34-55.
Shiller, R. T. (1991), Arithmetic repeat sales price estimators, Journal of Housing Economics
1, 110-126.
Stein, J. P. (1977), The monetary appreciation of paintings, Journal of Political Economy 85,
1021-1035.
Tout l'oeuvre peint de..., Paris: Flammarion.