Statistic Bet
Statistic Bet
Statistic Bet
u
p
. The rationale behind this rule is to
determine whether profitable rules can be developed whereby the away team is a strong
favourite (in which case the rule would default to betting on the away win if
i a
p
,
l
p ), or in
opposing a strong home team (in which case the rule would default to betting on the away
win if
i a
p
,
i
u
p
) or whether it is better to focus betting on the away win when there is no
seemingly clear favourite.
Our second approach is to use ordinary least squares regression with bookmakers
odds as the dependent variable. Residuals (
i
e ) under the model may be used to assess the
relative extent of disagreement between the bookmakers odds for the away win and the
9
predicted bookmakers odds under the model. Following the reasoning given above, we
consider a betting rule of the form bet on the away win for match i , if and only if,
l
e
i
e
u
e .
2. Sample data
Data was recorded on the 194 league football games that took places between the 2
nd
October
2007 and 22
nd
October 2007 from the games played in the top four English football leagues
and the top four Scottish football leagues. The outcome of each game was recorded (home
win, draw, away win) along with fixed odds for each outcome offered by Ladbrokes plc, the
UKs largest bookmaker.
Fixed odds are set with commercial and financial gains in mind and may not
necessarily reflect the best assessment of match outcomes since they may be set with
anticipated betting volumes in mind or indeed set to influence betting volumes. For these
reasons we consider as predictor variables the home and away team performance ratings
published weekly by the Racing and Football Outlook (RFO) which is a weekly newspaper
published by Trinity Mirror plc, dedicated to betting on horseracing and association football.
The RFO index is an index based on the results of the past 60,000 games and provides a form
rating on a scale of 0 to 1000 for each team in the English and Scottish football leagues.
Increasing ratings are intended to reflect increasing ability of a team and the difference in
RFO ratings between two teams is intended to reflect the extent of the degree of mismatch
between the two chosen teams. The RFO produces a separate index for home and away
performance to account for the home advantage effect and the extent of club specific home
advantage effect (the home effect cannot be considered to be of the same influence for all
10
teams). We therefore consider the RFO home rating for the home team and the RFO away
rating for the away team as predictor variables for match outcomes.
Our second approach is to use a good predictor of betting odds which utilises
information that might not be used by bookmakers in deriving odds. For this reason, for each
team in each game, we consider the average proportion of time that the team was winning,
irrespective of goal margin, in their previous three league games as a predictor variable. This
choice of predictor is partly informed by the ready availability of the data and partly informed
by the idea that the margin of victory is not of primary importance but that the percentage of
time winning in previous games will still provide an indication of relative dominance in
recent games against teams from the same league. We therefore consider average measure
of time winning in previous three games and RFO ratings as predictor variables of estimated
bookmaker probabilities.
A second data set comprising all of those matches held in the English and Scottish
divisions (63 games) between 15
th
January 2008 and 21
st
January 2008 was used to assess
independently the out of sample usefulness of the derived betting rules.
3. Derived betting rules
Table 1 summarises the discrete choice complementary log-log model for predicting the
probability of an away win. Overall the model is statistically significant (Log-likelihood chi-
square = 17.50, df = 2, p < 0.001), the individual predictors are statistically significant (p <
0.001) and the direction of effects for the RFO ratings for the home team at home (RFO HH)
and the RFO ratings for the away team playing away (RFO AA) make good conceptual sense.
The model adequately captures the structure in the data (percentage concordant pairs between
model predictions and outcomes is 66.6%) and goodness-of-fit tests using Pearsons residuals
11
(p = 0.246) and deviance residuals (p = 0.106) do not cast doubt on the appropriateness of the
model specification. Inspection of delta beta and delta deviance graphics indicate that model
does not suffer from the presence of overly influential observations. Prior to fitting this
model we did consider a simple logistic specification however application of Browns test
indicated that a model with a non-symmetric link function would be more appropriate.
{Table 1 about here}
Figure 1 is a plot of within sample profit against possible choices for
l
r for the betting
rule bet on the away win in match i if and only if
, a i
r
l
r with
, a i
r estimated for match i in
the data set using the complementary log-log regression equation and with a one pound bet
wagered each time the rule is fired. In this way the optimal value for
l
r was found to be
*
l
r =
1.596. In a similar way the value for the upper bound
*
u
r was determined to be 7.597, which
is the largest observed ratio in the data set. For the within sample data the rule bet on the
away win in match i if, and only if, 1.596
, a i
r 7.597 effectively defaults to bet on the
away win if
, a i
r 1.596 and fired 29 times yielding an absolute profit of 19.43 giving a
67% profit on monies staked. When applied to the test data, the rule fired on 13 occasions
giving an essentially break-even return of 0.80.
{Figure 1 about here}
Applying the same procedure but using the predicted probabilities for an away win
from the complementary log-log model gives the betting rule bet on match i to be an away
win if 0.4470
, a i
p 0.7146. This rule fired on 22 occasions and with 1 staked on each
12
game an overall profit of 16.34 was obtained (i.e. a 74% return). When applied to the test
data the rule fired on five occasions giving an overall percentage profit of 62.8%.
Table 2 summarises the fitted ordinary least squares model with the estimated
bookmaker odds for the away win as the dependent variable. The overall model is
statistically significant (
2
R = 41.9%, F(4, 195) = 34.04, MSE = 1.186, p < 0.001), each
predictor provides a unique statistically significant contribution to the model and the direction
of the effects in the model make good conceptual sense. The model does not suffer with
problems associated with multicolinearity (all variance inflation factors are less than 4). A
visual examination of the residuals under the model suggests that the assumption of
independence of errors has not been grossly violated although there is some evidence of a
small departure from normality (Kolmogorov-Smirnov test statistics for normality has a p-
value of 0.01). Adopting the same procedure as earlier, but using the residuals, gives a
betting rule of the form bet on the away win in match i if and only if 0.1489
i
e 0.2042
where
i
e is the residual for match i. Application of this rule to the sample data gives rise to
placing 28 bets yielding an overall profit of 9.75 (34.8% profit). Applying the rule to the
out of sample data gives a percentage profit of 19.8%.
{Table 2 about here}
4. Discussion and conclusions
The preceding analyses indicate that a profitable betting strategy based on gambling on the
away win may be possible. The results of the discrete choice model indicate that profitability
may be obtained by avoiding those matches where there is a large estimated probability of an
away win or a small estimated probability of an away win. This finding is consistent with
13
previous research cautioning against a betting strategy based on a long-shot or on a clear
favourite. Instead the derived rule suggests that it may be profitable to wager on the away
win outcome on those seemingly difficult to call matches. This may be a reasonable finding
if the extent of the home effect advantage has been incorrectly estimated by the bookmaker.
The results from the value bet approach which considers the ratio of model estimated
probability of the away win to the derived bookmaker probability of an away win as a betting
trigger seem to be less spectacular.
Distinct from other approaches we considered the direct modelling of bookmaker
odds using OLS regression. This analysis supported our prior reasoned hypothesis that
average time winning in previous games is associated with the odds on offer. Distinct from
other approaches we considered the residuals under the regression as quantifying the extent
of mismatch between the bookmaker odds for the away win and the model predicted odds.
The derived betting rule from this approach suggests avoiding betting on the away win when
there is a large discrepancy between predicted values and bookmaker values and this finding
is quite contrary to the usual stance of betting on so called value matches.
The results presented relate to league football only and due to the small sample size
should be treated with caution. However we have only fitted prior reasoned models and have
not undertaken a data dredging exercise which otherwise may have lead to too many false
findings. In deriving and assessing the betting rules we have simply placed a one-unit stake
per game. In practice it might be favourable to vary the stake in some optimal way (e.g.
betting stakes in proportion to perceived risk) and on this basis the percentage returns quoted
might be optimistically considered as an understatement. Likewise in practice a bettor will
be in a position to shop around the different bookmakers for best prices for the away win and
doing so would give a non-trivial positive impact on the percentage returns offered. We
chose to consider average winning times in the past three games a predictor variable although
14
there may be further merit in extending this predictor variable over a different number of
previous games.
A similar strategy could be considered for betting on the home win, however if
betting rules for both the home win and away win are to be considered then some additional
thought would have to be given to the possibility or prevention, of both rules firing on the
same game.
15
References
Archontakis, F. & Osborne, E. (2007). Playing it safe? A Fibonacci strategy for soccer
betting. Journal of Sports Economics, 8(3), 295-308.
Avery, C. & Chevalier, J. (1999). Identifying investor sentiment from price paths: The case
of football betting. Journal of Business, 72(4), 493-521.
Bird, R. & McCrae, M. (1987). Tests of the efficiency of racetrack betting using bookmaker
odds. Management Science, 33, 1552-1562.
Clarke, S. R & Norman, J. M. (1995). Home ground advantage of individual clubs in English
soccer. Statistician, 44, 509-521.
Department of Culture, Media & Sport (2007). Taking Part: The National Survey of Culture,
Leisure and Sport, Chapter 9, Gambling available from
http://www.culture.gov.uk/images/research/TPMay2007_9_Gambling.pdf
Dixon, M. J. & Coles, S. G. (1997). Modelling association football scores and inefficiencies
in the football betting market, Applied Statistics, 46(2), 265-280.
Figlewski, S. (1979). Subjective information and market efficiency in a betting market.
Journal of Political Economy, 87(1), 75-88.
Forrest, D. (1999). The past and the future of British football pools. Journal of Gambling
Studies, 15(2), 161-176.
Forrest, D., Goddard, J. & Simmons, R. (2005). Odds-setters as forecasters: The case of
English football. International Journal of Forecasting, 21, 551-564
Goddard, J. A. (2005). Regression models for forecasting goals and match results in
association football. International Journal of Forecasting, 21, 331-340.
Knight, F. H. (1965). Risk, Uncertainty and Profit. New York: Harper Torchbooks.
16
Pankoff, L. D. (1968). Market efficiency and football betting, The Journal of Business, 41(2),
203-214.
Pope, P. F. & Peel, D. A. (1989). Prices and efficiency in a fixed-odds betting market.
Economica, 56(223), 323-341
Reep, C., Pollard, R. & Benjamin, B. (1971). Skill and chance in ball games. Journal of the
Royal Statistical Society, Series A, 134, 623-629.
Rue, H. & Salvesen, O. (2000). Prediction and retrospective analysis of soccer matches in a
league. The Statistician, 49(3), 399-418.
Sharpe, G. (1997). Gambling on goals: A century of football betting. Mainstream, Edinburgh
Thaler, R. H. & Ziemba, W. T. (1988). Parimutuel betting markets: racetracks and lotteries.
Journal of Economic Perspectives, 2, 161-174.
Woodland, L. M. & Woodland, B. M. (1994). Market efficiency and the favorite-longshot
bias: the baseball betting market. Journal of Finance, 49, 269-279.
Woodland, L. M. (1994). Market efficiency and the favourite-longshot bias: The baseball
betting market. Journal of Finance, 49(1), 269-279
17
Table 1 Complementary log-log model for the probability of the away win
Variable Coefficient
(B)
SE(B) Z p
Constant -1.1897 1.7051 -0.70 0.485
RFO HH -0.0157 0.0038 -4.12 <0.001
RFO AA 0.0165 0.0042 3.90 <0.001
18
Table 2 OLS regression model with bookmaker odds of away win as the dependent
variable
Variable Coefficient
B
SE(B) t P
Constant 1.057 1.0060 1.05 0.295
RFO HH 0.0202 0.0022 9.32 <0.001
RFO AA -0.0194 0.0025 -7.79 <0.001
Time 1+ Home Team 0.0164 0.0047 3.50 0.001
Time 1+ Away Team -0.0129 0.0048 -2.66 0.008
19
8 7 6 5 4 3 2 1 0
20
10
0
-10
-20
-30
-40
-50
r > r*
P
r
o
f
i
t
f
o
r
u
n
i
t
s
t
a
k
e
s
0
1.595
Figure 1: Profit from rule bet on away win if r > r*