Methodology
Methodology
Methodology
1 Introduction.
1 For small provinces that have too few seats, we regroup them at the region-level. This is
the case for the Atlantic provinces as well as for the Prairies. Also, polls are usually provided
at the regional level for those provinces.
2A lot of forecasting models would include variables such as the growth rate of GDP during
the election year for instance. Why we are not denying the impact of such variables on the
political outcome, it is important to notice that GDP per riding at the time of the election is
not an available variable.
3 We do not conver the three territories which each have one seat. The reason being the
usual lack of polls for them
1
2 How it works.
The core principle of the model is to link province-level variations into riding-
level ones. The simplest model would be to assume that the variation is uniform.
Specically, if a party's support increases by 2 percentage points (between the
last election and now) at the province-level, then this party's support will also
increase by 2 points in every riding. This is the simplest model we can think of
and the one used by some websites (Ipsos-Reid for instance makes projections
sometimes during a campaign and they are using this model, or at least they
were). A slightly more sophisticated model assumes that the variation is pro-
portionally the same. In other words, if a party's support increases by 5% at
the province-level (for instance, increasing from 20% to 24%), then the variation
in every riding will be 5% as well (so for instance, if the last election's results
in two ridings were 10% and 40%, then the forecasts would put this party at
respectively 12% and 48%). As we are showing in section 3, those two simplistic
models don't perform as well as ours. The proportional change model seems to
provide better overall forecasts, but only because a lot of the mistakes at the
riding-level are simply cancelling each other.
The main shortcoming from these models lies in the fact that they assume
the variation, instead of actually using data to estimate it. We, on the other
hand, believe that estimation should be used whenever possible. And in the
case of electoral forecasting, data are easily accessible by anyone. We share
with those models the principle that the results next election depend on the
results of the last election, as well as on the province-level variation in parties'
support. The main idea of the model is that the variation for one party in a given
riding is a function of the variation of the party's support at the province-level,
regions-specic eects as well as of the results of the last election. Specically,
our econometrics specication looks like this:
Where i indexes political parties, j indexes the ridings and t represents the
time of the election. Xi,j,t contains the variables, such as the province-level
variation between t-1 and t for party i, the region where the riding is located
or the results of all the parties in riding j during the last election. We work
with percentages as opposed to number of votes in order to simplify the model.
Indeed, a more structural model would estimate the variation in the number of
voters for a given party, as well as the turnout for every riding. By working
directly with percentages, we don't need to estimate or assume anything about
the turnout rate. This is very convenient and not that limiting. On top of that,
only percentages actually matters as far as winning a seat is concerned.
One important characteristic of the model is that the province-level variation
is interacted with all the other variables. This means that if a party doesn't move
in the polls between the last election and now, then the model will predict that
4
this party's support will not change in any riding . We impose this constraint
2
because, as already mentionned, all we observe is the variation at the provincial
level. Therefore, while it is possible (and actually likely in real life) that a
party would not change province-wide, but still experiences a lot of variations
at the riding-level, we don't try to model those changes. To illustrate our point,
imagine the following worst case scenario: the support for the NDP in Ontario
remains at (say) 16%, but the party's support actually shift within the province.
For instance, we could imagine that the NDP loses a lot of support in rural area
but gains voters in some urban places. Our model could not predict such a shift
within a province and would end up being very wrong in its projections. But
again, all we can observe accurately during a campaing are the polls, and those
ones are given at the province-level. We don't see this as a major threat to our
5
identication strategy , but it is worth mentionning.
As opposed to the other models, the main dierences is that we use data
to actually estimate the Beta coecients while they will simply assume them.
Therefore, our model actually nests the other ones. Indeed, if the variation
at the riding-level is actually uniform, then our coecients will reect this.
However, in real life, the variation is far from being uniform, nor perfectly
proportional. It does depend on the region where the riding is located or on
the past results. For instance, it is obvious that if the Conservatives lose 3
points in Quebec, the variation will not be the same in Beauce or in the suburbs
of Montreal. The model is estimated using Ordinary Least Square (OLS) and
using the electoral results of 2004, 2006 and 2008. We estimate this model for
each province and each party. We don't use results before 2004 because the
electoral parties were not all the same. The Conservatives were split between
the Alliance (ex-Reform) and the Progressist-Conservatives. We could make
some assumptions about how the votes of those two parties could sum up, but
that would only be an approximation. Moreover, while having more elections
means more data to estimate our coecients, it also increases the chances of the
model to be outdated. In particular, it is plausible that the political landscape
in 1997 was very dierent than the one in 2008. Therefore, by using data from
1997, we take the chance to estimate parameters that are not valid anymore.
We only take into account the main political parties (so for instance, at the
federal level for Canada, we makes projections for the Conservatives, Liberal,
NDP, Green and Bloc Québécois) and totally ignores the other parties or inde-
pendant candidates. The reason being that those others usually only represent
a very small fraction of the votes. On top of that, their dynamics in a specic
variables to be not interacted. For instance, during the 2008 federal campaing, it was well
known that the Premier of Newfoundland & Labrador was running a campaing against the
Conservatives. In this case, our specication would include a dummy variable for the riding in
NFL in 2008, and this variable would not be interacted with anything. We use such dummies
on other cases, for instances the fact that the Liberals didn't run any candidate in the riding
of Central Nova in 2008. We do that in order to capture the one-time jump that the other
parties (in particular the Green party in this case) could have experienced. Such a eect was
limited to this riding and this election, and was likely too small to be observed in the variation
at the province-level.
5 Especially given that time between two elections is relatively short. Thus, major changes
in a party's support within a province is unlikely to happen in such a period of time.
3
riding is usually not linked to any changes at the province-level.
It should be noted that our above specication assumes the additivity be-
tween the last election's results and the variation and we refer to it as our linear
model. Another specication, more closely related to the proportional changes
model is like the following:
Where here the explanatory variables would be interacted with the propor-
tional variation of the party at the provincial level. A priori, both specications
are valid and there is no reason to choose one over the other. However, as we'll
see in the next section, the rst specication performs signicantly better and
is thus the one implemented and displayed on the website.
It is crucial to acknowledge that our model is suitable to make projections
as long as the percentages of votes for each party are within a certain range. In
particular, the model is not intended to forecast what would happen if the Bloc
Québécois didn't exist. The fact is that the model was estimated with the Bloc
being there and varying around 40% of the votes intentions. Any projections
putting the Bloc far from the previous observed level of support should be
considered as out-of-sample forecasts, or simple extrapolations. With all the
variables included in our specication, we can for instance have non-linear eects
and inputing one party far o its previous results could lead to unrealistic results.
Neverthless, for the sole purpose of predicting an actual elections, we believe
our model to be robust enough to this problem. Of course, our coecients
are better estimated when a party actually experienced important changes over
time, thus providing us with a better source of variation for the identication.
For instance, the Conservatives in Quebec are an ideal case since between 2004
and 2008, their level of support increased from less than 10% to almost 25%.
On the other hand, the NDP in Ontario is more complex since this party almost
didn't move at the province-level.
4
Table 1: Comparisons between models, by elections.
Notes: uniform stands for the model with uniform swing in term of percentages points.
proportional is the model with uniform proportional changes. 2c2c linear is our model
where the result of next election is a linear combination of the results past election and a
function XBeta. 2c2c prop. is our model where we model a proportional variation as a
function of XBeta.
Number of mistakes is at the individual level. It is the count of riding incorrectly projected
in term of the winner (for instance, projecting the LPC to win when in fact the NDP did).
Accuracy measures the average absolute deviation between our projections and the actual
results. For instance, projecting the CPC at 34% when the actual result was 36% would lead
to a 2 points deviation.
6
However, DS only accounts for the regional eects . In our model, not only are
we taking into account the regions, but we are also using the past election's
results. This may be potentially very important. For instance, if a party is
growing in a given province, it is likely that this party's support will change
dierently in a riding where this party was already at 60% or was only at 15%.
The potential for growth is completely dierent. In the same logic, the other
parties' results can inuence as well. It is possible that party i is growing at the
expense of party j and our model will be able to capture this fact.
Another crucial point to remember is that our model (as well as any other
ones using percentages as the main input) is completely dependent on the ac-
curacy of the opinion polls. Indeed, if the polls indicate that the Liberals stand
at 30%, but on election day this party only gets 25% of the votes, our projec-
tions will of course be aected. There are two steps involved in the projection
process (predict the right province-wide percentages, and then translat these
into riding-level ones). Our model only intend to solve the second step because
we believe that providing accurate polls is a completely dierent job. We also
believe that there are enough polls during a typical campaign to give us an ac-
curate portrait of the situation. Therefore, for the comparisons in this section,
we actually use the actual results of the election as our input. The objective
being to test which model performs best, conditionnal on having the correct
percentages. For instance for the 2006 projections, it means we had our coef-
fecients, the results of the 2004 elections and some polls giving us the precise
percentages of the 2006 election as input. Table 1 shows the results.
As we can see, our (linear) model is clearly the best and the only one which
6 Thus, again, the DS is nested in our specication. Although DS corrects for the ridings
that are too far from the regional average, something that could be implemented in our model
but this is not currently the case.
5
Table 2: Overall number of mistakes of each model.
2006 13 16 3 3
2008 15 9 13 17
Notes: Overall number of mistakes is calculated as the sum of errors for the overall results.
For instance, projection the CPC at 145 seats when they won only 143 would count as 2
consistently projects correctly at more than 90%. However, one may argue that
what people are really interested in is the overall forecast. In other words,
people want to know if the Conservatives will have 148 or 135 seats after the
election. So the mistakes should be computed with respect to the overall
results. Specically, for the 2008 elections, our linear model would predict 142
CPC, 85 LPC, 31 NDP and 49 Bloc. The actual results were, respectively, 143,
7
76, 36 and 49 (plus one independent ). Therefore, the model would have made
13 mistakes overall, down from the 25 at the riding-level. The reason being of
course that some mistakes cancel each other (this can be especially true with
close races where the win can go one way or the other). The next table then
show the overall mistakes of each model.
With this measure, the proportional changes performs really well for the 2008
election. This is very surprising given that for this election, it was the model
with the most mistakes at the riding-level (40 riding incorrectly projected) and
the worst accuracy. But a lot of the mistakes were canceling each other for
the country as a whole. While we acknowledge this fact, we believe that we
shouldn't select a model on the basis that this one was lucky for one election.
On top of that, overall mistakes can be missleading. For instance, for 2008,
our two models were the closest for the projections of the CPC, while being
less accurate for the LPC. On the other hand, the uniform and proportional
models would predict the CPC at 149 and 147 seats. All that to say that
overall mistakes is an interesting concept, but when we can measure riding-level
mistakes, we think we should do it. It gives a better portrait of the situation.