How well public opinion polls predict elections has been of interest almost since polling began, but in Canada this question has been particularly important since the 1998 Quebec election, when a bias in the polls led to a systematic underestimation of voter intent for the Quebec Liberal Party. Since that election, pollsters have asked themselves what they can do to prevent this situation from recurring, and the media and academics have wondered if they can rely on pollsters’ estimates of voter intent, particularly in Quebec.

This article examines the polls conducted during the 2000 federal electoral campaign. It uses statistical techniques to try to assess the polls’ accuracy in predicting results, their bias (if any) for or against certain political parties and the likelihood that different polling methodologies will produce biased or “volatile” estimates.

How does one evaluate the accuracy of polls’ estimates of voter intent? The most common method is to compare some sort of polling estimate, such as the leading party’s share of the vote or the difference between parties, to the final vote. The obvious shortcoming of this method is that it can only assess the accuracy of surveys published in the last days of a campaign, and even then only if one is confident that no substantial movement in voter intent occurred over these last days. A more practical disadvantage is that in Canada (unlike the United States) very few surveys are published during the last week of a campaign.

A second approach is to evaluate polls throughout the campaign. To do so, this article uses a slightly modified version of the method my colleagues Sébastien Vachon, André Blais and I devised (see our “Les sondages moins rigoureux sont-ils moins fiables?” Canadian Public Policy/Analyse de politiques, 1999). This method employs time-series analysis to estimate the evolution of voter intent and forecast election-day results that can then be compared with the real vote. In essence, polling results are weighted by their respective sample size and evenly distributed over the days the polls were conducted. Daily averages of polling results are then used to construct the series and to derive estimates of the parties’ daily positions that take into account possible effects like campaign events and the evolution of voter intent with time. Individual polls conducted during the campaign can then be compared to the estimated series and each pollster’s results examined for systematic bias and volatility.

The first task at hand is therefore to determine the evolution of voter intent during the campaign in order to assess whether the portrait painted by pollsters compares accurately with the election results. In Figure 1 the solid lines show the estimated evolution of voter intent from the day the election was called (Oct. 22) to election day (Nov. 27). The dotted lines show the margin of error of the series. Stars indicate the published estimates of each poll and are placed at the mid-point of the period over which the poll was taken, since polls take more than one day to conduct. Finally, triangles indicate the final election results for each party.

A total of 22 national polls were made public between the drop of the writs and 48 hours before polling day. They had a mean of 2000 respondents, with a minimum of 898 and a maximum of 4,102. Figure 1 shows the evolution of voter intent as estimated for the Liberal Party of Canada (upper lines), the Canadian Alliance (middle lines), and the Progressive Conservative Party (lower lines). A reasonable interpretation of the evolution of voter intent is that Liberal support decreased by over four per cent during the campaign (from 45 per cent to less than 41 per cent), Conservative support increased by four per cent and Canadian Alliance support remained stable. As it turns out, the time-series forecast overestimates voter intent for the Conservatives by about one per cent, while the predictions are within the margin of error for the other two parties.

In Quebec, 20 polls were made public during the campaign. Five were conducted only in Quebec, while 15 took their data from national polls. Only those Canadian polls that identified a Quebec stratum and published the related information were used for this analysis. Excluding a Léger poll of 3,514 respondents from the calculation of the mean, an average of 670 respondents were interviewed, with averages of 990 for the Quebec-only polls and 590 for the Quebec strata of the Canadian polls.

Figure 2 shows the estimated evolution of voter intent for the Liberal Party of Canada (LPC) in Quebec. Like the pan-Canadian polls, the Quebec polls show that voter intent for the LPC decreased from 43 per cent to 40 per cent over the course of the election campaign. However, their forecast of the final Liberal result was four per cent below the 44 per cent the Liberals actually received on election day. Figure 3 shows that voter intent for the Bloc Québécois (BQ) likely increased from 40 per cent at the beginning of the campaign to almost 45 per cent by mid-campaign, with a subsequent decrease to 41 per cent on the eve of the election. This forecast was quite close to the final result (40 per cent) and within the margin of error for the series.

Thus, the time series illustrate that in general, the polls’ estimates were fairly accurate, albeit with one important exception, the underestimate of voter intent for the LPC in Quebec.

As mentioned, another way to assess the accuracy of the surveys is to compare the election results with the polls conducted during the last week of the campaign. Doing so provides results similar to those given by the time series. The first three rows of Table 1 show that for Canada as a whole, whether using the raw estimates or the estimates computed from the daily averages weighted by sample size voter intent was estimated quite accurately: the difference between the estimates and the vote is very close to zero. The weighted estimate for the Progressive Conservatives (PC) is the one exception to this pattern: their vote was underestimated by 1.3 per cent.

By contrast, the last three rows of Table 1 show that in Quebec, voter intent for the LPC was seriously  underestimated"by fully 3.9 percentage points for the weighted estimate"while voter intent for the BQ, the Canadian Alliance and the New Democratic Party were overestimated by one per cent each. Once again, while voter intent for the other parties is estimated accurately in both Canada and Quebec, voter intent for the Liberal Party of Canada is underestimated in Quebec.

How about the predictions of individual polls or pollsters? A number of factors, including the laws of probability, can affect the accuracy of an individual estimate, so the goal here is not to criticize or praise but to see whether certain methodologies appear to produce systematically worse estimates than others. Such an examination is made easier under the new Elections Act, which requires all published polls to be accompanied by relevant methodological information.

A first test of a poll’s accuracy is to compare individual estimates with the voter intent calculated using time-series analysis. Figure 1 above shows that, for Canada as a whole, six polls depart slightly from the results derived from the estimated series. During the last three weeks of the campaign, three Zogby polls and two Environics polls produced estimates that were slightly outside the margin of error of the series. While Zogby tended to overestimate voter intent for the Alliance and the Conservatives, Environics tended to underestimate the Alliance vote and overestimate the Liberals’. In addition, one Léger poll slightly underestimated voter intent for the Conservatives. On the other hand, since the series as a whole overestimates voter intent for the Conservatives, the Léger poll was probably accurate.

Figure 2 shows that for Quebec, two Ipsos-Reid polls, the four Environics polls, and one Compas poll depart markedly from the series’ estimates of voter intent for the LPC. But again, since the series itself underestimated voter intent for the Liberals, polls that lie outside the estimated series’ upper margin of error could be more accurate. This is the case with Environics’ first three polls; but its last poll of the campaign underestimates this same voter intent much more than the other polls. Figure 3 shows the same information for the BQ and leads to similar conclusions, i.e., that the polls conducted by Ipsos-Reid, Environics and Compas generally departed from the series estimates.

Figure 4 looks at the pattern of the differences between each pollsters’ results and the series estimate for Canada as a whole using “box-and-whiskers” plots for the two leading parties (the Liberals and the Alliance) for each firm that conducted more than one poll. Box-and-whiskers plots illustrate two indicators of poll accuracy: the presence of systematic bias for or against a political party (as shown by the average difference from the series estimate) and the presence of high variability (as shown by the spread of differences). As Figure 4 shows, Ekos and Environics tended to slightly overestimate voter intent for the Liberals (their average difference with the series estimates being around two per cent), Ipsos-Reid tended to underestimate it (with a mean difference of nearly minus two per cent), while Zogby tended to overestimate voter intent for the Canadian Alliance (by more than two per cent on average). It should be stressed, however, that the great majority of the estimates fall within the margin of error. Figure 4 also shows that the estimates from Compas and Environics are more variable than those of other firms"as is indicated by a larger spread between the lower and the higher points of the boxes.

Figure 5 provides the same information for Quebec, though the generally smaller sample sizes mean that the overall spread of differences between poll estimates and the series’ estimates is larger. Compas and Zogby, who did not publish figures for Quebec, and SOM, which conducted only one survey, are not included. This figure does not suggest systematic bias by individual pollsters, the mean error being generally close to zero, with the exception of Reid’s underestimation of voter intent for the Liberals. The figure does, however, show high variability in the estimates produced by Environics and Ipsos-Reid. The difference between Environics’ estimates for the BQ and the time-series ranges from minus four per cent to plus three per cent, while Reid’s ranges from minus three to nearly plus four per cent.

In sum, the estimates produced by Environics and Compas for Canada as a whole — and, to an even greater extent, those produced by Environics and Ipsos-Reid for Quebec — tend to suffer from high volatility and depart from the series’ estimates. In addition, Zogby’s estimates, though not volatile, tend to overestimate voter intent for the Canadian Alliance.

Since margin of error is proportional to sample size, it’s tempting to conclude that the error of the estimation reflects only sample size and nothing else. This is not the case: There is a relationship between error and sample size, but in this case it does not explain all the discrepancies found. For instance, at the Canadian level, the relationship for the Liberals, though not significant, is in the opposite than expected direction — i.e., the larger the sample size, the larger the error (r between error and square root of sample size =.27) — while for the Alliance, it is in the expect- ed direction (r = -.42). At the Quebec level, the relationship is in the expected direction for both parties (r = -.52 for the Liberals and -.32 for the BQ) but it is significant only for the Liberals. In short, though some of the variation across polls can be explained by sample size, most of it cannot be.

Unless a last-minute shift in voter intent is suspected to have taken place, polls published during the last week of a campaign may also be helpful in trying to estimate bias. Three criteria are used to assess the accuracy of the polls published during the last week: how well they did in predicting both the leading party’s share of the total vote and the spread between the two leading parties, and how well they com- pare with the estimated series during the last week.

For Canada as a whole, every firm except Environics falls well within the margin of error whichever method is used. For example, the discrepancy between the LPC’s share of the vote and the various polls’ estimates varies between plus and minus two per cent (except for Environics, which missed by four per cent). In sum, all polls performed well.

The situation is different in Quebec, however. As Table 2 shows, all the final-week polls underesti- mated voter intent for the LPC by between 1.2 and 8.2 per cent. Léger and Ekos were within the mar- gin of error but the three others were significantly outside it. The difference between the two leading parties was also poorly estimated. Léger was best. It was the only firm that forecast the LPC ahead of the BQ, while Ekos and Ipsos-Reid placed the two parties as equal, but all three estimates were within the margin of error. The differences between the polls’ estimates and the series’ estimate are presented in the right-hand columns. Because the series itself did not perform well in evaluating the LPC’s share of the vote, it would not be appropriate to rank the polls using this estimate. The series having underestimated the LPC vote, the polls that erred on the high side — both Ekos and Léger — were more accurate. Finally, the firms’ errors in estimating the BQ vote follow a similar pattern: Léger and Ekos generally fared well; Sondagem and Ipsos-Reid were not as good but were reasonably accurate; Environics was imprecise.

In sum, for Canada as a whole, the polls did a fairly good job of estimating voter intent. But in Quebec the underestimation of voter intent for the Liberals was a replica of the 1998 Quebec provincial campaign. The literature offers three broad explanations for such phenomena: a last-minute shift in voter intent; a spiral of silence or “social-acceptability” effect in which respondents do not reveal their true intentions; and, finally, a sampling problem in which a sub-group of voters that is different from the overall voting group is either deliberately excluded from the sample or cannot be reached or convinced to cooperate.

In earlier work with Blais and Vachon, I have shown that a late campaign shift cannot fully account for the discrepancy between the polls’ estimates and the eventual vote in the 1998 Quebec election. Having two such shifts in a row seems unlikely and, in any case, no commentator has suggested that such a shift occurred during the last week of the 2000 federal election campaign. Moreover, SOM surveys conducted during Quebec by-elections in October 2001 showed a similar tendency to underestimate voter intent for the Liberals by as much as ten per cent.

Of the two remaining possibilities, if the main explanation lies in a “spiral of silence” effect, the reverse effect — i.e., a systematic underestimation of the Parti Québécois/Bloc Québécois vote — could happen if polls start to show the Liberal Party ahead in voter intent. On the other hand, if the explanation has primarily to do with sampling framework and management, the underestimation of the Liberal Party will remain a factor.

Two general conclusions may be drawn from this analysis. First, the systematic underestimation of voter intent for the Liberals in Quebec that appeared in the 1998 Quebec election was still a factor in the 2000 Canadian election. Polls conducted during recent by-elections in Quebec, albeit by a single pollster, appear to indicate that this underestimation remains a factor. Until the reasons for such a systematic bias are found, it would be prudent to bear this in mind when examining various polls conducted in Quebec.

The second conclusion is that some polls — perhaps especially the tracking polls — seem to produce systematically higher variability in their estimates. Caution should therefore be exercised in inferring from these polls’ estimates that voter intentions are swinging substantially.