Here, what is interesting is that even while the vote share of the LDF has gone up by 2 per cent, the number of predicted seats for LDF comes down by about 3. Here is where, as we shall argue below, the robustness of the prediction model and the randomness of the sample can be called into serious question.
In any psephology survey, there are different typical steps, each of which could be affected by margins of errors. First, you randomly select a sample set of constituencies for the survey. Here, it is important to know whether the sample of constituencies is randomly selected or purposively selected. Some may argue that if randomly selected, the diversity of electoral regions may not be captured. However, if purposively selected, the “purpose” and its rationale need to be publicised and justified.
Secondly, within these constituencies, you randomly select a set of voters. Here again, two sets of errors can enter. There could be a sampling error, which could be the outcome of the choice of the sampling design itself. There could also be a non-sampling error, which arises from errors in measurement and recording of data.
Thirdly, to compute vote shares, you use a suitable model of voting behaviour using past data in conjunction with the vote shares computed from sample opinion surveys. That is, you have baseline information (from the previous election) with the vote share of each party and who won how many seats. Based on this past information and the results of the present opinion poll, you estimate a “swing” for each party. Using this state-level swing factor will give you the estimate for the state-wide share of votes for each party. You can also compute swings in regions within the state – i.e., for Malabar, central Kerala and south Kerala. Here, of course, the major possible source of error lies in the assumption of uniform swing factors across states or regions.
Fourthly, number of predicted seats is arrived at from these vote shares. It is here that the maximum possibility of error lies. Let me take a common method used, which I borrow from the well-known statistician and psephologist Rajeeva Karandikar. After assuming that the swing factor is constant across a region and the state, you further assume that the swing in one seat is a convex combination of the state-level swing and the region-level swing. A convex combination is a linear combination of points, where all coefficients are non-negative and sum up to 1. Based on this swing factor for each seat, you arrive at winning probabilities of success for each candidate. A sum of probabilities of each UDF/LDF candidate in each seat will give you the total number of seats for UDF/LDF in the state as a whole.
However, most survey agencies never publicise the exact method of transforming vote shares into number of seats. It is always a black box, except in a few rare instances. For example, in 1998, the Frontline-APT Research Group opinion poll in Tamil Nadu publicised the details of its methodology of converting vote shares into number of seats. This survey used a modified version (to suit Indian realities) of the “Cube Law” for this purpose; the Cube Law states that if the vote shares of two parties ‘A’ and ‘B’ are ‘a’ and ‘b’, their seats will be in the ratio a
3 to b
3. In most cases of opinion polls, the actual method remains unknown.
According to Professor Venkatesh Athreya, a close observer of opinion polls,
In our first-past-the-post electoral system, an element of judgment is inevitable in moving from vote share forecasts to seat predictions, since there is no one-to-one correspondence between the two. Factors such as the degree of geographical concentration of a political formation's voter support, the degree of polarity in the electoral contest (whether a contest is bipolar, tripolar or even more multipolar) and the complexity (both on paper and in practice) of the electoral adjustments worked out are all crucial to determining the extent to which an increase in vote share translates into an increase in seats. This particular aspect needs to be borne in mind when evaluating or utilising opinion poll survey results.
In the Asianet-C Fore survey, only sketchy details of the methodology used are given. Asianet-C Fore had done a first survey in the month of February 2011, and the methodological details of that survey are available. I give below the only details that they have provided:
C fore (Centre for Forecasting & Research) conducted the [first] pre-poll survey between 23rd February and 7th March, 2011 in Kerala. In all, 6112 voters were interviewed using a structured questionnaire from 40 assembly constituencies using systematic random sampling method. In each constituency 5 urban and 15 rural locations were selected. For every location, a starting point was selected randomly in north, south, east and west direction. From each starting point right hand rule was followed and one person (above 18 years of age) was interviewed from a household with interval of 10 households. Thus, in all, polling was conducted in 200 urban and 600 rural localities in the state. Care was taken to ensure that different castes and communities were represented in the sample in their actual proportion. The survey has a margin of error of 1 percentage point at 90 percent confidence level.
It is not clear if the same methodology was followed for the second survey also.
Further, there are no details of (a) whether the 40 constituencies were selected randomly or purposely; (b) whether regional considerations were taken into account while selecting the 40 constituencies; (c) how care was taken to ensure that different castes and communities are represented in the sample; and (d) what method was used to convert votes shares into seats. What are also missing are scenarios, which give details of changes in predictions in the face of errors in the prediction of swings.
Given these handicaps, it is extremely difficult to comment on the results of the Asianet-C Fore survey. This is because, as explained before, the robustness of the model and randomness of the sample are critical in the prediction of the number of seats. If the sample is not random (both the sample of constituencies and the sample of voters), results can go haywire. If the prediction model is sensitive to small shifts, results can go bizarre. From what we know of the results of Asianet-C Fore, these errors could be large.
Below, I list some of the serious anomalies in the results. Here, I shall use the post-poll results from two earlier elections in Kerala – 2004 and 2009 – to compare the nature of shifts that the 2011 results throw up. The 2004 and 2009 surveys were conducted as part of the National Election Study (NES) of Lokniti in New Delhi. In comparison with the NES surveys, the community-wise shift in votes in the Asianet-C Fore survey is extremely surprising (see Tables 1, 2 and 3 below). For instance,
- Compared to 2004 Parliament elections, the upper caste Hindu vote share of LDF has fallen from 40 per cent to 13 per cent.
- While the share among Ezhavas has increased for the LDF (from 58 per cent in 2004 to 68 per cent in 2011), the share among Hindu OBCs has fallen from 52 per cent in 2004 to 42 per cent in 2011. How can the share of Ezhava votes for the LDF increase and share of Hindu OBC votes for the LDF fall, both so significantly? For the Hindu OBCs, the vote share for the UDF has increased from 17 per cent in 2004 to 41 per cent in 2011.
- Among Dalits, the vote share of LDF in 2004 was 72 per cent and in 2009 was 69 per cent, according to NES. However, the Asianet survey shows that it has fallen to 50 per cent in 2011.
- No survey anywhere in Kerala has shown till now that the vote share among upper-caste Hindus and Syrian Christians is higher for BJP and others than the LDF. The survey is saying that a non-UD/non-LDF combination can net in more votes among Syrian Christians than LDF. This appears to be an extraordinarily mistaken result that defies common sense.
TABLE 1
NATIONAL ELECTION STUDY (NES) RESULTS FOR 2004, KERALA, in per cent
Caste/Religion | Share for | N |
LDF | UDF | BJP | Others |
Hindu upper caste | 40 | 43 | 12 | 5 | 93 |
Nairs | 42 | 29 | 26 | 4 | 84 |
Ezhavas | 58 | 22 | 18 | 2 | 238 |
OBCs | 52 | 17 | 27 | 4 | 75 |
Dalits | 72 | 17 | 8 | 2 | 87 |
Muslims | 41 | 57 | 1 | 1 | 140 |
Christians | 31 | 62 | 2 | 5 | 214 |
Source: National Election Study, 2004, weighted data set.
TABLE 2
NATIONAL ELECTION STUDY (NES) RESULTS FOR 2009, KERALA, in per cent
Caste/Religion | Share for LDF
| Swing for LDF from 2004 | Share for UDF | Swing for UDF from 2004 |
Nairs | 27 | -14 | 33 | +4 |
Ezhavas | 57 | -1 | 27 | +5 |
Dalits | 69 | -5 | 15 | +5 |
Muslims | 26 | -3 | 69 | -2 |
Christians | 32 | -15 | 69 | +13 |
Source: National Election Study, 2009, weighted data set.
TABLE 3
ASIANET-C Fore RESULTS OF THE SECOND SURVEY, 2011, KERALA
Caste/Religion | Share (%) for |
LDF | UDF | BJP & Others |
Hindu upper caste | 13 | 60 | 27 |
Ezhavas | 68 | 25 | 7 |
Hindu OBCs | 40 | 41 | 19 |
Dalits | 50 | 34 | 16 |
Syrian Christians | 11 | 77 | 12 |
Other Christians | 14 | 73 | 13 |
Muslims | 23 | 70 | 7 |
Source: Asianet
There is further evidence of bias in the second Asianet-C Fore survey. Let us compare the results of the first survey and second survey.
- In the first survey, among Hindu upper caste voters, 65 per cent would vote for UDF, 22 per cent would vote for LDF and 13 per cent would vote for Others (see Table 4 below).
- However, in the second survey, among Hindu upper caste voters, 60 per cent would vote for UDF, 13 per cent would vote for LDF and 27 per cent would vote for Others (see Table 3 above).
TABLE 4
ASIANET-C Fore RESULTS OF THE FIRST SURVEY, 2011, KERALA
Caste/Religion | Share (%) for |
LDF | UDF | BJP & Others |
Hindu upper caste | 22 | 65 | 13 |
Ezhavas | 47 | 35 | 18 |
Hindu OBCs | 42 | 37 | 21 |
Dalits | 54 | 31 | 15 |
Syrian Christians | 14 | 70 | 16 |
Other Christians | 22 | 68 | 10 |
Muslims | 21 | 72 | 7 |
Source: Asianet
How did such a large share of Hindu upper caste voters decide to vote against both UDF and LDF, and in favour of Others, just over a period of one month? Was there a deliberately induced bias in the second survey?
All these unexplainable trends show that serious questions can be raised regarding the randomness of the sample used in the Asianet-C Fore survey. Further, given that the same survey shows that the candidate’s personal qualities are important in voting patterns and that “political affiliations” are less important, would the assumption of a state-level or region-level swing be appropriate? Is that a safe assumption? It appears no, and this further reaffirms the doubt that the sampling is not quite random as it should be.
Robust psephology, as pioneers in the field would tell you, requires good statistics, lots of common sense and a good understanding of ground realities (“domain knowledge”). The Asianet-C Fore survey appears lacking in all three.
In addition, the results of other opinion polls appear at variance with Asianet-C Fore’s. The Institute for Monitoring Economic Growth (IMEG) has announced its opinion poll results. On the positive side,
more details are available from IMEG regarding methodology than Asianet-C Fore:
เดเดดിเด്เด เดฎൂเดจ്เดจു เดชൊเดคു เดคിเดฐเด്เดെเดുเดช്เดชുเดเดณിเดฒും, เดเดฑเดฃാเดുเดณം เดเดช เดคെเดฐเด്เดെเดുเดช്เดชിเดฒും, เดเดฎเด് เดจเดเดค്เดคിเดฏ เดตിเดเดฏเดเดฐเดฎാเดฏ് เดชเด เดจเด്เดเดณിเดฒ് เดจിเดจ്เดจും เดുเดฑെ เดൂเดി เดฎെเด്เดเดช്เดชെเด്เด เดฎേเดคോเดกോเดณเดിเดฏാเดฃ് เด เดธเดฐ്เดตെเดฏിเดฒ് เดเดชเดฏോเดിเด്เดเดค്. เดฎൂเดจ്เดจ് เดคเดฐം เดธเดฐ്เดตേเดเดณുเดെ pooled result เดเดฃ് เด เด
เดญിเดช്เดฐാเดฏ เดตോเด്เดെเดുเดช്เดชിเดจ്เดฑെ เดซเดฒം. เดเดฆ്เดง്เดฏเดค്เดคേเดคു, เดฎുเดจ്เดจเดฃിเดเดณോเดുเดณ്เดณ เดൂเดฑും, เดൂเดฑ് เดฎാเดฑ്เดฑเดตും เด
เดเด്เดിเดฏ swing เดธเดฐ്เดต്เดตേเดฏും เดฐเดฃ്เดാเดฎเดค്เดคേเดค്, เดേเดจ്เดฆ്เดฐ เดธംเดธ്เดฅാเดจ เดญเดฐเดฃเด്เดเดณെ เดുเดฑിเด്เดുเดณ്เดณ เดตോเด്เดเดฐ്เดฎാเดฐുเดെ เด
เดญിเดช്เดฐാเดฏเดตും, เดตോเด്เดเดฐ്เดฎാเดฐ് เดตാเดฏിเด്เดുเดจ്เดจ เดชเดค്เดฐเด്เดเดณ്, เดตാเดฐ്เดค്เดคเดเดณ് เดാเดฃുเดจ്เดจ เดെเดฒിเดตിเดทเดจ് เดാเดจเดฒുเดเดณ്, เดเดตเดฏും เดตോเด്เดเดฐ്เดฎാเดฐുเดെ เดฐാเดท്เด്เดฐീเดฏ เดാเดฏเดตും เดคเดฎ്เดฎിเดฒുเดณ്เดณ correlation เดം, เดตോเด്เดเดฐ്เดฎാเดฐുเดെ เดเดฎ്เดช്เดฏൂเด്เดเดฐ്-เดเดจ്เดฑเดฐ്เดจെเดฑ്เดฑ് เดคാเดฒ്เดช്เดชเดฐ്เดฏเด്เดเดณും เด
เดเด്เดുเดจ്เดจเดคുเดฎാเดฃ്. เดฎൂเดจ്เดจാเดฎเดค്เดคെเดฏിเดจം, เดจേเดฐിเด്เดുเดณ്เดณ เด
เดญിเดช്เดฐാเดฏ เดธเดฐ്เดตെเดฏാเดฃ്. เดേเดฐเดณเดค്เดคിเดฒെ 140 เดฎเดฃ്เดกเดฒเด്เดเดณിเดฒെ เดคിเดฐเด്เดെเดുเด്เดเดช്เดชെเด്เด เดตാเดฐ്เดกുเดเดณിเดฒെ เดฑാเดฃ്เดเดฎാเดฏി เดคിเดฐเด്เดെเดുเด്เดเดช്เดชെเด്เด เดตീเดുเดเดณിเดฒ് เดจിเดจ്เดจും UDF, LDF, BJP, เดฎเดฑ്เดฑു เดช്เดฐเดฎുเด เดเด്เดทിเดเดณ്, เดเดตเดฐിเดฒ് เดเดฐ്เด്เดാเดฃ് เดตോเด്เดു เดെเดฏ്เดฏാเดจ് เดเดฆ്เดฆേเดถിเด്เดുเดจ്เดจเดค് เดเดจ്เดจ് เดฐเดนเดธ്เดฏเดฎാเดฏി เดฐേเดเดช്เดชെเดുเดค്เดคാเดจ് เดธ്เดฒിเดช്เดช് เดจเดฒ്เดുเดเดฏും, เดช്เดฐเดธ്เดคുเดค เดธ്เดฒിเดช്เดชിเดฒ് เด
เดญിเดช്เดฐാเดฏം เดฐേเดเดช്เดชെเดുเดค്เดคി เดธീเดฒ് เดെเดฏ്เดคു เดช്เดฐเดค്เดฏേเดം เดคเดฏ്เดฏാเดฑാเด്เดുเดจ്เดจ เดเดตเดฑുเดเดณിเดฒ് เดจിเด്เดทേเดชിเด്เดുเดจ്เดจ เดฐീเดคിเดฏാเดฃ്, เดเดคോเดൊเดช്เดชം เดคเดจ്เดจെ, matching sample เดเดณിเดฒ് เดเดฒ്เดฒാ เดจിเดฏോเดเด เดฎเดฃ്เดกเดฒเด്เดเดณിเดฒും IMEG faculty เด
ംเดเด്เดเดณ് เดจേเดฐിเด്เด് hit-and-run survey เดฏും เดจเดเดค്เดคി. เดเดต เดฎൂเดจ്เดจും เดേเดฐ്เดจ്เดจเดคാเดฃ് เดธเดฐ്เดตേ เดซเดฒം.
เดฎൂเดจ്เดจു เดชเด เดจเด്เดเดณിเดฒും เดൂเดി เดเดെ 59,678 เดชേเดฐെเดฏാเดฃ് เดธเดฐ്เดตെเดฏിเดฒ് เดเดฎเด് เดീം เดเดฃ്เดเดค്. เด เดชเด เดจเด്เดเดณിเดฒെ เดธാംเดฌ്เดฒിംเด് เดเดฑเดฐ് (SE) 2 เดถเดคเดฎാเดจเดฎാเดฃ് เดเดจ്เดจും เดจോเดฃ് เดธാംเดฌ്เดฒിംเด് เดเดฑเดฐ് เดเดฐു เดถเดคเดฎാเดจเดฎാเดฃ് เดเดจ്เดจും เดเดฃเด്เดാเด്เดാം.
Here, the claim is that all 140 constituencies were covered as in a census, and there was no sampling of constituencies. This does appear to be needless effort to “spread resources thin”; as
Venkatesh Athreya has noted:
A priori, results from surveys with a significantly larger number of sample constituencies may be regarded as more robust. This is in some ways more important than the size of the sample in terms of the number of respondents per se, provided of course that the latter does not go below a critical minimum figure. It must be noted, however, that increasing the number of sample constituencies beyond a critical minimum size also does not yield much greater precision in vote share estimates.
Yet, given that the sample size of voters is also higher in the IMEG survey, ceteris paribus, it appears to have higher reliability than the Asianet-C-Fore survey.
The IMEG survey results reveal that the UDF may win in 72-82 seats, while the LDF may get 58-68 Assembly seats. The BJP has very little chance to open an account, even though they may improve their position. It also observes that there is very strong contest in 20 assembly segments, where the results can go either way. This finding on 20 seats should have forced IMEG to make it a too-close-to-call prediction. Despite the fact that they have still gone ahead and predicted till the last mile, this result appears to be more realistic than Asianet-C Fore’s, partly because of the larger spread of constituencies (all 140, as compared to 40) and partly because of the larger sample size (59,678 as compared to 6,112).
A third survey result has been reported by an agency called Centre for Electoral Studies (CES), financed fully by Asianet. However, Asianet has been trying to underplay the results of this survey and overplay its predictions with C-Fore. Asianet has even refused to scroll these results in the news bar, though it did allow 9 minutes during its News Hour show for a discussion on this survey. According to Dr Syam Lal, who is attached to CES, the sample size of voters was 3625 from 35 constituencies. 105 respondents were selected from each constituency on the basis of systematic random sampling method from 3 polling stations selected again on systematic random sampling method. Voter preferences were elicited through an actual mock ballot. The 35 constituencies were selected on the basis of probability proportionate to size sampling method.
The CES survey also shows results very different from Asianet-C-Fore’s. According to the CES, the difference in vote shares between the LDF and UDF is extremely narrow; the UDF has a vote share of 44.9 per cent, while the LDF has a vote share of 44.3 per cent. Accordingly the seat predictions move much in favour of the LDF: the LDF would get 64 to 70 seats, while the UDF would get 70 to 76 seats. This represents an extremely close finish, with slight margins of error becoming capable of tilting the balance either way.
On the balance, the results of Asianet-C Fore are at great variance with both the IMEG survey and the CES survey. Protests have already arisen in Kerala against the second Asianet-C Fore survey, alleging that it is politically motivated. One argument is that given Asianet’s BJP connection (its leading shareholder Rajeev Chandrasekhar is close to the BJP), the agency might have over-sampled from constituencies where the BJP has a stronger presence. The results, thus, show a 9 per cent vote share for the BJP in the State, which is highly unrealistic. This could have also led to biased estimates regarding how different communities vote; in regions where the BJP is strong, their major vote-base is the Hindu upper castes, mainly Nairs.
Of course, the final verdict is in the hands of voters. Only time would tell if Asianet-C Fore predictions are correct or not. However, opinion polls do have an influence in deciding voter’s decisions. Hence, it is important that the full methodological details of these surveys are put in the public domain by the agencies concerned. Asianet-C Fore appear less forthcoming in doing so and this does raise doubts.