Saturday, April 2, 2011

Biases in the second Asianet-C Fore opinion polls in Kerala

R. Ramakumar

Asianet-C Fore has come up with a controversial set of estimates in their second opinion poll in the run up to the Kerala Assembly elections 2011. There are several problems in the results of the opinion poll, and it appears that many criticisms against the survey are justified.

The results of the SECOND SURVEY show a major change in vote shares and seats from the earlier survey conducted by Asianet-C Fore. On March 9th, Asianet-C Fore published results of the FIRST SURVEY that showed that
  • UDF will win 77-87 seats
  • LDF will win 53-63 seats
  • BJP will win up to 5 seats
Further, the vote shares of each party were the following:
  • UDF: 43 per cent
  • LDF: 39 per cent
  • BJP and others: 18 per cent
The new results of the SECOND SURVEY that were shown on March 31st show a very different picture. This new survey shows that,
  • UDF will win 80-90 seats
  • LDF will win 50-60 seats
  • BJP will win up to 2 seats
Further, the vote shares were the following:
  • UDF: 46 per cent
  • LDF: 41 per cent
  • BJP: 9 per cent
  • Others: 4 per cent
Here, what is interesting is that even while the vote share of the LDF has gone up by 2 per cent, the number of predicted seats for LDF comes down by about 3. Here is where, as we shall argue below, the robustness of the prediction model and the randomness of the sample can be called into serious question.

In any psephology survey, there are different typical steps, each of which could be affected by margins of errors. First, you randomly select a sample set of constituencies for the survey. Here, it is important to know whether the sample of constituencies is randomly selected or purposively selected. Some may argue that if randomly selected, the diversity of electoral regions may not be captured. However, if purposively selected, the “purpose” and its rationale need to be publicised and justified.

Secondly, within these constituencies, you randomly select a set of voters. Here again, two sets of errors can enter. There could be a sampling error, which could be the outcome of the choice of the sampling design itself. There could also be a non-sampling error, which arises from errors in measurement and recording of data.

Thirdly, to compute vote shares, you use a suitable model of voting behaviour using past data in conjunction with the vote shares computed from sample opinion surveys. That is, you have baseline information (from the previous election) with the vote share of each party and who won how many seats. Based on this past information and the results of the present opinion poll, you estimate a “swing” for each party. Using this state-level swing factor will give you the estimate for the state-wide share of votes for each party. You can also compute swings in regions within the state – i.e., for Malabar, central Kerala and south Kerala. Here, of course, the major possible source of error lies in the assumption of uniform swing factors across states or regions.

Fourthly, number of predicted seats is arrived at from these vote shares. It is here that the maximum possibility of error lies. Let me take a common method used, which I borrow from the well-known statistician and psephologist Rajeeva Karandikar. After assuming that the swing factor is constant across a region and the state, you further assume that the swing in one seat is a convex combination of the state-level swing and the region-level swing. A convex combination is a linear combination of points, where all coefficients are non-negative and sum up to 1. Based on this swing factor for each seat, you arrive at winning probabilities of success for each candidate. A sum of probabilities of each UDF/LDF candidate in each seat will give you the total number of seats for UDF/LDF in the state as a whole.

However, most survey agencies never publicise the exact method of transforming vote shares into number of seats. It is always a black box, except in a few rare instances. For example, in 1998, the Frontline-APT Research Group opinion poll in Tamil Nadu publicised the details of its methodology of converting vote shares into number of seats. This survey used a modified version (to suit Indian realities) of the “Cube Law” for this purpose; the Cube Law states that if the vote shares of two parties ‘A’ and ‘B’ are ‘a’ and ‘b’, their seats will be in the ratio a3 to b3. In most cases of opinion polls, the actual method remains unknown. According to Professor Venkatesh Athreya, a close observer of opinion polls,

In our first-past-the-post electoral system, an element of judgment is inevitable in moving from vote share forecasts to seat predictions, since there is no one-to-one correspondence between the two. Factors such as the degree of geographical concentration of a political formation's voter support, the degree of polarity in the electoral contest (whether a contest is bipolar, tripolar or even more multipolar) and the complexity (both on paper and in practice) of the electoral adjustments worked out are all crucial to determining the extent to which an increase in vote share translates into an increase in seats. This particular aspect needs to be borne in mind when evaluating or utilising opinion poll survey results.

In the Asianet-C Fore survey, only sketchy details of the methodology used are given. Asianet-C Fore had done a first survey in the month of February 2011, and the methodological details of that survey are available. I give below the only details that they have provided:

C fore (Centre for Forecasting & Research) conducted the [first] pre-poll survey between 23rd February and 7th March, 2011 in Kerala. In all, 6112 voters were interviewed using a structured questionnaire from 40 assembly constituencies using systematic random sampling method. In each constituency 5 urban and 15 rural locations were selected. For every location, a starting point was selected randomly in north, south, east and west direction. From each starting point right hand rule was followed and one person (above 18 years of age) was interviewed from a household with interval of 10 households. Thus, in all, polling was conducted in 200 urban and 600 rural localities in the state. Care was taken to ensure that different castes and communities were represented in the sample in their actual proportion. The survey has a margin of error of 1 percentage point at 90 percent confidence level.

It is not clear if the same methodology was followed for the second survey also.

Further, there are no details of (a) whether the 40 constituencies were selected randomly or purposely; (b) whether regional considerations were taken into account while selecting the 40 constituencies; (c) how care was taken to ensure that different castes and communities are represented in the sample; and (d) what method was used to convert votes shares into seats. What are also missing are scenarios, which give details of changes in predictions in the face of errors in the prediction of swings.

Given these handicaps, it is extremely difficult to comment on the results of the Asianet-C Fore survey. This is because, as explained before, the robustness of the model and randomness of the sample are critical in the prediction of the number of seats. If the sample is not random (both the sample of constituencies and the sample of voters), results can go haywire. If the prediction model is sensitive to small shifts, results can go bizarre. From what we know of the results of Asianet-C Fore, these errors could be large.

Below, I list some of the serious anomalies in the results. Here, I shall use the post-poll results from two earlier elections in Kerala – 2004 and 2009 – to compare the nature of shifts that the 2011 results throw up. The 2004 and 2009 surveys were conducted as part of the National Election Study (NES) of Lokniti in New Delhi. In comparison with the NES surveys, the community-wise shift in votes in the Asianet-C Fore survey is extremely surprising (see Tables 1, 2 and 3 below). For instance,

  •  Compared to 2004 Parliament elections, the upper caste Hindu vote share of LDF has fallen from 40 per cent to 13 per cent. 
  • While the share among Ezhavas has increased for the LDF (from 58 per cent in 2004 to 68 per cent in 2011), the share among Hindu OBCs has fallen from 52 per cent in 2004 to 42 per cent in 2011. How can the share of Ezhava votes for the LDF increase and share of Hindu OBC votes for the LDF fall, both so significantly? For the Hindu OBCs, the vote share for the UDF has increased from 17 per cent in 2004 to 41 per cent in 2011.
  • Among Dalits, the vote share of LDF in 2004 was 72 per cent and in 2009 was 69 per cent, according to NES. However, the Asianet survey shows that it has fallen to 50 per cent in 2011.
  • No survey anywhere in Kerala has shown till now that the vote share among upper-caste Hindus and Syrian Christians is higher for BJP and others than the LDF. The survey is saying that a non-UD/non-LDF combination can net in more votes among Syrian Christians than LDF. This appears to be an extraordinarily mistaken result that defies common sense.

    TABLE 1
    NATIONAL ELECTION STUDY (NES) RESULTS FOR 2004, KERALA, in per cent

    Caste/Religion
    Share for
    N
    LDF
    UDF
    BJP
    Others
    Hindu upper caste
    40
    43
    12
    5
    93
    Nairs
    42
    29
    26
    4
    84
    Ezhavas
    58
    22
    18
    2
    238
    OBCs
    52
    17
    27
    4
    75
    Dalits
    72
    17
    8
    2
    87
    Muslims
    41
    57
    1
    1
    140
    Christians
    31
    62
    2
    5
    214
    Source: National Election Study, 2004, weighted data set.
     
     
    TABLE 2
    NATIONAL ELECTION STUDY (NES) RESULTS FOR 2009, KERALA, in per cent

    Caste/Religion
    Share for LDF
    Swing for LDF from 2004
    Share for UDF
    Swing for UDF from 2004
    Nairs
    27
    -14
    33
    +4
    Ezhavas
    57
    -1
    27
    +5
    Dalits
    69
    -5
    15
    +5
    Muslims
    26
    -3
    69
    -2
    Christians
    32
    -15
    69
    +13
    Source: National Election Study, 2009, weighted data set.

    TABLE 3
    ASIANET-C Fore RESULTS OF THE SECOND SURVEY, 2011, KERALA

    Caste/Religion
    Share (%) for
    LDF
    UDF
    BJP & Others
    Hindu upper caste
    13
    60
    27
    Ezhavas
    68
    25
    7
    Hindu OBCs
    40
    41
    19
    Dalits
    50
    34
    16
    Syrian Christians
    11
    77
    12
    Other Christians
    14
    73
    13
    Muslims
    23
    70
    7
    Source: Asianet


    There is further evidence of bias in the second Asianet-C Fore survey. Let us compare the results of the first survey and second survey.

    • In the first survey, among Hindu upper caste voters, 65 per cent would vote for UDF, 22 per cent would vote for LDF and 13 per cent would vote for Others (see Table 4 below).
    • However, in the second survey, among Hindu upper caste voters, 60 per cent would vote for UDF, 13 per cent would vote for LDF and 27 per cent would vote for Others (see Table 3 above).

    TABLE 4
    ASIANET-C Fore RESULTS OF THE FIRST SURVEY, 2011, KERALA

    Caste/Religion
    Share (%) for
    LDF
    UDF
    BJP & Others
    Hindu upper caste
    22
    65
    13
    Ezhavas
    47
    35
    18
    Hindu OBCs
    42
    37
    21
    Dalits
    54
    31
    15
    Syrian Christians
    14
    70
    16
    Other Christians
    22
    68
    10
    Muslims
    21
    72
    7
    Source: Asianet

    How did such a large share of Hindu upper caste voters decide to vote against both UDF and LDF, and in favour of Others, just over a period of one month? Was there a deliberately induced bias in the second survey?

    All these unexplainable trends show that serious questions can be raised regarding the randomness of the sample used in the Asianet-C Fore survey. Further, given that the same survey shows that the candidate’s personal qualities are important in voting patterns and that “political affiliations” are less important, would the assumption of a state-level or region-level swing be appropriate? Is that a safe assumption? It appears no, and this further reaffirms the doubt that the sampling is not quite random as it should be.

    Robust psephology, as pioneers in the field would tell you, requires good statistics, lots of common sense and a good understanding of ground realities (“domain knowledge”). The Asianet-C Fore survey appears lacking in all three.

    In addition, the results of other opinion polls appear at variance with Asianet-C Fore’s. The Institute for Monitoring Economic Growth (IMEG) has announced its opinion poll results. On the positive side, more details are available from IMEG regarding methodology than Asianet-C Fore:

    കഴിഞ്ഞ മൂന്നു പൊതു തിരഞ്ഞെടുപ്പുകളിലും, ഏറണാകുളം ഉപ തെരഞ്ഞെടുപ്പിലും, ഐമഗ് നടത്തിയ വിജയകരമായ് പഠനങ്ങളില് നിന്നും കുറെ കൂടി മെച്ചപ്പെട്ട മേതോഡോളജിയാണ് സര്വെയില് ഉപയോഗിച്ചത്. മൂന്ന് തരം സര്വേകളുടെ pooled result ആണ് അഭിപ്രായ വോട്ടെടുപ്പിന്റെ ഫലം. ആദ്ധ്യത്തേതു, മുന്നണികളോടുള്ള കൂറും, കൂറ് മാറ്റവും അടങ്ങിയ swing സര്വ്വേയും രണ്ടാമത്തേത്, കേന്ദ്ര സംസ്ഥാന ഭരണങ്ങളെ കുറിച്ചുള്ള വോട്ടര്മാരുടെ അഭിപ്രായവും, വോട്ടര്മാര് വായിക്കുന്ന പത്രങ്ങള്, വാര്ത്തകള് കാണുന്ന ടെലിവിഷന് ചാനലുകള്, ഇവയും വോട്ടര്മാരുടെ രാഷ്ട്രീയ ചായവും തമ്മിലുള്ള correlation ഉം, വോട്ടര്മാരുടെ കമ്പ്യൂട്ടര്-ഇന്റര്നെറ്റ് താല്പ്പര്യങ്ങളും അടങ്ങുന്നതുമാണ്. മൂന്നാമത്തെയിനം, നേരിട്ടുള്ള അഭിപ്രായ സര്വെയാണ്. കേരളത്തിലെ 140 മണ്ഡലങ്ങളിലെ തിരഞ്ഞെടുക്കപ്പെട്ട വാര്ഡുകളിലെ റാണ്ടമായി തിരഞ്ഞെടുക്കപ്പെട്ട വീടുകളില് നിന്നും UDF, LDF, BJP, മറ്റു പ്രമുഖ കക്ഷികള്, ഇവരില് ആര്ക്കാണ് വോട്ടു ചെയ്യാന് ഉദ്ദേശിക്കുന്നത് എന്ന് രഹസ്യമായി രേഖപ്പെടുത്താന് സ്ലിപ്പ് നല്കുകയും, പ്രസ്തുത സ്ലിപ്പില് അഭിപ്രായം രേഖപ്പെടുത്തി സീല് ചെയ്തു പ്രത്യേകം തയ്യാറാക്കുന്ന കവറുകളില് നിക്ഷേപിക്കുന്ന രീതിയാണ്, ഇതോടൊപ്പം തന്നെ, matching sample കളില് എല്ലാ നിയോജക മണ്ഡലങ്ങളിലും IMEG faculty അംഗങ്ങള് നേരിട്ട് hit-and-run survey യും നടത്തി. ഇവ മൂന്നും ചേര്ന്നതാണ് സര്വേ ഫലം.

    മൂന്നു പഠനങ്ങളിലും കൂടി ആകെ 59,678 പേരെയാണ് സര്വെയില് ഐമഗ് ടീം കണ്ടത്. പഠനങ്ങളിലെ സാംബ്ലിംഗ് എറര് (SE) 2 ശതമാനമാണ് എന്നും നോണ് സാംബ്ലിംഗ് എറര് ഒരു ശതമാനമാണ് എന്നും കണക്കാക്കാം.

    Here, the claim is that all 140 constituencies were covered as in a census, and there was no sampling of constituencies. This does appear to be needless effort to “spread resources thin”; as Venkatesh Athreya has noted:

    A priori, results from surveys with a significantly larger number of sample constituencies may be regarded as more robust. This is in some ways more important than the size of the sample in terms of the number of respondents per se, provided of course that the latter does not go below a critical minimum figure. It must be noted, however, that increasing the number of sample constituencies beyond a critical minimum size also does not yield much greater precision in vote share estimates.

    Yet, given that the sample size of voters is also higher in the IMEG survey, ceteris paribus, it appears to have higher reliability than the Asianet-C-Fore survey.

    The IMEG survey results reveal that the UDF may win in 72-82 seats, while the LDF may get 58-68 Assembly seats. The BJP has very little chance to open an account, even though they may improve their position. It also observes that there is very strong contest in 20 assembly segments, where the results can go either way. This finding on 20 seats should have forced IMEG to make it a too-close-to-call prediction. Despite the fact that they have still gone ahead and predicted till the last mile, this result appears to be more realistic than Asianet-C Fore’s, partly because of the larger spread of constituencies (all 140, as compared to 40) and partly because of the larger sample size (59,678 as compared to 6,112).

    A third survey result has been reported by an agency called Centre for Electoral Studies (CES), financed fully by Asianet. However, Asianet has been trying to underplay the results of this survey and overplay its predictions with C-Fore. Asianet has even refused to scroll these results in the news bar, though it did allow 9 minutes during its News Hour show for a discussion on this survey. According to Dr Syam Lal, who is attached to CES, the sample size of voters was 3625 from 35 constituencies. 105 respondents were selected from each constituency on the basis of systematic random sampling method from 3 polling stations selected again on systematic random sampling method. Voter preferences were elicited through an actual mock ballot. The 35 constituencies were selected on the basis of probability proportionate to size sampling method.

    The CES survey also shows results very different from Asianet-C-Fore’s. According to the CES, the difference in vote shares between the LDF and UDF is extremely narrow; the UDF has a vote share of 44.9 per cent, while the LDF has a vote share of 44.3 per cent. Accordingly the seat predictions move much in favour of the LDF: the LDF would get 64 to 70 seats, while the UDF would get 70 to 76 seats. This represents an extremely close finish, with slight margins of error becoming capable of tilting the balance either way.

    On the balance, the results of Asianet-C Fore are at great variance with both the IMEG survey and the CES survey. Protests have already arisen in Kerala against the second Asianet-C Fore survey, alleging that it is politically motivated. One argument is that given Asianet’s BJP connection (its leading shareholder Rajeev Chandrasekhar is close to the BJP), the agency might have over-sampled from constituencies where the BJP has a stronger presence. The results, thus, show a 9 per cent vote share for the BJP in the State, which is highly unrealistic. This could have also led to biased estimates regarding how different communities vote; in regions where the BJP is strong, their major vote-base is the Hindu upper castes, mainly Nairs.

    Of course, the final verdict is in the hands of voters. Only time would tell if Asianet-C Fore predictions are correct or not. However, opinion polls do have an influence in deciding voter’s decisions. Hence, it is important that the full methodological details of these surveys are put in the public domain by the agencies concerned. Asianet-C Fore appear less forthcoming in doing so and this does raise doubts.

    8 comments:

    1. This comment has been removed by a blog administrator.

      ReplyDelete
    2. kure kashtapettallo ramkumar sakhaave... enthayalum nalla, language am impressed. bt oru kuzhappam, i think u missed the 2010 electn :D... LDF-inu chaayvu ullathu mathram angu parayathe. orumathiri 3rd yellow papers pole...

      ReplyDelete
    3. @anonymous: I will be glad if you give me similar caste-wise break-ups for 2010 elections. I could not get it, and NES has not covered Kerala in 2010.

      You must also remember that 2004 was a pro-LDF election and 2009 was a pro-UDF election. So, I have taken both extremes in this comparison. In fact, in the 2010 elections, the LDF had improved its position compared to 2009.

      Your cynicism appears interesting, but hollow. People name-call (like your calling this yellow paper pole) when they are exhausted of all responses. Look for a better reason to thrash me, brother!

      ReplyDelete
    4. I think your findings are partly true. The reason for bias Rajeev Chandrashekhar and BJP has damaged your reasoning. basically as per you reasoning bjp has been promoted but what it says is 0-5 seats and 0-2 seats which isnot much.

      the social base of UDFis much bigger ,chrisstians+muslim+nair. Loss of KCJ means figures for 2009 is not comparable for christians.

      The variations for social groups like Nair's at 90% confidence level is not stastically significant.

      My personal view is that the second asianet survey is relatively correct in terms of voteshare but not in terms of seats.

      ReplyDelete
    5. @anonymous:

      I reserve my curiousness about who you are!

      The point about Rajeev Chandrasekhar is an argument that has come up. I just stated it as one criticism in the air.

      Yes, the number of seats for BJP comes down from 5 to 2, but the vote share appears to have gone up.

      In 2009, the Left lost Christian vote, despite the KC (J). So, the loss of KC (J) after 2009 doesnt mean much.

      Finally, I do not think even by vote shares, Asianet-C Fore is right. Yes, the conversion of votes to seats is problematic, but I think the problem begins much earlier i.e., from vote shares itself.

      Of course, fingers crossed! :)

      ReplyDelete
    6. If you want to know my political orientation it is BJP and i am also not optimistic of its chances of winning a seat (maybe Nemom).

      If you look at it the First Asianet survey predicting 18 % for others is wrong . This puts a question mark about their sampling technique.

      But the second survey predicts 46% to UDF and 41 to LDF. With a margin of error of 3% this seems reasonable considering UDF will sweep Malappuram , Kottayam and Idukki by big margins.

      Moving onto variance with other surveys , India Today has predicted a land slide for UDF , the track record of IMEG is not encouraging ,(despite huge sample they predicted 6-9 seats for LDF in 2009, 5 for udf in 2004 , got 2006 spot on though)

      While the number of seats is difficult to predict , in a tight race alliance strength matter , plus history of our electorate UDF seems headed for a victory.

      ReplyDelete
    7. As usual,a very good incisive analysis. Can I reproduce this in my own blog?

      I did a passing critique too in my blog. This is my latest post:

      Which way will SHG Vote Swing?: Battle for Tamil Nadu

      In 1994, Chandra Babu Naidu of the Telegu Desam was quick to grasp the huge potential of SHGs to swing tightly fought electoral contests. He then supported the anti-arrack (local liquor) agitation that was spearheaded by Self-Help Group (SHG)s and rode the accompanying wave that catapulted him as Chief Minister of the state for two terms.

      At least 1/6th voters in Tamil Nadu are members of Self-Help Groups (SHGs). There are an estimated 500,000 to 800,000 SHGs within the state with each unit having a membership of 20. Collectively, they comprise half of Tamil Nadu's roughly 30-40 million of the state's total electorate. Little wonder that the two main alliances in the state are bending backwards to woo SHGs. The DMK alliance promised each member of a SHG Rs 5,000-10,000 as a grant. Its rival, the AIADMK promised Rs 1 million to each SHG, three fourth as loans at soft interest rates and the remainder as a subsidy.

      So how will SHGs vote in the current Tamil Nadu Assembly Elections? To answer this question, we bring you an exclusive survey, the first of its kind, to know what way this section will swing, conducted by Bhakther Solomon, DPG, Chennai.

      Read more: http://exitopinionpollsindia.blogspot.com/2011/04/which-way-will-shg-vote-swing-battle.html

      ReplyDelete
    8. @Rajan:

      Thanks, Rajan, will read it...Please feel free to reproduce.

      ReplyDelete