10 Statistical Problems with Philippine Pre-Election Surveys
There has been confusion in the Philippines on contrasting and significantly deviating pre-election surveys for the upcoming May 2016 presidential election. This has quite bothered me personally, since I was worried that people will never take research as serious and reliable tool to capture an interesting certain researchable phenomena. Many Filipinos are doubting, even me on the reliability and the accuracy of these pre-election surveys because they seem to be detached from what is really happening. This has led me to pinpoint ten statistical problems with pre-election surveys in the Philipppines.
This article is an opinion. The assertions are based on facts and statistical principles I learned, but interpretation of these principles may vary from one person to another. What I personally made is a list and the corresponding reasons I think pre-election surveys in the Philippines are unreliable. One could argue on my assertions, but I see to it the reasons were not influenced by my personal choice of the election. This is suppose to guide the electorate of the function of pre-election surveys. Furthermore, there is no expectation that this article would lead to a concrete measure, legislation, change of regulation or recommendation. This is simply a discussion and elaboration of existing data, interpreted through theories, principles and personal experience.
1. The false presumption of a determined voter turn-out.
The surveys are limited by a number of assumptions. For example, Pulse Asia uses a formula of determining the distribution of the Philippine electorate. Using these assumption, there are more individuals, who will be voting from the rural areas than the urban metropolis. This is of course has a basis perhaps due to previous statistical data banks on former surveys and census conducted. However, the problem would be the assumption itself. There will always be the possibility that either it was convenient for those gathering the data to conduct interviews in the urban areas, making the conclusion biased; or it is impossible to infer that the percentage of the distribution will be reflective of the real voter turn-out during the election.
2. Sampling size maybe statistically enough, but not reliable.
The Pulse Asia March 2016 survey uses 2,600 sample size. This is the highest so-far, making it the least biased sample size if judged according to the sample size. However this means that a single survey respondent represents approximately 13,000 voters. That is equivalent of the population of the whole big barangay. The question of whether that is enough could easily be answered. Simply not. Worse, SWS uses 1,200 to 1,800 respondents divided equally to geographical areas. SWS gathered data from 300 respondents each from Metro Manila, Luzon, Visayas and Mindanao. This methodology is problematic. The mere proportion distribution provides an idea of the magnitude of the survey bias. The researchers made an assumption that the turn-out and the voter population of each geographical area is the same. That is never true, and that threatens a biased analysis and eventual valid conclusion. In addition, just imagine a 5% lead, for example in a survey. This translates to approximately 100 survey respondents. The question of certainty and accuracy of selecting the right survey respondent will always influence the entire survey conclusion.
3. Distribution of the sample size according geographical location, socio-economic status and other factors are questionable.
Pulse Asia in their website makes categories to better interpret the survey data. This is a good measure, but perceived statistically insufficient to ensure bias is controlled. By merely adding the data provided, it would easily be noted that the selection of the survey respondents were not proportionally distributed according to the socio-economic factors. By looking at the table, there were more survey respondents from Class ABC, when in fact Class D constitutes the bulk of the whole Philippine population. This is not just true in the groups according to socio-economic classes, but also with regards to other demographic factors.
4. Problem of where to get the right survey respondents.
This is perhaps the biggest challenge of survey firms. Which barangay, which purok, which area within the purok should they conduct the survey. One knows that a female survey respondent of the same demographic variables may give a totally different answer, just because of the location they live within the community. The sample size distribution would be more problematic when insufficient data about the population occurs. The Philippines is of course improving in census and the conduct of statistics. However, statistics is still questionable without a national ID system, which provides accurate data about the distribution of the population according to various variables. Furthermore, the validity of the each data gathered is also not ensured as Filipinos would even hesitate to tell the truth about how much they earned, or information pertaining to personal information perceived to be a direct threat to privacy and confidentiality.
5. Multi-stage sampling is not appropriate for the population of Philippine electorate.
There is simple to much difference among geographical areas, religion, culture and language in the Philippines. When doing multi-stage sampling method, there will always be an assumption that all within the sampling list on a certain variable are homogenous and seemingly equal. When by random, the researcher have selected City X, how would the researcher be assured that in a specific variable, a certain group of people within City X would be seemingly homogenous. Even how many times, the regrouping will be made, the risk of non-homogeneity will always be there. Furthermore, it is also accepted that multi-stage sampling is not as accurate as simple random sampling method. However, simple random sampling becomes impossible in a country without a national ID system, and lack of valuable data to make good assumption in statistical surveys.
6. Significant error margins per geographic areas.
Despite error margins at less than 2% (for Pulse Asia) in the national level, the error margins when divided per geographic areas ( Metro Manila, Luzon, Visayas and Mindanao) are significantly higher. In fact, in March 2016 Survey of Pulse Asia, Metro Manila registered 5.7% and Visayas at 4.2%. The reliability of the survey becomes questionable as the survey error magnifies when conclusions are made at the national level. Moreover, when other biases are taken into consideration, the survey bias balloons and become more significant, making it difficult to come up even with a respectable valid conclusion. Lastly, in a tight presidential election, 20 survey respondents may mean an entire one percent of the survey data. With a high possibility of error both in data gathering, respondent selection, each percentage points in pre-election surveys becomes questionable and unreliable.
7. The methodology of conducting face-to-face survey is probably unreliable.
Pulse Asia has introduced a secret balloting mechanism, which is a very good measure to decrease the so-called experimenter’s bias effect. Although, this does not mean that the methodology will eliminate the bias. Every action of the person interviewing, the content and context of the information provided by the interviewer significantly affects the survey responses. The time allotted for face-to-face survey is undocumented, making it difficult to ensure that the survey was not done in haste. In addition, a lengthy interview maybe perceived differently by the respondent, increasing the chance of merely choosing at random, rather deciding at will. Moreover, survey methodology does not assume the respondents were not influenced by any family member or neighbor, who could be in close proximity while the survey was conducted.
8. The time of the survey contributes to research bias.
Filipinos easily forget. That is how difficult to capture a phenomenon in question through research in the Philippines. Time will always be a factor. The news and events in Philippine politics delivered by media entities with varying political and organizational intentions and visions will always have a significant influence on survey results. One could be very famous one day, and the other one would not be. The timing of the surveys will greatly be in question in country where controls on media is almost inexistent and not well-implemented. In fact, political advertisements were according to law to be controlled, but in practice not. Those who have more resources could literally gain control on the thoughts of the Philippine electorate. Hence, the genuine opinion of a respondent would be great challenge to be known.
9. Attitude of survey respondents towards research could never be determined.
The Filipino electorate is easily intimidated by people conducting research. This is because it is not so frequent that surveys are conducted in their locality. Based on experience, usually the head of the family usually makes the response for each survey. This threatens the accuracy of the data, since the answers are taken on the perspective of a person, who provides resources for the family. This does not take account the submissiveness of other members, or perhaps their deviation of opinion from the head of family, in terms of the issue at hand. Furthermore, in the Filipino society, it would not be uncommon that even the wives were interviewed, the husband or the head of family will be consulted, thereby influencing the ability of the respondents to choose. This is more common in the lower socio-economic levels, as more economic dependency occurs among family members.
10. The human factor of manipulation will never be eliminated.
This is of course a big problem. The survey firms conducts survey of the same population, but come up with contrasting and varying conclusions. That makes one think that one or some of these surveys were unreliable and inaccurate. The problem would be to pinpoint which of these surveys are inaccurate, or with very high probability of inaccuracy. In layman’s terms, if one is looking the same apple, all of the people seeing should see the same apple, with the same color and appearance with minimal deviation of the observers’ descriptions. Moreover, this leaves the question on where the human error made its greatest effect in the survey, whether it was in data collection, tabulation, analysis or worse the methodology itself.
Lastly, I personally acknowledge that pre-election surveys before have successfully inferred winning presidential candidates, making certain survey firms credible enough to conduct these surveys. This article does not intend to attack these survey firms. This serves as a critique of the methodology these survey firms are using. The success of inference in the previous election does not automatically guarantee reliability of the succeeding pre-election surveys. May 2016 is perhaps the most highly contested presidential election ever in the Philippine history, where four presidential candidates and three vice-presidential aspirants have still statistical probability of winning. Previous elections were not as competitive as the 2016 election, making it relatively easy to come up a survey conclusion similar to the election results, in spite of significant survey errors and bias due to clear dominance of the assumed single leading presidential aspirant. However, this is not true for the upcoming election. It is therefore a must that reliability and accuracy of these surveys be criticized and ensured.