Intro to Statistics and Evidence – Part II by Eli Hymson

Check out our newest NSD Update article, the part 2 of a 3-part series: Intro to Statistics and Evidence by Eli Hymson. The aim of these articles is to equip debaters with some debate-applicable knowledge from the field of statistics. The list of subjects is nowhere near comprehensive but reflects a grab-bag of areas which have the potential to improve the quality of in-round evidence comparison and out-of-round research practices.

Polls, Surveys, and Categorical Data

Researchers use statistical methods to infer characteristics of some target population using information obtained from a random sample of that population. Researchers often use polls to gauge public opinion on a variety of issues, and the assumption is that the sample surveyed is representative of the population of interest. You may come across unfamiliar statistical terminology when reading research papers based on polling and survey results. The aim of this section is to define and demystify some of these concepts.

  1. Margin of error

You should always read, report, and understand the margin of error when working with polling data. Ideally, when surveying the public to gauge opinions on the public’s preferred presidential candidate, for example, we would like to survey each voter in advance to know precisely how the final vote count will look. Obviously, we cannot do this. A poll’s margin of error helps us determine how close the sample results are to the true population results without needing to conduct a poll of the entire population. The underlying theory for how this works depends on something called a sampling distribution of the parameter estimate. I encourage you to read more on this subject, but will not explain the mathematical details here. Confidence intervals are a related concept and are discussed below, but I will not address their derivation quite yet.

If a poll says 55% of respondents favor Candidate A and researchers compute a confidence interval of (49, 61), the margin of error is +/- 6, or one half of the width of the interval. At the given confidence level, the observed data from our poll provides evidence that the true percentage of the population favoring Candidate A could be as high as 61% or as low as 49%. Because the interval includes 50%, our data cannot rule out the possibility that the public is perfectly split between the two candidates. If we had a shorter interval around our 55% result, such as (54, 56), then our margin of error would be +/- 1, and we would have a greater degree of comfort that the population truly favored one candidate over the other based on our one-sample poll. The margin of error is a function of the variability of the data as well as the sample size. The margin of error increases with the variability of the data and decreases with the sample size. One way to decrease the margin of error and obtain more precise estimates is to include a larger number of subjects in the sample. The more people you poll, the closer your sample should be to the population if you’ve truly conducted an unbiased random sample. Your estimates should reflect a smaller range of possible outcomes for the population parameter suggested by your observed data.

Some debate-specific takeaways:

  1. If your opponent reads evidence from a poll and only reports the raw percentages, demand to see the margin of error and confidence interval for the result. They might have a poll saying 55% of people polled agree with their position, but they might conveniently have left out the part of the card where the (49, 61) confidence interval is.
  2. If your opponent cites a poll conducted with a very small sample, press them further to explain how the author established that their results are not within the margin of error. As stated before, the margin of error depends on the sample size, and for highly variable data, a much larger sample size is needed to draw reliable conclusions. A poll of 20 people may yield an impressive percentage, but the confidence interval for this result can be disturbingly wide.
  3. Question the representativeness of the sample used in the poll. If your opponent claims their results reflect the entire US population, but the researchers only polled people aged 65+, then the raw percentage result, confidence interval, margin of error, etc. will not be correct for the population of interest. Poll results which seem egregious relative to your expectations of how the population probably views an issue may result from inappropriately oversampling a specific subset of the population.
  4. A small margin of error does not rule out that the question is biased and yields flawed information. A large margin of error does not indicate that its results are fundamentally incorrect.
  • Tabular data, odds ratio, and relative risk

Oftentimes, survey data is more complex than a simple Yes/No poll. Many surveys include a wide array of questions and try to uncover relationships between responses to one question and responses to another related question. For example, a survey may ask respondents to rate their overall life satisfaction on a scale from 1-5 and also ask respondents for their highest level of education completed. Without disclosing the purpose of these questions, the researchers might be collecting data to study whether educational attainment is correlated with life satisfaction. Data of this form is often presented in a table format like the following example, taken from Alan Agresti’s book Categorical Data Analysis. In this example, researchers collected data on a murder defendant’s race and whether they received the death penalty.

Defendant’s Race Death Penalty
Yes No
White 53 414
Black 11 37

There are many statistical techniques a researcher might perform with data of this type, but I will focus on a few key concepts which are often cited in the primary results of such studies.

Firstly, the odds ratio. The odds of an event can be expressed as the probability of success (p) divided by the probability of failure (1-p)  O = p/(1-p). If a debater has a probability of 75% to win a round, then the odds of that debater winning are .75/.25 or 3:1. However, researchers often are not only interested in the odds of an event occurring; they are interested in how the presence of some other factor influences the odds of an event occurring. To measure this, they use a statistic called the odds ratio.

The odds ratio equals the odds of an event occurring in the presence of some other factor divided by the odds of an event occurring in the absence of that factor. The odds ratio measures how strongly the event and the other factor are related, but does not establish a causal relationship in one direction or the other. In the death penalty example, researchers were interested in whether a defendant’s race affects the odds of receiving the death penalty. The probability of a white defendant receiving the death penalty = P1 = 53/(53+414) = .11. Thus, the odds of this outcome are P1/(1 – P1) = .11/.89 = .13. The probability of a black defendant receiving the death penalty = P2 = 11/(11+37) = .23. Thus, the odds of this outcome are P2/(1 – P2) = .30. The odds ratio between the two races receiving the death penalty can be written with either odds in the numerator without affecting our interpretation. Since researchers may suspect black defendants receive the death penalty more often than white defendants, let’s place the second odds in the numerator. The odds ratio is .30/.13, which is about 2.3. This indicates the odds of a black defendant receiving the death penalty are 2.3 times greater than the odds of a white defendant receiving the death penalty.

The odds ratio indicates that the conditioning factor (race) makes a difference in the odds of an event occurring if it is not equal to 1. An OR > 1 indicates that an event becomes more likely when the factor is present, whereas an OR < 1 indicates an event becomes less likely given that factor. When the OR = 1, the factor has no impact on the odds of an event occurring. Researchers should also present a confidence interval for the OR in addition to the specific estimate from the sample data. Similar to the polls example, if a study computes an odds ratio of 1.12 with a 95% confidence interval of (.99, 1.25), then the data does not conclusively support that the true odds ratio for the population is different from 1 even though the study’s results found an OR of 1.12. 

A related statistic is called relative risk. Relative risk is a ratio of two probabilities rather than two odds. In our death penalty example, the probability for a white defendant was .11, and the probability for a black defendant was .23. Thus, the relative risk of a black defendant receiving the death penalty is 2.09 compared to a white defendant. Again, relative risk figures should be reported with confidence intervals.

Most quantitative studies cited in debate rounds will be peer-reviewed academic work and will include all of the required elements I’ve discussed. While you probably won’t catch your opponent missing these things in their evidence, it is helpful to understand what these terms mean when reading articles for your own research. A knowledge of these terms may help you filter out low-quality publications and help you target which sections of the study’s methodology you bring to tournaments in case someone asks.

Eli holds a Master of Science degree in Statistics from Texas A&M University. He competed in Lincoln-Douglas debate for Stoneman Douglas HS in Parkland, Florida, reaching the TOC his senior year.

Jayanne Forrest