Intro to Statistics and Evidence – Part I by Eli Hymson

The aim of this article is to equip debaters with some debate-applicable knowledge from the field of statistics. The list of subjects is nowhere near comprehensive but reflects a grab-bag of areas which have the potential to improve the quality of in-round evidence comparison and out-of-round research practices. Some concepts and themes are referenced in sections I and II, but explained in more detail in section III. This is the first section of the three section series.

  1. Sampling

Before discussing anything to do with statistical techniques, we should all be on the same page about how samples are constructed to form data sets researchers analyze. A sample is a subset of the target population of interest for which data is collected. Using information from the sample, researchers employ statistical techniques to infer characteristics of the entire population. The sample will not perfectly communicate all information about the population; however, it should be generally representative of the population in that there are no systematic differences in how the sample is selected relative to the characteristics of the population. The difference between the information contained in the sample and in the population is referred to as sampling error. Non-sampling error is a much more problematic issue and results from errors and biases which would persist even if the sample included the entire population, e.g. measurements are inaccurately reported or a survey question is heavily biased.

Statistical techniques require random samples. It is probably not necessary to describe what a random sample is, but there are many ways to obtain one and many techniques which extend simple random sampling to more complex situations. For example, in stratified random sampling, the population is divided into unique groups based on some important trait, and then random samples are taken within each group to form an overall sample. This helps ensure each group is represented in the sample and allows researchers to produce more precise estimates of parameters within each group.

Examples of sampling techniques which may not yield representative samples include convenience samples and judgmental samples. You should be skeptical of studies using these techniques and press your opponent on the sampling technique during cross-examination if you suspect data was collected in this manner.

  1. In convenience samples, data analyzed comes from a sample easily accessible to the researcher. For example, a college student may want to conduct a research project on political attitudes of people aged 18-22. In this example, the population of interest is ALL people aged 18-22, but the sample only includes students on a college campus since the researcher cannot access a broader population due to resource limitations. As a result, the researcher may only infer characteristics of a smaller population than intended using the data collected.
  2. In judgmental samples, a researcher makes a subjective determination of which subjects represent the population of interest. For example, a researcher might be interested in measuring the impact of corruption on national economic growth. The researcher selects a sample of countries to investigate based on his/her expert opinion on which countries are corrupt enough to consider. This sample over-emphasizes countries based on their appeal to the researcher and therefore is not representative of the population of all countries.

There are many other ways to obtain inappropriate samples relative to the question a research article purports to answer. Sometimes these flaws will be obvious, but other situations may require more careful thought. For example, when data exhibits survivorship bias, applying statistical techniques to the available sample fails to answer the question. You may suspect survivorship bias if a study includes a sample of subjects which must progress past some barrier in order to make it to the observed data. For example, looking at racial discrimination in bank lending based on interest rates received does not account for potential disparities in how frequently loans are approved to each group. You can’t receive an interest rate if your application is denied. The bank may only lend to the aggrieved group if they have astronomically above-average credit scores, whereas the bank accepts average credit for the other group. The sample of loan recipients from the aggrieved group may not be representative of the population of that group.

Article written by Eli Hymson. Eli holds a Master of Science degree in Statistics from Texas A&M University. He competed in Lincoln-Douglas debate for Stoneman Douglas HS in Parkland, Florida, reaching the TOC his senior year.

Check back next week to learn more about analyzing the statistics in your research!

Jayanne Forrest