A Statistical Analysis of Side-Bias on the 2018 November-December Lincoln-Douglas Debate Topic

By Sachin Shah

Due to the strong response to the September-October side-bias study, a subsequent analysis of the 2018 November-December Lincoln-Douglas topic is merited to ascertain if the pattern of negative side bias holds. While the technical concerns raised with prior studies were addressed and resolved in the September-October side bias article, there remain a few potential concerns related to judge variability. This study will avoid the known potential pitfalls with statistical studies outlined in the September-October side bias article. Although this topic has just started, most tournaments on this topic occur in November, presumably as semester exams and winter break deter tournament hosting in December.

Affirmative and negative ballots were gathered via tabroom.com and speechwire.com from six Tournament of Champions bid distributing tournaments across the country: The 33rd Annual MinneApple Debate Tournament, Florida Blue Key Speech and Debate Tournament 2018, Damus Hollywood Invitational, Badgerland, The Tradition Cypress Bay, and The 15th Scarsdale Invitational. These tournaments range from octofinal to final bid level qualifier tournaments. Semifinal and final bid level tournaments were included in this analysis to provide sufficient data at this stage (early November) in the topic. This data set has a large sample size of 1701 ballots and represents fairly diverse debating style. These tournaments span the country from the west coast, where utilitarian rounds are more predominant, to the east coast, where philosophy is more popular. A 50-50 distribution of affirmative and negative ballots would be expected because topics should not inherently favor one side over the other.

However, when all posted ballots are analyzed, the negative won 52.15% of ballots. Now the question is whether the difference between 52.15% and what would be expected (50%) is statistically significant, or due to chance. In order to calculate a p-value to determine the answer, a one-proportion z-test was used. The null hypothesis was set to p = 0.5 since it is expected, barring any bias, that the affirmative would win the same number of times as the negative would. The alternative hypothesis was p > .5, where p is the proportion of negative ballot wins. The one-proportion z-test did not reject the null hypothesis in favor of the alternative hypothesis (p-value < 0.04). This implies there is less than 4% chance that the proportion of affirmative and negative wins is unbiased.

Although this data suggests there is not a statistically significant bias, there are potentially important lurking variables. This data set includes a quality disparity between debaters that occurs during preliminary rounds. There are uneven pairings where some debaters may have more years of experience than their opponents have. In addition, some judges heavily weight the 2AR more than the 2NR, thus preferring the affirmative more often as the last speech, instead of evaluating the round purely “by the flow.” This suggests there could be confounding variables. This concern may be more prevalent at smaller tournaments, as it could be that smaller tournaments have more judging variability than larger ones, which can attract more ‘flow’ judges.

Removing preliminary rounds from the analysis could ameliorate the quality disparity as described above. In the 380 elimination rounds across all seven tournaments in the data set, the one-proportion z-test rejected the null hypothesis in favor of the alternative hypothesis. The negative won a statistically significant 56.32% of ballots (p-value < 0.01). This suggests there is less than 1% chance that there is no side bias. The strength of the bias as suggested by the p-value means even if there are confounding variables, negating is still easier. The cause might not be the topic, but rather a structural concern.

Judge variability still exists in tournament elimination panels, which makes it hard to determine the specific cause of side bias. One method might be to only use quarterfinal and octofinal bid level tournaments on this topic (The 33rd Annual MinneApple Debate Tournament and Florida Blue Key Speech and Debate Tournament 2018), as they tend to attract more ‘flow’ judges. The negative won a statistically significant 59.91% of those 227 elimination ballots (p-value < 0.01). Although this excludes smaller tournaments from the sample size, it may provide a better sense of the bias on this topic rather than debate as a whole.

These graphs demonstrate the extent of the negative side bias on this topic. The first graph illustrates that negative side bias is pervasive across all rounds at each tournament except for the Tradition Cypress Bay, ranging from just under -6% to over 3% variance from an unbiased distribution. That tournament is likely an outlier and contributes to the lower bias, as there was actually an affirmative skew. If the outlier’s data is removed from the sample size, the negative won a statistically significant 52.97% of the 1548 total ballots (p-value < 0.001) and 59.76% of the 338 elimination ballots (p-value < 0.001). This statistic is likely a better representation of the skew on this topic. The second graph illustrates that the problem is exacerbated in elimination rounds; it shows the heightened negative side bias in elimination rounds across tournaments, ranging from just under -4% to 10%. However, it is difficult to extrapolate from these values, as the sample size is small.

This analysis is statistically rigorous and relevant in several aspects: (A) the p-value is less than the alpha (0.01). (B) This is on the current November-December topic, meaning it’s relevant to rounds these months [1]. (C) This includes diversity of debating styles. (D) This accounts for debating level of participants by isolating elimination rounds. The combination of these points validates this analysis.

As a closing note, some debaters are not debating the current topic in some rounds. An analysis of all of this year’s Tournament of Champions bid distributing tournaments with round results posted this year could be more applicable. The negative won 53.62% of 6475 ballots (p-value < 0.01) and 56.69% of the 1490 elimination ballots (p-value < 0.001). This suggests the bias might be structural, not just topic specific.

Therefore, this analysis confirms that affirming is again in fact harder on the 2018 November-December topic, just to a lesser extent than the last topic [2]. So don’t lose the flip!

*Sachin Shah is a senior at Lake Highland Preparatory School in Orlando, FL, who is currently qualified to the 2019 Tournament of Champions. Outside of debate, he participates in robotics and lab research. He often enjoys solving Rubik’s cubes and programming challenges in his spare time.*

[1] It is important to note that numbers presented in this article should only be used within the context of the 2018 November-December topic; debaters who attempt to extrapolate this data to future topics would be misrepresenting the intent of this article.

[2] The data analysis presented in this article has been reviewed and authenticated by two AP Statistics instructors.