When Annotators Agree but Labels Disagree: The Projection Problem in Stance Detection

arXiv:2603.24231v2 Announce Type: replace Abstract: Stance detection is nearly always formulated as classifying text into Favor, Against, or Neutral. This convention was inherited from debate analysis and has been applied without modification to social media since SemEval-2016. However, attitudes toward complex targets are not unitary. A person can accept climate science while opposing carbon taxes, expressing support on one dimension and opposition on another. When annotators must compress such multi-dimensional attitudes into a single label, different annotators may weight different dimensions, producing disagreement that reflects different compression choices rather than confusion. We call this the projection problem. We conduct an annotation study across five targets from three stance benchmarks (SemEval-2016, P-Stance, COVID-19-Stance), with the same three annotators labeling all targets. For each target, annotators assign both a standard stance label and per-dimension judgments along target-specific dimensions discovered through bottom-up analysis, using the same number of categories for both. Across all fifteen target--dimension pairs, dimensional agreement consistently exceeds label agreement. The gap appears to scale with target complexity: modest for a single-entity target like Joe Biden (AC1: 0.87 vs. 0.95), but large for a multi-faceted policy target like school closures (AC1: 0.21 vs. 0.71).

Leave a Comment