
In the evolving landscape of machine learning, the reliability and fairness of models hinge on the quality of annotated data. However, the presence of bias, particularly in subjective tasks, has become a critical concern. This is especially prominent in sensitive domains like hate speech recognition, where annotators, stemming from diverse backgrounds and perspectives, might introduce bias in their annotations.
| Toxicity and Hate Speech | |||
|---|---|---|---|
| SBIC [1] | Kennedy [2] | Agree to Disagree [3] | |
| # Annotators | 307 | 7,912 | 819 |
| # Annotations per annotator | 479±829.6 | 17.1±3.8 | 63.7±139 |
| # Unique texts | 45318 | 39,565 | 10,440 |
| # Annotations per text | 3.2±1.2 | 2.3±1.0 | 5 |
| # Labels | 2 | 3 | 2 |

Model
Trained 2 models for each dataset
| Majority Label Model | Multi Annotator Model |
| Model - Roberta-Base [6] Epochs - 5 Learning Rate - 5e-5 Batch Size - 32 |
Model - DISCO [7] Epochs - 5 Learning Rate - 2e-3 Batch Size - 200 |
Performance
| Dataset | F1 Score (majority) | F1 Score (multi-annotator) |
| Agree to Disagree | 0.78 | 0.78 |
| Kennedy | 0.68 | 0.75 |
| SBIC | 0.80 | 0.78 |
AGREE TO DISAGREE

KENNEDY

SBIC

AGREE TO DISAGREE

KENNEDY

SBIC

AGREE TO DISAGREE
![group count]](plots_agree_to_disagree/group_count.png)
SBIC

KENNEDY

High Confidence in Majority model and Low Confidence in Multi-Annotator model
Mislabeled annotations:
| Text | Annotation |
| I thought you were going to win this in a landslide | Offensive |
| crossing my fingers for you from Germany. Even though I am not an American. | Offensive |
| boy F**K you | Not Offensive |
| Fragility at its finest | Offensive |
Single Ground Truth Model
Multi-Annotator Model
[1] Social Bias Frames: Reasoning about Social and Power Implications of Language (Sap et al., ACL 2020)
[2] Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application (Kennedy et al., 2020)
[3] Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement (Leonardelli et al., EMNLP 2021)
[4] SemEval-2018 Task 1: Affect in Tweets (Mohammad et al., SemEval 2018)
[5] Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics (Swayamdipta et al., EMNLP 2020)
[6] RoBERTa: A Robustly Optimized BERT Pretraining Approach (Yinhan Liu et al., 2019)
[7] Disagreement Matters: Preserving Label Diversity by Jointly Modeling Item and Annotator Label Distributions with DisCo (Weerasooriya et al., Findings 2023)
[8] Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection (Sap et al., NAACL 2022)
[9] Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations (Mostafazadeh Davani et al., TACL 2022)
|
Abhishek Anand
MS in Computer Science |
Anweasha Saha
MS in Computer Science |
Prathyusha Naresh Kumar
MS in Computer Science |
Negar Mokhberian
PhD in Computer Science |
Ashwin Rao
PhD in Computer Science |
Zihao He
PhD in Computer Science |