nlp

ferret: a Framework for Benchmarking Explainers on Transformers

Many interpretability tools allow practitioners and researchers to explain Natural Language Processing systems. However, each tool requires different configurations and provides explanations in different forms, hindering the possibility of assessing …

HATE-ITA: Hate Speech Detection in Italian Social Media Text

Online hate speech is a dangerous phenomenon that can (and should) be promptly counteracted properly. While Natural Language Processing supplies appropriate algorithms for trying to reach this objective, all research efforts are directed toward the …

Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models

Hate speech detection models are typically evaluated on held-out test sets. However, this risks painting an incomplete and potentially misleading picture of model performance because of increasingly well-documented systematic gaps and biases in hate …

nlp

ferret: a Framework for Benchmarking Explainers on Transformers

HATE-ITA: Hate Speech Detection in Italian Social Media Text

Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models

XLM-EMO: Multilingual Emotion Prediction in Social Media Text

Exposing the limits of Zero-shot Cross-lingual Hate Speech Detection

HONEST: Measuring Hurtful Sentence Completion in Language Models

LearningToAdapt with word embeddings: Domain adaptation of Named Entity Recognition systems