Detailed Schedule

Schedule

09:00 - 09:15: Opening Remarks

09:15 - 09:45: Invited Talk 1: Alicia Parrish (Google)

09:45 - 10:15: Invited Talk 2: Lama Ahmad (OpenAI)

10:15 - 10:30: Mini Break

10:30 - 11:00: Invited Talk 3: Apostol Vassilev (NIST)

11:00 - 12:30: Poster Session

HausaHate: An Expert Annotated Corpus For Hausa Hate Speech Detection
Does Prompt Engineering Matter for LLM-based Toxicity and Rumor Stance Detection? Evidence from a Large-scale Experiment
Comparing LLM ratings of conversational safety with human annotators
The Mexican Gayze: A Computational Analysis of the Attitudes towards the LGBT+ Population in Mexico on Social Media Across a Decade
Improving Aggressiveness Detection using a Data Augmentation Technique based on a Diffusion Language Model
Robust Safety Classifier Against Jailbreaking Attacks: Adversarial Prompt Shield
AGORA: a Language Model for Safe Speech-to-Text Conversion
SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore
Investigating radicalisation indicators in online extremist communities
Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models
VIDA: The Visual Incel Data Archive. A Theory-oriented Annotated Dataset To Enhance Hate Detection Through Visual Culture
Does Prompt Engineering Matter for LLM-based Toxicity and Rumor Stance Detection? Evidence from a Large-scale Experiment
Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models
Robust Safety Classifier Against Jailbreaking Attacks: Adversarial Prompt Shield
Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales
EkoHate: Abusive Language and Hate Speech Detection for Code-switched Political Discussions on Nigerian Twitter
Web Retrieval Agents for Evidence-Based Misinformation Detection
AustroTox: A Dataset for Target-Based Austrian German and English Offensive Language Detection
A Strategy Labelled Dataset of Counterspeech
Toxicity Classification in Ukrainian
AGORA: a Language Model for Safe Speech-to-Text Conversion
Subjective Isms? On the Danger of Conflating Hate and Offence in Abusive Language Detection
Improving Covert Toxicity Detection by Retrieving and Generating References
[NAACL 2024] Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis
[NAACL 2024] MisgenderMender: A Community-Informed Approach to Interventions for Misgendering
[NAACL 2024] An Interactive Framework for Profiling News Media Sources
[Findings] Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies
[Findings] HateModerate: Testing Hate Speech Detectors against Content Moderation Policies

12:30 - 13:45: Lunch

13:45 - 14:15: Invited Talk 4: Seraphina Goldfarb-Tarrant (Cohere)

14:15 - 14:30: Outstanding Paper

14:45 - 15:30: Lightning Talks by remote presenters

A Study of the Class Imbalance Problem in Abusive Language Detection
Yaqi Zhang, Viktor Hangya and Alexander Fraser
Adversarial Nibbler - A Novel Crowdsourcing Procedure for Detecting Harmful Content in t2i Models
Jessica A. Quaye, Alicia Parrish, Oana Inel, Charvi Rastogi, Hannah Rose Kirk, Minsuk Kahng, Erin Van Liemt, Max Bartolo, Jess Tsang, Justin White, Nathan Clement, Rafael Mosquera, Juan Ciro, Vijay Janapa Reddi and Lora Aroyo
Towards a Unified Framework for Adaptable Problematic Content Detection via Continual Learning
Ali Omrani, Alireza Salkhordeh Ziabari, Preni Golazizian, Jeffrey Sorensen and Morteza Dehghani
From Linguistics to Practice: a Case Study of Offensive Language Taxonomy in Hebrew
Chaya Liebeskind, Natalia Vanetik and Marina Litvak
Estimating the Emotion of Disgust in Greek Parliament Records
Vanessa Lislevand, John Pavlopoulos, Panos Louridas and Konstantina Dritsa
Simple LLM based Approach to Counter Algospeak
Jan Fillies and Adrian Paschke
Harnessing Personalization Methods to Identify and Predict Unreliable Information Spreader Behavior
Shaina Ashraf, Fabio Gruschka, Lucie Flek and Charles Welch
Visual and Textual Narrative Analysis of the anti-femicide Movement in Mexico
Laura W. Dozal
X-posing Free Speech: Examining the Impact of Moderation Relaxation on Online Social Networks
Arvindh Arun, Saurav Chhatani, Jisun An and Ponnurangam Kumaraguru
A Strategy Labelled Dataset of Counterspeech
Aashima Poudhar, Ioannis Konstas and Gavin Abercrombie
From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets
Manuel Tonneau, Diyi Liu, Samuel Fraiberger, Ralph Schroeder, Scott A. Hale and Paul Röttger
A Bayesian Quantification of Aporophobia and the Aggravating Effect of Low–Wealth Contexts on Stigmatization
Ryan Brate, Marieke van Erp and Antal P.J. van den Bosch

15:30 - 16:00: Invited Talk 5: Yacine Jernite (Hugging Face)

16:00 - 16:45: Break

16:45 - 17:45: Panel Discussion - Online Harms in the Age of Large Language Models

17:45 - 17:55: Closing Remarks