WOAH 2020 | Shared Exploration

We are proud to announce the first WOAH Shared Exploration: Bias and Unfairness in the Detection of Online Abuse


This year we introduce the concept of Shared Exploration. We have selected a single dataset, and we encourage innovative and pathbreaking analyses which pertain to the theme of this year's workshop: Bias and Unfairness in the Detection of Online Abuse

A Shared Exploration borrows traits from traditional shared tasks with the distinction in the primary aim of the shared exploration. In contrast to shared tasks, the focus of a Shared Exploration is the analysis of datasets, rather than competitive modeling performance. We will review performance on the datasets in accordance with three criteria detailed below rather than just one evaluation metric. This means that we can adopt a more holistic approach and reward innovative and rigorous analyses, rather than sophisticated engineering.

Important information

Data availability: All data has been released.

Shared Exploration papers due: September 7, 2020 (23h59, GMT-12,"anywhere on Earth")

Camera ready papers due: October 14, 2020

Submission portal: http://softconf.com/emnlp2020/WOAH4/

Mailing list: Interested parties are encouraged to subscribe to the shared task mailing list at https://groups.google.com/forum/#!forum/woah-shared-exploration-2020.

Organizer E-mail: sharedexploration@workshopononlineabuse.com

Detailed Description

We are using the previously released Wikipedia Detox dataset. It is described in the 2017 WWW paper by Ellery Wulczyn, Nithum Thain, Lucas Dixon. Preprint available on Arxiv and is also documented in this Wiki. To access the data, download all files from Figshare.

We invite submissions which develop new models, systems and analyses (both quantitative and qualitative) which address our theme: Bias and Unfairness in the detection of online abuse. We encourage you to explore interesting, unusual and novel analyses and to integrate social scientific insights with advanced engineering where possible. Approaches which consider the wider implications of the results and address social questions are encouraged. Because this is a “Shared Exploration” rather than a “Shared Task”, we are not stipulating a single task for which you need to maximize model performance and you have freedom to define the scope of your own work. Some potential issues you could consider include:

  • How to best aggregate and combine annotations (and what the impact is on the performance, explainability, generalisability, and applicability of models)

  • Identifying and evaluating biases

  • Measures for mitigating biases

  • Enhancing, measuring and balancing model efficiency. Issues you could consider are model size, simplicity, run time, computational requirements and environmental impact.

  • Biases which emerge in cross-domain application of models and model generalisability (you may want to explore a secondary dataset).

  • Error evaluation, reduction and investigation, such as systematically investigating what types of content generate the most errors.

Shared Exploration Evaluation

All submissions will be reviewed by two reviewers. They will be evaluated based on the quality of the analysis in relation to three criteria:

  1. Advancement: Generation of new academic knowledge which addresses existing problems in the field, or identifies new problems. You should clearly outline the problem you are addressing and how your analyses respond to it.

  2. Innovation: Application of new methods, techniques and approaches. This could include combining social scientific methods and theories with advanced engineering solutions.

  3. Rigour: Specification of a clear research question and a well-executed, well-explained and fully considered research design. This could include integrating the Wikipedia Detox dataset with other sources of data and/or other training datasets.

A “Best Submission” award will be decided by a panel of experts and, depending on the number of submissions, a short-list of the 5 best papers will be announced.

Submission information

We will be using the EMNLP 2020 Submission Guidelines. Authors are invited to submit a Shared Exploration papers of 3 to 8 pages of content with up to 2 additional pages for references. All papers submitted to the shared task will be reviewed and must be submitted as single blind submissions. In addition to the papers, all research artifacts must be submitted alongside the report. All code, models, additional datasets and artifacts must be submitted as a zipped file with your paper submission, including a very short “Read Me” to explain the contents. We will share this through Github, enabling future researchers to easily access, cite and use your work.

In accordance with shared tasks, papers submitted as part of the shared exploration are guaranteed acceptance, provided that reviewers do not flag serious concerns with their veracity or integrity. We reserve the right to withhold papers from publication in the conference proceedings if authors are unwilling to make any required changes requested by our reviewers. This may include shortening certain sections, adding more details, or fixing errors in analysis.