Hate speech, algorithms, and digital connectivity

The Online Hate Index (OHI) is a research partnership between UC Berkeley’s D-Lab and Google Jigsaw that seeks to improve society's understanding of online hate speech (from sources such as YouTube, Reddit, Twitter and other social media sites), including its prevalence over time, variation across regions and demographics, our ability to measure it through crowdsourcing and algorithms, and how to influence it through historical or future interventions. Through a combination of citizen science and machine learning, the team is developing a nuanced measurement methodology that decomposes hate speech into various constituent components, enabling it to be transformed into a continuous “hate speech scale,” making it easier to rate, evaluate and understand than a single omnibus question (i.e. "is this comment hate speech?").

The project is setting new standards for the data science of hate speech, with goals to 1) establish a theoretically-grounded definition of hate speech inclusive of research/policies/practice, 2) develop and apply a multi-component labeling instrument, 3) create a new crowdsourcing tool to scalably label comments, 4) curate an open, reliable multi-platform labeled hate speech corpus, 5) grow existing data and tool repositories within principles of replicable and reproducible research, enabling greater transparency and collaboration, 6) create new knowledge through ethical online experimentation (and citizen science), and 7) refine AI models. The research team includes Geoff Bacon (Linguistics Ph.D. candidate); Nora Broege (Postdoc at Rutgers University); Chris Kennedy (Biostatistics Ph.D. student, BIDS Fellow); and Alexander Sahn (Political Science Ph.D. candidate).

Ultimately, we seek to understand the causal mechanisms for intervention and evaluation, while defending free speech. A new open-source platform - to be used by the Anti-Defamation League and other advocacy organizations - will make these resources (along with policy recommendations) available to educate the public and grow the larger data science / citizen science community.

Speaker: Claudia von Vacano, Executive Director, D-Lab, UC Berkeley

Wednesday, 05/08/19


Doe Memorial Library

UC Berkeley
Room 190
Berkeley, CA 94720