Where Was The Driver? Exploring responsibility framing in traffic news using BERTje

Jantina Schakel, Gosse Minnema and Malvina Nissim

Reporting an event implies taking a perspective on it, consciously or subconsciously. A key aspect of perspective-taking (or framing) concerns the different levels of focus, agentivity, and responsibility assigned to the actors involved in the event. For example, politicians are notorious for admitting that “mistakes were made” (without saying who made them), and sports commentators could say that “Vivianne Miedema scored a goal” or that “the goalkeeper failed to stop the goal”, depending on whose actions they would like to focus on. Linguistically speaking, responsibility framing is associated with specific lexical (e.g. “mistake” vs. “negligence”) and syntactic choices (e.g. active vs. passive constructions).

We investigate responsibility framing in news reports on traffic crashes using NLP tools. Worldwide, more than 1.3 million individuals are killed in traffic annually (Culver, 2018, p. 153). Traffic crashes are a socially relevant case study due to the different levels of vulnerability and social power of road users (e.g. car drivers vs. pedestrians) and the fact that they are often perceived as an inevitable part of everyday life. Social scientists have argued that traffic crashes are often reported on in the media in an ideologically biased way, favoring certain road users over others and reinforcing the societal status quo (Ralph et al., 2019; Goddard et al., 2019; Te Brömmelstroet, 2020). Specifically, Te Brömmelstroet (2020) observes how traffic crashes are often dehumanized: they are conceptualized as “glitches in the machine” and the responsibility of humans is de-emphasized. This results in the frequent use of linguistic constructions that place excessive focus on the victim and removes car drivers from the scene, resulting in the use headlines such as “Woman dies while cycling” [1] or “Fietser klapt tegen voorruit” (“Cyclist hits windscreen”) [2] to describe incidents in which a car driver hit a cyclist.

Our computational analysis uses the Dutch portion of the RoadDanger dataset [3], a collection of 10K news reports on traffic crashes in the Netherlands and abroad, and exploits the fact that large language models capture and reflect societal stereotypes (Nozza et al., 2021) to test how strongly some traffic-related agents (e.g. “bestuurder” or “auto”) are associated with some relevant verbs (Kurita et al., 2019, Bartl et al., 2020). We create and release a template-based benchmark for Dutch against which we evaluate BERTje (de Vries et al. 2019) before and after applying Domain-Adaptive Retraining (DAPT, Gururangan et al., 2020). DAPT uses bodies and/or headlines of articles from RoadDanger to expose and compare biased perspective-taking in news articles. Our hypothesis is that the domain-adapted LMs will show, in a more distinctive way, the responsibility framing structures and the misleading perspectives associated with them. Preliminary results show patterns favoring object agents over human agents in both un-fine-tuned and domain-adapted BERTje models.

[1] https://yhoo.it/3uSvJrA
[2] https://bit.ly/klapt-tegen-voorruit
[3] www.roaddanger.org