Navigating Stories

Erik Tjong Kim Sang, Kevin Pijpers, Stefan Bastholm Andrade, Malte Lüken, Anneke Sools and Gerben Westerhof

Navigating Stories in Times of Transition is a project of the University Twente and the Netherlands eScience Center. The project aims at providing access to state-of-the-art natural language processing tools to researchers in the social science domain and narrative psychologists. In particular we are interested in making available Digital Story Grammar [Andrade and Andersen, 2020], a method recently developed for narrative analysis. In the initial phase of the project, we focus on three tasks:

1. Adapting Digital Story Grammar (originally for English) to Dutch
2. Gathering insights in digital text analysis tools used for narrative research
3. Collecting requirements for digital text analysis tools from potential users

Digital Story Grammar is an automatic method for story analysis [Andrade and Andersen, 2020]. It relies on part-of-speech tagging, dependency parsing and a collection of task-specific rules to extract action, actor, object and modifier information from sentences. This data is then used to answer the questions who, where, when, what, why and how related to a story [Franzosi, 2012]. When porting the English software to Dutch, we noticed that the task-specific rules mostly dealt with a task known in natural language processing as semantic role labeling [Carreras and Màrquez, 2005]. For this task, several machine learning solutions exist. We used Stroll [Attema and van Kuppevelt, 2021], a semantic role labeler for Dutch. This eliminated the need for most of the task-specific and language-specific rules used in the Digital Story Grammar tool for English. The current tool for Dutch can extract sentence-based information and story-based information based on aggregated sentence-based data.

Digital tools for qualitative data analysis (QDA), like industry standards NVivo, ATLAS.ti and MAXQDA, have been used in narrative research for more than twenty years. Some of these tools incorporate some natural language analysis. In order to get insight in the current state-of-the-art, we compared 35 of such tools, next to the three industry standards [Tjong Kim Sang et al., 2022]. We identified Orange as having the potential to combine automation with in-depth data-analysis and visualization for user adoption.

We will further develop Digital Story Grammar to be a valuable assistive tool for narrative research. Furthermore, natural language processing might be able to offer other useful analysis methods for the narrative psychology and the social sciences. For our tools to be successful however, it is crucial that they are adopted by expert researchers from these fields. The adoption of new innovations is no simple matter [Ramilo and Embi, 2014], and we will conduct several interviews with expert researchers in order to understand how our NLP tools can assist them in analyzing and visualizing their data in rigorous ways [Pijpers, 2022]. We have initially identified five different user personas, and will use these personas to discuss tensions between the coding and using of digital text analysis software.

Reference

[Andrade and Andersen, 2020] Andrade, S. B. and Andersen, D. (2020). Digital story grammar: a quantitative methodology for narrative analysis. International Journal of Social Research Methodology, 23(4):405–421.