Aspect-Based Sentiment Analysis on Dutch Patient Experience Survey Data

Murad Bozik and Suzan Verberne

Aspect-based sentiment analysis (ABSA) extracts sentiments concerning a given aspect in a text. The success standards in ABSA on free-text are lower in the domain of health care and well-being [1]. Recent studies showed that a higher success rate is achievable by integrating syntactic dependencies into a graph convolutional network [2]. Žunić et al. proposed a sentence-level sentiment analysis method. In this approach, sentences are represented as undirected dependency graph. Vertices are represented by 300-dimensional glove embeddings. Using two convolution layers in their GCN network, they leveraged the information from second-order neighbors of vertices.

In this study we replicate this approach for the Dutch language and evaluate its effectiveness for the health domain, using a data set consisting of patient survey responses from multiple hospitals. In addition, we investigate the transferability of the method using cross-hospital analysis. Our patient experience survey data includes 55507 responses from 2 hospitals. Documents are mostly short text responses. As descriptive statistics of documents mean, median and standard deviation are 19, 11, and 26, respectively. To implement the approach, we annotated a small portion of patient experience survey data and represented them as a graph using syntactic dependencies.

We encountered a few challenges during this process. First, the data includes multiple questionnaires with open questions. Such as “What went well in the hospital?”, “What could be improved?”. Naturally, the first question is treated as positive and the other as negative. However, most of the negative responses don’t imply any negative sentiment. Second, responses include abbreviations that require expertise in the same domain and language.

We made annotations with preselected aspects. First, two topic modeling techniques (LDA and NMF) have been applied to the data. Extraction of aspects didn’t provide the desired outcome, due to overlapping topics. Nonetheless, it was useful to get an idea of what aspects should be. We annotated the following aspects: waiting time, communication, food, treatment, cleaning as negative aspects; kindness, explanation, speed, care, staff as positive aspects. Unlike in the approach of Žunić et al., our aspects are composed of a varying number of words. As an example of preselected aspect “waiting time”; for the document “Had to be present very early, so had to wait a long time for surgery”. We annotated “had to wait a long time” as our aspect. Sometimes the aspects include one word, sometimes up to 6 words.

In our poster, we will address the following questions,
“Does the integration of syntactic dependencies provide any leverage compared to other methods in Dutch language?”
“Is this approach suitable for providing an insight to improve health care and well-being in the hospital?”

[1] Zunic, Anastazia, Padraig Corcoran, and Irena Spasic. “Sentiment analysis in health and well-being: systematic review.” JMIR medical informatics 8.1 (2020): e16023.

[2] Žunić, Anastazia, Padraig Corcoran, and Irena Spasić. “Aspect-based sentiment analysis with graph convolution over syntactic dependencies.” Artificial Intelligence in Medicine 119 (2021): 102138.