Does Domain-Specific Training improve Financial Sentiment Analysis for Stock Price Prediction?

Gauri R. Karnik and Raquel G. Alhama

The objective of financial sentiment analysis is to classify a piece of financial text as expressing bullish or bearish opinion. Sentiment about future stock performance, both in news articles and social media, has been proven to be a strong predictor of actual future stock performance (Khadjeh Nassirtoussi et al., 2014, Jrnl. Exprt. Sys. App.). However, the challenge with financial sentiment analysis is the lack of large-scale training data and the requirement of expert knowledge for labeling the data (Xing et al., 2020, COLING)

In the field of Natural Language Processing, the use of models that have been pre-trained on large amounts of domain-general language, such as BERT (Devlin et al., 2019, NAACL) and GPT (Brown et al. 2020, Neurips) have shown successes in a large number of tasks, including sentiment analysis. Once pre-trained, these models can be taken off-the-shelf and trained for a specific task, hence providing a ready-to-use option.

Due to the limited number of domain-specific resources (and their cost) it is useful to see if combining them with general resources results in an improvement in the stock price prediction performance. We use a combination of domain-specific resources along with general language resources in this analysis.

We use three types of domain-specific resources, in the domains of finance, social media, or both. These resources are: (1) posts by retail investors on Reddit from three of the biggest subreddits related to stock markets and investing, including submissions and comments about five of the most discussed stocks over a period of two years from 2019-2020; (2) the Financial Phrasebank dataset, a financial sentiment analysis dataset compiled by Malo et al., 2014, Jrnl. AIST and (3) VADER, a rule-based sentiment analysis model specialized on social media sentiment analysis designed by Hutto et al., 2013, ICWSM. The general language resources used include BERT and Large Movie Review Dataset (Maas et al., 2011, ACL-HLT).

We analyze the relative advantage of using any of these resources as well as the effectiveness of domain transfer in financial sentiment analysis. We start by comparing VADER with the BERT model, using an off-the-shelf version, and another version which has been further trained on the Reddit financial dataset mentioned above. We train and evaluate each version of BERT in sentiment analysis; based on (a) Financial Sentiment Analysis (using the Financial Phrasebank dataset) and (b) Domain-General Sentiment Analysis (using data from the Large Movie Review Dataset). Finally, we test these models in the practical application of stock price prediction from sentiment analysis.

Results of this ongoing work should shed light on the relative advantages of using domain-specific resources in combination with domain-general resources to diminish the problem of sparse labeled data for financial sentiment analysis.