With high pressure from work, family, and school, a large number of people are stuck in psychosocial and mental health problems, which has been a concerning agenda in society. In recent years, the pandemic has amplified the effects of mental health difficulties, such as suicidal thinking, severe depression, and self-harm behaviors (Salimi, et al. 2023). In this article, we adopt the term mental distress to capture both mental disorders, high-pressure, and related symptomatology. Social media has inevitably evolved into an outlet for many individuals to express their suppressed emotions (Naslund, et al. 2020). Consequently, we witness numerous people online venting about their work-related stress and emotional challenges to find solace on social media; however, it conversely leads to the further pervasion of anxious sentiments because it is very likely that many people share similar experiences with others. Also, some extreme individuals may resort to sharing posts laden with racial (Thomas, et al. 2023) and gender (Chen, et al. 2022) discrimination as an outlet for their stress. Social media seems like a crucible containing different societal emotions where people share their current feelings with others on the platform at any time. As to an individual, diachronically, posts are a reflection of emotional changes within a time span. In the context of a particular event, social media posts serve as manifestations of the attitudes and viewpoints held by users. Therefore, the longitudinal and vertical study on emotions is of significance in monitoring mental health.
Considering its considerable research importance, scholars have investigated the impact of social media from various perspectives, including psychological analysis (Coyne, et al. 2020), adolescent growth (O’reilly 2020), and detection techniques (Chancellor and Choudhury, 2020; Kabir, et al. 2023), to name just a few. Researchers specializing in NLP dedicated considerable efforts to extracting important information by behavioral and linguistic cues from words. By using text mining approaches, we can predict the presence of mood distress, such as psychological stress states detection (Lin et al. 2017). Since 2017, amid the ongoing research on mental health via advanced techniques, researchers have become aware that NLP methods can be leveraged to analyze, predict, and timely prevent mental diseases (Garg 2023).
This study aims to explore the significant role of language in detecting mental distress. The pipeline is shown in Fig. 1. We have focused on a particular subreddit, r/AmITheAsshole, a platform where people frequently share personal experiences. These posts provide valuable insights as users often narrate their stories and can be assessed and appraised by other users. Consequently, the content and associated evaluations within such specialized subreddits can serve as invaluable datasets for the development of a classification model (Efstathiadis, et al. 2021; Haworth, et al. 2021). Such kind of model, once refined, holds potential utility in the context of judicial judgment (Jiang, et al. 2020), While exploring the selective sharing of information and moral judgment is an interesting area for future research, our main goal in this study is to examine the storytelling aspect. We aim to analyze the root causes of mental pressure and the types of events that occur online through this case study.
We ground this analysis in theoretical approaches that focus on negative sentiment and its underlying sources (O’dea, et al. 2017; Vedula and Parthasarathy 2017), particularly stress expressed in posts, and investigate the following research questions:
RQ1
How frequently are different emotions including joy, sadness, anger, fear, love, and surprise? Are the posts with negative emotions correlated to mental distress?
RQ2
To what extent do individuals encounter distressing situations in various types of relationships? What are the attributed identities of individuals who experience such distress?
RQ3
Which specific events or circumstances are associated with the trouble individuals encounter?
To investigate these research questions, we initially undertook the training of a classification model, using DistilBERT as the foundational model architecture. Then, we calculated the frequency of each emotion, with the objective of conducting a preliminary assessment concerning the presence of negative emotions. We confirm that the prevalence of negative emotions is notably elevated, given that the subreddit r/AmItheAsshole primarily serves as a platform for individuals to seek solace and subject themselves to external evaluation (connected to RQ1). For RQ2, we use semantic role labeling techniques to extract the events and entities in order to find the conflict events and relationships. Conflict is a normal, inevitable part of any relationship. In the context of family and society, it seems that when people are closer, they may experience more quarrels and conflicts. This can be attributed to various factors, such as poor communication, lack of understanding, and unresolved issues. A study on communication between spouses during forced self-isolation found that the features of communication between spouses affected the degree of constructiveness of marital relationships. In stressful situations, disputes and unsolvable conflicts may arise, leading to quarrels between spouses and other family members (Sorokoumova, et al. 2020). We assume that interpersonal relationships characterized by intimacy, such as those between parents and children, friends, and spouses, are more likely to exhibit occurrences of conflict and discord. This potential proclivity towards relational conflict is anticipated to manifest and be discernible within the content of subreddit posts. Furthermore, we have employed topic modeling techniques, a valuable method for the examination of prevalent topics within a corpus. This approach gives us insights into the nature of events and the central focal points of the arguments (RQ3). This combined analytical framework enables us to gain a comprehensive understanding of the specific incidents or situations encountered by the authors.
The significant contributions of this paper can be summarized as:
(1) new datasets: While it is worth noting that prior research endeavors have engaged with the subreddit r/AmItheAsshole (Haworth et al., 2021; Giorgi et al., 2023), their targets are predominantly the development of classification models. To address the specific aims of our study, we undertook web scraping methods to acquire our datasets which are useful for psychologists and sociologists. The datasets generated and analyzed during the current study are available in the mental_health_dataset repository, https://anonymous.4open.science/r/mental_health_dataset-5356/reddit_posts.csv.
(2) mental distress research: Our research holds significant relevance and contributes to a nuanced understanding of emotional well-being, interpersonal dynamics, and the ways in which individuals navigate and cope with distressing situations online.
(3) usage of NLP methods for social science study: From a methodological perspective, our research carries significant implications of large language model techniques, such as BERTopic, BERT classification (specifically, DistilBERT), and semantic role labeling. Our analysis can provide social scientists with an insight into the usage of NLP techniques.