A hybrid depressive mood analysis model to detect blogger depression tendency from web posts

doi:10.21203/rs.3.rs-2121386/v1

Download PDF

Research Article

A hybrid depressive mood analysis model to detect blogger depression tendency from web posts

https://doi.org/10.21203/rs.3.rs-2121386/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

In recent years, reports of suicide have continuously increased due to people suffering from tremendous pressure or depression. According to World Health Organization (WHO), globally, 5% of adults suffer from mental disorders. This kind of mental disorder is difficult to diagnose and detect. In our previous works, we proposed a Negative Emotion Evaluation (NEE) model and an Event-Driven Depression Tendency Warning (EDDTW) model to detect depressive moods in advance. In this work, we combine the previous models to propose a Hybrid Depressive Mood Analysis (HDMA) model to predict depression from web posts. The experimental results show that our proposed hybrid depressive mood analysis model obtains over 70% precision.

major depression disorder

negative emotion

negative thought

stressful life event

text mining

According to an American Psychological Association (APA) study [1], people under a stressful environment have a high risk of suffering from depression due to negative effects on the individuals’ physical and mental health, such as irritability, tension, depression, etc. Therefore, it is an important issue to help people understand their stress effectively to prevent suffering from depression.

The Asia Barometer organization conducted a survey of the quality of life [2] of various Asian countries in 2009. It reported that the Taiwanese happiness index is in last place. “Depression is a common mental disorder, and it is estimated by WHO that 5% of adults suffer from the disorder”. These phenomena show that depression is a serious health issue not only in Taiwan, but also around the world.

Nowadays, people use web 2.0 technology to record their life. There are many channels to share personal details such as Facebook, web blog, twitter, Line, etc. We find stressful events cause negative emotions, symptoms, and negative behaviors in these channels. When the influence of a high-pressure event reaches a certain level, it might increase the risk of the user suffering depression. Therefore, we hope to capture events from the user's life history and predict depression tendencies via information retrieval technologies.

We observed that web posts may contain one or more stressful life event terms, after the initiation of the negative emotions, symptoms, or negative behavior. These four factors refer to depression factor, and we use them to predict depression tendencies from web posts. As shown in Table 1, the stressful life event term “my father’s suicide” and “parents’ divorce” match the “Death of a close family member” and “Divorce” items in the Social Readjustment Rating Scale (SRRS). The mean values of these events in SRRS are 63 and 73, which are among the top five most serious events. We also find three other depression factors in the posts. The word ‘depressed’ is a negative emotion word, ‘insomnia’ is a symptom term, and the term ‘suicide’ denotes a negative behavior. In clinical treatment, patients do not have complete records of past life history and expression for diagnosis. Therefore, we develop new detection technology to provide more resources for medical diagnosis and treatment to improve quality of care.

Table 1

Example sentences of depressive factors
Factors	Chinese sentences	English sentences
Stressful life event	最近我阿嬤中風住院	Recently my grandmother had a stroke and was hospitalized
Stressful life event	爸媽離婚了	Divorce of my parents
Negative Emotion	最近一睜開眼睛,心情就很低落	Recently, I feel very depressed when I open my eyes
Symptom	居然才過了兩個禮拜就天天失眠	Can’t believe I have had insomnia everyday just after two weeks
Negative Behavior	就像我割腕過後才知道疤痕一輩子都跟著我	I did not know that the scars will follow me forever after I cut my wrists

In our previous work, we proposed a statistical inference approach, named the Event-Driven Depression Tendency Warning model (EDDTW) to investigate the depression tendency of web posts in blogs and forums. We collected and used about seven hundred Chinese forum posts from the PTT/Prozac zone in Taiwan as the dataset. We tried to find the stressful event, negative emotion, symptom, and negative thought terms in the dataset and predict the depression score. However, not all the web posts show the complete story. Some authors do not reveal life events in their posts and reduce the performance of proposed model. We found close relationships among occurrence of negative emotions, symptoms, and negative thoughts on the collected depressive web posts if no event terms are presented. Thus, in this work, we focus on two improvements over our previous work: (1) Event lexicon improvement: We refer to the Social Readjustment Rating Scale and Daily Hassles Scale (Daily Chores Scale) to reconstruct the event lexicon and try to predict depression using new life event lexicon. Overall, we collected 1200 stressful life event terms. (2) Hybrid model: we proposed a Hybrid Depressive Mood Analysis model (HDMA) to analyze web posts including or excluding the life event and predict the depression score. The results show precision increased from 0.593 to 0.719.

2.1 Clinical diagnostic techniques of depression

Nowadays, depressive disorder, dysphoria, and neurasthenia have gradually become commonplace mental illnesses for modern people. The cause of depression has not been confirmed clinically. Many researchers have only tried to find the root cause using methods like psychoanalysis or existential psychology [4, 5]. However, sociology researchers point out that the environment, relationships, and life events encountered are important reasons for a person suffering from depression. This study tries to find the relationships among the biological, psychological, and social aspects.

Medical diagnostic criteria for depression in this study is based on the Diagnostic and Statistical Manual of Mental Disorders Fourth Edition Revised Edition (DSM-IV-TR) (2000) [1], which was published by American Psychiatric Association (APA) and the World Health Organization (WHO) international disease and related health problems Statistical Classification (ICD-10) (2007) [6]. The following is a brief explanation of the DSM-IV-TR diagnostic criteria for major depressive disorder:

1. Melancholy mood: unhappy, irritable, and depressed.

2. Reduced interest and joy: uninterested.

3. Unable to focus: Unable to resolve contradictions, hesitation, unable to concentrate.

4. Weight and appetite disorders: weight loss or gain, decreased or increased appetite

5. Insomnia (or sleepiness): feeling difficult to fall asleep or feeling sleepy all day.

6. Psychomotor retardation (or excitation): Thoughts become slow motion. Their brains become blunt.

7. Tired or loss of vitality: lying in bed all day, always thinking, and physical deterioration.

8. A sense of worthlessness or guilt: feeling life is boring, sadness, remorse, or other negative thoughts.

9. Suicidal intent: repeated thoughts of death, suicidal ideation, attempts or plans.

There are several questionnaires provided for individuals to self-diagnose. The PHQ-9 questionnaire asks users to score the frequency of nine situations occurring in the last two weeks. The American psychiatrist Ivan K. Goldberg developed the Goldberg Depression Scale questionnaire. Individuals can use it to track their weekly moods. In addition, the John Tung Foundation [7] also provides a self-diagnosis depression scale for Taiwanese people to self-screen.

2.2 Stress

Stress is a psychology and biology term proposed by Han Selye, [8] who defined it as “the non-specific response of the body to any demand for change.” A stressor can be any chemical or biological agent, environmental condition, external stimulus, or an event that causes stress to an organism [9]. The event which affects the health of the individual is called a stressful life event [10]. Examples include examination, dating, and marriage.

How can the pressure be assessed? Recorded physiological signals and questionnaires are usually used in clinical diagnosis. Jorn Bakker et al. [10] and Mykola Pechenizkiy et al. [11] used physiological signal acquisition in the form of Galvanic Skin Response (GSR) to predict the state of stress in order to adjust working hours and job content to prevent accumulation of stress. Rafal Kocielnik et al. [6] also apply GSR to detect stress during examinations or a deadline for an ad hoc job for the students. The results show higher pressure during the special event.

PG Holeyannavar et al. [12] and Alan H.S. et al. [13] studied the environmental conditions for teachers and designed a standard questionnaire to investigate common sources of stress. The stress sources included working overtime, management of student learning, interaction among colleagues, reduced salaries, school violence, and other stressful events. They also tried to find if the teachers’ education level, gender, or other information were significant factors for the stress.

Currently, a pressure rating scale is commonly used to capture an actual event in a clinical stress assessment. The common assessment scales include Social Readjustment Rating Scale [14], Daily Hassles Scale [15], and Perceived Stress Scale [16]. The Social Adjustment Scale proposed by the Tomas Holmes and Richard Rahe at the University of Washington Medical Center lists 43 stressful events. Events within the last year are scored by severity and scores added together determine susceptibility to becoming ill in the near future. The top-rated event is death of a spouse. The Daily Chores Scale is presented by Richard S. Lazarus, et al. of the United States Department of Psychology, University of California, Berkeley. It lists 119 events, which cause stress in daily life. In this study, we apply the Social Readjustment Rating Scale and Daily Hassles Scale (Daily Chores Scale) to extract stressful life events.

2.3 Stress and depression

Numerous studies have pointed out that a close relationship exists between stress and depression. Karen [8] pointed out that when people are in a state of pressure, it stimulates the secretion of dopamine. There is a strong correlation between this hormone and depression.

In addition, Sybille [17] changes conditions to create a stressful environment for rat experiments. The results show many depressive symptoms were presented, including sleep problems, decreased learning ability, memory, and decreased symptoms. The above studies assert there is a close relationship between stress and depression.

2.4 Blogs mood analysis

Blogging has risen in popularity in recent years. Many scholars found that emotion plays an important role in blog posts. Chen has focused on blog mood analysis and has many publications in this field [18]. Nowadays, research into emotion detection and classification has increased [18, 19] and blog content analysis has gradually become a common research subject [20]. Many studies [18, 19, 21, 22, 23] focus on blog content and emotions analysis, where researchers put forward a variety of methods of identification and classification of emotions. Leshed and Kaye [24] do a comprehensive investigation for LiveJournal.com in an attempt to interpret the moods of bloggers. The site provides 132 kinds of mood options to represent blogger emotions. Hsu and Lin [21] provided an SVM-based mood classifier to analyze the relationship between text and emotions. For example, the vocabulary word "Computer" is quite likely to be categorized as "depressed." Yang et al. [19, 22, 23] use SVM and CRF to classify blog posts. They used sentences based on a body of collected articles for effectiveness evaluation. In this research, we use negative emotions as an innovative feature to forecast depression.

In our initial study [25], we reduced the complex criteria for major depression disorder from the DSM-IV-TR by proposing only three depression factors. These include negative emotions, symptoms, and negative thoughts. Table 2 shows how we mapped the depression factors to the major depressive disorder diagnosis guideline. We try to extract negative emotional terms in the web posts and find relationships between negative emotion terms and symptom terms, and between negative emotion terms and negative thought terms. We call this the Negative Emotion Evaluation model (NEE).

Table 2

Major depressive disorder mapping rules
Criteria	Description	Depression Factor
1	Depressed, sad, hopeless, discouraged most of the day	Negative Emotion
2	Loss of interest or pleasure in previously enjoyed activities	Negative Emotion
3	Impaired ability to think, concentrate, or make decisions	Symptom
4	Increased or reduced appetite. Loss or gain in weight.
5	Common sleep disturbance (insomnia/hypersomnia)
6	Psychomotor changes, including agitation or retardation
7	Decreased energy, tiredness, and fatigue
8	A sense of worthlessness or guilt.	Negative Thought
9	Frequent thoughts of death, suicidal ideation, or suicide attempts	Negative Thought

We faced a performance issue in the Negative Emotion Evaluation model. We found negative event terms presented in a majority of posts and proposed a new model to evaluate depression in web posts. We assume the event is the key factor and design an Event-Driven Depression Tendency Warning model (EDDTW) [26] to calculate the depression tendency score in a post. We propose a way to extend negative event terms via lexicon, part of speech pattern and the co-occurrence of negative event terms and negative emotion terms. In this study, we apply the Social Readjustment Rating Scale and Daily Hassles Scale to re-create the event lexicon and define the relations among depression factors to form the Stressful Life Event Driven Depression Analysis model (SLEDA).

We proposed two methods to analyze emotion-based and event-based post content for depressive mood detection. However, each model is designed for a specific type of post and not for general usage. In this study, we will combine these two models to analyze depressive moods from either emotion-based or event-based posts. Section 3.1 describes the Negative Emotion Evaluation model. Section 3.2 describes the Event-Driven Depression Tendency Warning model. Section 3.3 describes the Hybrid Depressive Mood analysis.

3.1 Negative Emotion Evaluation model

In previous work, we introduced the four depression factors of negative emotion, triggering event, symptom, and negative thinking in the proposed Negative Emotion Evaluation model. This study mainly investigates the adverse effects arising from emotional words, so we collect the negative emotion terms from web posts. We refer to Plutchik's wheel of emotions [27] to divide negative emotion terms into three levels and assign a strength value for each level (0.25, 0.5, and 0.75). For example, the words "sad" and "grief" differ in intensity. The word "grief" is more serious and assigned the highest score. Table 3 presents a sample of negative emotion related terms and their intensity level.

Table 3

Examples of Negative Emotion terms
Level 1 (0.25)	Level 2 (0.5)	Level 3 (0.75)
Pensiveness	Sadness	Grief
Boredom	Disgust	Loathing
Apprehension	Fear	Terror
Annoyance	Anger	Rage

We collected symptom related terms from several resources. First, we extracted symptom terms from collected web posts based on the major depressive disorder diagnosis criteria according to DSM-IV-TR. Second, additional symptom terms were collected from the bilingual MeSH vocabulary words associated with depression symptoms. Third, we collected similar terms from Google suggestions and Wikipedia.

Negative thought is also one of the important indicators in the diagnosis of depression. Many studies and clinical experience note that more than 60% of depression patients might have suicidal thoughts and/or suicidal behavior. Therefore, we gathered the vocabulary of negative thoughts and behavior from the collected articles in order to establish a negative thoughts dictionary. The focus was on specific negative behavior terms, which is the vocabulary related to depression or suicide. For example, the specific terms of “jumping,” or “wrists,” and other words signaling negative behavior may be included in negative behavior narrative content. Some depressive disorder patients disclose suicidal intentions or even acts of attempted suicide in their posts, so we assume that negative thought terms are different from the negative emotion terms. We extracted concentration problem, memory difficulty, and other suicide-related terms from web forum posts, e.g. “guilt” and “suicide”, to generate the negative thought lexicon.

Figure 2 shows the modified framework. Accumulating an amount of negative emotion during a certain period could cause depression tendencies. Given web post B, we want to analyze depression tendency D from the above mentioned web resources by computing the probability $P\left(D\right|B)$. We utilize depression tendency factors P_f to estimate depression tendency in a web post, and propose a probabilistic model as follows:

$$P\left(D|B\right)=P\left(M|B\right)\times P\left(S|M\right)\times P\left(T|M\right)$$

where M represents the negative emotion, S represents the symptom, and T represents the negative thought.

In Eq. 1, we extracted the negative emotion term in each post and summarize all emotion scores to calculate $P\left(M|B\right)$. We find the distance relationship for all symptom and negative emotion term pairs for $P\left(S|M\right)$. We also try to find the distance relationship for all negative thought and negative emotion pairs for $P\left(T|M\right)$. The framework of the Negative Emotion Evaluation model is shown in Fig. 1.

3.2 Stressful life event driven depression analysis model

To enhance the event lexicon for the EDDTW [26] model, we apply the SRRS and Daily Hassles Scale to collect a set of seed event terms, which are called stressful life event terms. Initially, we obtained 135 stressful life event terms. We then separate the stressful life event terms into 5 categories and 46 topics. For each topic, we calculate the average score from the point values of the 43 events in the SRRS and the 63 events in the Daily Hassles Scale as the stress severity score of the topic. After calculating the stress severity score, we rescan the relationships among the stressful life event terms, negative emotion terms, symptom terms and negative behavior terms and replace EDDTW with the stressful life event driven depression analysis model (SLEDA).

We use Eq. (2) to calculate the probability$P\left(D\right|B)$ in the stressful life event driven depression analysis model.

where E represents the stressful life event, M represents the negative emotion, S represents the symptom, and T represents the negative thought.

In Eq. (2), we utilize the average stress severity score of stressful life event terms in each post to compute the probability $P\left(E|B\right).$ To calculate $P\left(M|E\right),$ we selected negative emotion terms occurring in proximity to descriptions of stressful life events. Next, we also use the co-occurrence relation for all symptom and stressful life event term pairs for $P\left(S|E\right)$. We also use the co-occurrence relation for all negative thought and stressful life event term pairs for $P\left(T|E\right)$. The framework of the Stressful Life Event Driven Depression Analysis model is shown in Fig. 2.

3.3 The system architecture of hybrid depressive mood analysis model

In the collected dataset, not every post presented negative events. For these posts, we cannot provide the correct results of depression tendency detection. To solve this issue, we add the original idea to the diagnosis criteria of major depression disorder to calculate the depression tendency score if no event is present in a post. We separate all posts into event and non-event categories. In the category with events, we collect a new stressful life event lexicon and label event terms in the posts and negative emotion terms if the authors complain about something or exhibit depression. In the other group, we follow diagnosis criteria and assume the negative emotion terms are the key factor and trigger the depressive symptoms and/or negative thoughts in the post. In this work, we propose a complementary model to analyze these two categories of posts in Fig. 3.

4.1 Dataset

We collected more than 10,000 Chinese web posts from the PTT/Prozac forum, which is a famous BBS in Taiwan. This forum provides people with depression a place to share their feelings and thoughts. First, we selected the posts of the top 30 authors according to posting frequency, resulting in a total of 724 posts used in this work. Each post is segmented into words and labeled with POS tags by a popular Chinese POS tagging tool (http://ckipsvr.iis.sinica.edu.tw). Three trained master students from Institute of Behavioral Medicine College of Medicine, NCKU, manually labeled the depression tendency for each post. First, they read the posts and labeled the results. Then, they followed the major depressive disorder diagnosis guideline to count posts within a two-week period.

4.2 Lexicons

Initially, we applied the SRRS and Daily Hassles Scale separately to collect a set of seed event terms as stressful life event terms, and then we combined the SRRS and Daily Hassles Scale. We obtained 135 stressful life event terms divided into 5 categories and 46 topics. Then we designed a 5-scale questionnaire and collected 106 available samples to calculate the average score for each event term. Table 4 shows example stressful life events. A higher stress severity score represents higher pressure. For example, the stress severity score of “loss of job” is 4.49 which means it causes high pressure. The stress of “quitting school” has a lower score of 2.93 since students could start a job or transfer to another school if they quit a school.

Table 4

Examples of two-layer stressful life events
Category	Topic	Stress Severity Score
Family	Interaction between Family Members	3.51
	Change of Family Members	3.71
	Marital Problems	3.72
	Emotional Problems	3.60
	Get Married	3.21
	Family responsibilities	3.17
	Move House	2.96
	Quarrel	3.88
Campus	The death of a close friend	4.36
	Study problem	3.18
	Quit School	2.93
Medical	Not enough rest	3.64
	Care about medical problems	3.27
	Personal injury or illness	3.75
	Procreation	3.65
	Drug problems	3.13
Job	Outstanding Individual Achievement	3.07
Job	Lose Job	4.49
Life issues	Economic issues	3.40
	Smoking interference	3.46
	Fear of being rejected	3.62

Table 5 shows the number of terms for the four types of depression lexicons. The stressful life event lexicon was extended to the size of 1202 terms. The number of the other three depression lexicons, including negative emotion, symptom, and negative behavior are shown in Table 5.

Table 5

The number of terms for four types of depression lexicons
Lexicons	Stressful life event	Negative emotion	symptom	Negative behavior
Number of terms	1202	1316	177	73

4.3 Model evaluation

The purpose of this experiment was to estimate the weight value of the feature function. This study used 10-fold cross-validation to determine the weight value of each characteristic function. The weight is used to assess the effectiveness of the follow-up model. To assess the effectiveness of the hybrid depression mood analysis (HDMA) model, we compare the basic DSM-IV-TR diagnosis model and the proposed model.

The above experiment shows that our proposed hybrid depression analysis model, which combines the Negative Emotion Evaluation model (NEE) and Stressful Life Event Driven Depression Analysis model (SLEDA), effectively analyzes the depression tendency of a web post. We compare the performance of the proposed models and the DSM-IV-based depression tendency evaluation model.

The Negative Emotion Evaluation model (NEE) [25] proposed in our previous work is the first attempt at analyzing the depression tendency of bloggers by calculating the negative emotion score of their blog posts. The Event-Driven Depression Tendency Detection model (EDDTW) [26] enhanced the negative event lexicon by using three methods to extract negative events more efficiently. The DSM-IV-based depression tendency evaluation method uses the nine diagnosis criteria of major depression disorder to analyze the depression tendency of a blogger from their blog posts within two weeks.

Table 6 shows the evaluation results based on the metrics of precision, recall, and F-measure. The Hybrid Depressive Mood Analysis model obtains the best recall rate and F-measure at 0.716 and 0.680. The main reason is the hybrid depression analysis model can analyze whether a negative event is present in each post.

Table 6

Performance comparison of depression tendency analysis models
Model	Precision	Recall	F-measure
DSM-IV based model	0.666	0.571	0.614
NEE	0.613	0.494	0.547
EDDTW	0.593	0.668	0.624
SLEDA	0.638	0.656	0.645
HDMA	0.649	0.716	0.680

5.1 Stressful life event lexicon

In this study, we combined two major scales: the Social Readjustment Rating Scale, and Daily Hassles Scale from the field of psychology for identifying major stressful life events. Finally, 5 categories, 46 topic groups and over 1200 stressful life event terms were extracted in this study. The stressful life events help us identify event terms in the post more accurately and improve model performance. For example, in Table 4, we generate one new topic called “Change of Family Members” by merging 3 items: “Death of a close family member”, “Gaining a new family member”, and “Major change in number of family get-togethers” from the Social Readjustment Rating Scale and 2 items: “Thoughts about death”, and “Decisions about having children” from the Daily Hassles Scale. We used these 5 items to extract 32 stressful life event terms and the average score for this topic is 3.71.

5.2 Correct detection case study

This study tries to demonstrate the effectiveness of using stressful life events for depression tendency detection for each post, which improves the evaluation performance. We provide some correct and incorrect examples of Hybrid Depressive mood Analysis model in the tables below. Table 7 shows a correct detection example. This is a true positive example. In this example, a stressful life event, negative emotion, and symptom terms were extracted. The event extracted was “選課/select (school) course.” The stressful life event score for this event is 3.18, which is a high-risk event. Four negative emotion terms were identified, which are “焦慮/anxiety”, “煩惱/worry”, “沉重/heavy” and “緊張/nervous”. Eight symptom terms were identified, which include “腰酸背痛/back pain,”, “動作遲緩/slow movement”, “呆滯/sluggish”, and “好累/tired.” The first paragraph describes the relationship between depression and symptoms. The author thinks that she needs to go see a doctor instead of taking medicine. The second paragraph describes her school life. Selecting a course before the semester causes her nervousness and her solution is to write the plan down before she finishes it. The word position relationship among event, emotion and even symptom is very close and our HDMA model can predict a positive answer.

Table 7

True positive example
(Chinese post) 每天都很焦慮煩惱未來該怎麼過, 腦袋一直有人跟我對話, 討論跟一直想同樣的問題想不出答案, 鑽牛角尖, 覺得好累腦袋沒辦法休息, 腦袋想到全身緊蹦, 沒辦法放鬆, 只有泡熱水澡跟早上醒來時情況會好些。一整天下來到晚上又會覺得身體很沉重、很累。憂鬱症會導致腰酸背痛不舒服感嗎?反應動作遲緩、呆滯、生理期不正常等情況讓我很困擾。關於焦慮, 我自己適合的方式是書寫, 把問題寫下來。舉例算了。1.選課(加退選沒結束前, 選課是一場戰爭!)我會寫: 選課讓我很緊張, 每天想選課就好了, 我還要做什麼? 1.看看自己能選的課程內容, 看有沒有喜歡的2.做一個「未來課表」3.問問學長姐然後我一條條去做了。4.我真做了一個未來課表, 附註上已確定或是待處理。5.遇到學長姐就聊聊課程, 諸如老師風格, 上課內容, 進一步可以想想適合自己嗎? 於是做了之後, 我整個可以一邊納涼, 因為我都弄好了, 只要打開自己的紀錄, 一目了然, 就不會太緊張, 馬上可以知道自己要怎麼做。
(English translation) Every day I worry about the future and I am anxious. I have conversations in my head with myself but I cannot seem to find any answers. I feel exhausted, unable to rest or relax. Only when I take hot bubble baths or when I first wake up in the morning do things feel better. All day long my body feels very heavy, and by evening I am drained. Does depression lead to back pain and discomfort? Delayed reactions, sluggishness, and irregular menstruation have troubled me a lot. Regarding anxiety, my way of solving it is to write down the problem. For example, registering for classes makes me very nervous. Aside from thinking about classes every day, what else do I need to do? 1. Look at the courses I am eligible for to see if there are some that interest me. 2. Make a "future class schedule." 3. Ask students in higher grades and follow their suggestions. 4. I made a future class schedule and indicated if I was registered or waitlisted. 5. Talk to others who have taken the course and ask about it, such as teaching style, class content, and decide whether I am a good fit. Afterwards, I feel better because I am ready. I will not be too nervous because I have a game plan of what to do.

5.3 Incorrect detection case study

Table 8 shows a false positive example. The result shows the depression mood as positive when it is actually negative. In this example, all four types of depression factors were extracted. The event “寫作業/do homework” was extracted which consists of the two words “寫/write” (VC) and “作業/homework” (Na). The stressful life event score is 2.93 for “do homework.” Two negative emotion terms were identified, which are “嗚咽/whimper”, “痛苦/pain.” Two symptom terms were identified, which are “壓力/stress” and “睡眠/sleep.” One negative behavior term was extracted, which is “自殺/suicide.” Due to the depression factors extracted from this example, the HDMA model predicted the depressive mood as positive. Nevertheless, some encouraging sentences at the end of the example in Table 8 is important for the expert to determine this example is not a depressive content.

Table 8

False positive example
(Chinese post) 今天雨好大, 只有十幾個人來上課啊...不過早上還是很崩潰對著電話的那頭嗚咽說:我要在公車上表演自殺了!事情太多, 卻沒有力氣去做完的壓力真的好大想到暑假要去蠻荒地區, 得挨好幾針又覺得好痛好煩。看來六月需要痛苦的奮鬥, 睡眠離我越來越遠了, 連夢裡都在寫作業... >"<
(English translation) It is raining heavily today and only a dozen or so classmates attended class. This morning I felt like I was falling apart, telling the person on the other end of the phone that I wanted to act out a suicide on the bus! So much to do, but not enough energy to complete them. I feel a lot of pressure. Thinking ahead to summer, about how I will go to the countryside (an undeveloped area) and need a lot of shots, I feel pain and annoyance. Looks like June will be a painful struggle. Sleep seems a lot more distant to me, I am even doing homework in my dreams. >"<

Table 9 shows a false negative example. This post was annotated as positive depression tendency. The stressful life event, negative emotion, and negative behavior terms were extracted in this example. The stressful life event term is “哭/cry” and the stressful life event score is 3.51. The another life event is “抓傷自己/scratched herself” which is scored as 3.75. The author said that she could not cry when she lives with her parents and she also has self-harming behavior. Nevertheless, after she leaves home to study, she feels better after crying. In this sample, the word distance between stressful life event and negative emotion terms were longer than average which reduces the effect in the proposed HDMA model. A higher stressful life event score did not increase the depression tendency score for the final result. Finally, the depression tendency score is lower than the threshold and did not match the correct result.

Table 9

False negative example
(Chinese post) 好像很多人都很生氣很憤怒? 是不是壓抑憤怒也是造成憂鬱症的原因之一呢? 我知道我自己是從小就很憤怒我父母很不喜歡我哭的就算是現在在電話裡面跟他們哭他們也是很不耐煩的要我別哭而行為僅止於抓傷自己是因為留下疤痕,以後當總統的時候會不好看囧可是來美國唸書自己住以後我都可以哭到爽耶而且這種大哭過後感覺很不錯咧^_^
(English translation) Seems like many people are angry or furious? Isn’t repressed anger also one of the causes of depression? I was easily angered since childhood. My parents do not like me crying. Even now they are very impatient and do not want me to cry when I talk to them on the phone. They tell me that if I scratch myself out of frustration, it will leave scars, which will look bad in the future when I become president. 囧 But after coming to the US to study, I can cry as much as I want. I feel pretty good after crying. ^_^

5.4 Depression tendency trend

After we analyze a sequence of posts during a two-week period, we can draw a depression tendency trend for each author. The trend can show the changes of depressive mood over this timeframe. It might be helpful for doctors or psychologists when they are making an assessment. The two-week duration is from the technical report on major depression disorder by DSM-IV-TR. We calculate the average score from the golden sample, resulting in a threshold of 0.4. Figure 4 shows an example where the depression score changes over a period of two weeks. The doctor can use this trend to understand the changes in depression for the patient.

5.5 Limitations

The proposed HDMA model can only analyze text content. However, there are more and more emoticons used in the web posts. Some examples are Λ, 囧, @@, >_<, and ^_^. We need to prepare a mapping table for symbol and emoticon terms. This way, we can improve the performance of depression tendency analysis in the future. We combined the life event terms from two stress-related scales to extract stressful life events. However, not all stressful life events were included in this study. We merged some events into one group and calculated the average score, but this may cause differences between the average score results. Finally, not all synonym terms were included in the stressful life event lexicon, and this reduced the performance.

This paper presents a Hybrid Depressive Mood Analysis model to automaticallypredict the depressive mood of web posts and possibly help bloggers or post authors to recognize depressive mood trends and even detect major depressive disorder in advance. The proposed model analyzes the post content for stressful life events and negative emotions. Our proposed method combines the Social Readjustment Rating Scale and the Daily Hassles Scale to merge similar items and extract stressful life event terms. We provide a way to quantify the stress severity score of the stressful life event. The experimental results show our proposed model obtains over 70% precision.

Ethical Approval and Consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and materials

The web posts were from PTT/Prozac forum.

Competing interests

The authors declare that they have no competing interests.

Funding

Not applicable.

Authors' contributions

This paper adds two factors to predict authors’ depressive disorder, which are stressful life events and negative emotions.

We provide a way to quantify the stress severity score of the stressful life event.

The experimental results show our proposed model obtains over 70% precision. Therefore, this study can achieve the effect of early detection and early treatment.

American Psychiatric Association, Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition - Text Revision (DSMIV-TR), Amer Psychiatric Pub, 2000.
林思恩，「東亞七國快樂指數香港倒數第二台灣人最不快樂」， http://www.gospelherald.com.hk/news/soc_1311.htm，2009。G.L. Engel, “The need for a new medical model: a challenge for biomedical medicine.” Science, New Series, vol. 196, no. 4286., pp. 129-136, Apr. 8, 1977.
World Health Organization, “Depression,” https://www.who.int/health-topics/depression#tab=tab_1.
Freud, S., Strachey, J., Richards, A., “On Metapsychology: The Theory of Psychoanalysis”, Penguin Books, 1984, pp. 251-268.
A. Maslow, “The Farther Reaches of Human Nature”, New York, NY, USA: Viking Books, 1971, pp.318.
World Health Organization, “The International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10),” http://apps.who.int/classifications/apps/icd/icd10online, 2007.
財團法人董氏基金會，「憂鬱情緒自我篩檢」， http://www.jtf.org.tw/psyche/melancholia/overblue.asp
Karen Bruno, “Stress and Depression”, http://www.webmd.com/depression/features/stress-depression.
Wikipedia, “Stressor”, http://en.wikipedia.org/wiki/Stressor#cite_note-1
Sheldon Cohen, Ronald C. Kessler, and Lynn Underwood Gordon, “Strategies for Measuring Stress in Studies of Psychiatric and Physical Disorders”, Measuring Stress-A Guide for Health and Social Scientists 1995, Pp. 3-5.
Mykola Pechenizkiy, Jorn Bakker, Natalia Sidorova, “What’s your current stress level? Detection of stress patterns from GSR sensor data”, IEEE 11^th International Conference on Data Mining Workshops 2011, Pp. 573-580.
P. G. Holeyannavar, S. K. Itagi, “Stress and Emotional Competence of Primary School Teachers”, Kamla-Raj 2013 Enterprise J Psychology 2012, Pp. 29-38.
Alan H.S. Chan, K. Chen, and Elaine Y.L. Chong, “Work Stress of Teachers from Primary and Secondary Schools in Hong Kong”, in International MultiConference of Engineers and Computer Scientists 2010, vol. III. Pp. 17-19.
Thomas H. Holmes and Richard H. Rahe, “The Social Readjustment Rating Scale”, Journal of Psychosomatic Research 1967, vol. 11. Pp. 213-218.
Ellen E. Pastorino and Susann M Doyle-Portillo, “What is Psychology second edition”, Cengage Learning 2008.
Cohen T. Kamarck & R. Mermelstein, “A global measure of perceived stress”, Journal of Health and Social Behavior 1983, vol. 24. Pp. 385-396.
Sybille Hildebrandt, “How stress can cause depression”, http://sciencenordic.com/how-stress-can-cause-depression.
Lin, H.Y. and Chen, H. H., Ranking reader emotions using pairwise loss minimization and emotional distribution regression, in Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2008, pp. 136-144.
Yang, C. H., Kuo, H. A., and Chen, H. H., Emotion Classification Using Web Blog Corpora, in Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 2007, IEEE Computer Society. p. 275-278.
Judit, B. I, An outsider's view on "topic-oriented blogging", in Proceedings of the 13th international World Wide Web conference on Alternate track papers. 2004, ACM: New York, NY, USA. pp. 28-34.
Hsu, C.W., and Lin, C. J., A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks. 2002, Vol. 13 (2002), p. 415-425.
Yang, C. H., Kuo, H. A. and Chen, H. H., Building emotion lexicon from weblog corpora, in Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. 2007, pp. 133-136.
Yang, C. H., Kuo, H. A. and Chen, H. H., Emotion Trend Analysis Using Blog Corpora, in Proceedings of the 19th Conference on Computational Linguistics and Speech Processing. 2007. pp. 205-218.
Leshed, G., and Kaye, J.J., Understanding how bloggers feel: recognizing affect in blog posts, in CHI '06 extended abstracts on Human factors in computing systems. 2006, pp. 1019-1024.
Tung C.-M., Lu W.-H., “Predict Depression Tendency of Web Posts using Negative Emotion Evaluation Model,” in ACM SIGKDD Workshop on Health Informatics 2012, vol. 2012.
Tung C.-M., Lu W.-H., “Analyzing Depression Tendency of Post Authors Using an Event-Driven Model”, in Artificial Intelligence in MEDICINE. 2016. Pp. 53-62.
Plutchik, R. and Kellerman, H., Emotion: Theory, research, and experience. Theories of emotion, pp. 3- 33, New York: Academic 1980, Vol.1.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

A hybrid depressive mood analysis model to detect blogger depression tendency from web posts

Status:

Version 1

Abstract

Figures

1 Introduction

2 Related Work

2.1 Clinical diagnostic techniques of depression

2.2 Stress

2.3 Stress and depression

2.4 Blogs mood analysis

3 Method

3.1 Negative Emotion Evaluation model

3.2 Stressful life event driven depression analysis model

3.3 The system architecture of hybrid depressive mood analysis model

4 Experimental Results

4.1 Dataset

4.2 Lexicons

4.3 Model evaluation

5 Results Discussion

5.1 Stressful life event lexicon

5.2 Correct detection case study

5.3 Incorrect detection case study

5.4 Depression tendency trend

5.5 Limitations

6 Conclusion

Declarations

References

Additional Declarations

Status:

Version 1