Computerized text and voice analysis - a quantitative single case study of seven chronically schizophrenic patients in art therapy

doi:10.21203/rs.3.rs-1753947/v1

Download PDF

Research Article

Computerized text and voice analysis - a quantitative single case study of seven chronically schizophrenic patients in art therapy

https://doi.org/10.21203/rs.3.rs-1753947/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

This explorative study of chronic schizophrenic patients aims to clarify whether group art therapy followed by a therapist-guided picture review could influence the patients' communication behavior. Characteristics of voice and speech were obtained via objective technological instruments and selected as indicators of communication behavior. Seven patients were recruited to participate in weekly group art therapy over a period of six months. Three days after each group meeting, they talked about their last picture during a standardized interview that was digitally recorded. The audio documents were evaluated using validated computer-assisted procedures, the transcribed texts using the German version of LIWC2015, and the voices using the audio analysis software VocEmoApI. The dual methodological approach was intended to form an internal control of the study results. An exploratory factor analysis of the complete sets of output parameters was carried out in the expectation of obtaining disease typical characteristics in speech and voice that map barriers to communication. The parameters of both methods were thus processed into five factors each, i.e., into a quantitative digitized classification of the texts and voices. The scores of the factors were subjected to a linear regression analysis to capture possible process-related changes. Most patients continued to participate in the study. This resulted in high quality data sets for statistical analysis. In answer to the study question, two results were summarized: A text analysis factor called presence proved a potential surrogate parameter for positive language development. Quantitative changes in vocal emotional factors were detected, demonstrating differentiated activation patterns of emotions. These results can presumably be interpreted as an expression of a cathartic healing process. The methods presented in this study make a potentially significant contribution to quantitative research into the effectiveness and mode of action of art therapy.

Trial Registration: ISRCTN12365070

Acronym: Linguistic and Voice inquiry of patients talking about their own Pictures (LiVoPict)

art therapy

schizophrenia

communication disorder

speech and voice analysis

emotions

The discovery and publication of creative activities by psychiatric patients is considered a milestone in the history of art therapy. Von Spreti and Martius even refer to the "lunatic asylums" of the 19th and 20th centuries as their creative origins [1]. In addition to the unprecedented Heidelberg collection of the art historian and psychiatrist Hans Prinzhorn [2], the psychiatrist Leo Navratil was one of the first to incorporate pictorial material into psychiatric diagnosis and therapy [3]. The resulting revaluation of the mentally ill as artistic-aesthetic creators led to a (hitherto) unknown acceptance of such works by the public [4]. Since this time, the 'raw material' of the pictures has also exerted a great fascination for renowned artists to this day. The French sculptor Jean Dubuffet accordingly coined the term Art Brut ('raw art') [5].

Regardless of the fascination that the creative work of psychiatric patients has on many people, schizophrenia is a serious illness that affects the person as a whole and is accompanied by an incredible disruption of the sense of self [6]. Social withdrawal reduced activity and an impoverishment in communication are typical features of the illness [7] and have traditionally been identified as the core or basic symptoms of the disease [8, 9]. In the actual ICD-10 nomenclature (10th version of the International Classification of Diseases and Related Health Problems), they are referred to as the group of 'negative symptoms' of the disease.

These negative or basic symptoms further form the core of a disease-typical communication barrier, showing up in a confusion of thoughts, peculiarities of language, and an impoverishment of speech [8, 9]. In addition, the clinical picture of schizophrenia is characterized by peculiarities in affectivity. On the one hand, there is anxiety, which in its disease-typical peculiarity above all hinders the approach to others; it also generates a high degree of agitation of the patients, which can appear as aggressiveness or suicidal tendencies. On the other hand, an inadequate affectivity (parathymia) belongs to the typical symptoms of this disease and is a major disruptive factor in any communication. Here, emotional expression and the content of what is said do not match at all. The disintegration of inner feeling and expression that underlies parathymia also finds its counterpart in the ambivalence typical of schizophrenia. Here, incompatible qualities of experience coexist unrelatedly. The sufferer can feel and express fear and happiness in one, ore love and hate [9].

Furthermore, for Süllwold [10], depressive moods and "the predominance of displeasure-tinged emotions" represent an enormous hurdle in psychotherapy with the sick person [10]. Süllwold names the flattening of affect as another essential feature of the emotional changes, which also affects communication with schizophrenic patients [11]. According to Tölle, patients with the clinical picture of long-term schizophrenia also often show an "affective stiffness" and in extreme cases they appear indifferent and apathetic [9]. The psychiatrist Wing made a similar observation. Wing describes the facial expressions of the sick as mask-like and the voice as monotonous [11].

In experimental studies, the psychologist Krause [13] was able to objectify the observations of Wing [12], Süllwold [11], and Tölle [9] on the flattening of affect with the help of mimic affect expression. As a tool for this research, he used the Emotional Facial Action Coding System (EM-FACS) according to Ekman and Rosenberg [13] and Ekman and Friesen [14]. According to Krause's findings, a severe reduction of mimic affectivity could be shown especially in male patients with paranoid-hallucinatory schizophrenia [13]. According to Krause, this has to do with the fact that the "real joy" is not detectable in the patient's facial expression [13].

In about a quarter of patients with schizophrenia, the disease progresses "unfavorably" [9]. This means that the episodes of illness either lead to a schizophrenic residuum after years or decades (ICD-10: F 20.5) and the illness is essentially characterized by the spectrum of basic symptoms [9]. Alternatively, a chronic course begins with the initial manifestation, in which at least no symptom-free intervals can be delineated. The course is described as chronic if the symptoms of the illness persist for more than two years without interruption. Chronic schizophrenia does not have its own ICD-10 code; it is defined in the treatment guidelines (S3 guideline Psychosocial Therapies for severe Mental Illness [S3 guideline PTMI]), among others [16]. Patients with a chronic course of illness and with residual schizophrenia are usually dependent on continuous medical and psychosocial support [16]. The people in this art therapy study belong to that quarter of patients with schizophrenia in whom the course of the illness is "unfavorable" [9].

The limitations typical of the disease, which were pointed out at the beginning, mean that many of the patients suffering from schizophrenia have only a limited ability to act in a purely verbal form of therapy.

Due to their symptoms and the associated communication barrier, they have difficulty concentrating on their counterpart in conversations and deciphering word meanings. They also have problems blocking out external and internal stimuli [10]. The seemingly unsolvable problems in conversation with the mentally ill can be overcome in art therapy through a special form of communication. After the patient has created his or her picture in a first step, it is set up with a certain distance for the subsequent interview of the patient by the art therapist. In contrast to a classical conversation situation, both the patient and the art therapist direct their gaze towards the picture. In this way, the created picture becomes an "indirect signal" for both viewers [17] and has a decisive function for the art-therapist guided reflection on the picture: "It is precisely the indirect way of communicating that is often the possibility for mentally ill people to contact others in a protected mode [author's emphasis]" [17].

This obvious advantage of a 'protected communication' model of purely verbal psychotherapy was also recognized by the psychiatrists Peciccia and Benedetti [18]. In their clinical practice, they found that verbal psychotherapy was not possible for some chronic schizophrenic patients, while they could express themselves through painting without problems [18]. From this, they developed the idea of conducting psychotherapeutic communication with the patient via the drawn picture and called this method the "Progressive Therapeutic Mirror Image" [18]. Peciccia and Benedetti [18] emphasize the following as the main findings of their method: Firstly, verbal communication was restored in all six chronically schizophrenic patients after about three to six months. Secondly, all six patients communicated mainly verbally with the therapist in the subsequent psychotherapy [18]. Betensky [19] and Dannecker [4], among others, chose a different therapeutic approach to overcoming the communication barrier. The psychologist and art therapist Betensky [19] emphasizes "phenomenological intuition" in the discussion of images as the key to her structured method [19]. At its core is the elaboration of the individual perception or the individual meaning that the picture has for the patient with their own perspective [19]. Likewise, for the art therapist Dannecker [4], a phenomenological description of the picture "is the least fearful verbal approach to the pictures. The diversity of structures and pictorial elements, their connections, the formal structure, and contents can be named without expressing judgement "[4]. The phenomenological method of viewing images therefore seems to be particularly suitable for psychiatric patients, as they are prone to anxiety and usually experience verbal psychotherapy and confrontation with gazes as a threat.

For the treatment of schizophrenia, the S3 treatment guideline Schizophrenia [20] recommends a "combination therapy" [20]. This provides for an overall treatment plan based on three pillars, pharmacotherapy, psychotherapy, and psychosocial interventions. Psychosocial interventions also include artistic therapies [20]. According to the S3 treatment guidelines for schizophrenia [20], artistic therapies include art, music, dance, and movement therapy as well as theatre and drama therapy [20]. Art therapy is thus integrated into the artistic therapies in the guidelines [16, 20]. For patients with a clinical diagnosis of schizophrenia, art therapies are used in outpatient clinics, day clinics, in full and partial inpatient psychiatric rehabilitation, forensic psychiatry, and integration assistance [16]. The current S3-guideline schizophrenia [20] recommends the use of art therapies under the "recommendation grade B", i.e., with a "should" recommendation as part of an overall treatment plan "to improve the psychopathological symptomatology" [20]. Art therapy is thus established as part of the complementary therapy offer in those areas of medicine in which the recovery of patients' psychological resources is of particular importance. However, this offer is not yet based on objective evidence of therapeutic effectiveness, but essentially based on a no less significant resonance among patients and their clinical or social environment. The quality of previous attempts to prove scientifically the therapeutic effectiveness of art therapy in schizophrenia is assessed differently both in the S3 guideline [5, 20] and by the authors of the guideline of the National Institute for Health and Clinical Excellence (NICE [22]). The only consensus is that, based on the studies, the therapeutic potential for artistic therapies and for art therapy in particular tends to lie in influencing negative symptoms [21]. The most recent studies on the effectiveness of art therapy in schizophrenia were reported by Ruiz et al. in 2017 [23]. The authors cite the Epistemonikos database (https://www.epistemonikos.org), a central database that is updated from many other scientific databases, as a source. Thus, no more than five systematic reviews could be identified on this topic: Apotsos [24], Attard and Larkin [25], Van Lith [26] Ruddy and Milnes [27], and Maujean, Pepping and Kendall [28]. According to Ruiz et al. [23], these reviews include four Randomized Controlled Trials (RCT) that have been classified as scientifically usable in the sense of EBM. In addition, three outpatient art therapy studies are mentioned. These include the work of Green, Wehling, and Talsky [29], where 47 patients were treated from April to September 1980. The patients were divided into two groups, one receiving art therapy and the other serving as a control group and receiving standard treatment (treatment-as-usual [TAU]) in the form of a 20-minute talk. As a result of this study, Green et al [29] reported that there was a significant improvement in social skills and self-esteem in the patients who received art therapy. Also in an outpatient setting, Richardson, Jones, Evans, Stevans, and Rowe [30] studied 90 patients with chronic schizophrenia in England who had been ill for an average of 13 years.

Of these, 43 patients received art therapy for 90 minutes over 12 weeks and the control group (47 patients) received standard treatment in the form of a conversation (TAU). In this study, a significant decrease in negative symptoms was shown for the patients receiving art therapy [30]. The third outpatient art therapy research project presented is the large-scale multicenter MATISSE-study (Multi-centre study of Art Therapy In Schizophrenia - Systematic Evaluation) by Crawford et al. [31]. Here, the effectiveness of outpatient art therapy was examined in 417 schizophrenic patients on the global level of functioning (Global Assessment of Functioning Scale; GAF) as well as on negative and positive symptoms (Positive and Negative Scale; PANSS). The results of the study did not indicate any statistical improvement in the patients' health status. A huge dropout in therapy participation of up to 40% was cited as the cause [16]. Finally, for the first time in Germany, an interdisciplinary team from the Weißensee School of Art and the Charité in Berlin researched the effectiveness of art therapy in acute/subacute hospitalized schizophrenic patients in an inpatient setting. Primary target parameters were positive and negative psychotic and depressive symptoms (SANS and PANSS) as well as the general level of functioning (GAF). The study results were published in 2014 by Montag et al. [32] and show positive results regarding positive and negative symptoms [16, 32]. The authors of the current PTMI S3 guideline are also based on these four studies [16]. The authors of Ruiz et al. [23], listed in the current state of science, are obviously largely aware of the methodological weaknesses mentioned, which in their studies stand in the way of a successful proof of the effect of art therapy in patients with schizophrenia. Accordingly, they are cautious in their assessment of their own study results. Ruiz and co-authors [23] conclude that a therapeutic effect of art therapy on the symptoms of existing schizophrenia is not clearly demonstrable [23]. The DGPPN (Deutsche Gesellschaft für Psychiatrie Psychotherapie und Neurologie) guidelines also summarize their current assessment of the evidence for the effectiveness of art therapies, saying, "It remains to be stated that further studies are urgently needed to place the evidence in these areas on a broader basis and to make positive effects from smaller studies replicable"[16].

The art-therapeutic approach of this research project refers to the symptoms of language impoverishment and affective peculiarities that lead to the described disruption in communication with people suffering from schizophrenia. These should be overcome with the described range of methods. The protected mode in the complex of the art-therapeutic triad, consisting of the interaction of patient, therapist, and picture and the common picture reflection specially adapted to these patients. However, the study presupposes a standardization of the picture discussion. Thus, the 'therapist-guided picture talk' evolves into a standardized guideline-based interview entitled Therapist-Guided Picture Reflection (TGPR).

This is the basis for the explorative approach of this study as a quantitative single case analysis without baseline (Single-Subject Research design [SSR] in Petermann [33] Julius, Schlosser and Goetze [34], and Riley-Tillman and Burns [35] by means of new quantitative research instruments that have never been used in art therapy research before and are therefore tested for their study suitability. These are two automated computer-assisted text and voice analysis methods that use validated parameters to describe quantitatively the phenotypes of a speech or voice sample in a spectrum of differentiated categories. For the detection of processual changes, they are used in such a way that the individual linguistic and vocal phenotypes become recognizable in their dynamics during therapy. This dual study approach with digitized interviews enables correlations between the two parts of the recorded speech samples and forms the internal control in the study approach.

In summary, the chosen study design essentially maps three processes: the individual participation behavior of the patients during the study (adherence), the process-related changes in target parameters in a temporal perspective, and the psychological process about the quality and robustness of the two study instruments LIWC2015 and VocEmoApI.

2.1. Ethical statement

The art therapy study was approved by the management of the privately run Marienheim, a social therapy facility for the mentally ill. The written project permission was received on 30 March 2016. The written, informed consent of all seven study participants was also available before the start of the study. The legal guardians of these seven patients were also informed about the study participation of their wards and made the consent forms legally binding by signing them. After completion of these formal requirements, the author drew up the final list of study participants, which were signed by the the two treating psychiatrists. They endorsed the art therapy envisaged in the study and described it as a 'medically indicated therapeutic measure' integrated in the overall treatment plan. Furthermore, the Ethics Committee of the University of Augsburg confirmed the ethical harmlessness of the art therapy research project in its statement of 11 May 2017. The study was registered at www.isrctn.com (identifier: ISRCTN12365070) https://doi.org/10.1186/ISRCTN12365070.

2.2 Sampling

Seven patients (n = 7) aged between 46 and 62 years were included in the explorative single case study (SSR in an isolated B design) over a period of six months (05/2016 to 10/2016). They were inpatients at the Marienheim facility in Peiting and suffered from long-term schizophrenia. The patients of the Marienheim are people of all genders who suffer from different, partly combined mental illnesses that are classified in the ICD code. Their stay is under a judicial accommodation order according to §1906 BGB and they have legal care. The construction of the sample from the total population of the home for this art therapy study resulted primarily from the specific objective. Secondarily, the complete lack of third-party funding for this research project also played a role, albeit a subordinate one.

The Marienheim was occupied by 69 patients in February 2016 (according to the resident list of 12 February 2016). The selection of the sample took place under inclusion and exclusion criteria in cooperation with both attending consultant physicians. One of them was a specialist in neurology and psychiatry from an independent practice and the other was a specialist from a psychiatric outpatient clinic (PIA); they were both commissioned by the Marienheim facility.

Inclusion criteria: Diagnosis of Paranoid Schizophrenia (F20.0) with chronic course of at least ten years or/and Schizophrenic Residual (F20.5), stable medication of psychotic symptoms for at least three months, preference towards artistic therapies (art therapy), and the presence of written informed consent. Exclusion criteria: Primary addictive disorder, acute accessory symptoms, suicidality, intelligence reduction, brain organic disorder, autistic disorder, personality disorder, visual impairment, language barrier, hemiparesis including speech disorder, moving out soon, or participation in another creative therapy (in group or/and individual therapy).

Based on the above 'inclusion and exclusion criteria', 15 patients were selected as suitable for the sample with the treating psychiatrists on 09 April 2016. Thereupon, the author informed the 15 patients about the details of the study in individual interviews, offering them the opportunity to participate in the study. The obligatory 'information and consent form' was also discussed in detail and handed out if required. Of these 15 patients, eight agreed to participate in the study. After a reflection period of several days, seven patients signed the 'information and consent form'. One study participant (P7) asked to participate in the study group but did not allow audio recording of the interviews.

The list of study participants signed by the psychiatrists–not anonymized in the original–is shown in Table 1 below with year of birth, gender, diagnosis(es), and date of entry into the Marienheim. In the further course of the study, the study patients were named exclusively as P1, P2, P3, P4, P5, P6, and P8.

Table 1

List of study participants (anonymized).

Patient	Birth year	Gender	Diagnosis	Move-in
P1	1967	M	paranoid schizophrenia (F 20.0) mental behavioral changes due to alcohol (F10.3)	2015
P2	1958	M	paranoid schizophrenia (F 20.0) schizoaffective disorder (F 25.1)	2015
P3	1970	W	paranoid-hallucinatory schizophrenia (F 20.0)	2014
P4	1963	M	schizophrenic residual (F20.5)	2002
P5	1956	W	paranoid-hallucinatory schizophrenia (F20.0)	2014
P6	1954	M	paranoid schizophrenia (F 20.0)	2015
P8	1965	M	paranoid schizophrenia (F 20.0) schizophrenic residual (F 20.5)	2015

M: male; W: female.

2.3. Procedure

The course of the individual study phases with specifics of study interruption/dropout with reasons are shown in the CENT-2015 Flow Diagram in Figure 1.

2.5. Art therapy intervention

2.5.1. Picture creation

Group art therapy with image creation took place once a week, always on Fridays from 10.00 am to 11.30 am over a period of six months (05/2016 to 10/2016) and up to 20 possible sessions. The group was led exclusively by the author for the entire study and took place in the facility´s creative room. The art therapy appointments were fixed in the therapy plan as well as in the calendar of the residential group. All the materials needed for the artistic work were available for each task or topic; in detail, these were acrylic and watercolor paints, oil, and pastel crayons, colored, felt, lead and neon pencils, brushes of various thicknesses, palettes and water glasses and various adhesive materials (e.g., glue and tape). Pursuant to the diagnoses of the study participants, "solid materials" such as wax and colored pencils and collage materials were offered at the beginning [37]. Painting and drawing paper, cardboard, cardboard in various colors and thicknesses, primed canvases or painting cardboard and watercolor paper served as picture supports. Boxes with photographic material were available for the collages. In the first art therapy session, in addition to the timetable and schedule, the most important rules were explained and repeated as needed. These rules included that art therapy is not about creating works of art, that there is no 'right' or 'wrong', that the tasks and topics are for orientation and not binding, and that the materials offered can be used freely–according to individual expression needs. The group sessions of the study followed a fixed structure with regard to both the timing and the content.

The author moderated through the three phases, consisting of the opening round, picture design and closing round, according to Schrode [38] and von Spreti [37]. In the opening round, which lasted about 15 minutes, each participant was given the opportunity to talk about how they were feeling now. Especially at the beginning of the art therapy study–when the participants did not know each other that well–it was the author's task to create a trusting atmosphere within the art therapy group [37]. This was followed by the author setting the tasks and themes for the day to create the pictures. The tasks and themes were a central element in the art therapy study approach. The empirical findings of the art therapist Landgarten [40] formed the basis for the tasks and themes. In a group working on a long-term basis and thus 'experienced', the identification of themes also took place in consideration of current events, moods or wishes expressed by individual or several patients during the project [39]. The detailed description of the tasks and themes for the creation of the pictures is described in detail under A.1 (Appendix).

The pictures created by study participants in art therapy were placed in a personalized collection folder after drying. At the end of the study (beginning of November 2016), the images were also digitally photographed, anonymized, and archived on a hard drive. After the end of the study, study participants received their original pictures back.

2.5.2. Picture reflection

The art Therapist-Guided Picture Reflection (TGPR) took the form of a guided question interview. Although the guided interview belongs to the "semi-structured forms of data collection for obtaining verbal data" [41], it was used as a structured form of data collection in this research. This was because the questions in the guide enable the necessary structure in the sense of a standardized course of the survey. At the same time, this form of interview opened enough space for 'free speech', which was required here to obtain sufficiently large speech and voice samples. Furthermore, the standardization of the interview procedure established a certain comparability of the interviews during the study, specifically for intra-individual comparisons of each participant's speech and voice over the course of six months and for inter-individual comparisons across the study group. The structure of the TGPR (see Table A.2) was based on four phases according to Misoch [41]: the "information phase", the "warm-up and introduction phase", the "main phase", and the "fade-out and conclusion phase" [41].

It was necessary to create a catalogue of questions that, despite stereotypes, made the guide an event in which it should always be possible from image to create in the same way the essential emotional arc of tension, the actual therapeutic moment of the interview. When searching for suitable questions and their sequence, the disease-related limitations of the study participants had to be a primary consideration.

For the construction of such a catalogue of questions, it was possible to draw from a wealth of scientific foundations and our own practical experience. Method is a prerequisite for goal-oriented drawing, and here the choice fell on the structured CCSS method (Collecting, Checking, Sorting, Subsuming) by Helfferich [42]. This process was accompanied by the doctoral colloquium of the Research Association of Artistic Therapies (RAT) and by the upper seminar of art education research at the Chair of Art Education at the University of Augsburg in the period from September 2015 to May 2016. In a brainstorming session, all significant questions and topics for the art therapy guiding questions were 'collected'. Subsequently, which questions were suitable for the guideline with chronic schizophrenic patients, the setting, and the research question were checked. Subsequently, the questions were 'sorted' according to content as whether they were guiding questions or open-ended narrative prompts, maintenance questions, or specific follow-up questions. The content aspects and the prompt 'What is the question aiming at?' are shown in columns 2 and 5 of Table A.3. In a final step, the collected, reviewed, and sorted questions were 'subsumed' into a guideline of eight questions.

The thematic areas of the question catalogue were based on the art-therapeutic method of picture discussion from a phenomenological perspective according to Betensky [19] and the art-based approaches to picture viewing and discussion according to Stuhler-Bauer and Elbing [43], Bader [44], Dannecker [4] and Titze [45], and according to the specially elaborated conversational style for psychotherapy with schizophrenic patients according per Süllwold [10, 11]. In the interviews for data collection, the study participants talked about the images they had created. The conversation about the study patients' own pictures took place in the week after the pictures were created on Monday or Tuesday afternoon as individual therapy in the form of an interview with a maximum duration of 50 minutes. The author picked up the study participant from the agreed meeting point and went with them to the meeting room reserved for this purpose.

2.6. Research instruments

This art therapy study was directed at recording the vocal expression of emotion or emotionality and the speech and thinking style from the verbal and paraverbal speech signals of the study participants. To this end, the study used two computer-assisted research methods to analyze the digitized audio recordings, the computer-assisted quantitative text analysis Linguistic Inquiry and Word Count (LIWC2015) and the voice analysis software Vocal Emotion recognition by Appraisal Inference, the VocEmoApI technology (see Figure 2). Thus, a dual-automated, computerized analysis of the audio documents was achieved.

2.6.1. Range of target parameters of LIWC2015

The study interviews were conducted in German. To evaluate the recordings, the current software version of LIWC2015 according to Pennebaker et al. (46) was combined with a working version of the German LIWC2015 program by Meier et al. (46). The analysis took place in 2017/2018. The German DE-LIWC2015 dictionary consists of around 18000 words and word stems [47]. In the same way as in the English version of LIWC2015, the recognized words can be assigned to several word categories [47]. The quality of the German LIWC2015 was examined by Meier et al. [47] in two studies: first by comparing the German translation (DE-LIWC2015) with the English original (ENG-LIWC2015) and then with the previous German version of DE-LIWC2001. The findings in the studies showed a high level of equivalence between the two dictionaries [47].

The computer-assisted text analysis LIWC is a computer program that automatically examines written or transcribed verbal texts for features of the author in a formal, quantitative way. In doing so, the application focuses less on the content of human speech, but rather on individual words and word stems, which are counted, assigned to defined word categories and illustrated as a percentage in relation to the text length [48]. The recognized words can be assigned to several word categories–so-called upper and lower categories–at the same time and consequently be recorded and counted several times. The word cried can therefore be assigned to the categories sad (sadness), affect (all affect words as an upper category), the category negemo (negative emotion), verb (verbs), and focus past (past) [46].

For the recording and automatic evaluation of the verbal speech signals, a complete transcription of all individual interviews (full transcription) was necessary. The transcription rules were those from the LIWC2015 user manual according to Pennebaker [45] and the “German LIWC” according to Meier et al. [47].

The LIWC2015 text analysis consists of 90 output parameters. For the present work, however, only 61 of these parameters were included, since the remaining 29 may not be included in the statistics as additional share of an upper- or lower-level LIWC2015 category.

For the statistical analysis, it was first relevant how high the percentage of speech data is in an audio document. For this purpose, the LIWC output parameter Dictionary Words (Dic) was used, which determines the proportion of words that are recorded and evaluated by the LIWC2015 text analysis procedure. In addition, it was necessary to select the study-relevant parameter sets for the exploratory factor analysis. These included the three general descriptive variables with the words counted per interview (WC), the percentage of sentence length (WPS) and words longer than six characters (Sixltr). Of the output parameters of the categories of 'basic linguistic and grammatical parameters' (I) and 'psychological processes' (II), only those belonging to a subordinate category entered exploratory factor analysis. Therefore, of the 'basic linguistic and grammatical parameters' (I), all subcategories with 18 language features were included. This concerned the five linguistic features of personal pronouns (i, we, you, shehe and they) and impersonal pronouns (ipron) as well as the linguistic categories of article (article), prepositions (prep), auxiliary verbs (auxverb), adverbs (adverb), conjunction (conj) and negation (negate), verbs (verb), adjectives (adj), comparative words (compare), interrogative words (interrog), number words (number) and words that quantify (quant). Also, of the word categories for thematic content 'psychological processes' (II), only the subcategories with 30 parameters were included in the exploratory factor analysis. These included two parameters relating to affective (posemo, negemo), four relating to social (family, friend, female, and male), six relating to cognitive (insight, cause, discrep, tentat, certain, and differ), three relating to perception (see, hear, and feel), and four relating to biological processes (body, health, sexual and, ingest). In addition, five parameters describing drives (affiliation, achieve, power, reward, and risk), three describing temporal orientation (focuspast, focuspresent, and focusfuture), and three describing relativities (motion, space, and time) were analyzed. Finally, six parameters for 'personal concerns' (work, leisure, home, money, relig, and death) and four for 'formless language' (swear, assent, nonflu, and filler) were included. In summary, 61 parameters from the LIWC2015 text analysis were included in the exploratory factor analysis.

2.6.2. Range of target parameters of VocEmoApI

The voice analysis procedure used here is the further development of a software called sensAI (sensitive Audio Intelligence), a technology for the computer-aided identification of affective speaker states via human speech or voice. The following description of the software used is based on the working manual "sensAI WebAPI Documentation - Version 1.3 from 5.12.17" [49]. The sensAI technology analyses the input audio files and scales the automated evaluation of the output variables from speech or voice as well as emotional content. It recognizes the human voice with the help of "voice activity detection" (VAD) and can also distinguish it from loud background noises (e.g., street noise). The speech signals of the speaker are decoded about characteristic and personal features (personality, age, and gender), patient emotional speech states (emotional categories such as joy, anger, fear, etc.), the speech activity (duration, speech tempo, etc.), and the speech prosody (volume, pitch of voice, etc.).

The software, which works with 88 parameters, is based on the basic standard parameter set Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for voice research and affective computing [50]. It was applied in the voice analysis of the present study under the name VocEmoApI technology with its category_v2_scores. This includes the intensity parameter, 52 emotion scores of the category_v2_scores, the three emotion dimensions pleasantness, urgency, and control, and the four prosodic features pitchAverage, pitchVariation, loudnessAverage and speakingSpeed, which will be explained in the following paragraph. The parameter intensity (line 4) is used to check the emotional intensity of the category_v2_scores. The range goes from 0.0 (neutral) to 1.0 (highly emotional). A score below 0.25 can be considered neutral language. Scores above this indicate emotionally colored speech [49].

The 52 emotion scores are found in the category_v2_scores (line 5). For the category_v2_scores, a reference range is given in which the scores move. This reference range was determined from voice samples of unspecified speakers. It is set between the values -1.0 and +1.0. Scores between -1.0 and 0 are called poor match, i.e., the recognized emotion 'does not fit at all' (-1.0) to 'insignificantly' (0) into the designated category [49]. Values between 0 and 1.0 are referred to as good match and quantify the gradual fit of the acoustic signal into the specific emotion category. A value of 1.0 cannot be exceeded in this definition of the reference range, here the recognized emotional signal fits completely into the designated category. All 52 emotion scores of the category_v2_ were included in the explorative factor analysis. The three emotion dimensions pleasantness, urgency and control are in row 6. The scores of pleasantness are between -1.0 (negative), 0.0 (`neutral'/in between) and 1.0 (positive). The scores of urgencies and control are between -1.0 (low), 0.0 (`normal'), and 1.0 (high) [49]. The four prosodic features pitchAverage (average fundamental frequency F0 in Hz), pitchVariation (variation of the fundamental frequency F0 in Hz), loudnessAverage (average perceived loudness of the speaker) and speakingSpeed (speaking tempo in syllables per second) are in line 8. Pitch is expressed in semitones relative to a base note (A0) of 27.5 Hz. The lowest value for pitch is 12 semitones (55 Hz) and the highest value is 62 semitones (~ 1000 Hz). For loudness, a psycho-acoustically-corrected loudness measure is employed, considering the human ear's selective frequency response and non-linear frequency and intensity perception. The speaking speed (speakingSpeed) also includes the pauses of a speaker. Thus, values below 3 - 4 Ss are called slow speaking and values above 4 Ss are called fast speaking. Values below 2 Ss result from hesitant speaking, short utterances or when many pauses are made during speaking [49]. From the total data set of 88 parameters, 28 were left out for further calculation. With the remaining 60 selected parameters, the affective features from the human voice are analyzed.

2.7. Statistical analysis

The analysis of all data sets and the graphical visualization of the results were carried out with the statistics programmed Statistics Standard from IBM Version 25.0 (SPSS®). The graphs were also created alternatively with the software GraphPad Prism version 8.1.0. (221) (GraphPad®) if this resulted in an optimized visualization.

The statistical evaluation of the LIWC2015 and VocEmoApI data sets was carried out using the methods of descriptive statistics, exploratory factor, and linear regression analysis.

2.7.1. Descriptive statistics

First, in the process of data analysis the calculation of mean (M), range (Range), minimum (Min), maximum (Max), and standard deviation (SD) was carried out as characteristics of the measured output parameters of both test methods. This was done in tabular form for the entire study group and for each individual study participant.

On the correlation of study outcomes from the dual study instruments used. The effect size of the bivariate correlation coefficient (r) was defined according to Cohen [50] as `small´ with r greater than/equal to 0.10, as `medium´ with r greater than/equal to 0.30 and as `large´ with r greater than/equal to 0.50 [51].

2.7.2. Exploratory factor analysis

Exploratory Factor Analysis (EFA) was used to evaluate the extensive output parameters of the LIWC2015 text analysis (with 61 parameters) and the voice analysis category_v2_scores (with 52 parameters). As a prerequisite for an exploratory factor analysis, the following statistical procedures or tests were applied to the output parameters of both research methods: A bivariate correlation matrix according to Pearson first gave an overview of the number and effect strength of calculated intercorrelations of the output parameters, which ultimately allow a statement on whether an exploratory factor analysis is suitable for uncovering "structures, trends and patterns" among the measured parameters (here the speech and voice analysis) [52]. The effect size of the correlation coefficient (r) was defined according to Cohen [51] as 'small' with r greater/equal 0.10, as 'medium' with r greater/equal 0.30 and as 'large' with r greater/equal 0.50 [52]. As a further prerequisite, the standard test procedure developed by Kaiser, Meyer, and Olkin (KMO) was applied with the so-called KMO value. A KMO value > 0.60 [53] is assumed an acceptable lower limit for suitability. Thirdly, Horn's parallel analysis [54] was used to determine the number of factors. For factorization, all items with a loading amount > 0.40 were considered. Factor loadings below this were removed according to the 'rule of thumb' of Wentura and Pospeschill [52] [53]. To check the internal consistency of the factors, a reliability test was carried out using Cronbach's alpha. Wentura and Pospeschill [53] state that values between 0.30 and 0.50 are to be interpreted as medium and above 0.50 as high [53]. Therefore, a value > 0.60 was assumed as a measure of reliability. With the exploratory factor analysis completed, further statistical analysis of the study was based on the factor scores alone.

2.7.3. Linear regression analysis

The following criteria are stated as prerequisites for linear regression analysis: First, an interval scaling of the independent and dependent variables, second, a linear relationship between the two variables, third, that the sample must be 'random', fourth, a normal distribution of the residuals and fifth, a variance of the residuals (homoscedasticity) [55]. The basic prerequisite of an interval scaling for the independent and the dependent variable was given by the intervention (TGPR) as measurement time points 1 to 20 over 25 weeks (six months) and by the metric output parameters of LIWC2015 and VocEmoApI. Normal distribution and homoscedasticity (homogeneity of variance) were visually checked. Since the method of non-probabilistic sampling was used in this study with seven study participants (n = 7), the requirement of "random sampling" was not met. However, this exploratory study considered all statistical findings exclusively in a chance-critical manner [52].

The strength of the linear relationship between the independent (IV) and Dependent Variable (DV) was indicated by the regression coefficient (b). In addition, a "sample significance test" was performed using a t-test to test the regression coefficient (b). The probability level of the t-value was explained with p < 0.05. The coefficient of determination (R²) provided information about the quality of the regression model [52]. For all seven individual cases, the extracted factors of the two study instruments were visualized as a scatter plot with a regression line, including a 95% confidence interval. The individual interpretation of the process courses is based both on the visual examination of the graphical representations for normal distribution (histogram) and on the interference statistical data for linear regression analysis.

3.1. Descriptive statistics

3.1.1. LIWC2015

The speech recordings of the study participants showed a wide range from a minimum of 87 to a maximum of 1486 counted words (WC). On average (mean = M), the seven study participants used 467 words per interview session (SD = 247, Range = 87 - 1486; see Table B.2.1 in Appendix). The mean distribution of the LIWC2015 parameter Dic (Dictionary words) over the entire sample showed that the percentage of words that the programmed recorded on average from all 115 transcribed interviews in this study was 90% (SD = 3%, Range = 82 - 96%). However, it should be critically noted that among the remaining 10%, key words from art or art therapy (such as words related to the colors 'blue' and 'black' or the artistic methods 'collage' and 'watercolor') were not recorded. Although the LIWC2015 text analysis focuses less on the thematic references of the speaker than on the syntactic connections of the words in sentences, another problem that emerged was that in some word count analyses ambiguities in language use were not considered. This became visible in the individual case analysis of P3. Here, in the first interview (09.05.16), P3 showed her highest percentage of negative emotion words (negemo = 2%) during the study, at 2%. This was due to a misinterpretation of the LIWC2015 text analysis software, in which the word 'monkey' was counted as a negative emotion word, and P3 used this word several times in the description of her collage.

The detection frequency of the 61 parameters from the LIWC2015 machine text analysis was ranked by their mean values (M) for each study participant and across all interviews. A participant's personal set of guiding parameters was created from the five highest-ranking parameters (see Table B.2.1, and B.2.2 in Appendix). As a result, the same guiding parameters were determined for the seven study participants and were similarly distributed across the entire group. All seven study participants used a comparable number of words of the following characteristics: Verbs (verb) (M = 18%, SD = 2%, Range = 11% - 25%), words in the subjunctive (conj) (M = 17%, SD = 3%, Range = 7% - 29%), articles (article) (M = 13%, SD = 2%, Range = 7 % - 20 %), indefinite pronouns (ipron) (M = 14 %, SD = 2 %, Range = 7 % - 20 %) and those with more than six letters (Sixltr) (M = 18 %, SD = 3 %, Range = 12 % - 26 %).

Table 2

Mean distribution (M) of word count (WC), dictionary words (Dic) and the lead parameters for each study participant and the Total Collective (TC).

LIWC2015	Study participants
Parameters	P1	P2	P3	P4	P5	P6	P8	TC
	M (SD)	M (SD)	M (SD)	M (SD)	M (SD)	M (SD)	M (SD)	M (SD)
WC	409 (309)	645 (278)	401(134)	362 (133)	548 (196)	687 (293)	298 (83)	467 (247)
Dic	91 (3)	91 (3)	89 (3)	87 (4)	92 (2)	88 (2)	89 (3)	90 (3)
Sixltr	15 (1)	17 (2)	17 (4)	19 (4)	17 (2)	22 (2)	19 (2)	18 (3)
ipron	16 (2)	13 (2)	13 (2)	13 (2)	15 (3)	12 (2)	14 (2)	14 (3)
article	15 (2)	12 (2)	12 (3)	14 (2)	13 (3)	13 (3)	13 (2)	13 (3)
conj	20 (3)	19 (2)	16 (2)	16 (2)	21 (2)	14 (3)	16 (3)	17 (3)
verb	19 (2)	20 (2)	20 (3)	16 (3)	18 (2)	15 (2)	15 (2)	18 (3)

M: mean; SD: standard deviation; TC: Total Collective; WC (word count): number of words used.

All other output parameters in %.

3.2.1 VocEmoApI

The scores of the emotion dimensions pleasantness, urgency and control offered remarkable variability in the sample, but no striking patterns. If the male and female study participants are considered separately, it becomes clear that the mean scores of the male participants differ from those of the female participants. The males (P1, P4, P6, and P8) showed exclusively negative scores in the dimension’s pleasantness and urgency and exclusively positive scores for control. In contrast, the females showed a different grouping for the emotion dimensions: P5 showed positive scores for pleasantness, urgency, and control. P3 also showed positive scores for pleasantness and control and a negative score for urgency. The loudnessAverage score was used as a measure of psychoacoustic loudness. Within the study group, it ranged from a minimum of 0.85 to a maximum of 2.07. The mean value (M) had a score of 1.32 with a standard deviation (SD) of 0.28. No comparison values from healthy subjects were available. As there were no comparative values from healthy speakers, this parameter was only compared within the sample. The scores for loudnessAverage ranged from a minimum of 0.85 (P1) to a maximum of 1.7 (P4) for the male participants and from a minimum of 1.00 to a maximum of 2.07 (P5) for the female participants. The mean value (M) of the Pitch (F0) (pitchAverage) was 105 Hz for the male study participants and 156 Hz for the two females. In comparison, the value for the average Pitch (F0) is generally approximately 120 Hz for men and approximately 220 Hz for women [56]. The speech rate (speakingSpeed) showed values from a minimum of 0.76 to a maximum of 2.15 syllables per second (Ss) across the entire sample. The mean (M) was 1.43 syllables per second (Ss) with a standard deviation (SD) of 0.24 Ss (see Table 3). This result illustrates that compared to the average speaking rate of a healthy speaker of about four syllables per second (see Chapter 2.5.2), all participants in this study spoke more slowly on average, made many pauses, and sometimes only short utterances. However, for the interpretation of the speaking rate, it should be considered that even when accustomed to the process of therapist-guided picture reflection (TGPR), a high level of concentration was necessary when answering the question and story prompts and the study participants needed time to think. According to the interpretation guidelines of the developers of VocEmoApI, presented in Chapter 2.5.2, the characteristic values of the parameter intensity of emotions (intensity) for the entire sample with 115 audio recordings showed a score between a minimum of 0.53 (emotionally-coloured) and a maximum of 1.77 (far above highly emotional). The average score was 1.06 with a standard deviation (SD) of 0.26 and was thus rated as highly emotional (see Table B.2.5 in Appendix). This means that the intensity of the emotions measured in the study group was never in the range of neutral speech and that most of the study group was highly emotional throughout the study.

The examination of the mean values (M) for all 52 emotion categories showed that the reference range was slightly exceeded for the emotions disgust (M = 1.03, Range = 0.07 - 2.21, SD = 0.55) and grief (M = 1.01, Range = 0.00 - 1.73, SD = 0.59). The descriptive representation of the individual cases, on the other hand, showed considerable exceedances of the reference range of the mean values. For the male study participants, these are above all the negative emotions disgust (P1: M = 1.92, Range = 1.52 - 2. 21, SD = 0.19), sadness (P4: M = 1.29, Range = 1.04 - 1.43, SD = 0.11) and grief (P6: M = 1.38, Range = 1.09 - 1.73, SD = 0.21) and for the female study participant P5, the positive emotion euphoria (M = 2.07, Range = 1.58 - 2.38, SD = 0.23). The female P3 was the only one who did not exceed the reference range.

Table 3

Mean distribution (M) of the emotion dimensions, prosodic features, and emotion score headline parameters for each study participant and the TC.

VocEmoApI	Study participants
Parameters	P1	P2	P3	P4	P5	P6	P8	TC
dimensions	M (SD)	M (SD)	M (SD)	M (SD)	M (SD)	M (SD)	M (SD)	M (SD)
pleasantness	-0.49 (0.07)	-0.03 (0.19)	0.13 (0.10)	-0.42 (0.10)	0.04 (0.12)	-0.39 (0.08)	-0.18 (0.18)	-0.17 (0.28)
urgency	-0.85 (0.06)	-0.66 (0.09)	-0.27 (0.11)	-0.57 (0.09)	0.60 (0.14)	-0.58 (0.15)	-0.68 (0.07)	-0.43 (0.46)
control	0.20 (0.13)	0.14 (0.12)	0.54 (0.12)	0.52 (0.11)	0.95 (0.04)	0.54 (0.19)	0.38 (0.11)	0.46 (0.28)
prosody
loudnessAverage	1.10 (0.15)	1.09 (0.10)	1.20 (0.10)	1.39 (0.15)	1.86 (0.14)	1.38 (0.19)	1.28 (0.14)	1.32 (0.28)
pitchAverage	96.34 (2.97)	115.00 (6.32)	147.48 (2.34)	100.83 (3.25)	164.72 (2.98)	101.47 (4.22)	111.92 (2.98)	120.97 (24.61)
pitchVariation	7.80 (1.20)	7.80 (1.20)	6.0 (0.70)	6.40 (1.0)	11.0 (1.2)	11.0 (1.0)	6.4 (1.6)	1.74 (0.47)
speakingSpeed	1.58 (0.24)	1.30 (0.27)	1.43 (0.21)	1.53 (0.19)	1.49 (0.18)	1.28 (0.23)	1.36 (0.23)	1.43 (0.24)
intensity	1.21 (0.08)	0.88 (0.13)	0.75 (0.09)	1.06 (0.05)	1.53 (0.18)	1.17 (0.09)	0.97 (0.09)	1.06 (0.26)
category_v2_scores
sadness	1.91 (0.13)	1.06 (0.29)	0.24 (0.10)	1.29 (0.11)	0.01 (0.01)	1.37 (0.24)	1.23 (0.20)	0.99 (0.65)
passion	0.02 (0.00)	0.00 (0.00)	0.26 (0.19)	0.00 (0.01)	1.98 (0.21)	0.04 (0.07)	0.00 (0.01)	0.32 (0.68)
euphoria	0.00 (0.00)	0.00 (0.00)	0.27 (0.19)	0.00 (0.01)	2.07 (0.23)	0.05 (0.08)	0.00 (0.01)	0.34 (0.71)
enthusiasm	0.00 (0.00)	0.00 (0.01)	0.20 (0.12)	0.00 (0.00)	1.76 (0.28)	0.03 (0.05)	0.01 (0.01)	0.28 (0.61)
disgust	1.92 (0.19)	0.80 (0.32)	0.20 (0.08)	1.25 (0.11)	0.84 (0.19)	1.41 (0.22)	1.08 (0.18)	1.03 (0.55)
boredom	1.59 (0.22)	1.07 (0.23)	0.52 (0.13)	0.81 (0.22)	0.08 (0.07)	1.00 (0.38)	1.16 (0.21)	0.89 (0.50)
grief	1.62 (0.06)	1.17 (0.23)	0.31 (0.14)	1.35 (0.15)	0.01 (0.01)	1.38 (0.21)	1.36 (0.17)	1.01 (059)
valid values (N)	17	19	19	18	16	10	16	115

M: mean; SD: standard deviation; TC: total collective of all; pitchAverage/ pitchVariation in Hz; speakingSpeed in Ss (syllables per second), dimensions/ loudnessAverage/intensity/category_v2_scores as score values

3.2. Exploratory factor analysis

3.2.1. LIWC2015

The first phase of the exploratory factor analysis that followed consisted of a principal component analysis. Table B.3.1 (Appendix) shows that the eigenvalue rule suggested 17 factors. These explained 74% of the total variance. The eigenvalue diagram also showed a factor number of 17 with the 'kink' in the eigenvalue progression of the Scree test (from top: number of scores assessed before the 'kink', see Figure B.3.9 in Appendix). With a value of 0.601, the KMO-value showed a moderate suitability of the LIWC2015 parameters for a factor analysis (see Table B.3.2 in Appendix). The interpretation and naming of the identified factors under a 'title term' was facilitated by the fact that the items that formed a factor were ordered in descending order according to their factor loading. Items with the highest loading thus became more visible and were used as a hook for naming. This ranking was created in SPSS by a varimax rotation (see Table B.3.8 in Appendix). Only items with loadings > 0.4 were considered for further factor formation. The items below this were removed. To check the internal consistency of the factors, a reliability test was carried out using Cronbach's alpha. The values in Table B.3.10 show that the measure of reliability (> 0.60) was only given for factors F1, F2, F3, F4 and F5. The remaining factors (F6, F7, F8, and F9) were not considered further.

Each title for a factor should briefly and concisely reflect the best possible interpretation of its content. The challenge of finding a title was that the content was a product of statistics, which created a new combination of parameters whose common denominator had to be reflected in the title. To facilitate the interpretation of the content of the factors of LIWC2015 (F_liwc), a table was created in which the parameters in the respective factors are listed according to descending loading amount (see Table 5).

Table 5

Factor formation of the parameters of LIWC2015.

61 Parameters	Factors
LIWC2015	F1_liwc	F2_liwc	F3_liwc	F4_liwc	F5_liwc
filler	-0.848
nonflu	-0.843
tentat	-0.827
focuspresent	0.757
auxverb	0.750
verb	0.727
prep	-0.681
WPS	-0.578
sixltr	-0.535
space	-0.516
affiliation		0.763
family		0.756
sexual		0.730
female		0.596
conj		-0.477
interrog		-0.428
cause		-0.407
differ			0.701
negate			0.680
discrep			0.616
reward			-0.591
posemo			-0.553
adverb			0.494
home			0.403
article				-0.828
ipron				-0.693
focuspast				0.661
i				0.489
leisure				-0.459
achieve					0.817
work					0.685
compare					0.680
adj					0.602
number					0.496
see					0.457
insight					0.444

Allocation of the parameters to the factors (within the factors sorted by descending loading amount); 7 study participants.

The five factors of the exploratory factor analysis from the LIWC2015 dataset could be interpreted based on empirical findings by Tausczik and Pennebaker [57] and Schwitalla [58]. The extracted factors describe the `attention´ in its orientation and with a reference to the topic, (F1_liwc Presence and F4_liwc Autobiography), the 'thinking style' of the study participants (F3_liwc Self-critical analysis and F5_liwc Claim & Ambition) and the 'social relationships' (F2_liwc Social inclusion) from the transcribed interviews.

3.2.2. Emotions scores: category_v2_scores

All 52 parameters were used for the explorative factor analysis of the category_v2_scores. For a first overview, an intercorrelation matrix according to Pearson was created. The correlation at the level of p < 0.01 (2-sided) was defined as significant. There were sufficiently strong correlations among the parameters of the category_v2_scores so that it seemed reasonable to create a factor analysis from the data set. Due to the size of the correlation matrix, it can be found in the Appendix under the folders B.8.3. In the first phase of the explorative factor analysis, a principal component analysis was carried out. From Table B.4.1 (Appendix), the 'eigenvalue rule' suggested five factors. These five factors explained 96% of the total variance. Inspection of the eigenvalue plot also indicated the 'kink' in the eigenvalue plot of the Scree test at a factor count of five (see Figure B.4.6 in Appendix).

The suitability of the output parameters for an explorative factor analysis could again be calculated by the KMO value. With a value of 0.884 (see Table B.4.2 in Appendix), this showed a high suitability of the variables for factor analysis. To determine the number of factors, a parallel analysis according to Horn [54] was also carried out here. This also recommended (see Table B.4.4 in Appendix) the number of five factors, since the eigenvalues of the empirical data for the first five factor possibilities were larger than the eigenvalues of the random data. In SPSS25, the items of the factors were again ranked by a varimax rotation so that the items with the highest loading for the naming process could be captured right away (see Table B.4.5 in Appendix). Only items with loadings > 0.4 were considered for the factor formation. To check the internal consistency of the factors, a reliability test was carried out using Cronbach's alpha. The values in Table B.4.7 show that the measure of reliability (> 0.6) was given for the factors F1_emo, F2_emo, F3_emo, F4_emo and F5_emo.

To create the best titles for the factors that reflect their complex contents as far as possible in a 'keyword', a table was first created in which the parameters under the factor are presented in descending order (see Table 6).

Table 6

Factor formation of the parameters of category_v2_scores (VocEmoApI).

52 Parameters	Factors
category_v2_scores	F1_emo	F2_emo	F3_emo	F4_emo	F5_emo
enthusiasm	0.972
Euphoria	0.965
passion	0.963
cheerfulness	0.952
desire	0.945
amusement	0.935
excitement	0.908
happiness	0.903
delight	0.903
pride	0.900
interest	0.898
badtemper	0.887
outrage	0.886
anger	0.867
agitation	0.712
regret	-0.611
serenity		0.854
compassion		0.846
disgust		0.841
relief		0.825
resentment		0.807
disappointment		0.788
moved		0.771
dejection		0.759
boredom		0.754
sadness		0.745
grief		0.643
contentment		0.632
pleasure			-0.863
impressed			-0.848
surprise			-0.825
frustration			0.824
admiration			-0.731
displeasure			0.717
irritation			0.697
longing			-0.679
hurt			0.670
despair			0.666
suffering			0.589
stress				0.967
shock				0.959
panic				0.943
anxiety				0.934
worry				0.916
agony				0.786
humiliation				0.658
confusion					0.913
guilt					0.900
fear					0.866
nervousness					0.864
highstrung					0.827
loving					0.665

Allocation of the parameters to the factors (within the factors sorted by descending loading amount); 7 study participants.

Following the scientific findings of the authors Krause [13, 59], Ulich and Mayring [60], Tischer [61], and Fuchs [62], the emotion factors of the voice analysis VocEmoApI could depict an emotion expression that unites all study participants with characteristic "action and process patterns" in the conversations about their own image. Two reactive opposite action factors (F1_emo Extraversion and F2_emo Introversion), a self-reflexive frustration factor (F3_emo Frustration), a reactive negative action factor (F4_emo Stress, Panic & Anxiety), and a self-reflexive guilt factor (F5_emo Confusion & Guilt) are to be named here.

These two times five factors from the text and voice analysis, given behavioral and emotional psychological titles, are described below as Dependent Variables (DV) for each individual study participant over time of the independent variables (IV with TGPR) and given an interpretation in the context of the more comprehensive individual case analyses. They were used as target parameters during the study and were given the meaning of surrogate markers for changes in speech and voice.

3.3 Single case analyses

3.3.1. Study participation and adherence

The audio documents of all seven study participants (n=7) were evaluated. Six participants finished the study at the scheduled time after six months, one was allowed to leave the facility after four months, his audio documents, which were complete to that point, were included in the evaluation of the study. In the group art therapy, a total of 118 out of 140 pictures could be designed by the seven study participants. The data loss here was 16%.

Study participants who created a picture in the group also generally presented it in the interviews, so the number of pictures created did not differ significantly with the number of interviews. Of 140 possible interviews, 115 took place.

The data loss for the entire study group was therefore 18%. The data loss for the individual study participants P2 and P3 was only 5%, 10% for P4, 15% for P1, and finally 20% each for P5 and P8. Due to the early departure of P6, the data loss was 55% and 50% respectively, including the last interview on 29 October 2016.

The interview situation with a table microphone caused irritation for patients with the clinical condition of chronic schizophrenia. The repetitive question and narrative prompts served in principle as orientation in the interview but were also perceived as a restriction in the conversation. The 'maintenance questions' and the 'concrete follow-up questions' in the interviews proved to be a suitable means to 'bring back' the study participants when they 'digressed' from the topic, including latent psychotic symptoms.

3.3.2. Linear regression of the LIWC2015 factors

The quantitative individual case analyses showed exclusively individual courses or individual trends during the study for the LIWC2015 factors: for factor F1_liwc Presence for study participants P1 and P3, and for factor F4_liwc Autobiography for study participants P2 and P3. Only P3 showed statistically significant changes in two factors (see Table 8). The significant increase in her factor F1_liwc Presence showed that she was better able to focus her attention on the moment and on the object (picture) during the study.

In the summary of the individual case analyses, the exceptionally large variation with sometimes extreme outlier values in the factors F2_liwc Social inclusion, F4_liwc Autobiography, and F5_liwc Claim & Ambition reflects either the tasks and topics behind the pictures or their own guiding themes of the study participants (as with P3 to factor F4_liwc). For factor F2_liwc Social inclusion, this would be, for example, the task 'My first name', for factor F4_liwc the 'Self-portrait' and for factor F5_liwc 'Painting with acrylic colors'.

For the sake of completeness, linear regression analyses with the target parameters of LIWC2015 were also carried out over the entire study group (TC = Total Collective), to also capture such peculiarities during the study that either could not or could not sufficiently be depicted in the individual courses. In contrast to the individual case analyses, significant changes during the study of a LIWC2015 factor were found across the entire sample, shown in Table 8, in the sense of an increase in the so-called factor F1_liwc Presence.

Table 8

Regression coefficient (b) for the factors F_liwc of LIWC2015 over the course of the study for each study participant and the Total Collective (TC).

Factors	Study participants
	P1	P2	P3	P4	P5	P6 (14)	P8	TC
F1_liwc
b	0.015	-0.004	0.024*	0.017	1.66E-05	0.081	0.005	0.018**
p	0.063	0.687	0.021	0.123	0.998	0.103	0.627	0.005
R²	0.212	0.010	0.275	0.142	0.000	0.334	0.017	0.359
F2_liwc
b	0.002	-0.030*	-0.024	-0.017	0.001	-0.006	-0.010	-0.014
p	0.868	0.011	0.149	0.407	0.931	0.876	0.391	0.070
R²	0.002	0.321	0.119	0.043	0.001	0.004	0.053	0.171
F3_liwc
b	0.002	0.005	0.011	0.025	-0.004	0.034	0.017	0.009
P	0.849	0.630	0.444	0.128	0.792	0.272	0.083	0.052
R²	0.003	0.014	0.035	0.139	0.005	0.169	0.199	0.194
F4_liwc
b	-0.019	0.019	0.047*	-5.04E-05	-0.009	0.047	0.019	0.010
P	0.293	0.097	0.011	0.998	0.765	0.393	0.364	0.351
R²	0.073	0.153	0.321	0.000	0.007	0.106	0.059	0.048
F5_liwc
b	-0.001	0.016	-0.016	0.022	0.027	-0.051	-0.011	0.006
p	0.931	0.265	0.302	0.318	0.138	0.066	0.309	0.354
R²	0.001	0.072	0.062	0.062	0.150	0.404	0.074	0.048

b: regression coefficient; significance level: p ≤ 0.05* p ≤ 0.01** p ≤ 0.001***, R²: coefficient of determination; P6 (14): results for 14 weeks.

3.3.3. Linear regression of the VocEmoApI factors

In the summary of the individual case analyses, the calculated data from VocEmoApI presented the picture of a clear dichotomy among the study participants. One group, consisting of participants P1, P4, and P6, showed an almost linear and quasi-'therapeutic' change in at least two of three possible target criteria of the voice analysis over the course of the study (hereafter named 'Group 1 with change'). The remaining participants (P2, P3, P5, and P8) formed the second group; they did not show any statistically relevant development in the results of the voice analysis during the study (hereafter referred to as 'Group 2 without change').

Table 9

Regression coefficient (b) for the factors F_emo of VocEmoApI over the course of the study for each study participant and the Total Collective (TC).

Factors	Study participants
	P1	P2	P3	P4	P5	P6 (14)	P8	TC
F1_emo
b	-0.002*	-0.002***	0.002	-0.003*	-0.009	-0.011*	0.000	-0.002
p	0.015	0.000	0.620	0.043	0.127	0.027	0.823	0.514
R²	0.335	0.532	0.015	0.232	0.158	0.525	0.004	0.024
F2_emo
b	0.018***	-0.004	-0.002	0.003	-0.002	0.056**	-0.006	0.003
p	0.001	0.552	0.414	0.387	0.161	0.003	0.319	0.484
R²	0.556	0.021	0.040	0.047	0.136	0.745	0.071	0.028
F3_emo
b	-0.008**	0.002	0.002	-0.007	0.003	-0.007	0.005	-0.003
p	0.004	0.709	0.497	0.088	0.341	0.440	0.325	0.124
R²	0.429	0.008	0.028	0.171	0.065	0.087	0.069	0.126
F4_emo
b	-0.003*	0.000	0.001	-0.011*	0.001	-0.028	0.000	-0.004**
p	0.021	0.628	0.565	0.023	0.409	0.066	0.623	0.003
R²	0.308	0.014	0.020	0.282	0.049	0.402	0.018	0.393
F5_emo
b	0.000	0.001	0.001	-1.98E-05	-0.005	0.000	-2.33E-05	0.000
p	0.456	0.335	0.797	0.942	0.161	0.948	0.918	0.852
R²	0.038	0.055	0.004	0.000	0.135	0.001	0.001	0.002

b: regression coefficient; significance level: p ≤ 0.05* p ≤ 0.01** p ≤ 0.001***, R²: coefficient of determination, P6 (14): results for 14 weeks.

For the participants in 'Group 1 with changes', the extracted emotion factors developed in the same direction. The factors F1_emo, F3_emo and F4_emo decreased in P1, P4 and P6 and the factor F2_emo increased in P1 and P6. These changes could be statistically proven. In the other group, which included P3, P5, and P8, no changes became visible during the study in the extracted factors of VocEmoApI (see Table 10). This dichotomy of the sample was initially based only on the criterion with or without change in vocal target parameters. In a further linear regression analysis of the F scores carried out separately for Group 1 and Group 2, other group-specific characteristics were also revealed in addition to this distinguishing feature. The results of the group-separated linear regression analysis of the F_emo-scores (Group 1: P1, P4 and P6; Group 2: P2, P3, P5 and P8) are shown in Table 10. They show the degree of change in the scores of the factors (F_emo) for the two groups over the time of the study. First, as expected, the changes that had already been named as characteristics of differentiation from Group 2 were seen within Group 1. These significant changes concerned the factors F1_emo, F2_emo, F3_emo and F4_emo. There were further distinguishing features between the two groups because of the group-separated analysis. While the factor F3_emo Frustration in Group 1 decreased continuously over the six months, this factor even showed an increase in Group 2. This trend only just missed the significance level.

Table 10

Regression coefficient (b) for the factors F_emo of VocEmoApI during study for Groups 1 and 2.

Factors	Study participants
	Group 1	Group 2
F1_emo
b	-0.004***	-0.004
p	0.000	0.426
R²	0.524	0.036
F2_emo
b	0.018**	-0.002
p	0.004	0.593
R²	0.383	0.016
F3_emo
b	-0.008**	0.004
p	0.006	0.056
R²	0.351	0.188
F4_emo
b	-0.009***	0.000
p	0.001	0.610
R²	0.485	0.015
F5_emo
b	0.000	-0.001
p	0.421	0.551
R²	0.036	0.020

b: regression coefficient, significance level: p ≤ 0.05* p ≤ 0.01** p ≤ 0.001***, R²: coefficient of determination.

Finally, the graphs (see Figure 4) showed that in Group 1 the scores of F2_emo, F3_emo and F4_emo were at a considerably higher level than in Group 2. What is particularly striking is the considerably higher initial level of these emotions (Introversion, Frustration, Stress, Panic & Anxiety). The group difference became even clearer for the emotion’s frustration and anxiety when these two emotions were practically not recognized in Group 2.

Group differences also appeared in the same way in F1_emo and F5_emo. Only here the scores in Group 1 were significantly lower than in Group 2 (Extraversion, Confusion & Guilt).

If the statistical results of the linear regression analysis for the factors of the category_v2_scores are evaluated over the entire group (TC), a statistically significant change (decrease) over the test period is discernable only for the factor F4_emo Stress, Panic & Anxiety (b = -0.004**; p = 0.003) (see Table 10). All other factors show no statistically relevant change (all others p > 0.124). The analysis across the entire group (TC) therefore does not bring any new findings, but rather shows the expected effect that the opposite developments in the two subgroups will neutralize each other in the total collective (TC).

3.5. Dual study approach

The validity of the study results should be secured by the dual study approach in the sense of an internal control through correlations between the factors of the text and voice analysis. In the individual case analyses, only individual correlations could be shown. On the other hand, correlations were also found across the entire collective, which brought factors, key parameters, and quantified individual emotions of the text and voice analysis into a statistically secured relationship.

Table 11

Pearson correlation matrix between the factors of F_emo and F_liwc for Total Collective (TC).

		F1_emo	F2_emo	F3_emo	F4_emo	F5_emo	F1_liwc	F2_liwc	F3_liwc	F4_liwc	F5_liwc
F1_emo	Pearson correlation	1	-,542**	-,221*	0,001	,567**	0,033	-,245**	-,361**	0,152	-0,109
	Significance (2-sided)		0	0,018	0,988	0	0,726	0,008	0	0,105	0,246
F2_emo	Pearson correlation		1	,509**	-0,062	-,594**	-0,122	-,318**	,561**	-,431**	-0,17
	Significance (2-sided)			0	0,508	0	0,193	0,001	0	0	0,069
F3_emo	Pearson correlation			1	,552**	-,526**	-,423**	-,227*	0,168	-,279**	0,068
	Significance (2-sided)				0	0	0	0,015	0,073	0,003	0,468
F4_emo	Pearson correlation				1	-0,092	-,331**	0,077	-,190*	-0,121	0,11
	Significance (2-sided)					0,326	0	0,414	0,042	0,198	0,243
F5_emo	Pearson correlation					1	,337**	,207*	-,425**	,218*	-0,087
	Significance (2-sided)						0	0,027	0	0,019	0,357
F1_liwc	Pearson correlation						1	0,048	-0,103	0,035	0,126
	Significance (2-sided)							0,609	0,272	0,707	0,179
F2_liwc	Pearson correlation							1	-,400**	0,108	0,117
	Significance (2-sided)								0	0,253	0,213
F3_liwc	Pearson correlation								1	0,01	-,318**
	Significance (2-sided)									0,919	0,001
F4_liwc	Pearson correlation									1	0,02
	Significance (2-sided)										0,832
F5_liwc	Pearson correlation										1
	Significance (2-sided)
	(valid values) N	115	115	115	115	115	115	115	115	115	115
** The correlation is significant at the 0.01 level (2-sided). * The correlation is significant at the 0.05 level (2-sided).

N: number of valid measurement points; only statistically significant correlations between the factors of F_emo and F_liwc are in bold

In summary, the results in Table 11 show that when participants analyze their own image self-critically (F3_liwc), introverted emotions (F2_emo Introversion) (r = 0.561**; p < 0.01) increase and self-reflective feelings of Confusion & Guilt (F5_emo) (r = - 0.425**; p < 0.01) decrease. In contrast, when study participants talk about social and autobiographical topics in the interviews (F2_liwc Social Inclusion and F4_liwc Autobiography), introverted emotions (F2_emo Introversion) decrease, r = - 0.318**; p < 0.01 and r = - 0.431**; p < 0.01 respectively.

The negative correlations between F1_liwc and F4_emo (r =- 0.331**; p < 0.01), which were also detected, complete this two-dimensional insight into the dynamics of the art therapy process once again in a quasi-mirror image. The moderate negative correlation of the emotion factor Stress, Panic & Anxiety with the LIWC2015 factor Presence underpins the results of the linear regression analysis from Chapters 3.3.2 and 3.3.3. This can be seen from the fact that the two factors were complementary and analogous to the findings of the correlation matrix across the entire collective (see Figure 5).

The starting point of this research project was the assumption that under the conditions of therapist-guided picture reflection (TGPR), there is a change in the communication behavior of patients with chronic schizophrenia. The aim of the art therapy study was therefore to test this basic assumption with two quantitative analysis procedures of speech and voice. The extent to which the art therapy study approach and the two quantitative research instruments (LIWC2015 and VocEmoApI) were suitable for this study will be discussed below.

4.1. Study participation and adherence

The adherence of the participants to the study approach was documented by their presence/absence (see Table B.1.1 and B1.2 in the Appendix) and on the other hand by the author's written observations. Considering the nature and chronicity of the psychiatric disease, the data loss of 16% in the group art therapy and 18% in the interviews was relatively low, i.e., without much relevance for a statistical evaluation. This fulfilled a first prerequisite for the reproducibility of the study approach.

As a further condition for reproducible study results, a source of language and voice samples was to be created thad made it possible to ensure comparable emotional requirements in up to 20 consecutive interviews.

To ensure this, the therapist-guided picture reflection (TGPR) was chosen in the form of a standardized interview, the catalogue of questions of which was specifically oriented to the peculiarities of patients with chronic schizophrenia. In this way, the emotionality of the creative process of image creation should also be captured in the patients’ voices in the later reflection on their pictures.

It was the consistent use of existing art therapy elements and the reference to indications that led to the development of a variant of art therapy that is suitable for use as a study approach in art therapy for patients with chronic schizophrenia. The trimming to study suitability, especially through the uniform repetition of art therapy procedures, could have limited the therapeutic potential of the method, so that significant processual changes would no longer be recognized by the measuring instruments used. However, both in the consideration of the statistical study results of the individual case analyses and of the overall collective, it can be determined that the measured changes were sufficient both quantitatively and, in their respective combinations, to indicate the therapeutic potential of this art therapy study approach.

A study-typical 'downer' arose from the artefact of the interview situation with table microphone for digital recording of the TGPR. It triggered paranoid reactions matching the clinical picture in some study participants. Similar observations were apparently made by Montag et al. [32], who had three participants leave the art therapy group due to the study setting (fear of video recordings). For patients with schizophrenia, this form of voice recording therefore remains an inherent problem. Although the present art therapy study had different circumstances than, e.g., Green et al. [29] and Crawford et al [31], it showed relatively high adherence to the study approach with comparatively fewer dropouts. The chosen study approach of art therapy image development and picture reflection was therefore feasible in principle over the long period of six months in a study of patients with chronic schizophrenia.

4.2. Suitability of the research instruments

Overall, the two methods of text and voice analysis proved to be suitable research instruments in the hands of an art therapist and in use with patients suffering from chronic schizophrenia. However, apart from their contribution to a predominantly positive balance of the study, there were also limitations to be noted concerning the effort of data preparation and its evaluation.

4.2.1. LIWC2015

The text analysis LIWC2015 has already been used in studies with schizophrenic patients, but never in art therapy research. For a practical application of LIWC2015, the German-language electronic dictionary DE-LIWC2015 [47] would therefore have to be expanded to include terms from art science and art therapy. It would also be important for a complete analysis of the interviews that potential ambiguities of words are clearly recognized in relation to the context of what was said.

This enormous range in the volume of the interviews from a minimum of 87 to a maximum of 1486 words was shown in Chapter 3.1.1. A long interview with many words, filler words, and repeated words is described by Just et al. [63] as a typical feature in schizophrenic patients with formal thought disorders [63]. The thought and language disorders of schizophrenic patients were also highlighted in the introduction. Accordingly, this fluctuation could also be seen in the study participants as a feature of language typical of the illness in the interview situation.

LIWC has been continuously developed - since its development by Pennebaker et al. [64] - and was used for the first time in this work in the current software version of LIWC2015 to test its ability to detect therapeutic changes in patients with this clinical picture. What distinguishes LIWC from other text analysis software is that it is also available and validated for the German language [48], [65], [47].

For the survey from the German language, the text analysis could be combined with the German version of LIWC2015, namely DE-LIWC2015, by Meier et al. [47]. The strength of this study was that an average of 90% of the words from all interviews could be assigned and evaluated. This strength confirmed the improvement of the extended DE-LIWC2015 dictionary highlighted by Meier et al. [47], with an average word coverage of 83% [47].

Regarding the application of this text analysis procedure in this study, the pre-existence of scientific results from studies with schizophrenic patients was particularly noteworthy. It has not yet been used as a potential control instrument for therapeutic interventions in corresponding studies, but rather to characterize linguistic features of the disorder (e.g., in Hong et al. [66], Minor et al. [67], Bonfils et al. [68], and Just et al. [63]).

4.2.2. VocEmoApI

In this study, the VocEmoApI technology was used for the first time in study participants with chronic schizophrenia.

It was helpful in the data analysis that this emotion recognition technology not only records the known prosodic parameters (fundamental frequency, variation of fundamental frequency and speech rate, etc.), but also automatically assigns voice signals with emotional coloring to 52 emotionally defined parameters (category_v2_scores) and scores them according to how well they fit into the folder. The ratings are the Scores, and the emotion categories have names like sadness, anger, or loving.

The results of the voice analysis show in their descriptive part based on the lead parameters an opposite vocal expression of emotion between the male and female study participants, which is the first thing to be discussed here.

The male study participants' lead parameters are sadness, disgust, and boredom with positive emotions were almost not detected. When assessing the results of facial expression research, schizophrenic patients, who are males showed "a severe reduction of mimic affectivity" of these patients that could be recognized and interpreted as "a consequence of the disappearance of genuine joy" [13]. Krause [13] describes the emotional parameters sadness, disgust, and grief as "negative leading affects" of schizophrenic patients. Here, facial expression and voice analyses seem to correspond completely and allow the consistently negative emotional expression of the five male study participants to be classified as a known emotional phenotype in the diagnostic spectrum of schizophrenia.

The two women showed conspicuous and especially contrasting presentation of leading emotional parameters as compared to the male study participants, which is an interesting finding; it remains, however, an essentially unexplained phenomenon in these study results.

As further findings of the descriptive statistics, the commonalities linking all study participants should be discussed. These are the high intensity of the emotions, ranging between emotionally colored and highly emotional, and in the prosodic features, a low pitch (F0) and slow speech rate. In its current state of research, emotion psychology only offers meaningful approaches to a satisfactory interpretation about the prosodic features and only in part: Murray and Arnott [69] as well as Scherer and Wallbott [70] associate a low average fundamental frequency (F0) and a slow rate of speech with the emotion 'sadness' or 'dejection'. Similar observations were also made by Stassen [71], Cohen et al [72], and Martinez-Sánchez et al [73]. These authors were able to document that the speech samples of schizophrenic patients were characterized by increased pause times, a slower speech tempo, and a lower fundamental frequency compared to healthy control subjects. These findings would fit with the results of the five male study participants, who showed the emotions sadness, disgust, boredom, and grief as leading parameters, among others. However, there is no explanation in this model for the prosodic characteristics of the two women in the study. Their emotional parameters do not lie in the negative, but mainly in the positive emotional spectrum, e.g., longing for P3 and euphoria for P5.

In view of the present study results, the VocEmoApI technology in chronic schizophrenia also seemed to be generally suitable as a control instrument for psychotherapeutic interventions in the broader sense. Cohen and Elvevåg [74] also referred to this potential of automated voice analysis.

The suitability of this instrument for its use in art therapy studies with schizophrenic patients was also demonstrated by the fact that the VocEmoApI technology did not require laboratory conditions for the audio recordings due to the "Voice Activity Detection" (VAD).

4.3. Evidence for procedural changes

For the individual case analyses, the factors of LIWC2015 could only show individual progressions. Across the entire collective, F1_liwc (Presence) showed a statistically significant change and thus the development of increasing confidence in language use. The study participants increasingly succeeded in expressing themselves 'more precisely', i.e., they seemed to gain confidence in language use. The initial urge to speak of individual study participants possibly speaks for an uncertain conversational style (e.g., using more filler words) and/or for a characteristic feature in schizophrenic patients with formal thought disorders [63]. Thus, the statistically significant change in the factor F1_liwc (b = 0.018**; p = 0.005) shows that there were increasingly fewer filler words (filler) in the interview texts of the sample over the course of the study and that the speech was increasingly more precise and fluent (fewer nonflu). Thus, the factor could be regarded as a–therapeutic, per se–criterion for a training effect and as a surrogate marker.

The results of the linear regression analysis of the VocEmoApI voice analysis factors revealed a dichotomy within the sample: study participants with a high level of distress who even experienced a further activation of the F2_emo Introversion factor in the protected manner of the therapist-guided picture reflection (TGPR) and study participants without shares of distress in the voice and without this activation.

Across the entire sample, the results of the VocEmoApI voice analysis show a particularly high sensitivity to change for the factors F3_emo Frustration and F4_emo Stress, Panic & Anxiety. They would thus possibly have the potential of surrogate markers of a sustainable activation of emotions. The activation of emotions could be recognized in a differentiated way with this method of voice analysis. This distinguishes it as a research tool that could also be of general importance for art therapy, especially since the activation of emotions is considered an important effective factor of this complementary form of therapy.

The original assumption that under the conditions of a therapist-guided picture discussion, linguistic and vocal characteristics of chronically schizophrenic patients would improve in the repeated conversations about their own picture could not be confirmed across the board. The results of the linear regression analysis of the LIWC2015 factors on the individual cases nevertheless prove a procedural change in the sense of the research question of the study. However, this linguistic development was not evident in all study participants, but only in individuals (e.g., P3) or in the statistics on the overall collective. This change was reflected in the factor F1_liwc (Presence), which was interpreted as an increase in confidence in language use and, obviously, as a training effect.

Of the study participants who showed an emotional process during the study ('Group 1 with change') none became 'happier'. The emotional states of Frustration and Stress, Panic & Anxiety depicted in factors F3_emo and F4_emo decreased significantly during the study and the study participants also reduced their anxiety and agitation levels, but they directed their emotions more 'inwards' and became 'sadder' (see factor F2_emo Introversion).

In this construction, the language and voice of the study participants should also contain the information that indicates an activation of cathartic emotions. In the case of the study participants in whom such an emotional process could be demonstrated in the six months of the study, this was ultimately successful. This variant of emotional activation is again one of the recognized components in the spectrum of general effective factors of artistic therapies [21,75]. The emotion-related information hidden in speech and voice offers itself as a completely new source of target parameters when it comes to demonstrating therapeutic effects of art therapy in studies.

4.4. Dual study approach

A special hope regarding the benefit of internal control in the study design was directed towards the dual approach of text and voice analysis. However, for the individual case analyses, only intra-individual correlations between the findings of both instruments could be shown. The dual use of text and voice analysis provided interesting insights into the linkage of thought, language, and emotions, but beyond that, the control function about an affirmative or corrective potential was only fulfilled in small proportions of the results. Over the entire collective, the linear regression analysis showed perfectly complementary courses between F4_emo and F1_liwc.

This paper presents an elaborated art-therapeutic study approach for chronic schizophrenic patients with a form of therapist-guided picture reflection (TGPR), which could be proposed as a research tool for clinical studies. In addition, for the first time the two instruments LIWC2015 and VocEmoApI were used in an art therapy study in patients with chronic schizophrenia and their suitability for detecting procedural changes in study patients was evaluated in a differentiated manner. This could give new and innovative impetus to efficacy research.

The LIWC2015 instrument showed an extreme dispersion of the factor values over the totality of the interventions, which was apparently due to the individual influence of the pictorial topics on the speakers. The greater this thematic influence on the factors is, of course, the less likely is the possibility of statistically capturing linear, overarching developments in the factors, as the results are too dispersed under this theme. This shows that LIWC2015 was less suitable for showing longer-term developments, at least under the given study conditions, especially the very small sample.

Based on the available results, the voice analysis VocEmoApI could contribute to the basic theory of art therapy, opening insights into emotional processes, which are also discussed as essential therapeutic factors [16,21]. In the present art-therapeutic study approach, the emotional processes at least reveal changes that, in their dynamic and differentiated interplay, would also satisfy therapeutic demands.

The special message of the study results for this perspective lies in the differentiation with which the voice analysis VocEmoApI already identified an emotional constellation in the first interview of the patients, which could possibly prove to be prognostic for the success of the therapy. This combination describes and quantifies a quality of suffering pressure in which the conditions necessary for therapeutic development may be found.

However, due to the lack of control of the results achieved by baseline data (Phase A) it is not possible to establish a causal relationship between the intervention (TGPR) and the procedural change [33]. In line with these arguments, the present art-therapeutic study approach offers a plausible introduction to efficacy research 'entry' because it has not yet been possible to prove the effectiveness itself, but only to test a study approach or investigative instruments with which such proof can be achieved in a combination of follow-up studies.

Should the potential of VocEmoApI voice analysis be confirmed in follow-up studies, this would be of particular importance for the efficacy research of art therapy. In further controlled individual case studies, it might be possible to verify whether there is indeed a causal relationship between the art therapy intervention of a therapist-guided picture reflection (TGPR) and the demonstrated changes in the emotional response of the study patients.

With the vocal analysis method presented here, VocEmoApI, for the first time, the researching art therapist would have at their disposal a tool that would allow them to play their own role in the recruitment of study patients in addition to the purely medical selection criteria. Emotional profiles of art-therapeutic study patients could be created with an examination instrument optimized for art-therapeutic study practice (e.g., in the form of a mobile app as application software). This applies not only to the acquisition of profiles in the recruitment of study participants, with which to optimize the comparability of patients in a study group, but also to efficacy research in the function of variables at trial times of a therapeutic process.

CCSS	Collecting, Checking, Sorting, Subsuming
DGPPN	Deutsche Gesellschaft für Psychiatrie Psychotherapie und Neurologie
DV	Dependent Variables
GAF	Global Assessment of Functioning Scale
GeMAPS	Geneva Minimalistic Acoustic Parameter Set
ICD	International Classification of Diseases and Related Health Problems
KMO	Kaiser, Meyer, and Olkin
LIWC	Linguistic Inquiry and Word Count
NICE	National Institute for Health and Clinical Excellence
PTMI	Psychosocial Therapies for severe Mental Illness
RCT	Randomized Controlled Trials
sensAI	sensitive Audio Intelligence
SSR	Single-Subject Research design
TC	Total Collective
TGPR	Therapist-Guided Picture Reflection
VAD	Voice Activity Detection
VocEmoApI	Vocal Emotion recognition by Appraisal Inference

Ethics approval: Due to the lack of invasiveness of the study approach, the study supervisory team did not initially consider an ethics vote to be necessary.
Thus, a subsequent vote in the sense of ethical clearance was granted on 11/05/2017 by the chairman of the ethics committee of the University of Augsburg (Prof. Dr. Ulrich M. Gassner, Mag. rer. publ. M. Jur. (Oxon.) Faculty of Law, P.O. Box 86135 Augsburg, Germany; + 49 (0) 821 598 4600; [email protected]

Consent for publication: All seven study participants (and legal guardians) gave their written consent for the publication of the data.

Availability of data: All data underlying the results in this study are available without restriction at https://doi.org/10.5281/zenodo.5929647.

Competing interests: not applicable

Funding: Investigator initiated and funded

Author's contributions: not applicable

Acknowledgments: My thanks go first and foremost to seven study patients and the associated support of the facility management and staff of the Marienheim in Peiting, Germany. I would like to thank Dr. Florian Eyben, Germany, one of the developers of the voice analysis software VocEmoApI. I would also like to thank Dr. Markus Wolf, who initiated the German adaptation of the computer-assisted text analysis LIWC and who has always been at my side with advice and support. Finally, I would like to thank Prof. Dr. Rainer Krause, Germany. With his scientific insights from emotion research, he laid one of the foundations for the interpretation of the study results presented here.

von Spreti F, Martius P Kunsttherapie. Geschichte, Ansätze, Wirkweisen. In: Rössler W, Matter B, editors. Kunst- und Ausdruckstherapie. Ein Handbuch für die psychiatrische und psychosoziale Praxis. 2nd ed. München: Urban & Fischer; 2013. p. 231–43.
Prinzhorn H. Die Bildnerei der Geisteskranken. 2nd ed. Berlin: Springer; 1922.
Katschnig N. Die Künstler aus Gugging. In: Titze D, editor. Die Kunst der KunstTherapie. vol. 2. Kunstaustausch. Dresden: Sandstein; 2005. p. 106–7.
Dannecker K. Psyche und Ästhetik. Die Transformation der Kunsttherapie. 2nd ed. Berlin: Medizinische Wissenschaftliche Verlagsgesellschaft; 2010.
Kwiatkowski G. Meyers kleines Lexikon der Kunst. Bibliogr. Inst.: Berlin; 1986. p. 58 – 9.
Bäuml J, Martius P. Symptomatik, Ätiologie und Behandlung der schizophrenen Psychose. In: von Spreti F, Martius P, Förstl H, editors. Kunsttherapie bei psychischen Störungen. 2nd ed. München: Elsevier; 2012. p. 58–9.
Leucht S, Väth R, Olbrich HM, Jäger M. Schizophrenien und andere psychotische Störungen. In: Berger M, editor. Psychische Erkrankungen: Klinik und Therapie. 5nd ed. München: Elsevier; 2015. p. 301–438.
Bleuler E. Lehrbuch der Psychiatrie. 14nd ed., neubearbeitet von Bleuler M. Berlin-Heidelberg-New York: Springer; 1979.
Tölle R. Psychiatrie. 17nd ed. Berlin-Heidelberg-New York: Springer; 2014. p. 190–231.
Süllwold L. Psychologische Behandlung schizophren Erkrankter. Stuttgart-Berlin-Köln: Kohlhammer; 1990.
Süllwold L. Schizophrenie. 3nd ed. Stuttgart-Berlin-Köln: Kohlhammer; 1995.
Wing JK. Schizophrenie in Selbstzeugnissen. In: Katschnig H, editor. Die andere Seite der Schizophrenie. Patienten zu Hause. 3nd ed. München: Psychologie Verlags Union; 1989. p. 21–9
Krause R. Allgemeine psychodynamische Behandlungs- und Krankheitslehre. 2nd ed. Stuttgart: Kolhammer; 2012. p. 83–4.
Ekman P, Rosenberg EL. What the face reveals: Basic and applied studies of spontaneous expression using the facial action coding system (FACS). New York-Oxford: Oxford University Press; 1997.
Ekman P, Friesen W. A new pan-cultural facial expression of emotion. Motivation and Emotion 1986; 10 (2): p. 159–68.
DGPPN-Deutsche Gesellschaft für Psychiatrie und Psychotherapie, Psychosomatik und Nervenheilkunde, editor. S3-Leitlinie Psychosoziale Therapien bei schweren psychischen Erkrankungen. Berlin: Springer; 2019.
Fuchs T (2018) Kunst und das `Als-ob´. Anthropologische Anmerkungen. In: von Spreti F, Martius P, Steger F, editors. KunstTherapie. Wirkung - Handwerk - Praxis. Stuttgart: Schattauer; 2018. p. 1–5.
Peciccia M, Benedetti G. Das progressive therapeutische Spiegelbild. In: Schottenloher G, editor. Wenn Worte fehlen, sprechen Bilder. Bildnerisches Gestalten und Therapie. Reflexionen. München: Kösel; 1994. p. 91–4.
Betensky MG. Kunsttherapie und künstlerische Äußerung aus phänomenologischer Sicht. In: Rubin JA, editor. Richtungen und Ansätze der Kunsttherapie. Theorie und Praxis. Karlsruhe: Gerardi; 1991. p. 167–84.
DGPPN-Deutsche Gesellschaft für Psychiatrie und Psychotherapie, Psychosomatik und Nervenheilkunde, editor. S3-Leitlinie Schizophrenie. Berlin: Springer; 2019.
DGPPN-Deutsche Gesellschaft für Psychiatrie und Psychotherapie, Psychosomatik und Nervenheilkunde, editor. S3-Leitlinie Psychosoziale Therapien bei schweren psychischen Erkrankungen. Berlin: Springer; 2013.
National Institute for Clinical Excellence Schizophrenia: Core Interventions in the Treatment and Management of Schizophrenia in Adults in Primary and Secondary Care. NICE clinical guideline 82. London: NICE; 2009.
Ruiz MI, Aceituno D, Rada G. Art therapy for schizophrenia? Medwave. 2017; 17 (1): https://www.medwave.cl/link.cgi/Medwave/PuestaDia/ResEpis/6845.
Apotsos P. Art therapy in psychosocial rehabilitation of patients with mental disorders. Psychiatriki. 2012; 23(3): 245–54.
Attard A, Larkin M. Art therapy for people with psychosis: a narrative review of the literature. Lancet Psychiatry. 2016; 3(11): 1067–78.
van Lith T. Art therapy in mental health: A systematic review of approaches and practices. The Arts in Psychotherapy. 2016; 47: 9–22.
Ruddy R, Milnes D. Art therapy for schizophrenia-like illnesses. Cochran Database Syst Rev CD003728. 2005; 10.1002/ 14651858.CD003728.pub2 [doi].
Maujean A, Pepping CA, Kendall E. A systematic review of Randomized Controlled Studies of Art Therapy. Art Therapy. 2014; 31: 37–44.
Green BL, Wehling C, Talsky GJ. Group art therapy as an adjunct to treatment for chronic outpatients. Hosp Community Psychiatry. 1987; 38 (9): 988–91.
Richardson P, Jones K, Evans C, Stevans P, Rowe A. Exploratory RCT of art therapy as an adjunctive treatment in schizophrenia. Journal of Mental Health. 2007; 16 (4): 483–91.
Crawford MJ, Killaspy H, Kalitzaki E, Barrett B, Byford S, Patterson S et al. Group art therapy as an adjunctive treatment for people with schizophrenia: a controlled trial (MATISSE), Health Technology Assessment. 2012; 16 (8).
Montag C, Haase L, Seidel D, Gallinat J, Herrmann U, Dannecker K. A Pilot RCT of Psychodynamic Group Art Therapy in Acute Psychotic Episode: Feasibility, Impact on Symptoms and Mentalizing Capacity. PLOS ONE. 2014; 9 (11): e112348. doi: 10.1371/journal.pone.0112348.
Petermann F. Einzelfallanalysen. München: Oldenbourg; 1996.
Julius H, Schlosser RW, Goetze H. Kontrollierte Einzelfallstudien. Göttingen: Hogrefe; 2000.
Riley-Tillman, TC, Burns MK. Evaluation Educational Interventions. London-New York: The Guildford Press; 2009.
Vohra S, Shamseer L, Sampson M, Bukutu C, Schmid CH, Tate R et al. CENT-Gruppe. CONSORT extension forreporting N-of-1-trials (CENT) 2015 Statement. BMJ. 2015 Mai 14;350:h1738. doi:10. 1136/bmj. h1738.
von Spreti F. Kunsttherapie bei schizophrenen Patienten. In: von Spreti F, Martius P, Förstl H, editors. Kunsttherapie bei psychischen Störungen. 2nd ed. München: Urban & Fischer; 2012. p. 61–74.
Schrode H. Klinische Kunst- und Gestaltungstherapie. Regression und Progression im Verlauf einer tiefenpsychologisch fundierten Therapie. Stuttgart: Klett-Cotta; 1995.
Vopel KW. Kunsttherapie für die Gruppe. Spiele und Experimente. 2nd ed. Salzhausen: Iskopress; 2011. p.18–20.
Landgarten HB. Klinische Kunsttherapie. Ein umfassender Leitfaden. Karlsruhe: Gerardi; 1990.
Misoch S. Qualitative Interviews. Berlin: De Gryter; 2015.
Helfferich C. Die Qualität qualitativer Daten. Manual für die Durchführung qualitativer Interviews. 4nd ed. Berlin: Springer; 2011.
Stuhler-Bauer A, Elbing U. Die phänomenologische Bilderfassung: Ein kunsttherapeutisches Instrument. Zeitschrift für Musik-, Tanz- und Kunsttherapie. 2003; 14 (1): 32–46.
Bader R. Das Bild ernst nehmen. In: Sinapius P, Wendlandt-Baumeister M, Niemann A, Bolle R, editors. Bildtheorie und Bildpraxis in der Kunsttherapie. Frankfurt a. M.: Lang; 2010.
Titze D (2015) Formanalytische KunstTherapie. Zur integrativen und spezifischen Qualität von Kunst und Therapie. In: Mayer H, Niederreiter L, Staroszynski T, editors. Kunstbasierte Zugänge zur Kunsttherapie. München: Kopaed; 2015.
Pennebaker JW, Boyd RL, Jordan K, Blackburn K. The development and psychometric properties of LIWC2015. Austin, TX: University of Texas at Austin. http://liwc.wpengine.com/wp-content/uploads/2015/11/LIWC2015_LanguageManual.pdf.
Meier T, Boyd RL, Pennebaker JW, Mehl MR, Martin M, Wolf M et al. `LIWC auf Deutsch´: The Development, Psychometrics, and introction of DE-LIWC2015. Retrieves from https://osf.io/tfqzc/. 2018.
Wolf M, Horn AB, Mehl MR, Haug S, Pennebaker JW, Kordy H. Computergestützte quantitative Textanalyse. Äquivalenz und Robustheit der deutschen Version der Linguistic Inquiry and Word Count. Diagnostica. 2008; 54 (2): 85–98.
SensAI WebAPI Documentation. 2017 (Available on request from the developer of the software at: [email protected] or http://www.audeering.com)
Eyben F, Scherer KR, Schuller B, Sundberg J, André E, Busso C et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Transactions on Affective Computing. 2016; 7 (2): 190–202.
Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale, N.J.: Erlbaum; 1988.
Döring N, Bortz J. Forschungsmethoden und Evaluation in den Sozial- und Humanwissenschaften. 5nd ed. Berlin-Heidelberg: Springer; 2016.
Wentura D, Pospeschill M. Multivariate Datenanalyse. Eine kompakte Einführung. Wiesbaden: Springer; 2015.
Horn J. A rationale and test for the number of factors in factor analysis. Psychometrika. 1965; 30 (2): 179–85.
Bühner M, Ziegler M. Statistik für Psychologen und Sozialwissenschaftler. 2nd ed. Stuttgart: Alfred Krönner; 2017.
Pfister B, Kaufmann T. Sprachverarbeitung. Grundlagen und Methoden der Sprachsynthese und Spracherkennung. 2nd ed. Berlin: Springer Vieweg; 2017.
Tausczik YR, Pennebaker JW. The Psychological meaning of Words: LIWC and Computerized Text Analysis Methods. Journal of Language and Social Psychology. 2010; 29 (24): 23–54.
Schwitalla J. Gesprochenes Deutsch. Eine Einführung. 3nd ed. Berlin: Schmidt; 2006.
Krause R. (1990) Psychodynamik der Emotionsstörungen. In: Scherer KR, editor. Psychologie der Emotionen. Göttingen-Toronto-Zürich: Hogrefe; 1990. p. 630-8.
Ulich D, Mayring P. Psychologie der Emotionen. 2nd ed. Stuttgart: Kohlhammer; 2003.
Tischer B. Die vokale Kommunikation von Gefühlen. München: Psychologie Verl. Union; 1993.
Fuchs T. Verkörperte Emotion - Wie Gefühl und Leib zusammenhängen. Psychologische Medizin. 2014; 25 (1): 13–9.
Just SA, Haegert E, Koránová N, Bröcker AL, Nenchev I, Funcke J et al. Modelling incoherent discourse in non-affecktive psychosis. Front Psychiatry. 2020; 11: 1–11.
Pennebaker JW, Francis ME, Booth RJ. Linguistic Inquiry and Word Count -LIWC2001. Mahwah NJ: Erlbaum; 2001.
Linnenbürger A, Greb C, Gratzel D. (2018) PRECIRE Technologies. In: Stulle KP, editor. Psychologische Diagnostik durch Sprachanalyse. Wiesbaden: Springer Fachmedien; 2018. p. 23–56.
Hong K, Nenkova A, March ME, Parker AP, Verma R, Kohler C. Lexical use emotional autobiographical narratives of persons with schizophrenia and healthy controls. Psychiatry Research. 2015; 225: 40–9.
Minor KS, Bonfilds KA, Luther L, Firmin RL, Kukla M, MacLain VR et al. Lexical analysis in schizophrenia: how emotion and social words use informs our understanding of clinical presentation. Journal of psychiatric Research. 2015; 64: 74–8.
Bonfils KA, Luther L, Firmin RL, Lysaker PH, Minor KS, Salyers MP. Language and hope in schizophrenia-spectrum disorders. Psychiatry Research. 2005; 245: 8–14.
Murray IR, Arnott JL. Towards the Simulation of Emotion in Synthetic Speech: A Review of the Literature of Human Vocal Emotion. Journal of Acoustical Society of Amerika. 1993; 93 (2): 1097–198.
Scherer KG, Wallbott HG. Ausdruck von Emotionen. In: Scherer KG, editor. Psychologie der Emotionen. Göttingen: Hogrefe; 1990. p. 345–420.
Stassen H H. Affekt und Sprache. Stimm- und Sprachanalysen bei gesunden, depressiven und schizophrenen Patienten. Berlin-Heidelberg-New York: Springer; 1995.
Cohen AS, Yunjung K, Najolia GM. Psychiatric symptom versus neurocognitive correlates of diminished expressivity in schizophrenia and mood disorders. Schizophrenia Research. 2013; 146 (1–3): 249 – 53.
Martínez-Sánchez F, Muela-Martínez JA, Cortés-Soto P, Meilán JJ, Ferrándiz JJV, Caporrós AE et al. Can the Acoustic Analysis of expressive Prosody Discriminate Schizophrenia? The Spanish Journal of Psychology. 2015; 18 (86): 1–9.
Cohen AS, Elvevåg B. Automated computerized analysis of speech in psychiatric disorders. Current Opinion in Psychiatry. 2014; 27 (3): 203–9.
Schuster M. Kunsttherapie in der psychologischen Praxis. Berlin-Heidelberg: Springer; 2014.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

Computerized text and voice analysis - a quantitative single case study of seven chronically schizophrenic patients in art therapy

Archived Versions:

Version 2

Version 1

Abstract

Figures

Background

Methods

Results

Discussion

Conclusions

List of Abbreviations

Declarations

References

Additional Declarations

Archived Versions:

Version 2

Version 1