In this observational prospective cohort study, we modeled depression symptoms using NLP outputs from sparse written text collected through a pregnancy smartphone app that was delivered to patients are part of their routine prenatal care. The model had performance matching or better than other machine learning models for maternal depression prediction, which often have been built on larger data sets with a greater number of variables34–35. Our results add to a new, but growing, literature indicating that even sparse language can be used to predict depression symptoms. Moreover, this study focuses on a population for whom depression can have severe consequences. By capturing language through a prenatal smartphone app, this study also lays a foundation for wider-scale remote assessment of maternal depression from patients’ everyday language.
Specifically, we find that natural language features, including tone, first-person plural pronoun use, specific topics, such as space, mental and pregnancy-related health, and temporal wants, context-derived syntactic and semantic dimensions, and word count are indicative of depression symptoms. Moreover, these features capture a unique aspect of symptom level beyond current mood or baseline demographics or clinical risk factors. The best-performing model identified incident depression in a 30-day window with mood, topics focused on mental health and pregnancy-related health issues, and syntactic/semantic features all associated with depression symptoms. Pronoun use and topics associated with depression symptoms could reflect aspects of social isolation, e.g., use of “I” rather than “we” and references to staying in or needing to be in certain physical spaces. Our results also shed light on the types of topics that current mood may be capturing, such as temporal desires (captured by the LIWC topic, wants, - e.g., "wish”, “hopeful”).
In an illustrative contrast to prior studies finding that use of the first-person singular is associated with depression36–40, we find that first-person plural pronoun use is negatively associated with (or protective against) depression symptoms. The use of first-person plural pronouns in our predominantly partnered sample, particularly during pregnancy, could be indicative of the strength of the existing family structure. Use of “we” rather than “I” when discussing pregnancy may indicate degree of bonding in the partnership unit or the mother-infant dyad, consistent with literature on the protective effects of social support and mother-child bonding41. This linguistic focus on first-person plural pronouns may indicate a protective counterpoint of social supports in opposition to the self-focus or self-criticism suggested by first-person singular pronouns that has been shown to be harmful in prior work35, 42–43.
Notably, the theme of mental health was retained as a predictive language feature in the 30-day timeframe when controlling for mood and other baseline characteristics. That those with moderate to severe depression symptoms were writing about their mental health (e.g., psychiatrist, zoloft, trauma) is suggestive of the writer’s existing understanding of their depression status and perhaps a sensitivity to their own ebbs and flows. Previous studies have shown that those who are depressed may find writing therapeutic44, while others have found that re-living events can be either therapeutic or harmful45. Here, when given the opportunity to share writing in a pregnancy app, individuals experiencing depression symptoms wrote about their mental health and wrote more extensively than those who were not depressed. In addition to being a tool for eliciting depression symptoms in between routine prenatal care, such tools may offer an additional opportunity for sharing, particularly if structured in a way to support therapeutic rather than harmful disclosure of experiences. To do this effectively, future work should explore the structure of digital tool-based elicitation of writing to understand which prompts and formats of writing-elicitation allow for therapeutic disclosure.
Much of our data was collected during a pandemic and through periods of mandated self-isolation with fewer in-person clinical appointments. However, even though COVID-19 was explicitly included as a novel LIWC theme in modeling, it was not retained as an indicator for depression symptoms in our modeling. This result suggests that other topics, which may be consequent to COVID-19 pandemic experiences but do not specifically reference COVID-19 precautions or symptoms, are more directly indicative of depression.
Consistent with other literature46–48, we find that a lack of fluctuation in a text’s sentiment is symptomatic of depression (i.e. less varied – or “flattened” – affect in language is associated with depression symptoms). We also find that high-dimensional representations of the underlying syntactic and semantic content of open-ended text, captured by word2vec features, were indicative of depression symptoms, even more so when paired with self-reported mood. While the word2vec features are not easily interpretable, these findings suggest that there is something about underlying word choice that is uniquely informative and distinct from explicitly psychologically meaningful themes. Future work should examine whether these word-based language features could be used as an automated trigger for depression screening among patients of a specific healthcare system, as has been discussed in the context of social media49.
Our findings should be interpreted in the context of its naturalistic, patient-led data generation. While the self-motivated collection approach translates clearly to practice, the resulting data are sparse and tend to be highly topically focused. Thus, we may not have fully captured the range of language features that could indicate depression symptoms among pregnant people in a more directed data collection structure. For example, we did not find any emergent topics associated with depression symptoms in this sample, likely due to topic models’ need for larger bodies of text. It is possible that the number and, especially, length of the written texts may not have been sufficient to expose more subtle or infrequent but meaningful themes. Future work could include manual coding of the entries to clarify more nuanced themes and experimental data collection with and without clear writing prompts.
The naturalistic structure of the study added some noise to our text. Some individuals used open-ended text opportunities to track blood pressure readings or take notes on medical appointments. How individuals used the open-ended writing opportunities in the tool is something that could be explored in future work. We find that individuals with a history of depression or anxiety tend to use the overall tool for longer and write more frequently. However, this disproportionate use of the tool by those who have a history of depression may also be a strength if prenatal apps offer an additional means of therapeutic disclosure and connection to patients who are more likely to become depressed.
To the best of our knowledge, this is one of the first prospective longitudinal studies to use natural language collection50 and the first focused on maternal depression symptom prediction. Incorporating language inputs enables moderate predictive ability of depressive symptoms among peripartum patients in a large academic health system. This work points to an immediate value in using digital tools for depression symptom evaluation and support between routine clinical care appointments. It also indicates the potential for future analysis of app-elicited language to trigger mental health care provision.