Study Design
Our study consisted of a one-time, cross-sectional survey of United States (U.S.)-based adults in September of 2022. We sampled a general U.S. adult population to elicit the public’s perspectives on AI for mental health. We partnered with Prolific, an online survey sampling platform, to recruit participants. Prolific provides access to an international sample of verified users (over 100 000 users residing in the U.S.) who are willing to be involved in survey research studies. Prolific matches eligible participants with research studies streamline the recruitment, data collection, and compensation process. Prospective participants had to be verified Prolific users aged 18 or older who were fluent in written and spoken English to be eligible. This study was approved by the BRANY Institutional Review Board.
Questionnaire Design
We designed our questionnaire to mimic items asked in a previous study by Khullar et al. but applied to perspectives related to AI for mental health, instead of healthcare broadly. Question categories related to AI for mental health involved: 1) perceived benefits, 2) concerns, 3) comfort with AI for specific predictive tasks. Adapting questions related to perceived benefits and concerns predominantly involved updating language referring to “health” or “healthcare” broadly to “mental health” or “mental healthcare”. In the Khullar et al. questionnaire, questions regarding predictive tasks included reading a screening (i.e., chest x-ray); making a diagnosis for two different conditions, one more severe (pneumonia and cancer); telling a patient they had either of the two aforementioned conditions, and making a treatment recommendation. Our team worked with a trained psychiatrist to construct tasks following similar patterns but pertaining to mental health treatment, adding two more tasks (seven total) to explore more sensitive concepts relating to mental health. Appendix A.1 presents the questions in each of the aforementioned categories along with the question from which they were adapted as applicable.
We also extended the questionnaire to understand participant’s values pertaining to AI design and implementation for mental health to facilitate more patient-centered design of future AI applications for mental health. This section asked patients to rate their level of importance regarding various statements pertaining to AI for mental health informed by the constructs of MITRE’s bioethical framework. Appendix A.2 displays the values statements presented to participants based on the relevant bioethics construct.
In addition to the perspectives and values questions, participants also provided socio-demographic information regarding personal characteristics, health literacy, subjective numeracy, previous mental health care experience, and pregnancy history (results reported in a forthcoming manuscript). The full battery of socio-demographic questions may be found in Appendix A.3.
Lastly, following the sections regarding concerns and values, the survey contained open-ended questions to allow people to free-text responses with additional concerns or values (included in the relevant appendices).
We designed the battery of questions with input from experts in ML (co-authors JP and YZ), human-centered design (co-authors MRT, NCB, and PD), psychiatry (co-author AH), and the author of the original survey from which the questions were adapted (Khullar et al.). The survey questions also underwent two rounds of pilot testing to improve the comprehensibility of the questions and understand the amount of time needed to complete the questionnaire. The question/answer design was optimized, and pilot tested for both desktop and mobile (i.e., smartphone) completion to ensure those with different device access or preferences could participate in the study.
Participants
All participants were recruited from Prolific’s survey sampling panel and were verified users who have agreed to participate in research studies via the Prolific website. Our sample included those 18 years or older, residing in the U.S., with the ability to speak and read English. We recruited a sample representative of the adult U.S. population in terms of age, race, and gender, according to the U.S. Census. We initially recruited 530 survey respondents, of whom 30 did not begin the survey after reading the informed consent document, resulting in a total of 500 respondents. All 500 respondents finished the survey (zero incomplete responses) over a median time of 15 minutes and 24 seconds.
Data Collection
Our team designed and programmed the questionnaire using the Qualtrics XM platform. Participants received an invitation to participate in the questionnaire through Prolific then began the questionnaire by clicking on a secure, anonymous link to Qualtrics. Participants could complete the survey using any smartphone, tablet, or computer, provided they had an Internet connection. Prior to beginning the survey, participants read an information sheet and consented to participate in this specific study. Participants then completed the questionnaire. There was no time limit for how long participants had to finish the questionnaire, and they had the ability to stop and come back to complete the questionnaire at a later time. Participants had the ability to discontinue the survey at any time. Participants who completed the full questionnaire were compensated at an hourly rate of $13.60 based on Prolific’s policies.
Data Analysis
Quantitative Analysis
The first level of analysis involved assessing descriptive statistics to understand trends in participant perceptions and values. We also selected an outcome of interest (perceived benefit of AI for mental health), and calculated a logistic regression model to better understand if perceived benefits may differ by socio-demographic factors, specifically age, gender, race, education, financial resources, mental health history, and self-rated health literacy.40. The alpha value for all analyses was set at 0.05, and R version 4.2.1 (R Foundation for Statistical Computing, Vienna, Austria, 2022) was utilized. An analysis of a subset of this data (only those reporting female sex at birth) related to differences in perspectives based on pregnancy history has been reported in a separate manuscript.61
Qualitative Analysis
We analyzed free text responses using inductive thematic analysis and the constant comparative process. One analyst (Author PD) initially reviewed the codes and created a draft codebook. Free text responses to the two open-ended questions were analyzed using a singular coding scheme. A second analyst then used the coding scheme to independently dual code each free text response (ZR). The analysts met with a third team member to resolve discrepancies (NB), coding via consensus, and updating the codebook throughout the discussion. Once detailed codes had been developed, and 50% of the initial coding was completed, the team completed axial coding, coming up with higher level summary themes to describe themes patterns in the detailed codes.