Utility of artificial intelligence “one-minute free conversational voice” analysis for detecting cognitive decline in individuals

doi:10.21203/rs.3.rs-4070199/v1

Download PDF

Article

Utility of artificial intelligence “one-minute free conversational voice” analysis for detecting cognitive decline in individuals

https://doi.org/10.21203/rs.3.rs-4070199/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Recent developments in artificial intelligence (AI) have provided new technologies that can aid in detecting cognitive decline. This study developed a voice AI model that screens for cognitive decline solely based on a short conversational voice sample. This study involved collecting voice data, AI machine learning (ML), and confirming accuracy using test data. AI extracts multiple voice features from the collected voice data to detect potential signs of cognitive impairment. Data labeling for ML was based on Mini-Mental State Examination scores; scores of 23 or lower were labeled as “cognitively declined (CD),” while scores above 24 were labeled as “cognitively normal (CN).” A fully coupled neural network architecture was employed for deep learning using voice data from 263 patients. Twenty voice samples, comprising “one-minute conversations,” were used for accuracy evaluation. The developed AI model achieved an accuracy of 0.950 in discriminating between CD and CN individuals, with a sensitivity of 0.875, specificity of 1.000, and average area under the curve of 0.990. This voice AI model serves as a promising cognitive screening tool accessible via mobile devices, requiring no specialized environments or equipment.

Biological sciences/Neuroscience

Biological sciences/Psychology

Health sciences/Health care

Health sciences/Neurology

artificial intelligence

machine learning

voice

cognitive screening

dementia

The number of people with dementia is increasing worldwide.[1]¹ Over half of these cases are due to Alzheimer’s disease (AD) and there is approximately a 20-year preclinical period before cognitive decline is diagnosed. Although prevention, treatment, and care through early detection are possible, they often remain unrecognized or undetected for a long time. Thus, cost-effective and objective biomarkers are required to detect early cognitive decline, AD, and other dementias. Recently, the practical application of disease-modifying therapies (DMT) for AD have been proposed. For example, lecanemab can reduce amyloid-β protein in early AD and result in moderately slower decline in measures of cognition and function than placebo at 18 months.[2]² To maximize the benefit of DMT, early medical consultation and diagnosis are needed for patients with cognitive decline. Although nationwide dementia screening programs are in place for the early detection of dementia, participation is not very high in Japan. Accordingly, the number of cases in which DMT could be useful but is not applied is expected to increase due to delays in detecting and diagnosing cognitive decline. This places an additional burden on healthcare providers and primary care physicians to sustain screening programs. Brain scans and body fluid biomarkers can detect the early stages of dementia; however, they are invasive or expensive for screening.[3]³ Therefore, a simple screening test for cognitive function performed outside healthcare facilities is needed to encourage patients to seek medical attention.

Dementia is categorized as a neurocognitive disorder in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5). It encompasses the group of disorders with cognitive impairment, including attention, planning, inhibition, learning, memory, language, visual perception, spatial skills, and social skills.[4]⁴ In particular, language abilities are known to be impaired in the early stages of dementia, with symptoms such as aphasia, pauses, reduced vocabulary, and other language impairments.[5]⁵ AD, dementia with Lewy bodies (DLB), and vascular dementia (VaD) are the most common dementias worldwide. Previous studies have observed changes in syntactic complexity, lexical content, speech production, fluency, and semantic content during the early stages of AD, and language ability has been shown to correlate with global cognitive function[6, 7].^6,7 Patients with DLB exhibit reduced speech fluency, characterized by reduced overall speech rate and long pauses between sentences.[8]⁸ Language disturbance in VaD resembles that in AD, showing impairment on semantically mediated language tasks.[9]⁹ Thus, language is a suitable cognitive function for assessing cognitive decline in the early stages of dementia.

Recent developments in artificial intelligence (AI) have provided technologies that could aid in developing new, efficient, and accessible methods for early dementia detection. AI is expected to improve screening performance by extracting more features in a single test with fewer errors due to subjective judgments.[10]¹⁰ In addition, capturing additional features from large amounts of data can improve the accuracy of AI-based digital biomarkers. This allows for more objective inferences than a physician’s manual analysis results.[11]¹¹ AI-based cognitive function assessment includes computerized cognitive tests,[12, 13]^12,13 computer-assisted interpretation of brain scans-image analysis,[14]¹⁴ observation and evaluation of gait, hand, and eye movements,[15–17]^15–17 and speech, conversation, and language tests.[18–20]^18–20 However, existing AI-based cognitive assessment methods often require specific environments and equipment, and none have yet reached routine clinical practice. Cognitive screening tools used outside medical institutions should be administered anytime, anywhere, and quickly. In addition, since patients with dementia do not want others to realize their cognitive decline, it is better to avoid evaluation in the form of questions. Using people’s “conversational voices” can offer a simple and useful tool for cognitive screening because it does not depend on the environment or equipment. We hypothesized that phonetic features in people’s daily conversations reflect cognitive decline and sought to develop an ML-based voice AI to detect cognitive decline from “one-minute conversations.”

1. Voice datasets for machine learning

Voice data were collected from 285 consecutive patients who visited the Memory clinic. Their consent to participate in the study was obtained. However, two patients withdrew their consent; voice data from the remaining 283 patients were used in this study. For the accuracy confirmation test, 20 voice samples were used, leaving 263 voice samples (155 females) for training (Fig. 1). Clinical diagnoses of the 263 patients included AD (n = 85, 32.3%), mild cognitive impairment (MCI) (n = 78, 29.7%), subjective cognitive decline (SCD) (n = 34, 12.9%), VaD (n = 17, 6.5%), DLB (n = 12, 4.6%), Parkinson’s disease (PD) (n = 9, 3.4%), idiopathic normal pressure hydrocephalus (iNPH) (n = 7, 2.7%), brain tumor (n = 4, 1.5%), multiple system atrophy (n = 3, 1.1%), depression (n = 3, 1.1%), corticobasal degeneration (CBD) (n = 2, 0.8%), frontotemporal dementia (FTD) (n = 2, 0.8%), and neuronal intranuclear inclusion disease (n = 1, 0.4%). Clinical diagnosis was not feasible in n = 6 patients (2.3%) due to inadequate testing. Among the 263 samples, 113 samples (74 females) were categorized as cognitively declined (CD) (labeled with Mini-Mental State Examination; MMSE scores of 23 or lower). The remaining 150 voice samples were categorized as cognitively normal (CN). A summary of voice dataset used for machine learning (ML) is given in Table 1.

Table 1

Summary of voice dataset used for ML. Voice data collected from patients with MMSE scores above 23 were labeled as “0” = cognitively normal (CN), and from patients with MMSE scores of 23 or less were labeled as “1” = cognitively declined (CD). MMSE, mini-mental state examination; CDR, clinical dementia rating; SD, standard deviation.
	All	CN (MMSE ≥ 24)	CD (MMSE ≤ 23)	p - value
data labeling		0	1
n (% female)	263 (58.9)	150 (54.0)	113 (65.5)	0.06
MMSE (mean ± SD)	23.5 ± 5.0	27.1 ± 4.1	18.7 ± 5.1	< 0.00001
age (mean ± SD)	77.8 ± 9.4	75.1 ± 10.4	81.4 ± 7.2	< 0.00005
education year (mean ± SD)	13.4 ± 2.6	14.0 ± 2.7	12.6 ± 2.5	0.93
CDR (mean ± SD)	0.8 ± 0.6	0.6 ± 0.5	1.0 ± 0.5	< 0.00001

2. Discrimination accuracy of ML-based voice AI model

The discrimination test used 20 voice samples, comprising 8 CD (4 females, mean age 77.5 ± 9.8 years, mean education years 13.0 ± 2.7, mean MMSE score 18.4 ± 4.0, mean clinical dementia rating; CDR 1.5 ± 0.5) and 12 CN (7 females, mean age 75.0 ± 9.5 years, mean education years 14.2 ± 2.0, mean MMSE score 26.8 ± 2.1, mean CDR 0.3 ± 0.3) samples. No significant differences were observed in the percentage of females (p = 0.71), age (p = 0.56), or years of education (p = 0.16) between the CD and CN groups. Patients with CD exhibited significantly lower MMSE scores (p = 0.0003) and higher CDR scores (p = 0.0002). The clinical diagnoses of the test data included 5 and 7 cases of SCD and MCI for CN, respectively, and 5, 2, and 1 cases of AD, DLB, and VaD for CD, respectively (Table 2). The distribution of the AD, DLB, and VaD cases in the CD group was determined to reflect the ratio of real-world prevalence. Following ML, the voice AI model could discriminate between CD and CN with an accuracy of 0.950, sensitivity (probability of correctly identifying a CD as a CD) of 0.875, and specificity (probability of correctly identifying a CN as a CN) of 1.000. The average area under the curve (AUC) was 0.990 (Fig. 2).

Table 2

Result of discrimination test using speech AI model after ML. F, female; M, male; SCD, subjective cognitive decline, MCI, mild cognitive impairment; AD, Alzheimer’s disease; DLB, dementia with Lewy bodies; VaD, vascular dementia; MMSE, mini-mental state examination; CN, cognitively normal; CD, cognitively declined.
test samples				discrimination result
No.	clinical diagnosis	MMSE	labeling	probability	decision	result
1	SCD	30	CN (0)	0.0367	CN	correct
2	SCD	29	CN (0)	0.2156	CN	correct
3	SCD	29	CN (0)	0.1927	CN	correct
4	SCD	28	CN (0)	0.2509	CN	correct
5	SCD	28	CN (0)	0.2191	CN	correct
6	MCI	27	CN (0)	0.2174	CN	correct
7	MCI	26	CN (0)	0.2940	CN	correct
8	MCI	25	CN (0)	0.2372	CN	correct
9	MCI	25	CN (0)	0.1882	CN	correct
10	MCI	25	CN (0)	0.3585	CN	correct
11	MCI	24	CN (0)	0.4930	CN	correct
12	MCI	24	CN (0)	0.3877	CN	correct
13	AD	23	CD (1)	0.4734	CN	incorrect
14	AD	23	CD (1)	0.5320	CD	correct
15	AD	21	CD (1)	0.7185	CD	correct
16	DLB	19	CD (1)	0.7641	CD	correct
17	AD	18	CD (1)	0.5831	CD	correct
18	VaD	17	CD (1)	0.7356	CD	correct
19	AD	13	CD (1)	0.7136	CD	correct
20	DLB	13	CD (1)	0.8704	CD	correct

The proposed voice AI model discriminates between CD and CN individuals by analyzing “one-minute conversations”. The high discrimination accuracy of 0.950 attained through our simple method demonstrates the feasibility of using short conversational voices as a practical screening tool for analyzing cognitive function and alerting the individual or family to possible cognitive decline, leading to early treatment.

Extensive research on AI-based dementia assessment, particularly for AD, has been conducted worldwide. Practical digital biomarkers for diagnosing dementia can reduce the burden on clinical practice. However, it is difficult to realize this using only simple evaluation methods, such as short conversations. The diagnosis of dementia is complicated and should not be based solely on neuropsychological test scores, blood, cerebrospinal fluid, or imaging biomarkers. The DSM-5 criteria for diagnosing dementia involve significant cognitive decline from a previous level of activity, impairment in activities of daily living, and the exclusion of psychiatric disorders. Since detailed interviews with family members and caregivers, understanding of actual living conditions, and exclusion of psychiatric disorders are essential for diagnosis, we assumed that simple digital biomarkers alone could not cover all these criteria.[34]³⁴ In recent years, significant progress has been made in developing AI systems that use observational methods to assess the impact of daily lives on the well-being of the elderly.[35]³⁵ Although privacy concerns remain, the future promises to introduce digital biomarkers capable of diagnosing cognitive decline and dementia by integrating multiple assessments.

To clinically diagnose dementia based on pathological findings, cerebrospinal fluid, blood, and neurological imaging must be performed to identify abnormal protein accumulation in the brain. For example, diagnosing AD based on the ATN classification is essential to confirm amyloid beta and tau protein accumulation for DMT.[36]³⁶ Pathologic diagnosis of DLB and other synucleinopathies using cerebrospinal fluid and neuroimaging is also nearing.[37,38]^37,38 In an era where DMTs are effectively used, it is crucial to diagnose neurodegenerative dementias pathologically. Moreover, it is essential to confirm the results of imaging modalities, such as positron emission tomography (PET) and magnetic resonance imaging (MRI), as well as cerebrospinal fluid or blood biomarkers. Because diagnosing dementia using AI without clinical tests is challenging, focus has shifted from the development of digital biomarkers for diagnosing dementia to designing strategies aimed at facilitating early medical intervention for patients with cognitive decline, that is, at an early stage of dementia. This shift allows the effective utilization of the forthcoming therapeutic agents. In this regard, our proposed approach, without requiring specialized environments or equipment, represents a highly significant milestone.

Conversation and language abilities undergo impairment in the early stages of most dementia.[5]⁵ Recent studies have focused on AI-based assessments that employ speech and language. Typical testing procedures involve extracting pertinent features and subsequently inputting them into machine- or deep-learning classifiers to identify patterns consistent with dementia. Two primary features—acoustic and linguistic—are extractable and analyzable from human conversational voice.[10]¹⁰ Acoustic features delineate how individuals articulate speech, while linguistic features describe content aspects, such as vocabulary, grammar, and syntax. According to a recent review article, the extraction and analysis of linguistic features exhibit better accuracy (0.925) than the utilization of acoustic features alone (0.786). Employing linguistic and acoustic features in the AI analysis outperformed (0.939) using either feature in isolation.[10]¹⁰ The voice analysis AI developed in this study predominantly analyzed acoustic features and achieved a higher accuracy rate (0.950) than previous studies employing acoustic features alone. Using acoustic features alone may have the following advantages over using linguistic features: conversion errors do not affect the analysis results because there is no need to convert conversational voice into text, only a short conversation sample is needed, and the effects of the dialect characteristics of Japanese are reduced.

Recently, two type of tests mainly have been developed to analyze conversational voices: a picture description test (participants describe a picture, and their voice is recorded) and a conversation generated by an interview test. In our study, although voice data for ML were obtained from picture descriptions and interviews, the discrimination test was based only on “one-minute conversations”. The one-minute conversation was not provoked using a special task as in the picture description task, but rather was a spontaneous conversation based on an individual’s episodic memory, which resembles an interview task. In interview-based diagnosis, subjects answer multiple questions posed by humans or avatars, and their acoustic and linguistic features are analyzed to discriminate patients with cognitive decline and dementia.[20,39-41]^20,39-41 However, the subjects are required to answer multiple questions, which makes the test time-consuming and may give the subject the impression that they are being tested for cognitive function. In contrast, the possibility of discriminating between CD and CN with high accuracy using a short conversational voice data obtained from only one question makes this method simple, and it can be widely used in clinical settings. Another advantage is that, unlike tests with definite correct answers, freeform conversations have no fixed answers. This reduces the learning effect, making repeated administration of the tests easier. The proposed voice AI model identified CN with 100% accuracy. The absence of false positives (that is, a CN diagnosed as a CD) indicates its usefulness as a screening test in a real clinical situation, preventing unnecessary worry or anxiety in healthy individuals and avoiding the unnecessary medical burden and cost of additional testing.

However, our voice AI model cannot detect MCI-level cognitive decline equivalent to an MMSE score of 24–27, considering our cutoff score was 23/24. Identifying patients with cognitive impairment before progression to dementia, specifically at the SCD and MCI levels, is crucial, as early intervention provides more opportunities for the prevention, care, and effective use of treatments such as DMT for AD. To identify cognitive impairment at the MCI level, it is necessary to adjust the cutoff MMSE score for data labeling and perform ML by incorporating CDR results, which are worth 0.5, and the results of more detailed neuropsychological tests. In the future, additional ML using voice data from individuals with milder cognitive decline should be performed to explore the potential for detecting such decline with higher accuracy. Another limitation of this study is that the voice data was collected only once per individual, making it challenging to evaluate individual data repeatedly to confirm that the decision of the voice AI model is always the same in the same individual. In other words, the possibility that the discrimination results in the same individual may differ depending on the condition of the day (e.g., lack of sleep, alcohol consumption, and accidental forgetfulness) needs to be considered to avoid making an incorrect decision.

The proposed voice AI model is novel in its ability to accurately detect cognitive decline based solely on minute-long conversations. Thus far, no free conversation-based AI application has received pharmaceutical approval for dementia detection. This technology aims to develop AI medical software for detecting cognitive decline using minute-long conversations accessible via mobile devices such as smartphones. Patients with mild dementia are often unaware of their cognitive decline, resulting in fewer proactive visits to healthcare facilities for cognitive assessment. Even when family or friends raise suspicion, patients often resist a cognitive assessment. In addition, geographical barriers make it difficult to seek medical evaluation, particularly in rural areas. Therefore, developing AI medical software that is universally accessible, respects personal dignity and privacy, and imposes minimal mental, physical, and financial burdens to support dementia diagnosis will improve diagnostic accuracy and its widespread adoption. In addition, digital biomarkers based on language and conversation could detect changes in cognitive function before conventional medical examinations, offering potential applications for early diagnosis and detection of mental disorders such as depression.[42]⁴² While uncertainty remains on how to connect family members diagnosed with cognitive decline to medical institutions, we believe that an AI-assisted simple cognitive function screening tool, using a short conversational voice, can be valuable in an era where dementia is on the rise.

1. Research outline

This study aimed to develop an ML-based voice AI that could detect cognitive decline from a short conversational voice. The study involved 1) collecting voice data, 2) performing AI ML using the collected voice data, and 3) confirming the accuracy of the developed voice AI model using the test voice data. This study was approved by the Ethics Committee of Showa University School of Medicine (approval number: 21-018-B) and was conducted in accordance with the principles of the Declaration of Helsinki (as revised in 2013).

2. Voice data collection

We enrolled consecutive patients who consulted the Memory Clinic of the Department of Neurology, Showa University School of Medicine, Japan, for concerns related to memory loss between January 2021 and September 2023. All participants were of Japanese origin, and voice data were collected in standard Japanese. Voice data were gathered during conversations while the participants engaged in original tasks and underwent neuropsychological assessments, including the MMSE, Hasegawa’s Dementia Scale-Revised (HDS-R), and Montreal Cognitive Assessment (MoCA).[21-23]^21–23 The original tasks comprised the following: 1) conversational voice about “something fun you experienced recently;” 2) responses to three meal-related questions: “What did you eat today?,” “Please describe the contents of your meals yesterday, starting with breakfast,” and “What was the most memorable meal?;” 3) a picture description task using The Cookie Theft Picture.[24]²⁴ Psychological tests and tasks were conducted face-to-face between the examiner (psychologist or neurologist) and examinee in a room without specialized soundproofing. Voice recordings were made using an iPad (6^th generation) equipped with a microphone positioned on a table between the two. The participants were informed that their conversations were being recorded during the examination. The recorded voice data were stored on the iPad until the ML phase. Written informed consent was obtained from all participants.

3. Data labeling

MMSE is the most widely employed test for dementia screening.[21]²¹ With a cutoff set at 23/24, the combined sensitivity and specificity for detecting dementia were reported as 0.81 and 0.89, respectively.[25]²⁵ Given the overarching aim of this study, only MMSE scores were utilized for voice data labeling. Voice data with scores or 23 or lower were labeled as “1” = CD, whereas those above 23 were labeled as “0” = CN.

4. Machine learning procedure

We used a comprehensive approach with multiple voice features to detect potential cognitive impairment signs. First, we preprocessed the obtained voice data using Pyannote-audio (https://github.com/pyannote/pyannote-audio), an open-source toolkit in Python for speaker diarization, to separate the examinee and examiner voices. Next, we used a combination of the Japanese HuBERT (hidden unit bidirectional encoder representations from transformers) model trained by rinna Co., Ltd. (https://huggingface.co/rinna/japanese-hubert-base) and the librosa Python package for music and audio analysis (https://librosa.org/doc/latest/index.html) to extract voice features indicative of cognitive decline. The HuBERT voice analysis model extracts advanced representations of voice patterns and nuances to capture subtle variations and complexities in the voice and provides a more comprehensive analysis of voice for a deeper understanding of the underlying linguistic cues and nuances that may indicate cognitive impairment. The Japanese HuBERT model was trained using a large Japanese audio dataset obtained from rinna Co., Ltd. librosa uses direct acoustic features, such as silent interval features, fundamental frequency (F0) features, and Mel-frequency cepstral coefficients (MFCC). Silent interval features add depth to the analysis and provide critical insights into speakers’ communication patterns and fluency. Parameters such as frequency, duration, and proportion of pauses were examined to detect potential disruptions in voice rhythms that may indicate underlying cognitive impairment. Examining the frequency distribution and dynamics of the voice using MFCC allows for a detailed analysis of the spectral characteristics within the voice signals and a comprehensive understanding of subtle variations in voice patterns. In addition, including the F0 feature enhances the understanding of the fundamental frequency variations and pitch contours within the voice by analyzing the speaker’s intonation patterns and vocal modulations. Finally, the voice data containing these extracted features were labeled as “0” or “1” according to the MMSE scores and deep learning was performed using a fully coupled neural network architecture (Figure 3).

5. Discrimination accuracy testing

Twenty voice samples were prepared to assess the discrimination accuracy of the ML-based voice AI model. The test data involved “one-minute conversations” about “something fun you experienced recently”, a segment of the original task. None of the voice data employed for testing was used for model training. The CN test data encompassed individuals diagnosed with SCD and MCI, whereas the CD test data comprised individuals diagnosed with AD, DLB, and VaD, the three major types of dementia. The ML-based voice AI model outputs the probability (ranging from 0 to 1) that the voice belongs to a CD individual. A probability value of 0.5 or more was set as the threshold for CD diagnosis.

6. Clinical diagnosis

All patients underwent detailed interview, neurological examination by an experienced neurologist, blood tests, CDR, MMSE, and brain MRI. Additional examinations were performed when necessary for clinical diagnosis. SCD was characterized by self-reported memory complaints, CDR score of 0, MMSE scores within the normal range for cognition (28 ≤ MMSE), and no evidence of impairment in functional activities. Diagnosis of MCI was based on the diagnostic criteria proposed by Petersen[26]²⁶: memory complaint corroborated by an informant, global CDR score of 0.5, and a cognitive decline in MMSE (24 ≤ MMSE ≤ 27), but no evidence of impairment in functional activities revealed by clinical interview. Diagnosis of AD, VaD, DLB, FTD, PD, CBD, and iNPH were made according to the guidelines of the National Institute on Aging-Alzheimer’s Association workgroups,[27]²⁷ diagnostic criteria for vascular cognitive disorders of the International Society for Vascular Behavioral and Cognitive Disorders,[28]²⁸ revised criteria for the clinical diagnosis of DLB,[29]²⁹ the revised diagnostic criteria for the behavioral variant of FTD,[30]³⁰ clinical criteria for CBD,[31]³¹ International Parkinson and Movement Disorder Society criteria,[32]³² and third edition of the Japanese Guidelines for Management of iNPH,[33]³³ respectively.

7. Statistics

An unpaired t-test was employed to analyze the differences in mean age, years of education, MMSE scores, and CDR scores between the CD and CN groups for voice samples used in model training and testing to confirm accuracy. A chi-square test examined the male-to-female ratio of the samples. All tests were two-tailed and conducted using SPSS version 29.0.1.0 (IBM Corp., Armonk, NY, United States). Statistical significance was defined as an adjusted p < 0.05. The results are presented as mean and standard deviation (SD).

Acknowledgments

We thank M. Miyanohara, a psychologist, for her contribution to collecting voice data and conducting neuropsychological tests. We thank the staff of the Department of Neurology, Showa University School of Medicine, for their cooperation during the study.

Author contributions

T.K. and H.M. designed the study, collected and interpreted the data, and wrote the manuscript. S.H., K.M., Y.I., and K.O. designed and interpreted the data. D.S., S.K., and A.I. contributed to data collection. M.O., Y.S., M.T., and H.N. contributed to data analysis and interpretation and wrote the manuscript. All the authors have approved the submitted version of the manuscript.

Data availability

The datasets used and analyzed during the current study available from the corresponding author on reasonable request.

Competing interests

The authors declare no competing interests.

GBD. Estimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: an analysis for the Global Burden of Disease Study 2019. Lancet Public Health 7(2):e105-e125. doi:10.1016/S2468-2667(21)00249-8. Epub 2022 Jan 6. PMID: 34998485, PMCID: PMC8810394 (2022).
van Dyck, C. H. et al. Lecanemab in early Alzheimer’s disease. N Engl J Med. 388(1):9–21. doi:10.1056/NEJMoa2212948. Epub 2022 Nov 29. PMID: 36449413 (2023).
Blennow, K. et al. Cerebrospinal fluid and plasma biomarkers in Alzheimer disease. Nat Rev Neurol. 6(3):131–144. doi:10.1038/nrneurol.2010.4. Epub 2010 Feb 16. PMID: 20157306 (2010).
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, 5th edn. American Psychiatric Publishing (2013).
Kempler, D. & Goral, M. Language and dementia: neuropsychological aspects. Annu Rev Appl Linguist. 28:73–90. doi:
10.
1017/S0267190508080045, PMID: 21072322, PMCID: PMC2976058 (2008).
Ahmed, S. et al. Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease. Brain. 136(12):3727–3737. doi:10.1093/brain/awt269. Epub 2013 Oct 18. PMID: 24142144, PMCID: PMC3859216 (2013).
Weiner, M. F. et al. Language in Alzheimer’s disease. J Clin Psychiatry. 69(8):1223–1227. doi:10.4088/jcp.v69n0804, PMID: 18505305, PMCID: PMC3177322 (2008).
Ash,. S. et al. Impairments of speech fluency in Lewy body spectrum disorder. Brain Lang. 120(3):290–302. doi:10.1016/j.bandl.2011.09.004. Epub 2011 Nov 17. PMID: 22099969, PMCID: PMC3299896 (2012).
Vuorinen, E. et al. Common pattern of language impairment in vascular dementia and in Alzheimer disease. Alzheimer Dis Assoc Disord. 14(2):81–86. doi:10.1097/00002093-200004000-00005, PMID: 10850746 (2000).
Li, R. et al. Applications of artificial intelligence to aid early detection of dementia: A scoping review on current capabilities and future directions. J Biomed Inform. 127:104030. doi:10.1016/j.jbi.2022.104030. Epub 2022 Feb 17. PMID: 35183766 (2022).
Danso, S. O. et al. Application of Big Data and Artificial Intelligence technologies to dementia prevention research: an opportunity for low-and-middle-income countries. J Glob Health. 9(2):020322. doi:10.7189/jogh.09.020322, PMID: 32257177, PMCID: PMC7101511 (2019).
Angelillo, M. T. et al. Attentional pattern classification for automatic dementia detection. IEEE Access. 7:57706–57716. doi:10.1109/ACCESS.2019.2913685 (2019).
Thabtah, F. et al. A mobile-based screening system for data analyses of early dementia traits detection. J Med Syst. 44(1):24. doi:10.1007/s10916-019-1469-0 (2019).
Pellegrini, E. et al. Machine learning of neuroimaging for assisted diagnosis of cognitive impairment and dementia: A systematic review, Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring 10:519–535 (2018).
Mc Ardle, R. et al. Differentiating dementia disease subtypes with gait analysis: feasibility of wearable sensors? Gait & Posture. 76:372–376. doi:10.1016/j.gaitpost.2019.12.028 (2020).
Sano, Y. et al. Detection of abnormal segments in finger tapping waveform using one-class svm. In: 2019 Annu Int Conf IEEE Eng Med Biol Soc 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE. 2019:1378–1381. doi:10.1109/EMBC.2019.8856598 (2019).
Tadokoro, K. et al. Early detection of cognitive decline in mild cognitive impairment and Alzheimer’s disease with a novel eye tracking test. J Neurol Sci. 427:117529. doi:10.1016/j.jns.2021.117529. Epub 2021 Jun 3. PMID: 34130064 (2021).
Haider, F. et al. An assessment of paralinguistic acoustic features for detection of Alzheimer’s dementia in spontaneous speech. IEEE J Sel Top Signal Process. 14(2):272–281. doi:10.1109/JSTSP.2019.2955022 (2019).
Liu, Z. et al. Dementia detection by analyzing spontaneous mandarin speech. In: Asia and- the Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE; 289–296. doi:10.1109/APSIPAASC47483.2019.9023041 (2019).
Ujiro, T. et al. Detection of dementia from responses to atypical questions asked by embodied conversational agents. In: Interspeech:1691–1695. Doi:10.21437/Interspeech.2018-1514 (2018).
Folstein, M. F. et al. “Mini-mental state”: A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 12(3):189–198. doi:10.1016/0022-3956(75)90026-6, PMID: 1202204 (1975).
Imai, Y. & Hasegawa, K. The revised Hasegawa’s Dementia Scale evaluation of its usefulness as a screening test for dementia. JH.K.C. Psych. 4:20–24 (1994).
Nasreddine, Z. S. et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc. 53(4):695–699. doi:10.1111/j.1532-5415.2005.53221.x. Erratum in: J Am Geriatr Soc. 2019;67(9):1991. doi:10.1111/jgs.15925, PMID: 31493356 (2005).
Goodglass, H. & Kaplan, E. The Assessment of Aphasia and Related Disorders. Lea & Febiger (1972).
Tsoi, K. K. et al. Cognitive tests to detect dementia: a systematic review and meta-analysis. JAMA Intern Med. 175(9):1450–1458. doi:10.1001/jamainternmed.2015.2152 (2015).
Petersen, R, C. Mild cognitive impairment as a diagnostic entity. J Intern Med. 256(3):183–194. doi:10.1111/j.1365-2796.2004.01388.x, PMID: 15324362 (2004).
McKhann, G. M. et al. The diagnosis of dementia due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 7(3):263–269. doi:10.1016/j.jalz.2011.03.005. Epub 2011 Apr 21. PMID: 21514250, PMCID: PMC3312024 (2011).
Sachdev, P. et al. Diagnostic criteria for vascular cognitive disorders: a VASCOG statement. Alzheimer Dis Assoc Disord. 28(3):206–218. doi:10.1097/WAD.0000000000000034, PMID: 24632990, PMCID: PMC4139434 (2014).
McKeith, I. G. et al. Diagnosis and management of dementia with Lewy bodies: fourth consensus report of the DLB Consortium. Neurology. 89(1):88–100. doi:10.1212/WNL.0000000000004058. Epub 2017 Jun 7. PMID: 28592453, PMCID: PMC5496518 (2017).
Rascovsky, K. et al. Sensitivity of revised diagnostic criteria for the behavioural variant of frontotemporal dementia. Brain. 134(9):2456–2477. doi:10.1093/brain/awr179. Epub 2011 Aug 2. PMID: 21810890, PMCID: PMC3170532 (2011).
Armstrong, M. J. et al. Criteria for the diagnosis of corticobasal degeneration. Neurology. 80(5):496–503. doi:10.1212/WNL.0b013e31827f0fd1, PMID: 23359374, PMCID: PMC3590050 (2013).
Postuma, R. B. et al. MDS clinical diagnostic criteria for Parkinson’s disease. Mov Disord. 30(12):1591–1601. doi:10.1002/mds.26424, PMID: 26474316 (2015).
Nakajima, M. et al. Guidelines for Management of Idiopathic Normal Pressure Hydrocephalus (Third Edition): Endorsed by the Japanese Society of Normal Pressure Hydrocephalus. Neurol Med Chir (Tokyo), 3rd edn. 61(2):63–97. doi:10.2176/nmc.st.2020-0292. Epub 2021 Jan 15. PMID: 33455998, PMCID: PMC7905302 (2021).
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, 5th edn. DSM—5. VA. American Psychiatric Association (2013).
Batista, E. et al. On wandering detection methods in context-aware scenarios. In: 7th International Conference on Information, Intelligence, Systems & Applications (IISA). IEEE; 2016:1–6. doi:10.1109/IISA.2016.7785349 (2016).
Hampel, H. et al. Developing the ATX(N) classification for use across the Alzheimer disease continuum. Nat Rev Neurol. 17(9):580–589. doi:10.1038/s41582-021-00520-w. Epub 2021 Jul 8. PMID: 34239130 (2021).
Rossi, M. et al. Diagnostic value of the CSF α-synuclein real-time quaking-induced conversion assay at the prodromal MCI stage of dementia with Lewy bodies. Neurology. 97(9):e930–e940. doi:10.1212/WNL.0000000000012438. Epub 2021 Jul 1. PMID: 34210822, PMCID: PMC8408510 (2021).
Matsuoka, K. et al. High-contrast imaging of α-synuclein pathologies in living patients with multiple system atrophy. Mov Disord. 37(10):2159–2161. doi:10.1002/mds.29186. Epub 2022 Aug 30. PMID: 36041211, PMCID: PMC9804399 (2022).
Mirheidari, B. et al. Dementia detection using automatic analysis of conversations. Comput Speech Lang. 53:65–79. doi:10.1016/j.csl.2018.07.006 (2019).
Tanaka, H. et al. Detecting dementia through interactive computer avatars. IEEE J Transl Eng Health Med. 5:2200111. doi:10.1109/JTEHM.2017.2752152 (2017).
Luz, S. et al., A Method for Analysis of Patient Speech in Dialogue for Dementia Detection. arXiv Preprint ArXiv:1811.09919 (2018).
Reilly, J. et al. Cognition, language, and clinical pathological features of non-Alzheimer’s dementias: an overview. J Commun Disord. 43(5):438–452. doi:10.1016/j.jcomdis.2010.04.011. Epub 2010 May 6. PMID: 20493496, PMCID: PMC2922444 (2010).

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Utility of artificial intelligence “one-minute free conversational voice” analysis for detecting cognitive decline in individuals

Status:

Version 1

Abstract

Figures

Introduction

Results

1. Voice datasets for machine learning

2. Discrimination accuracy of ML-based voice AI model

Discussion

Methods

Declarations

References

Additional Declarations

Status:

Version 1