DementiaBank

DementiaBank

ADReSS 2020 Challenge

INTERSPEECH 2020 - Alzheimer's Dementia Recognition through Spontaneous Speech: The ADReSS Challenge

This challenge was organized by Saturnino Luz, Fasih Haider, and Sofia de la Fuente Garcia of the University of Edinburgh and Davida Fromm and Brian MacWhinney of Carnegie Mellon University.

The objective of the ADReSS challenge is to make available a benchmark dataset of spontaneous speech, which is acoustically pre-processed and balanced in terms of age and gender, defining a shared task through which different approaches to AD recognition in spontaneous speech can be compared. Our JAD review describes the state of research at the beginning of the challenge.

The Challenge has these features:

It targets a difficult automatic prediction problem of societal and medical relevance, namely, the detection of cognitive impairment and Alzheimer's Dementia (AD). To the best of our knowledge, this will be the first such shared-task event focused on AD.
While a number of researchers have proposed speech processing and natural language procesing approaches to AD recognition through speech, their studies have used different, often unbalanced and acoustically varied data sets, consequently hindering reproducibility and comparability of approaches. The ADReSS Challenge will provide a forum for those different research groups to test their existing methods (or develop novel approaches) on a new shared standardized dataset.
Th ADReSS Challenge dataset has been carefully selected so as to mitigate common biases often overlooked in evaluations of AD detection methods, including repeated occurrences of speech from the same participant (common in longitudinal datasets), variations in audio quality, and imbalances of gender and age distribution.
Unlike some tests performed in clinical settings, where short speech samples are collected under controlled conditions, this task focuses AD recognition using spontaneous speech.

The Challenge

This powerpoint describes the challenge. It consists of two tasks:

an AD classification task, where you are required to produce a model to predict the label (AD or non-AD) for a speech session. Your model can use speech data, language data (transcipts are provided), or both.
an MMSE score regression task, where you will create a model to infer the subject's Mini Mental Status Examination (MMSE) score based on speech and/or language data.

After joining as a DementiaBank member, you can gain access to the training and test data from here.

The training data consists of three folders of data (full enhanced audio, normalised sub-chunks, transcriptions) as well as two text files with information on age, gender and MMSE scores for participants with and without a diagnosis of AD (cc_meta_data.txt, cd_meta_data.txt). A README file is also included for further details.

The baseline results are in this paper

Performance on AD classification is evaluated through F scores. Performance on MMSE prediction is through root mean squared error (RMSE).

This sheet lists the participants in the challenge.

This sheet summarizes the results.

This file gives the labels.

The complete set of conference papers is here .

	AD		non-AD
Age Interval	Male	Female	Male	Female
[50, 55)	2	0	2	0
[55, 60)	7	6	7	6
[60, 65)	4	9	4	9
[65, 70)	9	14	9	14
[70, 75)	9	11	9	11
[75, 80)	4	3	4	3
Total	35	43	35	43

The Challenge

Sponsorship