DementiaBank |
|
| |
ADReSS 2020 Challenge |
INTERSPEECH 2020 - Alzheimer's Dementia Recognition through
Spontaneous Speech: The ADReSS Challenge
This challenge was organized by Saturnino Luz, Fasih Haider, and Sofia
de la Fuente Garcia of the University of Edinburgh and Davida Fromm
and Brian MacWhinney of Carnegie Mellon University.
The objective of the ADReSS challenge is to make
available a benchmark dataset of
spontaneous speech, which is acoustically pre-processed
and balanced in terms of age and gender, defining a shared task
through which different approaches to AD recognition in spontaneous
speech can be compared. Our JAD review describes the
state of research at the beginning of the challenge.
The Challenge has these features:
- It targets a difficult automatic
prediction problem of societal and medical relevance, namely, the
detection of cognitive impairment and
Alzheimer's Dementia (AD). To the best of our knowledge, this will
be the first such shared-task event focused on AD.
- While a number of researchers have proposed speech processing and
natural language procesing approaches to AD recognition through
speech, their studies have used different, often unbalanced and
acoustically varied data sets, consequently hindering
reproducibility and comparability of approaches. The ADReSS
Challenge will provide a forum for those different research groups
to test their existing methods (or develop novel approaches) on a
new shared standardized dataset.
- Th ADReSS Challenge dataset has been carefully selected so as to
mitigate common biases often overlooked in evaluations of AD
detection methods, including repeated occurrences
of speech from the same participant (common in longitudinal
datasets), variations in audio quality, and imbalances of gender and
age distribution.
- Unlike some tests performed in clinical settings, where short
speech samples are collected under controlled conditions, this task
focuses AD recognition using spontaneous speech.
The Challenge
This
powerpoint describes the challenge.
It consists of two tasks:
- an AD classification task, where you are required to produce a
model to predict the label (AD or non-AD) for a speech session. Your
model can use speech data, language data (transcipts are
provided), or both.
- an MMSE score regression task, where you will create a model to
infer the subject's Mini Mental Status Examination (MMSE) score
based on speech and/or language data.
After joining as a DementiaBank member,
you can gain access to the training and test data
from here.
The training data consists of three folders of data (full enhanced
audio, normalised sub-chunks, transcriptions) as well as two text
files with information on age, gender and MMSE scores for participants
with and without a diagnosis of AD (cc_meta_data.txt,
cd_meta_data.txt). A README file is also included for further
details.
The baseline results are in this paper
Performance on AD classification is evaluated through F scores. Performance on
MMSE prediction is through root mean squared error (RMSE).
This sheet lists the participants in the challenge.
This sheet summarizes the results.
This file gives the labels.
The complete set of conference papers is here .
| AD | non-AD |
Age Interval | Male | Female | Male | Female |
[50, 55) | 2 | 0 | 2 | 0 |
[55, 60) | 7 | 6 | 7 | 6 |
[60, 65) | 4 | 9 | 4 | 9 |
[65, 70) | 9 | 14 | 9 | 14 |
[70, 75) | 9 | 11 | 9 | 11 |
[75, 80) | 4 | 3 | 4 | 3 |
Total | 35 | 43 | 35 | 43 |