The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study

Gustav Mårtensson; Daniel Ferreira; Tobias Granberg; Lena Cavallin; Ketil Oppedal; Alessandro Padovani; Irena Rektorova; Laura Bonanni; Matteo Pardini; Milica G Kramberger; John-Paul Taylor; Jakub Hort; Jón Snædal; Jaime Kulisevsky; Frederic Blanc; Angelo Antonini; Patrizia Mecocci; Bruno Vellas; Magda Tsolaki; Iwona Kłoszewska; Hilkka Soininen; Simon Lovestone; Andrew Simmons; Dag Aarsland; Eric Westman

doi:10.1016/j.media.2020.101714

Article Dans Une Revue Medical Image Analysis Année : 2020

The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study

(1) , (1) , (1, 2) , (1, 2) , (3, 4) , (5) , (6, 7) , (8) , (9) , (10) , (11) , (12) , (13) , (14) , (15) , (16) , (17) , (18, 19) , (20) , (21) , (22, 23) , (24) , (25) , (25) , (25, 1)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Gustav Mårtensson

Fonction : Auteur
PersonId : 1364925
ORCID : 0000-0002-6100-991X

Karolinska Institutet [Stockholm]

Daniel Ferreira

Fonction : Auteur

Karolinska Institutet [Stockholm]

Tobias Granberg

Fonction : Auteur

Karolinska Institutet [Stockholm]

Karolinska University Hospital [Stockholm]

Lena Cavallin

Fonction : Auteur

Karolinska Institutet [Stockholm]

Karolinska University Hospital [Stockholm]

Ketil Oppedal

Fonction : Auteur

Stavanger University Hospital

University of Stavanger

Alessandro Padovani

Fonction : Auteur

Università degli Studi di Brescia = University of Brescia

Irena Rektorova

Fonction : Auteur

St. Anne’s University Hospital [Brno]

Central European Institute of Technology [Brno]

Laura Bonanni

Fonction : Auteur

Università degli studi "G. d'Annunzio" Chieti-Pescara [Chieti-Pescara]

Matteo Pardini

Fonction : Auteur

Università degli studi di Genova = University of Genoa

Milica G Kramberger

Fonction : Auteur

University of Ljubljana

John-Paul Taylor

Fonction : Auteur

Newcastle University [Newcastle]

Jakub Hort

Fonction : Auteur

University Hospital Motol [Prague]

Jón Snædal

Fonction : Auteur

Landspitali National University Hospital of Iceland

Jaime Kulisevsky

Fonction : Auteur

Universitat Autònoma de Barcelona = Autonomous University of Barcelona = Universidad Autónoma de Barcelona

Frederic Blanc

Fonction : Auteur

Laboratoire des sciences de l'ingénieur, de l'informatique et de l'imagerie

Angelo Antonini

Fonction : Auteur

Università degli Studi di Padova = University of Padua

Patrizia Mecocci

Fonction : Auteur

Università degli Studi di Perugia = University of Perugia

Bruno Vellas

Fonction : Auteur
PersonId : 933362
ORCID : 0000-0002-7678-5065
IdRef : 03566729X

Epidémiologie et analyses en santé publique : risques, maladies chroniques et handicaps

Gérontopôle

Magda Tsolaki

Fonction : Auteur

Aristotle University of Thessaloniki

Iwona Kłoszewska

Fonction : Auteur

Medical University of Łódź

Hilkka Soininen

Fonction : Auteur

University of Eastern Finland

University of Kuopio

Simon Lovestone

Fonction : Auteur

University of Oxford

Andrew Simmons

Fonction : Auteur

King‘s College London

Dag Aarsland

Fonction : Auteur

King‘s College London

Eric Westman

Fonction : Auteur

King‘s College London

Karolinska Institutet [Stockholm]

Résumé

Deep learning (DL) methods have in recent years yielded impressive results in medical imaging, with the potential to function as clinical aid to radiologists. However, DL models in medical imaging are often trained on public research cohorts with images acquired with a single scanner or with strict protocol harmonization, which is not representative of a clinical setting. The aim of this study was to investigate how well a DL model performs in unseen clinical datasets-collected with different scanners, protocols and disease populations-and whether more heterogeneous training data improves generalization. In total, 3117 MRI scans of brains from multiple dementia research cohorts and memory clinics, that had been visually rated by a neuroradiologist according to Scheltens' scale of medial temporal atrophy (MTA), were included in this study. By training multiple versions of a convolutional neural network on different subsets of this data to predict MTA ratings, we assessed the impact of including images from a wider distribution during training had on performance in external memory clinic data. Our results showed that our model generalized well to datasets acquired with similar protocols as the training data, but substantially worse in clinical cohorts with visibly different tissue contrasts in the images. This implies that future DL studies investigating performance in out-of-distribution (OOD) MRI data need to assess multiple external cohorts for reliable results. Further, by including data from a wider range of scanners and protocols the performance improved in OOD data, which suggests that more heterogeneous training data makes the model generalize better. To conclude, this is the most comprehensive study to date investigating the domain shift in deep learning on MRI data, and we advocate rigorous evaluation of DL models on clinical data prior to being certified for deployment.

Mots clés

Clinical application Deep learning Domain shift Neuroimaging

Domaines

Santé publique et épidémiologie

Fichier principal

Martensson_2020.pdf (899.52 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte
licence : CC BY - Paternité

Alicia Benson-Rumiz : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04510101

Soumis le : lundi 18 mars 2024-16:59:53

Dernière modification le : jeudi 11 avril 2024-13:08:14

Dates et versions

hal-04510101 , version 1 (18-03-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04510101 , version 1
DOI : 10.1016/j.media.2020.101714
PUBMED : 33007638

Citer

Gustav Mårtensson, Daniel Ferreira, Tobias Granberg, Lena Cavallin, Ketil Oppedal, et al.. The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study. Medical Image Analysis, 2020, 66, pp.101714. ⟨10.1016/j.media.2020.101714⟩. ⟨hal-04510101⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS ENGEES INSA-STRASBOURG INC-CNRS SITE-ALSACE INSA-GROUPE UNIV-UT3 UT3-TOULOUSEINP CERPOP

14 Consultations

1 Téléchargements

The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager