Rofo 2023; 195(S 01): S36
DOI: 10.1055/s-0043-1763038
Abstracts
Vortrag (Wissenschaft)
IT/Bildverarbeitung/Software

Training of AI Models Beyond the Local Dataset Using Federated Learning with 695,000 Non-Identically-Labeled Chest Radiographs

S Tayebi Arasteh
1   Uniklinik RWTH Aachen, Diagnostische und Interventionelle Radio, Aachen
,
P Isfort
2   Uniklinik RWTH Aachen University, Diagnostische und Interventionelle Radiologie, Aachen
,
M Sähn
2   Uniklinik RWTH Aachen University, Diagnostische und Interventionelle Radiologie, Aachen
,
C Kuhl
2   Uniklinik RWTH Aachen University, Diagnostische und Interventionelle Radiologie, Aachen
,
D Truhn
2   Uniklinik RWTH Aachen University, Diagnostische und Interventionelle Radiologie, Aachen
,
S Nebelung
2   Uniklinik RWTH Aachen University, Diagnostische und Interventionelle Radiologie, Aachen
› Author Affiliations
 
 

    Zielsetzung Artificial intelligence (AI) models require large annotated datasets for training. While federated learning (FL) enables multiple institutions to securely cooperate in training AI models, it requires all images to be annotated identically and based on the same classifications and conditions. In the real world, this can usually only be achieved if partners agree on a standardized annotation beforehand. Our aim was to develop and validate an extension to FL for collaborative training of chest radiographs that had not been labeled identically.

    Material und Methoden In this retrospective study, we included more than 695,000 chest radiographs from five institutions, i.e., (i) VinDr, n=18,000, (ii) ChestX-ray14, n=112,120, (iii) CheXpert, n=157,676, (iv) MIMIC-CXR, n=215,187, and (v) UKA, n=193,361. For these radiographs, annotations for typical radiological diagnoses (e.g., cardiomegaly, pneumonia, pleural effusion, etc.) in a multilabel setting with distinctly different label schemes were available. We comparatively evaluated two separate training approaches for each dataset, i.e., training 1) only on the local data and 2) on all data using our novel flexible FL (FFL) paradigm. Statistical analysis was performed on the area under the receiver-operator-curve (AUC) of held-out test sets (n=3,000, n=25,596, n=29,320, n=2,844, and n=39,824, respectively using bootstrapping as a statistical test.

    Ergebnisse FFL outperformed local training over all five conditions in terms of the average AUC: 1) VinDr-CXR: 0.885±0.049 [FFL] vs. 0.867±0.045 [local]; p=0.001, 2) ChestX-ray14: 0.744±0.080 vs. 0.744±0.076; p=0.363, 3) CheXpert: 0.797±0.061 vs. 0.796±0.064; p=0.243, 4) MIMIC-CXR-JPG: 0.786±0.066 vs. 0.772±0.072; p=0.004, and 5) UKA: 0.918±0.031 vs. 0.916±0.031; p=0.001.

    Schlussfolgerungen By enabling joint training of the AI models beyond the local dataset and with heterogeneous labels, our new FFL framework further improves model performance and allows more flexibility in input data organization and integration.


    #

    Publication History

    Article published online:
    13 April 2023

    © 2023. Thieme. All rights reserved.

    Georg Thieme Verlag
    Rüdigerstraße 14, 70469 Stuttgart, Germany