Background: Machine learning (ML) and artificial intelligence (AI) based classifiers can be used to diagnose diseases from medical imaging data. However, few of the classifiers proposed in the literature translate to clinical use because of robustness concerns.
Materials and Methods: This study investigates how to improve the robustness of AI/ML imaging classifiers by simultaneously applying perturbations of common effects (Gaussian noise, contrast, blur, rotation, and tilt) to different amounts of training and testing images. Furthermore, a comparison with classifiers trained with adversarial noise is also presented. This procedure is illustrated using two publicly available datasets, the PneumoniaMNIST dataset and the Breast Ultrasound Images (BUSI dataset).
Results: Classifiers trained with small amounts of perturbed training images showed similar performance on unperturbed testing images compared to the classifier trained with no perturbations. Additionally, classifiers trained with perturbed data performed significantly better on testing data both perturbed by a single perturbation (p-values: noise = 0.0186; contrast = 0.0420; rotation, tilt, and blur = 0.000977) and multiple perturbations (p-values: PneumoniaMNIST = 0.000977; BUSI = 0.00684) than the classifier trained with unperturbed data
Conclusions: Classifiers trained with perturbed data were found to be robust to perturbed testing data than the unperturbed classifier without exhibiting a performance decrease on unperturbed testing images, indicating benefits to training with data that include some perturbed images and no significant downsides.
Reference
BioMedInformatics 4, No. 2, pp. 889-910 (2024)