Publication

Towards Transparency in Dermatology Image Datasets with Skin Tone Annotations by Experts, Crowds, and an Algorithm

matt groh

Sept. 19, 2022

Topics

People

Projects

Diagnosing Diagnosis in Dermatology

Groups

Share this publication

Groh, Matthew, Caleb Harris, Roxana Daneshjou, Omar Badri, and Arash Koochek. "Towards transparency in dermatology image datasets with skin tone annotations by experts, crowds, and an algorithm." CSCW (2022).

Abstract

While artificial intelligence (AI) holds promise for supporting healthcare providers and improving the accuracy of medical diagnoses, a lack of transparency in the composition of datasets exposes AI models to the possibility of unintentional and avoidable mistakes. In particular, public and private image datasets of dermatological conditions rarely include information on skin color. As a start towards increasing transparency, AI researchers have appropriated the use of the Fitzpatrick skin type (FST) from a measure of patient photosensitivity to a measure for estimating skin tone in algorithmic audits of computer vision applications including facial recognition and dermatology diagnosis. In order to understand the variability of estimated FST annotations on images, we compare several FST annotation methods on a diverse set of 460 images of skin conditions from both textbooks and online dermatology atlases. These methods include expert annotation by board-certified dermatologists, algorithmic annotation via the Individual Typology Angle algorithm, which is then converted to estimated FST (ITA-FST), and two crowd-sourced, dynamic consensus protocols for annotating estimated FSTs. We find the inter-rater reliability between three board-certified dermatologists is comparable to the inter-rater reliability between the board-certified dermatologists and either of the crowdsourcing methods. In contrast, we find that the ITA-FST method produces annotations that are significantly less correlated with the experts’ annotations than the experts’ annotations are correlated with each other. These results demonstrate that algorithms based on ITA-FST are not reliable for annotating large-scale image datasets, but human-centered, crowd-based protocols can reliably add skin type transparency to dermatology datasets. Furthermore, we introduce the concept of dynamic consensus protocols with tunable parameters including expert review that increase the visibility of crowdwork and provide guidance for future crowdsourced annotations of large image datasets.

via CSCW 2022

What happens when we give doctors an AI assistant?

Media Lab alum Matt Groh discusses a study finding that AI can help doctors diagnose dermatological conditions across different skin tones.a

Post Research

Irmandy Wicaksono + Matt Groh receive Harold Horowitz (1951) Student Research Fund awards

Award recipients are selected based on their academic achievements and dedication to their graduate research works.

Publication Research

Evaluating Deep Neural Networks Trained on Clinical Images in Dermatology with the Fitzpatrick 17k Dataset

Groh, Matthew, Caleb Harris, Luis Soenksen, Felix Lau, Rachel Han, Aerin Kim, Arash Koochek, and Omar Badri. "Evaluating Deep Neural Networks Trained on Clinical Images in Dermatology with the Fitzpatrick 17k Dataset." arXiv preprint arXiv:2104.09957 (2021). Accepted at CVPR ISIC 2021 workshop. Publication pending.