Matt Groh Dissertation Defense

Dissertation Title: The Science and Art of Human-AI Collaboration


While artificial intelligence (AI) appears to be surpassing the performance of human experts on a wide variety of games and real-world tasks, these algorithms are prone to systematic and surprising failures when deployed. In contrast to today's state-of-the-art algorithms, humans are highly capable of adapting to new contexts. The different strengths and weaknesses of humans and AI motivate a guiding research question for the emerging field of human-AI collaboration: When, where, why, and how does the combination of human problem solving and AI systems lead to a hybrid system that surpasses (or fails to surpass) the performance of either humans or the machine alone? This dissertation addresses various dimensions of this guiding question by conducting large-scale, digital experiments across three distinct tasks and domains: deepfake detection, dermatology diagnosis, and Wordle. First, the experiments in deepfake detection examine the similarities and differences between human and machine vision in identifying visual manipulations of people's faces in videos and identify important performance trade-offs between hybrid systems and human or AI only systems for deepfake detection. Second, the experiments in dermatology diagnosis reveal that non-visual information is often essential for diagnosing skin disease, diagnostic accuracy disparities across skin color exist in image-only store-and-forward teledermatology, and clinical decision support based on a fair deep learning system can significantly increase physicians' diagnostic accuracy and reduce diagnostic accuracy disparities in this experimental setting. Third, the experiment on Wordle demonstrates that digitally mediated empathy can counteract the negative effect of anger on human creative problem solving. In addition to these digital experiments, this dissertation presents two algorithmic audits on clinical dermatology images to reveal where systematic errors arise in state-of-the-art algorithms, examines how context influences automated affect recognition, and proposes methods for more effectively incorporating context in applied machine learning. Together, these contributions provide empirical evidence for why human-AI collaborations succeed and fail across a variety of tasks and domains, insights into how to design human-AI collaborations more effectively, and a framework for when and where hybrid systems should rely on human problem solving.

Committee members:
Rosalind Picard
Professor of Media Arts and Sciences

David Rand
Erwin H. Schell Professor of Management Science and Brain and Cognitive Sciences

Chaz Firestone
Assistant Professor of Psychological and Brain Sciences

More Events