Automatic emotion recognition has become a well-established machine learning task in recent years. The sensitive and subjective nature of emotions may give rise to societal challenges manifesting from incorrect or misinterpreted predictions. In this work, we make the argument that emotion recognition models have an obligation to quantify their uncertainty (or similarly, provide confidence bounds). We provide demonstrations of how classical network architectures can be altered to give measures of epistemic and aleatoric uncertainty using established probabilistic inference techniques. We also explore what these uncertainties explain about the data and predictions and how it can reveal a lack of diversity in training data. We demonstrate how difficult and subjective training samples can be identified using these learned uncertainty measures.