The social robot intelligence benchmark is a large-scale multimodal dataset comprising many diverse contexts, tasks, and environments to evaluate a social robot's social intelligence. In this work, we first define a taxonomy of social intelligence in the context of of social robots grounded in findings from psychology, cognitive science, and human-robot-interaction. Secondly, we enable and establish a suite of machine learning tasks and and evaluation metrics which carefully measure different aspects of an AI agent's social intelligence. Finally, we establish and compare current state-of-the-art machine learning models on these tasks. Our unified, large-scale, rich, and accessible dataset enables further research to improve social intelligence in social embodied agents.