The Hol-Deep-Sense project (funded by the EU’s Horizon 2020 Marie Skłodowska-Curie grant) aims at holistic machine perception of human phenomena such as personal attributes (e.g., age, gender), emotion, health, as well as cognitive and physical states. The machine learning methods developed in this project help personalize AI technologies and enable natural human-machine communication. In particular, this project addresses the shortcoming of today's recognition systems that regard affective states as isolated patterns. Using novel multi-task and transfer learning techniques, we aim to shed light on the interrelations between the facets of human phenomena. The overarching goal of the Hol-Deep-Sense project is to create an end-to-end, multi-input, and multi-output learning framework that learns from multi-modal sensory inputs (e.g., audio, visual, physiological signals), an acoustic model to jointly recognize multiple output targets.