Abstract
Generative AI has rapidly become a daily presence in human life — not just as a tool, but as a responsive, adaptive, and increasingly relational system. AI companions, tutors, and creative partners are shaping habits, trust, and reliance at scale. While children are the most visible and vulnerable users, the deeper shift affects all of us. Yet despite growing concern around manipulation, dependency, and erosion of agency, today’s accountability mechanisms remain largely static and indirect.
Model cards and “AI nutrition labels” describe how systems are built — but not how they actually shape human behavior in real-world use.
This talk introduces HumaneBench, a new paradigm for humane AI accountability that moves beyond disclosure to continuous, outcome-based evaluation. Instead of listing ingredients, HumaneBench dynamically measures AI behavior across research-grounded principles such as respecting attention, protecting dignity, fostering healthy relationships, and enhancing human capabilities — providing living evidence of human impact inside real products.
Through a production case study at Storytell.ai, we will share how continuous humane evaluation surfaced surprising patterns of unintended dependency and directly informed product redesign toward learning, agency, and augmentation — demonstrating how ethical intentions can be operationalized at scale.
More broadly, we will outline a vision for humane benchmarking as the accountability engine behind meaningful AI transparency — reshaping incentives from engagement to human flourishing.
At a moment when AI systems are forming unprecedented relationships with users, the central challenge is no longer defining humane principles — it is operationalizing them. HumaneBench offers a practical path from aspiration to measurable reality, transforming human flourishing from a value we endorse into a performance standard we can continuously improve.
Speaker Bio
Erika Anderson
Erika Anderson is the founder of Building Humane Technology and co-founder and Chief Customer Officer of Storytell.ai, an AI-powered data platform. Born on a commune founded by Stephen Gaskin, she brings a lifelong perspective on how intentional design shapes collective wellbeing and human flourishing. She co-developed HumaneBench, which focuses on measuring and evaluating AI systems' impact on human wellbeing, particularly how technology affects attention, autonomy, and mental health in everyday interactions. She’s since turned HumaneBench into a humane eval, fundamentally shifting humane AI evaluation from static disclosure to continuous, outcome-based accountability that measures how systems actually shape human behavior and wellbeing over time. Humane evals like HumaneBench enable teams to make human flourishing a quantifiable product outcome rather than an aspirational afterthought.
Yaoli Mao
Yaoli Mao, PhD is a cognitive scientist turned human-centered AI research leader who explores how people learn, decide, and grow in partnership with intelligent systems — with a deep curiosity about the shared cognitive architecture between humans and machines, and how we can design systems that help both sides learn to learn. With over a decade of experience across academia and industry, she brings cognitive and behavioral science into real-world AI design to ensure technology expands human creativity, judgment, and growth rather than quietly eroding agency. She co-developed the Hybrid Intelligence Technology Acceptance Model (HI-TAM), a research-backed framework for building trustworthy, sustainable human–AI partnerships, and leads large-scale work on agentic AI experiences and humane evaluation that transforms ethical intentions into measurable design outcomes.