Event

Data Provenance @ Mozilla Data Futures Lab

Courtesy of the researchers

People

Projects

Data Provenance for AI

Groups

Share this event

Monday

January 22, 2024

11:00am — 12:00pm ET

Recent breakthroughs in language modeling are powered by large collections of natural language datasets. This has triggered an arms race to train models on disparate collections of incorrectly, ambiguously, or under-documented data that has left practitioners unsure of the ethical and legal risks.

To address this, the Data Provenance Initiative has created a mapping of 2000+ popular, text-to-text finetuning datasets from origin to creation, cataloging their data sources, licenses, creators, and other metadata, for researchers to explore using this tool. The purpose of this work is to improve transparency, documentation, and informed use of datasets in AI.

Read more at Mozilla Access the recording and slides

More Events

Event Events

Data Provenance @ Mozilla Data Futures Lab

People

Projects

Groups

City Science Summit Concepción

Media Lab @ Venice Biennale Architettura 2025

AHA Speaker Series: Gry Hasselbalch | Human Power – Seven Traits for the Politics of the AI Machine Age

Deviation Game Exploratorium Exhibition

Data Provenance @ Mozilla Data Futures Lab

People

Projects

Groups

Share this event

City Science Summit Concepción

Media Lab @ Venice Biennale Architettura 2025

AHA Speaker Series: Gry Hasselbalch | Human Power – Seven Traits for the Politics of the AI Machine Age

Deviation Game Exploratorium Exhibition