Shayne Longpre

Graduate Student

AI crawler wars threaten to make the web more closed for everyone

There’s an accelerating cat-and-mouse game between web publishers and AI crawlers, and we all stand to lose.

In the struggle over who can train AI models and how, there’s a casualty many people don’t realize: The open web.

As junk web pages written by AI proliferate, the models that rely on that data will suffer.

After identifying major flaws in popular AI models, researchers are pushing for a new system to identify and report bugs.

New research from the Data Provenance Initiative has found a dramatic drop in content made available to the collections used to build AI.

The project purpose is to improve transparency, documentation, and informed use of datasets in AI.

Researchers created a tool that enables an AI practitioner to find data that suits their model, which could improve accuracy and reduce bias

PhD student Shayne Longpre (Human Dynamics) discusses an open letter he co-authored with collaborators.