Join Our Paper Club with Shayne Longpre / MIT on "Consent in Crisis: The Rapid Decline of the AI Data Commons" [AI Tinkerers

Join Our Paper Club with Shayne Longpre / MIT on "Consent in Crisis: The Rapid Decline of the AI Data Commons"

~ Please RSVP ~

Oct

Tuesday, October 22nd

12PM to 1:30PM (EDT)

Virtual Meeting

Attendees

286 / 300

300 max

+205

Please Register or sign in

Also, please enter at least one of:

⭐ Limited capacity. Attendance is screened by organizers. ⭐

By registering, you accept our Terms & Privacy

Join Our Paper Club Event Series! Meet with Shayne Longpre, AI Research Scientist at MIT.

Don’t miss this unique opportunity: Hear directly from the researcher & join a live Q&A!

☝️ Register Above for this Live Virtual Meeting with the MIT Researcher! ☝️

New Banner for Paper Club Event

Info	Details
Event	Paper Club with MIT on “Consent in Crisis: The Rapid Decline of the AI Data Commons”
Date & Time	October 22, 2024, 12:00 PM EST
Presenter	Shayne Longpre, AI Research Scientist, MIT
Research Paper	📄 Consent in Crisis: The Rapid Decline of the AI Data Commons

About the Paper:

Meet Shayne Longpre, a PhD Candidate at MIT whose research focuses on the intersection of AI and policy. His work addresses the responsible training, evaluation, and governance of general-purpose AI systems.

Paper link: 📄 Consent in Crisis: The Rapid Decline of the AI Data Commons

Paper abstract:

Meet Shayne Longpre, the lead researcher of the Data Provenance Initiative report, “Consent in Crisis: The Rapid Decline of the AI Data Commons.”

Generative AI models depend on vast public datasets sourced from websites, but the rise in web restrictions is rapidly reducing available data. In the first large-scale audit of AI training corpora, Longpre and his team analyze 14,000 web domains, revealing a growing use of robots.txt and Terms of Service to block AI data collection. The report highlights:

A significant rise in data restrictions, with up to 45% of key AI datasets like C4 becoming inaccessible.
The impact of these restrictions on AI models, which are shifting away from fresh, high-quality data sources like news and academic sites.
The challenges these data limitations pose for AI developers, and the implications for future AI systems.

This work uncovers a critical crisis in data consent protocols, with long-term consequences for AI development, research, and data access.

Media Coverage

What is Paper Club?

Paper Club is a virtual event series brought to you by the Human Feedback Foundation in collaboration with AI Tinkerers, featuring authors of cutting-edge AI and machine learning papers. These online meetups allow attendees to hear about groundbreaking research directly from the authors, participate in live Q&A sessions, and engage in discussions. Open to all, Paper Club offers a regular opportunity to learn and interact with leaders in the rapidly evolving field of artificial intelligence.