TechCrunch Industry News cover image

TechCrunch Industry News

Microsoft is exploring a way to credit contributors to AI training data

Mar 24, 2025
Microsoft is diving into research that estimates how specific training examples shape generative AI outputs. The initiative raises questions about ethical data use, including transparency and the possibility of compensating contributors. As copyright issues loom large, this project could redefine how we think about data dignity in AI training practices.
05:20

Podcast summary created with Snipd AI

Quick takeaways

  • Microsoft's research project aims to enhance the transparency of AI training processes by recognizing the contributions of specific datasets.
  • The podcast highlights ongoing legal challenges faced by AI companies over copyright issues, emphasizing the need for ethical data practices.

Deep dives

Microsoft's Training Time Provenance Project

Microsoft is initiating a research project aimed at assessing the influence of specific training examples on the outputs generated by AI models. This initiative, noted in a job listing for a research intern, seeks to make the processes of model training more transparent by estimating the contributions of particular datasets, including images and texts. The lack of clarity in current neural network architectures has raised concerns, and this research could provide a framework for recognizing and compensating those who contribute valuable data, addressing issues of data dignity as highlighted by technologist Jaron Lanier. For instance, if an AI model generates a creative work based on numerous influencers, this project could facilitate acknowledgment and payment to those key contributors, effectively connecting them to their impact in the creation process.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner