OpenAI announced its GPT bot, a web crawler that scrapes data from the entire public internet. It is designed to crawl the web to get information that can be used for training future OpenAI models. The company also shared how people maintaining websites can block access to it.
NLW reviews a recent research paper which provides a comprehensive overview of the state of LLM development and organizes remaining challenges and problems to be solved -- along with prospective solutions. Before that on the Brief: OpenAI launches GPTBot to crawl the web for AI training purposes; Zoom has to back off ToS changes that would allow them to collected data for AI training purposes. Plus a terrifying new cyberattack that can tell what you typed just by listening to your keystrokes.
Read the paper: https://arxiv.org/abs/2307.10169
Today's Sponsor:
Giskard - the testing framework for ML models - https://www.giskard.ai/
ABOUT THE AI BREAKDOWN
The AI Breakdown helps you understand the most important news and discussions in AI.
Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe
Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown
Join the community: bit.ly/aibreakdown
Learn more: http://breakdown.network/