This chapter explores advanced tactics in AI red teaming, particularly the role of multi-turn interactions in improving jailbreak success rates. It reveals insights from a new dataset on human efforts to bypass AI safeguards, emphasizing the necessity of domain knowledge and highlighting the superiority of human-led strategies over automated ones.
Our 181st episode with a summary and discussion of last week's big AI news!
With hosts Andrey Kurenkov and Jeremie Harris
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/
If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.
Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai
In this episode:
- Google's AI advancements with Gemini 1.5 models and AI-generated avatars, along with Samsung's lithography progress.
- Microsoft's Inflection usage caps for Pi, new AI inference services by Cerebrus Systems competing with Nvidia.
- Biases in AI, prompt leak attacks, and transparency in models and distributed training optimizations, including the 'distro' optimizer.
- AI regulation discussions including California’s SB1047, China's AI safety stance, and new export restrictions impacting Nvidia’s AI chips.
Timestamps + Links:
- (00:00:00) Intro / Banter
- (00:03:08)Response to listener comments / corrections
- Tools & Apps
- Applications & Business
- Projects & Open Source
- Research & Advancements
- Policy & Safety
- Synthetic Media & Art
- (02:14:06) Outro