Machine Learning Street Talk (MLST) cover image

AI Agents Can Code 10,000 Lines of Hacking Tools In Seconds - Dr. Ilia Shumailov (ex-GDM)

Machine Learning Street Talk (MLST)

00:00

Thinking Traces Aren't Security Proof

  • Interpreting model internal traces offers limited security value because projections collapse multidimensional behavior.
  • Interpretability helps safety but is insufficient for high-assurance security guarantees.
Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app