AXRP - the AI X-risk Research Podcast cover image

16 - Preparing for Debate AI with Geoffrey Irving

AXRP - the AI X-risk Research Podcast

00:00

Is There a Language Model in Isolation?

I've just realied. I hve a few questions. So, so yet, it seems like, wat, the adversarial part, there's a cool energy with this in the previous paper, right? Where, like, ideally, your language muddl would say a thing and generate some evidence. And then you could use red teaming to check if that evidence was confabulated or misleading. But we haven't done that for this particular paper. We are doing it like lemsbotle interpret interpretability work at deepind. Is just that this is a more complicated system than just language modelling and isolation. i would want to not put all the pieces together too early

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app