Is There a Language Model in Isolation?

I've just realied. I hve a few questions. So, so yet, it seems like, wat, the adversarial part, there's a cool energy with this in the previous paper, right? Where, like, ideally, your language muddl would say a thing and generate some evidence. And then you could use red teaming to check if that evidence was confabulated or misleading. But we haven't done that for this particular paper. We are doing it like lemsbotle interpret interpretability work at deepind. Is just that this is a more complicated system than just language modelling and isolation. i would want to not put all the pieces together too early

Play episode from 39:06

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app