Analysis of Refusals and Comparison of Instruction Sets

This chapter discusses the evaluation of the rate of refusals by an AI and provides a list of phrases indicating refusals along with their presence in the text. It also compares the frequency of the word 'bomb' in two different sets of instructions and comments on various requests related to making a bomb.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app