
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Generally Intelligent
Retraining Models to Steer Towards Desirable Characteristics
A hypothesis was formed to steer a trained model away from undesirable characteristics by identifying a way to forget undesirable information./nDifficult examples and their associated information were considered undesirable, while easier examples were considered desirable./nAlgorithms with iterative training were used to measure the forgetting step and its impact on the model's features./nThe forgetting step resulted in a higher proportion of difficult examples being forgotten./nThe hypothesis was supported by observing that resetting the later layers of the model improved performance./nSimilar improvements were seen in other algorithms when the specificity of the forgetting step was increased.