AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Use Images and Sounds for Indirect Instruction Injection in Multi Model Lems
Do you see model distillation as a way to get around the size issue, or do you see something more fundamental there? The language text inputs are sort of key to being able to reason about instructions and have all this knowledge encoded in it. So when you distill things, you inherently lose some of that reasoning capability and that knowledge. To some extent, probably, but also probably not to the extent that it's needed for edge hardware.