AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Using Vision to Improve a Soundsol Separation
Jonathan: I'm very excited about trying to, i would say our ultimate go in my team is what we call total transcription. In particular, try to model how objects interact with each other to create sound as seen. This kind of visual cues can help separate better. And so s vision with our partners in the computer vision group at merl and trying to titeus with sound. If yhos sound, you could try to detect them, localize them in infree space as well.