In this engaging discussion, Aljosa Osep, a postdoctoral researcher specializing in robot vision, shares his insights on advancing technology for 3D scene understanding. He delves into his innovative work on Text2Pos, which aligns textual descriptions with localization cues, enhancing robot navigation. Osep also explores groundbreaking approaches to forecasting using LIDAR data, redefining object tracking in dynamic environments. His research aims to push the boundaries of robotic vision beyond autonomous vehicles, ensuring smarter, more adaptable robots in various applications.