Interconnects

Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions

Jan 10, 2024
This podcast discusses recent developments in the multimodal space, including the Unified IO 2 model, collecting preference data for images, LLaVA-RLHF experiments, and challenges in multimodal RLHF. They explore the architecture and challenges of multimodal models, the potential of GPT for V in multimodal RLHF, and the use of RLHF technique in multimodal models. They also discuss the importance of clearer terminology and the adoption of synthetic data in this context.
Ask episode
Chapters
Transcript
Episode notes