
Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions
Interconnects
00:00
Collecting Preference Data for Images and the Potential of GPT for V in Multimodal RLHF
Discussion on the collection of preference data for images and the challenges it presents, including the adoption of a behavior where users select a specific image as a preference label. The chapter also explores Tl draw's dataset using OpenAI's API and the potential of adding written text to improve GPT for V.
Transcript
Play full episode