
Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions
Interconnects
Collecting Preference Data for Images and the Potential of GPT for V in Multimodal RLHF
Discussion on the collection of preference data for images and the challenges it presents, including the adoption of a behavior where users select a specific image as a preference label. The chapter also explores Tl draw's dataset using OpenAI's API and the potential of adding written text to improve GPT for V.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.