AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Multimodal Summarization and How to Improve Its Faithfulness
Would you say that the introduction of multi-modality makes factuality an easier problem to solve than just factuality for large language models? Or is it harder? Or both, I guess is probably the answer. So we had this paper on multimodal summarization and how to improve its faithfulness. Some information that's present in both modalities gives you a better way to avoid faithfulness because it should not hallucinate based on either modality. But given the fact that some modalities are just harder in terms of being accurate, the features are not that high quality in certain modalities than other modalities, that definitely creates a little bit of an imbalance.