Pre Training for Summarization

There's a question of, is it that this is a useful routine that transformers are just somehow unstable or ill conditioned? Right. And actually, you know, we have a new paper that I'm kind of excited about that I can tell you that should be on the archive tonight. So for summarization, people tend to use these like things where this pre training objects for everything is represented as a sequence like T5 types that up. For classification people might use a BERT model, signing train or the mass language modeling objective and they're doing this with larger and larger transformer models,. Not larger and larger core poor and seeing over getting these gains. But there's a question which is

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app