AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Is Parallelization a Good Idea?
This just helps like it done faster, cheaper than you know with recurrent models. I think generally speaking the fact that it's parallel hides a bunch of other stuff which was sort of discovered and like figured out later about this particular architecture. So besides making it more computationally efficient and making it efficient enough to trade on a vast corpus of data in a human reasonable amount of time,. shortening those paths by making the parallel instead of sequential means it trains better. It's better able to understand more context and better being better able to understanding more context is like how language models have gotten so good lately.