Machine Learning Street Talk (MLST) cover image

Test-Time Adaptation: the key to reasoning with DL (Mohamed Osman)

Machine Learning Street Talk (MLST)

CHAPTER

Exploring Transformer Limitations and Architectural Fixes

This chapter examines the limitations of self-attention transformers, particularly regarding tasks like counting and copying. It discusses the impact of representational squashing and the softmax function on performance while proposing deeper architectural changes to improve AI capabilities.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner