AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Understanding Low Rank and Tokenization in Embedding Spaces
This chapter explores the idea of low rank and dimensionality in embedding spaces, detailing how high-dimensional vectors can lie on lower-dimensional hyperplanes. It also examines the mathematics of tokenization in models like GPT-2, including cost calculations and the resources available for working with tokenization libraries.