AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Innovative Wafer Engine for AI Processing
The chapter explores a revolutionary large-scale wafer engine designed for AI processing, optimizing performance for training large language models through weight streaming and partitioned computation. It discusses a computer chip with a substantial SRAM cache directly on the chip, enabling high bandwidth and efficient computation with different streaming modes. The evolution of wafer scale engine architecture, support for specific models like transformers, and the unique functionalities of the wafer setup are highlighted, emphasizing advancements in efficient matrix calculations and large-scale model training.