Invest Like the Best with Patrick O'Shaughnessy cover image

Gavin Uberti - Real-Time AI & The Future of AI Hardware - [Invest Like the Best, EP.356]

Invest Like the Best with Patrick O'Shaughnessy

NOTE

Utilizing Model Specific Chips for Faster Response Times

Utilizing model specific chips, specifically designed for transformer models, can significantly reduce initial delay by increasing compute capacity and efficient utilization. By embedding these models into chips with high memory reading efficiency, it is possible to achieve over 90% utilization and drastically reduce response times from milliseconds to seconds.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner