What limits AI today isn’t imagination – it’s the cost of running it at scale.
In this episode of Inference, Ksenia Se sits down with Lin Qiao, co-founder & CEO of Fireworks AI (an inference-first company) and former head of PyTorch at Meta, where she led the rebuild of Meta’s entire AI infrastructure stack.
We talk about:
Why product-market fit can be the beginning of bankruptcy in GenAI
The iceberg problem of hidden GPU costs
Why inference scales with people, not researchers
2025 as the year of AI agents (coding, hiring, SRE, customer service, medical, marketing)
Open vs closed models – and why Chinese labs are setting new precedents
The coming wave of 100× more efficient AI infrastructure
Watch to hear Lin’s vision for inference, alignment, and the future of AI infrastructure.
And – at the end – Lin shares her very personal journey to overcome fears. Watch it!
Did you like the episode? You know the drill:
📌 Subscribe for more conversations with the builders shaping real-world AI.
💬 Leave a comment if this resonated.
👍 Like it if you liked it.
🫶 Thank you for watching and sharing!
Guest:
Lin Qiao, co-founder & CEO of Fireworks AI and former head of PyTorch at Meta
https://www.linkedin.com/in/lin-qiao-22248b4
https://x.com/lqiao
https://x.com/FireworksAI_HQ
https://fireworks.ai/
📰 Want the transcript and edited version?
Subscribe to Turing Post https://www.turingpost.com/subscribe
Chapters
Turing Post is a newsletter about AI's past, present, and future. Publisher Ksenia Se explores how intelligent systems are built – and how they’re changing how we think, work, and live.
Sign up: Turing Post: https://www.turingpost.com
Follow us
Ksenia and Turing Post:
https://x.com/TheTuringPost
https://www.linkedin.com/in/ksenia-se
https://huggingface.co/Kseniase