Sam Hogan, Founder and CEO of Kuzco, dives into the rapidly changing landscape of AI infrastructure. He highlights the shift from model training to inference, predicting it will dominate AI computing in just five years. Hogan elaborates on Kuzco's role as a marketplace for idle GPU power, utilizing crypto for transactions. The discussion progresses to insights on AI agents, their potential effects on human interaction, and the necessity of preserving genuine human connections amid the surge in AI-driven technology.
The podcast highlights a significant shift towards AI inference computing, which is predicted to dominate 99% of AI compute within five years.
Kuzco is revolutionizing the GPU market by creating a global marketplace for idle compute resources, making AI development more accessible and cost-effective.
The discussion emphasizes the importance of maintaining human connections, urging caution against over-reliance on AI for social interactions in an increasingly automated world.
Deep dives
Shift from Training to Inference
The podcast discusses a notable shift in the AI landscape, where historically, up to 95% of AI compute was dedicated to training models, and only 5% was used for inference. This trend is changing as models have become increasingly useful, leading to a balance where approximately half of the compute resources are now allocated to both training and inference. In the next few years, it is anticipated that this balance will tip in favor of inference, potentially reaching 99%, as the need for real-world applications and real-time interactions grows. This evolution indicates a transition to a new phase in AI development, where efficiently utilizing models in practical applications becomes more of a priority than just developing them.
Kuzco's Market for Idle Compute
Sam Hogan, the founder of Kuzco, describes his company as a marketplace for AI inference, connecting sellers of idle compute resources with developers who need access to these capabilities. Kuzco aims to optimize the use of previously underutilized hardware, such as consumer-grade GPUs and data center capacities, offering a more cost-effective solution compared to traditional providers. The platform, which launched with around 5,000 GPUs connected, is designed to facilitate access for developers seeking to utilize AI models without incurring high costs. This innovative approach leverages surplus capacity to benefit both data centers and developers in a mutually advantageous way.
The Realities of GPU Availability
The discussion highlights the evolving GPU market, particularly the H100 and H200 models from NVIDIA, which saw a rapid increase in availability and subsequently a drop in prices due to oversupply. Initially, GPUs were in high demand, leading data centers to take loans and significantly increase inventory, but the subsequent abundant supply challenged the sustainability of that model. The ongoing transition from a scarcity mentality to one of accessibility may lead to increased competition, but it simultaneously opens opportunities for newer models and configurations to be utilized for inference tasks without relying solely on the latest technology. This shift indicates a diversification of resources that can be used for various AI applications beyond traditional, centralized computing infrastructure.
Understanding Training vs Inference
The podcast underscores the critical distinction between training AI models and using them for inference. Training requires extensive computational resources and data to develop large models, which can consist of billions of parameters, while inference involves applying these previously trained models to real-world tasks, such as generating text responses in chat applications. The expectations for inference demand a different approach, as the focus is on efficient resource utilization and swift responses rather than extensive computational loads. Recognizing this difference is vital for understanding how AI can be successfully integrated into applications that require real-time interaction and responsiveness.
The Future of AI and Human Interaction
The conversation also delves into the potential impacts of AI on human society, especially concerning social interactions and personal experiences. As AI technologies advance, there is a concern that individuals may become overly reliant on AI systems for social engagement, potentially isolating themselves from real-world interactions. This reinforces the notion that while technology can facilitate communication and convenience, it should not replace meaningful human experiences. A balance must be struck where technology enhances our lives without diminishing the fundamental human need for connection and physical presence.
In today's episode Kuzco founder Sam Hogan explores AI's major infrastructure shift, explaining why inference computing is set to dominate 99% of AI compute within five years. He details how Kuzco is building a global marketplace for idle GPU power, leveraging crypto for payments and verification. The conversation evolves from technical infrastructure to the emergence of AI agents, their potential impacts, and concludes with insights on preserving human value and authentic connections in an increasingly AI-driven world.
Disclaimer: Nothing said on Empire is a recommendation to buy or sell securities or tokens. This podcast is for informational purposes only, and any views expressed by anyone on the show are solely our opinions, not financial advice. Santiago, Jason, and our guests may hold positions in the companies, funds, or projects discussed.
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode