Intelligent Machines (Audio) IM 842: None Pizza Left Beef - AI On the Edge
16 snips
Oct 23, 2025 Joey de Villa, a developer and AI educator, joins the discussion on the future of AI with tools that can run on personal machines. He delves into local language models and the implications of Nvidia's Blackwell chip and HP's ZGX Nano for privacy and cost control. They explore real-world applications, from healthcare to personal projects, and consider the accuracy concerns raised by the BBC study on AI-generated news. Additionally, insights on the pushback against AI content highlight the evolving landscape of technology and ethics.
AI Snips
Chapters
Transcript
Episode notes
Local Blackwell SOC Enables On-Desk LLMs
- NVIDIA's Blackwell-based SOC enables powerful local LLM inference in a small form factor like HP's ZGX Nano.
- Running models locally shifts privacy, latency, and cost trade-offs away from cloud dependence.
Choose Fine-Tuning For Lasting Specialization
- Use fine-tuning when you need long-term, tailored behavior for a specific application or product domain.
- Use RAG (retrieval-augmented generation) for one-off, open-book queries where documents supply context at runtime.
Unified Memory Is The Real Constraint
- Unified high-bandwidth memory is the gating factor for fitting large models on local hardware.
- ZGX's 128GB unified memory lets it host models up to ~200B parameters by sharing RAM between CPU and GPU.
