The Myth of AI Breakthroughs // Jonathan Frankle // #205

16 snips

Jan 19, 2024

Jonathan Frankle, Chief Scientist at Databricks, discusses the realities and usefulness of AI, including face recognition systems, the 'lottery ticket hypothesis,' and robust decision-making protocols for training models. They also explore Jonathan's move into law, his experience with GPUs, and the revolutionary algorithm called Qstar.

Ask episode

Chapters

Transcript

Episode notes

Introduction

00:00 • 4min

Pragmatic Approach, Scientific Community, and Gratitude

03:41 • 2min

Panel Discussions and LM Avalanche

05:20 • 24min

Building a Platform for Training Models and Pushing the Frontier of Knowledge

29:20 • 2min

Building Expertise into a Product and Ensuring Quality

31:40 • 7min

The Use of Slurm Orchestration Solution for ML Workloads

38:17 • 30min

Expressing Appreciation, Anticipation, and Hot Takes

01:08:03 • 2min

Jonathan Frankle works as Chief Scientist (Neural Networks) at MosaicML (recently acquired by Databricks), a startup dedicated to making it easy and cost-effective for anyone to train large-scale, state-of-the-art neural networks. He leads the research team. MLOps podcast #205 with Jonathan Frankle, Chief Scientist (Neural Networks) at Databricks, The Myth of AI Breakthroughs, co-hosted by Denny Lee, brought to us by our Premium Brand Partner, Databricks. // Abstract Jonathan takes us behind the scenes of the rigorous work they undertake to test new knowledge in AI and to create effective and efficient model training tools. With a knack for cutting through the hype, Jonathan focuses on the realities and usefulness of AI and its application. We delve into issues such as face recognition systems, the 'lottery ticket hypothesis,' and robust decision-making protocols for training models. Our discussion extends into Jonathan's interesting move into the world of law as an adjunct professor, the need for healthy scientific discourse, his experience with GPUs, and the amusing claim of a revolutionary algorithm called Qstar. // Bio Jonathan Frankle is Chief Scientist (Neural Networks) at Databricks, where he leads the research team toward the goal of developing more efficient algorithms for training neural networks. He arrived via Databricks’ $1.3B acquisition of MosaicML as part of the founding team. He recently completed his PhD at MIT, where he empirically studied deep learning with Prof. Michael Carbin, specifically the properties of sparse networks that allow them to train effectively (his "Lottery Ticket Hypothesis" - ICLR 2019 Best Paper). In addition to his technical work, he is actively involved in policymaking around challenges related to machine learning. He earned his BSE and MSE in computer science at Princeton and has previously spent time at Google Brain and Facebook AI Research as an intern and Georgetown Law as an Adjunct Professor of Law. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: www.jfrankle.com Facial recognition: perpetuallineup.orgThe Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networksby Jonathan Frankle and Michael Carbin paper: https://arxiv.org/abs/1803.03635 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Denny on LinkedIn: https://linkedin.com/in/dennyglee Connect with Jonathan on LinkedIn: https://www.linkedin.com/in/jfrankle/ Timestamps: [00:00] Jonathan's preferred coffee [01:16] Takeaways [07:19] LM Avalanche Panel Surprise [10:07] Adjunct Professor of Law [12:59] Low facial recognition accuracy [14:22] Automated decision making human in the loop argument [16:09] Control vs. Outsourcing Concerns [18:02] perpetuallineup.org [23:41] Face Recognition Challenges [26:18] The lottery ticket hypothesis [29:20] Mosaic Role: Model Expertise [31:40] Expertise Integration in Training [38:19] SLURM opinions [41:30] GPU Affinity [45:04] Breakthroughs with QStar [49:52] Deciphering the noise advice [53:07] Real Conversations [55:47] How to cut through the noise [1:00:12] Research Iterations and Timelines [1:02:30] User Interests, Model Limits [1:06:18] Debugability [1:08:00] Wrap up