
The Gradient: Perspectives on AI
Deeply researched, technical interviews with experts thinking about AI and technology. thegradientpub.substack.com
Latest episodes

14 snips
Oct 27, 2022 • 44min
Luis Voloch: AI and Biology
* Have suggestions for future podcast guests (or other feedback)? Let us know here!* Want to write with us? Send a pitch using this form :)In episode 46 of The Gradient Podcast, Daniel Bashir speaks to Luis Voloch.Luis is co-founder of Immunai, a leading AI-led drug discovery company with over 140 employees and over one billion dollar valuation based out of NYC & Tel Aviv. Before Immunai, Luis was Head of Data Science and Machine Learning at ITC and worked at Palantir, where he worked on a variety of ML efforts. He did his studies and research in Math and CS in MIT. He has also led AI, genomics, and software efforts at a number of other companies.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:25) Luis’s math background and getting into AI* (06:35) Luis’s PhD experience, proving theoretical guarantees for recommendation systems* (09:45) Why Luis left his PhD* (15:45) Why Luis is excited about intersection of ML and biology* (18:28) Challenges of applying AI to biology* (22:55) Immunai* (27:03) Challenges in building a biotech (or “tech-bio”) company* (30:30) Research at Immunai, Neural Design for Genetic Perturbation Experiments* (34:43) Interpretability in ML + biology* (36:00) What Luis plans to do next* (37:55) Luis’s advice for grad students / ML people interested in biology* (40:00) Luis’s perspective on the future of AI + biology* (43:10) OutroLinks:* Luis on LinkedIn, Crunchbase* Luis’s article on The convergence of deep neural networks and immunotherapy* Papers* Luis’s thesis* Neural Design for Genetic Perturbation Experiments* SystemMatch: optimizing preclinical drug models to human clinical outcomes via generative latent-space matching Get full access to The Gradient at thegradientpub.substack.com/subscribe

7 snips
Oct 13, 2022 • 1h 41min
Zachary Lipton: Where Machine Learning Falls Short
* Have suggestions for future podcast guests (or other feedback)? Let us know here!* Want to write with us? Send a pitch using this form :)In episode 45 of The Gradient Podcast, Daniel Bashir speaks to Zachary Lipton. Zachary is an Assistant Professor of Machine Learning and Operations Research at Carnegie Mellon University, where he directs the Approximately Correct Machine Intelligence Lab. He holds a joint appointment between CMU’s ML Department and Tepper School of Business, and holds courtesy appointments at the Heinz School of Public Policy and the Software and Societal Systems Department. His research spans core ML methods and theory, applications in healthcare and natural language processing, and critical concerns about algorithms and their impacts.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (2:30) From jazz music to AI* (4:40) “fix it in post” we had some technical issues :)* (4:50) spicy takes, music and tech* (7:30) Zack’s plan to get into grad school* (9:45) selection bias in who gets faculty positions* (12:20) The slow development of Zack’s wide range of research interests, Zack’s strengths coming into ML research* (22:00) How Zack got attention early in his PhD* (27:00) Should PhD students meander?* (30:30) Faults in the QA model literature* (35:00) Troubling Trends, antecedents in other fields* (39:40) Pretraining LMs on nonsense words, new paper!* The new paper (9/29)* (47:25) what “BERT learns linguistic structure” misses* (56:00) making causal claims in ML* (1:05:40) domain-adversarial networks don’t solve distribution shift, underspecified problems* (1:09:10) the benefits of floating between communities* (1:14:30) advice on finding inspiration and learning* (1:16:00) “fairness” and ML solutionism* (1:21:10) epistemic questions, how we make determinations of fairness* (1:29:00) Zack’s drives and motivationsLinks:* Zachary’s Homepage* Papers* DL Foundations, Distribution Shift, Generalization* Does Pretraining for Summarization Require Knowledge Transfer?* How Much Reading Does Reading Comprehension Require?* Learning Robust Global Representations by Penalizing Local Predictive Power* Detecting and Correcting for Label Shift with Black Box Predictors* RATT* Explanation/Interpretability/Fairness* The Mythos of Model Interpretability* Evaluating Explanations* Does mitigating ML’s impact disparity require treatment disparity?* Algorithmic Fairness from a Non-ideal Perspective* Broader perspectives/critiques* Troubling Trends in Machine Learning Scholarship* When Curation Becomes Creation Get full access to The Gradient at thegradientpub.substack.com/subscribe

Oct 6, 2022 • 1h 11min
Stuart Russell: The Foundations of Artificial Intelligence
Have suggestions for future podcast guests (or other feedback)? Let us know here!In episode 44 of The Gradient Podcast, Daniel Bashir speaks to Professor Stuart Russell. Stuart Russell is a Professor of Computer Science and the Smith-Zadeh Professor in Engineering at UC Berkeley, as well as an Honorary Fellow at Wadham College, Oxford. Professor Russell is the co-author with Peter Norvig of Artificial Intelligence: A Modern Approach, probably the most popular AI textbook in history. He is the founder and head of Berkeley’s Center for Human-Compatible Artificial Intelligence and recently authored the book Human Compatible: Artificial Intelligence and the Problem of Control. He has also served as co-chair on the World Economic Forum’s Council on AI and Robotics.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:45) Stuart’s introduction to AI* (05:50) The two most important questions* (07:25) Historical perspectives during Stuart’s PhD, agents and learning* (14:30) Rationality and Intelligence, Bounded Optimality* (20:30) Stuart’s work on Metareasoning* (29:45) How does Metareasoning fit with Bounded Optimality?* (37:39) “Civilization advances by reducing complex operations to be trivial”* (39:20) Reactions to the rise of Deep Learning, connectionist/symbolic debates, probabilistic modeling* (51:00) The Deep Learning and traditional AI communities will adopt each other’s ideas* (51:55) Why Stuart finds the self-driving car arena interesting, Waymo’s old-fashioned AI approach* (57:30) Effective generalization without the full expressive power of first-order logic—deep learning is a “weird way to go about it”* (1:03:00) A very short shrift of Human Compatible and its ideas* (1:10:42) OutroLinks:* Stuart’s webpage* Human Compatible page with reviews and interviews* Papers mentioned* Rationality and Intelligence* Principles of Metareasoning Get full access to The Gradient at thegradientpub.substack.com/subscribe

Sep 29, 2022 • 51min
Varun Ganapathi: AKASA, AI and Healthcare
Have suggestions for future podcast guests (or other feedback)? Let us know here!In episode 43 of The Gradient Podcast, Daniel Bashir speaks to Varun Ganapathi.Varun is co-founder and CTO at AKASA, a company developing AI systems for healthcare operations. Varun’s previous entrepreneurial experience includes co-founding Numovis, a company focused on motion tracking and computer vision for user interaction that was acquired by Google, and Terminal.com, a browser-based IDE acquired by Udacity. Varun received his PhD from Stanford in 2014.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (1:50) Varun’s intro to AI* (3:25) Working with Andrew Ng* (7:37) Varun’s road to a PhD* (13:20) Numovis, Google acquisition* (15:00) Vacillating between research and entrepreneurship, Terminal.com* (17:10) Roots of Varun’s interest in AI + healthcare* (22:30) Research at AKASA, Deep Claim* (25:45) Causality in claim denial, expert knowledge* (25:52) we need to trademark the word “gradient”* (28:20) AKASA’s Unified Automation, expert-in-the-loop* (34:15) Varun’s near-term and long-term visions for AKASA* (39:50) Towards “deploying a new version of healthcare”* (42:25) Varun’s perspective on the role of AI in healthcare, the need for humans in the loop* (47:02) Varun’s advice for aspiring AI researchers and practitioners* (51:00) OutroLinks:* AKASA’s Homepage* Varun’s research* AKASA is hiring! Get full access to The Gradient at thegradientpub.substack.com/subscribe

5 snips
Sep 22, 2022 • 1h 39min
Joel Lehman: Open-Endedness and Evolution through Large Models
Have suggestions for future podcast guests (or other feedback)? Let us know here!In episode 42 of The Gradient Podcast, Daniel Bashir speaks to Joel Lehman.Joel is a machine learning scientist interested in AI safety, reinforcement learning, and creative open-ended search algorithms. Joel has spent time at Uber AI Labs and OpenAI and is the co-author of the book Why Greatness Cannot be Planned: The Myth of the Objective. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:40) From game development to AI* (03:20) Why evolutionary algorithms* (10:00) Abandoning Objectives: Evolution Through the Search for Novelty Alone* (24:10) Measuring a desired behavior post-hoc vs optimizing for that behavior* (27:30) Neuroevolution through Augmenting Topologies (NEAT), Evolving a Diversity of Virtual Creatures* (35:00) Humans are an inefficient solution to evolution’s objectives* (47:30) Is embodiment required for understanding? Today’s LLMs as practical thought experiments in disembodied understanding* (51:15) Evolution through Large Models (ELM)* (1:01:07) ELM: Quality Diversity Algorithms, MAP-Elites, bootstrapping training data* (1:05:25) Dimensions of Diversity in MAP-Elites, what is “interesting”?* (1:12:30) ELM: Fine-tuning the language model* (1:18:00) Results of invention in ELM, complexity in creatures* (1:20:20) Future work building on ELM, key challenges in open-endedness* (1:24:30) How Joel’s research affects his approach to life and work* (1:28:30) Balancing novelty and exploitation in work* (1:34:10) Intense competition in AI, Joel’s advice for people considering ML research* (1:38:45) Daniel isn’t the worst interviewer ever* (1:38:50) OutroLinks:* Joel’s webpage* Evolution through Large Models: The Tweet* Papers:* Abandoning Objectives: Evolution through the search for novelty alone* Evolving a diversity of virtual creatures through novelty search and local competition* Designing neural networks through neuroevolution* Evolution through Large Models* Resources for (aspiring) ML researchers!* Cohere for AI* ML Collective Get full access to The Gradient at thegradientpub.substack.com/subscribe

Sep 15, 2022 • 57min
Andrew Feldman: Cerebras and AI Hardware
Have suggestions for future podcast guests (or other feedback)? Let us know here!In episode 42 of The Gradient Podcast, Daniel Bashir speaks to Andrew Feldman.Andrew is the co-founder and CEO of Cerebras Systems, an AI accelerator company that has built the largest processor in the industry. Before Cerebras, Andrew co-founded and served as CEO of SeaMicro, which was acquired by AMD in 2012. He has also served in executive positions at Force10 Networks and RiverStone Networks.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:05) Andrew’s trajectory, from business school to Cerebras* (10:00) The large model problem and Cerebras’ approach* (19:50) Cerebras’s GPT-J announcement* (22:20) Andrew explains weight streaming to Daniel* (32:30) Andrew’s thoughts on the MLPerf benchmark* (38:20) The venture landscape for AI accelerator companies* (42:50) The hardware lottery, hardware support for sparsity* (45:40) The CHIPS Act, NVIDIA China ban and the accelerator industry* (48:00) Politics and Chips, US and China* (52:20) Andrew’s perspective on tackling difficult problems* (56:42) OutroLinks:* Cerebras’ Homepage* GPT-J Announcement* TotalEnergies* GlaxoSmithKline (GSK)* Sources mentioned* “Political Chips” by Ben Thompson (because Daniel’s a fanboy)* Daniel’s conversation with Sara Hooker* The Hardware Lottery Get full access to The Gradient at thegradientpub.substack.com/subscribe

22 snips
Sep 8, 2022 • 1h 12min
Christopher Manning: Linguistics and the Development of NLP
Have suggestions for future podcast guests (or other feedback)? Let us know here!In episode 41 of The Gradient Podcast, Daniel Bashir speaks to Christopher Manning.Chris is the Director of the Stanford AI Lab and an Associate Director of the Stanford Human-Centered Artificial Intelligence Institute. He is an ACM Fellow, an AAAI Fellow, and past President of ACL. His work currently focuses on applying deep learning to natural language processing; it has included tree recursive neural networks, GloVe, neural machine translation, and computational linguistic approaches to parsing, among other topics. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:40) Chris’s path to AI through computational linguistics* (06:10) Human language acquisition vs. ML systems* (09:20) Grounding language in the physical world, multimodality and DALL-E 2 vs. Imagen* (26:15) Chris’s Linguistics PhD, splitting time between Stanford and Xerox PARC, corpus-based empirical NLP* (34:45) Rationalist and Empiricist schools in linguistics, Chris’s work in 1990s* (45:30) GloVe and Attention-based Neural Machine Translation, global and local context in language* (50:30) Different Neural Architectures for Language, Chris’s work in the 2010s* (58:00) Large-scale Pretraining, learning to predict the next word helps you learn about the world* (1:00:00) mBERT’s Internal Representations vs. Universal Dependencies Taxonomy* (1:01:30) The Need for Inductive Priors for Language Systems* (1:05:55) Courage in Chris’s Research Career* (1:10:50) Outro (yes Daniel does have a new outro with ~ music ~)Links:* Chris’s webpage* Papers (1990s-2000s)* Distributional Phrase Structure Induction* Fast exact inference with a factored model for Natural Language Parsing* Accurate Unlexicalized Parsing* Corpus-based induction of syntactic structure* Foundations of Statistical Natural Language Processing* Papers (2010s):* Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank* GloVe* Effective Approaches to Attention-based Neural Machine Translation* Stanford’s Graph-based Neural dependency parser* Papers (2020s)* Electra: Pre-training text encoders as discriminators rather than generators* Finding Universal Grammatical Relations in Multilingual BERT* Emergent linguistic structure in artificial neural networks trained by self-supervision Get full access to The Gradient at thegradientpub.substack.com/subscribe

Sep 1, 2022 • 1h 9min
Jeff Clune: Genetic Algorithms, Quality-Diversity, Curiosity
In episode 41 of The Gradient Podcast, Andrey Kurenkov speaks to Professor Jeff Clune.Jeff is an Associate Professor of Computer Science at the University of British Columbia and a Faculty Member of the Vector Institute. Previously, he was a Research Team Leader at OpenAI and before that a Senior Research Manager and founding member of Uber AI Labs, and prior to that he was an Associate Professor in Computer Science at the University of Wyoming.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterThe Gradient is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.Outline:(00:00) Intro(01:05) Path into AI(08:05) Studying biology with simulations(10:30) Overview of genetic algorithms(14:00) Evolving gaits with genetic algorithms(20:00) Quality-Diversity Algorithms(27:00) Evolving Soft Robots(32:15) Genetic algorithms for studying Evolution(39:30) Modularity for Catastrophic Forgetting(45:15) Curiosity for Learning Diverse Skills(51:15) Evolving Environments (58:3) The Surprising Creativity of Digital Evolution(1:04:28) Hobbies Outside of Research(1:07:25) Outro Get full access to The Gradient at thegradientpub.substack.com/subscribe

16 snips
Aug 26, 2022 • 47min
Catherine Olsson and Nelson Elhage: Anthropic, Understanding Transformers
In episode 40 of The Gradient Podcast, Andrey Kurenkov speaks to Catherine Olsson and Nelson Elhage.Catherine and Nelson are both members of technical staff at Anthropic, which is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Catherine and Nelson’s focus is on interpretability, and we will discuss several of their recent works in this interview. Follow The Gradient on TwitterOutline:(00:00) Intro(01:10) Catherine’s Path into AI(03:25) Nelson’s Path into AI(05:23) Overview of Anthropic(08:21) Mechanistic Interpretability(15:15) Transformer Circuits (21:30) Toy Transformer(27:25) Induction Heads(31:00) In-Context Learning(35:10) Evidence for Induction Heads Enabling In-Context Learning(39:30) What’s Next(43:10) Replicating Results(46:00) OutroLinks:AnthropicZoom In: An Introduction to CircuitsMechanistic Interpretability, Variables, and the Importance of Interpretable BasesA Mathematical Framework for Transformer CircuitsIn-context Learning and Induction Heads PySvelte Get full access to The Gradient at thegradientpub.substack.com/subscribe

21 snips
Aug 18, 2022 • 1h 12min
Been Kim: Interpretable Machine Learning
In episode 38 of The Gradient Podcast, Daniel Bashir speaks to Been Kim.Been is a staff research scientist at Google Brain focused on interpretability–helping humans communicate with complex machine learning models by not only building tools but also studying how humans interact with these systems. She has served with a number of conferences including ICLR, NeurIPS, ICML, and AISTATS. She gave the keynotes at ICLR 2022, ECML 2020, and the G20 meeting in Argentina in 2018. Her work TCAV received the UNESCO Netexplo award, was featured at Google I/O 2019 and in Brian Christian’s book The Alignment Problem.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:(00:00) Intro(02:20) Path to AI/interpretability(06:10) The Progression of Been’s thinking / PhD thesis(11:30) Towards a Rigorous Science of Interpretable Machine Learning(24:52) Interpretability and Software Testing(27:00) Been’s ICLR Keynote and Human-Machine “Language”(37:30) TCAV(43:30) Mood Board Search and CAV Camera(48:00) TCAV’s Limitations and Follow-up Work(56:00) Acquisition of Chess Knowledge in AlphaZero(1:07:00) Daniel spends a very long time asking “what does it mean to you to be a researcher?”(1:09:00) The everyday drudgery, more lessons from Been(1:11:32) OutroLinks:* Been’s website* CAVcamera app Get full access to The Gradient at thegradientpub.substack.com/subscribe