Practical AI: Machine Learning, Data Science, LLM cover image

Practical AI: Machine Learning, Data Science, LLM

Latest episodes

undefined
Dec 1, 2020 • 44min

The world's largest open library dataset

Unsplash has released the world’s largest open library dataset, which includes 2M+ high-quality Unsplash photos, 5M keywords, and over 250M searches. They have big ideas about how the dataset might be used by ML/AI folks, and there have already been some interesting applications. In this episode, Luke and Tim discuss why they released this data and what it take to maintain a dataset of this size. Join the discussionChangelog++ members get a bonus 1 minute at the end of this episode and zero ads. Join today!Sponsors:Linode – Get $100 in free credit to get started on Linode – our cloud of choice and the home of Changelog.com. Head to linode.com/changelog OR text CHANGELOG to 474747 to get instant access to that $100 in free credit. Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this! LaunchDarkly – Power experimentation at any scale. Fast and reliable feature management for the modern enterprise. Featuring:Luke Chesser – Website, XTimothy Carbone – GitHub, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes: Unsplash The world’s largest open library dataset from Unsplash The Unsplash dataset on GitHub Something missing or broken? PRs welcome!
undefined
Nov 24, 2020 • 51min

A casual conversation concerning causal inference

Lucy D’Agostino McGowan, cohost of the Casual Inference Podcast and a professor at Wake Forest University, joins Daniel and Chris for a deep dive into causal inference. Referring to current events (e.g. misreporting of COVID-19 data in Georgia) as examples, they explore how we interact with, analyze, trust, and interpret data - addressing underlying assumptions, counterfactual frameworks, and unmeasured confounders (Chris’s next Halloween costume). Join the discussionChangelog++ members get a bonus 2 minutes at the end of this episode and zero ads. Join today!Sponsors:Linode – Get $100 in free credit to get started on Linode – our cloud of choice and the home of Changelog.com. Head to linode.com/changelog Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this! Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. LaunchDarkly – Power experimentation at any scale. Fast and reliable feature management for the modern enterprise. Featuring:Lucy D'Agostino McGowan – Website, LinkedIn, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes: Casual Inference Podcast Casual Inference Podcast | Twitter Communicating Complex Statistics (video) Communicating Complex Statistics (slides) Practical AI is a “Media Sponsor” of the R Conference | Government & Public Sector, where Lucy D’Agostino McGowan is giving the talk with Malcolm Barrett called “Causal Inference in R”, as well as a workshop with the same title. This will be the first ever R Conference focused on data science work in government, defense, and the public sector. Practical AI listeners get a special discount code valid for 20% off all ticket types, General & Academic Admission and workshops: PRACTICALAI20 Something missing or broken? PRs welcome!
undefined
Nov 17, 2020 • 49min

Building a deep learning workstation

What’s it like to try and build your own deep learning workstation? Is it worth it in terms of money, effort, and maintenance? Then once built, what’s the best way to utilize it? Chris and Daniel dig into questions today as they talk about Daniel’s recent workstation build. He built a workstation for his NLP and Speech work with two GPUs, and it has been serving him well (minus a few things he would change if he did it again). Join the discussionChangelog++ members get a bonus 1 minute at the end of this episode and zero ads. Join today!Sponsors:Linode – Get $100 in free credit to get started on Linode – our cloud of choice and the home of Changelog.com. Head to linode.com/changelog Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this! Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Featuring:Chris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes:Daniel’s workstation components: CPU - AMD YD292XA8AFWOF Ryzen Threadripper 2920X CPU cooler - Noctua NH-U12S TR4-SP3, Premium-Grade CPU Cooler for AMD sTRX4/TR4/SP3 Motherboard - GIGABYTE X399 AORUS PRO Memory - Corsair Vengeance LPX 16GB (2x 2 packs), total 64GB Storage 1 - Samsung (MZ-V7S1T0B/AM) 970 EVO Plus SSD 1TB GPU 1 - RTX 2080 Ti GPU 2 - Titan RTX Case - Lian Li PC-O11AIR Power Supply - Rosewill Hercules Case fan(s) - Coolmaster 8mm Daniel’s NUC 9 Extreme machine References: How to build the perfect Deep Learning Computer and save thousands of dollars Curtis Northcut’s blog posts Something missing or broken? PRs welcome!
undefined
Nov 9, 2020 • 51min

Killer developer tools for machine learning

Weights & Biases is coming up with some awesome developer tools for AI practitioners! In this episode, Lukas Biewald describes how these tools were a direct result of pain points that he uncovered while working as an AI intern at OpenAI. He also shares his vision for the future of machine learning tooling and where he would like to see people level up tool-wise. Join the discussionChangelog++ members support our work, get closer to the metal, and make the ads disappear. Join today!Featuring:Lukas Biewald – Website, GitHub, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes: Weights & Biases Blueriver case study from W&B W&B gallery Something missing or broken? PRs welcome!
undefined
Oct 26, 2020 • 47min

Reinforcement Learning for search

Hamish from Sajari blows our mind with a great discussion about AI in search. In particular, he talks about Sajari’s quest for performant AI implementations and extensive use of Reinforcement Learning (RL). We’ve been wanting to make this one happen for a while, and it was well worth the wait. Join the discussionChangelog++ members support our work, get closer to the metal, and make the ads disappear. Join today!Featuring:Hamish Ogilvy – XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes: Sajari Blog post: “Reinforcement Learning Assisted Search Ranking” Blog post: “Query Understanding 101” Blog post: “The Inevitable Collision of Search and AI Tech” Special offer from Sajari for Changelog listeners Something missing or broken? PRs welcome!
undefined
4 snips
Oct 20, 2020 • 48min

When data leakage turns into a flood of trouble

Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set. Join the discussionChangelog++ members get a bonus 1 minute at the end of this episode and zero ads. Join today!Sponsors:DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at do.co/changelog. Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this! Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Featuring:Rajiv Shah – Website, GitHub, LinkedIn, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes: Rajiv Shah | University of Illinois at Chicago Rajiv Shah | DataRobot Blog DataRobot Something missing or broken? PRs welcome!
undefined
Oct 13, 2020 • 55min

Productionizing AI at LinkedIn

Suju Rajan from LinkedIn joined us to talk about how they are operationalizing state-of-the-art AI at LinkedIn. She sheds light on how AI can and is being used in recruiting, and she weaves in some great explanations of how graph-structured data, personalization, and representation learning can be applied to LinkedIn’s candidate search problem. Suju is passionate about helping people deal with machine learning technical debt, and that gives this episode a good dose of practicality. Join the discussionChangelog++ members get a bonus 2 minutes at the end of this episode and zero ads. Join today!Sponsors:DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at do.co/changelog. Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this! Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Rollbar – We move fast and fix things because of Rollbar. Resolve errors in minutes. Deploy with confidence. Learn more at rollbar.com/changelog. Featuring:Suju Rajan – LinkedInChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes: The AI behind LinkedIn’s recruiter search TensorFlow Extended (TFX) Paper: “Hidden Technical Debt in Machine Learning Systems” Something missing or broken? PRs welcome!
undefined
Oct 6, 2020 • 54min

R, Data Science, & Computational Biology

We’re partnering with the upcoming R Conference, because the R Conference is well… amazing! Tons of great AI content, and they were nice enough to connect us to Daniel Chen for this episode. He discusses data science in Computational Biology and his perspective on data science project organization. Join the discussionChangelog++ members get a bonus 1 minute at the end of this episode and zero ads. Join today!Sponsors:DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at do.co/changelog. Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this! Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Featuring:Daniel Chen – Website, GitHub, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes:R Conference: Website Twitter: @rstatsdc Tickets Discount code PRACTICALAI20 is good for 20% off every ticket type, including the conference & all workshops Links relevant to the show: William Stafford Noble 2009 - A Quick Guide to Organizing Computational Biology Projects Greg Wilson, et al. 2014: “Best Practices for Scientific Computing” Greg Wilson, et al. 2017: “Good enough practices in scientific computing” Jenny Bryan’s code smells: link 1 and link2 Jenny Bryan on naming things JD Long’s talk at rstudio::conf this year about being empathetic python’s version of pyprojroot “Be kind: all else is details”. – Greg Wilson, Teaching Teach Together – The Rules Books “Pandas for Everyone” by Daniel Chen “Advanced R” by Hadley Wickham Something missing or broken? PRs welcome!
undefined
Sep 21, 2020 • 53min

Learning about (Deep) Learning

In anticipation of the upcoming NVIDIA GPU Technology Conference (GTC), Will Ramey joins Daniel and Chris to talk about education for artificial intelligence practitioners, and specifically the role that the NVIDIA Deep Learning Institute plays in the industry. Will’s insights from long experience are shaping how we all stay on top of AI, so don’t miss this ‘must learn’ episode. Join the discussionChangelog++ members get a bonus 2 minutes at the end of this episode and zero ads. Join today!Sponsors:DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at do.co/changelog. Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this! Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Rollbar – We move fast and fix things because of Rollbar. Resolve errors in minutes. Deploy with confidence. Learn more at rollbar.com/changelog. Featuring:Will Ramey – LinkedIn, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes: NVIDIA GPU Technology Conference, October 5-9, 2020 20% off for our listeners! Use code CMINFDW20 by 9/25 for additional 20% off NVIDIA Deep Learning Institute NVIDIA to Acquire Arm for $40 Billion NVIDIA Ampere Architecture GeForce RTX 30 Series Graphics Cards Practical AI Episode #15: Artificial intelligence at NVIDIA with Chief Scientist Bill Dally Practical AI Episode #36: Growing up to become a world-class AI expert with Anima Anandkumar of NVIDIA and CalTech Practical AI Episode #90: Fully-Connected with Chris and Daniel - Exploring NVIDIA’s Ampere & the A100 GPU Something missing or broken? PRs welcome!
undefined
Sep 14, 2020 • 59min

When AI goes wrong

So, you trained a great AI model and deployed it in your app? It’s smooth sailing from there right? Well, not in most people’s experience. Sometimes things goes wrong, and you need to know how to respond to a real life AI incident. In this episode, Andrew and Patrick from BNH.ai join us to discuss an AI incident response plan along with some general discussion of debugging models, discrimination, privacy, and security. Join the discussionChangelog++ members save 2 minutes on this episode because they made the ads disappear. Join today!Sponsors:Linode – Our cloud of choice and the home of Changelog.com. Deploy a fast, efficient, native SSD cloud server for only $5/month. Get 4 months free using the code changelog2019 OR changelog2020. To learn more and get started head to linode.com/changelog. Pace.dev – Minimalist web based management tool for your teams. Async by default communication and simplistic task management gives you everything you need to build your next thing. Brought to you by Go Time panelist Mat Ryer. Try it out today! Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Featuring:Andrew Burt – Website, XPatrick Hall – GitHub, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes: AI Incident Response Checklist and other BNH.ai resources “New Law Firm Tackles AI Liability” (article about BNH.ai) In the realm of paper tigers – exploring the failings of AI ethics guidelines Debugging Machine Learning Models workshop Why you should care about debugging machine learning models Strategies for model debugging FTC: Using Artificial Intelligence and Algorithms SR 11-7: Guidance on Model Risk Management Apple Goldman case California Consumer Privacy Act (CCPA) Previous episode: Data management, regulation, the future of AI Something missing or broken? PRs welcome!

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode