The Data Scientist Show cover image

The Data Scientist Show

Latest episodes

undefined
10 snips
Oct 23, 2023 • 1h 16min

Academia vs. Industry for Machine Learning, Research at Uber AI Labs, ML for Wind Farms - Jason Yosinski - The Data Scientist Show #070

Jason Yosinski, founding member of Uber AI Labs and co-founder of WinscapeAI, discusses academia vs. industry in machine learning, challenges of understanding neural networks, ML for wind farms, and the significance of metrics in evaluating models. They also explore hobbies, personal development retreats, and the power and pitfalls of patterns in behavior.
undefined
12 snips
Sep 14, 2023 • 1h 26min

Ads forecasting at Netflix and Spotify, how to build your personal moat - Jeff Li - The Data Scientist Show #069

Jeff Li, a senior data scientist at Netflix and former data science manager at Spotify, discusses ads forecasting, career paths as a manager vs IC, and the culture differences at Spotify, Netflix, and Doordash. They also talk about the challenges of forecasting in finance and ads, detecting and accounting for seasonality and black swan events in advertising, transitioning from manager to senior data scientist, comparing company cultures, changes in tech stacks and data visualization tools, the future of forecasting, the importance of mentors in career growth, and the role of communication skills for data scientists.
undefined
18 snips
Aug 25, 2023 • 1h 14min

A/B testing at Airbnb, building next-gen experimentation platform at Eppo - Che Sharma - The Data Scientist Show #068

Che Sharma, former data scientist at Airbnb and founder of Eppo, talks about A/B testing best practices, A/B testing for ML models, and his career journey. They discuss successful A/B testing, interpreting and communicating test results, A/B testing best practices for ML models, centralizing experiment analysis, preparing data scientists for the future, developing communication skills, transitioning to a manager role, and the future of experimentation.
undefined
Aug 10, 2023 • 1h 55min

From data scientist@Meta to full-time YouTuber (500k+ sub), AI engineering, future of work - Tina Huang - The Data Scientist Show #067

We talked about self-learning, productivity, how Tina navigates her career change and how she thinks AI could change the future of work. Tina's YouTube: www.youtube.com/@TinaHuang1 Lonely Octopus: www.lonelyoctopus.com Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Tina Huang is a data scientist turned YouTube creator with 500k subscribers. She is the founder of Lonely Octopus, an online program helping people gain data science, AI, and freelancing skills. She originally studied pharmacology before transitioning into tech, completing a master's degree in computer science at UPenn. (00:02:38) Transitioning from Data Science to Content Creation (00:06:29) Preparing for Data Science Interviews (00:10:59) Starting a YouTube Channel (00:14:18) Building Multiple Income Streams (00:17:35) Getting Started with AI Skills (00:29:29) Advice for Starting YouTube (00:34:47) Improving Storytelling Skills (00:36:58) Overcoming Procrastination (00:42:33) The Future of Work (01:47:08) Looking to the Future (01:26:49) Income Breakdown
undefined
Aug 1, 2023 • 1h 27min

Making LLMs hallucinate less, how to diagnose ML models, from PM in Google AI to CEO of Galileo - Vikram Chatterji - The Data Scientist Show #066

Vikram is the co-founder of Galileo – an AI diagnostics and explainability platform used by data science teams building NLP, LLMs and Computer Vision models across the Fortune 500 and high growth startups. 
 Prior to Galileo, Vikram led Product Management at Google AI, where his team built models for the Fortune 2000 across retail, financial services, healthcare and contact centers. He has a master degree from Carnegie Mellon University from the school of computer science. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Resources:LLM Studio: https://www.rungalileo.io/blog/announcing-llm-studio Galileo: https://www.rungalileo.io/ Blog on LLM Hallucination: https://thesequence.substack.com/p/guest-post-stop-hallucinations-from Vikram Chatterji’s LinkedIn: https://www.linkedin.com/in/vikram-chatterji/ "The Mom Test": https://www.amazon.com/The-Mom-Test-Rob-Fitzpatrick-audiobook/dp/B07RJZKZ7F Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:04:24) How he got into machine learning (00:06:53) Diagnosing large language models (00:09:56) Addressing model hallucination (00:12:46) Metrics for measuring hallucination (00:17:30) From Google AI to starting Galileo (00:24:08) Developing LLMs and putting them into production (00:32:51) Galileo's diagnostics and explainability platform (00:43:16)  Advice for data scientists when joining a startup
undefined
Jul 28, 2023 • 1h 53min

Data Science "Mix Martial Arts", applied re-inforcement learning, scaling AI workloads using Ray - Max Pumperla - The Data Scientist Show #065

Max Pumperla designed his own career path in data science. He is a freelance software engineer at AnyScale, and also a data science professor. We talked about reinforcement learning, open source contributions, Ray for data scientists, and his view on the data scientists role. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Max’s LinkedIn: https://www.linkedin.com/in/max-pumperla-a8099354/ Max's GitHub: https://github.com/maxpumperla Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:09:19) How he got a remote job through Twitter (00:14:06) Introduction to Ray (00:18:52) Reinforcement learning (00:23:56) Key lessons on integrating customer feedback (00:35:12) Flaws in data science job titles (00:45:51) How to be irreplaceable as a data scientist (00:48:55) An unconventional career path as a data scientist (01:12:24) Productivity and work-life balance (01:28:10) Advice for building a personal brand
undefined
5 snips
Jul 4, 2023 • 1h 50min

Uber's ML Systems (Uber Eats, Customer Support), Declarative Machine Learning - Piero Molino - The Data Scientist Show #064

Piero Molino was one of the founding members of Uber AI Labs. He worked on several deployed ML systems, including an NLP model for Customer Support, and the Uber Eats Recommender System. He is the author of Ludwig , an open source declarative deep learning framework. In 2021 he co-founded Predibase, the low-code declarative machine learning platform built on top of Ludwig. Piero's LinkedIn: https://www.linkedin.com/in/pieromolino Predibase free access: bit.ly/3PCeqqw Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:54) Journey to machine learning (00:03:51) Recommending system at Uber Eats (00:04:13) Projects at Uber AI  (00:09:34) Uber's customer obsession ticket system (00:16:01) How to evaluate online-offline business and model performance metrics (00:17:16) Customer Satisfaction (00:28:38) When do you know whether a project is good enough (00:41:50) Declarative machine learning and Ludwig (00:45:32) Ludwig vs AutoML (00:54:44) Working with Professor Chris Re (00:58:32) Why he started Predibase (01:07:56) LLM and GenAI (01:10:17) Challenges for LLMs (01:22:36) Advice for data scientists (01:34:29) Career advice to his younger self
undefined
10 snips
Jun 26, 2023 • 47min

Data science in transportation, the intersection of operations research and ML - Holger Teichgraeber - The Data Scientist Show #063

Holger Teichgraeber is a Data Science Manager at Archer Aviation. Previously, he worked at Convoy as a Research Scientist on their trucking marketplace, and at various companies in the energy space. Holger has a Bachelor's degree in Mechanical Engineering from Aachen, Germany, and a Masters and Ph.D. with research focus on machine learning and optimization applied to energy systems from Stanford University. He regularly writes on LinkedIn, with the goal to show how to build valuable products at the intersection of machine learning and optimization in production. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Holger's LinkedIn: https://www.linkedin.com/in/holgerteichgraeber/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:28) How he got into operations research (00:02:39) Operation research vs data science (00:04:37) Trucking optimization at Convoy (00:08:42) Optimization problem (00:10:18) Strategic planning on air mobility at Archer (00:13:50) Using simulation and solving a problem (00:16:45) Big data science work vs smaller data science work (00:21:23) Stakeholder management (00:29:28) IC vs Manager (00:32:04) Advice on promotion (00:39:12) Work cultures in Germany and the US (00:41:16) How to handle tight deadlines (00:43:21) Important feedback from his work (00:44:14) How to plan projects (00:44:45) Next big challenge for data science teams (00:45:40) Career growth in the next few years (00:46:01) Connect with Holger
undefined
May 18, 2023 • 1h 22min

Tackling data quality issues, 5 pillars of data observability, from management consultant to CEO of Monte Carlo - Barr Moses -The Data Scientist Show #062

Barr Moses is a consultant turned CEO & Co-Founder of Monte Carlo, a data reliability company. She started her career as a management consultant at Bain & Company and a research assistant at the Statistics Department at Stanford University. Later, she became VP of Customer Operations at customer success company Gainsight, where she built the data and analytics team. She also served in the Israeli Air Force as a commander of an intelligence data analyst unit. Barr graduated from Stanford with a B.Sc. in Mathematical and Computational Science. Today, we’ll talk about Barr’s career journey, data reliability and observability, and what it means for data teams. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science. Barr's LinkedIn: https://www.linkedin.com/in/barrmoses/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:24) How did she got into data science (00:08:26) Frameworks for data-driven decisions (00:11:20) Is customer support ticket always bad? (00:15:20) How to quickly find out what is true (00:20:17) Struggles in the data team (00:23:37) Daliana’s story about lineage (00:28:00) People stressed about data (00:28:09) Netflix was down because of wrong data (00:30:40) Common issues with data quality (00:33:14) 5 pillars of data observability (00:39:14) How does Monte Carlo help data scientists (00:43:08) Build in-house vs adopt tools (00:45:48) How Daliana fixed a data quality issue (01:02:44) How to measure the impact of the data team (01:09:09) Mistakes she made (01:15:28) Beat the odds
undefined
Feb 21, 2023 • 1h 27min

Is search dead? Google vs ChatGPT, from Google Search to enterprise search at Glean, machine learning in search, tech layoffs - Deedy Das - The Data Scientist Show #061

Deedy Das is a founding engineer at Glean, an enterprise search startup. Previously, he was a Tech Lead at Google Search working on query understanding and the sports product in New York, Tel Aviv, and Bangalore. Before that, he was an engineer at Facebook New York and graduated from Cornell University. Outside of work, Deedy writes on his blog. He published a viral resume template and his work on exposing grading flaws in the Indian education system. He also enjoys running marathons, road cycling, and playing cricket. Today we’ll talk about the search projects he worked on at Google, why he left Google, his current work at Glean, and his thoughts on whether Google is doomed because of  ChatGPT. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science.   Deedy's Twitter: https://twitter.com/debarghya_das?s=20 Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu  (00:00:00) Introduction  (00:01:52) What is search  (00:04:33) Query understanding  (00:12:46) Google vs ChatGPT  (00:18:24) Fixing bug for Sundar Pichai  (00:27:33) Why he left google  (00:30:32) How to get into search  (00:34:38) Enterprise search at Glean  (00:46:55) Advice for people who got laid off  (00:48:41) What do search engineers do  (00:51:37) How he evaluates candidates  (00:53:58) Future of search  (00:57:16) Why the web is declining  (00:59:25) Copilot and AI-powered developer tools  (01:03:46) Indian startup ecosystem  (01:07:45) India vs Silicon Valley  (01:09:48) How he grew 30k followers on Twitter  (01:13:28) Daliana and Deedy’s challenge with social media  (01:19:31) Career mistakes he made

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode