The Data Scientist Show - Daliana Liu cover image

The Data Scientist Show - Daliana Liu

Latest episodes

undefined
Nov 12, 2023 • 55min

Machine learning in cybersecurity, computer vision in sports, from business analyst to ML engineer - Betty Zhang - The Data Scientist Show #072

Betty Zhang is a data scientist currently working at a cloud security company, previously she was a data scientist at Amazon Web Services. Today we’ll talk about her computer vision projects in Sports, data science use cases in cyber security, from business major to data scientist, what’s her experience working in startups vs big tech companies. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Betty’s Linkedin: https://www.linkedin.com/in/betty-zhang-0bb63731/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ (00:00:00) Introduction (00:01:21) Computer Vision Project in Sports at AWS (00:12:28) Challenges in computer vision (00:14:02) Time allocation for ML projects (00:15:22) 3 key skills for computer vision (00:17:20) From business analyst to ML engineer (00:18:14) How she got her data scientist job through Linkedin (00:21:32) How she got into Amazon (00:22:17) Three tech skills needed during Amazon interviews (00:26:11) Why she joined a Cyber Security startup (00:27:22) Three cybersecurity use cases (00:29:47) Anomaly detection (00:30:40) ML for cybersecurity (00:34:43) Tech stacks Amazon vs Startups (00:39:35) Startups vs big tech (00:45:56) Balance learning and impact (00:48:35) Advice for new data scientists
undefined
17 snips
Nov 4, 2023 • 1h 4min

Stop abusing A/B testing, toxic experimentation culture, how to run A/B tests with rigor - Che Sharma - The Data Scientist Show #071

Che Sharma, former data scientist at Airbnb and founder of Eppo, discusses toxic behaviors in experimentation culture, A/B testing best practices, and A/B testing for ML models on The Data Scientist Show. Topics include statistical power, effect size, monitoring metrics, alternative methods to A/B testing, difference in differences method, and A/B testing in ML and AI.
undefined
10 snips
Oct 23, 2023 • 1h 16min

Academia vs. Industry for Machine Learning, Research at Uber AI Labs, ML for Wind Farms - Jason Yosinski - The Data Scientist Show #070

Jason Yosinski, founding member of Uber AI Labs and co-founder of WinscapeAI, discusses academia vs. industry in machine learning, challenges of understanding neural networks, ML for wind farms, and the significance of metrics in evaluating models. They also explore hobbies, personal development retreats, and the power and pitfalls of patterns in behavior.
undefined
25 snips
Sep 14, 2023 • 1h 26min

Ads forecasting at Netflix and Spotify, how to build your personal moat - Jeff Li - The Data Scientist Show #069

Jeff Li, a senior data scientist at Netflix and former data science manager at Spotify, discusses ads forecasting, career paths as a manager vs IC, and the culture differences at Spotify, Netflix, and Doordash. They also talk about the challenges of forecasting in finance and ads, detecting and accounting for seasonality and black swan events in advertising, transitioning from manager to senior data scientist, comparing company cultures, changes in tech stacks and data visualization tools, the future of forecasting, the importance of mentors in career growth, and the role of communication skills for data scientists.
undefined
19 snips
Aug 25, 2023 • 1h 14min

A/B testing at Airbnb, building next-gen experimentation platform at Eppo - Che Sharma - The Data Scientist Show #068

Che Sharma, former data scientist at Airbnb and founder of Eppo, talks about A/B testing best practices, A/B testing for ML models, and his career journey. They discuss successful A/B testing, interpreting and communicating test results, A/B testing best practices for ML models, centralizing experiment analysis, preparing data scientists for the future, developing communication skills, transitioning to a manager role, and the future of experimentation.
undefined
Aug 10, 2023 • 1h 55min

From data scientist@Meta to full-time YouTuber (500k+ sub), AI engineering, future of work - Tina Huang - The Data Scientist Show #067

We talked about self-learning, productivity, how Tina navigates her career change and how she thinks AI could change the future of work. Tina's YouTube: www.youtube.com/@TinaHuang1 Lonely Octopus: www.lonelyoctopus.com Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Tina Huang is a data scientist turned YouTube creator with 500k subscribers. She is the founder of Lonely Octopus, an online program helping people gain data science, AI, and freelancing skills. She originally studied pharmacology before transitioning into tech, completing a master's degree in computer science at UPenn. (00:02:38) Transitioning from Data Science to Content Creation (00:06:29) Preparing for Data Science Interviews (00:10:59) Starting a YouTube Channel (00:14:18) Building Multiple Income Streams (00:17:35) Getting Started with AI Skills (00:29:29) Advice for Starting YouTube (00:34:47) Improving Storytelling Skills (00:36:58) Overcoming Procrastination (00:42:33) The Future of Work (01:47:08) Looking to the Future (01:26:49) Income Breakdown
undefined
Aug 1, 2023 • 1h 27min

Making LLMs hallucinate less, how to diagnose ML models, from PM in Google AI to CEO of Galileo - Vikram Chatterji - The Data Scientist Show #066

Vikram is the co-founder of Galileo – an AI diagnostics and explainability platform used by data science teams building NLP, LLMs and Computer Vision models across the Fortune 500 and high growth startups. 
 Prior to Galileo, Vikram led Product Management at Google AI, where his team built models for the Fortune 2000 across retail, financial services, healthcare and contact centers. He has a master degree from Carnegie Mellon University from the school of computer science. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Resources:LLM Studio: https://www.rungalileo.io/blog/announcing-llm-studio Galileo: https://www.rungalileo.io/ Blog on LLM Hallucination: https://thesequence.substack.com/p/guest-post-stop-hallucinations-from Vikram Chatterji’s LinkedIn: https://www.linkedin.com/in/vikram-chatterji/ "The Mom Test": https://www.amazon.com/The-Mom-Test-Rob-Fitzpatrick-audiobook/dp/B07RJZKZ7F Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:04:24) How he got into machine learning (00:06:53) Diagnosing large language models (00:09:56) Addressing model hallucination (00:12:46) Metrics for measuring hallucination (00:17:30) From Google AI to starting Galileo (00:24:08) Developing LLMs and putting them into production (00:32:51) Galileo's diagnostics and explainability platform (00:43:16)  Advice for data scientists when joining a startup
undefined
Jul 28, 2023 • 1h 53min

Data Science "Mix Martial Arts", applied re-inforcement learning, scaling AI workloads using Ray - Max Pumperla - The Data Scientist Show #065

Max Pumperla designed his own career path in data science. He is a freelance software engineer at AnyScale, and also a data science professor. We talked about reinforcement learning, open source contributions, Ray for data scientists, and his view on the data scientists role. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Max’s LinkedIn: https://www.linkedin.com/in/max-pumperla-a8099354/ Max's GitHub: https://github.com/maxpumperla Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:09:19) How he got a remote job through Twitter (00:14:06) Introduction to Ray (00:18:52) Reinforcement learning (00:23:56) Key lessons on integrating customer feedback (00:35:12) Flaws in data science job titles (00:45:51) How to be irreplaceable as a data scientist (00:48:55) An unconventional career path as a data scientist (01:12:24) Productivity and work-life balance (01:28:10) Advice for building a personal brand
undefined
6 snips
Jul 4, 2023 • 1h 50min

Uber's ML Systems (Uber Eats, Customer Support), Declarative Machine Learning - Piero Molino - The Data Scientist Show #064

Piero Molino was one of the founding members of Uber AI Labs. He worked on several deployed ML systems, including an NLP model for Customer Support, and the Uber Eats Recommender System. He is the author of Ludwig , an open source declarative deep learning framework. In 2021 he co-founded Predibase, the low-code declarative machine learning platform built on top of Ludwig. Piero's LinkedIn: https://www.linkedin.com/in/pieromolino Predibase free access: bit.ly/3PCeqqw Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:54) Journey to machine learning (00:03:51) Recommending system at Uber Eats (00:04:13) Projects at Uber AI  (00:09:34) Uber's customer obsession ticket system (00:16:01) How to evaluate online-offline business and model performance metrics (00:17:16) Customer Satisfaction (00:28:38) When do you know whether a project is good enough (00:41:50) Declarative machine learning and Ludwig (00:45:32) Ludwig vs AutoML (00:54:44) Working with Professor Chris Re (00:58:32) Why he started Predibase (01:07:56) LLM and GenAI (01:10:17) Challenges for LLMs (01:22:36) Advice for data scientists (01:34:29) Career advice to his younger self
undefined
10 snips
Jun 26, 2023 • 47min

Data science in transportation, the intersection of operations research and ML - Holger Teichgraeber - The Data Scientist Show #063

Holger Teichgraeber is a Data Science Manager at Archer Aviation. Previously, he worked at Convoy as a Research Scientist on their trucking marketplace, and at various companies in the energy space. Holger has a Bachelor's degree in Mechanical Engineering from Aachen, Germany, and a Masters and Ph.D. with research focus on machine learning and optimization applied to energy systems from Stanford University. He regularly writes on LinkedIn, with the goal to show how to build valuable products at the intersection of machine learning and optimization in production. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Holger's LinkedIn: https://www.linkedin.com/in/holgerteichgraeber/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:28) How he got into operations research (00:02:39) Operation research vs data science (00:04:37) Trucking optimization at Convoy (00:08:42) Optimization problem (00:10:18) Strategic planning on air mobility at Archer (00:13:50) Using simulation and solving a problem (00:16:45) Big data science work vs smaller data science work (00:21:23) Stakeholder management (00:29:28) IC vs Manager (00:32:04) Advice on promotion (00:39:12) Work cultures in Germany and the US (00:41:16) How to handle tight deadlines (00:43:21) Important feedback from his work (00:44:14) How to plan projects (00:44:45) Next big challenge for data science teams (00:45:40) Career growth in the next few years (00:46:01) Connect with Holger

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app