The Data Scientist Show cover image

The Data Scientist Show

Latest episodes

undefined
22 snips
Feb 20, 2023 • 1h 43min

The 100-hour work week of an self-taught machine learning researcher, how he got into Google Brain, why he started Omni - Jeremy Nixon - The Data Scientist Show #060

Jeremy Nixon is a machine learning researcher, software engineer, and startup founder. Previously he was a software engineer at Google Brain working on deep learning. Now, he is the co-founder and CEO of Omni, building an immersive information retrieval system for you and your team. He studied applied math at Harvard University. Today we’ll talk about how he got into Google brain, his 3-month self-learning plan to learn machine learning, his startup, and how he executed his goal relentlessly since 2016. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science. Jeremy's Twitter: https://twitter.com/JvNixon Jeremy's Blog: https://jeremynixon.github.io/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu Jeremy's LinkedIn: https://www.linkedin.com/in/jeremyvnixon (00:00:00) Introduction  (00:01:50) Research in Google Brain  (00:03:37) How he got into Google Brain  (00:07:56) His 3-month plan to learn ML  (00:17:55) The 100-hour workweek  (00:33:26) What if he is tired  (00:39:59) Why he found Omni  (00:44:24) Data science problems in Omni  (00:54:42) Future of machine learning  (00:57:51) Silicon Valley is very accessible  (00:59:47) The golden handcuffs  (01:06:58) From data scientist to full-stack engineer  (01:09:06) Close-minded data scientists  (01:24:10) Advice to ML learners  (01:29:41) Something he wished that he did when he was younger  (01:37:25) The future of his career  (01:42:17) Connect with Jeremy
undefined
Jan 24, 2023 • 1h 20min

The power of error analysis, tree models for search relevancy, what ChatGPT means for data scientists - Sergey Feldman - The Data Scientist Show #059

Sergey Feldman is the head of AI at Alongside, providing mental health support for students. He is also a Lead Applied Research Scientist at Allen Institute for AI, where he built an ML model that improved search relevancy for scientific literature. Sergey has a PhD in Electrical and Electronics Engineering from the University of Washington. Today we’ll talk about machine learning for search, his consulting project for the Gates Foundation, AI for mental health, and career lessons. Make sure you listen till the end. If you like the show, subscribe, leave a comment, and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's Twitter: https://twitter.com/DalianaLiuDaliana's   Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/   Sergey's LinkedIn: https://www.linkedin.com/in/sergey-feldman-6b45074b/  Data Cowboys: http://www.data-cowboys.com/ Sergey Feldman: You Should Probably Be Doing Nested Cross-Validation | PyData Miami 2019: https://www.youtube.com/watch?v=DuDtXtKNpZs December 4th, 2018 - Breakfast with WACh with Dr. Sergey Feldman, PhD: https://www.youtube.com/watch?v=vA_czRcCpvQ (00:00:00) Introduction  (00:01:24) Machine learning skeptic  (00:03:02) Tree-based models for search relevance  (00:14:34) How to do error analysis  (00:19:20) Nested cross-validation  (00:21:34) Model evaluation  (00:30:43) Error analysis common mistakes  (00:33:37) How to avoid overfitting  (00:35:56) Consulting project with Gates Foundation  (00:41:16) Tree-based models vs linear models  (00:45:19) Working with non-tech stakeholders  (00:50:20) Chatbot for teen’s mental health  (00:54:32) Can ChatGPT provide therapy?   (00:58:12) How he got into machine learning  (01:02:12) How to not have a boss  (01:03:46) Feelings vs Facts  (01:09:02) Future of machine learning  (01:11:30) How to prepare for the future  (01:13:39) AutoML  (01:17:12) His passion for large language models
undefined
6 snips
Dec 7, 2022 • 1h 9min

How to build data science muscle memory, DeepChecks -- an open source ML testing suite - Philip Tannor - The Data Scientist Show #058

Philip Tannor is the Co-Founder and CEO of Deepchecks, a python package to run checks for machine learning models. Previously, he was the head of data science group at the Isreal Defense Force. He has a master's degree from Tel Aviv University in engineering, his thesis was about a new algorithm that combines neural networks with gradient-boosting decision trees. Today we’ll talk about his career journey, how to build your data science muscle memory, the algorithm he worked on, and how to check ML models. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career. Daliana's Twitter: https://twitter.com/DalianaLiuDaliana's  LinkedIn: https://www.linkedin.com/in/dalianaliu/  Philip’s LinkedIn: https://www.linkedin.com/in/philip-tannor-a6a910b7/?originalSubdomain=il Augboost: https://medium.com/@ptannor/augboost-like-xgboost-but-with-few-twists-e4df4017a5c4 (00:00:00) Introduction  (00:01:17) How did he get into ML  (00:02:52) Data science in the military  (00:08:15) How to take feedback  (00:13:24) Handling criticism  (00:15:12) What he worked on  (00:18:18) testing deployment  (00:21:28) How to build the data science muscle memory  (00:27:09) Improving the skills of data scientists  (00:30:42) His thesis in grad school  (00:36:59) Combine NN and gradient boosting  (00:40:05) Aug boost  (00:41:15)Tools he uses  (00:45:58) Deepchecks  (00:50:46) Most challenging part of building Deepchecks  (00:52:05) How can people contribute  (00:53:40) Behind the scenes  (00:56:09) Deciding how to fix or improve the model  (01:00:49) Advise for those who wanna create open-source projects  (01:04:07) Features to add for the enterprise product  (01:06:57) About his life and career right now  (01:08:27) Connect with Philip
undefined
Nov 24, 2022 • 1h 15min

The Daliana Special: how did I got into data science, 5 things only experienced data scientists know, and why I started "The Data Scientist Show" - Daliana Liu #057

Who is Daliana? This is a conversation I had in 2021 with Harpreet Sahota. I talked about my unexpected journey to data science all the way back in high school, things I wish I could know earlier about my career, the projects I worked on, what is like to be a quote-and-unquote influencer on Linkedin, and more. If you want more content from me, I write about data science and career nerdy jokes, on my Linkedin and you can subscribe to my very infrequent newsletter at dalianaliu.com. I’m curious what you think about this episode, leave a comment on YouTube or send a DM on Linkedin. Hope you enjoy the Daliana special!   Daliana's Newsletter: https://dalianaliu.com  Daliana's Twitter: https://twitter.com/DalianaLiu  Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/  Harpreet's LinkedIn: https://www.linkedin.com/in/harpreetsahota204/  The artist of the data science podcast: https://theartistsofdatascience.fireside.fm/  (00:00:00) Introduction  (00:02:52) Where did Daliana grow up  (00:05:19) Daliana in highschool  (00:07:11) How did she got into data science  (00:11:36) Why is writing important for data scientist  (00:15:51) How to write better  (00:20:56) Career lessons you didn't learn in school  (00:27:40) Imposter syndrome  (00:31:29) Day-to-day work as a data scientist  (00:36:16) Most common mistakes data scientists make  (00:39:41) Data Analyst vs. Data Scientist  (00:42:30) What is the science in data science?  (00:44:51) Can everyone be a data scientist  (00:49:21) Linkedin profile tips for job search  (00:52:59) How she creates content  (00:54:11) Being a data scientist "influencer"  (00:56:04) Why she started "the data scientist show"  (01:01:16) Women in data science  (01:06:39) What's her legacy  (01:09:43) What is she reading  (01:14:21) Connect with Daliana
undefined
5 snips
Nov 8, 2022 • 1h 8min

How he carved his own path at Airbnb, from data engineer to CEO of Mage - Tommy Dang - the data scientist show #056

Tommy Dang is the Co-founder and CEO of Mage, a data ingestion and transformation pipeline for data engineers (https://github.com/mage-ai/mage-ai). Previously, he was working on data engineering and machine learning engineering at Airbnb. He has a bachelor degree of science in UC Berkeley studying economic, history, and sociology. Today we’ll talk about how he learned engineering and machine learning after college, data tools and ML tools he built at Airbnb, performance review, and how he navigates his career. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career. Tommy’s LinkedIn: https://www.linkedin.com/in/dangtommy/ Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu (00:00:00) Introduction  (00:01:28) Get into computer science from non-tech background  (00:03:08) How he started his first project  (00:04:07) Projects at Airbnb  (00:06:09) Speed vs Quality when building data pipelines  (00:16:34) How to deal with AdHoc requests  (00:21:00) How did he learn machine learning  (00:24:04) How he convinced data scientists to teach him ML  (00:25:15) Performance review  (00:27:11) Don’t let your job title limit your career  (00:28:29) Why he started his company  (00:31:38) Build your own tool vs use open source solutions  (00:33:12) Transitioning from an engineer to a CEO  (00:34:50) Earn trust from internal stakeholders  (00:36:27) Career advice  (00:41:31) How he carved his own path at Airbnb  (00:46:00) How did he learn to be a good engineer  (00:47:10) Best advice for data scientists or engineers  (00:48:41) Most important quality of data scientists or engineers  (00:51:51) Design principles  (00:58:51) Future of tools  (01:01:00) What does he think about his future career  (01:05:05) Inspiration of Tommy
undefined
9 snips
Oct 24, 2022 • 1h 24min

How to effectively test and debug machine learning models, from ML engineer@Apple to startup founder - Gabriel Bayomi - the data scientist show #055

Gabriel Bayomi is the Co-Founder at OpenLayer, a tool that tests & debugs machine learning models. OpenLayer was in the YCombinator’s batch in 2021, building tools for machine learning model testing. Previously he was a machine learning engineer at Apple working on Siri. He has a master degree in computer science from Carnegie Mellon. He is passionate about Natural Language Processing, Machine Learning, and Computational Social Science. We talked about how to test and debug machine learning models, his experience at Apple, and career lessons. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career. Gabriel’s LinkedIn: https://www.linkedin.com/in/gbayomi Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu (0:00) Intro (01:01:39) How he got into machine learning (01:06:43) His experience at Apple, Siri (01:15:55) How to validate the solution (01:19:39) Benefits of using external error analysis framework (01:21:30) How to build a model evaluation pipeline (01:28:26) Don’t overfit the subset of data (01:33:19) Your validation set shouldn’t be fixed (01:41:03) Become one with data (01:44:05) Three model interpretability library you should use (01:50:47) Common mistakes people made in model validation (01:53:33) How to create an adversarial test (01:55:43) How to check data quality (01:06:46) Transition from engineer to executive (01:10:04) Things he learnt from his favorite coworker (01:17:57) how job roles would evolve
undefined
10 snips
Oct 19, 2022 • 2h 12min

From Amazon research scientist to head of data product at Vestiaire Collective, why data science projects fail, how to be a good communicator - Alisa Kim - the data scientist show #054

Alisa Kim is the head of data product at Vestiaire Collective. Previously, she was a research scientist at Amazon Web Services. We used to work on the same team in Machine Learning Solutions Lab and Amazon Web Services. We have collaborated on projects before and previously she was a consultant and worked on analytics and investment banking. She has a Ph.D. in Econ AI and she has worked on various industries and multiple continents. She's someone I really enjoyed working with. We talked about her journey, the projects she worked on and the lessons she learnt. If you like the show subscribe to the channel and give us a 5 star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.   Alisa's LinkedIn: https://de.linkedin.com/in/alisakolesnikova Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's twitter: https://twitter.com/DalianaLiu (0:00) Intro (00:01:38) how she got into data science (00:04:38) day-to-day at AWS ML Solutions Lab (00:08:00) AWS leadership principles (00:16:34) challenges the consultant faces when working with external customers (00:23:36) from AWS to Vestiaire Collective (00:37:54) how to build a better data product (00:44:17) how data scientist can align with business stakeholders  (00:57:52) from tech to business (01:01:33) how to develop communication skills (01:09:17) increase visibility of the data science team (01:17:22) being proactive vs being passive in chasing opportunities (01:24:06) get feedback from your "nearest neighbors" (01:25:37) how to set boundary at work (01:38:48) mistakes she made in her career (01:48:25) how to manage disagreement (01:57:53) future of data science
undefined
Oct 15, 2022 • 1h 33min

The lessons from almost losing a million dollars for his company, how to build good data assets and get buy-in from the leadership - Mark Freeman - the data scientist show#053

Mark Freeman is a community health advocate turned data scientist His mission is to improve the well-being of people, especially among those marginalized. He is currently a senior data scientist at Humu where he builds data tools that drive behavior change to make work better. He has a master degree from the Stanford School of Medicine in clinical research, experimental design and statistics. He also has a certificate in entrepreneurship from the Business School of Stanford. In his free time, he volunteers with a Bay Area Community Health Advisory Council. He also plays Men's Division III Rugby. We talked about the building data tools, data engineering skills for data scientist, how to pitch a projects, and his career journey. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Mark's LinkedIn: https://www.linkedin.com/in/mafreeman2/ Chapters: (0:00) Intro (00:03:05) Our experience using R - 1000 lines of code (00:09:22) Entrepreneurship within a company (00:16:25) DBT and modern data stack (00:20:15) Tools don’t matter (in interviews) (00:21:09) Things DE enjoys but DS doesn’t (00:24:55) How to work with different stakeholders (00:30:32) Common SQL mistakes (00:33:34) SQL vs Python vs R (00:35:26) T.R.I.B.E framework for projects (00:40:43) Meet the stakeholders where they at (00:42:40) Use feedback to get buy-in from collaborator (00:46:36) How to pitch a new idea (00:49:45) Don’t lead with solution, lead with the problem (00:51:03) How to get buy-in from the leadership (00:57:56) Present an idea as if the audience came up with it (00:58:41) How to iterate a project (01:00:27) How he almost lost 1 Million dollar for his company (01:02:07) Things he learned from his manager (01:04:19) Things that help people make changes effectively (01:06:05) Things he learned from mentoring (01:12:19) Mental Health and anxiety (01:17:12) Web3 (01:20:14) Why he cares about community health (01:25:40) "Soul - searching" on his future (01:28:36) Why he write on LinkedIn (01:30:04) Future of data science
undefined
Oct 4, 2022 • 1h 31min

From deep learning architect at AWS to PM in AI product - Abhi Sharma - the data scientist show #052

Abhi Sharma started his career as a software engineer at Amazon Lab 126, building cloud services for Alexa. Later he transferred to Amazon Web Services as a deep learning architect. We used to work at the same team at machine learning solutions lab in AWS. Currently, he is a product manager, responsible for machine learning products like chatbot at Chime. We talked about how he transitioned his career from software engineer to deep learning architect and to a product manager, cool projects he worked on, and our shared experiences at Amazon. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Abhi's LinkedIn: https://www.linkedin.com/in/abhivs/ Highlights: (0:00) Intro (00:01:48) from SWE to deep learning architect to product manager (00:12:44) day-to-day as a product manager at Chime (00:19:46) how he collaborates with different data personas (00:27:21) how to negotiate for more time for projects with leaders (00:33:59) some timelines are negotiable (00:38:00) most impactful project he worked on (00:44:22) how to evaluate KPI, and not game the system (00:48:02) think about development in the beginning (00:50:29) data scientists need to educate the business and demystify the buzz words (00:54:19) Amazon’s Think Big Challenge (00:57:09) Never solve the problem twice (01:00:25) How to transition to a product manager (01:07:48) why he wanted to become a PM (01:25:35) How can data scientist learn from PM
undefined
Sep 27, 2022 • 2h 5min

What data scientists need to know about MLOps principles, from GPA 2.6 to Sr. MLOps Engineer@Intuit - Mikiko Bazeley - the data scientist show051

Mikiko Bazeley is a senior software engineer working on MLOps at Intuit. Previously, she worked as a growth hacker, data analyst in Finance, then become a data scientist, and later transitioned into machine learning. She has a bachelor degree in econ, biological anthropologie, did data science bootcamp at springboard. She is a tech writer for NVIDIA and she’s working on a course on MLOps. Her goal is to demystify MLOps & show how to develop high-quality ML products from scratch. You can find her content on Linkedin and YouTube. Today, we’ll talk about useful engineering principles for data scientists, MLOps, and her career journey. Subscribe to www.dalianaliu.com for more on data science and career. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Mikiko's Linkedin: https://www.linkedin.com/in/mikikobazeley/ Highlights: (0:00) Intro  (00:02:00) from GPA2.6 to data scientist (00:05:27) her experience at Mailchimp (00:11:44) her frustrations on Cookiecutter project (00:14:09) the pain point of a data scientist working with engineering (00:21:01) 2 MLOps pattern (00:25:52) challenges about her work (00:29:49) the basic engineering skills a data scientist should have (00:32:46) the tests a data scientist should write (00:37:42) how an MLOps engineer collaborates with a data scientist (00:45:28) what makes a good MLOps engineer (00:52:33) AWS vs GCP vs Azure (00:58:59) how a data scientist collaborates with an MLOps engineer  (01:05:19) suggestions for building a model on a large scale (01:09:11) how she learnt MLOps on her own within 6 months (01:17:32) learn from code review (01:19:17) MLOps books and resources she recommended (01:24:13) mistakes she made earlier in her career (01:31:29) common mistakes people make during career change (01:38:22) "Start with the end in mind" (01:41:16) the future of MLOps (01:46:23) how she sees her career growth (01:56:40) how she continues learning new skills (02:00:09) what she is excited about her career and life

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode