
The Data Scientist Show - Daliana Liu
A deep dive into data scientists' day-to-day work, tools and models they use, how they tackle problems, and their career journeys. This podcast helps you grow a successful career in data science. Listening to an episode is like having lunch with an experienced mentor. Guests are data science practitioners from various industries, AI researchers, economists, and CTOs of AI companies. Host: Daliana Liu, an ex-Amazon senior data scientist with 180k followers on Linkedin.
Join 20k subscribers at www.dalianaliu.com to learn more about data science, career, and this show. Twitter @DalianaLiu.
Latest episodes

Jun 29, 2022 • 1h 50min
Applied machine learning research methods, human-machine team, AI strategies, trends in machine learning, how to earn trust - Vin Vashishta - The data scientist show #042
Vin Vashishta is a chief data officer and AI strategist at V Squared, a company he founded in 2012 that provides AI strategy, transformation, and data organizational build-out services.
He teaches data professionals about strategy, communications, business acumen, and applied machine learning research methods. Vin has 130k+ followers on Linkedin talking about AI, analytics, and strategy. His website: https://www.datascience.vin/ If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Highlights:
(0:00) Intro
(00:03:37) "ML strategy" with 'pricing' as an example
(00:09:45) what is a good metric for ML
(00:13:16) how to translate a business problem into a data problem
(00:23:42) leverage users in the "Human Machine Teaming"
(00:48:22) how he earned the trust
(01:17:31) data science evolution from 2012 to 2022
(01:31:06) how he learns new domain knowledge
(01:36:25) the mistakes he made
(01:42:15) what he learnt from his mentor

Jun 23, 2022 • 1h 31min
Retail store forecasting with video and audio, ML in high frequency trading, from tech to politics, ML in Web3 - Greg Tanaka, the data scientist show #041
Greg Tanaka is a computer scientist turned CEO of an AI company. He started coding when he was 6, studied computer science at UC Berkeley, and has built many machine learning applications, he is the the founder and CEO of Percolata developing ”Forecast as a Service”. He is also the council member of Palo Alto in California, and just finished his campaign for congress. Today we’ll talk about his career journey, forecasting, machine learning in blockchain and political campaigns. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Greg's Linkedin: https://www.linkedin.com/in/gltanaka/, Twitter: https://twitter.com/GregTanaka
Greg's DAO: https://www.gregtanaka.org/dao
Highlights:
(00:02:10) use computer vision, audio, and Wi-Fi fingerprints to forecast the retail store traffic
(00:21:55) why time series forecast is hard
(00:26:39) how he made the forecasting more stable
(00:28:46) how he troubleshot the spikes and drops in data
(00:36:04) human trading vs algorithmic trading
(00:47:36) his vision of machine learning in blockchain
(00:54:57) why he got into politics
(01:05:57) advises for people who are interested in Web3
(01:11:04) AutoML and the future of machine learning
(01:15:36) things he wished he could learn earlier

Jun 16, 2022 • 1h 58min
Weather forecasting with AI, Kaggle tips and tricks, dealing with missing data, deep learning with Jesper Dramsch, The Data Scientist Show #040
Jesper Dramsch is a scientist for machine learning at the European Centre for Medium-Range Weather forecasts. They have a phd in applied Machine Learning to Geoscience from Technical University of Denmark. They are a Kaggle Kernals Expert and TPU star, ranking at top 81/100k worldwide. We talked about weather forecasting, things they learned from Kaggle, how to deal with missing data and ourliers, deep learning, Keras vs Pytorch, XGBoost, their struggles as a phd student, working in the EU vs US. Follow @DalianaLiu for more updates on data science and this show.
(00:01:27) how he got into in ML
(00:09:10) how he handled missing data
(00:28:34) Transformers are eating the world
(00:49:36) Hoover Loss is a fantastic metric to deal with extreme values
(00:54:48) his experience with Kaggle competition
(01:02:59) Kaggle tricks that helped his models perform better
(01:08:18) PyTorch vs Keras
(01:30:30) working in different countries and cultures
Resources shared by Jesper:
The newsletter with missing data:
https://buttondown.email/jesper/archive/towels-have-quite-a-dry-sense-of-humor/
The paper by Gael about missing data:
https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giac013/6568998
The Huber Loss:
https://en.wikipedia.org/wiki/Huber_loss
Skill Scores:
https://en.wikipedia.org/wiki/Forecast_skill
Brier Skill in Weather:
https://www.dwd.de/EN/ourservices/seasonals_forecasts/forecast_reliability.html
CRPS Continuous Ranked Probability Score
https://datascience.stackexchange.com/questions/63919/what-is-continuous-ranked-probability-score-crps
ConvNext, Convnets for the 2020s:
https://arxiv.org/abs/2201.03545
Transformers for ensemble forecasts:
https://arxiv.org/abs/2106.13924
Books I recommend:
https://www.amazon.com/shop/jesperdramsch/list/2DYS5KVR5TX0E
Blog posts I wrote about these books:
https://dramsch.net/tags/books/
Short I made about Test-Time Augmentation
https://www.youtube.com/shorts/w4sAh9lKyls
Their links: https://dramsch.net/links
Their open PhD thesis: https://dramsch.net/phd
Newsletter: https://dramsch.net/newsletter
Twitter: https://dramsch.net/twitter
Youtube: https://dramsch.net/youtube
Linkedin: https://dramsch.net/linkedin
Kaggle: https://dramsch.net/

Jun 8, 2022 • 1h 53min
Reinforcement learning common use cases, recommendation engine, productivity - Susan Shu Chang the data scientist show#039
Susan Shu Chang is a principal data scientist at clearco, helping ecommerce founders' by building machine learning-powered investing. In her previous role, she developed the company’s very first ML powered website recommender system, deployed to millions of customers, and created a custom OpenAI Gym environment for a reinforcement learning project in production. She is also the founder and developer of Quill Game Studios, selling ~10k copies of the debut game in 6 months. She has given talks at PyCon Canada,Toronto Machine Learning Summit (TMLS), and more. She writes about her career journey and learning on https://www.susanshu.com/ If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Highlights
(00:00) Intro
(00:01:29) from economics to data science
(00:07:23) reinforcement learning (RL)
(00:20:00) recent reinforcement learning use cases
(00:27:28) reinforcement learning for social media's recommender system
(01:04:42) common mistakes when productionizing models
(01:08:30) principal data scientist's day-to-day
(01:14:05) what productivity really means
(01:21:04) productivity tips
(01:41:48) books and blogs on productivity

May 31, 2022 • 2h 2min
User-centric data science, design thinking, from UX researcher to data science manager@Visa - Laura Gabrysiak - the data scientist show #038
Laura Gabrysiak is a senior manager of data products and solutions at Visa. Previously, she's a data scientist, building machine learning models and decision tools to enable Visa clients. She has a college degree in computational and linguistics and has masters in design thinking. She's building the local data science community in Miami, and a co-founder of our Ladies. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Laura's Linkedin:https://www.linkedin.com/in/lauragabrysiak/
(00:02:43) her journey into data science
(00:20:28) anecdotes vs big data
(00:27:05) the power of small data
(00:30:41) design thinking key elements
(00:47:25) mindset shift from a user researcher to a data scientist
(01:00:51) how to improve customer engagement
(01:02:10) how to make data visualization effective
(01:27:21) mindset shift from an individual contributor to a manager
(01:40:43) advices for people who are on PIP

May 24, 2022 • 2h 10min
A/B testing and growth analytics at Airbnb, building data science tools and metrics store with Nick Handel, the data scientist show#037
Nick Handel was a senior data scientist leading the launch of the data side of this Airbnb Trips and later built a team that designed aribnb’s end-to-end machine learning platform, bighead. Currently, he is the cofounder and CEO of Transform, he first centralized 'metrics store' that empowers data analysts to deliver insights. He was recognized as 30 under 30 by Forbes in 2018. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Nick's Linkedin:https://www.linkedin.com/in/nicholashandel/
Highlights:
(00:00) intro and career journey
(00:10:58) common mistakes in A/B testing
(00:25:48) how to do A/B testing deep dives
(00:27:32) surprising A/B testing results
(00:29:18) facts vs opinions
(00:33:55) A/B testing best practices
(00:55:01) how he built a new data schema for Airbnb Trips
(01:00:43) how to collect data when building data science tools
(01:38:53) trend of data science tools

7 snips
May 17, 2022 • 1h 51min
Becoming a superforecaster, decision science for better human predictions - Pavel Atanasov-the data scientist show#036
Pavel is a decision scientist and co-founder at Pytho, using decision science to measure and improve human judgment & prediction. He has a phd in psychology and decision science from the University of Pennsylvania, focusing on crowd predictions. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Pavel's twitter: https://twitter.com/PavelDAtanasov
Superforecasting book, based on the Good Judgment Project: https://www.amazon.com/Superforecasting-Science-Prediction-Philip-Tetlock/dp/0804136718
Blogs about forecasting:
Vox's Future Perfect series: https://www.vox.com/future-perfect
Astral Codex Ten: https://astralcodexten.substack.com/
Highlights:
(00:01:10) how he got into decision science
(00:14:38) what makes someone a super forecaster
(00:16:20) three elements of becoming a super forecaster
(00:24:37) how to effectively update our opinions
00:30:05 how he designed experiments to find out what was a better system
(00:48:27) why humans sometimes are better than algorithm
(01:14:50) how to collect data and information better
(01:33:25) why you should quit
(01:42:30) the future of decision science

May 10, 2022 • 1h 36min
Using AI to detect online abuse, from physics PhD to staff ML engineer@Linkedin, persuasion at work with James Verbus - the data scientist show #035
James Verbus is Staff Machine Learning Engineer at LinkedIn. He has a PhD in Physics from Brown university. He is the tech lead of the Anti-Scraping and Automation AI Team, working on protecting LinkedIn's Members from bots and abusive scripted behavior, pioneering the use of deep learning to detect abusive automated sequences of user activity (blog post). If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
(00:01:14) from physic to data science
(00:16:37) background of online abuse detection
(00:24:40) Isolation Forest Algorithm
(00:42:59) his day-to-day as a staff ML Engineer
(00:52:57) how to persuade stakeholders
(00:58:17) how to build influence at work
(01:00:22) how he grew to staff engineer
(01:13:48) what he learned from his mentor

May 5, 2022 • 2h 46min
The golden age of AI and neuroscience, brain computer interface (BCI), from academia to FAANG with Patrick Mineault - The Data Scientist Show #034
Patrick Mineault is a neural data scientist. He has worked at Google and Facebook after he did a postdoc at UCLA. He worked on Brain Computer Interface (BCI) at Facebook Reality Labs, building a BCI that allows you to type with your brain. He tweets about neuro-AI @patrickmineault, and writes a blog (https://xcorr.net) sharing his career journey and learnings along the way. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
How he got into data science (00:02:41)
His work at Google on A/B testing (00:04:17)
How he joined Facebook Reality Lab(00:23:53)
Projects on neuro-AI and brain computer interface (BCI) (00:27:13)
Skills needed for BCI research (00:34:37)
How AI influence neuroscience (01:34:28)
computer vision VS human vision (01:39:57)
model vs data, nature vs nurture(01:45:32)

Apr 6, 2022 • 1h 25min
From biostatistician to the 'artist of data science', how he turned his life around, philosophy - Harpreet Sahota - The Data Scientist Show#033
Harpreet Sahota is a data scientist and ML developer advocate, he is also the host of “artist of the data science” podcast and weekly data science happy hours, he is the principal data science mentor at data science dream job. He is also a philosophy nerd. He had some struggles when he tried to get into data science, and today we’ll talk about his experience as a biostatistician, data scientist, lessons he learned from his journey and from mentoring other people, and how he turned his life around. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Harpreet's Linkedin: https://www.linkedin.com/in/harpreetsahota204/?originalSubdomain=ca
The artist of data science podcast: https://theartistsofdatascience.fireside.fm/