A/B testing at Airbnb, building next-gen experimentation platform at Eppo - Che Sharma - The Data Scientist Show #068
Aug 25, 2023
auto_awesome
Che Sharma, former data scientist at Airbnb and founder of Eppo, talks about A/B testing best practices, A/B testing for ML models, and his career journey. They discuss successful A/B testing, interpreting and communicating test results, A/B testing best practices for ML models, centralizing experiment analysis, preparing data scientists for the future, developing communication skills, transitioning to a manager role, and the future of experimentation.
Sufficient data collection is crucial in experimentation to avoid inconclusive results and inaccurate institutional knowledge.
Experimentation can transform a company's culture and decision-making process, enabling data-driven decision-making and fostering innovation.
Building a trustworthy experimentation infrastructure is essential, emphasizing collaboration, clear metrics definitions, automation, and preserving confidence in results.
Deep dives
The Importance of Sufficient Data in Experimentation
One of the main ideas discussed in the podcast is the need for sufficient data in experimentation. The speaker highlights a past experiment at Airbnb that was prematurely terminated before enough data was collected, leading to inconclusive results. This emphasizes the importance of running experiments for an adequate duration to detect meaningful insights. The failure to handle this failure mode can create inaccurate institutional knowledge and hinder decision-making.
The Transformative Power of Experimentation at Airbnb
The podcast episode also explores the transformative impact of experimentation at Airbnb. As the first data scientist at the company, the speaker describes how experimentation shifted the company's culture and decision-making process. It enabled data-driven decision-making and empowered teams to try out new ideas without significant political battles. Experimentation became a key factor in the company's growth and success.
Challenges and Best Practices in Experimentation
The podcast delves into the challenges and best practices in experimentation. It highlights the difficulty of running experiments effectively, emphasizing the need for statistical power, robust randomization, and careful execution. The speaker also emphasizes the importance of setting expectations and starting with small, lightweight experiments before pursuing larger and more visible projects. Additionally, the podcast emphasizes the value of experimentation in machine learning models and the need to use business metrics to justify further investment.
Building Trustworthy Experimentation Infrastructure
The episode discusses the importance of building trustworthy experimentation infrastructure. The speaker highlights the need for infrastructure that supports collaboration between data scientists, engineers, and product managers. This includes providing clear metrics definitions, automating common checks for data quality and randomization, and offering reliable statistical power analysis. The podcast also emphasizes the role of preserving confidence in experimentation by detecting anomalies in data pipelines and outliers in results. Overall, the episode underscores the importance of a collaborative and robust infrastructure in driving successful experimentation.
The Challenges of Holdouts in Experimentation
Holdouts are a valuable tool for understanding the total additive effect of multiple experiments and assessing a team's overall impact. However, implementing holdouts can be challenging for many companies. Small data volumes resulting from low percentages used for holdouts may not provide sufficient data for meaningful analysis. Additionally, holdouts are not effective in scenarios where cookies are cleared and user-level metrics and identity resolution are required. Furthermore, maintaining a separate version of the application for a long period of time with multiple feature flags can be difficult and may result in a version with bugs. While holdouts are encouraged for those with the resources, alternative approaches can be considered to achieve similar goals.
AB Testing versus Multi-Armed Bandits for Short-Term Effects
AB testing is commonly used to explore and exploit variations for both short and long-term effects. However, multi-armed bandits (MAB) provide a more automated decision-making process by shifting traffic to the winning variation early on. This allows for faster rewards and is particularly useful in time-sensitive situations with limited benefit windows. While MABs can be valuable in certain circumstances, AB testing is generally recommended for most scenarios where rewards can be continuously reaped from the winning variation. It is important to carefully evaluate the trade-offs and align the chosen approach with the specific needs and context of the experiment.
Che Sharma was the 4th data scientist at Airbnb, later he joined Webflow as an early employee. In 2021 he founded Eppo, a next-gen A/B experimentation platform designed for modern data and product teams to run more trustworthy and advanced experiments. We talked about A/B testing best practices, A/B testing for ML models, and Che’s career journey. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Che’s LinkedIn: https://www.linkedin.com/in/chetanvsharma/
Try Eppo for A/B testing: https://www.geteppo.com/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:01:26) Getting started in data science at Airbnb
(00:03:08) Keys to successful A/B testing
(00:06:53) Interpreting and communicating A/B test results
(00:15:00) A/B testing best practices testing machine learning models
(00:41:39) Centralizing experiment analysis
(00:53:46) Preparing data scientists for the future
(00:59:33) Developing communication skills as a data scientist
(01:08:43) Transitioning from individual contributor to manager
(01:12:28) The future of experimentation
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode