Lenny's Podcast: Product | Career | Growth

chevron_right

The ultimate guide to A/B testing | Ronny Kohavi (Airbnb, Microsoft, Amazon)

whatshot 835 snips

Jul 27, 2023

Guest

Ronny Kohavi

Ronny Kohavi, a leading expert in A/B testing with a rich history at Airbnb, Microsoft, and Amazon, shares invaluable insights into the world of experimentation. He emphasizes how fostering a culture of experimentation can lead to significant product innovations. The discussion covers common pitfalls in A/B testing, including sample ratio mismatches, and highlights the importance of trust and data integrity. Ronny also shares surprising results from his experiments and best practices for optimizing testing processes, making data-driven decisions essential for growth.

01:23:07

forum

Ask episode

web_stories

AI Snips

view_agenda

Chapters

menu_book

Books

auto_awesome

Transcript

info_circle

Episode notes

question_answer

ANECDOTE

Unexpected Ad Placement Win

A seemingly trivial change in Bing's ad display, promoting the second line to the first, unexpectedly increased revenue by 12%.
This small change, initially low priority, became one of the biggest revenue impacts in Bing's history.

question_answer

ANECDOTE

New Tab Surprise

Opening search results in a new tab, rather than the same window, proved highly beneficial at both Bing and Airbnb.
This seemingly small change significantly improved user engagement and, ultimately, revenue.

insights

INSIGHT

Incremental Gains

While everyone seeks "gold nugget" experiments with massive impact from minimal effort, these are rare.
Most gains come incrementally, with small improvements accumulating over time, as seen in Bing Ads' 2% yearly revenue growth.

Get the Snipd Podcast app to discover more snips from this episode

The Power of A/B Testing: Small Changes, Big Impact

04:56 • 23min

chevron_right

Balancing Innovation and Experimentation

28:01 • 13min

chevron_right

The Strategic Approach to A/B Testing and Iteration

41:20 • 2min

chevron_right

Navigating Data-Driven Decision Making at Airbnb

43:35 • 12min

chevron_right

Unraveling A/B Testing Pitfalls

55:26 • 20min

chevron_right

Reflections on Chernobyl and Technical Interview Challenges

01:15:43 • 2min

chevron_right

Innovative Home Monitoring and Effective Team Communication

01:17:32 • 3min

chevron_right

The Hierarchy of Evidence and Real-Life A/B Testing

01:20:14 • 3min

chevron_right

#1670

• Mentioned in 21 episodes

Mistakes Were Made (But Not by Me)

Carol Tavris

Elliot Aronson

This book by Carol Tavris and Elliot Aronson delves into the psychological mechanisms behind self-justification, using anecdotal, historical, and scientific evidence. It explains how cognitive dissonance leads people to create fictions that absolve them of responsibility, restoring their belief in their own morality and intelligence. The authors discuss various examples, including political decisions, marital conflicts, and medical errors, to illustrate how self-justification can lead to harmful consequences. The updated edition includes new examples and an extended discussion on how to live with dissonance, learn from it, and potentially forgive oneself.

#50434

Calling Bullshit

The Art of Skepticism in a Data-Driven World

Jevin D. West

Carl Bergstrom

In 'Calling Bullshit,' Carl T. Bergstrom and Jevin D. West provide readers with tools to critically evaluate information, especially in the context of data-driven narratives. The book addresses how to identify and refute misinformation by understanding statistical fallacies, data visualization, and the distinction between correlation and causation. It emphasizes the importance of skepticism in a hyperpartisan media environment.

#58830

Hard Facts, Dangerous Half-Truths, and Total Nonsense

Profiting from Evidence-Based Management

Jeffrey Pfeffer

Robert Sutton

In 'Hard Facts, Dangerous Half-Truths, and Total Nonsense', Jeffrey Pfeffer and Robert I. Sutton challenge conventional management wisdom by highlighting the flaws in popular practices. They promote evidence-based management as a more effective approach, encouraging leaders to rely on empirical evidence rather than intuition or trends. The book debunks several myths, such as the idea that financial incentives are the primary motivators or that the best organizations always have the best people.

#27766

• Mentioned in 2 episodes

Trustworthy Online Controlled Experiments

Ronny Kohavi

Diane Tang

Ya Xu

This book, written by experimentation leaders from Google, LinkedIn, and Microsoft, provides practical insights and examples on designing, executing, and interpreting online controlled experiments. It covers the entire process from hypothesis formulation to result interpretation, emphasizing the importance of trustworthiness, statistical significance, and avoiding common pitfalls like carryover effects and Simpson's paradox. The authors share their extensive experience to help readers make informed decisions based on reliable data, fostering a culture of trustworthy experimentation in organizations.

Brought to you by Mixpanel—Event analytics that everyone can trust, use, and afford | Round—The private network built by tech leaders for tech leaders | Eppo—Run reliable, impactful experiments

—

Ronny Kohavi, PhD, is a consultant, teacher, and leading expert on the art and science of A/B testing. Previously, Ronny was Vice President and Technical Fellow at Airbnb, Technical Fellow and corporate VP at Microsoft (where he led the Experimentation Platform team), and Director of Data Mining and Personalization at Amazon. He was also honored with a lifetime achievement award by the Experimentation Culture Awards in September 2020 and teaches a popular course on experimentation on Maven. In today’s podcast, we discuss:

• How to foster a culture of experimentation

• How to avoid common pitfalls and misconceptions when running experiments

• His most surprising experiment results

• The critical role of trust in running successful experiments

• When not to A/B test something

• Best practices for helping your tests run faster

• The future of experimentation

—

Enroll in Ronny’s Maven class: Accelerating Innovation with A/B Testing at https://bit.ly/ABClassLenny. Promo code “LENNYAB” will give $500 off the class for the first 10 people to use it.

—

Find the full transcript at: https://www.lennysnewsletter.com/p/the-ultimate-guide-to-ab-testing

—

Where to find Ronny Kohavi:

• Twitter: https://twitter.com/ronnyk

• LinkedIn: https://www.linkedin.com/in/ronnyk/

• Website: http://ai.stanford.edu/~ronnyk/

—

Where to find Lenny:

• Newsletter: https://www.lennysnewsletter.com

• Twitter: https://twitter.com/lennysan

• LinkedIn: https://www.linkedin.com/in/lennyrachitsky/

—

In this episode, we cover:

(00:00) Ronny’s background

(04:29) How one A/B test helped Bing increase revenue by 12%

(09:00) What data says about opening new tabs

(10:34) Small effort, huge gains vs. incremental improvements

(13:16) Typical fail rates

(15:28) UI resources

(16:53) Institutional learning and the importance of documentation and sharing results

(20:44) Testing incrementally and acting on high-risk, high-reward ideas

(22:38) A failed experiment at Bing on integration with social apps

(24:47) When not to A/B test something

(27:59) Overall evaluation criterion (OEC)

(32:41) Long-term experimentation vs. models

(36:29) The problem with redesigns

(39:31) How Ronny implemented testing at Microsoft

(42:54) The stats on redesigns

(45:38) Testing at Airbnb

(48:06) Covid’s impact and why testing is more important during times of upheaval

(50:06) Ronny’s book, Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing

(51:45) The importance of trust

(55:25) Sample ratio mismatch and other signs your experiment is flawed

(1:00:44) Twyman’s law

(1:02:14) P-value

(1:06:27) Getting started running experiments

(1:07:43) How to shift the culture in an org to push for more testing

(1:10:18) Building platforms

(1:12:25) How to improve speed when running experiments

(1:14:09) Lightning round

—

Referenced:

• Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing: https://experimentguide.com/

• Seven rules of thumb for website experimenters: https://exp-platform.com/rules-of-thumb/

• GoodUI: https://goodui.org

• Defaults for A/B testing: http://bit.ly/CH2022Kohavi

• Ronny’s LinkedIn post about A/B testing for startups: https://www.linkedin.com/posts/ronnyk_abtesting-experimentguide-statisticalpower-activity-6982142843297423360-Bc2U

• Sanchan Saxena on Lenny’s Podcast: https://www.lennyspodcast.com/sanchan-saxena-vp-of-product-at-coinbase-on-the-inside-story-of-how-airbnb-made-it-through-covid-what-he8217s-learned-from-brian-chesky-brian-armstrong-and-kevin-systrom-much-more/

• Optimizely: https://www.optimizely.com/

• Optimizely was statistically naive: https://analythical.com/blog/optimizely-got-me-fired

• SRM: https://www.linkedin.com/posts/ronnyk_seat-belt-wikipedia-activity-6917959519310401536-jV97

• SRM checker: http://bit.ly/srmCheck

• Twyman’s law: http://bit.ly/twymanLaw

• “What’s a p-value” question: http://bit.ly/ABTestingIntuitionBusters

• Fisher’s method: https://en.wikipedia.org/wiki/Fisher%27s_method

• Evolving experimentation: https://exp-platform.com/Documents/2017-05%20ICSE2017_EvolutionOfExP.pdf

• CUPED for variance reduction/increased sensitivity: http://bit.ly/expCUPED

• Ronny’s recommended books: https://bit.ly/BestBooksRonnyk

• Chernobyl on HBO: https://www.hbo.com/chernobyl

• Blink cameras: https://blinkforhome.com/

• Narrative not PowerPoint: https://exp-platform.com/narrative-not-powerpoint/

—

Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com.

—

Lenny may be an investor in the companies discussed.

This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.lennysnewsletter.com/subscribe

Home Top podcasts Popular guests Top books