The Experimentation Edge

False negatives are killing your best product ideas

28 min · 24. juni 2026

Beskrivelse

Summary How do you make a high-stakes product decision when the safe choice is to never test it at all? In this episode of The Experimentation Edge, host Ashley Stirrup talks with Arun Bodapati, director of data science at Twitch, about the discipline behind trustworthy experimentation. Drawing on his experience at Schwab, Uber, and Twitch, Arun explains why false negatives are the most dangerous result a team can produce, what hygiene to nail before you push play, and how Twitch used geo-fenced experiments and causal inference to finally settle a pricing question it had avoided for years. It's a practical conversation for product managers, engineers, data scientists, and growth leaders who want experiments that hold up and earn executive trust. Chapters 00:00 Welcome and introduction 01:15 Arun's background and marketing experimentation at Schwab 04:15 Uber's mature, experiment-driven culture 06:30 Coming to Twitch: from Python notebooks to a shared standard 08:30 The pricing problem Twitch had long avoided 10:30 Geo-fenced experiments, matched markets, and elasticity 13:15 The gifted-subs surprise and testing promotions 16:15 The discipline that matters before you push play 18:15 Why false negatives are worse than false positives 20:05 Enrollment triggers and broad explore experiments 22:45 AI, the Kiro tool, and what's next for experimentation Takeaways * False negatives are more dangerous than false positives — they get institutionalized as "we tried that, it didn't work" and quietly kill good ideas for years. * The most valuable experiment work happens before you push play: clear enrollment logic, a plain-English hypothesis, and no optimizing ahead of the test. * If an intervention sounds weak when you write it out in plain English, don't run the experiment — you're just wasting time. * Run a broad explore experiment first; small, over-narrowed populations lack power and raise the odds of a false negative. Find the responsive segment with heterogeneous treatment effects afterward. * Twitch used geo-fenced experiments with matched markets and causal inference to measure true price elasticity, turning a feared pricing decision into a measured, accretive one. Connect with the Guest LinkedIn: https://www.linkedin.com/in/abodapati/ [https://www.linkedin.com/in/abodapati/] Website: https://www.twitch.tv [https://www.twitch.tv] Sponsor Growthbook helps you ship features with confidence by bringing experimentation and feature flagging into one open-source platform. No more guessing whether that new checkout flow actually moved the needle, waiting weeks for data team bandwidth, or flying blind on rollouts. Growthbook gives you a single place to run A/B tests, manage feature flags, and analyze results against your existing data warehouse. With powerful stats built in, it takes the complexity out of experimentation, helps you catch regressions before they hit every user, and makes it easy to test ideas that keep your product improving and your metrics moving in the right direction. See a demo at https://www.growthbook.io/ [https://www.growthbook.io/]

Kommentarer

Vær den første til at kommentere

Tilmeld dig nu og bliv en del af The Experimentation Edge-fællesskabet!

Kom i gang

False negatives are killing your best product ideas

Beskrivelse

Kommentarer

1 måned kun 9 kr.

Alle episoder