The Experimentation Edge
This episode of The Experimentation Edge explores how A/B testing, feature flags, and user research transformed Atlassian's talent product after it failed with its first users. Andrew Willingham — 11 years at Amazon, now Head of Legal and People Products at Atlassian — shares how product experimentation works when you can't test at scale, why your customer and your user are not the same person, and how the metrics you choose decide which experiments you can even run. Summary Andrew Willingham, Head of Legal and People Products at Atlassian, spent 11 years at Amazon before joining Atlassian a year ago. His path from running A/B tests on millions of Amazon shoppers to building talent management software for a few hundred thousand employees forced a fundamental shift: when you can't run tests at scale, you have to sit with your actual users and watch them fail. He shares how building a talent review product for Amazon's HR specialists completely flopped when handed to HRBPs — and why that failure taught him more than any winning experiment. Now at Atlassian, he's applying that same rigor to reimagining hiring processes with AI, testing everything from recruiter screens to interview sequences that the industry has run the same way for decades. Timestamps 03:09 From marketing Amazon's mobile app to building HR software for 1.5 million associates 08:19 Why a talent review product loved by IO psych experts flopped with actual HRBPs 11:11 How A/B testing helps product managers escape opinion-based politics 15:25 Testing copy that changes behavior: "We'll generate that status report for you" 17:20 The two North Star metrics Andrew optimizes: efficiency and quality 19:05 Khan Academy's metric trap: measuring cognitive engagement, not just completion 21:10 Why product managers resist experimentation — and what changes when you admit you don't know Takeaways - Your customer and your user may not be the same person — building for HR specialists instead of the HRBPs who actually run talent reviews resulted in a feature nobody could use. - When you can't test at scale, desk rides replace A/B tests — sitting with users and watching them struggle reveals failures faster than any dashboard. - Experimentation short-circuits political debates by removing opinion from product decisions. - Test metrics before you test features — usage time could signal engagement or just mean your product takes too long to do its job. - The experiments that fail deliver the most valuable learnings, especially when you expected a slam dunk. Connect with the guest Andrew Willingham on LinkedIn: https://www.linkedin.com/in/andrewwillingham/ [https://www.linkedin.com/in/andrewwillingham/] Learn more about Atlassian: https://www.atlassian.com/ [https://www.atlassian.com/] Sponsor Growthbook helps you ship features with confidence by bringing experimentation and feature flagging into one open-source platform. No more guessing whether that new checkout flow actually moved the needle, waiting weeks for data team bandwidth, or flying blind on rollouts. Growthbook gives you a single place to run A/B tests, manage feature flags, and analyze results against your existing data warehouse. With powerful stats built in, it takes the complexity out of experimentation, helps you catch regressions before they hit every user, and makes it easy to test ideas that keep your product improving and your metrics moving in the right direction. See a demo at https://www.growthbook.io/ [https://www.growthbook.io/] Topics: A/B testing, product experimentation, feature flags, user research, talent management, qualitative research, metric design, experimentation at scale, growth experimentation. * (03:09) - From marketing Amazon's mobile app to building HR software for 1.5 million associates * (08:19) - Why a talent review product loved by IO psych experts flopped with actual HRBPs * (11:11) - How A/B testing helps product managers escape opinion-based politics * (15:25) - Testing copy that changes behavior: "We'll generate that status report for you" * (17:20) - The two North Star metrics Andrew optimizes: efficiency and quality * (19:05) - Khan Academy's metric trap: measuring cognitive engagement, not just completion * (21:10) - Why product managers resist experimentation — and what changes when you admit you don't know
13 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y forma parte de la comunidad de The Experimentation Edge!