The Experimentation Edge
Summary In this episode of The Experimentation Edge, GrowthBook CMO Ashley Stirrup sits down with Nick Beyler, data science manager at Stitch Fix, where he leads the decision and insights team and owns the company's internal experimentation platform. Nick shares why the metric he most wants is the one he can't measure yet, a North Star that predicts a client's long-term value from their earliest behaviors, and why the most impactful experiment learnings tend to come from adoption friction rather than product bugs. He makes the case that if you're only testing winners you're not taking enough risks, explains how guardrails make that risk safe, and looks ahead to a new in-house platform and the promise of agentic AI. It's a practical, statistician's-eye view of experimentation for product managers, data scientists, and engineers building serious testing programs. Chapters 00:00 Cold open and welcome to the show 01:45 What Stitch Fix actually does 04:15 Balancing AI with the human stylist 05:15 From public policy to the A/B testing adrenaline rush 07:15 Inside the weekly experimentation review group 08:45 The AI style assistant and listening to qualitative feedback 10:45 Why adoption friction beats product bugs 13:45 Testing for losers and building guardrails 15:45 Keep rate, successful fixes, and the holy grail metric 18:15 The new platform and the promise of agentic AI Takeaways * The most impactful experiment learnings usually come from adoption friction, not product bugs. By the time a big feature reaches A/B testing, it's often already a winner, so the open question is how and where to introduce it. * A losing test is a finding, not a failure. If every experiment wins, you're not taking enough risk to learn anything new. * Guardrails and stopping criteria are what make risk-taking safe, especially when the experience is as personal as shopping. * The most valuable North Star metric is the one you can't measure yet, long-term client value, and causal-inference modeling helps predict it from short-term behavior. * Quantitative results are only half the story. Direct, qualitative client feedback inside an experiment often reshapes the rollout more than the numbers do. Connect with the Guest LinkedIn: https://www.linkedin.com/in/nick-beyler-381864119/ [https://www.linkedin.com/in/nick-beyler-381864119/] Website: https://www.stitchfix.com [https://www.stitchfix.com] Sponsor GrowthBook is the warehouse-native platform for experimentation, feature flags, and product analytics trusted by AI-native product teams at 3,000+ companies worldwide. Go to http://growthbook.io [http://growthbook.io?utm_source=edge-podcast&utm_medium=podcast&utm_campaign=episode-25]
24 episodes
Comments
0Be the first to comment
Sign up now and become a member of the The Experimentation Edge community!