Coverbild der Sendung Data Breakthroughs: Solving Real-World Data Challenges

Data Breakthroughs: Solving Real-World Data Challenges

Podcast von Lior Barak - Cooking Data

Englisch

Wissen​schaft & Techno​logie

Begrenztes Angebot

2 Monate für 1 €

Dann 4,99 € / MonatJederzeit kündbar.

  • 20 Stunden Hörbücher / Monat
  • Podcasts nur bei Podimo
  • Alle kostenlosen Podcasts
Loslegen

Mehr Data Breakthroughs: Solving Real-World Data Challenges

A podcast where data experts solve real-world operational challenges submitted by listeners. Each episode tackles a fresh problem, delivering actionable solutions, key insights, and implementation steps to help data professionals overcome barriers and create business value. impactoperations.substack.com

Alle Folgen

14 Folgen

Episode Why Simple Data Integrations Take Months & How to Fix Them (feat. Ilya Vladimirskiy) Cover

Why Simple Data Integrations Take Months & How to Fix Them (feat. Ilya Vladimirskiy)

Episode Summary I’m incredibly excited to have Ilya back for the Season 1 finale. He was my very first guest in the pilot episode [https://cookingdata.substack.com/p/data-breakthroughs-solving-pipeline], and honestly, he helped me figure out this format - what works, what doesn’t, how to make the collaborative problem-solving feel authentic. So it felt only right to close this season by bringing him back full circle. In this finale, we tackle a frustratingly common scenario: a marketing stakeholder needs data from a new platform for critical quarterly forecasting, but the data team estimates four months to build the connector. Meanwhile, two hours disappear every morning into manual CSV downloads, cleanups, and copy-paste operations, a process that’s already caused a 50% error in pipeline analysis. What seems like a technical integration problem quickly reveals itself as something much deeper: an organizational breakdown in ownership, communication, and mutual understanding between business stakeholders and data teams. And true to form, Ilya immediately zeroes in on the people side of things. Problem Category: Data Integration & ETLRuntime: 40 minutes The Problem Submitted by: Anonymous Marketing ProfessionalIndustry Context: Company with established data infrastructure and quarterly business planning cycles Problem Framework Issue: Need data from the new marketing automation platform for quarterly forecasting, but building a connector will take four months, according to the data team. Currently manually downloading CSV files daily. Trigger: Quarterly planning cycle starts in 8 weeks. Currently spending 120 minutes every morning downloading, cleaning, and manually importing marketing data files. Last week, a copy-paste error threw off pipeline analysis by 50% - only caught because the numbers seemed unrealistic. Tension: The data team focuses on building robust, enterprise-grade connectors that take months to develop properly. While understanding their approach, there’s an immediate business need that can’t wait for the perfect solution. The manual process is unsustainable and risky, but the data is critical for business planning. Boundaries: * Cannot change the quarterly planning timeline (set by business cycle) * The marketing platform was selected by leadership and cannot be changed * The data team has limited capacity and other priorities * A budget exists for reasonable interim solutions * Must maintain data quality standards for forecasting accuracy Tech Stack: New marketing automation platform with CSV export capability, central data warehouse for forecasting (specific tools not disclosed) Clarity Statement: Need an interim solution to get marketing platform data into the data warehouse reliably within the next 8 weeks, without waiting for the full enterprise connector that will take 4 months. Our Guest IlyaFractional Head of Data & Data Leadership Consultant Ilya brings over 15 years of data experience, having led data functions at companies like Ada Health (symptom checker app), and various startups and scale-ups across Berlin and Munich. Originally from Moscow with a background in computational mathematics, he moved to Germany in 2002 and transitioned from database research to hands-on data engineering and leadership roles. After the biotech winter impacted Ada Health, Ilya pivoted to fractional and interim data leadership, helping companies build data platforms and teams across different domains and stages. Special Note: Ilya was our very first guest in the pilot episode and returns to close out Season 1, bringing his people-first philosophy full circle. Connect with Ilya: * LinkedIn: https://www.linkedin.com/in/bkmy43/ [https://www.linkedin.com/in/bkmy43/] * YouTube: https://www.youtube.com/@lab4.berlin [https://www.youtube.com/@lab4.berlin](Data leadership discussions while smoking pipes - yes, really, and it’s worth checking out) * Website: https://www.lab4.berlin/ [https://www.lab4.berlin/] The Breakthrough Discussion Initial Reactions Ilya and I immediately recognized this as a people problem disguised as a technical problem. As Ilya put it: “From most failing projects and situations like this, I rarely saw the root cause was technology or tools.” The four-month estimate raised red flags for both of us. As Ilya observed, connecting to an API of an existing marketing tool shouldn’t take four months - what’s likely happening is that “building a connector” actually means the entire pipeline: getting the data, integrating it into the company data model, and delivering it through BI tools with proper business logic. The Real Problem Through the reflection break and collaborative discussion, we identified the core issues: * Communication Breakdown: The data team likely down-prioritized this request because other initiatives have a higher business impact, but they haven’t articulated this clearly. The stakeholder hears “four months” without understanding what’s blocking it. * Ownership Confusion: It’s unclear who owns the decision about prioritization and tradeoffs. Without clear ownership, every request becomes a negotiation rather than a strategic decision. * Missing Context: The data team probably doesn’t understand the business impact of the delay (corrupted forecasts, wasted ad spend, strategic planning delays). The stakeholder doesn’t understand what the data team is actually building and why it takes time. * Us vs Them Dynamic: The situation has devolved into adversarial positioning - “the data team won’t help me” versus “stakeholders want everything immediately” - rather than collaborative problem-solving. The Solution Approach Rather than a single technical fix, our discussion produced a multi-layered solution: Immediate Relief (Week 1-2): Ask one of the engineers to build a simple Python script or similar automation. This doesn’t need to be production-grade infrastructure - just something that reliably pulls the CSV, does basic transformation, and loads it into the warehouse. This can be a 2-3 day task rather than a 4-month project. Transparency & Context (Week 2-3): Create a visible initiative backlog overview showing everything the data team is working on. When someone says “it will take four months,” they should be able to show exactly what’s blocking it and why those other priorities matter more. This isn’t about justifying delays - it’s about enabling informed decisions. Rational Decision Framework (Week 3-4): Develop a structured way to articulate both the cost of building solutions and the business impact of delays. Put numbers on the table: What does two hours of manual work daily cost? What’s the risk value of potential forecast errors? What’s the opportunity cost of the data team working on this versus their current priorities? Strategic Alignment (Ongoing): Establish clear ownership and a prioritization process that considers both technical complexity and business impact. This isn’t about the data team gatekeeping or stakeholders demanding - it’s about having a framework where tradeoffs are visible and decisions are rational. Key Takeaways 3 Critical Insights * This is an Organizational Problem, Not a Technical One: The four-month timeline isn’t about technical complexity - it’s about priorities, communication, and organizational dynamics. The actual technical work of connecting to an API could be done much faster if approached as a quick automation rather than an enterprise-grade infrastructure. * The “Us vs Them” Dynamic Is Killing Efficiency: When stakeholders and data teams position themselves as adversaries rather than collaborators, every interaction becomes a negotiation. The marketing person sees the data team as obstructionist; the data team sees stakeholders as demanding and unrealistic. Neither side wins in this dynamic, and the business suffers. * Ownership Clarity Is Essential: Without clear ownership of prioritization decisions, every data request becomes contested territory. Someone needs to own the decision about whether a four-month wait is acceptable given the business impact, and that person needs visibility into both the technical constraints and business consequences. 4 Action Items For the Problem Submitter (and anyone in similar situations): * Request a Quick Automation Script (This Week) - Ask a data engineer to build a simple Python script or similar automation that pulls the CSV, does basic transformation, and loads it into your warehouse. Make it clear this doesn’t need to be production-grade infrastructure - just something reliable enough to bridge the gap. Timeline: 2-3 days of engineering time. * Create Initiative Backlog Visibility (Week 2) - Work with the data team to create a visible overview of all current initiatives. When told something will take four months, you should understand what’s blocking it and why those priorities were chosen. This isn’t about challenging their decisions - it’s about having context for informed discussion. * Articulate Cost and Impact With Numbers (Week 3) - Document the actual business impact: two hours daily of manual work (cost it out by salary), risk of forecast errors (quantify the potential impact), strategic planning delays (what decisions are being made without this data?). Similarly, ask the data team to articulate what they’d need to deprioritize to tackle this sooner. * Establish Ongoing Prioritization Framework (Week 4+) - Work with leadership to create a clear process for prioritizing data work that considers both technical complexity and business impact. Identify who owns these decisions and ensure they have visibility into both technical constraints and business consequences. This prevents future “four months” surprises. Episode Highlights * 02:00 - Problem reveal: Four months for a marketing platform connector * 06:30 - Ilya’s immediate diagnosis: “This is a people problem, not a technical one” * 14:45 - Post-reflection discussion: Unpacking the communication breakdown * 24:30 - The ownership question: Who actually decides priorities? * 31:00 - Quick wins vs long-term solutions: The Python script approach * 36:05 - Did we break through? An honest assessment The Honest Assessment At the end of the episode, Ilya and I agreed: This wasn’t a breakthrough moment. As Ilya candidly put it: “If I had a silver bullet, a process, or an idea how to solve it in a company, I would probably not be here. I’d be a millionaire.” This pattern - stakeholder requests taking months, manual workarounds creating risk, “us versus them” dynamics - happens constantly across companies of all sizes. It’s not easily solved because it’s deeply rooted in organizational structure, culture, and human nature. What our episode provided wasn’t a magic solution but a structured framework for thinking about these conflicts: * Separating immediate tactical relief from long-term strategic solutions * Making tradeoffs visible rather than hidden * Moving from adversarial positioning to collaborative problem-solving * Establishing clear ownership and rational decision processes The problem that was submitted - the manual CSV downloads - can likely be solved with a quick automation. But the deeper problem - the organizational dynamics that created a four-month timeline - requires more fundamental changes that depend heavily on company culture, team size, and leadership support. Resources & Concepts Mentioned * Python Automation Scripts: Simple scripts using libraries like pandas for CSV processing and database connectors (psycopg2, mysql-connector, etc.) for warehouse loading * Backlog Transparency Tools: Project management platforms (Jira, Linear, etc.) configured for stakeholder visibility * Prioritization Frameworks: Cost of delay, weighted shortest job first, RICE scoring (Reach, Impact, Confidence, Effort) * Interim vs Enterprise Solutions: The concept of “good enough for now” automation versus production-grade infrastructure Continue the Conversation Submit Your Data Problem or Become a Guest Visit our show website: https://data-breakthroughs-podcast.cookingdata.blog/ Here you can: * Submit your data challenge for a future episode * Apply to be a guest * See the latest episodes * Explore past problems and solutions Share Your Alternative Solution Have you dealt with similar stakeholder-data team conflicts? How did you resolve them? * Use #DataBreakthrough on social media * Reply to this newsletter with your approach Implementation Follow-up If you try any of these approaches and they work (or don’t work), I’d love to hear about it. Real implementation stories help the entire community learn. Season 1 Reflection This episode marks the end of Data Breakthroughs Season 1. I started with Ilya in the pilot episode, and it felt only right to close the season with him returning. Throughout the season, I’ve seen a consistent theme: most data problems are actually people problems. Whether it’s pipeline reliability, customer definition alignment, ML deployment challenges, or dashboard governance, the technical aspects are rarely the core issue. Thank you for being part of this first season. Your problem submissions, guest applications, and community engagement have made this possible. About Data Breakthroughs Data Breakthroughs brings together data practitioners to solve real operational challenges through collaborative problem-solving. Each episode features authentic, unscripted brainstorming sessions where neither the guest nor I sees the problem beforehand, creating genuine, real-time problem-solving moments. Host: Lior Barak Credits Host & Producer: Lior BarakGuest: Ilya, Fractional Head of DataMusic: “Calisson” courtesy of RiversideSeason: 1, Episode 11 (Season Finale) Season 2 Preview: Coming late February 2026 with enhanced problem submission requirements, extended reflection breaks, and continued commitment to authentic, unscripted data problem-solving. Season 3 recording begins in April 2026 for a June 2026 release. Disclaimer This podcast is for inspiration and educational purposes. The solutions and approaches discussed are general frameworks meant to spark ideas and collaboration. Always adapt recommendations to your specific organizational context, constraints, and requirements. Not every problem has a breakthrough solution, and sometimes the most valuable outcome is understanding the complexity better. Thanks for being part of Season 1. See you in late February 2026 for Season 2! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit impactoperations.substack.com/subscribe [https://impactoperations.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_2]

17. Dez. 2025 - 40 min
Episode How to Bridge the Data-Experience Gap & Gain Executive Buy-In (feat. Tiankai Feng) Cover

How to Bridge the Data-Experience Gap & Gain Executive Buy-In (feat. Tiankai Feng)

Data Breakthroughs - Episode 10: When Data Meets Decades of Experience Real-world data problem solving in action! Tiankai Feng (Director of Data & AI Strategy at ThoughtWorks, author of "Humanizing Data Strategy" and "Humanizing AI Strategy") and host Lior Barak tackle a manufacturing company where plant managers with 20+ years of experience resist a modern data platform. Problem Category: Organizational Data Strategy / Change ManagementRuntime: 32 minutes The Challenge: A family-owned manufacturer invested heavily in data infrastructure, but plant managers still make decisions based on "what happened last time" and gut instincts, creating a divide between analytics teams and operations. The Solution: Transform data from replacement threat to support tool through co-creation, clear communication about expertise-data synergy, and defining decision-making rules with proper incentives. Key Takeaways: Expertise vs. data is always a tension field - communicate how they work hand-in-hand, not against each other People don't use things they didn't help create - co-creation is essential for adoption Success metrics must reflect both short-term and long-term value to align incentives properly Guest: Tiankai Feng, Director of Data & AI Strategy at ThoughtWorks Author of "Humanizing Data Strategy" and "Humanizing AI Strategy" Connect: https://www.linkedin.com/in/tiankaifeng/ [https://www.linkedin.com/in/tiankaifeng/] Get Involved: Submit your data problem or Become a guest: https://data-breakthroughs-podcast.cookingdata.blog/ [https://data-breakthroughs-podcast.cookingdata.blog/] Join the conversation: #DataBreakthrough Full show notes: https://data-breakthroughs-podcast.cookingdata.blog/ [https://data-breakthroughs-podcast.cookingdata.blog/] Disclaimer: This podcast is for inspiration and educational purposes. Solutions discussed are general approaches - adapt them to your specific context and constraints. Music: "Calisson" courtesy of Riverside This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit impactoperations.substack.com/subscribe [https://impactoperations.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_2]

10. Dez. 2025 - 33 min
Episode How to Manage Analytics Overload & Build Effective Dashboards (feat. Eva Schreyer) Cover

How to Manage Analytics Overload & Build Effective Dashboards (feat. Eva Schreyer)

Data Breakthroughs - Episode 09: When Analytics Becomes a Dashboard Factory Real-world data problem solving in action! Eva Schreyer (Head of Data & Analytics at Neugelb/Commerzbank) and host Lior Barak tackle a community-submitted challenge about analytics overload for the first time during recording. Problem Category: Business Intelligence & Dashboarding / Organizational Data StrategyRuntime: 37 minutes The Challenge: A product team drowns in 40+ charts per report while struggling to make data-driven decisions, creating a disconnect between analytics investment and business value. The Solution: Transform from dashboard factory to strategic partner through executive alignment, monetizing requests, and prioritizing deep-dive analyses over generic reporting. Key Takeaways: Too much data doesn't mean good decisions-relevance matters more than volume Making stakeholders understand the cost of requests (in time/effort) dramatically improves prioritization Ask "what will you do differently when this metric changes?" to identify truly actionable insights Guest: Eva Schreyer, Head of Data & Analytics at Neugelb (Commerzbank) Connect: https://www.linkedin.com/in/eva-schreyer/ [https://www.linkedin.com/in/eva-schreyer/] Get Involved: Submit your data problem: https://data-breakthroughs-podcast.cookingdata.blog/ [https://data-breakthroughs-podcast.cookingdata.blog/] Become a guest: https://data-breakthroughs-podcast.cookingdata.blog/ [https://data-breakthroughs-podcast.cookingdata.blog/] Join the conversation: #DataBreakthrough Full show notes & visual diagrams: https://data-breakthroughs-podcast.cookingdata.blog/ [https://data-breakthroughs-podcast.cookingdata.blog/] Disclaimer: This podcast is for inspiration and educational purposes. Solutions discussed are general approaches - adapt them to your specific context and constraints. Music: "Calisson" courtesy of Riverside This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit impactoperations.substack.com/subscribe [https://impactoperations.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_2]

26. Nov. 2025 - 37 min
Episode Why Your Perfect Model Fails in Production: The Accuracy Paradox (feat. Irena Bojarovska) Cover

Why Your Perfect Model Fails in Production: The Accuracy Paradox (feat. Irena Bojarovska)

Data Breakthroughs - Episode 8: The Office Kitchen Paradox Real-world data problem solving in action! Irena Bojarovska and host Lior Barak tackle a community-submitted challenge for the first time during the recording. Problem Category: Machine Learning & AI Implementation Runtime: 50 minutes The Challenge: A hackathon team built a smart kitchen demand forecasting model with 91% accuracy, but the company is still throwing away 20-25% of fresh products weekly while running out of popular items. The Solution: The breakthrough isn't about fixing the model, it's about fixing the data. The model is missing critical inputs (office attendance, special events) and is operating blindly due to data quality problems. The real solution combines better data, human-AI collaboration, and proper A/B testing. Key Takeaways: • Model accuracy ≠ real-world performance (91% test accuracy doesn't guarantee waste reduction) • Data quality and contextual information are your foundation (garbage in, garbage out) • Humans should augment the model, not be replaced by it (hybrid approach wins) Guest: Irena Bojarovska, Data Scientist at Zalando SEConnect: https://www.linkedin.com/in/irenabojarovska/ [https://www.linkedin.com/in/irenabojarovska/] Get Involved: Submit your data problem: https://data-breakthroughs-podcast.cookingdata.blog/submit-problem [https://data-breakthroughs-podcast.cookingdata.blog/submit-problem] Become a guest: https://data-breakthroughs-podcast.cookingdata.blog/become-guest [https://data-breakthroughs-podcast.cookingdata.blog/become-guest] Join the conversation: #DataBreakthrough Full show notes & visual diagrams: https://wabi-sabi-data-newsletter.com [https://wabi-sabi-data-newsletter.com] [or your actual newsletter link] Figma Board: https://www.figma.com/board/jfC4ipNvd8zSPIyZreEten/Irena-Bojarovska?node-id=1-14&t=Q46O2Ae9yuRHZRwy-1 [https://www.figma.com/board/jfC4ipNvd8zSPIyZreEten/Irena-Bojarovska?node-id=1-14&t=Q46O2Ae9yuRHZRwy-1] Disclaimer: This podcast is for inspiration and educational purposes. Solutions discussed are general approaches; adapt them to your specific context and constraints. Music: "Calisson" courtesy of Riverside This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit impactoperations.substack.com/subscribe [https://impactoperations.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_2]

12. Nov. 2025 - 50 min
Episode How to Compete Against AI-Powered Competitors With Limited Resources (feat. Jon Cooke) Cover

How to Compete Against AI-Powered Competitors With Limited Resources (feat. Jon Cooke)

Data Breakthroughs - Episode 7: Small Company vs AI Giants Real-world data problem solving in action! Jon Cooke (Founder of Dataception) and host Lior Barak tackle a classic David vs. Goliath scenario for the first time during the recording. Problem Category: Data Strategy & Customer AnalyticsRuntime: 36 minutes The Challenge: Small German seed company (7 people) with 600+ product varieties, 4 years of customer data, and 30 years of gardening expertise. They're losing to giants who use algorithms for personalized recommendations. Conversion rate: 2.1%. Sent tomato seeds in December while competitors suggested microgreens and winter planning guides. They have incredible data and domain knowledge - but no idea how to compete with automated personalization. The Solution: You don't need massive tech teams. Start with customer segmentation workshops, map buying journeys, understand your data quality, build a simple recommendation engine (could be done in half a day), and test with friendly customers. The institutional knowledge trapped in people's heads is your competitive advantage - you just need to capture and automate it. Key Takeaways: Understand customers and segments first - technology second Data quality dictates approach: good data = ML models, poor data = heuristic rules Simple models beat no models - you don't need world-class data scientists This is a business process problem with AI tools, not an AI problem Small teams can compete by moving fast and testing with customers Guest: Jon Cooke, Founder of Dataception20 years in data & AI | Former Databricks Solutions Architecture Lead | Ex-PwCExpert in data products, GenAI, and knowledge graphs Connect with Jon: Website: https://dataception.com [https://dataception.com] LinkedIn [https://www.linkedin.com/in/jon-cooke-096bb0/] Company: Dataception Get Involved: Submit your data problem: https://data-breakthroughs-podcast.cookingdata.blog/submit-problem [https://dataception.com]Become [https://dataception.com] a guest: https://data-breakthroughs-podcast.cookingdata.blog/become-guest [https://dataception.com]Join [https://dataception.com] the conversation: #DataBreakthrough Full show notes & visual diagrams: [Link to newsletter version] Disclaimer: This podcast is for inspiration and educational purposes. Solutions discussed are general frameworks - adapt them to your specific context and constraints. Music: "Calisson" courtesy of Riverside This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit impactoperations.substack.com/subscribe [https://impactoperations.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_2]

12. Nov. 2025 - 35 min
Super gut, sehr abwechslungsreich Podimo kann man nur weiterempfehlen
Super gut, sehr abwechslungsreich Podimo kann man nur weiterempfehlen
Ich liebe Podcasts, Hörbücher u. -spiele, Dokus usw. Hier habe ich genügend Auswahl. Macht 👍 weiter so

Wähle dein Abonnement

Am beliebtesten

Begrenztes Angebot

Premium

20 Stunden Hörbücher

  • Podcasts nur bei Podimo

  • Keine Werbung in Podimo Podcasts

  • Jederzeit kündbar

2 Monate für 1 €
Dann 4,99 € / Monat

Loslegen

Premium Plus

100 Stunden Hörbücher

  • Podcasts nur bei Podimo

  • Keine Werbung in Podimo Podcasts

  • Jederzeit kündbar

30 Tage kostenlos testen
Dann 13,99 € / monat

Kostenlos testen

Nur bei Podimo

Beliebte Hörbücher

Loslegen

2 Monate für 1 €. Dann 4,99 € / Monat. Jederzeit kündbar.