Cover image of show Tech on the Rocks

Tech on the Rocks

Podcast by Kostas, Nitay

English

Technology & science

Limited Offer

2 months for 19 kr.

Then 99 kr. / monthCancel anytime.

  • 20 hours of audiobooks / month
  • Podcasts only on Podimo
  • All free podcasts
Get Started

About Tech on the Rocks

Join Kostas and Nitay as they speak with amazingly smart people who are building the next generation of technology, from hardware to cloud compute. Tech on the Rocks is for people who are curious about the foundations of the tech industry. Recorded primarily from our offices and homes, but one day we hope to record in a bar somewhere. Cheers!

All episodes

28 episodes

episode Building the Open Lakehouse for the AI Era with Shubham Baldava from DataZip / OLake artwork

Building the Open Lakehouse for the AI Era with Shubham Baldava from DataZip / OLake

In this episode of Tech on the Rocks, Nitay and Kostas sit down with Shubham Baldava, co-founder of DataZip and creator of OLake, to trace the evolution of the modern open lakehouse — from the early days of Apache Hudi to today's Iceberg-centric world. Shubham shares stories from a decade of data engineering at scale, including building near real-time pipelines at Japanese fintech giant PayPay, scaling a TikTok-style social platform at ShareChat from 10M to 160M monthly active users, and the cost and complexity pressures that pushed teams to adopt lakehouse architectures in the first place. From there, the conversation digs into the table format wars: why Hudi was the early pick for truly open, vendor-neutral lakehouses, how Iceberg has caught up and pulled ahead on integrations, where Delta fits in, and what the Tabular acquisition means for the community. Shubham explains why he believes all the major formats are converging — single-file commits, deletion vectors, variant and geospatial types, Z-indexes — and why integration breadth, not features alone, is now the deciding factor. The discussion then turns practical: what the four real pillars of a lakehouse are (ingestion, optimization, query, governance), why Debezium is so hard to replace, what it takes to hit 10-minute CDC latency for fintech reconciliation, and how OLake is rethinking ingestion with Arrow-based writes, exactly-once semantics built on Iceberg metadata, multi-phase compaction, and watermark-based parallel backfills. Finally, Shubham looks ahead to a future where Iceberg becomes the single substrate for structured, semi-structured, and unstructured data — powering multi-engine analytics and AI workloads on top of formats like Lance and Vortex, now that Iceberg has decoupled from Parquet. Topics covered: • Lessons from PayPay, ShareChat, and indie app entrepreneurship • Hudi vs Iceberg vs Delta — history, trade-offs, and convergence • Why fintech reconciliation needs sub-10-minute CDC • The real cost of running BigQuery, Trino, and Spark side by side • Debezium's staying power and why Go (not Rust) for next-gen CDC • How OLake uses Arrow, equality and positional deletes, and multi-step compaction • The decoupling of Iceberg from Parquet and what Lance/Vortex unlock for AI • Where to build in-house vs adopt managed lakehouse tooling

21 May 2026 - 58 min
episode From Session Replays to Autonomous Improvement: Shipping the First AI Product Engineer with Milana artwork

From Session Replays to Autonomous Improvement: Shipping the First AI Product Engineer with Milana

In this episode, we sit down with Rohan Katyal and Raghav Sethi, co-founders of Milana, to discuss the shift from passive analytics to the world’s first AI Product Engineer. Rather than just providing another dashboard to monitor, Rohan and Raghav are building an agentic partner that you add to your product to bridge the gap between discovery and deployment. Drawing on their experience at Meta, Yelp, and Airtable, they explore how Milana enables autonomous improvement - turning deep user intelligence into shippable code and structural refinements that act as a tireless extension of your engineering team. The conversation dives into why session replays — a mature but historically underused technology — are now a powerful data asset thanks to vision LLMs. Raghav explains how session replays are really just high-granularity logging of DOM changes, not screen recordings, and why feeding them through AI unlocks insights that traditional event-based analytics simply can’t capture. The team breaks down how they use just-in-time structuring to extract meaning from dense, unstructured session data without requiring upfront instrumentation. Rohan shares hard-won lessons from building Yelp’s experimentation platform — including how teams that simply ran more experiments consistently outperformed those with better data resources. They discuss the tension between A/B testing rigor and iteration speed, why most experiments never ship, and how lowering the cost of generating and testing hypotheses changes everything about product development velocity. We also get into the technical details of semantic clustering across millions of sessions, why video is actually a more compact representation than raw DOM for LLM reasoning, and how Milana analyzes sessions from multiple perspectives — user researcher, PM, founder — to surface real pain points. Plus, a bold prediction: analytics dashboards are dying, and the future belongs to agentic systems that don’t just deliver insights but actually own and drive your OKRs. Topics covered: * Why session replays are the ultimate untapped data asset for product teams * How vision LLMs unlocked AI-powered analysis of user sessions * Just-in-time data structuring: querying unstructured sessions without upfront instrumentation * Lessons from building experimentation platforms at Yelp and Airtable * Why running more experiments beats having better data * Semantic clustering: separating signal from noise across millions of sessions * Video vs. DOM vs. events — the best data representation for LLM reasoning * Analyzing agent behavior through session replays * The death of dashboards and the rise of agentic growth systems * User research horror stories and the surprising things users do Chapters 00:00 Introduction to Rohan and Raghav's Journey 04:47 The Importance of User Research 08:03 Making Solutioning a Science 11:09 Understanding Session Replays and Experimentation 14:50 Defining Sessions and Experimentation Platforms 18:54 The Need for Consistent Metrics 22:11 The Role of Events vs. Session Replays 29:46 Leveraging LLMs for Enhanced Insights 35:04 Determinism vs. Non-Determinism in Data Analysis 37:57 Understanding User vs. Agent Behavior 39:47 The Art of Structuring Data 45:25 Semantic Clustering and Its Importance 47:09 Building Infrastructure for Complex Data 51:24 The Future of User Simulation and Experimentation

24 Apr 2026 - 1 h 0 min
episode From Art to Science: Wild Moose and the Future of AI-Powered Debugging artwork

From Art to Science: Wild Moose and the Future of AI-Powered Debugging

In this episode, we sit down with the full founding team of Wild Moose — CEO Yasmin Dunsky, CTO Roei, and VP R&D Tom Tytunovich — to explore how they’re transforming production debugging from an art into a science using AI. The trio shares their unconventional founding story — from meeting across three different cities to living together for three months in a California Airbnb to stress-test both their idea and their relationship. They discuss how they identified production debugging as a massive unsolved problem before ChatGPT even launched, recognizing that while code generation is fundamentally a text problem, debugging is a search problem that demands a completely different approach. We dive deep into Wild Moose’s “microagents” architecture — fast, highly optimized AI agents that replicate the muscle memory of senior engineers to automatically investigate production incidents in under a minute. The team explains why accuracy trumps everything in their space (wrong answers are worse than no answers when you’re debugging at 3 AM), how they navigate the speed-cost-quality triangle, and why they built a test-driven approach to validate agents against past incidents. We also get into the multi-agent vs. single-agent debate, handling multimodal observability data (logs, metrics, traces, dashboards, code), and how the rapidly evolving LLM landscape creates both opportunities and challenges for production AI systems. Plus, the team shares their favorite outage war stories — including a “WatchCat” hack and a three-month hunt for a single rogue bit. Topics covered: * The Wild Moose origin story and the California Airbnb experiment * Why production debugging is a search problem, not a text generation problem * Microagents: fast, specialized AI agents for incident investigation * Building institutional knowledge into AI — capturing engineering muscle memory * The speed-cost-quality triangle in real-time AI systems * Multi-agent vs. single-agent architectures: when to use what * Handling multimodal observability data with LLMs * The future of AI SRE and self-healing production environments * Favorite outage war stories from the trenches Chapters 00:00 Introduction to the Wild Moose Team 04:12 The Spark Behind Wild Moose 08:41 Understanding the Debugging Landscape 12:45 The Role of AI in Debugging 17:31 Building Investigative Agents 21:55 Optimizing Workflows and Feedback Loops 29:12 Navigating Complexity in Software Systems 33:42 Adapting to Rapid Changes in AI Technology 40:02 Microagents: The Future of AI Architecture 44:46 Outage Stories: Lessons from the Trenches 50:49 Vision for the Future of AI in Production

17 Mar 2026 - 52 min
episode From Notebooks to Production: Xorq’s lockfile Approach for Reproducible, Portable ML Pipelines artwork

From Notebooks to Production: Xorq’s lockfile Approach for Reproducible, Portable ML Pipelines

In this episode, Hussain shares the story behind xorq: a “lockfile for ML pipelines” that makes notebook work easier to reproduce, debug, and ship. We talk about why the research→production path is still so manual, how schemas (and Arrow) become the contract between systems, and what it takes to run the same pipeline across engines like Snowflake and Databricks. We also dig into escape hatches for imperative code, why feature stores didn’t become the default, and how xorq fits alongside other technologies like Iceberg. Chapters 00:00 Hussain's Journey in Data Science 06:00 The Need for xorq: Bridging Research and Production 10:38 Challenges in Machine Learning Deployment 17:40 The Role of Lock Files in Data Pipelines 29:51 Understanding Schema Management in Data Systems 34:40 Navigating Declarative and Imperative Transformations 36:39 The Developer's Journey with xorq 38:34 Feature Stores vs. xorq: A Comparative Analysis 43:43 The Future of Feature Stores and Machine Learning 51:41 Reproducibility in Data Pipelines: xorq vs. Git-like Operations 55:47 The Future of xorq and the Data Ecosystem

29 Jan 2026 - 57 min
En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.
En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.
Rigtig god tjeneste med gode eksklusive podcasts og derudover et kæmpe udvalg af podcasts og lydbøger. Kan varmt anbefales, om ikke andet så udelukkende pga Dårligdommerne, Klovn podcast, Hakkedrengene og Han duo 😁 👍
Podimo er blevet uundværlig! Til lange bilture, hverdagen, rengøringen og i det hele taget, når man trænger til lidt adspredelse.

Choose your subscription

Most popular

Limited Offer

Premium

20 hours of audiobooks

  • Podcasts only on Podimo

  • No ads in Podimo shows

  • Cancel anytime

2 months for 19 kr.
Then 99 kr. / month

Get Started

Premium Plus

Unlimited audiobooks

  • Podcasts only on Podimo

  • No ads in Podimo shows

  • Cancel anytime

Start 7 days free trial
Then 129 kr. / month

Start for free

Only on Podimo

Popular audiobooks

Get Started

2 months for 19 kr. Then 99 kr. / month. Cancel anytime.