Software Engineering Radio - the podcast for professional software developers

SE Radio 719: Birol Yildiz on Building an Agentic AI SRE

53 min · 6. maj 2026

Description

Birol Yildiz, CEO and co-founder of iLert, joins host Kanchan Shringi to explore how iLert built an AI SRE — an autonomous agent for handling production incidents — and what the experience revealed about building AI agents in the real world. Birol explains why incident response is a fundamentally agentic problem, where the unpredictability of novel incidents makes rule-based runbooks insufficient and reasoning models essential. He describes how the AI SRE evolved from an early browser-based approach to its current architecture, built around two key ingredients: reasoning models and the Model Context Protocol. The conversation examines the four layers of the AI SRE in depth: an orchestration layer that routes requests and abstracts model providers; a knowledge layer built on plain text memory and agentic search rather than vector databases; an evaluation framework based on recorded live investigations replayed against new model versions; and a human-in-the-loop constraint layer. The episode concludes with practical advice for teams building agents: own your context completely, avoid off-the-shelf frameworks that obscure what enters the model, and get out of the way of the reasoning model rather than over-prescribing its steps.

Comments

Be the first to comment

Sign up now and become a member of the Software Engineering Radio - the podcast for professional software developers community!

Get Started

All episodes

709 episodes

SE Radio 724: Jure Leskovec on Relational Graph and Foundational Models

Jure Leskovec, Professor of Computer Science at Stanford University and Chief Scientist at Kumo.ai, speaks with host Sriram Panyam about relational and graph language models and their transformative impact on enterprise decision-making and predictive modeling. Jure begins by establishing the critical importance of predictive modeling across industries - from fraud detection in financial institutions to customer churn prediction, lifetime value estimation, product recommendations, and healthcare risk assessment. He notes that while AI has made remarkable advances in natural language understanding and computer vision, predictive modeling over enterprise operational data stored in relational databases has been largely left behind, still relying on 30-year-old machine learning approaches that are expensive, time-consuming, and require manual feature engineering. His proposed solution to the fundamental problem with current approaches is relational deep learning and relational transformers. The discussion explores how this approach differs from traditional graph neural networks (GNNs), which Jure pioneered and deployed successfully at Pinterest. Jure concludes with practical guidance for software engineers and data scientists interested in exploring this technology.

Yesterday1 h 2 min

SE Radio 723: Dave Airlie on Linux Kernel Maintenance

Dave Airlie, a Distinguished Engineer at Red Hat, speaks with host Gregory M. Kapfhammer about Linux kernel maintenance. After over-viewing the scale and structure of the Linux kernel, they dive deep into the review and validation of kernel patches, drawing on examples from the GPU subsystem. After discussing the features and benefits of the Linux kernel's maintenance model, they also explore kernel maintenance best practices and the supporting tools for these practices. Dave and Gregory also discuss topics such as the integration of Rust code in the Linux kernel and the ways in which AI-driven code review are influencing kernel maintenance.

3. juni 20261 h 9 min

SE Radio 722: Dwayne McDaniel on the Engineering Challenges of Secrets Management

Dwayne McDaniel [https://trail.gitguardian.com/api/t/c/usr_XgQEQPgwFZ78282oE/tsk_CSQx5hsyFYHpAHx2r/enc_U2FsdGVkX19aR9lGtbabCxEhb9Yde_hsokM0Br2H8cO0MuhkXtGOlxqoSa2kzhx9AJEkM4SrYvH4PEzf842ZL9fm-omZUuEVXLdnzhA74ugphvs8lMXgwE63YENVZ9Ax], developer advocate at GitGuardian.com [http://gitguardian.com/], joins host Priyanka Raghavan to talk about the engineering challenges of secrets management. They explore what "secrets" really are in modern systems—far beyond passwords—including API keys, tokens, certificates, and machine identities, and how "secret sprawl" emerges across the SDLC. Drawing on reports from GitGuardian and Verizon, they discuss the growing scale of secret leaks and why credential abuse and phishing remain dominant attack vectors. They examine common leak points—from code repos and logs to CI/CD pipelines, containers, and SaaS integrations—and how cloud, DevOps, and AI tooling are amplifying risks. Priyanka quizzes Dwayne about recent supply chain attacks from pyPi and trivy ecosystems, highlighting recurring root causes like poor access control, long-lived credentials, and weak security hygiene. Finally, they consider detection, response, and modern solutions—short-lived credentials, secret scanning, and identity-based approaches like OWASP NHIR and SPIFFE/SPIRE—ending with practical advice for engineers to reduce blast radius and design for secure secret lifecycle management.

27. maj 202652 min

SE Radio 721: Rob Moffat on Risk-First Software Development

In this episode, Rob Moffat, author of Risk-First Software Development and chief technical architect at the FinTech Open Source Software Foundation (FINOS), speaks with host Brijesh Ammanath about how all of software development is actually risk management. Rob introduces the concept of 'risk-first software development,' which sits in the context of existing methodologies like scrum and kanban. Showcasing multiple real-world project patterns to illustrate how things can go wrong when risk is ignored, he makes the case for why risk should be the primary lens behind every development decision, from architecture to prioritization. Through various examples, he shows how every developer action can be viewed as a risk trade-off and why making that explicit can lead to better outcomes. The conversation takes a deep dive into the risk-first framework and how teams can apply it in their existing processes.

20. maj 202652 min

SE Radio 720: Martin Dilger on Understanding Eventsourcing

Martin Dilger, founder and CEO of Nebuilt GmbH, speaks with host Giovanni Asproni about event sourcing -- a software architecture pattern in which, rather than storing just the current state of your data, you store a sequence of events that represents every change that has ever happened in the system. This episode starts by introducing the vocabulary around event sourcing, highlighting its relationship with event modeling, event streaming, and event storming. Martin describes some of the pros and cons of the approach, including which systems it is most suitable for. The conversation ends with guidance how to get started with event sourcing, for both greenfield and legacy systems.

13. maj 202655 min

SE Radio 719: Birol Yildiz on Building an Agentic AI SRE

Description

Comments

1 month for 9 kr.

All episodes