Billede af showet High Output: The Future of Engineering

High Output: The Future of Engineering

Podcast af Maestro AI

engelsk

Videnskab & teknologi

Begrænset tilbud

2 måneder kun 19 kr.

Derefter 99 kr. / månedOpsig når som helst.

  • 20 lydbogstimer pr. måned
  • Podcasts kun på Podimo
  • Gratis podcasts
Kom i gang

Læs mere High Output: The Future of Engineering

A window into tomorrow's software organizations through conversations with visionary engineering leaders who are redefining the profession. Join us to explore how leadership will evolve, what makes high-performing teams tick, and where the true value of engineering lies as technology and human creativity continue to intersect in unexpected ways. maestroai.substack.com

Alle episoder

15 episoder

episode When Craft Meets Non-Determinism cover

When Craft Meets Non-Determinism

Superhuman built its reputation on a number: 100 milliseconds. Every interaction in the product has to feel instantaneous. Not fast. Instantaneous. That’s the threshold where the human brain stops perceiving lag and starts feeling like the software is an extension of thought. They’ve been engineering to that constraint for years, and it has shaped everything — the architecture, the hiring bar, the way even a billing email gets crafted like a product. Then they added AI. And for the first time, they were shipping something they couldn’t fully control. The feeling that built a company “Every single interaction needs to be below 100 milliseconds, because this is when you feel that things are instantaneous,” Loic says. The number didn’t come from a product spec. It came from game design. Rahul Vohra, Superhuman’s CEO, studied how games create the feeling of flow, and bet that people hate email because of how email works, not because email is email. The architecture follows from that constraint. Superhuman assumes the network will slow you down, so they build as if the network isn’t there — local-first, syncing in the background, optimistic UI throughout. “You need to build without a backend. How do you do that across multiple devices and make it crazy fast?” People pay $40 a month for email and feel it’s worth it. Their users — mostly executives and salespeople who average three hours a day in their inboxes — describe the experience the way people describe good tools: the software stops mattering and the work takes over. How taste becomes infrastructure Loic joined at the beginning of 2025 as an outsider. “I came in with genuine curiosity. I was blown away.” What surprised him wasn’t the rule but how thoroughly it had been internalized. “Even a backend engineer will think about the latency of their API and how this will reflect in the experience.” In most engineering organizations, backend engineers think about correctness and throughput. At Superhuman, they think about how the user will feel. It starts in hiring — product sense is a criterion for every role, not just product and design. The finance team applies the same scrutiny to the email a customer gets when they’re being told what they owe as the product team applies to the inbox. The offer letter is a product experience. “The offer is a ceremony. It’s not transactional — it’s already an experience.” Candidates who got that treatment show up acting like it. Rahul reviews everything going into production. “Within the organization, this is building a muscle in every single engineer, designer, product manager — everyone knows the bond is that high.” You can’t work at Superhuman long without developing an eye for when something feels off — a slightly slow animation, a misaligned pixel, an API call that’s a few milliseconds slower than it ought to be. Loic calls it sensation transference. Packaging changes how you experience the product inside. They take that idea seriously enough that the bill you get from the finance team is treated like part of the product. The part they can’t control For ten years, everything in Superhuman’s stack was deterministic. Same input, same output. That’s what made the 100ms promise keepable: you could engineer to it, measure it, hold it. AI broke that. “The consistency we were used to is not there anymore,” Loic says. “We all face the surprising change of behavior of a model that is technically not changing its version.” A model API doesn’t update its version number, but its outputs shift. The same query returns different results this week than last week. For most products, this is annoying. For Superhuman, it’s a more serious problem, because their users aren’t tolerant of inconsistency. “We are similar to Apple in the sense that people expect the best. They pay a bunch, so they always expect the best.” The specific problem is what happens when AI meets user-generated input. Superhuman can engineer every designed interaction. They cannot engineer how users phrase search queries. “We were controlling every single part of the interaction — feels fast, feels right, feels correct — and all of a sudden, the outcome of the search box is not what I was looking for. Garbage in, garbage out. But how do you control the garbage in?” There’s no bug to fix and no perf target to chase. The product was built on consistency, and now consistency is the thing they can’t fully promise. What the numbers don’t say Superhuman’s AI adoption numbers look good: 90% of engineers using AI daily, 70% of PRs AI-augmented, 90% of those interactions net positive, some engineers claiming 40% velocity gains. Loic is careful about how he explains this. The numbers work partly because of who their engineers are. “We have a very senior team — over-optimized on seniority. Those people tend to use AI with care. They know the outcome they want, and they just use AI to get faster to that outcome.” The 40% gains aren’t coming from code generation. They’re coming from everything before the code. “Coming into a new codebase, trying to understand what this library is doing — before, you had to find the entry point, map the dependencies, build your own mental model. Now Claude Code does that so much faster.” The win is in comprehension and orientation, not typing speed. But the same playbook doesn’t transfer automatically. “If you have a lot of junior engineers, vibe coding’s impact on code quality might be real. It’s not a problem for us — it’s not part of our DNA.” Taste filters the output. Senior engineers with strong judgment about what “right” looks like can catch what the model gets wrong. Engineers without that judgment can’t. Teams celebrating big AI velocity gains may be doing so because they have enough experienced judgment to catch the mistakes. Teams where most of the engineers are still building that judgment may be accumulating comprehension debt they don’t know about yet. The acquisition test The Grammarly acquisition tests the same question at a different scale: can Superhuman’s taste survive contact with mass distribution? Grammarly has the opposite profile. They’re embedded in Google Docs, Word, email clients, browsers. They have AI capabilities built over years of NLP work. What they’ve optimized for is breadth: supporting every kind of user, every context. Superhuman has been doing the opposite, going deep on one persona and refusing to compromise. Loic frames the challenge clearly: “How do we make Superhuman not this niche, very fancy application, but something brought to the mass — while keeping our identity?” He reaches for Apple as the reference point. “Learning from Grammarly’s scale and AI capabilities, keeping our culture and taste, and bringing that to the mass — that would be really interesting.” It’s a genuinely hard problem. Making things simple is hard. Linear built something delightful for small engineering teams, then got successful, then came the bigger companies, the feature requests, the complexity. The focus that made it work is what success makes hardest to maintain. What this means for you Superhuman is hitting a wall any product with a quality bar will hit. Three things their experience suggests are worth borrowing. Make your implicit promises explicit. Superhuman’s was 100ms and determinism — they had ten years of architecture built around it before AI made determinism optional. Most teams have a similar promise they’ve never said out loud: accuracy, consistency, availability, something. Find yours before the model finds it for you, because you can’t defend a contract you haven’t named. Treat the prompt box as a UX surface, not a backend problem. The moment that surprised Loic wasn’t a model bug — it was the search box. Users phrase queries badly. Prompts are now part of the interface the user sees, and “garbage in, garbage out” is no longer an engineering excuse. Better prompts and evals matter, but if the search box returns the wrong thing, the design team owns that, not the ML team. Don’t credit the tools for what your senior engineers are doing. Superhuman’s 40% velocity gains work because the people using AI know what right looks like and catch what the model gets wrong. If your team is junior, the same playbook will produce comprehension debt instead of speed. Once you can’t tell the tool’s contribution from the engineer’s, you’re not measuring AI productivity. You’re measuring how much taste you happened to hire. Loic spent time before tech in contexts where craft standards weren’t optional and the feedback was immediate — a French Navy vessel that had to be back at sea in six weeks, no extensions. The discipline from that kind of constraint is different from the kind you get from a style guide. You learn it because you have no choice, and then it doesn’t really leave. He thinks that’s what Superhuman has built. He’s been there less than a year. Whether the taste travels at Grammarly scale is the thing he’s actually being paid to find out. High Output is brought to you by Maestro AI [https://getmaestro.ai]. Loic’s AI numbers look good — 90% daily adoption, 40% velocity gains — but he’s the first to say the metrics don’t explain themselves. They work because his senior engineers have the judgment to catch what the model gets wrong. Most engineering leaders have no way to see that layer. You can see PR counts and cycle time. You can’t see whether your engineers are using AI well or just generating output faster. Maestro’s daily briefings reveal where your team’s time and energy actually go — not just what shipped, but the quality of the judgment behind it. Visit https://getmaestro.ai [https://getmaestro.ai] to see how we help engineering leaders understand what their AI adoption numbers actually mean. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit maestroai.substack.com [https://maestroai.substack.com?utm_medium=podcast&utm_campaign=CTA_1]

14. maj 2026 - 39 min
episode Stop writing code. Start reading it. cover

Stop writing code. Start reading it.

We recorded this episode with Steve back in October of 2025, before he invented Beads [https://steve-yegge.medium.com/introducing-beads-a-coding-agent-memory-system-637d7d92514a] and Gastown [https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04]. Several of his predictions have aged well in the months since. Steve Yegge [https://steve-yegge.medium.com/] has been VP or head of engineering at four companies. He keeps stepping down on purpose. Not because things went wrong — his organizations were doing well. He’s the kind of leader whose reputation travels through a company; at Amazon, at Google, engineers lined up to transfer onto his teams. He stepped down each time because he noticed the same thing: the moment he stopped being able to code alongside his engineers, conversations started requiring translation. Once you’re in translation mode, Yegge figured out, you’re not leading anymore. You’re triangulating toward an answer you don’t fully understand. In the AI era, he thinks this problem just got much more expensive. The translation layer When Yegge handed over the engineering org at Sourcegraph — his fourth deliberate step-down in a career that spans Amazon, Google, and Grab — he gave a specific reason. “I was going through a translation layer with my engineers where they’d be like, ‘Well, you see the AI does this, and then I do that, and then the AI does that, and then there’s a gateway’ — and I’m like, what?” It wasn’t that he didn’t trust his engineers. It was that he’d lost the ability to sense-check them. And he’d noticed what happened to leaders who stayed in that position too long: “That’s a technique that non-technical leaders use. People who’ve lost their technical chops, they can still be effective leaders, but they have to be very good at triangulating, almost like a GPS on the right answer by going to different technical people and getting it.” Triangulation is better than nothing. But it’s slow, and it requires your engineers to speak in executive-friendly summaries, which means you’re always one abstraction layer removed from what’s actually happening. Yegge’s response has been consistent across his career: hand the org to someone ready to take it, go back to IC, get his hands back in the code. At Sourcegraph that meant 18 months as an individual contributor during the period when AI coding changed the most — which is exactly when he made the predictions that got Anthropic’s attention. His observation about himself is worth sitting with: his most accurate forecasts came during IC phases, not executive phases. Proximity to the work makes the signal cleaner. The “Otherwise” has arrived The case for technical proximity isn’t just philosophical anymore. Yegge has data. Andrew Glover, Director of Productivity at OpenAI, shared findings with Yegge and his co-author Gene Kim: at OpenAI itself, engineers who adopted Codex — their fully agentic CLI coding tool — are producing pull requests that, even accounting for higher rejection rates, “dwarf the contributions of the people who aren’t doing agentic coding by an order of magnitude. Ten times as many commits.” The interesting part isn’t the 10x number. It’s where the 10x is and isn’t happening. “The ones who are successful with agentic coding were the ones living in the microservices world, where there’s lots of small, well-factored bits of software. The ones who are struggling are the folks in ChatGPT Land, which is one of the world’s largest monoliths.” For a decade, engineers warned that monolithic codebases would become a liability — every warning came with an implicit otherwise at the end: refactor now, or else. But the or-else never arrived. You could run with a monolith indefinitely; deployment was easier, QA was simpler, everything just “floated off and got deployed somewhere.” The warning was technically correct but operationally optional. “You didn’t refactor it. And so what we’re faced with right now is this rat race where first of all, everyone who’s already in microservices land is just being pigs. They can use all the tokens they want. AI is working for them beautifully. The ones with monoliths — and you just point at any company and they have a monolith — it is time to break them up.” The otherwise, he says, has finally arrived. A 2025 METR study [https://addyo.substack.com/p/the-reality-of-ai-assisted-software] found that experienced developers were 19% slower when using AI tools on large, real-world repositories — the kind of environments where monoliths live. What Bezos actually understood about services Yegge built some of the original infrastructure that justified Amazon’s service-oriented architecture, so he has a view on why Bezos pushed it so hard in the early 2000s that most people don’t know about. It wasn’t primarily an engineering decision. “I heard this later from a colleague at Amazon. Jeff had come from D.E. Shaw on Wall Street, and D.E. Shaw is a company that buys companies and breaks them up and sells the pieces off for a huge profit. He was worried that Amazon was gonna die because of the dot-com bust. And so what he wanted to do, as a last resort, was I’m gonna bust Amazon up and sell the pieces. Which means every one of them has to have a service interface.” An exit strategy for a dying company accidentally created the architecture for a trillion-dollar one. Bezos wasn’t playing chess when everyone else was playing checkers — he was scared. The mandate came from a Wall Street M&A playbook, not a software architecture philosophy. Modular design was a byproduct of an exit strategy. The companies that invested in microservices over the past decade for code organization reasons are now discovering they got AI compatibility for free. The companies that didn’t are discovering the bill is coming due. The “Dial” Yegge has a name for the decision every engineering leader is quietly making right now: the Dial. “Every company has been given a dial that goes from zero to a hundred, and it is the number of engineers that you’re gonna fire in order to pay for the rest of them to have AI.” He’s not being glib. If a subset of your engineers can produce 10x the output with agentic tooling, and those tools require meaningful investment in compute and licensing, the question of headcount allocation is already embedded in your budget decisions. You’re turning the dial whether you’re thinking about it explicitly or not. Most companies aren’t thinking about it explicitly. Yegge thinks that’s a mistake. “Once you finally figure out how coding is done today — with Codex, with Claude Code, with Sourcegraph Amp — you switched into that world. You are playing in the big leagues and everyone else is falling behind.” The dial isn’t just about AI spending. It’s about what you believe your engineers will be doing in 18 months. Writing code is for agents Which brings Yegge to his single most concrete piece of advice: stop spending your energy on writing code. Start spending it on reading code. “You’re gonna be generating 10 to 100 times as much code as you ever did before, and you’re gonna need to read it at some point because you need to own it.” Addy Osmani [https://substack.com/profile/11623675-addy-osmani], VP of Engineering at Google Chrome, calls the alternative “comprehension debt [https://addyo.substack.com/p/the-8 +0-problem-in-agentic-coding]” — the accumulation of plausible-looking code you’ve approved without truly understanding, a debt that comes due when something breaks at 2am and you can’t trace why. The shift is real and immediate. Yegge has already made it. He describes his current workflow as watching his agents code — actually sitting there, following the diffs, paying attention to what they produce — rather than writing much himself. “Turn off permission checks so you don’t have to hit enter all the time and just watch it. Watch it code. Pay attention to the diffs.” The skill of reading code fast and evaluating it accurately — is this correct? Does this make sense architecturally? Would I defend this in a code review? — is what separates a developer who’s a good director of agents from one who’s just vibe coding at scale and hoping for the best. Yegge’s analogy: a musician who practices sight reading every day for 10 minutes compounds that skill faster than someone who only practices composition. The reading muscle and the writing muscle are different. For most developers, the writing muscle is heavily developed and the reading muscle isn’t, because historically writing was the job. That’s the ratio that’s inverting. What this means to you If you’re a leader who has drifted from direct technical work, the cost of that drift just increased. AI coding is changing fast enough that managing by summary will leave you making decisions you don’t understand. You don’t need to write the code — but you need to be able to read the diffs. Ask whether your codebase is AI-ready. Not “are we using AI tools?” but “can an agent work effectively in our codebase?” The answer is mostly a function of modularity. If your engineers are struggling to adopt agentic coding, the problem is probably architectural, not motivational. Have an explicit conversation with your leadership team about how AI changes the headcount math. Not as a cost-cutting exercise, but as a forcing function for getting clarity on what you believe your engineering team will look like in two years. Leaving this implicit means it gets decided by budget pressure instead. And if you’re an engineer: watch your agent work. Follow the diffs. Treat it like sight reading practice. The engineers who can evaluate agent output quickly — who own what the agent ships — will be the ones who remain indispensable as the generation overhead approaches zero. High Output is brought to you by Maestro AI [https://getmaestro.ai/]. Steve Yegge talked about the “translation layer” that forms when leaders drift from the code — but there’s a deeper version of that problem right now. Every engineering leader knows AI adoption is happening. What they can’t see is whether it’s working. Token counts and PR velocity tell you who’s generating more. They don’t tell you who’s actually using AI well. Maestro analyzes the AI sessions themselves, scoring how effectively each engineer is working with their tools — so you can see who’s genuinely leveling up and who’s just generating noise. Visit https://getmaestro.ai [https://getmaestro.ai] to see how we help engineering leaders measure AI effectiveness, not just AI activity. How are you thinking about the difference between AI adoption and AI effectiveness on your team? We’d love to hear your story. Schedule a chat with our team → https://cal.com/team/maestro-ai/chat-with-maestro [https://cal.com/team/maestro-ai/chat-with-maestro] This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit maestroai.substack.com [https://maestroai.substack.com?utm_medium=podcast&utm_campaign=CTA_1]

29. apr. 2026 - 45 min
episode Principles Over Process with Gaurav Gargate cover

Principles Over Process with Gaurav Gargate

Most engineering leaders spend enormous energy on process. Which agile framework. Which sprint cadence. Which AI coding tool to adopt. How to standardize workflows across teams. The assumption is that the right process produces the right outcomes. Gaurav Gargate [https://substack.com/profile/10954060-gaurav-gargate] has come to believe the opposite. Get the principles right, and the process can flex. Gaurav is VP of Engineering at Confluent, where he runs their Security Products and Cloud Platform powering their cloud-native data streaming ecosystem. He joined when the business was sub-$100 million; today it’s $1.1 billion. Before Confluent, he spent seven years at Box and six years at Microsoft. And before any of that, he started his career at a 15-person startup in India — “Didn’t know what we were doing, but it was fun.” Across all of those environments—from a scrappy team of 15 to a billion-dollar enterprise—one pattern has held: the organizations that thrive are rigid about their principles and flexible about everything else. The ones that struggle have it backwards. The Agile Dogma Aha Moment Gaurav has a specific story about when this clicked. Early in his career, he was a believer in classical agile—sprints, scrums, the full playbook. He thought it was the way to run engineering projects. Then he hired a leader who was completely aligned on the principles: execution pays the bills, work needs visibility and traceability, quality gates matter. But the process? Different. “Look, I don’t necessarily care about the book process, whether you call it agile or you call it scrum or something else. I would love to have the agency to ensure I manage and track my work. My engineers feel like they’re actually doing the best work of their life and there is quality gate and accountability.” Gaurav calls this a strong aha moment. “I realized I was being unnecessarily dogmatic in my approach. And actually this additional way of doing it opened up so many gates.” The lesson wasn’t that agile is bad. It was that confusing a specific process with the underlying principle is a trap. The principle—visible, accountable, high-quality execution—can be achieved multiple ways. Insisting on one process locks out people who could deliver the same outcomes through a different path. It closes doors you didn’t know existed. The constraint is real, though. “You don’t wanna have 30 teams have 30 different innovative ways.” There’s a phase where letting a thousand flowers bloom is the right move, and there’s a point where you need to converge on five or six archetypes. The art is knowing when you’re in which phase. Culture Add Over Culture Fit The same logic applies to hiring. Early in his career, Gaurav screened for culture fit—people who matched the team’s existing style. Over time, he realized this was the same mistake as the agile dogma, applied to people instead of methodology. “It’s actually a bad idea to have a very closed door—only follow this culture and nothing else.” When you hire exclusively for fit, you get a team that reinforces its own assumptions. The same instincts. The same blind spots. The culture calcifies instead of evolving. His alternative: hire for culture add. Find people who share your principles and values, but bring their own approaches and experiences. “New people join in, people grow in their roles, people from different companies and backgrounds and experiences come together—the beauty is that an evolving culture being held strong on the principles of the company actually makes it a success story.” The distinction is subtle but important: principles are fixed, culture is not. Values are the foundation. Everything built on top should be allowed to shift. Share the Why, Trust the How Gaurav applies the same framework to day-to-day management, and he sums it up bluntly: “The fundamental principle is to treat people like adults and they will behave like adults.” In practice, that means sharing context aggressively—where the business is going, how decisions get made, what the company needs right now—and then stepping back. “Enable them, let them have that agency to make those micro decisions as much as possible.” He’s not flexible about everything. Collaboration, one-team attitude, flat hierarchy, open communication—these are non-negotiable. “There are certain principles which I’m actually not ready to compromise on.” But beyond those fixed points, he lets leaders find their own style. “Ultimately what every strong individual or leader wants is to be held accountable for the outcomes and the results they deliver. And nobody likes to be micromanaged on how they get there.” Rigid on values. Flexible on methods. The same pattern, applied to management instead of hiring or methodology. The SDLC Tree Where this gets most interesting is how Gaurav applies the framework to AI adoption. His approach is different from the typical “push coding copilots” playbook—and the principle underneath it is the same one driving everything else. The principle: engineers should spend their time on high-value, creative work. The process for achieving that? That’s what changes. Gaurav looks at the entire software development lifecycle as a tree of workflows and targets the branches no engineer enjoys. “Especially as a cloud infrastructure company, there is a ton of work in operating, managing, keeping your infrastructure secure, scaling the business. There are a lot of things that AI can generally do well.” Confluent handles security patches and vulnerability management across three clouds and roughly a hundred regions. Infrastructure gets set up, tested, and torn down constantly. These are the branches AI is taking over completely—with engineers administering and managing rather than doing the work by hand. “Engineers actually love to do the innovation. They love to do the new problem solving. They love to have that ability to write new code in a way they feel is appropriate.” His conclusion follows directly: “I would love my engineers to actually have that mental space to invest their time in that high value work and let all the undifferentiated work be taken over completely by AI.” This is a fundamentally different framing from “AI makes engineers faster.” It’s not about speed. It’s about expanding what engineering teams can accomplish. “The pie is getting bigger. We gotta look at AI as a way to expand the pie of work that an engineer can do, not necessarily just what they were doing last year.” He invokes Jevons’ paradox—the idea that when something becomes more efficient, total consumption increases rather than decreases. Because it’s easier to build, more will get built. More demand, more opportunity, more roles. And his take on whether AI threatens engineering jobs is unequivocal: “Every role, every job category is going to change because of AI.” But change isn’t elimination. It’s the same transition the industry went through when cloud replaced data center ops. The people who understood first principles learned the new layer and kept going. The Fundamentals Don’t Change This is the thread that ties everything together. Principles endure. Process shifts. When Gaurav joined Microsoft, people questioned whether he was a real engineer because he didn’t write device drivers. “The previous generation did something at a lot lower level, and then the next generation is doing something at a different layer. That’s always been happening for decades.” But through those decades of transformation, the fundamentals haven’t changed. Understanding operating systems, databases, memory management—”the fundamental understanding of these core principles is what allows a great engineer to learn and pick up new things.” His advice to new graduates is the same advice he’d have given five years ago: focus on the fundamentals. “Learning new things has become easier. Building and experimenting has become a lot easier than before. If people can really spend time understanding the core fundamental building blocks of computer science, applying them to learn and build new things is actually gonna be easier going ahead.” The career lesson mirrors the organizational one. The engineers who thrive across generational shifts are the ones grounded in principles, not attached to any particular layer or tool. The organizations that scale from startup to $1.1 billion are the ones that hold their values tight and let everything else evolve. The leaders who get the most from AI are the ones who know which work matters and which work is just process. Same pattern. Every level. What This Means for You First, separate your principles from your processes. Gaurav’s agile aha moment came when he realized he was treating a specific methodology as a principle. Identify which of your team’s practices are genuinely non-negotiable values and which are just comfortable habits dressed up as requirements. Second, audit your hiring for culture fit vs. culture add. Are you screening for people who share your principles, or people who share your habits? The first builds a team that evolves. The second builds one that calcifies. Third, when deploying AI, map your SDLC and target the work nobody wants. Instead of asking “how do we code faster,” ask “which branches of our workflow tree drain engineers without engaging them?” Security patches, infrastructure provisioning, repetitive operations—these are the high-ROI AI targets that also free engineers to do the work that drew them to the field. Fourth, give context instead of instructions. If you want people to make good micro-decisions without being micromanaged, they need the same information you have. Share the why and how you measure the what—then trust them to figure out the how. The question worth asking your team: Are the things you’re rigid about actually principles—or are they processes you’ve held onto so long they just feel like principles? High Output is brought to you by Maestro AI [https://getmaestro.ai]. Gaurav talked about giving teams the room to deliver their own way. But when you stop prescribing process, you lose the visibility that process used to provide. You’re no longer watching how the work happens—so you need a way to see whether the work is landing. That’s what Maestro does. Maestro is engineering intelligence for AI-first teams: AI-powered analysis that measures the true impact of your team’s work, from code changes to review quality to team health. Stop flying blind. Start leading with signal. Visit https://getmaestro.ai [https://getmaestro.ai] to learn more. Building a team where autonomy and accountability coexist? We’d love to hear how. Schedule a chat with our team → https://getmaestro.ai/book [https://getmaestro.ai/book] This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit maestroai.substack.com [https://maestroai.substack.com?utm_medium=podcast&utm_campaign=CTA_1]

11. feb. 2026 - 31 min
episode Why AI Productivity Gains Are Context-Dependent | With Raju Matta cover

Why AI Productivity Gains Are Context-Dependent | With Raju Matta

Some engineering teams are seeing real, measurable AI productivity gains. Cursor is transforming how frontend developers build React apps. AI-assisted code review is catching bugs before deployment. Prototypes that took weeks now take days. But not everyone’s seeing the same results. Raju Matta [https://www.linkedin.com/in/raju-matta-4067a7/] runs engineering for Cambridge Mobile Telematics [https://www.linkedin.com/company/cambridge-mobile-telematics/]—200+ engineers, three countries, petabytes of real-time sensor data processing driver safety. Six months ago, he formed a tiger team to systematically track AI tool adoption. Status reports every two weeks. Multiple tools tested: Copilot, Cursor, PR review bots. His finding? “I’ve not seen the measurable velocity increase that people are saying out in the market—but that doesn’t mean I have totally written off LLMs yet.” This isn’t skepticism. It’s measured evaluation. And the pattern Raju’s seeing reveals something important about when AI tools deliver and when they don’t. Where AI Tools Excel As part of their evaluation, CMT ran an internal hackathon to see what AI tools could do in practice. The results told a clear story. Eighteen projects, all using AI. Teams built fully working web apps—complete with datasets—in 2-4 hours. “For that purpose, it’s great. It’s not bad at all,” he says. The pattern: AI coding tools work brilliantly for rapid prototyping with established patterns, web development using well-documented frameworks, mechanical coding tasks like boilerplate and test generation, and quick experiments to validate product ideas. These are real productivity gains. The people claiming 2x-3x aren’t exaggerating—they’re working in contexts where AI capabilities align perfectly with task requirements. When your bottleneck is writing React components or generating CRUD endpoints, AI tools deliver measurable acceleration. But CMT’s production systems are different. The Complexity Multiplier They’re processing petabytes of data from gyroscopes, accelerometers, GPS sensors, video streams. They’re distinguishing potholes from crashes, sharp corners from reckless driving. They’ve been using AI and machine learning for this work for 13 years—long before LLMs became everyone’s productivity obsession. The engineering challenge isn’t writing code. It’s architecting systems that handle sensor fusion at scale, debugging why clusters fail under load, ensuring accuracy when lives depend on your classifications, and managing tech debt across distributed teams in six countries. “You can outsource your engineering and coding with AI tools, but not your thinking,” Raju explains. In complex production systems, the thinking is where the time goes. Code generation helps, but it’s not the bottleneck. The productivity multiplier drops from 3x to “incrementally helpful” because the constraint isn’t in the typing—it’s in the architectural decisions, the system design, the understanding of how everything fits together. This doesn’t make AI tools useless. They still catch bugs in PRs. They still help prototype solutions. They still accelerate certain tasks. But the overall velocity gain is modest because code generation often isn’t the long pole. The Tiger Team Approach Here’s what makes Raju’s perspective valuable: he’s not guessing. Six months ago, CMT’s CTO gathered the engineering leaders. “How are you guys thinking of AI?” The response: treat it like a first-class citizen. They formed a dedicated tiger team. Three people producing status reports every two weeks on tool adoption, usage patterns, and measurable impact. “We have about three or four tools that we are using all the way from PR review tools to tools like Copilot, Cursor.” This is systematic evaluation, not anecdotal impressions. And the data shows results that differ from the market narrative: “My general experience is that it’s good, it’s doing its job, but I haven’t seen the measurable velocity increase as much as what people are saying out in the market.” His peer conversations confirm the pattern isn’t unique to CMT: “Even other leaders and my peers that I speak with, who are working at big tech companies, have said similar things. So it’s not uncommon.” But Raju’s not dismissing the technology. “The tools are progressing at a very fast pace. I wouldn’t be surprised if it’s another six months or a year where we get to exhaust more pieces of the tool and get more done.” That “yet” matters. He’s still tracking, still evaluating, still expecting improvement. When Mistakes Have Consequences When Raju says “we have to save people’s lives,” he’s not being dramatic. CMT’s technology directly impacts driver safety. Their telematics platform processes sensor data to detect dangerous driving, assess risk, and potentially prevent accidents. This creates a different bar for “move fast and break things.” “We are a little bit more diligent because at the end of the day, we have to save people’s lives. So for us, we’d rather spend the time beforehand than reactively trying to address it.” The stakes are high—both financially and ethically. When your technology directly impacts human safety, you can’t afford to ship fast and fix later. The constraint isn’t just technical complexity—it’s consequence of failure. “AI tools can take you north, but with the same speed, they can take you south.” In safety-critical systems, the review time, the testing time, the verification time doesn’t compress even if code generation does. You can’t ship and iterate rapidly when mistakes could harm people. The overall productivity gain shrinks accordingly because the non-coding portions of the development cycle remain unchanged. This applies beyond telematics. Financial systems. Healthcare platforms. Infrastructure control. Any domain where errors have serious consequences faces the same limitation: AI can accelerate code generation, but it can’t compress the necessary validation and testing cycles. Where AI Struggles AI’s limitations show up in unexpected places. CMT uses AI to filter thousands of resumes for each job opening. The results? “50% makes sense. And 50% don’t make sense.” This split illustrates a broader pattern. AI works brilliantly for well-defined, repeatable tasks. It struggles with judgment calls, context-dependent decisions, and situations requiring nuanced understanding. The tool saves time on mechanical filtering. But the judgment about who’s actually right for the role? Still human. And critically, the humans can immediately spot when AI recommendations miss the mark—they don’t trust it blindly. This mirrors the coding experience. AI generates boilerplate quickly. But understanding whether the generated code fits the broader system architecture, handles edge cases properly, and follows team conventions? That requires human judgment that doesn’t compress. Where This Leaves Engineering Leaders The mistake isn’t believing AI tools work—they demonstrably do in many contexts. The mistake is assuming your context will see the same gains as someone in a completely different situation. Raju’s systematic evaluation reveals the variables that matter: Your problem domain determines gains. Web apps and prototypes with established patterns can see significant productivity improvements. Complex distributed systems with unique requirements tend to see incremental improvements. The difference isn’t the tool quality—it’s how much of your bottleneck typically sits in code generation versus system design. Your constraint defines the impact. If implementing features is your rate-limiting step, AI delivers massive value. If architectural decisions and system design are your constraint, AI helps less. Most production systems fall into the second category after the initial prototyping phase. Your risk tolerance changes the math. If you can ship and iterate rapidly, AI accelerates that cycle. If mistakes have serious consequences, the review and testing time doesn’t compress proportionally. The overall velocity gain depends heavily on how much of your process can safely be accelerated. Your system complexity matters. Greenfield projects with established patterns see huge gains. Legacy systems with unique constraints and interconnected dependencies see modest gains. The complexity of your codebase directly impacts how useful AI-generated code becomes. The Honest Assessment Raju isn’t claiming AI tools are overhyped. He’s providing the nuanced reality: they work extremely well for specific contexts and deliver modest improvements in others. His 6-month tiger team experiment with dedicated tracking hasn’t found a productivity revolution. They’ve found incremental gains with clear constraints. That’s the honest number engineering leaders need for planning. “LLMs can help us experiment and prototype features faster. They can help developers catch mistakes in our pull requests. They can help us find answers faster, and we are constantly evaluating,” he explains. “But I’ve not seen the impact that people are saying out there.” This doesn’t mean ignore AI tools. It means understand your context, measure systematically, and set realistic expectations. For rapid prototyping and web development? The 2-3x gains are real. For complex production systems with safety requirements? The gains exist but are much more modest. Both can be true simultaneously—the difference is context. What This Means for You First, measure systematically rather than relying on anecdotes. Set up dedicated tracking like Raju’s tiger team—assign ownership, establish regular reporting, and gather actual usage data. The hype cycle around AI tools means everyone has an opinion, but data reveals what actually works in your specific context. Second, understand where your bottleneck actually sits. If architectural decisions and system design consume most of your time, AI tools will help less than if code generation is your constraint. Be honest about what’s actually slowing you down before expecting AI to solve it. Third, adjust expectations based on risk profile. If your domain allows rapid iteration and tolerable failure rates, AI tools can deliver significant acceleration. If mistakes have serious consequences, the non-compressible validation cycles will limit overall gains regardless of how fast code gets generated. Fourth, keep evaluating as tools improve. Raju expects capabilities to expand significantly over the next 6-12 months. Today’s limitations may not be tomorrow’s. But base your current planning on current capabilities, not projected future states. The question every engineering leader should ask: What’s actually constraining my team’s velocity—code generation or everything else? Because if it’s everything else, AI coding tools will help incrementally, not transformationally. And that’s okay—incremental gains compound over time. Raju’s measured approach provides the reality check the market needs. AI tools deliver real value, but the magnitude depends entirely on your specific context. Understanding that context is how you set realistic expectations and make smart adoption decisions. High Output is brought to you by Maestro AI [https://getmaestro.ai]. Raju talked about forming a tiger team to systematically track AI tool adoption with biweekly status reports—but that measurement challenge extends beyond just AI tools. When your 200+ person engineering team is distributed across four countries and multiple tools, it becomes impossible to see what’s actually happening without systematic tracking. Maestro cuts through that complexity with automated reporting and metrics and show where' your team’s time and energy actually go, so you can spot patterns and make data-driven decisions about everything from AI adoption to resource allocation. Visit https://getmaestro.ai [https://getmaestro.ai] to see how we help engineering leaders get actually useful insights into their teams. Running systematic evaluations of new tools and processes? We’d love to hear your approach. Schedule a chat with our team → https://getmaestro.ai/book [https://getmaestro.ai/book] This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit maestroai.substack.com [https://maestroai.substack.com?utm_medium=podcast&utm_campaign=CTA_1]

11. dec. 2025 - 36 min
episode Building AI Products Under HIPAA | With Muhammad Atif cover

Building AI Products Under HIPAA | With Muhammad Atif

When you’ve bootstrapped an engineering org from 2 people to 500, working with Fortune 500 clients like Intel and Samsung, you learn something most AI builders miss: the best technology doesn’t always ship. Muhammad Atif [https://www.linkedin.com/in/muhammadatif/], President and CTO of PureLogics [https://purelogics.com/], recently deployed an on-prem AI model that hits 70% of the accuracy of their original cloud-based prototype. That 30% accuracy gap represents the tradeoff required for HIPAA compliance. The cloud-based prototype couldn’t be deployed—patient data can’t touch external APIs under their client’s compliance requirements. This is the reality healthcare engineering leaders face: you’re building for the best model that meets your compliance requirements, not just the highest-performing model in isolation. Since co-founding Pure Logics in 2007, Muhammad has grown it from 2 people coding in a room to a 500-person global engineering firm. They build on-prem models that achieve 60-70% of cloud-based prototype accuracy while meeting the strict data security requirements that healthcare demands. The Compliance Wall Everyone Hits Muhammad’s team was prototyping an AI feature using OpenAI’s API. Fast iteration, impressive results. Then the client’s compliance team saw the architecture diagram. “When the customer said they need to have on-prem AI, we changed the entire paradigm,” Muhammad explains. The entire approach had to be rethought. The paradigm shift required rethinking four critical areas. First, hardware specification: what GPU specifications, how much RAM, what storage architecture. These decisions determine whether your model trains in days or weeks, whether inference is real-time or batch. Second, model selection: which open source model fits your domain? Healthcare has different requirements than generic NLP—you need models that work for medical terminology, clinical workflows, provider documentation patterns. Third, and most challenging, training data acquisition. You need millions of records to train effectively, but healthcare data is protected. “We need to have millions of records of data to train that model to bring up to that accuracy,” Muhammad explains. Where do you get training data that doesn’t violate HIPAA? Fourth, compliance layers: NIST AI RMF compliance, HSS trustworthy AI practices, OSAP LLM practices, HIPAA audit trails. “We need to make sure that we have all these security and safety guardrails implemented, especially when dealing with live patient data,” Muhammad says. “We have deployed an onsite model. It’s almost 70% accurate compared to the one we used to have in the initial POC,” Muhammad says. That 30% accuracy gap represents the tradeoff for meeting compliance requirements. The on-prem model that meets HIPAA requirements ships. The cloud-based prototype doesn’t. This is the reality healthcare leaders face. The question isn’t “what’s the highest-performing model?” It’s “what’s the best model we can deploy within our regulatory constraints?” What Compliance Expertise Enables Pure Logics’ on-prem AI capabilities unlock healthcare applications that wouldn’t be possible without deep compliance knowledge. Take their diabetic foot monitoring project. Diabetic patients often can’t feel temperature changes in their feet—a dangerous condition that can lead to undetected injuries and infections. Pure Logics is building algorithms that analyze thermal images of patients’ feet to detect temperature anomalies, giving providers early warning signs before problems escalate. Or their women’s health platform, which helps women track and manage their health throughout hormonal and menstrual cycles. These aren’t trivial consumer apps—they’re handling protected health information that requires the full compliance framework Pure Logics has built. “We have also been working with few startups who are working on like diagnostics and disease detection kind of algorithms, and we are really proud that we are going to be part of those teams,” Muhammad says. This is the payoff for solving the hard problems. Teams that can’t navigate HIPAA constraints can’t build these applications. Teams that can navigate HIPAA but can’t achieve reasonable AI model performance on-prem can’t make them useful. Pure Logics’ expertise in both areas—compliance frameworks and on-prem AI deployment—creates the foundation for meaningful healthcare innovation. The Hidden Cost of Moving Fast Muhammad sees a pattern with technical debt. “Tech debt is mostly built due to business pressure—’keep delivering, I need this thing or that thing’—or it can be due to poor planning or prioritization.” Add AI to the mix, and the pressure intensifies. Your CEO reads about companies shipping 4x faster with AI. Your board asks why you’re not seeing similar gains. Your competitors claim massive productivity jumps. But in healthcare, you can’t just vibe-code a system into production. “You can keep building things, but especially with AI—we are generating code through AI as well—we wanted to make sure we’re not building a product that reaches a certain level where we can’t add any further features, or it’s not scalable.” Pure Logics’ solution: quarterly audits. Load testing. Security reviews. Code quality checks. Database design reviews. Access audits—who has credentials to which systems. And version upgrade planning—if you’re on Python version X but version Z is stable, what’s the migration path? This sounds expensive. It is. But Muhammad has watched what happens without it: systems that need complete rebuilds after two years. Technical debt that makes simple features take weeks. Security vulnerabilities that surface during compliance audits. The paradox: moving slower with proper guardrails lets you move faster long-term. The Twenty-Year View Muhammad started Pure Logics in 2007 with one other person. They worked 12-14 hour days, went home at midnight, worked weekends. “The initial four to five months were quite challenging.” By 2008, they landed Fortune 500 clients—Live Nation, where they managed web presence for Maria Carey and Taylor Swift. By now, they have 500 people across multiple countries. This growth path offers a different model than the typical startup story. No VC funding. No blitzscaling. Just steady, sustainable growth by solving real problems for enterprise clients. What does this teach about AI adoption? “We need to have people who are not just coders, but they are also thinking from an end-to-end problem solving mindset. And they are great at other areas like soft skills—communication, explaining and connecting with people and driving to a solution.” The companies that win with AI won’t be the ones that generate the most code the fastest. They’ll be the ones that understand the complete problem: technical constraints, compliance requirements, security frameworks, and human workflows. What This Means For You If you’re building AI products in regulated industries, Muhammad’s framework offers a practical path: First, map your constraints before you optimize. Don’t start with “what’s the best model?” Start with “what meets our compliance requirements?” An on-prem model that achieves 70% of your prototype’s accuracy but ships is more valuable than a cloud-based prototype that can’t be deployed. Second, build security guardrails into your development workflow. Muhammad’s team achieves 20-25% productivity gains from AI coding tools while maintaining code quality through static analysis, peer review, and technical debt checks. Third, audit regularly, not reactively. Quarterly reviews of code quality, security, database design, and access controls catch problems when they’re manageable, not when they’ve compounded into system-wide issues. Fourth, choose tools for integration, not hype. The best AI tool isn’t the one with the most impressive demos. It’s the one that integrates with your existing quality processes and workflow. Fifth, remember that constraints can become advantages. Pure Logics’ on-prem expertise differentiates them. Companies that need HIPAA-compliant AI need teams that understand both AI and compliance frameworks. Your constraints are your moat. The critical question: are you building AI products that work within your industry’s reality, or are you trying to force approaches that only work for unrestricted consumer apps? About PureLogics: PureLogics [https://purelogics.com/] is a global engineering firm specializing in healthcare software development with deep expertise in HIPAA compliance and on-prem AI deployment. Founded in 2007, they’ve grown from 2 engineers to a 500-person team serving Fortune 500 clients including Intel, Samsung, and Live Nation. The company focuses on building compliant AI solutions for healthcare organizations, from e-prescription systems and EMR integrations to on-prem AI models for sensitive patient data. Their expertise in both AI implementation and healthcare compliance frameworks enables them to build applications that meet strict regulatory requirements while delivering meaningful clinical outcomes. Learn more at purelogics.com [https://purelogics.com/]. About Maestro AI: High Output is broght to you by Maestro AI [https://getmaestro.ai/]. Maestro is an engineering visibility platform that helps leaders make data-driven decisions backed by narrative context. While most dashboards offer surface-level metrics, Maestro analyzes your team’s actual code, PRs, tickets, and communications to reveal not just what’s happening, but why. The platform automatically synthesizes this activity into real-time feeds for every project, team, and individual—replacing subjective status meetings with objective truth. This allows you to identify blockers before they impact deadlines, de-risk key initiatives, and measure the true impact of tools like AI on your organization. Visit https://getmaestro.ai [https://purelogics.com/] to see how we help engineering leaders build more predictable and efficient organizations. Leading distributed engineering teams? We’d love to hear your challenges. Schedule a chat with our team → https://getmaestro.ai/book [https://getmaestro.ai/book] This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit maestroai.substack.com [https://maestroai.substack.com?utm_medium=podcast&utm_campaign=CTA_1]

11. nov. 2025 - 35 min
En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.
En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.
Rigtig god tjeneste med gode eksklusive podcasts og derudover et kæmpe udvalg af podcasts og lydbøger. Kan varmt anbefales, om ikke andet så udelukkende pga Dårligdommerne, Klovn podcast, Hakkedrengene og Han duo 😁 👍
Podimo er blevet uundværlig! Til lange bilture, hverdagen, rengøringen og i det hele taget, når man trænger til lidt adspredelse.

Vælg dit abonnement

Mest populære

Begrænset tilbud

Premium

20 timers lydbøger

  • Podcasts kun på Podimo

  • Ingen reklamer i podcasts fra Podimo

  • Opsig når som helst

2 måneder kun 19 kr.
Derefter 99 kr. / måned

Kom i gang

Premium Plus

100 timers lydbøger

  • Podcasts kun på Podimo

  • Ingen reklamer i podcasts fra Podimo

  • Opsig når som helst

Prøv gratis i 7 dage
Derefter 129 kr. / måned

Prøv gratis

Kun på Podimo

Populære lydbøger

Ofte stillede spørgsmål

Flere spørgsmål og svar
Kom i gang

2 måneder kun 19 kr. Derefter 99 kr. / måned. Opsig når som helst.