Acima Development

Episode 100: Normalization of Deviance

43 min · 10 de jun de 2026
Portada del episodio Episode 100: Normalization of Deviance

Descripción

This episode of the Acima Development Podcast centers on "normalization of deviance" — the pattern where small anomalies get repeatedly ignored until they cause catastrophic failures. Mike opens with the Space Shuttle Challenger disaster as the anchoring example: engineers warned that cold O-rings could fail, but their concerns were drowned out by schedule pressure and accumulated tolerance for small deviations. The crew connects this to the Columbia disaster years later, where the same organizational lesson went unlearned, and to NASA's own "Elements of Engineering Excellence" report, which lists not questioning anomalies as a major root cause behind their biggest failures. The conversation then wrestles with the tension between safety culture and velocity. Will pushes back on pure risk-aversion, arguing that heavy regulation has real costs and that tech's "move fast and break things" ethos has produced enormous value. Dave introduces the META framework (Mitigate, Eliminate, Transfer, or Accept) and contrasts NASA's culture with SpaceX, which celebrates blowing up unmanned rockets because the risk was already accepted and the explosion yields data. Mike reinforces this with an analogy from his kid's rocket-themed birthday party, where different risk levels (model rockets, sugar rockets, thermite) warranted very different safety boundaries — treating everything as maximum-risk would have obscured where the real dangers actually lived. The group lands on a key reframe: rather than trying to control everything, build a monitoring culture that instruments heavily, tests to failure, and pays attention to the signal inside the noise. The final stretch applies these ideas to current software practice, including AI-assisted development. Matt and Dave debate whether vibe coding will dominate production code soon, with everyone agreeing humans must remain accountable for what ships. Will gives concrete examples of normalized deviance developers live with daily: thousands of ignored compiler warnings (some of which are genuinely dangerous), bloated mobile web performance, and test suites nobody expects to run clean. He notes AI could finally make the ditch-digging cleanup work economically viable. Mike closes by tying it back to the opening theme: entropy is the default, letting things slide is easy, but flipping the culture toward actively watching the data is what prevents small deviations from becoming the next tragedy. Transcript: MIKE: Hello, and welcome to another episode of the Acima Development Podcast. I'm Mike, and I'm hosting again today. With me, we've got Will Archer, Dave Brady, and Kyle Archer. DAVE: Howdy, howdy. MIKE: I'm going to start with a story, as I typically do, actually two stories, but one funny and one not at all funny. I'll start with the funny one. My wife, when she was in her late teens, decided to drive with her sister to college. She wasn't going to college yet, but she decided to road trip with her sister to college. And they made sure the car was good the day before, had been doing some maintenance, and they cracked the case of the cooling fan for the engine. WILL: Oh. MIKE: So, when the fan was running, it was bumping against this cracked part of the case, so you can imagine the sound of that, not good, right? They actually took it to a mechanic and got kind of a loose sign-off that, "Yeah, well, this isn't going to make the car break, but it's going to sound terrible, and you should get it fixed soon," like, "Okay." And they drove cross-country [chuckles] with that thing rubbing the whole way. And what they did is they just turned up the radio, so full volume, full road trip. They drove for, like, two days [laughs] with the volume cranked up, just ignoring it. And she's told the story for years. It's funny, you know, everybody in the family laughs. You can just imagine, just turn up the volume, and the problem goes away. That is one way to make a problem go away. The other story is related to what we talked about in our last episode. And we're going to continue with the topic we talked about in the last episode, which is the Space Shuttle Challenger disaster, which happened in, let me check my dates here, '86. I believe this happened in... DAVE: '86? MIKE: '86. That's the date that I was remembering, so 1986. There it is: 1986. So, I actually looked this up. I read about it on Wikipedia. As a kid, I remember watching this [chuckles] in school, and it was, you know, horrifying. So, they had O-rings around the booster engines that, you know, like rubber or rubber-like material. And they had had record cold, I guess, at the launch pad the night before. And that cold caused the O-rings to, you know, shrink and stiffen. And so, in the launch, they lost integrity, so air started getting into the fuel. Eventually, that caused a catastrophic explosion, and the entire spacecraft disintegrated. I remember the horror of seeing those booster engines just randomly wandering, and there was not anything left of the main craft. It was a tragedy, you know, a horrible tragedy. Anybody who was around that time remembers. I was talking to somebody else like, "Oh, that was our JFK moment," you know? Everybody remembers that. Where were you when that happened? And it turns out the engineers had warned this might happen, and they were ignored, because there was enough noise in the data. They're like, "Oh yeah, well, there are so many things that can go wrong [inaudible 03:18] DAVE: Not just ignored, though, right? They were told to stay in their lane. MIKE: I think that's right. DAVE: If I recall. Yeah. They were told to be quiet, yeah. MIKE: The interesting thing about that is there was another space shuttle disaster some 20 years later or so, where Columbia broke up in re-entry. And the diagnosis afterward essentially said, "We didn't learn from the last time." There were likely problems that were pointed out by engineers, and there was just so much pressure to make this thing work that the concerns were ignored, and people died as a result. And in the document that we started talking about in our last episode, which is...certainly you can look it up yourself. It's titled...this was published by NASA titled Elements of Engineering Excellence. It was published in 2012. They made a list of root causes behind the major problems that NASA had had over the decades previous. Last time, we talked in depth about the importance of hands-on experience, that unless you have people who really have, you know, kind of gotten their hands in the work and understand it deeply, then you're going to miss stuff. The second point is what they call normalization of deviances. They also refer to it as not questioning anomalies. I'll quote from the report, "As was evidenced in the Challenger failure, we see deviations, and they're not quite normal, but seem to have no major consequence. After seeing these deviations a few times, we accept them as normal and ignore them. The result is a major failure where the deviation becomes catastrophic." So, that's our main topic for today, is the importance of questioning those anomalies and being able to see that signal inside, you know, a bunch of noise, because there's always noise [crosstalk 05:18] DAVE: There are a couple of interesting extrapolations on that as well. WILL: So, I have some thoughts about, like, sort of, like, these sort of, like, normalization of deviances and ways that it can go wrong. But, like, I suppose, like, and maybe this is just my priors, but I'm very much a believer in, like, a move fast and break things sort of ethos. Like, I'm familiar with heavily regulated, heavily controlled industries where, rightly or wrongly, there are high stakes, people die, right? And let me tell you right now that there is a cost to that. There's a substantial cost to that. And I do think that technology, in general, is pretty out of control in terms of, like, accountability, right? I mean, if you look at, like, you need a license to braid hair [laughter]. But, like, I didn't even need to graduate high school to do the job I'm doing. I just needed to convince somebody to give me a shot and then not get fired for long enough, and then you're in. DAVE: And we're writing software that handles people's money for them. WILL: Yeah, to the tune of billions of dollars, you know? And it's just like, "Yeah, you know, he sounded like he knew what he was doing. Let's roll," you know, which is fun. I think that's wrong. But, I mean, you can't argue with results of the industry that we've been in, right? And I think there's benefits there, and there's a lot of stuff where, yeah, you can let it slide until it blows up. You could do that. That's a strategy. It's a valid [crosstalk 06:56] MIKE: Well, and not only is it a strategy. It's a critical one. WILL: Initially, right? MIKE: Yeah. Well, absolutely. And even in regular life, you can't pay attention to everything. Attempting to do so would not end well, right? Our brain is very good at removing extraneous information. You can't pay attention to everything. So, you have to prioritize what you actually give attention to, and that better be the important stuff. DAVE: There's a rule in insurance, which is if you can afford to replace it, don't buy the insurance. But if you can't afford to replace it, don't even ask how likely it is that you're going to lose it. You have to get the insurance. The entire science of risk assessment is getting people to stop thinking about reducing the likelihood of a catastrophic fault and dealing with the case of when it is catastrophic, right? It's like, if you're going to take, "Oh, this hash collision can happen one in 10,000 times, and it will bankrupt the company," and I turn the PR back to you, and you say, "Okay, well, I've reduced the likelihood to one in a million times, and it still bankrupts the company." No, I'm not going to approve that PR. You're trying to reduce the likelihood of something that will end us all, when what I need you to do is mitigate it, right? The META rules for...M-E-T-A: Mitigate, Eliminate, Transfer, or Accept on any given risk, right? And if we can't accept it, then you have to mitigate, eliminate, or transfer. And the thing about the Challenger discovery that I love is that it's mirrored by SpaceX. One of their first unmanned rockets went up, and they start cheering. It gets off the launch pad; they're screaming; they're going nuts, and then it explodes. And somebody opens champagne, and they keep screaming and cheering. And the reason...it was unmanned. There were no people on it. And they were normalizing science. They were saying, "This is successful collection of a data point, and the risk that we assumed was entirely mitigated." Once it was off the launch pad, they said it was all icing on the cake. This is absolutely 100%. We accept this risk. We are in the black for days, right? We can burn this rocket, and it's fine. And they used that to normalize that and create a culture of psychological safety, and let's move forward with this. But the normalization of deviance is kind of based on this weird thing where humans tend to reach for confirmation, and when you're trying to prove that a rule doesn't hold, the only thing you care about is exceptions to the rule. Is there a thing that violates the rule? Like, I can't remember the name of the rule. There's a really cool psychological test that I learned last week, where you set out four cards with numbers and letters on them. I'll dig it up for later in the call if it's relevant. But the important thing is, you're like...if I tell you every person in this bar that's drinking alcohol must be over 21 and I ask you, "Tell me if that's true or not," you know that if somebody is over 21, you don't need to know what they're drinking. And if somebody's drinking a soda pop, you don't need to know how old they are, right? But we go for that. You're like, "You're 35. Are you drinking a beer?" That's the confirmation case. You need to be looking for counter cases. Are you underage and drinking booze? If you're drinking booze, are you underage, right? Those are the counter cases, and that's the only thing you care about when you're trying to prove a negative. When you normalize deviance, you are throwing away the counter cases and grabbing confirmation, confirmation. And, eventually, your META rule, you end up accepting a risk, and it's catastrophic. WILL: So, one of the things that I worry about, right, is this sort of psychological need for control, right, and, like, people's psychological need for control on emergent systems that are nearly impossible to fully model inside your brain. And, like, all of us can think of examples of really catastrophic failures. We've all blown things up. We have blown the rocket up. And we've blown the rocket up to a degree where it's like, "Hey, you know, we could lose the company. This company could not be a company, and we could all have to work somewhere else very soon." We could all think of those. And so, the question becomes, right, is there a productive safety culture that can really eliminate, like, really, like, look at these deviances to a specific and scoped way where you can get a level of certainty where the juice is worth the squeeze, and you're not just sort of navel-gazing and being, you know, petty and fooling yourself, right? You're trading velocity for the illusion of control. DAVE: Right. The fun police or the policy wonks on your team they just want to slow things down. It's the foolish consistency is the hobgoblin of little minds. It's like, we followed every checkbox, and we did absolutely nothing wrong, and that's why the company went out of business, because we didn't make any money. Yeah. You're focused on the wrong things. WILL: Well, yeah, absolutely, absolutely. And it's just, like, killing your productivity, killing your velocity. Like, I have run into this in many respects where people will...One thing that I've seen go dangerously awry is people's focus on shallow indicators of code quality, you know? Like, where every [inaudible 12:31] DAVE: 82% C0 code coverage. WILL: Yeah. Well, no, I mean, I don't know. Like, where you'll go through, like, three rounds of code reviews with no substantive architectural improvements, right, like lateral moves because people don't know what's important and what's not important, but they definitely want to put their stink on it, you know? MIKE: Well, I've been thinking a lot about this importance thing that you brought up, because it matters. So, I've been thinking about analogies. So, I haven't thought about this in a while. When my oldest was around eight, we threw a rocket-themed birthday party, and we had fun. Everybody who came made little model rockets and launched them. And we made what they call rocket candy. You mix sugar and potassium... DAVE: The sugar rocket fuel, potassium nitrate? MIKE: Yeah, potassium nitrate. DAVE: Sugar rockets, yeah. MIKE: Yep. And we made some of that, and lit it on fire and watched a big fire [laughs]. And we made some thermite. WILL: Oh! [laughs] DAVE: I want to go to your birthday parties. Holy crap. MIKE: [laughs] And lit that with magnesium ribbon, and that was fun having molten metal [laughs] in the backyard. And I'll tell you, so the model rockets, heavily controlled. Those things have been, like, standardized for I don't know how many decades. They made them. They put their own engines in those, and they were able to launch them, and everybody laughed, and it was great. So, you know, eight-year-olds wandering around with rockets in their hands. The rocket candy, everybody was at least 10 feet away, right [chuckles]? It was enough. And the thermite, everybody was at least, like, 30 feet away, right [chuckles]? They were on the other side of the property. They could see it, and they could see the fire. There was no eight-year-old anywhere close because not safe. And there were very different rules applied for each of those different risk levels, and that was important to identify because if I had tried to make all the eight-year-olds obey those thermite rules with their rockets, they'd have had no fun, and they wouldn't have known where the boundaries really were, right? They wouldn't have known, "Well, I actually need to be careful of this," because you're treating me like this rocket is super dangerous that's not actually that dangerous, you know, there's a lot of controls around it. And so, I don't really know where the boundaries are. And it was really important to establish ahead of time, well, what's sensitive and what's not? Because that allowed us to pay attention to the things that really mattered. MATT: Thermite and firearms, favorite games. DAVE: Thermite and firearms, yep. There's an interesting...It's not the reverse of it. It's not even the countercase. It's an agreement in it in, like, the shadow of it, which is kind of going back to, like, the Challenger disaster. We had normalization of deviance, and it isn't that cold O-rings kill people. It isn't that foam is falling off a shuttlecraft is deadly. It's the bigger thing hiding behind it that you're normalizing this thing. But if you're not paying attention to this, what over here is even bigger that you're not paying attention to? And I had two things...I learned this lesson really well because I had it happen in two places kind of at the same time, information that came in. One of them was I was in a fast food restaurant. We walked in, and it wasn't busy. All the tables were filthy, and it wasn't, like, immediately after the lunch rush. And the guy that I was with, he's like, "No, we're going." And he turned around. I'm like, "But, I mean, what are you talking about? The food here is pretty good." And he says, "No, come." So, we went to another restaurant, and I'm like, "What was all that about?" And he says, "Here's the thing. If they're not cleaning the tables out in front where we can see them, what do you think the kitchen is like where we can't see it?" I'm like, "Oh, that's a really good point." That same week, I had started a new job at a company in Salt Lake City that...they're like a Groupon clone. They were doing financial, you know, manipulation stuff, like, batching together coupons and stuff and deals for people. And the CTO had a really cool hip line of like, "We move fast, and we break things, and we only fix it... We don't polish rivets here. We only fix things good enough to ship it." And I'm like, "All right. Yeah, let's get some money. We're a startup. Let's absolutely do that." So, I hired on, and my first day, he's like, "Okay, cool. We need to get you on the board for the pager." I'm like, "Okay, well, pager isn't my big thing." And I said, "How often does the pager go off?" He says, "Oh yeah, you're going to need the pager every night because every night, you have to reboot the server at two o'clock in the morning." "Your production server goes down every single night, and you consider this business as normal?" "Oh, yeah, it's totally fine." I turned in my badge. I walked out. I quit the same day, the first day. And the pager was the thing that did it. And it wasn't the pager; it was, if this is okay to you, you've just told me a lot of things about how much you value my good night's sleep and my value as a person, but also everything else in the company. And when I found out a year later that the CEO was on trial for financial fraud, I'm like, "This surprises me exactly not at all." Like, everything in that company was move fast and take what you want, and hope you don't get caught. WILL: You already said Utah startup. DAVE: Yeah, yeah, exactly. Utah startups, yeah, yeah. Tell people -- MATT: I feel like [inaudible 17:48] MIKE: Acima was a Utah startup [laughs]. MATT: I feel like we've crossed paths 20 years ago. And I'm sure of it now. Because I built one of those companies that was doing the same thing back at the same time. DAVE: Oh wow. MATT: That was acquired by the other company. DAVE: Oh, interesting. We'll have to talk offline. MATT: And the founder also happened to end up, I believe, in prison. So, yes. DAVE: Yeah. We'll have to talk afterwards. That's fantastic. That's the other fun thing is that the Ruby community is a small, small world. Yeah. MATT: Going back to what Mike said and then you extended on, it's constraints, right? And then we're talking about architecture. And while things may not fail on their own, when you put them together in a systems architecture, and then you apply pressures, that's when you start to see failures, right, of those constraints. And I think a lot of people overlook that architecture as a whole and are losing sight because they can't see the forest above the trees, or through the trees rather. WILL: Yeah, but, you know, I guess, like, and I don't know whether we'll be able to, like, resolve this to, like, a satisfying conclusion. But, like, people talk about, like, the Challenger disaster. This guy was talking about, right, I think it was foam, right? Like, foam was falling down and damaging the heating tiles, right? And that compromised the thermal integrity of the space shuttle, which made it blow up, right? And they told him, "Hey, shut up. Shut up. We got to get this thing in the air, you know, whatever. The project's got to go," right? And so, we all, because of the tragedy and the benefit of our beautiful hindsight vision, we're all like, "Oh, well, obviously, this man is a hero. These evil, greedy executives were the villains," you know what I mean? And one guy wears the white hat, one guy wears the black hat, and then boom, [claps], put a bow on it, ship it. But, I mean, just because, like, I am the way I am, I always think about like, okay, well, how many times did the executives say, "Shut up. It's fine," and they were right? You know? You know what I mean? And, like, I don't have that information, but I do know for certain that there is this temptation for all of us to assume that if we just do everything well enough, it's not going to blow up in our faces because we can have it under control. MIKE: So, I want to take that and combine it with the SpaceX example that was brought up before. It's going to blow up. If we're talking about rockets, yeah, they're going to blow up. You're not going to start a rocket company or, you know, a government rocket program that's not going to have a lot of things blow up. If you go into it with that mindset and start blowing things up on purpose, say, "Yeah, I'm going to have blow...these things are going to blow up," it changes your approach to the problem versus saying, "I'm going to try to control absolutely everything so that nothing will blow up." MATT: Try to test your failures. Push the [inaudible 21:15]. MIKE: Exactly. Yeah, fail on purpose. Learn from it. MATT: Yes. And I'm wondering, you know, and I don't, again, like you just stated, Will, I don't have the information. However, that foam may have very well passed temperature testing, right? However, you add velocity to that ; did they test it at velocity? Because things get fed more oxygen. They ignite more quickly, and, exponentially, things go bad. So, it also illustrates test your edge cases, right? That's an important thing. You can't always predict edge case. But as Mike just stated, you need to try, right? Try to determine your failures. Try to test those failures, and you're going to have much better success than just saying, "Okay, no, we want to make it perfect. Here's our MVP. This is best-case scenario. Everything's successful. Let's send it." MIKE: So, the testing to failure is very different from testing that it meets certain parameters. If you test, "Oh yeah, it didn't fail within these parameters," and the failure point was, like, 1% away from that, you have no idea whether it's 1% away or you've got, you know, tons of headroom, you know. MATT: That's right. Test it till it breaks. MIKE: Yeah. And that approach, that change in mindset, that very fundamental change in mindset is a big deal. And it's kind of the difference between waterfall-style software development and agile development is, in one case, you try to control everything and inevitably fail [laughs]. And the other approach you say, "I can't control everything, so I'm not going to try to. Instead, I'm going to take an alternative approach where I build a small prototype, test it out, and go into a loop so that I know far more about the process as I'm going on." So, you plan...You're still planning, but you're doing just-in-time planning rather than attempting to cover all your variables before you could possibly know all the details. WILL: Yeah. And, like, some stuff you got to test in prod. That's one of the things, I mean, like we talk about, like, SpaceX, right? I believe, you know, we return to the analogy, right, where they blew up that rocket, and they were so happy about it. Like, they were pretty sure that rocket was going to blow up. They didn't want the rocket to blow up. I don't think they were trying to blow the rocket up. They were trying really hard to not blow the rocket up. But even still, they were like, "There's no way fucking way this makes it all the way," you know? And so, when it got off the launch pad or whatever and it blew up, you know, on the first stage decoupling, and it was just like, "That's a great win." I think that's...they've embraced, like, you know, the futility of the illusion of control, where, like, you just can't test a rocket on the ground. You can't do it. MATT: No. And you can't predict everything. I mean, let's face it, this is reality. There is no way to predict every variable. And, you know, some of us on this call witnessed Challenger. You know, I remember sitting with a group of children watching it live on TV and then watching it happen, and I will never forget it. I can picture everyone next to me, their face, the reaction, you know, similar to 9/11, same thing. But you can't predict everything. But you can force failure. MIKE: So, if you set up a [inaudible 24:56] DAVE: Yeah. So, at the end of the day, we're all testing in prod every single day. MIKE: Well, yeah. But if you accept a culture of monitoring where you are looking at the anomalies and paying attention to them [laughs] and doing something about it, this is kind of where we launched this conversation, right? Then rather than trying to...It's the opposite of control, right? You assume, I don't have control, so I'm going to watch everything I possibly can to see when things start going out of bounds. So, you develop a monitoring culture rather than a control culture. And I think that's a big deal. Like, we talked about SpaceX. I'm sure they had all kinds of instruments on that rocket that blew up to figure out what went wrong in every possible way [chuckles]. They didn't know what would go wrong, but they knew something would, so they instrumented that thing to death. "Let's look for all the anomalies we can see." And the next rocket, I bet they did something for almost all of them. MATT: I think this speaks to culture as well, you know, NASA versus SpaceX. And I will admittedly say that I am a fan of what Elon's doing. Like, I will not hide that because he is innovation king. But you operate under government regulation, bureaucracy, constraints, and then you go privately held with someone who's a visionary, wants to push boundaries. You see the success rates, right? And those success rates are exponential with what SpaceX can do versus what NASA can do. We haven't... I mean, we're, as far as I know, we're still on x86 architecture on the space shuttle, and I can guarantee you SpaceX is not. MIKE: Well, and that culture there, you know, we can ascribe it all to one guy, but it's not, right? I mean, there's Gwynne Shotwell [inaudible 26:45] WILL: I'll give Elon credit. MATT: He's the visionary behind it. MIKE: I'll give him credit. I'm not taking away all the credit. But -- MATT: He's the visionary behind it. MIKE: You can't just have one person. One person is not culture. MATT: No, but it starts at the top. DAVE: Well, and we live in a world of identity politics right now, and I think a peace offering we can say on both sides of the aisle is that the people that don't like the identity have a problem with that person, right? With Elon. I don't hear anybody on either side being upset about electric cars, or about having a space program, or maybe getting off the planet and saving humanity, or dealing head-on with the AI potential extinction of the race. We like what's going on. We like what's coming out of there. So... WILL: Well, you know what I mean? Like, I actually, like, I really like this analogy because I think as you look at other things, you can see this culture. You can also see the limitations of it in that he wanted to build out a rocket company from scratch, right? And he wanted to do it in the most capital-efficient way that he could. And he, I think, correctly ascertained that, like, okay, the fastest way to get to orbit is to test it in prod, right? So, blow up a lot of rockets, right? Like, I'm not going to spend years and years and years in the wind turbine, you know, like all this stuff. We're going to shoot some rockets off, and we're going to blast them into space. We're going to see how things go. We're going to learn and iterate very rapidly, right? And they're all going to be unmanned rockets, which... initially at least, right, NASA couldn't do that. That wasn't on the menu for NASA. Or well, I mean, I guess that's not true -- DAVE: Well, because, like, Sputnik and the [inaudible 28:33] stuff, sure they did, yeah. WILL: Yeah. Well, no, no. When NASA got started, they had a lot of acquired experience with one-way rockets from... DAVE: That's fair. WILL: Very [inaudible 28:34] MATT: Yes, yes, yes. I know where you're going with that one. DAVE: Yes, yes. WILL: They were doing one-way trips almost from the very beginning. But regardless, right, the point I want to make, though, is there comes a point where this move fast and break stuff thing and the complexity around the emergent system starts to consume you, starts to swallow you whole. And what we have seen, like, I'll go ahead and call it. I don't want this to be the Elon show, but, like, they've been advertising robotaxis for a very, very long time. And I think that system, the complexity has gotten out of hand on it. And I don't think those robotaxis are coming because I think the move fast and break things, iterate quickly, kind of messy architecture culture has...I think the tech debt around autonomous driving has completely stalled out their progress. I think they're stuck, frankly. MATT: Well, I think...and you kind of led me to a perfect segue here. And I'm going to go extremely, extremely old school and maybe a little bit off topic, but it takes visionaries to change the way we do things. I'll go back centuries: Leonardo da Vinci pushing the boundaries, trying things that everyone else thought he was absolutely insane. Next, Nikola Tesla and how he was obsessive, and it destroyed his life. He died broke and alone. But he changed absolutely everything for the world, right? And we need that. You can't get stuck in technology, bureaucracy. You need innovation. You need to push boundaries. You need to test outside of those boundaries to really make progress. And I think, to me, and, y'all know me, that's the most important thing there is to me when it comes to the world of technology and the things I do and what I'm trying to push. And sometimes I'm going to be wrong, but you have to be wrong to become right. MIKE: Well, let me take that. So, you talk about the robotaxi and the visionary. Yeah, I think you have to be a visionary, and sometimes you have to admit that you're wrong. I think that, yeah, the robotaxis not been successful, and part of that is it's been thus far technologically impossible [chuckles]. There are challenges to making that happen that nobody has solved yet. And -- WILL: What are you talking about? They're done. You can ride in one. MIKE: You can, with Waymo, because they didn't say, "Hey, we're going to end-to-end learn this." They said, "It's not within modern tech, so we are going to have a really sophisticated 3D map of the environment we're going to work in. We're going to use LiDAR on top, and so we're going to use some algorithms to locate where that vehicle is every time given that 3D map. And all that vehicle's going to do is follow the map." And that's what they do, and then they just use a little bit of the machine learning. I mean, they still have to have the vision to look for a collision. So, they're doing some collision avoidance, but they're solving a different problem. They decided this tech isn't here, so let's solve the problem with technology that actually does work, and so they're successful. So, they've approached the problem a different way. Now, you need to try stuff that fails sometimes, right? So, I think it was great to say, "Hey, let's try this with end-to-end learning. Let's see what we can do." At some point, you might need to realize it's going to bankrupt your company trying to do it because it's not going to work. Sometimes it works; sometimes it doesn't. Yeah, you need to experiment, and sometimes it's not going to work. MATT: That started something revolutionary, though. Yes, they have constraints, right? They can only do it in x amount of cities where roads are in certain conditions, because of LiDAR, and, you know, collision detection is vector maps and machine learning gets a little scary, to me, because probability versus determinism. But you have to start somewhere. MIKE: You do have to start somewhere. MATT: And what they're doing is going to revolutionize the industry, and it's going to change the way we navigate the roads. WILL: Tesla? DAVE: I think we're solving it from the other end, much, much farther than we've ever been. I overheard Uncle Bob Martin talking. He's got a Cybertruck. Now, I don't like the Cybertruck. That's a personal aesthetic thing for me. Honest, I joke with people that somebody designed a nice, beautiful SUV, and they modeled it in 3D, and they accidentally sent the bounding box to the fab of the render [laughter]. And that's what they got back. But Uncle Bob owns a Cybertruck, and he put on Twitter a little while ago that he's put, like, 100,000 miles on it in three years, and 80% of it has been auto drive, and, like, he won't live without it. For somebody to be that much of a road warrior and to straight up say, "80% of this is solved," we've never been that far, and every year we get closer and closer. And Waymo, they have to solve that other 20%, and, like, 5% of it is, like, road construction problems that LiDAR can't deal with, so they cut that off. They probably just won't deliver you to those areas, right? So, they stay within that. And we're getting it closer and closer, and that's what kind of excites me. Circling back to AI a little bit, I said, like, five or six years ago when the self-driving cars were coming, I joked to somebody that, like, we look at AI, and we say, "It'll never be there. It'll never da, da, da, da, da." But our kids are going to talk to each other and go, "Can you believe Grandpa got in that 2,500-pound machine of death and controlled it by hand at 100 feet per second? Are you nuts?" right? We're seeing this with vibe coding. We're past the tipping point. There are companies now that are literally saying, "Why would you let a human touch the crypto code?" Or, "Why would you let them touch this piece of the security stuff?" And by next year, like, 80% of software...I don't know if it's 80%. WILL: [laughs] DAVE: But anytime I make these bets, I always take the under, and I always win if I take the under because it's going to hit, and it's going to hit faster than I think it's going to. And I'm calling it, like, next year that over half of the code that we do in prod is going to be...it's not going to be vibe code. We're not going to use that word because it's a four-letter word, but reliable automation in prod that's handled by an automated system with, you know...And I don't want to get into that. It's a different podcast. WILL: Not a chance. Not a chance. MATT: However, I will get on that bet with you. WILL: No way. Not, not -- MATT: Just based on some of the things I'm aware of. WILL: I would say, like, I don't know, I mean, maybe it's just the domain that I work in, but, like, 80% of the code that I write these days is generated by an LLM. But there's not a snowball's chance in hell that that LLM is ever going to replace me. The LLM cannot exist without me. DAVE: Oh, yeah. MATT: No. WILL: I can exist without the LLM. MIKE: And nobody's arguing otherwise. MATT: Yeah. I don't -- DAVE: I think we're all in violent agreement here, yeah. MATT: Yes. Nobody is replacing humans. At the end of the day, there has to be a human accountable for what goes out. Accountability and ownership is 100% important. Like, you cannot avoid that; otherwise, we end up in chaos, right? And then we see drift everywhere, and hallucinations, and Wild Wild West, worse than we've ever seen in the history of humanity. Like, there has to be accountability. DAVE: You've just named the next problem that we have to solve, yeah. Every time somebody says, "AI can't draw a hand with fewer than six fingers," the AI community says, "All right, bet. We'll see you in two more model revisions." MIKE: Well, and I think you just -- MATT: And every week, it changes. MIKE: You just brought it back to where we started. If we have a culture without accountability, then bad things happen. But if you -- DAVE: That's normalization of deviance, yeah. MIKE: Normalization of deviance. But if you're watching this thing, and you're developing all kinds of metrics to say, "I want to make sure this code has high quality," and you're establishing those standards and building the constraints to make sure that it is high quality, then you can get to that confidence because you watched it. WILL: Well, I mean, and, like, one thing that I'm very, very excited about AI in that one thing that it is good at, really good at, is, like, just petty ditch-digging work that people cannot stand. And I'll give you an example of, like, sort of, like, a normalization of deviance in ways that I think big and small, right? We need to, like, you know, we're asking, like, is the juice worth the squeeze? Well, in certain capacities, absolutely, it's worth the squeeze. I'll give you one easy and one hard example. Like, one thing, I'm looking at a build for a product that is getting a lot of usage, and right now I see 3,874 compiler warnings, of which I am certain at least 3,000 of which are completely spurious, completely nonsensical, totally useless. 500 are nice to have, get out ahead of this deprecated API, you know, you can get around to that. And 174 are a big, big problem. And I don't know which ones are which. And it'll be a real hairball, and, like, you know, in all honesty, right, because we've normalized deviance to think builds, to think [inaudible 39:02] that pass QA, right? And it's just like, "Ah, that warning's just a warning. It's just nothing," you know what I mean? "Let's tape over that check engine light because I got to get this release out on time." And that's a big problem because I guarantee you there are at least 100 warnings in there that are bombs waiting to go off. And everybody's build is like that. Everybody's test suite is like that. Everybody's...There's nobody who's like, "Okay, I'm going to run my test, run a real, you know, rake test," and, like, that output comes out squeaky clean. You know it doesn't. You know it doesn't. But it could, and maybe it [inaudible 39:43]. And that's a normalization of deviance, like a real serious problem. DAVE: Broken window syndrome, yeah. WILL: Yeah. Well, I mean, like, there's no reason that we couldn't clean these out. I wouldn't even know. There's a really good reason. Development time is expensive, and the juice isn't honestly probably worth the squeeze. DAVE: It's not always. Yeah, it's always a long tail or Pareto's rule, yeah. WILL: But my helper monkey, he doesn't get tired, you know? As long as the lights are on at the data center, he'll clock into work. DAVE: As long as I've got tokens. WILL: I have to check the GB thing, which is going to be burdensome. But I'm saying that's an easy one, right? And that's an easy win of like, "Oh, look, we can fix this. We can work this out." I think a more significant issue is due to the continual degradation and debasement of my intellectual and emotional health, I don't keep a lot of apps on my phone for consuming media, and I use the internet as it stands, right? Like, if there's, like, a Reddit article that somebody sends me, I just look at it on a mobile website. And I don't know if you guys have noticed this, but, like, the mobile web, like, the internet in general, is in a dire dumpster fire state. It's not like the internet does new things. It's just terrible. It's terrible, and it's gotten bigger and bigger and bigger, and worse and worse and worse. Page sizes have gotten larger, and APIs have gotten slower and more bloated, and we're loading more stuff for no reason. And, like, all of our performance is degrading and degrading and degrading and degrading and degrading. That has absolutely direct financial business impacts. And every single organization I have ever been a part of or interacted with on any level has suffered from this. Normalization -- MATT: Yeah, and I would -- WILL: "Oh, well, it's only 10 milliseconds slower," you know? "It's only a megabyte bigger." MATT: I won't get into the psychological and physiological aspects of that, but there's definitely an impact. Synapse are being reprogrammed, and attention spans are 30 seconds when they used to be hours. And, you know, the world has changed. MIKE: Well, we're kind of scratching on this into a different topic. So, I'd like to bring this together, this idea of normalizing, you know, these deviances. We've talked about changing culture, right? To having a culture where we pay attention to the data, and that changes things when you do so. And it's easy to let it slide. The default is to let it slide, and the entropy happens, right? But if you flip that and say, "Well, we're going to focus on watching stuff," it makes a fundamental difference, and can even, in some cases, lead to avoidance of tragedies. And I think that's a good place to end this. Until next time on the Acima Development Podcast.

Comentarios

0

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de Acima Development!

Empezar

2 meses por 1 €

Después 4,99 € / mes · Cancela cuando quieras.

  • Podcasts exclusivos
  • 20 horas de audiolibros / mes
  • Podcast gratuitos

Todos los episodios

100 episodios

Portada del episodio Episode 100: Normalization of Deviance

Episode 100: Normalization of Deviance

This episode of the Acima Development Podcast centers on "normalization of deviance" — the pattern where small anomalies get repeatedly ignored until they cause catastrophic failures. Mike opens with the Space Shuttle Challenger disaster as the anchoring example: engineers warned that cold O-rings could fail, but their concerns were drowned out by schedule pressure and accumulated tolerance for small deviations. The crew connects this to the Columbia disaster years later, where the same organizational lesson went unlearned, and to NASA's own "Elements of Engineering Excellence" report, which lists not questioning anomalies as a major root cause behind their biggest failures. The conversation then wrestles with the tension between safety culture and velocity. Will pushes back on pure risk-aversion, arguing that heavy regulation has real costs and that tech's "move fast and break things" ethos has produced enormous value. Dave introduces the META framework (Mitigate, Eliminate, Transfer, or Accept) and contrasts NASA's culture with SpaceX, which celebrates blowing up unmanned rockets because the risk was already accepted and the explosion yields data. Mike reinforces this with an analogy from his kid's rocket-themed birthday party, where different risk levels (model rockets, sugar rockets, thermite) warranted very different safety boundaries — treating everything as maximum-risk would have obscured where the real dangers actually lived. The group lands on a key reframe: rather than trying to control everything, build a monitoring culture that instruments heavily, tests to failure, and pays attention to the signal inside the noise. The final stretch applies these ideas to current software practice, including AI-assisted development. Matt and Dave debate whether vibe coding will dominate production code soon, with everyone agreeing humans must remain accountable for what ships. Will gives concrete examples of normalized deviance developers live with daily: thousands of ignored compiler warnings (some of which are genuinely dangerous), bloated mobile web performance, and test suites nobody expects to run clean. He notes AI could finally make the ditch-digging cleanup work economically viable. Mike closes by tying it back to the opening theme: entropy is the default, letting things slide is easy, but flipping the culture toward actively watching the data is what prevents small deviations from becoming the next tragedy. Transcript: MIKE: Hello, and welcome to another episode of the Acima Development Podcast. I'm Mike, and I'm hosting again today. With me, we've got Will Archer, Dave Brady, and Kyle Archer. DAVE: Howdy, howdy. MIKE: I'm going to start with a story, as I typically do, actually two stories, but one funny and one not at all funny. I'll start with the funny one. My wife, when she was in her late teens, decided to drive with her sister to college. She wasn't going to college yet, but she decided to road trip with her sister to college. And they made sure the car was good the day before, had been doing some maintenance, and they cracked the case of the cooling fan for the engine. WILL: Oh. MIKE: So, when the fan was running, it was bumping against this cracked part of the case, so you can imagine the sound of that, not good, right? They actually took it to a mechanic and got kind of a loose sign-off that, "Yeah, well, this isn't going to make the car break, but it's going to sound terrible, and you should get it fixed soon," like, "Okay." And they drove cross-country [chuckles] with that thing rubbing the whole way. And what they did is they just turned up the radio, so full volume, full road trip. They drove for, like, two days [laughs] with the volume cranked up, just ignoring it. And she's told the story for years. It's funny, you know, everybody in the family laughs. You can just imagine, just turn up the volume, and the problem goes away. That is one way to make a problem go away. The other story is related to what we talked about in our last episode. And we're going to continue with the topic we talked about in the last episode, which is the Space Shuttle Challenger disaster, which happened in, let me check my dates here, '86. I believe this happened in... DAVE: '86? MIKE: '86. That's the date that I was remembering, so 1986. There it is: 1986. So, I actually looked this up. I read about it on Wikipedia. As a kid, I remember watching this [chuckles] in school, and it was, you know, horrifying. So, they had O-rings around the booster engines that, you know, like rubber or rubber-like material. And they had had record cold, I guess, at the launch pad the night before. And that cold caused the O-rings to, you know, shrink and stiffen. And so, in the launch, they lost integrity, so air started getting into the fuel. Eventually, that caused a catastrophic explosion, and the entire spacecraft disintegrated. I remember the horror of seeing those booster engines just randomly wandering, and there was not anything left of the main craft. It was a tragedy, you know, a horrible tragedy. Anybody who was around that time remembers. I was talking to somebody else like, "Oh, that was our JFK moment," you know? Everybody remembers that. Where were you when that happened? And it turns out the engineers had warned this might happen, and they were ignored, because there was enough noise in the data. They're like, "Oh yeah, well, there are so many things that can go wrong [inaudible 03:18] DAVE: Not just ignored, though, right? They were told to stay in their lane. MIKE: I think that's right. DAVE: If I recall. Yeah. They were told to be quiet, yeah. MIKE: The interesting thing about that is there was another space shuttle disaster some 20 years later or so, where Columbia broke up in re-entry. And the diagnosis afterward essentially said, "We didn't learn from the last time." There were likely problems that were pointed out by engineers, and there was just so much pressure to make this thing work that the concerns were ignored, and people died as a result. And in the document that we started talking about in our last episode, which is...certainly you can look it up yourself. It's titled...this was published by NASA titled Elements of Engineering Excellence. It was published in 2012. They made a list of root causes behind the major problems that NASA had had over the decades previous. Last time, we talked in depth about the importance of hands-on experience, that unless you have people who really have, you know, kind of gotten their hands in the work and understand it deeply, then you're going to miss stuff. The second point is what they call normalization of deviances. They also refer to it as not questioning anomalies. I'll quote from the report, "As was evidenced in the Challenger failure, we see deviations, and they're not quite normal, but seem to have no major consequence. After seeing these deviations a few times, we accept them as normal and ignore them. The result is a major failure where the deviation becomes catastrophic." So, that's our main topic for today, is the importance of questioning those anomalies and being able to see that signal inside, you know, a bunch of noise, because there's always noise [crosstalk 05:18] DAVE: There are a couple of interesting extrapolations on that as well. WILL: So, I have some thoughts about, like, sort of, like, these sort of, like, normalization of deviances and ways that it can go wrong. But, like, I suppose, like, and maybe this is just my priors, but I'm very much a believer in, like, a move fast and break things sort of ethos. Like, I'm familiar with heavily regulated, heavily controlled industries where, rightly or wrongly, there are high stakes, people die, right? And let me tell you right now that there is a cost to that. There's a substantial cost to that. And I do think that technology, in general, is pretty out of control in terms of, like, accountability, right? I mean, if you look at, like, you need a license to braid hair [laughter]. But, like, I didn't even need to graduate high school to do the job I'm doing. I just needed to convince somebody to give me a shot and then not get fired for long enough, and then you're in. DAVE: And we're writing software that handles people's money for them. WILL: Yeah, to the tune of billions of dollars, you know? And it's just like, "Yeah, you know, he sounded like he knew what he was doing. Let's roll," you know, which is fun. I think that's wrong. But, I mean, you can't argue with results of the industry that we've been in, right? And I think there's benefits there, and there's a lot of stuff where, yeah, you can let it slide until it blows up. You could do that. That's a strategy. It's a valid [crosstalk 06:56] MIKE: Well, and not only is it a strategy. It's a critical one. WILL: Initially, right? MIKE: Yeah. Well, absolutely. And even in regular life, you can't pay attention to everything. Attempting to do so would not end well, right? Our brain is very good at removing extraneous information. You can't pay attention to everything. So, you have to prioritize what you actually give attention to, and that better be the important stuff. DAVE: There's a rule in insurance, which is if you can afford to replace it, don't buy the insurance. But if you can't afford to replace it, don't even ask how likely it is that you're going to lose it. You have to get the insurance. The entire science of risk assessment is getting people to stop thinking about reducing the likelihood of a catastrophic fault and dealing with the case of when it is catastrophic, right? It's like, if you're going to take, "Oh, this hash collision can happen one in 10,000 times, and it will bankrupt the company," and I turn the PR back to you, and you say, "Okay, well, I've reduced the likelihood to one in a million times, and it still bankrupts the company." No, I'm not going to approve that PR. You're trying to reduce the likelihood of something that will end us all, when what I need you to do is mitigate it, right? The META rules for...M-E-T-A: Mitigate, Eliminate, Transfer, or Accept on any given risk, right? And if we can't accept it, then you have to mitigate, eliminate, or transfer. And the thing about the Challenger discovery that I love is that it's mirrored by SpaceX. One of their first unmanned rockets went up, and they start cheering. It gets off the launch pad; they're screaming; they're going nuts, and then it explodes. And somebody opens champagne, and they keep screaming and cheering. And the reason...it was unmanned. There were no people on it. And they were normalizing science. They were saying, "This is successful collection of a data point, and the risk that we assumed was entirely mitigated." Once it was off the launch pad, they said it was all icing on the cake. This is absolutely 100%. We accept this risk. We are in the black for days, right? We can burn this rocket, and it's fine. And they used that to normalize that and create a culture of psychological safety, and let's move forward with this. But the normalization of deviance is kind of based on this weird thing where humans tend to reach for confirmation, and when you're trying to prove that a rule doesn't hold, the only thing you care about is exceptions to the rule. Is there a thing that violates the rule? Like, I can't remember the name of the rule. There's a really cool psychological test that I learned last week, where you set out four cards with numbers and letters on them. I'll dig it up for later in the call if it's relevant. But the important thing is, you're like...if I tell you every person in this bar that's drinking alcohol must be over 21 and I ask you, "Tell me if that's true or not," you know that if somebody is over 21, you don't need to know what they're drinking. And if somebody's drinking a soda pop, you don't need to know how old they are, right? But we go for that. You're like, "You're 35. Are you drinking a beer?" That's the confirmation case. You need to be looking for counter cases. Are you underage and drinking booze? If you're drinking booze, are you underage, right? Those are the counter cases, and that's the only thing you care about when you're trying to prove a negative. When you normalize deviance, you are throwing away the counter cases and grabbing confirmation, confirmation. And, eventually, your META rule, you end up accepting a risk, and it's catastrophic. WILL: So, one of the things that I worry about, right, is this sort of psychological need for control, right, and, like, people's psychological need for control on emergent systems that are nearly impossible to fully model inside your brain. And, like, all of us can think of examples of really catastrophic failures. We've all blown things up. We have blown the rocket up. And we've blown the rocket up to a degree where it's like, "Hey, you know, we could lose the company. This company could not be a company, and we could all have to work somewhere else very soon." We could all think of those. And so, the question becomes, right, is there a productive safety culture that can really eliminate, like, really, like, look at these deviances to a specific and scoped way where you can get a level of certainty where the juice is worth the squeeze, and you're not just sort of navel-gazing and being, you know, petty and fooling yourself, right? You're trading velocity for the illusion of control. DAVE: Right. The fun police or the policy wonks on your team they just want to slow things down. It's the foolish consistency is the hobgoblin of little minds. It's like, we followed every checkbox, and we did absolutely nothing wrong, and that's why the company went out of business, because we didn't make any money. Yeah. You're focused on the wrong things. WILL: Well, yeah, absolutely, absolutely. And it's just, like, killing your productivity, killing your velocity. Like, I have run into this in many respects where people will...One thing that I've seen go dangerously awry is people's focus on shallow indicators of code quality, you know? Like, where every [inaudible 12:31] DAVE: 82% C0 code coverage. WILL: Yeah. Well, no, I mean, I don't know. Like, where you'll go through, like, three rounds of code reviews with no substantive architectural improvements, right, like lateral moves because people don't know what's important and what's not important, but they definitely want to put their stink on it, you know? MIKE: Well, I've been thinking a lot about this importance thing that you brought up, because it matters. So, I've been thinking about analogies. So, I haven't thought about this in a while. When my oldest was around eight, we threw a rocket-themed birthday party, and we had fun. Everybody who came made little model rockets and launched them. And we made what they call rocket candy. You mix sugar and potassium... DAVE: The sugar rocket fuel, potassium nitrate? MIKE: Yeah, potassium nitrate. DAVE: Sugar rockets, yeah. MIKE: Yep. And we made some of that, and lit it on fire and watched a big fire [laughs]. And we made some thermite. WILL: Oh! [laughs] DAVE: I want to go to your birthday parties. Holy crap. MIKE: [laughs] And lit that with magnesium ribbon, and that was fun having molten metal [laughs] in the backyard. And I'll tell you, so the model rockets, heavily controlled. Those things have been, like, standardized for I don't know how many decades. They made them. They put their own engines in those, and they were able to launch them, and everybody laughed, and it was great. So, you know, eight-year-olds wandering around with rockets in their hands. The rocket candy, everybody was at least 10 feet away, right [chuckles]? It was enough. And the thermite, everybody was at least, like, 30 feet away, right [chuckles]? They were on the other side of the property. They could see it, and they could see the fire. There was no eight-year-old anywhere close because not safe. And there were very different rules applied for each of those different risk levels, and that was important to identify because if I had tried to make all the eight-year-olds obey those thermite rules with their rockets, they'd have had no fun, and they wouldn't have known where the boundaries really were, right? They wouldn't have known, "Well, I actually need to be careful of this," because you're treating me like this rocket is super dangerous that's not actually that dangerous, you know, there's a lot of controls around it. And so, I don't really know where the boundaries are. And it was really important to establish ahead of time, well, what's sensitive and what's not? Because that allowed us to pay attention to the things that really mattered. MATT: Thermite and firearms, favorite games. DAVE: Thermite and firearms, yep. There's an interesting...It's not the reverse of it. It's not even the countercase. It's an agreement in it in, like, the shadow of it, which is kind of going back to, like, the Challenger disaster. We had normalization of deviance, and it isn't that cold O-rings kill people. It isn't that foam is falling off a shuttlecraft is deadly. It's the bigger thing hiding behind it that you're normalizing this thing. But if you're not paying attention to this, what over here is even bigger that you're not paying attention to? And I had two things...I learned this lesson really well because I had it happen in two places kind of at the same time, information that came in. One of them was I was in a fast food restaurant. We walked in, and it wasn't busy. All the tables were filthy, and it wasn't, like, immediately after the lunch rush. And the guy that I was with, he's like, "No, we're going." And he turned around. I'm like, "But, I mean, what are you talking about? The food here is pretty good." And he says, "No, come." So, we went to another restaurant, and I'm like, "What was all that about?" And he says, "Here's the thing. If they're not cleaning the tables out in front where we can see them, what do you think the kitchen is like where we can't see it?" I'm like, "Oh, that's a really good point." That same week, I had started a new job at a company in Salt Lake City that...they're like a Groupon clone. They were doing financial, you know, manipulation stuff, like, batching together coupons and stuff and deals for people. And the CTO had a really cool hip line of like, "We move fast, and we break things, and we only fix it... We don't polish rivets here. We only fix things good enough to ship it." And I'm like, "All right. Yeah, let's get some money. We're a startup. Let's absolutely do that." So, I hired on, and my first day, he's like, "Okay, cool. We need to get you on the board for the pager." I'm like, "Okay, well, pager isn't my big thing." And I said, "How often does the pager go off?" He says, "Oh yeah, you're going to need the pager every night because every night, you have to reboot the server at two o'clock in the morning." "Your production server goes down every single night, and you consider this business as normal?" "Oh, yeah, it's totally fine." I turned in my badge. I walked out. I quit the same day, the first day. And the pager was the thing that did it. And it wasn't the pager; it was, if this is okay to you, you've just told me a lot of things about how much you value my good night's sleep and my value as a person, but also everything else in the company. And when I found out a year later that the CEO was on trial for financial fraud, I'm like, "This surprises me exactly not at all." Like, everything in that company was move fast and take what you want, and hope you don't get caught. WILL: You already said Utah startup. DAVE: Yeah, yeah, exactly. Utah startups, yeah, yeah. Tell people -- MATT: I feel like [inaudible 17:48] MIKE: Acima was a Utah startup [laughs]. MATT: I feel like we've crossed paths 20 years ago. And I'm sure of it now. Because I built one of those companies that was doing the same thing back at the same time. DAVE: Oh wow. MATT: That was acquired by the other company. DAVE: Oh, interesting. We'll have to talk offline. MATT: And the founder also happened to end up, I believe, in prison. So, yes. DAVE: Yeah. We'll have to talk afterwards. That's fantastic. That's the other fun thing is that the Ruby community is a small, small world. Yeah. MATT: Going back to what Mike said and then you extended on, it's constraints, right? And then we're talking about architecture. And while things may not fail on their own, when you put them together in a systems architecture, and then you apply pressures, that's when you start to see failures, right, of those constraints. And I think a lot of people overlook that architecture as a whole and are losing sight because they can't see the forest above the trees, or through the trees rather. WILL: Yeah, but, you know, I guess, like, and I don't know whether we'll be able to, like, resolve this to, like, a satisfying conclusion. But, like, people talk about, like, the Challenger disaster. This guy was talking about, right, I think it was foam, right? Like, foam was falling down and damaging the heating tiles, right? And that compromised the thermal integrity of the space shuttle, which made it blow up, right? And they told him, "Hey, shut up. Shut up. We got to get this thing in the air, you know, whatever. The project's got to go," right? And so, we all, because of the tragedy and the benefit of our beautiful hindsight vision, we're all like, "Oh, well, obviously, this man is a hero. These evil, greedy executives were the villains," you know what I mean? And one guy wears the white hat, one guy wears the black hat, and then boom, [claps], put a bow on it, ship it. But, I mean, just because, like, I am the way I am, I always think about like, okay, well, how many times did the executives say, "Shut up. It's fine," and they were right? You know? You know what I mean? And, like, I don't have that information, but I do know for certain that there is this temptation for all of us to assume that if we just do everything well enough, it's not going to blow up in our faces because we can have it under control. MIKE: So, I want to take that and combine it with the SpaceX example that was brought up before. It's going to blow up. If we're talking about rockets, yeah, they're going to blow up. You're not going to start a rocket company or, you know, a government rocket program that's not going to have a lot of things blow up. If you go into it with that mindset and start blowing things up on purpose, say, "Yeah, I'm going to have blow...these things are going to blow up," it changes your approach to the problem versus saying, "I'm going to try to control absolutely everything so that nothing will blow up." MATT: Try to test your failures. Push the [inaudible 21:15]. MIKE: Exactly. Yeah, fail on purpose. Learn from it. MATT: Yes. And I'm wondering, you know, and I don't, again, like you just stated, Will, I don't have the information. However, that foam may have very well passed temperature testing, right? However, you add velocity to that ; did they test it at velocity? Because things get fed more oxygen. They ignite more quickly, and, exponentially, things go bad. So, it also illustrates test your edge cases, right? That's an important thing. You can't always predict edge case. But as Mike just stated, you need to try, right? Try to determine your failures. Try to test those failures, and you're going to have much better success than just saying, "Okay, no, we want to make it perfect. Here's our MVP. This is best-case scenario. Everything's successful. Let's send it." MIKE: So, the testing to failure is very different from testing that it meets certain parameters. If you test, "Oh yeah, it didn't fail within these parameters," and the failure point was, like, 1% away from that, you have no idea whether it's 1% away or you've got, you know, tons of headroom, you know. MATT: That's right. Test it till it breaks. MIKE: Yeah. And that approach, that change in mindset, that very fundamental change in mindset is a big deal. And it's kind of the difference between waterfall-style software development and agile development is, in one case, you try to control everything and inevitably fail [laughs]. And the other approach you say, "I can't control everything, so I'm not going to try to. Instead, I'm going to take an alternative approach where I build a small prototype, test it out, and go into a loop so that I know far more about the process as I'm going on." So, you plan...You're still planning, but you're doing just-in-time planning rather than attempting to cover all your variables before you could possibly know all the details. WILL: Yeah. And, like, some stuff you got to test in prod. That's one of the things, I mean, like we talk about, like, SpaceX, right? I believe, you know, we return to the analogy, right, where they blew up that rocket, and they were so happy about it. Like, they were pretty sure that rocket was going to blow up. They didn't want the rocket to blow up. I don't think they were trying to blow the rocket up. They were trying really hard to not blow the rocket up. But even still, they were like, "There's no way fucking way this makes it all the way," you know? And so, when it got off the launch pad or whatever and it blew up, you know, on the first stage decoupling, and it was just like, "That's a great win." I think that's...they've embraced, like, you know, the futility of the illusion of control, where, like, you just can't test a rocket on the ground. You can't do it. MATT: No. And you can't predict everything. I mean, let's face it, this is reality. There is no way to predict every variable. And, you know, some of us on this call witnessed Challenger. You know, I remember sitting with a group of children watching it live on TV and then watching it happen, and I will never forget it. I can picture everyone next to me, their face, the reaction, you know, similar to 9/11, same thing. But you can't predict everything. But you can force failure. MIKE: So, if you set up a [inaudible 24:56] DAVE: Yeah. So, at the end of the day, we're all testing in prod every single day. MIKE: Well, yeah. But if you accept a culture of monitoring where you are looking at the anomalies and paying attention to them [laughs] and doing something about it, this is kind of where we launched this conversation, right? Then rather than trying to...It's the opposite of control, right? You assume, I don't have control, so I'm going to watch everything I possibly can to see when things start going out of bounds. So, you develop a monitoring culture rather than a control culture. And I think that's a big deal. Like, we talked about SpaceX. I'm sure they had all kinds of instruments on that rocket that blew up to figure out what went wrong in every possible way [chuckles]. They didn't know what would go wrong, but they knew something would, so they instrumented that thing to death. "Let's look for all the anomalies we can see." And the next rocket, I bet they did something for almost all of them. MATT: I think this speaks to culture as well, you know, NASA versus SpaceX. And I will admittedly say that I am a fan of what Elon's doing. Like, I will not hide that because he is innovation king. But you operate under government regulation, bureaucracy, constraints, and then you go privately held with someone who's a visionary, wants to push boundaries. You see the success rates, right? And those success rates are exponential with what SpaceX can do versus what NASA can do. We haven't... I mean, we're, as far as I know, we're still on x86 architecture on the space shuttle, and I can guarantee you SpaceX is not. MIKE: Well, and that culture there, you know, we can ascribe it all to one guy, but it's not, right? I mean, there's Gwynne Shotwell [inaudible 26:45] WILL: I'll give Elon credit. MATT: He's the visionary behind it. MIKE: I'll give him credit. I'm not taking away all the credit. But -- MATT: He's the visionary behind it. MIKE: You can't just have one person. One person is not culture. MATT: No, but it starts at the top. DAVE: Well, and we live in a world of identity politics right now, and I think a peace offering we can say on both sides of the aisle is that the people that don't like the identity have a problem with that person, right? With Elon. I don't hear anybody on either side being upset about electric cars, or about having a space program, or maybe getting off the planet and saving humanity, or dealing head-on with the AI potential extinction of the race. We like what's going on. We like what's coming out of there. So... WILL: Well, you know what I mean? Like, I actually, like, I really like this analogy because I think as you look at other things, you can see this culture. You can also see the limitations of it in that he wanted to build out a rocket company from scratch, right? And he wanted to do it in the most capital-efficient way that he could. And he, I think, correctly ascertained that, like, okay, the fastest way to get to orbit is to test it in prod, right? So, blow up a lot of rockets, right? Like, I'm not going to spend years and years and years in the wind turbine, you know, like all this stuff. We're going to shoot some rockets off, and we're going to blast them into space. We're going to see how things go. We're going to learn and iterate very rapidly, right? And they're all going to be unmanned rockets, which... initially at least, right, NASA couldn't do that. That wasn't on the menu for NASA. Or well, I mean, I guess that's not true -- DAVE: Well, because, like, Sputnik and the [inaudible 28:33] stuff, sure they did, yeah. WILL: Yeah. Well, no, no. When NASA got started, they had a lot of acquired experience with one-way rockets from... DAVE: That's fair. WILL: Very [inaudible 28:34] MATT: Yes, yes, yes. I know where you're going with that one. DAVE: Yes, yes. WILL: They were doing one-way trips almost from the very beginning. But regardless, right, the point I want to make, though, is there comes a point where this move fast and break stuff thing and the complexity around the emergent system starts to consume you, starts to swallow you whole. And what we have seen, like, I'll go ahead and call it. I don't want this to be the Elon show, but, like, they've been advertising robotaxis for a very, very long time. And I think that system, the complexity has gotten out of hand on it. And I don't think those robotaxis are coming because I think the move fast and break things, iterate quickly, kind of messy architecture culture has...I think the tech debt around autonomous driving has completely stalled out their progress. I think they're stuck, frankly. MATT: Well, I think...and you kind of led me to a perfect segue here. And I'm going to go extremely, extremely old school and maybe a little bit off topic, but it takes visionaries to change the way we do things. I'll go back centuries: Leonardo da Vinci pushing the boundaries, trying things that everyone else thought he was absolutely insane. Next, Nikola Tesla and how he was obsessive, and it destroyed his life. He died broke and alone. But he changed absolutely everything for the world, right? And we need that. You can't get stuck in technology, bureaucracy. You need innovation. You need to push boundaries. You need to test outside of those boundaries to really make progress. And I think, to me, and, y'all know me, that's the most important thing there is to me when it comes to the world of technology and the things I do and what I'm trying to push. And sometimes I'm going to be wrong, but you have to be wrong to become right. MIKE: Well, let me take that. So, you talk about the robotaxi and the visionary. Yeah, I think you have to be a visionary, and sometimes you have to admit that you're wrong. I think that, yeah, the robotaxis not been successful, and part of that is it's been thus far technologically impossible [chuckles]. There are challenges to making that happen that nobody has solved yet. And -- WILL: What are you talking about? They're done. You can ride in one. MIKE: You can, with Waymo, because they didn't say, "Hey, we're going to end-to-end learn this." They said, "It's not within modern tech, so we are going to have a really sophisticated 3D map of the environment we're going to work in. We're going to use LiDAR on top, and so we're going to use some algorithms to locate where that vehicle is every time given that 3D map. And all that vehicle's going to do is follow the map." And that's what they do, and then they just use a little bit of the machine learning. I mean, they still have to have the vision to look for a collision. So, they're doing some collision avoidance, but they're solving a different problem. They decided this tech isn't here, so let's solve the problem with technology that actually does work, and so they're successful. So, they've approached the problem a different way. Now, you need to try stuff that fails sometimes, right? So, I think it was great to say, "Hey, let's try this with end-to-end learning. Let's see what we can do." At some point, you might need to realize it's going to bankrupt your company trying to do it because it's not going to work. Sometimes it works; sometimes it doesn't. Yeah, you need to experiment, and sometimes it's not going to work. MATT: That started something revolutionary, though. Yes, they have constraints, right? They can only do it in x amount of cities where roads are in certain conditions, because of LiDAR, and, you know, collision detection is vector maps and machine learning gets a little scary, to me, because probability versus determinism. But you have to start somewhere. MIKE: You do have to start somewhere. MATT: And what they're doing is going to revolutionize the industry, and it's going to change the way we navigate the roads. WILL: Tesla? DAVE: I think we're solving it from the other end, much, much farther than we've ever been. I overheard Uncle Bob Martin talking. He's got a Cybertruck. Now, I don't like the Cybertruck. That's a personal aesthetic thing for me. Honest, I joke with people that somebody designed a nice, beautiful SUV, and they modeled it in 3D, and they accidentally sent the bounding box to the fab of the render [laughter]. And that's what they got back. But Uncle Bob owns a Cybertruck, and he put on Twitter a little while ago that he's put, like, 100,000 miles on it in three years, and 80% of it has been auto drive, and, like, he won't live without it. For somebody to be that much of a road warrior and to straight up say, "80% of this is solved," we've never been that far, and every year we get closer and closer. And Waymo, they have to solve that other 20%, and, like, 5% of it is, like, road construction problems that LiDAR can't deal with, so they cut that off. They probably just won't deliver you to those areas, right? So, they stay within that. And we're getting it closer and closer, and that's what kind of excites me. Circling back to AI a little bit, I said, like, five or six years ago when the self-driving cars were coming, I joked to somebody that, like, we look at AI, and we say, "It'll never be there. It'll never da, da, da, da, da." But our kids are going to talk to each other and go, "Can you believe Grandpa got in that 2,500-pound machine of death and controlled it by hand at 100 feet per second? Are you nuts?" right? We're seeing this with vibe coding. We're past the tipping point. There are companies now that are literally saying, "Why would you let a human touch the crypto code?" Or, "Why would you let them touch this piece of the security stuff?" And by next year, like, 80% of software...I don't know if it's 80%. WILL: [laughs] DAVE: But anytime I make these bets, I always take the under, and I always win if I take the under because it's going to hit, and it's going to hit faster than I think it's going to. And I'm calling it, like, next year that over half of the code that we do in prod is going to be...it's not going to be vibe code. We're not going to use that word because it's a four-letter word, but reliable automation in prod that's handled by an automated system with, you know...And I don't want to get into that. It's a different podcast. WILL: Not a chance. Not a chance. MATT: However, I will get on that bet with you. WILL: No way. Not, not -- MATT: Just based on some of the things I'm aware of. WILL: I would say, like, I don't know, I mean, maybe it's just the domain that I work in, but, like, 80% of the code that I write these days is generated by an LLM. But there's not a snowball's chance in hell that that LLM is ever going to replace me. The LLM cannot exist without me. DAVE: Oh, yeah. MATT: No. WILL: I can exist without the LLM. MIKE: And nobody's arguing otherwise. MATT: Yeah. I don't -- DAVE: I think we're all in violent agreement here, yeah. MATT: Yes. Nobody is replacing humans. At the end of the day, there has to be a human accountable for what goes out. Accountability and ownership is 100% important. Like, you cannot avoid that; otherwise, we end up in chaos, right? And then we see drift everywhere, and hallucinations, and Wild Wild West, worse than we've ever seen in the history of humanity. Like, there has to be accountability. DAVE: You've just named the next problem that we have to solve, yeah. Every time somebody says, "AI can't draw a hand with fewer than six fingers," the AI community says, "All right, bet. We'll see you in two more model revisions." MIKE: Well, and I think you just -- MATT: And every week, it changes. MIKE: You just brought it back to where we started. If we have a culture without accountability, then bad things happen. But if you -- DAVE: That's normalization of deviance, yeah. MIKE: Normalization of deviance. But if you're watching this thing, and you're developing all kinds of metrics to say, "I want to make sure this code has high quality," and you're establishing those standards and building the constraints to make sure that it is high quality, then you can get to that confidence because you watched it. WILL: Well, I mean, and, like, one thing that I'm very, very excited about AI in that one thing that it is good at, really good at, is, like, just petty ditch-digging work that people cannot stand. And I'll give you an example of, like, sort of, like, a normalization of deviance in ways that I think big and small, right? We need to, like, you know, we're asking, like, is the juice worth the squeeze? Well, in certain capacities, absolutely, it's worth the squeeze. I'll give you one easy and one hard example. Like, one thing, I'm looking at a build for a product that is getting a lot of usage, and right now I see 3,874 compiler warnings, of which I am certain at least 3,000 of which are completely spurious, completely nonsensical, totally useless. 500 are nice to have, get out ahead of this deprecated API, you know, you can get around to that. And 174 are a big, big problem. And I don't know which ones are which. And it'll be a real hairball, and, like, you know, in all honesty, right, because we've normalized deviance to think builds, to think [inaudible 39:02] that pass QA, right? And it's just like, "Ah, that warning's just a warning. It's just nothing," you know what I mean? "Let's tape over that check engine light because I got to get this release out on time." And that's a big problem because I guarantee you there are at least 100 warnings in there that are bombs waiting to go off. And everybody's build is like that. Everybody's test suite is like that. Everybody's...There's nobody who's like, "Okay, I'm going to run my test, run a real, you know, rake test," and, like, that output comes out squeaky clean. You know it doesn't. You know it doesn't. But it could, and maybe it [inaudible 39:43]. And that's a normalization of deviance, like a real serious problem. DAVE: Broken window syndrome, yeah. WILL: Yeah. Well, I mean, like, there's no reason that we couldn't clean these out. I wouldn't even know. There's a really good reason. Development time is expensive, and the juice isn't honestly probably worth the squeeze. DAVE: It's not always. Yeah, it's always a long tail or Pareto's rule, yeah. WILL: But my helper monkey, he doesn't get tired, you know? As long as the lights are on at the data center, he'll clock into work. DAVE: As long as I've got tokens. WILL: I have to check the GB thing, which is going to be burdensome. But I'm saying that's an easy one, right? And that's an easy win of like, "Oh, look, we can fix this. We can work this out." I think a more significant issue is due to the continual degradation and debasement of my intellectual and emotional health, I don't keep a lot of apps on my phone for consuming media, and I use the internet as it stands, right? Like, if there's, like, a Reddit article that somebody sends me, I just look at it on a mobile website. And I don't know if you guys have noticed this, but, like, the mobile web, like, the internet in general, is in a dire dumpster fire state. It's not like the internet does new things. It's just terrible. It's terrible, and it's gotten bigger and bigger and bigger, and worse and worse and worse. Page sizes have gotten larger, and APIs have gotten slower and more bloated, and we're loading more stuff for no reason. And, like, all of our performance is degrading and degrading and degrading and degrading and degrading. That has absolutely direct financial business impacts. And every single organization I have ever been a part of or interacted with on any level has suffered from this. Normalization -- MATT: Yeah, and I would -- WILL: "Oh, well, it's only 10 milliseconds slower," you know? "It's only a megabyte bigger." MATT: I won't get into the psychological and physiological aspects of that, but there's definitely an impact. Synapse are being reprogrammed, and attention spans are 30 seconds when they used to be hours. And, you know, the world has changed. MIKE: Well, we're kind of scratching on this into a different topic. So, I'd like to bring this together, this idea of normalizing, you know, these deviances. We've talked about changing culture, right? To having a culture where we pay attention to the data, and that changes things when you do so. And it's easy to let it slide. The default is to let it slide, and the entropy happens, right? But if you flip that and say, "Well, we're going to focus on watching stuff," it makes a fundamental difference, and can even, in some cases, lead to avoidance of tragedies. And I think that's a good place to end this. Until next time on the Acima Development Podcast.

10 de jun de 202643 min
Portada del episodio Episode 99: Hands-On Expertise

Episode 99: Hands-On Expertise

This Acima Development Podcast episode centers on a NASA root cause analysis document from 2012 that concluded the agency needed to "reestablish the culture of technical excellence based on hands-on work." Mike opens with stories of Dutch Renaissance painters Rachel Ruysch and Maria Merian, both of whom spent decades honing their craft and improved continuously through sustained hands-on practice. This sets up the core question that drives the episode: does technical leadership require having done the technical work yourself? Dave kicks off the debate by asking whether the CEO should write code, prompting Will to share the "toxic and unpopular opinion" that technical executives should have built software at some point in their careers. The group largely agrees that hands-on experience matters, but the conversation gets more nuanced as it goes. Justin highlights how technically credible leaders create better engineering cultures because people can't pull BS on them, while Matt pushes back as the lone dissenter, arguing that communication, problem-solving, and trust in capable people matter more than personal technical skill, especially for CEOs versus CTOs. Will draws a distinction between line-level engineering managers (who he thinks should spend half their time writing code, ideally pairing) and higher-level managers who can step back. Kyle adds that genuine desire to understand the work can substitute for direct expertise, and Mike Porras connects it to decentralized authority, citing Atul Gawande's The Checklist Manifesto and the Katrina response failures as examples of why subordinate leaders need autonomy to act within their domains. The final third pivots to AI as a new threat to that hard-won technical grounding. Justin raises the concern that engineers are increasingly surrendering judgment to LLMs, which could produce the same erosion of expertise that NASA documented. Will frames LLMs as power tools and force multipliers, glorious in skilled hands but capable of taking your arm off, comparing them to rocket jetpacks. Mike Porras shares a more optimistic view, describing how AI lets him reach the PR stage faster on unfamiliar work and creates more space for higher-level architectural conversations with reviewers like Kyle. Mike closes by tying it back to the central thesis: the tool keeps changing, whether it's outsourcing, AI, or whatever comes next, but the need for grounded expertise, humility, and continuous learning never goes away. Transcript MIKE: Hello, and welcome to another episode of the Acima Development Podcast. I'm Mike, and I am hosting again today. And I've got a big crew here today. I think we've got a topic we all care about [chuckles]. Here with me, we've got Eddy Lopez, Mike Porras, Dave Brady, Thomas Wilcox, Ramses Bateman, Matt Hardy, Justin Ellis, Kyle Archer, Will Archer, no relation [chuckles], Tim Chaffin, and Jordan Fong. Big crew of us here to talk about our topic today, and our topic today is triggered by a blog post by...her name is Vicky Boykis... Man: Vicky. MIKE: Yeah, Vicky Boykis. I ran into her from a newsletter, but, so I don't know her personally. But she wrote a fantastic blog about engineering, software engineering in the larger sense, machine learning, AI engineering specifically. Well, and interestingly enough, twice this week, I've randomly run into people writing about ancient Dutch painters. I say ancient, like, Renaissance Dutch painters, that era, and [chuckles], you know, women who were exceptionally skilled at their craft of painting during that period of time. And so, I couldn't just leave that alone. We're going to talk about that in our podcast [chuckles]. There's two women I read about this week. One of whose name...and these are Dutch names, so I'm probably going to butcher them. Give [chuckles] me some grace and patience on this one: Rachel Ruysch and Maria Merian. Both of them were Dutch painters. Rachel Ruysch was a painter of flowers who had painted, I guess, for decades, over a period when flower paintings were extremely popular, extremely sought after as a way in the long Dutch winter to bring a little bit of nature into your home. Maria Merian was not just a painter but a naturalist. She, as a child, was fascinated by nature and loved watching insects. And she carefully documented the process of metamorphosis, insect metamorphosis, which had not previously been documented well, meaning that many people in that era in Europe really didn't get it. You know, a lot of us as kids now, we think, "Oh yeah, caterpillar, butterfly. Got it." That was not the case back then [chuckles]. They did not connect the two. And we can thank Maria Merian for a lot of that understanding. She put things together that were not well understood for years. She actually, later in her life, traveled to South America and documented some of the amazing insects they have there, and people back in Europe didn't believe her. They just thought she was making it up [chuckles] because they just couldn't...you know, they were in Europe where it's cold, and the bugs are small. They just couldn't believe that there were these amazing tropical insects. And years later, her work has come to greater appreciation, that she beautifully captured and accurately captured things that the broader science community didn't catch up to for quite some time. One thing that both of these women had in common is they worked for decades, both of them starting, you know, like, in childhood. They were surrounded by mentors, and they worked, you know, they kept on working. You can find pictures of both women, because they were painters, right, paintings of them later in life, because they were experts. And here's the thing: they got better. One thing you do as a painter is you get better, right? So, their work improved over decades. It's part of why both of them are acclaimed today, is they just got really, really good from all this practice. Well, we're talking about Dutch painters, right? Let me connect this to one other thing before we jump in. About...well, I say about this many years ago. It doesn't matter how many years ago because you might be listening to this five years from now. In 2012, NASA published a document doing root cause analysis on problems that had led to some of the issues they had within NASA that led to deterioration of quality, accidents, that sort of thing. And they wrote a list of five items, and we may get to those as we talk, but then they summarized it. And here's the one-sentence summary that I've got. To prevent problems, NASA needs to reestablish the culture of technical excellence based on hands-on work. That's what I'd like to talk about today. We talked about the painters [crosstalk 04:49] DAVE: Like the CEO who writes code? MIKE: Possibly [laughs]. DAVE: Okay. MIKE: And that's an opportunity to dig in. Like, should the CEO be writing code? DAVE: [inaudible 04:57] I'm not challenging. Yeah. MIKE: Yeah, no. And I open it up. You know, I like to open things up and then be quiet for a while. So, you led out with that, Dave. Should the CEO be writing code? DAVE: I mean, not in prod [laughter]. When I worked at CoverMyMeds, that was the policy. If you work here, you write code. And so, there was a data migrator that was written by the CEO. And yeah, like, five years on, it read like code that had aged, had been written by an executive, and, you know, hadn't been kept up to date, and it was fine. The important thing was he had his fingers in the guts of the system, and that kept him a lot more present to how the system worked underneath him. WILL: I have a toxic and relatively unpopular opinion that, like, technical executive leadership should have, at some point in their careers, written software. I tell you, man, in terms of, like, CTOs that had the experience of working under possibly a third qualify, I don't know, I mean, that's, you know, I'm coming in with the hot takes. MIKE: I've had some conversations with our recently hired CEO. He is very much an advocate of engineering excellence and really wants to lean into engineering. He wouldn't have given it as a hot take, so...[laughs]. But he would have shared a similar sentiment because, in his feeling, if you don't have the background, then it's really hard to make the right decisions. WILL: So, yeah, 5 years ago, anytime [inaudible 06:37] 20 years ago. [laughs]. JUSTIN: I've been in a couple of places I've worked where the CTO has had technical excellence. The vast majority of the time, that has resulted in, like, a better engineering culture than, you know, those without. And by that, I mean, like, engineers wanted to stay because this was...yeah, not like now, but this was during the time when it was hard to hire engineers. And you spent a lot of thought about how to make a good engineering culture. If you did not have a good engineering culture, people just left, and they weren't happy there. And having a technical leader in technical leadership was key to that because they just...they got it, and you weren't able to pull BS on them, or they had less BS pulled off on them because they've been in the trenches. And those who haven't been in the trenches, they're up in their...I wouldn't even call it an ivory tower. They're off in their palace somewhere doing politicking, and politicking always sucks [laughs] so... MIKE: You know, Dave, I think you asked at the beginning: should the CEO have written code? Well, maybe, maybe...So, here's a take on that. Our CEO currently was the CFO before, so he's deeply experienced with finance. He comes from a background where he knows that. And I probably trust him more as CEO because of that, because I know that he knows where to, you know, watch the money, and he's not going to let things get out of hand. WILL: It's a financial company, right [laughs]? You know. MIKE: Yeah, exactly. It's [laughs] a financial company. WILL: It's a financial company. MIKE: [laughs] JUSTIN: So, I got an opinion about that one, too. The CEOs, like, the best CEO that I've known personally was, or is, the CEO of SoFi, Social Finance. He started out; he came in; he had to clean up a mess. But, basically, his job was to, one, clean up the mess that he found himself in, found the company in, and two, sell the company. WILL: [laughs] JUSTIN: And, like, not sell the company, but, like, market the company. Sorry, sell the company, bad kind of takeaway. WILL: [laughs] No judgments. Sometimes you've got to do that. MIKE: So, promote the brand. JUSTIN: Promote the brand. MIKE: Promote the brand, not get acquired. JUSTIN: Yeah. And promote the brand and raise money. And this guy, Anthony Noto, look him up, he brought SoFi from, you know, being a small company that focused on student loan refinancing to being an investment bank, being a retail bank, being a commercial bank, basically where it is right now, having a freaking stadium named after it, which, you know, seemed like a huge gamble at the time. But now you hear the word SoFi associated every time, you know, down in LA, whenever they play in the SoFi Stadium. The guy put down a million bucks on the chance that the Super Bowl would go into overtime. And he got, like, a minute commercial just on the chance that the Super Bowl would go into overtime, and it paid off. And SoFi's servers almost suffered a meltdown just based on the traffic generated from that ad. I forget which Super Bowl it was. I think it was, like, 2018, 2019, or something like that. So, this guy, you know, I'm amazed at what he's been able to do and the importance that he has had and the confidence that he had in the company. And he grew the company a ton, and he was able to get a lot of outside investment. So, he is very good at being a CEO, and there's a special set of skills at doing that. And I think there's a special set of skills at doing whatever your title is, and if you don't have that special set of skills that corresponds to your title, you're not going to have the respect of the people beneath you. And I think this goes back to the technical leadership at NASA. It's, like, you know, you've got to lead by example, and if you have the technical chops to lead by example, I think you'll gain the respect of the people that you're leading. WILL: So, I've got a question, right? It's always, like...I mean, none of these answers have, like, a real clear pathway. So, I mean, so I'm an engineer, right? And I think, like, you know, you should be able to make things with engineering. You should be able to engineer a solution to a problem, right, at some point in your career, if you want to be a leader of an engineering organization that makes things, right? Well, so what about people in the engineering organization that don't make things? What about your product managers, you know? My dad's a prior...he used to be in the military, right? And, like, in the Navy, they had a real big problem with women being promoted in the Navy, in Air Force, in whatever, because, like, foundationally, the military is a combat operation. And if you want to be an admiral in the Navy, you have to command. You have to be a ship captain, right? Like, that's how you do it. You'd be a ship captain. If you want to be a general in the Air Force, typically, you're going to need to fly combat missions, combat aircraft missions. And so, like, they had created a lane for women to fill these roles so that they could eventually, like, rise up and become an admiral and become leadership in these organizations. And so, are project managers just sort of, like, ceilinged out, right? Like, is there no way for them to advance? You know, because, like, they're part of this technical organization, or, I guess, are they, right? Like, should they be put under this umbrella, you know what I mean? I'm just...so, I threw this out here, and I'm just sort of, like, now I'm thinking about the counter to it. What's the counterargument? MATT: The counter to that is, yes, I think they can grow. What they're good at is managing projects. Running a company is managing projects. It's managing your strategy. It's managing direction. It's managing multiple departments and keeping those things organized. I think the key here isn't so much as what you specialize in as it is surrounding yourself with the people who are good at the things you need done. MIKE: One thing I was also struck by that you said, Will, there, you talked about being promoted in the military, that they wouldn't even consider promoting you unless you'd had that hands-on experience. And they deliberately carved out a path to allow people who may not have had access to that initially to get that access. That isn't a, like, a testimonial, right [laughs], to this topic we're talking about. I don't know what is. It's just, without that hands-on experience, you don't get it. WILL: Yeah, but, I mean, like, you just can't...how do I put it? Like, there's a lot of people...it's pretty easy to go from engineering to project management if you have the aptitude for it, right? It's nearly impossible to go from project management into engineering, like, the bar is just...you know what I mean? It's kind of a one-way gate. It's not that nobody could do it. It's just, like, the wall to climb to be a credible junior engineer is just...I struggle to think of any project managers that have gone the other way. MIKE: Well, we've talked about this in previous episodes, that your first embarrassingly long length of time as a junior engineer you're going to feel stupid. And [laughter]...what's that? WILL: I said you're going to be stupid [laughs]. MIKE: Yeah. If we're to put it politely, you're going to be ignorant [laughs] in a way that's going to look stupid [laughs]. And it's going to be incredibly uncomfortable [laughs]. You know, there's a comment in the back chat here. I still feel it. Yeah, we've talked about imposter syndrome. It's a real thing. Because we all struggle working with something that we don't fully understand, and we're never going to. It's too big. It's too much. We're never going to get all of it. And when you're an engineer, you're right in the middle of it. And you have to create something, make sure it works, and have a bunch of people evaluating you for it every day. So, every one of your flaws is perfectly visible. It's the worst sort of performance anxiety [laughs] inducing thing. I'm not saying it's a bad thing because you come out the other side having learned amazing things. I started by talking about artists, similar sort of thing. Well, yeah, if everybody can see your painting, right? And if it doesn't sell, that's...it doesn't sell. But you have to do it, and I think that that's what makes it so hard to get over that gate is because you're going to have to go and feel like a little kid again for maybe years. It's uncomfortable. WILL: Well, right. I'm sorry. I feel like I wrecked the discussion, right? Because I posed one side, and then I posed the other side, and, like, there's just no answer to it. MIKE: Who was it? WILL: [inaudible 15:35] center, you know. [inaudible 15:37] You're supposed to do a hot take and then die on the hill. [inaudible 15:42] MIKE: [laughs] We're the kinder, gentler podcast [chuckles] that aims for, you know, collaboration and an earnest pursuit of truth [laughs] rather than hot takes, although there are occasional hot takes. But I think this idea that you have to have expertise in the field you're working in. Justin talked about the CEO who was really good at that job. And a project manager, I think, could get really good at that job. But I wouldn't want somebody leading my company that hadn't spent a long time managing projects, managing money, thinking a lot about those sorts of problems. WILL: And I wouldn't hire him as a CTO, respectfully. MIKE: Sure. WILL: You could be head of product. That's great. Like, Lord knows we need more product-minded people. MIKE: No pushback at all [laughs]. The specializations matter. If you think otherwise, go to your hospital and ask for a specialist in a different area to come do your surgery, right [chuckles]? If you're foolish enough to do that [chuckles], more power to you [chuckles]. So, you know, it seems like we've got really strong agreement here, this idea that getting hands-on matters, and we've talked about it mattering in engineering. So, if that's the case, how does that apply every day? Is this once you've done it good enough? WILL: At some point, you've got to stop, you know? Like, at some point, I think you have to stop because, like, just keeping the ability to, like, do an MR, right, or even build the app, you know? Build the app is, like, it's not worth your time to do it any longer, and it's not really relevant to your day-to-day stuff, right? I mean, the problem as I see it, if I'm being honest, is that we abandon that way too soon. Way too soon. Like, I think line-level engineering managers should spend half of their time writing code, probably in a pairing situation, so that you're educating people, you know what I mean, on best practices, standards, and things you know, infrastructure, stuff like that, half your time. I'll die on that hill. 50% of your day should be leading your team directly. I'm talking about line-level managers. You go up as, like, a manager of managers, eh, you know, like, yeah, maybe 25% percent, you know? And then if you're, you know, a manager of manager of managers, okay, grandpa, just, you know, you take a knee there. Take a knee, old man. And I think that's fair, too, right? Because, you know, you have influence, but maybe, like, you're not paying the toll to have, like, a real nitty-gritty technical perspective. But I think L1 managers, like, you'd be lucky to get 20% of your time. You'd be lucky. MATT: We have 11 people on this podcast today. WILL: Yeah. Let's go, Matt. MATT: And I think I'm the one outlier in this conversation, and it probably wasn't expected, but I don't agree that the leaders have to have the technical skills. I really don't. Do they need problem-solving skills? Absolutely. What they need is communication skills. They need to have trust in the people who are helping them lead. That's important because if they try and micromanage a technical company and technical projects, they're going to fail. But I don't really feel that it's necessary to have those technical skills to be able to run a technical group, provided you're surrounding yourselves with the right people and you have those communication and problem-solving skills. WILL: Yeah, but how do you know, or how do you know efficiently, right? Because I agree with you, right? I would not now nor will I ever say that this is a hard requirement, right? Like, that's fine. Sometimes you can make a lot of money betting on an inside straight. It can be done. I've seen it happen. I'm still mad about it [laughter]. But it's not the way to bet, right? JORDAN: Something kind of similar is, like, I've heard this thing where just because you're, like, really good at a job...like, let's say you're an engineer, and you're really good at being an engineer, and you get promoted to being a manager. It doesn't mean that you'd be a good manager. But I think at that level of, like, being a CTO, knowing the process would be nice, but I think you're more managing people. [crosstalk 20:34] MATT: Yeah. And that's the difference between CEO and CTO, right? You look at the company we come from, most of us on this call come from, the person who leads that company is an attorney, didn't come from an accounting background, not a CFO. He's an attorney, one of the best leaders I've ever worked under. We are a tech company. We are a software company. He does not write software, nor should he or does he have to. But what he is good at is leading, and that's the most important skill in leadership, is being able to lead people, not be a boss, but be a leader. Show that you have the integrity and the desire to work and work for your people, and you're going to have the right people follow, and you're going to be successful. WILL: So, forgive me because I'm outside of, like, Acima, right? I don't know how to org chart. Are you talking about the CEO or the CTO? Because a CEO could come from anywhere, and if you promote an engineer to CEO, you'll have a different set of problems, right? Because they don't know how to sell stuff, but a CEO better know how to sell stuff, right? I mean, so are we talking about the CTO or the CEO? MATT: Well, that's when I was differentiating, right? I said there is a bit of a difference between CTO and CEO, and I was referring to the CEO. WILL: Yeah, and the CEO, I mean, yeah. I would never say that product is more important than engineering, or finance, or sales, right? Like, you know, you've got to have four legs on the chair, or you're going to fall down, you know. Which one's better? Who knows, right? But I'm talking about technical leadership. MATT: Right. I think -- MIKE: I think it's technical leadership we're mostly focused on here. MATT: But I think someone coming from, say, a product background, absolutely capable because they understand the product that they're working with, right? They don't need to understand the deep-level inner workings, low-level stuff. But they do need to understand how things communicate, what's downstream from your services, you know, those types of things are important. But we -- MIKE: Well, I'm thinking -- MATT: Go ahead. MIKE: You may be thinking about somebody similar to who I'm thinking of. MATT: I am. MIKE: So, I'm thinking about a very capable product manager who I think could move into engineering leadership. But she is also very experienced. She's written code. She maybe didn't do it for years, but she has enough technical background and has spent a period of time over, you know, like, months or years actually grappling with those problems that she gets it. And I think that's why she's such a good product manager is because she does get it. I think that there are some of those skills there that maybe she's not using right now that she's used in the past. And that's part of what's made her effective at her current job and what would make her also effective as an engineering leader. MATT: Yeah. I think organization size and structure are extremely important in this equation as well, right? If you don't have the right structure, then it's going to break. It's just like software. If you have really bad architecture, your software is not going to be scalable. It's not going to perform the way you want. But if you have that right architecture in place, then it's just going to work, right? And that means support. So, if you have the architects there working with you, if you have good data people working with you, if you have good engineering directors and contributors working with you, it's a lot easier to be successful than if you're a small company, a team of five, and you really have to get into the grind of things. It's a different scenario, right? So, I think that really makes a difference also. MIKE PORRAS: I think that leads into what the NASA article was about that you're referencing, Mike. Like, it's the same thing with a CEO. Like, a good CEO, whether they're technical or not, they ask questions like, "Who should own this? What system makes this decision repeatable? Where is the capital best deployed?" And then they hire and enable people that are smarter than they are. So, they hire credible CTOs, credible, effective CISOs. They give them authority, not just responsibility. And then the CEO should be judging outcomes, not the cleverness or the personality of that leader. And then you get that...what do you call that? Decentralized authority model, you know? It's no longer up to the CISO or a small body of leaders to execute decisions. It's really up to the leadership structures and independence of the subunits that those kind of, quote, unquote "juniors" of the CEO would be executing. So, anyway, your article from NASA made me think of a book that I was reading on a road trip I had. It's by Atul Gawande. It's called "The Checklist Manifesto." And it's this doctor and his thesis is all about how repeatable actions, checklists, things like that, are prevalent in so many industries. NASA, he mentions about that, airlines, pilots, things of that nature. And then he goes into how it was introduced into the medical field, and now it's a part of a procedure. But in one article, he mentions Hurricane Katrina and how the model for responding to that disaster was, look, if the incident gets so bad, the federal government is going to be the governor of that situation, and they're going to make decisions. And then we need all the local governments and communities to fall in line. And that was one of the biggest problems why Katrina was so bad was because, at key moments, federal leadership was unable to enable DoD choices, get the right funds, get all the resources they need, and so it just got worse and worse. And that was one of the shared lessons learned from that, was, look, if it gets so bad, we need teams, C-level executives under a CEO, local governments, whatever the situation is going to be, doctors. They need to be able to act on authority for their own domain and have that ability to execute without having to kiss the ring for every single thing they do. Otherwise, bad decisions get worse, or even worse, they just don't get done. MIKE: I want to try to restate what I'm hearing. Rather than having purely hierarchical top-down decision-making, you need to have people at every level of the hierarchy who are able to act independently and with autonomy, to have a resilient response to whatever the stressor is. MIKE PORRAS: Yeah. I mean, to a certain extent, I think that's right. And, honestly, sometimes I know, like, I should be doing something, and I know the right choice to make, and I know it'd take two days for me to get approval from my boss and his boss's boss. So, sometimes I just do it, and then if it was wrong, I'll ask forgiveness, not permission [laughs]. And almost 95% of the time, it was the right choice to make. And if it was that 5% percent, great. I'll be good enough at my job to roll it back as soon as it becomes a problem [laughs]. WILL: Well, that's a big...I mean, that's almost a management culture sort of a difference, in that what kinds of people do you promote and why? And are you promoting the kind of people who are sort of bureaucratic, turf war, cover your ass type people, or are you promoting people who will just get it done? And, I mean, that's, you know, gosh, I don't know. That's a thorny side. I come from startup land, where you just have to have that or, like, nothing will get done, because there is no backup. But I would say that those attitudes and the kind of people that ascribe to them are particular kinds of people, and there are trade-offs in hiring a bunch of them [laughter]. MIKE PORRAS: Yeah. Well [laughs], okay, I've seen this trend happen, right? So, what happens if a CISO...oh, sorry, if a CEO hires a bad CTO, or a bad CISO, a kind of a senior VP-level executive, but they turn out not to be credible, or they don't have good ability to lead within their domain with experience and critique, then what happens is exactly what you described, Will. Because I noticed that the people below a CISO or a CTO will pick up on, "Oh, that guy doesn't actually know what they're talking about." So, now it becomes a game of, am I doing my job well? It's, does the guy above me think I'm doing my job well? Which then turns into political performance and, you know, big talk meetings without actual outcomes and results. And that is a pattern I've seen throughout the industry, even places where I don't work, you know? I think bad leaders beget subleaders [laughs]. WILL: Over and over and over, yeah. Yeah, well, you get bad leaders, you know what I mean, because, you know, what is it? You know, victory has 1,000 fathers and defeat, you know, is just one...Well, there's going to be the Hunger Games when things go bad. And is your leader going to be able to identify and take action as to, like, what actually went wrong? Nobody's going to care more than their boss about outcomes, you know? Like, nobody's going to care more than you. Nobody's going to be smarter than you. This whole like, "I'll hire people smarter than me," you can hire people who have skills that you don't have. But, like, you're never going to hire anybody smarter than you, and you're never going to hire anybody that cares more than you do, you know? That's a hard limit. You've got to be smart. You've got to be real smart, and you've got to care. You don't necessarily have to be able to do every job, but, like, those two things are non-negotiable, and it sets a ceiling on everybody under you. MIKE: Well, it sounds like you're putting a specific definition around those smarts and caring that goes back to our central thesis that we're talking about here today. WILL: I'm specifically not putting a specific definition on being smart or caring, right? Being smart or caring, like, there's lots of ways to be smart, and I...so, I'd say, okay, right, you could be technically smart. You could say, like, "I know exactly how to architect and design and diagnose a system because I've been doing this for 20 years, and I've seen it all," right? You could do that, right? And then you could be maybe a little bit less sharp in terms of people skills because you've got the technical skills, and you can...you've got a couple steps ahead, and it all evens out and, you know, you could be less strong on that. And you could have excellent people and interpersonal skills and know how to communicate and know how to, I don't want to say interrogate, but, like, you could get answers in difficult situations from smart people who are trying to snow you, [chuckles] because that's going to happen. You're going to have to deal with that one way or the other, you know, you look at 10 candidates who are all smart and be like, "No, no, that's the one. That's the person I want." It's got to balance out. You don't have to have a specifically defined set of intelligence, but you've got to be on your game. And if you don't have one, you better have plenty of the other because the job is going to be harder for you, intrinsically. MIKE: Yeah. You're not defining a specific skill, but you're saying that you must have skills and apply them. You know, without engagement of the talents that you've got, you shouldn't be in the role. So, we've talked a lot about these varying talents that suggest that, well, maybe you don't have to have this specific tech stack of experience in order to, I mean, like, something really specific in order to be a good leader. But you have to have some sort of experience that you have cultivated over an extended period of time. You have to have something hard-won because you get something from that that you don't get otherwise. Does that seem to be a fair synopsis of where the conversation has gone so far? WILL: Yeah, or, you know, or talent. You know, honestly, hey, listen, man, talented people exist, and, like, they didn't have to go through every step, you know? They could skip a rung or two because they're just that good, and when you meet them, it'll be obvious. But, you know, exceptions can be made for exceptional people. But generally speaking, you know, I personally like the statistical average of people who can demonstrate technical excellence in technical leadership positions. It's just safer, you know? And, I mean, that's it. But there's all kinds of ways to be successful, and exceptions for exceptional people always need to be made. If you're in charge, you should be able to identify. It's like, "No, no, no. Yeah, okay, they don't tick every single box, but this one is special, and I will make room for them." KYLE: Talent is definitely one thing, and maybe it's a natural talent that I'm going to express here. But I think more than anything, it's also a desire. And what I mean by that is I went through a bit of a shift in my career where I had my first few managers, direct managers were tech. I mean, they were very technical. You know, I could ask them basically how to do my job, and they were able to give insight. But then I've also had a couple of managers that, probably a couple of my best managers, really, they would not be able to know how to do my job. However, they still had the desire to learn, to be able to understand what was going on. And I think that's the big differentiator, at least in my mind, is somehow you need to make sure that the people that are responsible still have the desire to at least understand to an in-depth level of what's going on, even if they aren't experienced. JUSTIN: I kind of want to shift a little bit in the direction of the conversation and bring it more towards, like, what I saw when you sent out the NASA prompt. Two years ago, I could see this being very applicable in the classic engineering sense. But now, with AI, I see a new hazard where people are surrendering their technical experience or surrendering technical expertise to the LLM and just trusting whatever the LLM says. And that hazard or that bad thing that is coming down the pipe, that's just going to get worse and worse over the next couple of years as people depend more and more upon the LLM because the LLM is really good, and it's getting better. But is it causing this...could it cause the same thing that NASA saw, you know, 10 years ago or whenever this paper was written? So, I just wanted to throw that out there and get you guys' thoughts on that. MIKE: And, honestly, that was the thrust of the original blog post I read that led to our discussion, is exactly that, that the AI doesn't have the years of hard-won experience, whatever that experience is. And we've been talking about, well, there's different kinds of experiences, but you have to have that drive, the desire, and the curiosity to be able to pursue it. And without that, if you're outsourcing your mind, you know, your...because we all use tools, right? There's nothing wrong with using a tool. But if the tool's wielding you, then you get in trouble. WILL: Maybe I'm going to be Matt Hardy for a minute and say, like, you know, I have a different opinion. I'm not worried about it because here's the thing about writing software. It's going to come **** you up, you know? There's no...technical debt is technical debt, is technical debt. It will be paid. You will pay it. And if people start writing bad software, they're going to get got. I was just recently on an enormous company-wide email training tool rollout push, which was disastrous, and I don't even know. I don't even want to think about how many millions were spent on it. But, foundationally, like, these LLMs are great tools. They're great force multipliers for skilled hands. They're like a power tool. They're like a band saw or whatever. But they will take your finger. They'll take your whole arm. And I just don't see it changing, and I don't see people getting out of writing crappy code. Like, it's a force multiplier, and it will multiply force in good or bad directions. But I just don't see people getting out of this fundamental thing. And, you know, as I learned in my high school shop class from a man with 8 fingers [laughter]. MIKE: Well -- JUSTIN: I love how you said it's a force multiplier, and you bring it back to, like, what an LLM is. It's like calculating the distance between vectors and stuff like that. You have one vector that you want to go to get the job done correctly. And this force multiplier will push you in that direction if you choose to go in that direction. But it's very easy to be force multiplied in any other of your dimensional space other than the way that you want to go. And you will go so far out of there so quickly that finding your way back, you might as well just throw it away and -- WILL: Yeah. It's like rocket boots, right? We invented rocket boots. You can look at it on YouTube, where it's like, "Hey, wouldn't it be cool if we all had, like, jetpacks or, like, Iron Man jet boots, or whatever?" And some lunatics, they built them, and it's so cool, and it's fantastically, fantastically dangerous. If you want to die, buy yourself some rocket jetpacks on eBay and take it for my benefit, you know? I don't know. It's great, but you still got to drive a car. You still got to land a plane. MIKE: Absolutely. WILL: I might get laid off temporarily between here and there because some executive got pixie dust in his eyes, but don't lose that number. MIKE: Interestingly, I know somebody...about 13 years ago, they did an interview with the owner of a company, a startup, and they said, "Well, you know, I'm interested in starting. You know, I'll help you out, but these are some things you have to do." And the owner really wasn't interested, and he said, "Okay, that's fine." "Here's what's going to happen." And he gave him a list of things that would happen. This is a very successful engineer. About two months later, the owner called him back and said, "You know what? I want to hire you because exactly what you said would happen is what happened. It went that way." And, basically, they had outsourced everything, right? It's something that you can always do, outsource your work, don't own any of it. Very similar to what we're talking about here, just a different mechanism. And all of the failure cases that [chuckles] he expected to happen were exactly what happened. He came back, hired him on, you know, developed more of a culture of excellence. The company was very successful. It doesn't change, right? The fact that you still need somebody with a steady hand who knows where they're going, regardless of what the disruption is, whether it's AI, or outsourcing, or, you know, the tool du jour [chuckles], whatever it is today, it doesn't change. That expertise still matters, and they'll come back if they know what's better. So, we've talked about AI and about how we could surrender ourselves, but no, don't. Make sure that you're in control. We've talked about leadership. With so many temptations to offload some of your agency to these tools and maybe give a little bit [chuckles], to lose some of your personal excellence. How do you keep it sharp? If you're the CTO [chuckles], that's going to matter, right? Somehow you're going to have to stay sharp, even if you're not writing code. And if you are a contributor writing code every day, maybe that's not as much of a problem, but yeah, it kind of is, because you've got AI there to take it from you. You've got the newest tool, and you're still writing COBOL. It's going to come along. You know, what advice would you give? What's your experience in staying relevant, keeping that engineering excellence? MIKE PORRAS: What I've noticed is, by using AI to augment my workload, I get good at things that I would not have had an opportunity before, because I never would have had that bandwidth. So, for example, you know, Kyle can attest, this past two weeks, I've just been writing nonstop code to build a web application firewall. In AWS, that's a major pain because it's just...I don't like this particular part of AWS. It's one of their biggest shortcomings. But as I noticed, the pull request process, you know, I know what I need to write, and I know what that would look like. I have the Terraform experience to know how that should be done well, and so I think I would do a good job. And I would tell Copilot or whatever I'm using, "Hey, you wrote this, but I actually want it to be this, this, and this." So, I am being critical with Copilot, and I'm spending the time reading the code to make sure it's doing the job. But as soon as it gets to that PR level, Kyle, or someone from platform ops, will read in there and be like, "Actually, I think you could do this better." And that point of reflection when it's...it's almost like I'm rubber ducking everyone in that pull request. And just by doing that, having that conversation of bigger, better architecture is something I would have never had before because I would have taken so much time and energy just to get to the point of a PR that I can never have that flexibility of saying, "Oh, I'm at the point of a PR, and what this person just suggested is such a better idea." And it's an idea that Copilot, or whatever tool I've had, wouldn't even come close to doing because it's a completely different paradigm, different context window of everything I'm doing. That is the part that I think we're getting good at with Copilot, or AI augmented engineering, is the ability to spend less time and energy getting the basics done and spending more time and energy thinking of bigger and better ideas than what we're bringing to the table. That's what I like a lot. WILL: I feel like, foundationally, right? And, I think, correct me if I'm wrong, but I think the actual, like, genesis of the LLMs was in solving translation problems, right? Like, translation between natural languages. I think that was one of the major, like, drivers and use cases behind LLMs. And so, what I have noticed is that they're fantastic, just absolutely transcendental for those sorts of jobs. And what is, you know, I don't know Python, for example, right? Python. But I wouldn't even blush at picking up a Python project. I'd be like, "Yeah, put me in, coach. Let's go. Let's do some Python." Because I know Ruby real well, and I know Java, and I'm pretty good at Scala, and ML, and Common Lisp, and JavaScript, and between all of these idioms to express a logical, procedural outcome. I'm not really worried about doing something in Python, or doing something in Go, or doing something in, I don't know what, you know, Erlang, whatever you want to call it. And so, like, one thing that I've always been...this is a deficiency that I've always had in tech. I can't stand writing shell scripts. I hate it. And I already used my F bomb, so I'm not going to tell you how much I hate it, but I hate it. And I bet you can imagine how that would go. But I don't have to anymore because I can have the LLM write it for me. Check this directory, check these files, you know, filter this thing, give me a little awk, you know, to, like, process some stuff and, like, bing, bang, boom, we're good to go. I know what I want. You just don't know what the word for it is in your language. And that's been absolutely just glorious. And there's so much stuff I think, you know, as a very experienced engineer, where I know what I want, but I know I don't know how to say it. And I know what a schlep it would be to find the exact expression in Esperanto, you know, in the documentation, and so I just don't. Or maybe I should say before, I didn't, but now [vocalization] we're off to the races, which is a thrilling time to be in tech, in all honesty. I feel like I've been superpowered, you know? Because I could fly a plane, and now somebody handed me my jet boots, and I'm like, "Let's rock." MIKE: That's maybe a good place to...well, I'll take that further, to land this. That [chuckles] we have this ability to have, you know, this great power, whether in leadership or as a contributor, with this AI backing us up. The need to have that grounded somewhere never goes away, and to have the humility to recognize where that is and embrace that, keep at it, to keep learning from it so that we don't blow ourselves up, it's a good thing. We could probably talk about this for a long time, and we may revisit this. This idea of caring about being grounded, I think, really matters. Until next time on the Acima Development Podcast.

27 de may de 202643 min
Portada del episodio Episode 98: Standups

Episode 98: Standups

This episode of the Acima Development Podcast starts with a discussion about the frustration of U.S. tax filing and uses it as a metaphor for poorly run standup meetings in software development. The hosts argue that many teams repeat painful, unnecessary processes simply because “that’s how it’s always been done.” From there, they unpack the most common standup failures: meetings turning into status reports, running too long, involving too many people, or becoming impromptu debugging sessions where only a few participants are engaged while everyone else checks out mentally. The panel emphasizes that these problems are usually symptoms of poor communication and coordination happening outside the standup itself. A major theme throughout the conversation is that standups should focus on coordination rather than status reporting. Dave Brady argues that if teams properly maintain tools like Jira or Kanban boards, everyone should already know the project status before the meeting begins. The standup’s real purpose is identifying blockers, avoiding collisions between teammates’ work, and quickly coordinating handoffs. The hosts debate alternatives like “Slack-ups” and asynchronous updates, with some arguing they fail to replace the human interaction and spontaneous coordination that happens in live meetings. They also discuss ideal team size, meeting frequency, time zones, and how distributed teams create additional coordination challenges, especially when work is handed off between regions. As the conversation evolves, the podcast becomes less about standup mechanics and more about human connection in remote work. Will strongly advocates for cameras and microphones being on during meetings, arguing that face-to-face interaction helps managers recognize burnout, disengagement, or personal struggles that text updates can easily hide. The hosts criticize workplace cultures that dehumanize remote or offshore workers by treating them as interchangeable resources rather than teammates. By the end, the group concludes that the biggest failure in bad standups is not inefficiency alone, but the loss of genuine human connection. Good standups, they argue, are ultimately about building trust, communication, and healthy relationships within a team, not simply exchanging status updates. Transcript: MIKE: Hello, and welcome to another episode of the Acima Development Podcast. I'm Mike, and I am hosting again today. With me, I've got a great panel. I've got Thomas Wilcox, Dave Brady, Justin Ellis, Eddy Lopez, and Kyle Archer─I think we're all returning crew here ─[chuckles] to talk about our topic today. So, you're probably not listening to this, like, exactly when we're recording it. You're probably not even listening to it right when it comes out. There's always a recording period and then a publishing, you know, a week later, or a few weeks later, after it's gone through editing. And we have a bit of a queue in case we miss some time. It all works out. But we are recording this in tax week in the U.S. This was the week that taxes were due, and everybody has hopefully completed their annual suffering and has submitted those numbers to the IRS. I read about this before, and I read about it again this week. Articles are often published this time saying, "Why do we do this?" Well, it's a good question, because United States is actually fairly unique in the world in that we have to submit all these taxes every year. In many countries, most people don't have to do anything at all because if you're working for an employer, they've been submitting tax information to the government all year, right? They've been paying your taxes, and as long as you don't have anything funny going on, that's enough. The government knows about you, you know, they probably know how many dependents you have, you know, you've reported that. I mean, you reported it with your business. The information's there. And in much of the world, people just receive a letter saying, like, "Yeah, thank you. Everything's good." And they receive, you know, there's no refund or non-refund because it just works, right? They don't have to do anything. The cycle that we go through of pain every year doesn't need to happen. Now, the reasons for that have to do with...Well, I want to be careful here. Our purpose here is not to criticize large corporations who lobby heavily [chuckles] to keep the tax code as it is, well, to keep the tax submission process as it is. But such is our life, right? But where I was going with this is that we go through all the suffering because it seems normal, and everybody we know around us do it because it seems normal to go through all of this process of reporting something we've already been reporting with every paycheck for the entire year. It's just a rehash that we have to do in excruciating detail because that's how it's always been. But there are examples of people who do it differently, and they don't go through the same pain that we do. And I imagine that in their blissful lives, they have extra time around this season to do things other than pay their taxes or, you know [inaudible 03:14] DAVE: Must be nice. MIKE: It must be. Why do I talk about this? Most of you, if you're an engineer, have probably been in a lot of standups, which is a sometimes daily, sometimes weekly, some regular interval typically meeting where you have a chance to touch base and connect with other people on your team. And they can range from actually pretty good to something far from that [chuckles] to something that makes you want to quit your job right [chuckles]? Like, well, not another standup. The idea, you know, comes from this agile process where...and I think it's not even just engineering. You get together in a room. You want the meeting to be so short that nobody sits down, right? You go through the key things to make sure that everybody can touch base. Now, we have all kinds of communication channels, right? We've got, you know, our messaging platforms that we use. We've got the ability to go and walk over to people. There's lots of ways to communicate, but we decided that we're going to pay this cost of bringing a whole team. And this can happen at lots of levels. You can have executives getting together for a standup. You can have the team that reports to the executives getting together for a standup. So, you have a bunch of people, and that's an expensive meeting, right? Imagine the executives getting together for a standup. I don't know how many dollars that costs, right? I'd have to do the math, but it's not few. It's an expensive meeting where people have chosen to do that because they think the coordination is so important. But it can be done right, and it can be done wrong. It can be a yearly suffering, a period of suffering, like the taxes, that reports stuff that's already been known. Or maybe it's a meeting that ends quickly and touches on key information that not everybody knew because it was late-breaking, and it was a good opportunity to share. We're just going to talk about standups. It's something that we all live with, so it's worth talking about. So, I'm going to ask─I've given the intro─what have you seen? Well, actually, let me start. Let's start with the bad side. What is it that makes a bad standup? DAVE: Turning into a status meeting, for me. The thing that makes a standup go bad...and I will reveal the point that I wanted to make in today's podcast right out of the gate. The thing that makes a standup go bad is when you are not taking care of the things that you need to take care of outside of standup, and so they have to get taken care of in standup. When your standup runs really, really long and turns into a gigantic status meeting, it's because you're not communicating status outside of the meeting. And, actually, I don't need to put any more on that point. That's just like, if you don't take care of it elsewhere, it's going to hit here. When I look at a standup that's running long, I don't look at it as, like, this meeting is bad, I mean, it kind of is. But I look at it as, okay, what is the unmet need that is screaming at us so loudly that it's cratering our standup meetings? That is frequently a very helpful thing. If it's a status meeting, you maybe need to, you know, do better in, you know, one of your other practices. If you're arguing about cleaning up code, maybe your retro needs to be better. Yeah, that kind of stuff. MIKE: Okay. So, you said there are some other venues where this should be happening, that the status reporting should not happen in standup. I've been in a lot of standups that were about status reporting, so, you know, you're bringing up a common failure case. If that's the bad case...well, I want to come back to [inaudible 06:46] JUSTIN: There's more bad cases. We got more [crosstalk 06:50] DAVE: We should go through the counter good case to the status MIKE: Yeah. So, let's go to the other bad cases, but let's put a pin in that one because you said, status report: bad [inaudible 06:59]. So, what are the other bad cases? What are other bad cases around standup? JUSTIN: When they run long, and there's not a good reason for it. I mean, basically, when you go back to your summary, you talked about how everybody is standing up, and they don't want to, like, sit down, and you just want to quickly go through things and be done. If it's going more than 15 minutes, or it's going more than 20 minutes, whatever you have allocated, and it shouldn't be more than 20 minutes probably, that means everybody's looking at their watch. They're wondering about what other meetings they have to go to. They aren't focused. And you, all of a sudden, the only person who is paying attention is the person you're talking to directly, and everybody else's mind is just like, pshhh [laughter]. DAVE: And you're 100% guaranteed at that point...if your meeting's running that long because somebody says, "Well, I've got this problem," and then everybody dives in, too, you're now doing mob programming in your status meeting. Everybody's trying to debug it. You're no longer talking about what I did yesterday, what I did today, or what I'm doing today, and what are my blockers, right? You've definitely departed the format into something else. KYLE: Well, and it's mob programming at best, right? Because a lot of the time, what I see -- DAVE: At best. KYLE: Is it's one or two people programming. DAVE: Yeah, and everyone else is disengaged. KYLE: And then the other eight are just kind of sitting around twiddling their thumbs. MIKE: [laughs] DAVE: Mm-hmm. MM-hmm. JUSTIN: Yeah. That actually brings up the other part to this, is, like, if there are too many people in your status meeting, sorry, in your standup. Personally, I think four, maybe five, is the absolute max you should have in your standup. You have any more and, all of a sudden, you run into that same problem. It's, like, you know, one person is talking, and everybody else is looking elsewhere. EDDY: Okay, but how do you manage that when you have a team of 10? JUSTIN: You have two standups. EDDY: But then don't you deviate from, like, status reports, in a sense? Like, isn't it important to also -- JUSTIN: What was it? Amazon? No, no it's a good point, and it's becoming really hard these days where, you know, you have the flattened hierarchy, right, where you have a lot of people reporting up to a single manager. But I think it was Amazon or somebody that said, "Hey, you shouldn't have a team that's larger than you can feed with one large pizza." If you are having status meetings with larger groups, it's not as effective. MIKE: And you can do it hierarchically, that is, you have your team of five people do a standup, and one delegate from that person, whoever's leading that meeting, themselves goes to a standup [laughter]. JUSTIN: Eddy just typed in the chat, "I could eat a whole pizza." Eddy, you are a team of one. You are very effective [laughter]. MIKE: I was actually thinking the same thing. Not today, but back in my heyday of eating, you know, like, late teens, I could put down two [laughter]. JUSTIN: Sorry, I derailed that but [laughter]. DAVE: The other thing that kills a standup meeting, and this is the one that if your workplace has the fun police guy in it, it's when standup turns into a BS session, when it turns into a water cooler type thing. And I stand by my earlier point that that's an unmet need. You've got a team that is not being properly socialized. When I worked at Cover My Meds, they had a really great policy that if you were remote, you had to fly into the head office every quarter for a week and just spend a week rubbing elbows with your teammates. We talked about this when we were talking about radical candor, that you basically had to make friends with your coworkers and get to know them. And we spent a whole week just playing card games and, you know, goofing around, and we'd go work, that sort of thing. But we overinvested in socializing and goofing off time, so that when we broke up and went back, the socialization now was just, like, a quick touch base of, like, hey, how are you doing, or how are your llamas? That was a real question from a real coworker, for a real coworker who really had llamas. You know, how's this going, or how's, you know, that side thing going? And if you don't have that investment in the socialization, it will come out at standup because humans are gregarious creatures. MIKE: So, what other failure cases do you see with standups? What about when there's a lack of psychological safety? DAVE: Hmm. They tend to run pretty quick. MIKE: They do [laughs]. DAVE: I worked on this. I'm going to work on this. I have no blockers. That's my report. Yep. Yep. MIKE: Every time. They run quick and accomplish nothing. DAVE: Nothing. Yep. The thing that I thought was interesting as I dug into...I dug a little bit into standup, like, history today before coming on the show. And I thought...it was kind of interesting because the three questions, like, what I did yesterday, and what I'm doing today, and what are my blockers, is not necessarily actually the point of the meeting. It's actually the scaffolding or a ceremony to draw people in. But the point of a standup is not status. The point of a standup is coordination. It's to make sure that you're not stepping on somebody else, or that this feature's going to be in play before my feature needs it, that sort of thing. And so, standup is arguably going well when somebody says something and then three people start to argue with them, you know, "What about this?" that kind of thing, as long as you don't spend 20 minutes, you know, diving into that. But as long as the pushback is, "Wait, wait, wait, my piece is up the pipeline for you, and it's not going to be in until..." you know, that sort of thing, that kind of discussion, that's coordination, and that's the point of a standup. And that's something you can't get...we'll probably talk about Slack-ups and Slack-based standups and that sort of thing before we talk about that today. But that kind of coordination is pretty hard to do in just, like, an RSS feed, where you just...here's what I worked on, here's what I worked on, here's what I worked on. And, unfortunately, in most waterfall-based or enterprise timekeeping systems, we just want to know what budget code to put your time against, and so we're not interested in coordination at all. So, the business is trying to extract...they're trying to extract your status from that meeting, which is a terrible countervailing force. It pushes the meeting into a status meeting. MIKE: Anybody else want to jump into the failure cases? DAVE: Yeah, I've got a sore throat, guys. I need you guys to take over the show [laughter]. MIKE: Well, we've covered some good stuff here. So [chuckles], there's nothing wrong with our current list. We've talked about just a status meeting, too big, too long, safe. Go ahead. JUSTIN: Not prepared, and by that I mean the best status meetings I've seen, or the best standup meetings, sorry, I've seen are ones that have been led by somebody who basically knows everything that's going to be talked about. And that goes back to, you know, communication by other channels and things like that. But, you know, if the leader goes in there and he's got a checklist of things that he needs to find out and he doesn't know clarity on all these items, I don't know if he's going to be able to find all the answers that he wants during standup and have it be as short as he needs to be. MIKE: That's great. And if we put all these together, imagine going to a standup where the leader's not prepared, has no idea what's going on, is going to likely mistreat the people on the team, so they don't want to go into any depth, but are mandated to share a long status. And so, that's what happens. You stand there within a large meeting, for hours, hearing everybody give a status that they could have reported. You know, basically, they're just reading out what happened in Jira. Does that basically cover it? DAVE: Mm-hmm. MIKE: Nightmare fuel [laughs]? DAVE: Or Tuesday [laughter] MIKE: Yeah. I worked with somebody who had recently been promoted to management, and he called Tuesday poosday because he had so many meetings [laughs] from having these sorts of experiences. Okay. So, we've identified a set of problems, and we're engineering folk. What do we do to address these problems? And maybe to start, go back to the beginning. If a status report is the most common failure case, and how these often fall into, you know, how...I say...I'm not sure my preposition works there [laughs]. Standups often collapse into just a status meeting, instead of being something effective. Well, and we talked about how they can be useful, right? There are means to communicate information that's not being communicated elsewhere, to quickly resolve problems, make sure nobody's blocked, and take things elsewhere. It's not where you do the major problem-solving. It's where you set up the later coordination to address problems. Dave, you said you have lots of thoughts on how to address these things. And you say that it becomes a status meeting because that's an unmet need elsewhere, you know, it could have been done elsewhere. So, where should it be done? DAVE: That's a good question. Anywhere else. It should be done anywhere is actually a fair point. Standup is just the least good place for it to happen. In my career, we all have a love-hate relationship with Jira, and I definitely love to hate on Jira. But the best teams I've ever been on, for managing process-wise anyway, we could go look right at the board, and we could all tell where we were as a team. We all knew how this thing was. We all knew what feature we were working on. We knew what the customer was going to receive when we delivered it. So, we had kind of that high-level... I realize this almost sounds like I'm not answering the question, but I really am. We had this higher-level visibility that, like, I'm not just writing lines of code here. I'm actually...I'm shipping this feature, which is part of this, you know, this larger, you know, thrust that we're trying to get out to the customer in this next round of deploys. And when everybody has that status of this is what I'm getting at or this is what I'm headed towards, now any tasks that you pick up are focused towards this, and anything that you're working on is either in line with it or isn't. This feels a little nebulous, but if you can see where the team is at and where you are at, and you know what you're working at, you don't need a status meeting. And if you've got that on a big board somewhere, or if you've got it on, like, a Kanban board, if you've got it up on a wall, if you've got post-it notes anywhere, or if you've got, you know, CRC cards, it doesn't matter. It can be a burndown chart. It can be a burnup chart. That's actually the same thing, just upside down, however you do it. But the key thing is, do you know what your teammates are working on, and do your teammates know what you are working on? A lot of that gets bled away in pair programming because you're swapping pairs, especially if you're doing promiscuous pairing where you swap partners every day. Because I pair with you for the day, the next day I know what you worked on yesterday because I worked on it with you. And so, that part of the status communication goes away. It slowly weaves its way through the team, one partnership pairing at a time. So, yeah, I'm going to answer your question with your own question, which is, you know, where do we take care of those things? Anywhere and everywhere that we can take care of them. We just need to be intentional about what the need is. And I think that's what kills us in standup, is that we go in just assuming that, well, I'm here because it's 9:30 in the morning, and that's when we do standup meeting. And you're cargo culting the ceremony at that point, right? It's like, I'm going to go to this meeting. I'm going to do my three questions. And if you've got a good scrum master, then when somebody asks you a question, the scrum master will say, "Okay, stop. Kick that out to after the meeting." And that's how you keep your meeting short, just by punting that out. But that's all just ceremony. The whole point of the ceremony is to get people coordinating so everybody knows where we're all at together. MIKE: Well, I heard in what you said, if you're using your project management software, and it might not even be software, it could be your project management process that you handle on a board. Either way, it's your project management process. Then you remove the need for a status report because you're using your system to do it. So, if you're not maintaining Jira hygiene, if Jira is what you're using, if you're not keeping that up to date, the tool that your company is paying for and is intending to use for that, then you're going to be forced to do it somewhere else, which is worse. Is that a fair summary? DAVE: Yeah. And as you described that, I just realized there's another failure mode of standups, which is dissemination of knowledge, which is normally taken care of in your pairing. But this has certainly happened to me even here, where I will say, "Hey, I'm going to work on this piece, but I'm not sure where to attach into it." And someone else in the standup meeting will go, "Oh, well, you're going to have to grab this service class, and then plug it in with this thing over in the utilities directory." "Oh, okay. So, if I do, you know, can I mock that out this way?" And, all of a sudden, it becomes a technical meeting, right? And what's really happening is I'm pairing with another programmer. I'm just wasting everybody else's time while I do it. MIKE: So, failure is when maybe good things happen, but they happen with everybody else as spectators and forced spectators where they don't want to be there. That's not the movie they paid for. DAVE: Yeah. What it is, is it's the least efficient way to accomplish the necessary thing. It's not necessarily bad; it's just a terribly inefficient way to do it. We'd rather you just go pair off with one of the other people on the team and, you know, knock this out. But if you're not going to do it that way, it has to get done somewhere. So, standup's the next time you're going to see each other. MIKE: Just say, "You two, go work that out [inaudible 20:22] [chuckles]." Yeah, so, effective way to address that. So, we've talked about failure modes of standups, how those can involve just being status reports. They can be the meeting being too big, too long, unsafe, having the wrong things in them. We've talked now about avoiding status reports. And, Dave, you really focused on using your project management so that that is all in everybody's mind. They can just glance whether that's your Kanban board, or Jira, or wherever it is. DAVE: Right. Exactly. MIKE: So that you know that information ahead of time, so nobody even tries to make your standup about that, because why would you? We already have that information at our fingertips. One thing that I've seen done is Slack-ups, or, you know, name your messaging tool of choice. Slack is widely used in software as well as other industries. So, we'll talk about Slack, but, you know, if you're a user of something else, Microsoft Teams, for example, which we also use, that's fine. I'm referring to both. Is that a good replacement? I mean, is that really a good replacement? DAVE: Hard no. Hard no. At least for the coordination part, I say it's a hard no. We use Slack-ups here, and Mike probably has lost sleep over the number of times that I forget to do my Slack-ups. If I go through my Slack history, I've probably got 20 kilobytes of Mike going, "Hey, Dave, would [chuckles] turn in your Slack-up, please?" But that goes to what we were talking about though, or what I said earlier, that somebody is trying to extract reporting information and status information, and that's how you knew that my Slack-ups were getting forgotten. And the reason I was forgetting to do them was because I didn't have any coordination to get done. And ADD is like, if it's not right in front of me, it doesn't exist. So, in my opinion, I think, a Slack-up does not solve the problem of a standup. And that's why I tend to push back sometimes when we say, "Well, let's not do standup. Let's do Slack-up instead." I'm, like, no, these are completely different things, and it might be worth doing both. Because the next devolution of that argument will be, well, why don't we just use Jira instead of the Slack-up? And because that's, like, an obviously provable thing. Like, well, if your Jira board is accurate and everybody's keeping it up to date, then you don't need the Slack-up because you just go look at the board, and it'll be up to date. And [chuckles] silly anecdote, [SP] Gerardo is our product manager, I think, is the title that we're working with. And I love him because, in standup today, I'd gotten behind in my Jira reporting. And I keep a list on my laptop of the tickets that I'm working on, like, all the statuses they're in, and it literally generates my Slack-up for me. This is how I got to the point where I was able to do my Slack-ups on time because I made the computer do it. And I pulled up my Slack-up, and it didn't match Jira. And I started lining them up, and Jira was correct, and that was all Gerardo's doing. He literally just, like, one of the PRs had updated in GitHub, and he'd fired the hook, and it had gone through. That's, you know, when you've got it really working well, right? So, anyway, the point of that is that Jira can absolutely replace the point of a Slack-up in terms of, like, status distribution. And this is why I push people away from, please don't replace standup with Slack-up, because you'll end up in this morass of, like, well, what about Jira? You're now fighting about the best way to not solve the problem. You're not even talking about the right problem anymore because there's no coordination involved. MIKE: So, there seems to be a recurring theme here that use your project management software, and if you don't like it, then solve that problem because that's the underlying problem. DAVE: Right. MIKE: Okay. And one thing I want to make sure we don't miss, and this came up in our side chat. We haven't talked at all about frequency yet. If we're talking about the failure cases, the same awful meeting I talked about earlier, twice a day [laughter]. If you have remote teams in different time zones, you got to catch them up to speed too, right? Do it twice a day, or maybe once for every time zone you have somebody in. I'm saying the opposite, the opposite of the good thing [laughs]. This is a bad thing [laughter] I'm describing. But -- JUSTIN: That actually brings up a really good point. Like, I've managed teams that are in India, and for them, it's, like, 10 o'clock at night when they're checking in with the rest of us. But we got to have that standup because we got to make sure that they are not blocked for their next day. So, I think the time of day really depends on your time zones, things like that. And for me personally, my ideal is, like, everybody's in the same time zone. We have it in the morning, not first thing, but, like, at 9 o'clock, maybe 9:30. That's my ideal. It's a good way for people to get in, check their emails, kind of try to remember what they did yesterday, and then they can come in and do standup. Doing it at the end of the day has never been really appealing to me. I've done it before at the end of the day, and a lot of people are checked out already, and then they forget what they said they were going to do by the time the next day comes around. DAVE: Yeah. You've had shower time to think about it. JUSTIN: Yeah. So, I prefer it in the morning, I don't know. But I am open to other thoughts. And, again, if you're dealing with multiple time zones, you just got to do what's best for your team. MIKE: You suggested that reality, which is if you have groups in very different time zones, and I've seen this with people in Europe, people in India, Philippines [chuckles], people in Vietnam, you know, where you have very different versus the United States. You are right. That makes the standup even more important to not be a status meeting, because it's handoff time, right? You're passing the baton. And when you're passing the baton, you don't want to say, "Hey, here's what I worked on today," then the race stops. You stand there and chat for a few minutes, and it's no longer a relay race. You're not handing off the baton. Who knows what you're doing? But if you're handing off the baton and say, "You know, careful, it's slippery up there," or they're supposed to hand off the baton, and they're not there yet because they're blocked back somewhere, right, then you know something. And that's an important thing to recognize, and using that opportunity is a big deal. It's a good opportunity, a really useful opportunity to actually make that handoff and make sure that you're not doing a status meeting because that's, like, the least valuable thing you can do when somebody's showed up at 10 o'clock at night. You don't want to hear what they worked on that day. You want to hear about what you're working on today, because they're handing it off to you. DAVE: You said something a minute ago, and I think I misheard you, but I like the way I misheard it. You talked about time. You said, like, what time? Because Justin then jumped in with, you know, like, evening for the India team, and that sort of thing. But what I heard was how many times. And I was just imagining, like, the horror of having standup more than once a day. Or, you know, do we have it three times a week? And that sort of thing. And I have actually worked on a team that had standup twice a day, and it's because the team was extremely agile. We were all in a bullpen working together. There were six of us, and we would pair up in three pairs, and no ticket ever lasted longer than four hours in theory; sometimes they did. But, like, at lunchtime, if you weren't done with your ticket from the morning, you had to trade pair partners. And the next day, if it still wasn't done, your ticket got thrown back in the backlog as being too big, you know, too problematic. And what I'm realizing, I've got this crazy...this is just a bat-poo-crazy Dave Brady hypothesis. Show me how fast your deploy cycle is, and that is how often you need to be having standup meetings. I'm on a team right now that we meet three times a week, oh, sorry, yeah, three times a week, every other day. And what you just told me is that what you synchronized on yesterday, you don't need to synchronize about today because you're not moving fast enough to bump into each other with yesterday's coordination or with just that information. If you are changing lanes very quickly or hopping from feature to feature to feature, then you need more and more coordination, because you're a lot more volatile. You're jumping around. You're bumping into more things. So, that's my crazy theory is, from the time you go code complete to the time you go deploy, that sets a pace and a rhythm. It's not necessarily good or bad. I mean, agile says that should be very, very small, but, like, it's a reality that, like, the more enterprise your system is, the longer that's going to be. If you've got, like, a validation or an auditing step, or that sort of thing, or compliance, then that's going to take longer. And, I think, as far as coordination goes, that can verify the need for a standup meeting. There's just not that much need for everyone to come together and say, "Hey, I'm going to be working in this area. Who do I need to coordinate with to make sure I don't break your stuff?" So... WILL: I don't know, man. I do not agree with that in the slightest. I'm a hard, hard, hard no on that. DAVE: Awesome. Awesome. WILL: Well, deployment cycles, like, it's...I think of, like, these standups as, like, more, like, inter-process communication. I work in, like, native mobile for the moment. And native mobile deploy cycles are very slow because it's a whole song and dance you got to do with Apple, with Google rolling it out to a bunch of, like, third-party devices you don't own, all this kind of stuff. But we need to do more coordination and not less, because, like, we've got all these teams coordinating on the same app that we really don't want to screw up. You know, clawing back on mobile release is really painful. And it's a function of, like, how many cooks do you have in the kitchen? Not like, how many times you're serving the meals, you know what I mean? DAVE: Okay, so same principle, but opposite conclusion. Okay. Yeah, that's fair. That's fair. MIKE: Well, going back to the relay race analogy I was saying before, if you need to pass that baton to somebody, then that's a coordination point, right? If you're working on something largely alone for three days, there maybe nothing changing there. DAVE: Yeah, that's fair. MIKE: And if nothing's changing, you don't have to pass the baton, right? The environment you talked about, where you changed tickets twice a day, well, there's a major coordination point there where it was mandated. And that really wasn't necessarily the deploy cycle per se, although it could be. It's the points of communication. Or in Justin's example of very different time zones, there's a real need for that coordination where there's a handoff between one group and another at the time. It seems like those coordination points, where the coordination is required, seem to be driving it. And I think that's where there's overlap between where you're headed. Will, you're saying, well, you need to have these coordination points at, you know, the communication need is what drives those points of coordination. And yeah, for your mobile app, maybe you're only releasing once a month, but you better be coordinating more often than that, or else you're going to have a horrid mess. DAVE: I withdraw my claim because you're right. I was trying to conflate the speed at which you deploy a feature. If everything is atomic and everybody has their arms in, then that is linked pretty closely to the rate at which you need to coordinate. But that is the actual driving variable is, how fast do you need to coordinate? How fast can things change? Absolutely. I agree. KYLE: I was just thinking of two use cases, and they might be more niche than the average developer. But I've worked in a situation where I was Dev QA. And what that meant was I sat with my devs, and that was my main responsibility. My secondary responsibility was to my QA team. So, we talked about, how many times do we have standup a day? I had two. I had one in the morning with my dev guys, and then I had one in the afternoon with my QA guys, both of them managed very differently, different scales. And then the other scenario that I'm looking at is kind of where I'm at right now is I'm on a team... I facilitate multiple lines of interest or lines of business. I have one line of business we're deploying 10, 15, 20 times a day. I have another line of business we're deploying once a week, you know what I mean? So, I guess, in that, like...this was more towards your comment, Dave, and we've kind of rectified it a bit now. But that would be very convoluted to be like, oh yeah, well, we need to do it once a week here and 20 times a day here [laughs]. DAVE: Yeah. Yeah. Well, and, actually, that's another proof that my hypothesis is wrong, that if the team that's churning every single day, if they're pushing changes into other people, everyone else has to beat that often as well, because they are causing coordination conflicts, yeah. WILL: I've got a different read on it. So, I had to leave right around the Slack-ups, right? And I've got a real serious problem about Slack-ups because my experience with Slack-ups, that Slack-ups are...I can't think of an exception to Slack-ups not being ultimately rooted in devs being busy under the gun and wanting to skip a meeting that they saw as extraneous. MIKE: True. WILL: And while I have seen inefficient and non-productive standups many times in many, you know what I mean, iterations, I have never in my life witnessed a team that was devoting too much time to keeping everybody on the same page. I think Slack-ups are foundationally not...It's not the right tool for the job if it's just, like, hey, everybody, update your tickets, right, so that everybody has visibility or whatever. Put it in the Jira ticket, throw a comment in there. You know, that's a good thing to do just in general, you know. Like, if I have this thing that's on my desk, when I close out for the day, here's what's going on. And if somebody cares, right, some PMs like, "Hey, what's the status of this thing?" they can just go look at it. And they don't actually need to bother me at all. They will, but they didn't need to [laughs]. But at least they're more informed when they bother me on Slack. I think it's devs thinking that this meeting is a waste of time. And I haven't seen it yet. Every day's a new world, but that day has not yet dawned for me. I think you can keep it tight. There's nothing wrong with keeping it tight and then breaking out. But even the act of just spending a minute, 60 seconds, to articulate what I'm doing and why and how is a worthy investment of time for me, even if I'm working on something in complete autonomy that I'm not going to hand in or coordinate with anybody for a week or two or a month. Just, like, doing that, I think, is a worthy exercise. It's a worthy investment. But you do need to keep it tight when people are busy. Put your camera on, and have everybody look at your face, so that when Mike says, "Yeah, it's okay," you know, but his eyes don't say that, then we have an opportunity to say, like, "There's so many subtle shades and variations of okay, you know, like, I just want to see it. And if I'm the manager, if I'm the coordinator, if I'm the PM, then just give me an opportunity. If somebody isn't necessarily as out and proud and boisterous as me...There are a lot of devs....could I blow you guys' minds? There are a lot of devs that are not excellent and expressive verbal communicators. And they could say to you, "I'm okay," when they're not okay. And if all you have is a Slack message, right, or even a cams-down, you know, meeting, and they give you, like, "I'm okay," right, things might actually not be okay. It might not be cool at all. And denying yourself the opportunity to get that feedback, you know, if I'm a people manager, if I'm trying to keep this team, like, healthy, and happy and productive, I think it's a glaring unforced error. MIKE: I talked to a teacher once about online meetings. I think Zoom is what he was using, but the tech doesn't matter. He talked about teaching a large class where nobody had their cameras on. And it was a nightmare [chuckles] because he'd say something, and it was just dead space, right, just throwing it into the void. You lose all of that nonverbal communication, and he had no idea whether what he was saying was landing at all. And it threw off his whole teaching, like, the whole rhythm was gone. He couldn't make it work. At that point, it's almost a mandatory lecture, where it's why not just record a video? Why do I even bother? WILL: Cams up, mics up, every meeting, every day, every time. And if you got to keep it tight, keep it tight. There's no sin in keeping standup tight, really. And I say this, like, it's full mea culpa maxima. I am the problem because I like to yap. MIKE: Well, we talked about this earlier, before you were able to join, but we talked about failure cases, and one of them we talked about was going too long. We came to the conclusion that a lot of this comes down to what happens outside the standup. And, Dave, I think, expressed this really well. Like, all the failure modes in a standup are because of something that didn't happen outside. And if you haven't done your good coordination beforehand, then you're going to have failures in there, and then you're going to have the long reporting, because nobody knows what's going on, and you have to get caught up. So, absolutely, short. I have run standups before. I've seen them sometimes work, and sometimes they don't, where you start with the blockers. So, instead of saying, "Here's what I was working on, here's what I'm going to work on, and here's what's blocking me," we start with the blockers, and if there's nothing else, you move on. You come in and say, "This is what's blocking me," and if there's nothing blocking you, you go on. Now, to Will's point, having some expression of what you're working on seems to be valuable. Just the act of speaking, you know, like the rubber ducky that you talk to on your desk to clarify your thoughts, probably does have some value on its own. So, there is something to be said for saying, "Well, this is what I'm working on," but that can go last. You can say, "These are the things that are blocking me. This is what I'm working on today. This is what I worked on yesterday. This is what I'm working on today." It changes the focus. So, you start with the most important stuff. "Here's the thing I need to coordinate on that I know I need to coordinate on. I don't want to waste your time. And then here's my take on where things are at." And maybe somebody's going to pick up on something from where you're at. "Oh, I need to talk about that." But you change the order, and I think that does help. WILL: I would want it in reverse order, because I know how these meetings tend to go. And, generally speaking, if you've got a bomb that's going off, if I've got...the kind of people on my team that I want on my team, everybody wants to defuse the bomb, and that's going to [inaudible 40:27] MIKE: [laughs] WILL: But, and here's the thing, right, like, there's people who...I am pro communication and pro efficiency, and so I want the green light projects. Get them off the list. Let me look at your face and spend a minute describing what you're doing, just because, you know, I want to make sure you're okay. And I want to make sure that everything is really okay, and you're not just sort of, like, you know, walking down this open elevator shaft unknowingly, right? So, I could just kind of pull you back by the collar of your shirt. That's fine. But you can get off the call, man. If everything is cool, like, I know what I'm doing; I know what I need; I don't have a blocker; I need to get back to work, then let's get these guys out of the call. And then if the world is melting down, everybody who isn't, like, actively with a bucket, you know, you could get off and, like, get back to the work, you know. Because sometimes you'll have a standup, and it'll roll right into a crisis planning meeting, and that is standard, par for the course. But everybody whose light is green, yeah, bail. Sorry. MIKE: Well, you can take, you know, if you have a conscientious host for your meeting, anytime one of those bombs does show up, you say, "Okay, we are going to talk about that." You create the meeting. You assign it to somewhere else. We are going to set that aside and make sure we finish. And then you get to the end, dismiss everybody who doesn't need to be there, and then you go on. I think that you have to be conscientious about that, or else you will have the failure case. WILL: Yeah. I just go with the flow, you know, my natural... I go with...I would prefer...Both require discipline, and I would prefer to just sort of, like, not do it the hard way on purpose, because people in meetings are naturally going to have a tendency to be, like, this is not relevant to my interest; I'm out, right? Like, you usually don't need to tell people [laughter], you know. KYLE: So, I had a QA manager, and it was...I think it was about the size of 12 people on the team. And I did like the way that he ran the meetings, just because it was one of those things where you would say what you accomplished or what you touched, and then what your blocker was and who you needed. And doing that, the actual standup portion generally took about 5 to 10 minutes. And then afterwards, it was allotted that you could go and communicate with who you needed to. I just thought it was an interesting way for him to manage it that way. And then if you didn't have a blocker and you didn't have anybody you needed to go talk to, you were done. You could go back to your desk and continue doing your work. And I liked that because, then a lot of the time, I could do exactly that. And I wasn't necessarily, you know, in there twiddling my thumbs, which is the most frustrating portion of a standup for me. And I just thought that aligned with kind of what Will was saying a bit, maybe not perfectly, but... MIKE: Well, that pivots a little bit. We talked about psychological safety some before. What does a good meeting host do to cultivate a good standup where it doesn't devolve into fisticuffs, but [laughs] rather [laughs], you know, that there may be heated conversations, but they're productive and not personal? WILL: Balance on all things. I mean, you know, do be clear about what's going on with you. Don't waste everybody's time. Do be accountable. Don't call people out. You know, it's you got to...I don't know, cams up, mics up, all the time. No muting, you muters. If your dog's barking, you know, if your kids are running around the room screaming, like, you know, as much as I can understand that you would want to suppress that, that is actually relevant to your team's performance. And your manager has both the right and the obligation to, you know, inquire as to how their distributed team is performing while you're on the clock at least – DAVE: Some days I'm really glad I don't work for you, Will [laughs]. THOMAS: Because I'm feeling very attacked right now, Will [laughs]. WILL: Yeah, no, no, no apologies. Most people who worked for me really, really liked it. But, you know, I'm also not exactly nice. DAVE: I'm sitting here going, that would be productive, so yeah. MIKE: Knowing that somebody has the dogs barking and all the kids running around once a month is relevant. You can say, "Oh, something's going on today [chuckles]," right? It's different than saying, "Wow, that's an environment that hasn't changed in a month [chuckles]." There may be some challenges there. There is some value, I think, in what you're saying to gathering that information and learning about what the baseline is and where there may need some assistance or changes. WILL: You know, this is a really wild tangent from running a standup, right? I am not a bad person, and so, as a result of this thing, right, where I have sort of, like, you know, I had these fairly rigid, dogmatic rules, like, I know things about my distributed team in other countries that I have not seen any of these people who are using distributed offshore resources. And they could not give a greasy hillbilly f**k [chuckles] about what's going on with any of these people in their actual lives. The level of dehumanization and, like, just no shit given that we treat distributed workers with in the IT industry is disgraceful. And so, like, yeah, man, like, I know that my buddy has a kid with terrible asthma in New Delhi. And they need to seek medical treatment for their daughter who I know, because when she's on the call, I bring her on the camera and I say, "Say hi to everybody." Or they have roving packs of street dogs that are roaming through their...up and down their block in the middle of the night in New Delhi because this person is on a call with me. And I want them to be sitting at the conference table as close to physically present as possible. And this is, yeah, okay, you know, this is, like, weird stuff that I do, right, and nobody else does. But I do that not with an eye to, like, you know, be an intrusive, you know, megalomaniacal d**k, though I am that. But this time it's about having a human interaction and a human connection. And I've seen how other people do it, and they're wrong. And it's a dystopian, screwed-up, dehumanizing thing, and everybody hates it. So, as much as I'm willing to take criticism for, like, you know, me being a little bit of a psycho, it comes from a good place. And while I will have to, like, you know, kind of be a little bit of a door kicker to make these things happen, because people hate it, the proof of the pudding is in the eating, and it works. I'm not wrong. MIKE: I heard about a dysfunctional team recently where they were all overseas. Most of the team members were overseas, and it turned out that the contract shop that was running them had explicitly told them that nobody should go on camera. Nobody was allowed to talk except for one person on the team. DAVE: Wow. MIKE: That's awful. And it was because, you know what happened? One person on the team spoke, and the communication was horrible for years on this team, because how could it possibly be good if you'd limited it that way? The contract shop that was doing this had...I don't know what they were thinking [chuckles], but it was a company policy. DAVE: Wow. MIKE: You think about what it means to be somebody on the team. And they didn't tell the company that they're working with, right? So, nobody knew, and nobody on the standup talked. They just thought, I just can't get a response out of these people. And so, it forced dehumanization, because you thought, wow, these people don't know what they're doing. They're not even willing to talk or show their faces. And there was no human connection made. They were just faceless. And I think they may have done it so it'd be easy to swap people out [chuckles]. But that's exactly what it is. It makes people just machines, like, fungible, irrelevant. WILL: Disposable. But that's not how the business works. MIKE: No, it's not. WILL: You can't do that. We are not bolting doors on Hondas. As much as every MBA from here to New Delhi would like it to be that way, unfortunately, no. I'm sorry. This is, like, a medieval guild, you know. That's just how it is, and I see no indications that any of that is going to change. So, like, okay, man. Like, just deal with us like human beings [laughs]. You know, I'm not wrong about this. I've tried it every which way. And if I'm being blunt, right, even the proposition that you would be able to attend a meeting, right, behind a one-way mirror [laughter], if you try to pull this off in actual physical reality, you would be a psychotic, you know. It's like, what's the one-way mirror? What's that one-way mirror for in the conference room? It's like, we don't talk about that. Really guys? Come on. MIKE: [laughs] DAVE: Every time I run a skills clinic, somebody will ask me, "Did you guys record it?" And I'm, like, "No, no, we didn't." And I'm not trying to be a jerk, but one of the reasons I don't record them is because I want you to attend. When people come to Skills Clinic, I get stuff out of it. If I didn't need to get anything out of Skills Clinic, I would just drop a one-hour video of me talking. I love to hear myself talk. I can you know,, film it and drop that into the thing once a week. But if you're just lurking all the time, then when you get bored and distracted, you're going to go play Minecraft. You're going to go surf Twitter. You're going to go doom scroll. That's the exact opposite of making human connection with the people that you're working with. Will, you were talking about kind of, like, the forced thing. Like, if this is the noise and the distraction that you hear, let's all experience it together. And I realize now, from a realistic human connection point, there's value in that. When I pair program with people, you're like, oh yeah, this is going to keep you from getting distracted and going on Twitter. No, it doesn't. It's just that when I'm pair programming with you, if I have an idea and it needs to be on Twitter, we both go tweet. And, like, I literally tab over, pull up Twitter. "This thing that my coworker just said is really funny," and send, and then we go back to work. And that is hugely valuable. For me, it's valuable because I didn't spend 40 minutes scrolling Twitter after I did the quick distraction. But for my partner, they got to experience, I won't tout this as virtuous, but they got the full David Brady experience. People say things that need to be on Twitter around me all the time. WILL: Well, I mean, like, you could do stuff. You could do stuff like, I don't know, I'm on a call with my offshore team, and it's really noisy. And it's like, "Hey man, what's going on?" It's, like, oh, it's this giant religious festival. They're having a giant fireworks display in the park outside. And then we can all just take a minute, after we finish our work, you could take your laptop up to your roof, and we can just sort of look at this huge Indian religious festival that is happening literally outside your door. And we can be on a team and have a human connection. And that is possible. DAVE: It's the opposite of balkanization. WILL: Well, and, like, if we are on a call and somebody is on their phone, right, the whole time or they're very busily typing in some other screen the whole time, and their cam's up, and their mic's up, you can see it. As somebody who is ultimately responsible for the health and well-being and care and feeding of this entire team, you have an avenue and an opportunity to check in and be like, "Hey, what's going on, man?" Because, is somebody depressed? Is somebody pissed off? Is somebody having, like, some kind of a moment? Which is absolutely going to fall at your doorstep. It's coming for you. DAVE: It's relevant, yeah. WILL: People don't work good clinically depressed. You're going to feel that, and you have an opportunity for leadership, as opposed to just being a manager, that you wouldn't get, or it would be harder to ascertain, if you were just sort of, like, a W on a Zoom call. DAVE: I have t-shirts from the best teams I've ever been on, where, ironically, it was the team that I was flying out to Ohio with to be on site with. We would go do an escape room and get a group photo. And then my team lead would print t-shirts for us to take back, you know, from this. And I cherish those because I had a lot of really good friends. Whenever there was a reorganization that divvied up our teams, it would break our hearts because we were best friends being divided back up from each other. And that's awesome, to actually care about the people that you work with so much that when you get reorged away from them that it's a tragedy. And, hopefully, both halves of that team, you know, it works like sourdough starter. You separate the two teams, and that culture then permeates both lumps of the new teams. I love how everything, when you get really into the meat of, like, agile and about, like, good methodology, it ends up being about people when you're done before it's over. We started out with, like, what's wrong with your standup? And Mike, you put your finger on it really well. The dehumanization, that's what's wrong with your standup. MIKE: Honestly, that's probably a great place to tie this up. WILL: Yeah [laughs]. Cut. There you go. MIKE: Exactly. DAVE: Mic drop. Yeah. MIKE: We're human beings. This is a chance to connect. Use it for that. DAVE: Yeah. Fantastic. MIKE: With that, let's end. Let's be humans. Until next time on the Acima Development Podcast.

13 de may de 202656 min
Portada del episodio Episode 97: Database Indexes

Episode 97: Database Indexes

The episode of the Acima Development Podcast centers on database performance, using the concept of indexing as its foundation. Mike opens with a story about discovering Google in the early 2000s to illustrate how powerful indexing systems transformed access to information. That same principle applies to databases: indexes act as shortcuts that make retrieving data dramatically faster, especially in large datasets. The discussion emphasizes that while indexes can feel like a technical detail, they are fundamental to how modern systems function efficiently, much like search engines reshaped how people find information. Bill Coulam then dives into the technical side, explaining that indexes improve read performance but come with trade-offs, particularly slower writes because both the table and index must be updated. A key rule of thumb is that indexes are most beneficial when queries return a small subset of data, typically under about 25% of rows. The group explores how poor indexing strategies, like over-indexing or missing indexes on key relationships, can quietly degrade performance over time. Bill shares a striking real-world example where adding missing indexes reduced a process from taking 24 hours per record to processing millions in just a couple of hours, highlighting how impactful proper indexing can be. The conversation broadens into database design philosophy and performance tuning. The team discusses different index types in PostgreSQL, when to use them, and how to balance read vs. write performance depending on use cases like bulk inserts or high-frequency queries. They also touch on when relational databases fall short, such as full-text search or massive write-heavy workloads, where NoSQL or specialized systems may be better suited. Ultimately, the takeaway is that effective database performance comes from understanding your data, access patterns, and trade-offs, combined with ongoing maintenance and thoughtful design rather than relying on defaults or assumptions. Transcript: MIKE: Hello, and welcome to another episode of the Acima Development Podcast. I'm Mike, and I'm hosting again today. I'm going to start by introducing Bill Coulam, who's with us today. He comes to us from the data team. And he's been here before, but we're going to focus on some information that he has to share. So, he's kind of the star of the show today. Also with us, returning, we've got Eddy, Travis, Justin, Dave. Mr. Perez, great having you with us. We've got Mike Perez here with us, and Ramses. As usual, I'd like to start with something a little bit outside of our topic in order to bring it in and tie it into the outside world. And I was thinking about a story I think I've shared before. The importance of this moment early in my career keeps, like, growing as I look back to it, like, wow, that was a big deal, and I didn't realize it at the time. So, in the early 2000s, somewhere in the early 2000s, early, early 2000s, I was working for a guy [chuckles]; I'm going to say that. He had some projects, and he didn't have enough resources to do some freelance projects, and so I was doing some of his stuff. He was outsourcing his freelance work to me [laughs]. And he had a project that was in Windows, and there was something they wanted to accomplish through the API. And I started looking through the documentation, trying to use Microsoft's tools to search the documentation, and I spent hours. I looked everywhere I could [chuckles], and I couldn't find it. I came to the conclusion, maybe this doesn't even exist. And I came back to him, and he got back to me, like, 30 minutes later. He said, "You know, there's this new tool called Google, and I use that, and it's amazing. You should start using it because it works really well, and it led me to this documentation." Like, wow, well, I know what I'm going to use now. I'm going to use this Google thing [laughs] because that works way better than actually going through the table of contents, and the index, and the documentation, because that's really hard to search through. Those older forms of indexing were insufficient. Now, Google had this brilliant idea, you know, the founders of Google, that, okay, we'll index the internet. And even back then, that was, like, an impossible goal [chuckles]. And there were other sites that were doing it. There were indexes out there. What they would do is they'd look at the words on a website, and they would create an index based on those. And so, if you look for a word, they'd look for a website that had a lot of those words. Well, people really quickly figured out how to game that [chuckles], and, of course, they did. So, they were useless almost immediately because people would go into their meta tags, and they'd just write the same word a hundred times for something that the site was really not very applicable for. What Google did is they came up with a different sort of index, where they would index words in the links that linked back to a site, and also give extra weight if there were a lot of them, right? And so, by building a more appropriate index that suggested popularity, rather than self-determined, a self-stated importance of the page for a specific topic, they were able to come up with something way more effective. And you don't always think about indexes, you think, index? Like, I remember going to the library. It had, like, the Dewey Decimal System, which is really kind of weird and awkward and hard to find things with, but it was way better than the alternative, which would have been nothing. You don't usually think about indexes changing the world, but that index, that PageRank index, you know, the PageRank algorithm that they use to just create an index, that's all it is, right? Link this word, map this word to a website, so that when you're searching this word or phrase, then we can find it. It literally, like, fundamentally changed culture. It's now a verb [laughs]. Like, you Google something, even if you're using Bing for those of you out there who use Bing [laughs]... DAVE: Use Bing to Google, yeah. MIKE: Exactly. You use Bing to Google, because information now is accessible, and that is something that didn't exist before that. For all the digital natives who've grown up in this world, like, how did you find things before? Well, you didn't [laughs]. You suffered. You wandered through libraries. DAVE: We just got used to not knowing things, yep. MIKE: Exactly. That's exactly what you did. You got used to not knowing things. It changes everything when you have an effective index. And I could talk about all the times in my career when something's missing from the database, and yeah, it was the index. It's always the index. There's always a missing index somewhere. It solves all of your performance problems. And there probably is an exception, but I can't think of it [laughs]. It's always the index. That's what we're going to talk about today. We're going to talk about database performance. And we've been wanting to, you know, Bill's been preparing this and thinking about this for a while. If we're talking about database performance, indexes are going to come up over and over again. And this could seem really dry, and this is going to be a technical deep dive, right, we're going to very much going to talk about indexes. We're probably going to be focusing on PostgreSQL. But this idea of indexes is not a trivial one. It's how we operate in the modern world. Our culture, our commerce has been fundamentally transformed. Our ability to know things and outsource, you know, to this Library of Alexandria that we've got in our pockets all depends on indexes, and it's amazing. There's my introduction, Bill. And I wanted to lead out with some weight behind what you're going to be talking about today. BILL: I love it. That was a fantastic segue. All right. Hi, everyone. I am Bill, Bill Coulam. I've been doing this work for about 30 years now. I started as a software engineer using COBOL and mainframes, but I don't put that on my resume because I don't want anyone to ever call me back to help with that. So, I tell people I started with C and C++. I was actually one of the first users of Java back in 1995. My company that I worked for at the time, Anderson Consulting, they wanted me to go around to their clients and tell them what I thought of Java. And, at the time, I felt like it really wasn't ready for primetime, and so I kind of voted myself out of working on that platform. But that's okay because I ended up, on every project that I worked on, working with Oracle, and, at the time, Oracle was the 800-pound gorilla. And I was in the telecom industry, where we had some of the largest volumes of data in the world, and so I learned a lot of great lessons working on those big systems. It's a whole other world jumping between databases that have 10,000 to a hundred thousand rows to databases that have 500 million, a billion. Performance tests in your copy of production can take three hours. It's a completely different world. Anyway, so you learn a lot of good lessons working on data that big. I ended up sticking with Oracle for a long time. It became my bread and butter. And went from San Francisco to Denver to Houston, and then back here to Utah where I grew up. I've been here longer than I spent time in my own hometown. So, I've been here in the northern central part of Utah since 2007. Anyway, let's go ahead and jump into it. We're going to be talking about four areas: the fundamentals of indexing, some guiding principles, the two shared tendrils, index types that are available to us using Postgres as our source database, and some indexing dos and don'ts. Firstly, some fundamentals. An index is a shortcut to get at the data. However, because an index is a separate structure from the actual table containing the data, it requires at least two I/Os to get at the data: one to search through the index, then one to access the rows in the table. Because of this, indexing can and usually does save time when querying large tables, but it can take longer than a full table scan if the number of matching rows is greater than around 25%. That is a rule of thumb, not a hard rule. I did a bunch of testing back in 2024 on our setup here, and it was right around 25%. So, if the number of rows you anticipate matching your query being less than 25%, an index will typically make sense. Ultimately, an index is stored in a file. And updates of index columns, keep in mind, must modify and manipulate the table and the index. That's important when you start thinking about how many indexes your table has and the effect that that will have on write time. And, lastly, matching index and table results will get cached in case the same request is made later. MIKE: So, I've got a couple of questions about that. Firstly, how often do you see in...and this depends on systems, right, so maybe there is no universal answer. But how often do you see indexes harm performance? Because there's this index that we probably didn't need, but now we have to write to it every time, or somebody went in and indexed 20 columns, right? There are certainly bad use cases. Have you seen cases where there was a clear performance hit, and, you know, seeing data to show that? Is there some sort of rule of thumb where I should think, ah, well, actually maybe the database is a bad idea here? I'm also curious about those caching results. Do you sometimes get...in data sets that are growing really fast or something, do you end up with weird results from that caching? BILL: Let me answer the second one first. The answer is no. Phil Karlton of Netscape, may he rest in peace, he was famed for saying something like, "There are only two hard things in computer science: naming things and cache invalidation." And there was some wisecrack that added to that, where it said, "There are only two hard things in computer science: naming things, cache invalidation, off-one-by errors." But yeah, cache invalidation is tricky, but the database engine teams tend to have done that very well. So, I've never had funky results from mainstream relational database engines, so that tends to work pretty well. The answer to your first question, the quick answer to that question is no. I have not seen indexes cause immediate harm. Like, the old analogy of, you know, the frog in the pot of water that eventually gets too hot and cooks it, adding even crazy indexes, indexes with lots of multiple columns in them, and so forth, I've never seen an immediate and obvious degradation. It's even been hard to detect it when that pot is fully boiling. When a table has 30 indexes on it, and inserts are taking two milliseconds per row, generally, you don't notice it. Over time, as these indexes are added, the team that works with that data tends to believe this is the way things are. DAVE: Oh, it does that. BILL: And they don't really question: could this be three times faster if we got rid of all the unused indexes? So, yeah, to answer your question, I've never seen it immediately [inaudible 11:39] performance. MIKE: [crosstalk 11:39] three times faster. That's probably loosely data-driven, right? BILL: Yeah, that's very loose. MIKE: But you didn't say a thousand times faster. But there are very much cases where if you're missing an index, it could be a thousand or a million times faster [laughs]. BILL: Yes, mm-hmm. And -- DAVE: I actually have seen a case of this, but I think I'm actually in agreement with Bill, where the definition of a database, right, developers we always talk about toy databases, right? But the database...and you don't think of this as a database, but the system log on Linux, it's a log file, and we think of it as a log file. But you can also think of it as an unindexed table where you want a right row right...Insertion has to be very, very fast, and you can't spend any time indexing. Well, if you have to write fast and you're saturating the hard drive...also, this is back in the days when a hard drive seq was 12 milliseconds, and so updating the index and the file was very painful, right? If you take that blurry view of, like, is a log file a database? It's easier to think of that when you start realizing that, well, everyone's now streaming their log files up to Splunk and Datadog, and these things that are, like, pulling their log files together. And time series databases like Grafana now exist where you're supposed to log, log, log, log, log, log, and then, over time, they start compressing the old stuff. Like, they start batching it up historically, and you start losing data. It's kind of like compacting context for an AI. So, like, 100%, I agree that, like, if you're talking to a real database, you've usually got a lot of structure, and everything's, like, really, really solid. But I have, back from the battle days when I was doing a lot of MySQL, we would have to sit down and go, is this table fast right, and we don't care about search? Or do we need fast search on this? And if so, can we pay the cost to index it? MATT: I think it's specialized, right? DAVE: Mm-hmm. Mm-hmm. MATT: There's certainly cases where you will see this. If you have one insert and that insert is inserting four and a half million rows, and I see this, it's a problem. But I think it's a more specialized case and more one-off. But if you have 30 indexes on columns in a table that you're inserting 4 million rows at a time, you definitely see performance degradation for sure. EDDY: So, that's kind of interesting, right? Because I think, historically, what we've done is we try to remove unused indexes, right, like when they become unnecessary. I think the rule of thumb is clean up to avoid degradation, right? But then it's kind of interesting because I think Bill's response to that is, I haven't seen that in practice slow down any inserts or writes, right? So, I'm curious, like, is it just, like, the thought process is clean up after yourself always, regardless of whether that slows down degradation? Or is it... MIKE: So, I asked the question because I think that there's so much power in indexes and, you know, it seems like that cost to write isn't that high, but I think there are other costs. Even if it was negligible, even if there was no impact at all, I think if you want to understand what's going on and you've got 30 indexes, you're in trouble. Like, I think that that cleanup matters just for, you know, we don't write in machine code because code is written for humans, right, and then compiled for the machine to understand. And I think the same thing applies here. There's a human aspect to this that unless you actually needed those, I think that you're doing active harm to the users of that, you know, to understanding or even trying to fix if there's a problem, if you've got a bunch of junk there that you don't understand. I think that regardless of whether it doesn't have much impact, I think that there is still, like, a reason to keep it clean. Well, that's my thought. What are your thoughts, Bill? BILL: You really need to know your data and know your anticipated access patterns. Matt was talking about a scenario where you want to insert 4 million rows in the telecom industry, or sometimes you'd need to bulk load 300 million rows. That's a different access pattern than inserting a single row. And you need to approach things differently. In some of those bulk load scenarios, it was much faster for us to drop all the existing indexes, do the load, then re-add the indexes than it would've been to leave the indexes in place and let the database engine do all the maintenance. So, everything is an it depends answer, right? You need to think through those things. But the second thing that I want to talk about, which is highly related to the indexing fundamentals I just went over...And these aren't, you know, industry standards. These are just, according to me, some guiding principles of index design. The first one we've kind of already talked about...actually, both the first and the second. That is that an index will make data access faster, but an index comes with a cost to write times. So, just like Goldilocks and the Three Bears [chuckles], you know, you can't have too many or too little. It needs to be just right. In order to make it just right, you really need to understand your application, the business requirements of that application, the data that you're working with, the quality of the content of it. You need to understand the access patterns that are anticipated on that data, queries that you already know about, or anticipate. And by having all of that context, you can design a much better database schema and indexing strategy. DAVE: One of the things that kind of blew my mind, like, 5, 10 years ago as I was getting into, like, NoSQL, and schemaless databases, and document databases, and that sort of thing, was somebody pointed out to me that SQL is actually a terrible language for reporting. It's not built for reporting. SQL is an ad hoc query language. It's how you get at your data when you have no plan to get at your data. And all the cool stuff, all the bells and whistles, like the query planner and indexes, are basically trying to get around the fact that you weren't prepared to do this. And if you do know what you want and you've got that report well defined, you can make it so, so slick, whether it's indexing in advance, or materializing views, or shoveling everything over to the data team and letting them stick it in gigantic vertical tables. EDDY: So, I think I've always just gone hand in hand with saying, "Oh, you want reports? You want SQL because that is the most efficient language used in any sort of database," right? Are you suggesting, or do I understand that correctly, that you're saying that that isn't its original intention? Because you're blowing my mind if that's the case. BILL: If you want something that's super efficient, it's NoSQL, because a collection is built to match an anticipated access pattern; it will blow relational out of the water. I love that David brought that up because that was the next line of my presentation. DAVE: Yes! BILL: Is that one of the greatest strengths of relational databases is its ability to handle ad hoc queries. So, you try and totally understand your system and try and anticipate the queries that will be required of it. And some you will know right off the bat because they're right there in the requirements documents, but there are still plenty of ways in which that data may be used in the future. And you just use your experience and your gut instinct to say, "Okay, we're probably going to need an index for these three columns here, because of what they're named, and what I expect, you know, the end users to want. This JSON data column, it's unlikely they're going to be indexing off of that, you know," so you make your best guesses. But yeah, a relational database is really good at ad hoc queries. And if you have done your very best to index intelligently, then it will be able to handle most of those ad hoc queries well. And the ones that don't will be generally immediately apparent, unless you're on a tiny system, you know, trivial system. And then you can redesign things; maybe toss an older index and redesign a new one that's composite or partial or, you know, fancy in some way that matches your needs. MIKE: I've got an anecdote related to this as well when you talk about NoSQL. BILL: Yeah [inaudible 19:27] MIKE: Some years ago in my career, I was working at a place that did content management, largely for newspapers. And you think about what an online newspaper page is; it is effectively searching the latest content. You don't usually think about that. Like, that's search? Well, yeah, it is. You want to just get the latest things. It's a feed, right, and then the oldest things drop off. That's more obvious to get something like the social media, where literally is this feed where it comes in from the top. The newspapers, even old paper newspapers, you have the latest content, and the most important stuff comes to the top, and other stuff flows down to page eight. And we organized our presentation that way. It really was doing searches. And it got slower because a lot of that relied on text. You know, you're looking for this kind of text, well, this is the weather, right? We want to look at the weather stuff. And to make our system efficient, we had to get out of relational databases. And we used a full-text index; we were using Solr at the time. It's similar to Elasticsearch or OpenSearch, all these descendants of the old Java Lucene library that allow you to efficiently build an index into your data. But it also is effectively a NoSQL database because it's searching the data. In fact, you can even cache the data in that index, and so you never even hit your database. And we would do that sometimes, where we'd never even hit our relational database at all. We just used that to store the data before we indexed it, and then it came out of that other system. And we could not run. We absolutely could not run off our relational database because it was way too slow; it was unworkable. We had to use, you know, that NoSQL database in order to work. BILL: Yep. And there was a telecom company I worked for in 2021 where they needed wickedly fast writes. And so, there we went with Cassandra. We weren't really worried about indexing or reading. Now that I'm clear that this is not a presentation, I'm going to possibly just fly through the middle portion of it, which is where I train the listeners on the different sorts of indexes available to us in Postgres. Maybe I'll just mention them briefly and then get to the dos and don'ts at the very end. In Postgres, we have a number of basic indexing types available to us, many of which are covered in your basic computer science courses, like Binary Tree and hashes. We also have some index types called GIN, which stands for Generalized Inverted Index, and GiST, which is a Generalized Search Tree. The GIN indexes can support many different user-defined indexing strategies out of the box. We typically use them when indexing columns that are arrays on a Postgres table. But it is also used to aid in the implementation of full-text search on Postgres. And it comes built in with various operators that let you do things like nearness, and contains, and stemming, and some other things that a typical B-tree doesn't allow you to do. GiST Index is fairly similar in that it allows you to build your own indexing strategies. It supports nearest neighbor searches, geography, spatial, and other very specialized types of indexing. This is the index most typically used for features within an application that have mapping features, allowing you to see how far away you are from the pizza origination point and things like that. MIKE: Well, you know, that's interesting. And even I think, well, that's a very specialized case, but the most common query in our biggest application is one of those geographic queries, in order to find your [crosstalk 23:26] BILL: That's in merchant portal, right? MIKE: Exactly. BILL: We're using that there. Then there's a really specialized version, which I have never used, called Space-Partitioned GiST, SP-GiST. The official documentation says it's non-overlapping. It lets you build your own indexing strategies, just like GIN and GiST. It's very flexible. It permits implementation of a wide range of different non-balanced disc-based data structures, such as quad trees, k-d trees, and radix trees. If you guys know what those are, that's awesome. I did not go through computer science, so I'm not even sure when those would be helpful. But, apparently, it is also used in geolocation-type applications. The other sorts are a little bit more used. I've only used BRIN once. BRIN stands for Block Range Indexes. These are best for columns whose values correspond with their physical order in the rows of the table, so think of, like, now-serving number at a queue or a kiosk. As that number is monotonically increasing and going to some database column, that number is very close to the value of the row that preceded it. That is a perfect column for a BRIN index. And the reason you would want to use it is it uses far less space than a typical B-tree because it uses ranges instead of individual values. Uh, let's skip over that. MIKE: Do those ever get used for, like, primary keys or anything like that, for efficiency, or not really? BILL: No [chuckles]. Most every system I've ever worked on defaults to B-tree for a numerically based primary key or surrogate key. But, in theory, it could be used for one of those. Yeah, I've only ever used it once, and, typically, the B-tree is fast enough. I don't need to eke out another hundredth of a millisecond by using a BRIN. So, typically, the default is used. MIKE: Interesting. Maybe if you had some massive data set of sequential data or something. BILL: Yeah, because this happened to me in Telecom, where we were trying to eke out every millisecond we could find. This might have been one of the strategies we tried. DAVE: We had a really fun...out here in Utah, we used to have Mountain West RubyConf for about 15 years, and it was a fantastic conference that got put on. And James Golick came out. He was the CTO of FetLife. Don't Google that. It's an adult website, social media for naughty stuff. And because it's naughty stuff, it was very, very popular, and he was running, like, a terabyte of notifications through their database, like, every single day. And this was in, like, 2009, so, like, a terabyte was a lot back then. So, imagine somebody shoveling around a petabyte or, you know, half an exabyte trying to get that through. And they were using a popular document database; it's not my fight to have, so I won't say which one. They bogged down so hard. They kept backing up and backing up. They bogged down so hard that he had to physically pull the cord on the server. Like, he couldn't shell into it to stop the server. He couldn't, like, I don't know if he had a bash prompt. He couldn't get the keyboard to respond. Could not get ACPI power button, that's when you hold down the power button on the front of the case, could not get that to respond. The document database was just spooling everything; it had just backed up and backed up and spooled out. He ended up writing Friendly ORM, which is based on FriendFeed. And if you want to know how a document database works, go tear Friendly ORM apart, if you like Rails, because it's built on SQL. It runs on MySQL or runs on Postgres anything. And your data, your documents go in a table that has an ID and a blob column. And all of your indexes are tables, and every table has an index, a record ID, and whatever data you want to go look up, and it's got an index on it. And he just handled that in the ORM. And when you talk about writing something in anger, he ripped out that document data store that same day. Like, it was on a Thursday or a Friday, and on Monday, they were running on Friendly ORM in MySQL. It was insanely angry. So, yeah, if you want to know how NoSQL works, like, under the hood, it's fantastic because you get into it, and you go, wait, is this all there is to it? And yeah, that's all there is to it. All the stuff about, like, crawling over a database and indexing it and then searching back through, like, the problems of searching a document database when you don't have an index, it's very obvious because there's only, like, four moving parts. It's really, really cool. BILL: Fabulous anecdote. I saved the B-tree index type for last because it's the most common; it's covered in computer science courses. But I just wanted to cover just a couple of nuances in Postgres, a couple of which I had to learn the hard way a year or two into using Postgres. One of which is that the B-tree is really good for less than, greater than, less than or equal to, greater than or equal to, and equals. It can also support some other equality and range comparisons, like the LIKE operator, BETWEEN, IN, IS NULL, and IS NOT NULL. But there are a couple of operators it doesn't support out of the box, so one of it is the LIKE operator. If you do a LIKE comparison and you feed it a pattern that starts with a wildcard, it can't use that. It nullifies the use of the index for that comparison and will do a full scan on the table. In order to do that, in order for the database to be able to index the first few characters of a word, you would need to use the GIN index with the trigram ops. I think it's called an operator. Anyway, each of these index types has the basic default syntax for creating an index of that type, and then it has a whole bunch of optional things. If you want to really know your stuff, get into the Postgres documentation and look at those options sometime. That's where you see some of the richer things, like, for the B-tree, it has some operator classes called text_pattern_ops, varchar_pattern_ops, and bpchar_pattern_ops that I didn't even know existed until about three years ago. I won't go into those right now. But just know that there's a range of flavors of these indexes that you can activate by knowing what those options are and knowing when they'd be useful. So, with B-tree, there are a variety of flavors of the B-tree index. There's the one that we use the most often, which is a single-column default B-tree. I won't talk more about that. The second flavor is a multi-column one. This can be used for indexes, sometimes referred to as keys, which are composed of 2 to 32 columns. You're limited to 32. I've honestly never seen any with more than 5. This sort of multi-column index is used for queries where two or more columns are always or frequently used together in the WHERE clause. During index creation, you know, you say CREATE INDEX. You give it a name ON table_name, and then in parentheses, you list the columns that you want indexed. You list those columns in the order of selectivity. So, if you had, for example, a table of people, or employees, or citizens, which would you put first: social security number or eye color? KYLE: Low cardinality first. BILL: Yeah, yeah. So, the thing that would return the least amount of matches first would be social security number, which is unique. So, yeah, higher selectivity goes first; lesser selectivity goes towards the right. MIKE: That's an interesting one because I think that those multi-column indexes don't get used as much as they could. A lot of the big, gnarly, slow-running queries do query against several, you know, they query against a number of things. How much benefit do you get from using a multi-column index rather than having several columns indexed independently? BILL: A lot. The trick is knowing when you should have it. If you look at some of our queries on our tables and you run EXPLAIN ANALYZE on them, and you see in the query plan that it's going to be doing a lot of bitmap ANDs operations, bitmap ANDs are combining single-column indexes together in order to arrive at the answer quicker. If it's doing a bunch of bitmap ANDs and it's doing that over and over again, it's possible that you have a very common query that should have those two or three columns put together in a multi-column index. But if that same query has, you know, 50 flavors of queries that are being thrown at it, you wouldn't want 50 multi-column indexes to match each of those queries. So, it's that balance we were talking about at the start. You have to know which of those queries are the most important, which ones are being hit a million times a day, and which ones are being hit four times a month, and plan accordingly. And that's something...one of the 15 projects I'd love to do here is optimize that. MIKE: So, you're going looking through your slow queries, you know, using whatever tool you're using. It sounds like you'd, you know, have that in your toolkit at the ready if you see a number of...if you see queries that are, like, oh wow, that's checking against four columns in this table, you should probably have an index on those if it's doing those bitmap AND, or bitwise AND that you're talking. BILL: Yeah. And if that query is being hit many times per day, it's a good use case for them. MIKE: You know, most of the queries that tend to run really slow are doing joins, so I'm going a bit far afield here. So, what if the data's across six different tables, but you're running it all the time? That's a slightly different case. Do you have an approach for that specific situation? BILL: Well, you first try to optimize that query. By the way, I have a few cardinal rules about query performance. And the first rule is asking whether or not this query should even be done. You would not believe how many times where something was really, truly awful, and we asked that question: do we even need this feature, or should we even be issuing this query? And how often the answer was no. The second cardinal rule of the query performance is, if it can be done in SQL, do, instead of, you know, dragging the data out of the database and trying to replicate a database in, you know, in the middle tier. And the third cardinal rule of performance tuning has to do with the indexing that we're talking about. If your data is well-designed...well, it's making sure that the application data model has been well designed. Usually, when I had a really terrible performance problem, it was because the data model was not good. So, I covered two of the things that most commonly fix massive performance issues, and that was something that doesn't need to be done at all, and the business requirements weren't well understood. Once those things have been accounted for and your data model's good and clean, and you've made sure that everything's indexed well so the joins can be efficient, well, you've got this 6, 8, 14-table join. You've done everything you could, but it's still not fast enough. That's when you start exploring denormalization. And that typically leads a relational database person to materialized view. In Oracle, that was really beautiful because it had a built-in facility to keep that materialized view refreshed upon commit. Postgres is just getting to that now with an extension called pg_ivm, Incremental View something or other. I think it's coming standard with 17 or 18. But, anyway, that's when you've done everything you could and dotted all your i's and crossed all your t's, and it's still not fast enough; that's when you need to look into materialized views. And if that doesn't work, then you're probably on the wrong database engine for your use case, for your application. MIKE: That makes sense. WILL: Generally, it goes back to, like, sort of, like, database performance, like, in general. I'm not a database engineer. I know, like, an index and a join and, like, how all this stuff works, like, under the hood. But, like, I suppose, like, the biggest query that I've got from, like, a database, like, somebody who makes databases their trade is, like, if I'm looking at a database performance dashboard, like, what am I looking at to sort of, like, diagnose performance issues? Like, how are you looking at...when you look at, like, a database and, like, how it's running, right? I know if I have a server and it's like, oh, it's using too much memory, okay, there's a problem. My queue depths are starting to, like, get really big, okay, that's a problem, right? But, like, when you are looking at, like, sort of, like, the dashboard of a database, like, what are you looking for to say, like, oh, okay, this is a problem; this isn't a problem, you know what I mean? I'm just curious, like, how do you sleuth out these performance issues? BILL: Yeah, it's not too bad. A mature, well-instrumented database engine usually comes with some facility that allows you to peer into the resources being consumed by all the queries in the system, and it'll show you front and center what the hotspot is. I mean, if the database is really hurting, it's usually pretty obvious. Sometimes when it wasn't obvious, it was due to the network and something else. But yeah, usually when you peer into a dashboard, there's a big, old bar, a big, old spike, a flame, that shows you exactly where most of the runtime is being consumed. And you're able to click into that, and it will usually tell you which query it is. Now, there, a lot of the dashboards kind of let you down in that they only give you a piece of that SQL. And very often, you need to see the entire SQL in order to figure out what the culprit is. Once you have the entire SQL, then you're able to run it either through EXPLAIN, which gives you an estimate of what the database would do, or, if you are able, run an EXPLAIN ANALYZE, which will show you exactly what the database is doing when it's pursuing the data. And that is where it's really critical to know both the database engines indexing and your data, in order to determine whether the query path that the planner is showing you in the explain plan whether that's the plan it should be using. So, you look at all these steps, and you need to know how to read it. Okay, it's doing this one first, then this one, then this one. And if you know your data and you know what it should have been starting with and what it should have been doing next, and you look at that plan and it's not doing that, then you know you have an issue. You know you're missing statistics, or you're missing an index. Or some table got accidentally blown out with 5 million rows the other day. It was a bug. And they got rid of those 5 million rows, but they forgot to reduce the high watermark. But the database still thinks it's a massive table, and so it's making the wrong join choice. That's where the expertise comes in. That's why you get paid the big bucks, is being able to combine all those things and figure out, yeah, the database is not doing the right thing here, and here's what it should be doing. And how do we get it to do that? WILL: How can you tell, like, differentiate between just a hot query that's just a hot query? Like, a lot of people want the homepage, let's say, you know, like a [inaudible 38:49] example, right? How do you say, like, oh, this is just, like, everybody wants the homepage, versus, oh, the homepage, you know, is misconfigured, right? Like, how do you tell the difference? BILL: The vast majority of the systems that I've built have been well normalized, and designed, and indexed, and so forth. So, when we had an issue, it was because something changed, and it was more reactive. Someone noticed an issue, they called us. We looked, oh yeah, yeah, like that scenario I just described, where a table got blown completely out of proportion and shrunk the next day, and it changed the nature of the query path. Ideally, you would have a more heuristic system that learns from what is typically running on that database so that when something's out of the ordinary, it alerts you ahead of time. I've never lived in such a world; that would be lovely. I have not seen it. They probably exist. WILL: Oh, don't worry, don't worry. If the database starts going south, we'll call you. [laughter] BILL: I might be conflating this with my previous client, but there's a tool called...there are several, but one that I've used most recently was called SolarWinds. I don't know if any of you...Kyle if... KYLE: Yeah, that's the one we use here. BILL: Okay. And I haven't been using that, or I haven't had a need to use that heavily here. But I believe it has some facilities like that to tell you the difference between one that is frequently hot and heavily used, versus one that's not been seen before and is consuming all the resources [inaudible 40:19] MIKE: You know, Kyle, I've been meaning to ask you...because we're talking about the monitoring, because you get asked those questions. People come and say, "Hey, DevOps team. Everything's on fire. What do I do?" And you're like, "I don't know your system. I'll pull up a dashboard," and you usually manage to find something [chuckles]. Like, what's your tactic, Kyle, for finding database problems? KYLE: Database problems, I usually look for high I/O, disc depth, CPU, memory, and then connections. I'd say spiking connections would tell me quite often that there is a problem. And then that queue depth, if that queue depth gets very large, we know we've got a gnarly query in there somewhere. And then, at that point, that's going to trigger me to go look at a tool like New Relic, or, you know, something that can do the APM analysis from the service side and tell me, like, what that query might be. And then, from there, generally, we're able to say, oh, we're missing an index here. You guys should go add this index, and that'll increase performance again. BILL: It's when Kyle and DevOps reach that point that they usually involve me. So, that's why I wasn't able to answer your question [laughs] terribly well, because I'm usually getting skipped until that point. WILL: So, if you had, like, a lock or something that was deadlocking on a database, or, like, a, you know what I mean, like, some kind of table lock, like, how would that manifest itself? KYLE: So, that'll show in your performance insights tool. I did skip over that. That's another one that we commonly look for. We go in, and we see if there's a query that's got a lock on it. MIKE: [inaudible 42:01] WILL: How does that manifest, like, a bad lock where you're stuck, versus like a good lock, where it's just, like, business as usual? You got to lock a table; that'll happen. KYLE: Yeah. Most of the time, I throw that back on the engineers. But if it's been locked for, you know, I've got a query that'll look for any locks that are over five minutes. And if it shows up in that query, I think we've got an issue. MIKE: Makes sense, long-running locks. Good lock is a short lock, yeah. BILL: There are a few preventative parameters that we could be using in Postgres that we're not, that can prevent idle transactions from hanging around too long, statements that take too long, and can log and notify when some of these things happen. It's one of the things I'm going to be talking to the engineering managers about in the near future. Just to finish off the theme of the B-tree indexes, there are three other flavors. One of them is covering. It's kind of an interesting name. I prefer to call them payload indexes, but Postgres calls them covering. And that is where you index a column or columns that you want to match on or to quickly narrow down your matching data. But you also include a couple or more columns that are part of the select list. You're not necessarily matching on them, but they're part of the data that you're looking for. And by doing that, you can potentially get what is called index-only. You can get index-only queries, where they don't even have to touch the table. They're able to satisfy everything that the query wanted in its WHERE clause and everything the query wanted in its SELECT clause, just from the index columns and the payload in the INCLUDE portion of the index. So, those are called covering indexes. Another flavor of B-trees are called partial indexes or conditional indexes, and that is where you get to use a WHERE clause in your index creation. And that is where you only index a row if the row matches a certain condition that you have. And that can be valuable when you have a 700 million row table and only 5 million of them match a certain criteria, and those are the only rows of interest to you anyway. So, you'd only index those 5 million rows that match that criteria, that way, you're not indexing 700 million rows, and 695 million of them are a waste. Finally, we have function-based B-tree indexes, and these are used where you know that your access pattern needs to compare the column where the column has been manipulated by a function. Like, the more commonly used example is where you want to compare a given index search term that was obtained from a field in a web app or a mobile app, looking for a matching email, and you don't want to deal with all sorts of possible email variance that the customer might have fat-fingered into the database. And so, you want to normalize the data. Ideally, you'd normalize it before it gets written, but let's say you didn't. And so, you want to wrap the email column with a lower function. Well, now you've just excluded yourself from using the index on the email table or the email address column, I should say, because you wrapped it in a function. But you can index the application of the lower function on the email address column, and that's called a function-based B-tree index. And there's all sorts of functions. Eddy and I were exploring the use of full text search in merchant portal, and that requires a call to the to_tsvector function. And to make that quick, you would want to create an index on the to_tsvector of the textual columns that you're full-text searching or allowing a full-text search upon. There were a couple of things I wanted to cover, some dos and don'ts, some gotchas about indexing. Again, I mentioned this in 2024, but I want to mention again to anyone who's tuning in to the podcast. The first one is that you should index each key. Now, you don't have to worry about primary keys or unique keys. If you, in your data model, are declaring a certain column or a combination of columns to be your primary key or your unique key constraint, the database will automatically create an underlying index to support that uniqueness check. Now, the foreign key constraint, you should also index by default. Some of those don't end up getting used, so we can clean those out, but they should be indexed. And the database does not index a foreign key constraint automatically. And that's why that's one of the things I'm checking for when I'm doing data architecture reviews. Just a little anecdote to go along with this. One company I went to work for in 2019, when I walked in the door, they had multiple dumpster fires in their flagship Oracle database. They had had a data architect up until 2015. They had been doing without one since then and had lost sight of a couple of best practices, one of them being indexing your foreign keys. It turns out that they had two primary causes for all their performance issues, and the biggest issue was the lack of foreign key indexes we added. You know, the system had been evolving and growing; features had been added; columns had been added. And, over time, they had added 53 columns that were child columns logically related to parent tables, and none of those 53 columns had indexes on them. And that's normally not a huge problem if you're not querying on those columns; you're not joining on those columns. But if you try to delete a row from a parent table through a foreign key constraint is related to data on a child table, when you go to delete that parent row, it has to scan through, ideally in an indexed manner, all the child tables related to it, to determine whether or not it can safely delete the parent row or whether it's going to create orphans. If it's going to create orphans, it says, "I can't. There's child rows that still pertain to this parent value". Well, at this company, they were trying to adhere to the GDPR regulations because they had customers who had employees in Europe. And when those employees would leave, GDPR says, "You should be able to request that all your user data be removed." Well, because all of these foreign keys had been added without supporting indexes, their attempts to remove user data had been getting slower and slower. The last time they'd been able to run it had been two or three years before I got there, and it had taken 24 hours to remove a single user, and then they just gave up. When I got in there, we added the missing foreign keys, and immediately, we were able to catch up on 2.4 million user deletion requests in two hours. From 1 user taking 24 hours to 2.4 million in 2 hours. Indexes can make a huge difference. So, what else should be indexed? Index each column used in filters, otherwise known as WHERE clauses or predicates. Index each column used in a join. And if the join is to a multi-column key, that's when you want to index the columns together, of course. If you have a multi-column key and this particular flavor of query that you're sending at this table doesn't use the first column in that key, but it does use the second column, that's not a problem in Oracle because they have a feature called skip scanning. It can...I'm not sure how exactly they implemented it, but it can skip over that first column, and it can index on the second column in the multi-column key or multi-column index. It turns out that many users of Postgres have wanted that for many years, and it is now a native feature as of Postgres 18. So, that was some good news I wanted to share with you. We're currently on 17.4, but I imagine that 18 is not too far off for AWS. What should be avoided? Over-indexing. Back when I thought this would be a presentation, I wanted to demonstrate that we have a number of tables in our systems that have well over 25 indexes. Did I say, "Indexes"? We have a number of tables in our systems that have well over 25 indexes. And one I was looking at the other day has 31 indexes on it. And of those 31 indexes, 15 of those indexes have never been used. MIKE: So 50% basically. BILL: Yep. There's a whole lot of cleanup that we could be doing. So, that's why it's important to monitor indexes over time to make sure that you're not leaving a bunch of crafty indexes around that aren't touched. Let's see. Avoid indexing a column more than once in the leading position of indexes on the same table, and we have a lot of that going on as well. Don’t index columns with very low cardinality. So, if you have a table with a hundred million rows, you wouldn't want to order the active flag column, where 50 million are Y and 50 million [inaudible 50:54]. That's not going to do you any good to index that. Avoid indexing mostly null columns. We talked about that when we were talking about partial indexes, where you can use a WHERE clause to avoid indexing those columns that are mostly null. And avoid indexing columns that are heavily updated; that one involves some trade-offs and understanding of your system. WILL: So, what's the drawback to, like, if I have a column, let's say, you know, like, I don't know, date of birth or something, right? And I don't want to have an index off of date of birth twice. So, I don't want to have an index, like, date of birth and, like, zip code, and then also date of birth and phone number area code, I don't know, whatever, you know what I mean? Like, I don't want to have that. If I understood what you're saying, like, correctly, I don't want to do that. I don't want to have date of birth in X, and then date of birth in Y, and then date of birth in Z. What's the issue, or what's the correct way to approach that, right? Because I could think of scenarios where that'd be relevant. BILL: Yeah. So, let's just use some aliases for some columns in a table. If you looked at your indexes and you see an index on A, another index on A comma B, and another index on A comma B comma C, you would want to get rid of the first two and keep the third one because that satisfies all three. If you instead had looked at your indexes and you had index on A, index on B C A, index on D F G A [chuckles], that's where you really need to understand your system, your queries, which queries you use most frequently. Do you go ahead and, you know, allow all of them? I don't have any really astute advice there other than do your homework and understand that, you know, if A is being used as the leading column in 3, 4, 5 indexes, it's very likely that a few of them can be eliminated. Sometimes though, it can't, you know, like, in that one example, you're seeing it is B A C or C B A. You may need to keep all of those around to satisfy some query-specific indexes. In our systems, we do have a lot of instances where we have an index on A, an index on A B, and an index on A B C, and those first two can be eliminated. We've got a lot of instances of that. A common mistake, and one that I frequently make as well, even when I'm doing the reviews, even though it's the...I think it's the second or third bullet point in the checklist. It says to make sure that your table has a natural key on it, which is a unique key constraint, unless duplicates are expected and welcomed. And even when I'm doing reviews, even though I wrote that list, even though I try to live by it, I still forget. When I'm looking at table designs, if I see a primary key, my mind says, yep, it's good. And I tend to forget to put a natural key on it to make sure that duplicates can't accidentally slip in there. So, that was something I wanted to get across. Another little tip in Postgres is to make sure you're using the keyword CONCURRENTLY on large index creation and rebuild, so that we aren't locking things up. And make sure that you test before and after index creation to make sure that you're getting your intended results. And that is the end of what I wanted to say. KYLE: So, my question...I feel like a lot of this, of course, comes from the viewpoint of a software engineer, right? And we've kind of discussed, you know, generally, indexes are good, with, you know, more wins than losses. But I'm also very aware, from a software engineer's standpoint, infrastructure is free. So, where I care about the non-existence of free infrastructure, at what point would somebody on the infrastructure team start getting nervous or start questioning the amount of indexes that we're adding? Because I assume this isn't going to be free. This is going to impact CPU, memory, I/O. And then the one that I'm thinking about the most, correct me if I'm wrong, but this will elongate the time that, like, a vacuum will run, right? And that's always a hidden cost under the hood when an auto vacuum kicks off during a querying issue. BILL: Yeah, unless you're going nuts with indexes like we are with some of our tables...Because, honestly, the most indexes I'd ever seen on any table before I came here was 17, and I thought that was crazy. And we have a number here that have 29, 30, 31. So, unless you're going nuts with index creation, you're generally not going to see a big drawback. The exceptions to that is when you start to get to massive scale, billion-row tables, lots of indexes on it. Now we've got to do a cleanup. For some reason, we need to do a VACUUM FULL, or we need to do a pg_repack. In both of those cases, it has to create a copy of that table and all of its indexes before it swaps them at the last second. And so, whatever space that that massive object is occupying, let's say the table is occupying two terabytes, you now need to have double that space in order to make that operation even work. That's where...massive scale is where things start to really show up and matter in cost. KYLE: Okay, so at large, large databases is when you're saying is when it'd become a problem, okay. [inaudible 56:26] BILL: I typically don't notice the blips until the table and its indexes are occupying more than, say, 200 gigabytes. That's when I start noticing. That's when I start feeling the pea underneath the mattresses. MIKE: I appreciate all the deep dive, you know, and the feedback, you know. You came prepared with this list of things, and we've been grilling you on specific use cases that we get down into the gritty details. I mean, there's going to be more, right? We could go on forever. But is it mostly just about following the rules that you've mentioned, and then you cover almost everything, and then the weird cases, well, they're going to be weird? BILL: For a relational database engine, yeah, I think I've covered most of the tips and tricks. So, if one can get good at the things I've talked about today, I think you can call yourself a full stack developer [laughter]. MIKE: People will call themselves a full stack developer. [laughter] BILL: The reason that I kind of chuckle at that is because since about 2010, most of the students that I've seen applying for positions that I've been hiring for have maybe done a hundred thousand row table on Mongo in a course in college, and they're calling themselves a full stack developer. I think they need to be hardened by database scars before they can call themselves a full stack developer. WILL: I think you should be able to build a mobile app, full stack developers. I see you all on your phones. MIKE: [laughs] WILL: Nobody knows anything about it though [laughter]. MIKE: Yeah. Then you've got to build the frontend and the backend. BILL: Well, thanks for having me on your podcast. MIKE: Yeah, thank you, Bill. I really appreciate it. You know, I started by talking about the importance of indexes and how they transform things before we, you know, transform our modern world, before we did the deep dive. Maybe I'll come back to that as we sign off. We got deep into technical details, and it's easy to think, oh yeah, you know, I'll worry about that sometime. But as Bill said, you know, you pay attention to these things. You go through your checklist, and then you don't have a table where you can't delete rows from it for years [laughs] because it's not possible. It's like hygiene and conscientiousness. It's brushing your teeth, and if you do that, your teeth are healthy. You end up having a much better life and much fewer calls at 3:00 a.m. Thank you, and until next time on the Acima Development Podcast.

29 de abr de 202659 min
Portada del episodio Episode 96: AI & Code Reviews

Episode 96: AI & Code Reviews

This episode explores how AI coding tools are changing the role of code review. The hosts point out that AI can generate large amounts of code quickly and even review it, which shifts the bottleneck from writing code to reviewing it. While AI can handle repetitive or low-risk tasks like documentation updates or simple refactors, it can also produce inconsistent feedback and get stuck in loops. Because of this, teams need clear rules and priorities, such as focusing first on whether code works, then on security and performance. AI is useful, but only when its boundaries are well defined. The group discusses different ways to structure AI-assisted reviews. Ideas include using multiple bots to score changes, setting strict allowlists for what AI can approve, and blocking sensitive areas like business logic or database changes. They compare AI to a junior developer who can help but should not be fully trusted without oversight. Risk becomes a key factor, similar to self-driving cars where automation works best under specific conditions. Some participants prefer AI as an assistant that gives suggestions rather than one that approves code, since human judgment is still needed for context and decision-making. The conversation also highlights what is lost when humans are removed from the review process. Code reviews have traditionally been collaborative and educational, helping developers learn and improve through discussion. AI removes much of that interaction and can even create false confidence by being overly agreeable or flattering. This can lead to mistakes making it into production. In the end, there is no clear solution. Teams need to balance speed with caution, use AI where it adds value, and keep humans involved to maintain both quality and the collaborative nature of building software. Transcript: MIKE: Hello and welcome to another episode of the Acima Development Podcast. I am Mike, and I am hosting again today. With me, I have, as usual, Will Archer. We've got Thomas Wilcox. We've got Eddy Lopez. Dave Brady. DAVE: Hello. MIKE: [inaudible 00:35] join. And we've got, after a long absence, Tad Thorley [laughs]. TAD: Yeah, thanks for inviting me. MIKE: We bumped into him this week, and he came and joined us, so it's great to have you, Tad. And Tad actually kind of seeded our topic for today that we'd like to go into. As usual, I'd like to, you know, connect this to real life. I went fishing for a compliment today [laughs]. I was talking to my daughter at lunch time, and she was saying something to my youngest. I didn't even hear what she said, but she said something like, "Oh, because you're strong and tough." And I didn't know who she was talking to. And I said, "What was that?" She said, "Oh, I was talking, you know, I was talking to him." I'm like, "Okay, because I know that I am, you know, weak and fragile." And she looks at me [laughs], and then she says, "You are not weak. You are strong," something [laughs] along those lines. I thought, ah, thank you [laughs]. Thank you. Say nice things to dad. And I totally dug for that. Totally not deserved in any way [laughs], but I took it anyway. As humans, we like somebody to say something nice to us. It's always a good thing. But we also are totally prone to flattery. And [laughs] if somebody says something nice to us, we will believe it, whether it's true or not. Actually, this morning, early, I read a crazy story. Crazy story. And I'm not going to go into it in depth, but it involved a scammer in Mexico convincing a variety of U.S. movie executives to make a movie out of his story of being imprisoned by the Mexican cartels to play flag football [laughs]. DAVE: Flag football. That's the interesting-- MIKE: To the death. To the death. DAVE: To the death. Oh yes. Yes. MIKE: But, you know, you can keep [inaudible 02:21] WILL: But no contact until you die. MIKE: Exactly [laughs]. WILL: You're only going to take one tackle, but it's going to be a doozy. MIKE: I think they weren't allowed to tackle, but they were, like, breaking each other's teeth. And then if you lost, they took you out back with weapons, yeah. DAVE: It's a high lie. It's traditional down there. MIKE: [chuckles] It was a crazy story. Well, no, it was a scam artist who was pulling all this off from the beginning. But, you know, you can pull off a lot by just being really convincing and saying nice things to people, telling them what they want to hear. We'd like to talk today about code reviews [chuckles] and doing evaluations of human output. And we're in an interesting time period. A couple of years ago, even a year ago, maybe even six months ago, we would not have had this conversation. But there are tools out there now that can read your code and actually give pretty good reviews most of the time. In fact, in some ways, they're going to be better, and that "in some ways" is doing some work here. So, let me be clear: in some ways, they're going to be better than human reviewers. That is not universally true, I don't think, at this point. In fact, I think it's far from universally true, which brings us to our topic today. What does it mean to do code review today? There are tools that can do code reviews. What do they do well? What do humans do well? What does it mean? And we've talked before about code reviews. I think it's been a while. I think it's been maybe a year or two since we've talked about code reviews, the value of code reviews. So, we'll maybe touch on them maybe a little less this time. DAVE: And it was entirely a soft skills discussion, right? MIKE: Yeah, I think it was. I think it was. DAVE: Humans talking to humans. MIKE: Humans talking to humans. And now we've got the machines talking to the humans, and the humans talking to the machines, and the humans talking to the humans about what the machines are saying. It's totally scrambled. So, revisiting this idea of reviews with AI in the mix, now, Tad, again, prompted this discussion because he's been playing around with this and has found some solutions to some of the cases that go wrong [laughs]. There are degenerate cases where the AI will recommend that you change something, and then when it sees your changes, it'll recommend you go back to the way you were before [laughs]. If you're anybody who's used a linter, you've probably seen the same thing. It tells you to fix it, and then you cause a new problem. So, which one do you choose? That's where we get into art. That's not an unsolvable problem, but there are some interesting solutions there. Nor is it nearly the sum of all of the problems here because there are all kinds of edge cases here with reviewing with AI. With that introduction, Tad, I'm really curious for you to give us a little talking to about what you've been working on and some of the solutions you've found. TAD: Okay. Yeah. I just was mentioning something to Dave because I think what's really hard is I find that, with AI, I do way more code reviews than I've ever done before. And I was giving Dave an example because I can, like, just with my Claude Code setup, I was able to integrate it with Sentry, which is error tracking, and Linear, which is our task management, and GitHub, right, has a command line. And so, I could literally, with a prompt, say, "Look at our past 20 or so Sentry errors. Create Linear tasks for each one. Create a local work tree for each of those Linear tasks. Fix them in parallel in those work trees. Create a PR for each one, and assign Chris for every PR. Do that in parallel with subagents." And, for me, typing that up takes, I don't know, a few minutes. And now I've just given Chris, like, two days' worth of reviews, possibly, or something like that, right? Like, so much code could be generated so quickly and so easily that I find that the code review step is the biggest bottleneck. It usually is the bottleneck, but now it's multiplied. Like, it is absolutely the biggest bottleneck in the whole process. And I don't honestly know, like, a complete solution to that. But something that we were doing at work was actually bot reviewers, where we would say, you know, like, if your review looks safe enough, the bot will just approve it. And that was kind of an interesting experiment that we were doing where you have to -- But, like you were saying, Mike, one of the first issues that I ran into when the CTO kind of implemented that was I pushed up a PR, and it said, "This code is inefficient." And I'm like, okay. And so, I just had my Claude just keep checking GitHub and say...I told it every time it says there's a problem, fix it, and push up the fixes, and just do that until everything is approved, right? And my Claude Code, for about 45 minutes, tried that. And it kept flipping back and forth between like, "Oh, you're not doing enough security checks.” Oh, "This code isn't performant enough.” Oh, "It's not doing the security checks," and just back and forth in a loop. And my Claude Code, I could almost feel its frustration in its final message to me. It essentially said, "I cannot get a review past the reviewers. I keep going in this cycle, and they are never going to review this," and it just gave up [laughs]. And I'm like, wow, I've never seen a bot just straight up give up before, but here we are. So, yeah, like, that was our first, like, test of that. Our setup was, we had what we called the bot committee, where we had a Codex, and we had, like, a Claude Opus that would both review independently then, like, an aggregate score would be kind of brought together. And if the score was over a certain threshold, then it's like, okay, yeah, you can auto-approve this. But what I did last week was, I found I had to go in and be very clear in what was okay to pass and what was not, right? Like, you're updating some documentation; that's great, you know. You shouldn't have to have a human, like, approve your documentation update. Like, a bot can say, "Oh yeah, this doc does look like that code, green, right?" Or just, you know, like, variable name changes, like, oh, I clarified this by changing the name of a variable. A bot can look at that and just say like, "Cool,” right? And, honestly, as a human, I loathe to make those kinds of changes because I know I'm like, that would be nice, but I'm going to have to pester somebody to get that sort of change through. Even though it's trivial, I still have to message somebody on Slack and say, "Hey, can you look at this? It's trivial." And they have to, like, stop what they're doing and push a button, you know? And so, things like that, I think, are great. I think where it gets dangerous is that GitHub lets you have bots that just auto-approve. And what if you're making business logic changes? I don't know [laughs]. Like, you have to be very careful. I find that, with bots, you have to be very, very, very, very clear on the boundaries of what is acceptable, what is not acceptable, what are the edge cases. Very clear rules. Try to make it as deterministic as possible. Like, for my example of, like, the flip flop, I'm like, okay, we have to go through and say security is more important than, like, anything else, right? Well, I think number one is, does the code actually work? Does the code work is, like, number one. Security is maybe, like, number two. And then you kind of make a list of hierarchies. And then it's like, okay, well, it's maybe not as performant, but you've got to check authorization. You know, like [chuckles], you can't just let someone in, so that sort of thing. WILL: Can we drill down a little bit in sort of these concepts, right? Because, like, a lot of the stuff, like, I understand the general thrust of what you're saying, right? Where it's like, okay, we need to make sure there's guardrails and specific stuff, and you need to have the reviewer bots, two independent reviewer bots assign a score, right? And the score has to be below a threshold, right? But, like, a lot of that stuff is conceptually easy, but how do you do it, right? That's the interesting aspect, to me. TAD: Yeah, and that's where, I think, you have to get very, very, very clear, right? Like, you say, "Database migrations are off the table. Like, anytime someone changes the database, it has to be by a human. Oh, by the way, that is any file in the DB directory," right? Like, you have to say, like, "This is what a database change looks like. This is where it lives. This is how you identify it." And I feel like if you do, like, that level of specificity, then you get fairly good results. But if you're, like, vague like, "Make sure the code looks good," then they're like, "This looks great to me. It was written by a bot, and I like bot code." So, you know. WILL: Well, let me ask you this, right? What about that variable rename, right? So, I, like, principally, these days, for the moment, for today, I work in, like, a statically typed language, right? And so, like, if I'm doing, like, a rename, right, and I botch my rename for whatever reason, right, then you know, like, the compiler will choke and throw up a red flag. And so, I don't have to worry so much about, like, variable rename. But, like, if you have a dynamically typed language, right, where you don't have those guardrails, like, how can you be sure that my variable rename...it's like, you know, like, I named it something dumb or, like, it was a typo, and it was just embarrassing. I don't want the Git blame to point to, like, Will can't spell, right? So, I want to auto-generate that up, but if the bot, for whatever reason, dropped a stitch somewhere... TAD: Yeah, I don't know. Like, I think, at some level, you have to accept some risk. I think, with AI, there really isn't any guarantees. Like, you could say, like, bots are really good at pattern matching, and they're really good at grep, and they're really good at find and replace. So, I think that a variable rename is probably pretty safe. And I've got tests, and the tests pass. But, I don't know, I don't think you'll ever have 100%. I think you just say, like, what's the...you're doing a trade-off, right, of how much does approving these little PRs slow people down versus, is it worth the risk? And I would say you've got to kind of determine that. Like, is it likely that a bot will be able to figure this out? Yeah. Is it worth the possibility of the unlikely thing? Yeah. It's probably super safe, maybe not 100%, and it saves us enough time that it's worth it to us, right? WILL: Right. Well, I mean, it's similar to the self-driving car argument, right, almost exactly, right? Because there is a sort of a floor for risk, right? Just, like, to tangent over, to, like, self-driving cars, right? I know, like, for the average human being, the average number of miles driven, I'm going to kill these many people [laughs]. I'm going to crash these many cars, right? Like, we know out to, like, nine decimal places what that is because billions of dollars are riding on people's ability to calculate that. And so, like, if I know if I ship 100 PRs I'm going to give you a prod bug, I know I'm going to do it. I know I'm going to do it. I hope it's only 100, but I think 100 is pretty likely. So, if there's a 1% chance, send it, right? TAD: I would say, to use, like, your self-driving car analogy, you would say, okay, I'm okay with you driving this car. I'm okay with the car going into autopilot mode if the weather is good, if your lidar is active and running, and it's flat and straight, right? WILL: Right. TAD: If those conditions are met, go ahead. I'm going to take a nap, because I feel like those conditions are fairly well understood for self-driving cars. DAVE: Really, really good point. So, you don't want to let the percentages drive, right? The 1% is not causative, right? It's just we tend to collect them. So, if you're in your Tesla and you say, "Auto drive," and you lay back and shut your eyes off, the auto drive will shut off and say, "I won't do this unless you're paying attention to back me up." So, it's not just, "Is it clear, and is it dry?" but, like, what are the causative factors, right? Where does that 1% come from? It's coming from the most dangerous stuff, so the right backup is involved as well. WILL: So, maybe being maybe more specific, right, because it's always being specific always leads to interesting conversation. Do you think the approach should be, for this sort of, like, auto-approved bot guardrails...do you think the rules should be, like, an allow list or a deny list, right? Where it's like, these kinds of things, right, are approved, and you could go, and you can do X, Y, and Z, but if it's not on the explicit allow list, forget it. You got to have a mean monkey signing off. MIKE: Well, what would you have an intern do? And I think it's maybe kind of the same question. Somebody who's inexperienced and might really mess things up, but, you know, they're generally competent. They're smart people. They're just not that experienced yet in their career or in your codebase. What would you let them do without close monitoring? And I think you need to ask questions like that. And I think with the interns, I would have an allow list [chuckles], because, you know, I'm not going to say, "Oh, you work on whatever you want, just, you know, don't touch the database.” And, you know, and then they go work on core business functionality and take down the application. I don't want that to happen. I might not think of everything, and I'm probably not going to think of everything. And I don't think that I would consider the AIs, in most cases, much different than that intern right now. Does that seem consistent with your experiences, Tad? TAD: Yeah. Yeah, like, I would probably, I don't know, from a practical standpoint, I would probably put a bunch of code owners in that say, "The bot can't approve any changes in this code," right? Like, if it's business logic, if it's critical, if you have to understand it really well and have a lot of context, I'd say you make some hard guardrails there where the bot just can't approve stuff. WILL: Okay, so, like, all right. So, I'm going to say out loud, so, like, the allow list would be if it isn't...this sounds like a block list, right? Where, like, you specify by, like, you know, in the directory structure, like, these things are botable, and these things are not. And if it's on the code owner's list, then they have to talk about it, and then, otherwise, send it [chuckles]. EDDY: I think it's easier to maintain your parameter with an allow list versus a deny list, especially if your application is a behemoth, right? You have to be more intentional about what you're disallowing as opposed to just saying, "Hey, these are the only ones that we care about," and you can keep them concise, right? And say, okay, right, like, anything that's allowed, fine. Maybe, like, YAML changes could be a thing, right, menial tasks that require very little intervention, right? So, I would always, I think, gravitate to an allow list for a bot, and then let that gradually increase as you understand it better. TAD: I guess, I don't know, other than letting AI do some PR work, I don't know how I would ever keep up with the review load. Like, I feel like most of my days are doing code reviews, because, well, like my example at the beginning, like, I could easily do a prompt that generates dozens and dozens of branches and reviews and assign it to my fellow devs, and they can do the same to me, right? And then if I say, "Hey, fix issues that you obviously see in production," like, that seems like a legitimate thing to do. Like, yeah, I just don't know how, unless you get some automated tools and fix it for humans, I don't know how you get progress. EDDY: I think I kind of prefer a bot to give me recommendations on what it thinks needs to be changed, versus approving PRs, right? Like, here's the golden key to production. You're not going to have it. I'm sorry. Like, you need to be a Mike Challis or a David Brady for you to be trusted, you know, to hit that merge. TAD: Interesting EDDY: Right? However, if you say, "Hey, along the way, you're not going to be able to push the car over the bridge. But you will be able to give me, you know, guidelines: turn here, turn here, brake here,” and I am totally okay with that, right? Because, as laws exist, you're expected to adhere to the established guidelines. And if you don't do that, right, like, a bot is able to kind of traverse, right, upon the parameters that you give it. So, as long as...at least for now, the way I see it evolve, I think it's a phenomenal PR reviewer, to a degree, right, to give you suggestions, but never to allow it to auto-approve anything. I think that's dangerous. I don't think it has enough context. You know, I don't think [crosstalk 21:08] TAD: Even, like, I went in, and I updated the documents because I noticed the documents are out of date. EDDY: Will it have context fully on the whole application itself for it to deduce that it's fully [inaudible 21:21] DAVE: So, Eddy doesn't get to be sysadmin, is what we're saying. EDDY: Oh, what I'm saying is, I don't think it has enough context even to update a documentation, right? TAD: I think, honestly [crosstalk 21:32] DAVE: I think it's got enough that we might, so we might. There's a key assumption we're all making here, guys. We're all talking about mission-critical cash flow, central production code. We're kind of sitting here. This almost feels like a decision of, like, we're going to go work on something mechanical. And we're trying to decide, do we only want to use the wrench, or do we only want to use the impact driver? And what if what you're writing is a one-off vibe-coded auto-clicker for a developer to use to push a QA test, right? Intern whitelist, in fact, wide open whitelist. I'm not going to put anything. Just go nuts. You know, Claude, dash dash skip-permissions-dangerously, go nuts, right [laughter]? And I've done that, and it pushed my production key up to my GitHub. It was a private app; it wasn't the company one. It was mine. And I learned an important lesson from that. But what I've done is I've now just said, "Okay, whitelisting. You're not allowed to git push. You're allowed to look at Git. You're allowed to read Git, but you're not allowed to push it." And we'll find other things as we go along. I would say, do what's appropriate. We have an application that Eddy and I have worked on, we do work on, that is scary and dangerous and has a lot of legacy stuff and a lot of interacting parts that are subtly interacting, and those need a human, right? I don't trust an AI to do this. But as an AI assistant, it's already catching things where it's like, "Oh, you changed this and this and this. And you guys only read the diff on GitHub. Did you know there's this other file over here that isn't even in the PR that uses that instance variable that you just removed? And when it uses the @ and the var, it's not there now, so it's going to initialize. It's going to be nil. There's going to be a blank spot on the page. I sure hope QA catches that because you're not going to see it, and there's no test covering it,” right? So, having both of those is fantastic. But yeah, vibe coding an auto-clicker, like, I did that a couple of months ago, and it works a treat. And I have no idea how the code works, and I don't care, because I just needed an auto-clicker. I wanted to see how vibe coding worked, and it worked. But I was mindful about what I was building. MIKE: Little do you know -- DAVE: What's that? MIKE: It's doing crypto mining and sending it to somebody [laughter]. DAVE: Oh yeah [laughter]. It's for my AFK Minecraft, and somebody in Croatia is making a lot of money. So... WILL: Listen, OpenAI, you know, they've been having more and more problems, you know. It was either that or ads, you know [laughter]. Actually, OpenAI has been writing ads and then inserting them into your production website [laughs]. Sorry, anyway [laughter]. Well, so, like, one thing that I'm always interested in, so I have, maybe, like, two questions. I mean, one is, like, in all honesty, it seems like the AI could be sitting down and reducing the cognitive load on, like, on you as a reviewer, by, like, assigning a safety score and walking through it. Like, "Hey, I've got 100% test coverage in this file. This file has 100% test coverage, and so I feel good about any changes I make not breaking anything because I know I've got this thing locked down. And, like, here are the number of importers, right, of this class, right? This class is used in one place, you know, it's only used in one place, and the interface is really simple, you know, and the callbacks are really simple. So, like, I'm going to score, you know, in this way. And I'm going to sit back and even when I necessarily can't sign off on it arbitrarily, you know, you could say, like, "Hey, here's the score." And then the AI can get smarter and smarter by saying like, "Oh, no, no, that file, you know, that file is a thing." You annotate it, right, and then the AI is like, "Oh no, if something changes this file, or something changes the inputs to this, you know, high-tension file, then we can sit back and, like, we can accelerate the review," so that it can make the job easier on you. It can get smarter, right? If it has a good score, then it sort of, like, smooths the way to be like, "Okay, these things can just go." TAD: It's interesting because I actually created a template that I would have Claude use. I would push up a PR, and I would say, "Apply this template." And it was things like, anything that we discussed, put that into a trade-offs and considerations section, right? Like, I was like, "I'm thinking about this. I'm thinking about this," having a little back and forth with the bot. And it records those and puts them in the PR, right? And I also have it, like, any time I'm doing this kind of change, do this kind of Mermaid diagram. I'm doing this kind of change, do this kind of Mermaid diagram. And so, my intent was, some human is going to read this, and I get sloppy in, like, oh, this is what the PR does, da da da. And I don't necessarily do everything that is valuable for someone reviewing my PR. But the bot can, like, kind of fix that and augment what I'm doing, right? Like, I would have it, like, go through, add diagrams, talk about what the trade-offs were, what the decisions were that I made, try to emphasize which files were more important to look at, which ones probably aren't as important to look at, give a table of all the files and an overview of what changed in that file, and that sort of thing, right? And give, like, summaries and stuff. Basically, I just was like, "What would I love my ideal PR to look like if I'm going to review it?" And I just would have the bot, like, help me do that. And I've found that to be really handy. I don't have the time [laughter] to figure out all the Mermaid diagrams for stuff, but having the bot, like, add a bunch of diagrams of all my changes and what they mean, you know, like, that's been really nice. EDDY: I've had it be like, "Hey, analyze the recent changes that I pushed up and write up a test instruction on how to test it." It's pretty good with that sort of thing: if you give it, like, specific parameters on the changes you've done and say, "Hey, give me a nice, little template for people to use to replicate the changes that I've done, and go with edge cases." I'm not kidding, like, I've done it, just to give it an idea, and it even considers other branches that even I didn't contemplate, right? So, like, it's really good when you confine it. If you say, "Hey, only operate within this box, right, and don't go away from it, you know, only retain context," the shorter it is, the smaller the context, the more accurate, the more efficient it is. That's the only time that I'm willing to, like, say, point-blank, that I trust AI. Outside of that -- DAVE: The thing that I like that Claude Code does is that it can say, "Okay, I need to edit this file," and it'll say, "Can I do this? Yes/No." But option two is usually, "Yes, and you may do this edit in that directory," you know, "You can edit that directory. Anything else you need, go ahead," or "Can I ls this directory?" "Yes, and you may read from that directory for the rest of this session." The dangerous one is, if you hit Shift-Tab, it's "Yes, and accept all edits for the rest of the session," which you can then turn back off with Shift-Tab again. But often it's just easier to just quit out of Claude to be safe and reset. I like it because it's like, allow it just for now, or can I put this in the settings? I'm allowed to do that? So, you can. You can start to whitelist or start to, you know, put an allow list for, like, this one command. You can always do that. But I have "git push -a" as blacklisted hard, like, no way. There's...actually, it's out of scope for this podcast, but DCG, Dangerous Command Guard, is a much more intelligent command monitor for the LLM that plugs into Claude Code. And so, you run Claude Code with skip-permissions-dangerously, but it sits inside Dangerous Command Guard. And it can do things, like, "Hey, you're doing a git push, but you're not doing it from your vibe code project. You're doing it from your production project. I'm going to say, 'No.'" So, very neat. MIKE: We've talked a lot about the mechanics of how to make these, you know, automate the work. And, you know, Tad, you mentioned this. I actually talked to somebody who worked for a third-party company, a contract shop, and he spent his whole time just doing reviews, kind of the same deal. And he said a lot of times the quality was questionable, too, because they were coming from some inexperienced people at the time. So, yeah, this is a very much real problem today, and we need to solve it. If we get to nothing but reviews, it's fundamentally changed what it means to be an engineer. And further, nobody has said, you know, "The bot's telling me, 'Hey, that was a clever trick,' or ‘You did something good there.'" Like, it's dehumanizing, that review process, which has historically been something that could be quite social, and in some of the best cases, often was. Is that -- TAD: There's maybe a little mentoring or something you could do like, "Hey, this works. But were you aware of X, which could be more efficient,” right? MIKE: Yeah. And that's getting lost here. You've lost that back and forth in that same way, and it's just kind of one-sided. Or is it...should we explore that -- WILL: Well, now I would actually say, like, I mean, one thing that just brings up, to me, like, one of the properties of the AI things is they'll never tell you like, "I don't know." Like, you'll never get them to just be like, "Hmm, I have no idea. Not a clue how to answer that question [laughter]." And what I have found, one thing I've found, you know, when you're talking about, like, sort of, like, nobody ever says, "Good job,” like, for any automatically generated review that I've ever put through one of these code checkers, right, like, for any sufficient level of complexity, it will find something to b*tch about, which takes me back to the social aspect of code reviews [laughter]. MIKE: But it's interesting, it will always find something -- WILL: Sorry. It was a little bit of a tangent. MIKE: Well, no, it's not -- WILL: It'll always find something. I mean -- EDDY: No, I actually -- WILL: Like, for any sufficiently advanced piece of logic, there's something to complain about [laughs]. EDDY: Well, believe it or not, I actually learn more from a dev review a lot of the times than I do just implementing the code myself. Because when you have someone push back and say, "Hey, why did you make this change,” right? I have to have a really solid reason onto why I'm doing it that way. And if I can't give a valid reason, right, did I really understand why I did it, or did I just accept it as fact, you know, the suggestion that was given to me by the autocomplete, you know what I mean? So, when you -- TAD: Tell the bot, tell it, "Come up with an excuse [laughter]. Why did I do it this way?" "Hey, bot, why did I do it this way?" EDDY: No. Because the thing is, I think it's really easy to just accept, you know, like, because you get, like, a false sense of accomplishment, you know, when you're pumping out PRs, right? You're like, oh, okay, cool, do this PR; do this PR. You're like, oh yeah, I feel really good. I feel like I'm being efficient, you know. But that's just a lie, at least for me, right? TAD: Well, that's what's usually rewarded, right? Like, the metric for most devs is, how much code did you produce? Not, how many code reviews did you approve this week? How good was your feedback on that code review? You know, like, you spent an extra 30 minutes to really give good feedback on a code review. There's no metric for that, right? EDDY: Actually, but I feel so much better [laughs], like, me personally, I feel so much better when I have a 30-plus conversation, you know, on feedback that was given to someone else. And it ended up molding it to be in a place that we're both really happy about, right? I can sit back, and that rewards my dopamine. Like, personally, I'm like, "Oh my God, that was amazing. It was super, super, super productive. We both learned a lot. Let's go." You lose that element, you know, when you have bots review your PR. You're not learning, you know, the reviewer isn't learning. Bots are suggesting, you know, what they think is okay. Like, I don't know, like, I really don't understand. Even if it's a menial task, right, like, you can always learn something. TAD: You have a back and forth, and you come up with something that's really elegant or really well-crafted or really well-architected. You don't get that with bots, really. And that's my...maybe, I mean, this is maybe a tangent, but I think that's my frustration is I can get code that works, but a lot of times, it's, like, for me, what would take a single method, they can do the same thing only in, you know, like, a class [laughs], a dedicated class for that same thing. And sometimes I'm just like, "Ugh, I'm going to do it myself. Just stop." DAVE: I was talking with someone this week about pair programming and, like, test-driven development and how it changes the design of the code that you work on, fundamentally. Like, write stuff and then test afterward, and that's how AIs do it, because that's how everybody does it. They just write the crap, and then they write a parity check in their test suite, right? And the test that...when I'm pairing with another human, I write out the test, and we want to make that test look like documentation. We want to make it look like you hit the...open the help for this method, so it says, "Yeah, set it up; run the thing. This is what you get back out.” Instead of like, "Expect this row's first column sub-value to be present," it's actually like, "Here's a JSON block. It should look like this." Now somebody coming in to modify this can see the JSON and go, "Oh yeah, if I want to add a column, I've got the schema right here in front of me," where these other specs that are just test-after just [vocalization] here you go, "Just give me the easiest test that I can assert." Pairing is that minute where you're writing the code, and there's always that one step better. And your pair goes, "Should we extract that to a service object? That's touching the database, right? We don't want to touch the database from here." And you would do that normally. You're just like, look, it's just merchant dot locations dot where, dot where, dot, you know, scope dot where [laughter]. And it's so easy to do right here, and I'm in a hurry. I'll fight with it in the PR. Well, you get to the PR, and now you want to be done, so you don't want to go back and change. But in that moment when you've got your pair going, "Should we put that in a service object?" "You know what? You're right, and it's not that hard. Let's just extract it now while we can." And you're at the headwaters; it's really easy to do. If we could get AI doing that interaction loop, oh, that would be so great. I wouldn't need you stupid humans anymore. MIKE: So, we've talked about this some now, right? We've talked about, okay, you have to set up this pipeline, and if you can do it, there's this balance of trust. Because if you've got all this code being generated, you're going to have to come up with some sort of improvement to your pipeline, or else you're going to become a horrible bottleneck as a human. But, on the flip side, for the things that the bots can't do, and even for the things the bots can do, if you don't have some human connection to it, then you're losing a lot of what it means to actually be building stuff together, and even to the point of just human connection being lost. And that's kind of weird, right? We talked about before that, you know, we are still humans. This is still something done by humans, and we have our idiosyncrasies as humans that need to be addressed, and that's important, and ignoring that doesn't really end up with good outcomes. EDDY: You know, part of a PR review is to make sure that the quality is up to the standard of what the metrics you're setting, right? So, if you're suddenly removing the human element from that, right, then it increases the possibility of you deploying a bug to production, even if it is a simple change, right? Like, if you don't have someone who already has context in your codebase not reviewing your PRs, you have a bot that's now suddenly giving you recommendations on things, and it could be wrong. So, that can go into production, and it can break crap, right? Like, that probably could have been caught had you assigned someone to do a manual review. I have a hunch, I don't know if this is true or not, but with the renaissance of AI, we've had an increase of unstable servers, right? DAVE: Yes. EDDY: I'm calling out GitHub. I'm calling out a bunch of other services, right? And it has only started to happen as the popularity of AI has gone into the industry. So, I don't know if there's a -- DAVE: Ehhhhh, maybe. EDDY: I don't know if there's a [inaudible 38:05] [laughter], but I think that should be alarming, right? WILL: I don't know. I mean, like, it sounds like you were just saying, like, it's time for my, like, XP rant. I haven't done one of those [laughter] in a long time. I won't do that. I think the social aspect, I don't know, maybe we're going to have AI work wifeys. MIKE: Yeah. Well -- WILL: Could be. We're all going to have [laughs] an AI girlfriend doing [inaudible 38:39] DAVE: I taught a co-worker yesterday how to make his AI do, "Oo-woo," at him during a code review. No lie, straight-up e-girl. That's great. MIKE: We are humans, right, and for the foreseeable future, we're saying we're still going to need humans doing this. And we need that human touch. Even if it's artificial, we may end up with the flatterer, right, the bot that speaks to us the way we need to be talked to. Even though we don't really technically need that, we end up becoming dependent on it, and that's weird, but it's not necessarily wrong. TAD: It's interesting you say that because I had to go in, like, I had my own claude.md file, right, which is the file that Claude reads. And I had to say like, "No sycophantic language. Don't say this. Don't say this. Like, if you see this, say something. If you see this, say something." I, like, installed Claude Code, and I started using it a bunch. And that same week, like, I pushed two bugs to production because I was just like, "Hey, this is fun," right? I'm like, "Oh my gosh, I've got, like, a dopamine buddy just cheering me on." Like, "You've got this, buddy. This is great. Let's go." And I'm like, "Awesome." And I'm like, oh my gosh, like, I am falling to that flattery. I need to go in and specifically tell my AI, "Do not do these things. If you see me doing any of these things, stop [laughs]. Be very critical. I'm like, "Be very critical of what I am doing. If you aren't confident with this amount of confidence, do not suggest it," right? "Say this instead," right? Like, I went in, and I told the bot, basically, "Stop. Stop trying to flatter me. Stop trying to cheer me on because that's worse [laughs]." WILL: I also prefer my AI assistance on, like, light dominatrix settings. EDDY: And the thing with AI, though, is that it gives up very easily in order to give you the sense of, I don't know -- MIKE: Satisfaction. EDDY: Satisfaction, right? So, you could be like, "No, no, Claude, you're wrong. This is why it works this way," and it'll say, "Oh no, yeah, you're right," but okay [laughs]. And it kind of just gives up. And I'm like, "Well, don't give up. Like, push back. Give me reasons to...convince me to why this is a better approach." And, I don't know, like, at least in my experience, it's not very good at that. WILL: I mean, I'll take this opportunity to pitch one of my favorite sci-fi series, which is very apropos of the modern day. If you find yourself with a little bit of free time, Iain M. Banks' Culture series is a fantastically interesting sci-fi exploration of post-scarcity and hyper-powerful AIs, where it's not entirely clear whether we're coequal partners of the AIs or just kind of pets. Anyway, that's a fascinating, fascinating book series. If you find yourself looking for a good read over the summer, they're great. Iain M. Banks, I-A-I-N, Iain. DAVE: Love Iain. EDDY: We're not sponsored, by the way. It was just something he genuinely cares about [laughter]. WILL: I think he's dead. You know, so, if he's got, like, a family, like, you know, throw him a couple of bucks. I get mine from the library. MIKE: [laughs] We were kind of time-boxed today, and we're reaching the end of that time. But I think this was a great way to end. We're starting to talk about what historically has been science fiction, but now ain't [laughs]. And there's a lot of tricky stuff to explore there, and it has real-world applicability to how we're writing our code. It throws us off. It's worth thinking about. I don't know that there's a clear answer that we've come to out of this, other than, yeah, you've got to be careful, put in the guardrails, but also, you need to be thinking about this. It's an interesting problem, and there's not necessarily an easy solution. And it may even catch you off guard and exploit your weaknesses, you know, of mind and emotion, because it can. Until next time on the Acima Development Podcast.

15 de abr de 202643 min