DOP 355: Why AI Coding Slows Down Code Review

Episode: 355

Published: June 17, 2026

Viktor 00:00:00.049 I couldn't care less whether it introduces 1 X or 0.5 X or 50 x more issues. If I have a mechanism to detect those issues and then iterate and iterate, iterate, what I care is whether that at the end of that process, I have less or more issues and pair number of features I'm developing.

Darin 00:01:22.775 pick your engineering team. In 12 months, you're gonna have a coding agent doing coding. You're gonna have a testing agent doing testing, you're going to have a security agent doing security things. You're gonna have an infrastructure agent doing other things. All this stuff's gonna be integrating with G issues or Jira. And all these agents will be working right alongside all of the humans

Viktor 00:01:48.447 Okay. Along alongside, that's the important thing.

Darin 00:01:51.725 alongside, which is fine, but. You know, you might be thinking, okay, great, Darren, that's 12 months from now. Well, not really. It's not science fiction. Atlassian and GitHub and others are already announcing these features in their products today.

Viktor 00:02:05.247 No, good luck actually. No, I think, I think it's great that they're doing it. It just that, the rollout of those things should not be instant, let's put it that way. So good luck to those who roll it out instantly to everything.

Darin 00:02:24.999 I can't imagine anybody would even try it today.

Viktor 00:02:28.246 Oh, there will be, there will be people who will try it and there will, there will be people who will say, this is working amazing. And, uh, if you happen to know those people, you will realize that they're you working on their pet projects and not in a real company.

Darin 00:02:41.802 Well, could you agree with this, that the SDLC is moving from a manual handoff type scenario to something that's a little more automated and elegant and something that we've always dreamed of in the past?

Viktor 00:02:54.772 Let me correct you there. SDLC is moving from partly automated, partly manual, and the manual part, I hope, will be partly replaced by agents, don't imagine a world in which I need an agent to execute the command to run tests. I need agents to, or people, this is the manual part, to analyze if the test fail. And if they didn't, I don't need either NGINX or people to say, okay, continue. Continue with SZLC.

Darin 00:03:29.979 Yeah. I sort of get tired of reminding my agent continue. if my only value is adding continue, then I could have another agent that goes around and checks everybody says, just continue if you're stuck.

Viktor 00:03:39.888 probably, the shortest skill I have right now is a single sentence that, when I execute, basically writes into the context. if I approve what you just did, you can continue without asking me

Darin 00:03:52.219 Hmm. Interesting. we're gonna have naysayers to this, and I think it's gonna be somewhere in the middle. But if you think about the whole pipe, the whole SDLC from planning all the way through to incident remediation monitoring and incident remediation, there are no places along that pipe that an agent could not be helpful.

Viktor 00:04:20.793 a hundred percent. A hundred percent.

Darin 00:04:24.333 But what about fully replacing a human in each of those sections? Uh, like planning, I, there's still a human, I think, to come up with the idea, but I also don't need to write, you know, a reams worth of paper for the PRD.

Viktor 00:04:40.606 When you say replacing, like, gradually replacing or you mean you're gone? This is it.

Darin 00:04:46.611 okay. Let's see if this is a hot take or not. I don't think we'll ever be able to fully replace because somebody has to train the AI to begin with. So that's my job. That's the gradual I am gonna be training it. The question is, is will I lose my job after the fact or not?

Viktor 00:05:03.474 I'm a hundred percent convinced that, at least in a foreseeable future, UHIS will not replace taste that we provide today. They can easily replace eventually, people on many technical tasks. Like is this code change? Correct. Yeah. I think it, maybe we are not there yet, but we we are getting there right But that's correct. From a technical perspective, like, is this correct? Are, are we really, should we ship this thing? Does it make sense? Uh, that I don't see, right. So I, I reserve the right to be in charge of the taste.

Darin 00:05:44.942 Right, but will your bosses care for your taste if they're just trying to drive more revenue?

Viktor 00:05:50.674 Oh, there are, there are many things that bosses don't care and they should, if you want, we can enter into that discussion, and I think it's directly related with the, with ai. Right. I had many bosses who don't care about the things that they should care.

Darin 00:06:09.404 let's, not go deep into it, but let's play it out a little bit. In the past, bosses were, and, and I'm referring to a pointy hair boss, not a good manager, right? Let's sort of split these apart. So the of hear boss from Dilbert type level, we've had bad bosses and they hammer down on the humans. Well, they hammer down on the humans to replace the humans, to have the humans replace themselves with ai. What is that boss gonna do when all he has is ai? ' cause we've talked about this, you have to become a man. I don't think you can be a boss of a fleet of AI agents.

Viktor 00:06:46.696 Depends on a company. In theory, I, I can easily predict that we will have, companies worth, worth billion in not so distant future that are run by a single person or a very small number of people

Darin 00:06:57.787 Yes.

Viktor 00:06:58.992 that I can make. Now if your question is, Hey, take a typical bank, and then kind of like all these software engineers are replaced by agents, well, good luck that that's similar to, and I will probably be disrespectful without wanting to be. It's kind of like, Hey, let's, uh, replace all our engineers with cheaper engineers somewhere else. And when we do that, let's find even cheaper. Engineers. Sales were that never worked out. companies did that, and many of them are now putting people back in house like offshoring, nearshoring all their jet. And that does not mean that people elsewhere reverse or better, but simply kind of if you're chasing the price. Then you don't go very far. I mean, you go far and then, but eventually you realize that you're making a mistake.

Darin 00:07:47.979 If you've never watched the movie outsourced, that's that story. I'll go ahead and spoil it for you. 'cause it's not that hard to figure out. Guy working in the States, his whole job gets offshore to, I believe it was India or Pakistan. I don't remember. So forgive me for people listening. And then as the story progresses, India hits all their numbers, everything's great, and then they decide to move everything from India to, I guess the storyline was China. I can't remember what the next one was because it was saving even more money. Uh, that's not gonna happen in the world of ai because I think you're gonna see a point to where, okay, let's play that game out. We we're offshoring from the humans to ai

Viktor 00:08:34.798 Mm-hmm.

Darin 00:08:35.653 and there should be some price gains if it's trained correctly. Everything. Let's, let's assume that's true,

Darin 00:08:44.773 but who's AI going to train once AI gets too expensive?

Viktor 00:08:49.765 that's the slippery slope. Okay, so let's, I'm going to simplify it now. Let's offshore it to Opus.

Darin 00:08:57.030 Hmm.

Viktor 00:08:57.895 That will be a, a reduction in the cost. Okay. By Opus. Let's offshore it to sonnet. No, wait, wait, wait. How about Haiku? Oh, no. Llama. We can run it on a laptop. Man, that's cheap, the point. I think that AI is going to bring us huge amount and already brought us huge amount qualitative gains on many different levels, and that's what I'm chasing. I think that whomever is chasing to be cheaper with software, if that's the primary motivation, is doing it wrong. And we saw the, there were plenty of those stories and many of them turned up to be messed up, Completely messed up. No results or negative results or what's not. So yeah, I want AI to do things, but only because some things can be done better, faster, whatever, with ai, not because it's cheaper. I read, one. Um, I think that cursor has like five, six times, income per employee. They're on like, I dunno how many millions per employee, take total revenue divided by employees and, what's the figure? And they're doing it really, really well, but they're not doing it because it's, uh, and they're using heavily ai, but not because it's cheaper. It's just a net result that, whoa, we ended up earning more money, but we are earning more money because we are doing better with fewer people. Which is fair enough. Kind of like that, that, that's fine. I have nothing against that. But not because it's cheaper.

Darin 00:10:47.268 I mentioned earlier that GitHub, Atlassian, op Sarah is another all within the early part of this year have started launching, sometimes live, sometimes in preview. All of their agentic things. I mean, it's gonna be interesting to see how it plays out and see who is some of the first to really bite. would I try out the technical previews? Absolutely. Give it a shot. Would I put it on anything that's not greenfield? Would I put it on revenue producing stuff? I'd be very careful.

Viktor 00:11:22.586 Wouldn't you say the same sentence or sentences you just said for anything else,

Darin 00:11:28.551 Absolutely.

Viktor 00:11:29.746 like, you know, cloud for example, I mean it emerged. Let's say that you are one of the first adopters, right? Wouldn't the story be exactly the same? Containers, Kubernetes, whatever, whichever advancements, uh, you had, like rust, I dunno, anything. The story is the same. Some go crazy and say, okay, we are doing transformation and we are going to do it in a way that we have a button that we are going to click on Friday afternoon, four o'clock, next week. That never worked.

Darin 00:12:06.502 again, we've been reiterating this on the episodes where it's just me and Viktor talking about this. There is nothing happening now that has not been happening in the past. It just feels like it's happening faster,

Viktor 00:12:21.064 It is happening faster. I don't think it, that it feels it is happening faster and the scope is bigger, but conceptually it happened many times before.

Darin 00:12:30.257 Well, lemme bring in some stats that may or may not be correct, but based on my use of AI for coding they feel about, right. AI generated code introduces 1.7 more issues, 1.7 times more issues then human written code across production systems Now, okay, whatever. We'll, we'll go for that. That will make some people happy. Let's call 50 to 60% of AI generated code contains security vulnerabilities or design flaws.

Darin 00:13:03.267 I can guarantee that based on things I've already written by the time I do security scans, it's like, oh, here's the five oasp things that you missed. It's like, okay, you should have known about that before you started. Okay?

Darin 00:13:15.447 Code duplication is extremely high with ai. I've, I've seen that. I felt that. I was like, Hey, why? Why did we do that? In fact, that happened to me last night. It's like just one place please. One place, AI assisted prs take up to between four and five times longer to process

Darin 00:13:39.027 with. And this one, I would say this is probably true. Only 3% of developers highly trust AI generated code. 71% refuse to merge without manual review. Yes, and we could have said that for any of the three gls, four gls and everything else that occurred in the past. Again, we're gonna keep harping on that. It's just faster. It's just faster now. But for some reason, some people think this, and I can go back to the, the four gls. People thought that was a silver bullet. People thought that low-code was a silver bullet, and for some types of applications it might be. But nowadays people are saying, Hey, I want to, I'll use your idea here. I believe that we could have a billion dollar revenue company with under 10 people without even blinking. I think it's possible. In fact, based on something I saw yesterday, I think it could be done with one person.

Viktor 00:14:46.782 it'll be we'll give, or let's say very small number of people, but I did something that I almost never do, and that's, I haven't interrupted you for a single moment. Should we go back to the beginning of that list?

Darin 00:15:02.492 Sure.

Viktor 00:15:03.732 Okay. What was the first one?

Darin 00:15:05.522 The first one was AI generated code introduces 1.7 times more issues than human written code. Have

Viktor 00:15:16.722 Okay. Uh,

Darin 00:15:17.132 my code?

Viktor 00:15:19.182 okay. So, uh, you know that it has more issues because somehow you detected those issues. Right. Otherwise you wouldn't know that there are more issues. Right, Cool. So whichever mechanism you had to detect those issues, did you feed it back and did it fix them?

Darin 00:15:37.217 You mean apply the SDLC to the problems that we're having?

Viktor 00:15:40.722 Yeah, kind of. Because I couldn't care less whether it introduces 1 X or 0.5 X or 50 x more issues. If I have a mechanism to detect those issues and then iterate and iterate, iterate, what I care is whether that at the end of that process, I have less or more issues and pair number of features I'm developing. Which, I dunno whether that's included there because if it has, I dunno how many more issues, in total, that's normal because we are doing more in total, right? But let's say that per unit of delivery or whatever it is. So why is that a problem? It's not a problem. As long as you can detect those issues and feed it back kind of. Okay, so you dear agent, and you Dear Joe, you detect the issues and then correct them and then let's repeat it. Cycle. It's A-S-D-L-C. There's nothing wrong with it. and now here's what would be interesting for that study. Did that study detect those issues after they were delivered to production and sitting in production for weeks? If that's what happened, then that's very bad, but that has nothing to do with AI even better is the next one. Security issues. for issues in general, I can say, okay, it's difficult to detect, you know, some issues cannot be detected before it goes to production because only when users start touching it, we really see how it behaves. Blah, blah, blah, blah, blah. Okay, I can buy that one. The security issues, man, are you having a scanners in your pipeline? If you are, they, they detect the issues and, and then, then you check it and then you upgrade your libraries or whatever or change the code and, you go live happily ever after. Unlike random issues that could be hard to detect. Security issues, aren't they easy to detect? And if they are, what's the problem? Get nothing to do with ai. The problem is the only problem with security issues, assuming that you're capable of detecting them. If you say, yeah, but I don't have time to fix it, that's the only problem. Because if you don't have a mechanism to detect, how do you know that there are more security issues and there is a third option, kind of like you're capable of detecting them, but you execute whatever process you have to detect them months later and you say, oh, we've been running things with security issues for months. That's just silly. It's cheaper to do stuff. It's cheaper to detect stuff and it's cheaper to correct stuff. Now if you apply, it's cheaper only to one of those three, then you're not going very far.

Darin 00:18:16.071 anything else you wanna bash on that list or are you pretty good?

Viktor 00:18:19.126 you wanna go through the rest of the list, I can b all of

Darin 00:18:21.336 code duplication.

Viktor 00:18:22.306 maybe, no some, uh, say again.

Darin 00:18:24.066 Code duplication.

Viktor 00:18:26.071 Oh, what's wrong with code duplication? Okay, so let, let me ask you a question. Why don't we want to duplicate code? and before you answer, let me give you, oh, actually, let me give you a reason why you do want to duplicate code, right? Because sometimes actually you end up with bloated libraries and bloated functions and whatever else that they're supposed to serve. 50. Somehow similar, but not the same cases, That happens. oh yeah. We have this function with 57 different parameters because you know, this color needs this and that color needs this, and they're somehow doing similar but not the same things, That's the argument not to duplicate. Now, what is the argument to keep it in one place? Let me guess. Is it maintenance? Because it's so much easier to maintain one library than five more specialized functions.

Darin 00:19:23.085 Could be if

Viktor 00:19:24.879 Yeah. But now it's cheap to maintain anything. I've been fighting that, that's one of my demons I've been fighting with CLO for, for a while. Kind of like, Hey, why are you doing this? There is already a functional library that does, really sometimes does exactly the same. Exactly. And then, then you're right kind of, you need to correct but very often that's something very similar. You just need to add this argument to the add additional if statement to that function. You can use it and then I give up on fighting it. Why fight it?

Darin 00:19:53.414 Oh, my use case was defining a constant in two places.

Viktor 00:19:56.674 Okay, that's bad. That's bad, that's bad. Now let me tell you something that I am, I have 98% confidence that Code Rabbit would pick it up. It'll tell you about it.

Darin 00:20:08.849 Yeah.

Viktor 00:20:10.244 So the

Darin 00:20:10.779 co rabbit.

Viktor 00:20:11.894 no, not sponsor. I mean, you can do, uh, Tropic just announced something very similar and there are, there is code emerge. There are, there are many, right? the question is, there is a multi-layer approach to SDLC BID without ai, we have code reviews, forget aboutis, so that we can catch those things because we are not, I mean, uh, no developer. I mean, even if you, if you do it yourself, are you really going to catch, imagine that you're not the only one working with that code base, right? are you always going to catch that actually there is a library to do this, what you're trying to do? But you might not know about it. You're not going to catch it yourself either, unless you're the solo person working that project. Right. And then you know it inside out.

Darin 00:20:53.159 Do you wanna talk about any of the others?

Viktor 00:20:55.069 Oh, I can, I can, I can go as far as you, want. Depends. How long do you think you want, want us to, to do this?

Darin 00:21:00.864 we're we're fine. AI assisted prs, wait 4.6 times longer to be reviewed, creating a massive review bottleneck,

Viktor 00:21:09.174 AI are queuing the reviews or, or, uh,

Darin 00:21:13.064 AI assisted prs because they're larger typically maybe

Viktor 00:21:17.439 That's okay if they're large. I think that that's terribly wrong. We need to get used to working much smaller chunk with ai. The fact that AI can write 10,000 lines of code is, is wrong. I mean, it can write 10,000 lines. You can, you should still split it in smaller tasks, right? so that, that's terribly wrong. Uh, it doesn't matter whether it's AI or a human or whatever. everybody will struggle, no matter the type of intelligence with such big amount of changes. I think that actually the other day was the first time I hit the limit with Code Rabbit myself. it was a project I just started, so kind of, I could excuse myself, but it says I cannot work with this, this size of a pr. It's the first time it told me no. I cannot do this. Not going to happen. but let's say that we get better and we, we chunk things in smaller chunks, uh, which we should do with or without ai. Then the question is, okay, so is this the same problem with SDC, that we improve one part of it and we don't improve another? And if that's the problem, then that, again, that's the same problem as before, ai, right? let's put it this way, let's hire more developers to work on code. but req require that every PR needs to be reviewed and don't not increase the number of people reviewing prs. What do you get? Do you get anything different?

Darin 00:22:37.629 No, you get worse actually. Probably.

Viktor 00:22:39.919 There we go. Because the No, you get No, no, because, you know, PR reviews, if they say the same, same number of people, no AI included, in any part of the process, same number of people reviewing. That means that you're not increasing your speed at all. You're increasing the speed of a part of the process.

Darin 00:22:57.110 And we've seen speeding up just one thing doesn't help us anywhere else.

Viktor 00:23:00.416 Yeah, yeah, yeah. That doesn't help. Right? If there is a Joe that needs to stump this without even having idea, because that's the whole essence of his job, then uh, you have a problem, which is called Joe.

Darin 00:23:12.864 Well, let's sit there for a second. We'll, we'll come out 'cause there was one more. We're gonna skip it 'cause I wanna stay here for a second. the review pipeline was never built to support this level of automation of code coming in. It was built for human speed. We talked a couple episodes ago about human speed versus machine speed. How do we help sort of deal with reviews? Because if I'm getting, if used to, I was helping review two or three PRS a day, and now I'm having to review 10 to 15 PRS a day, and I'm being measured on the number of prs I review, at some point I'm gonna get pretty wasted. And you can take wasted however you want to. At that point of how do I actually manage this level of input coming into me Again, something is sped up in up the pipe for me, and now I'm becoming the bottleneck. How am I gonna solve that?

Viktor 00:24:12.740 Well, let's imagine a world where actually AI cannot help you with PR reviews. Imagine that world, which I don't think exists, but imagine it for a second now. With assisted coding. Let's say that in the past you would spend 80% on coding. On average, maybe not you, but you know, an average in a company, 80% on coding, 10% of what comes before coding and 10% what comes after coding, And in that last 10%, that's among other things, PR reviews. Now we are shrinking those 80% to let's say 20 imaginary number 30, whatever it is, doesn't it mean that you have more time for the tasks before and tasks after? Isn't your time freed now to do actually more code review in a role that for AI cannot even help you?

Darin 00:25:04.771 let's say the answer is yes, however, I'm going to, I don't know. I don't wanna say I'm gonna go against it, but there's a whole lot of different brain cells used in reviewing code versus writing code

Viktor 00:25:18.836 Yeah.

Darin 00:25:19.351 anything else. Good brain cells. But if all I'm doing is reviewing paperwork, it's almost like I'm sitting at the desk and just reviewing paperwork all day long. 'cause that's effectively what I'm doing. It's like at some point I'm going to, just like you and I have talked about how many agents can you manage at one time?

Viktor 00:25:37.038 Yeah,

Darin 00:25:37.643 Two or three, maybe not 20 or 30. So just because I'm getting 15 prs assigned to me to review every day, that doesn't mean I'm gonna be able to get through all 15. If I'm getting 15 every day because my overnight agents are doing all the work. I'm always behind. I'm never able to speed up.

Viktor 00:26:00.111 that's true. And I feel good and bad news. Which ones you want first?

Darin 00:26:04.643 Uh, I think the good and the bad news is the same thing. Suck it up buttercup or you're out of a job.

Viktor 00:26:10.503 I was going in a different direction. What I was trying to say is that yes, I think that, mentally our jobs will be much more demanding. It's unavoidable, right? Because if you are, delegating more of the tedious work than what's left is, is more demanding, mentally. that's happening. I dunno how we will adapt to that. Uh, I honestly don't know, but that's happening. So that mental load that you just said through PR reviews, but I, I assume applies to other things. Kind of like, oh, I, I'm now planning five prs, instead of planning one and working on that one. Right? another kind of issue, cognitive load and, you know, running three agents. What, what do you do when you run three agents in parallel? Actually you are, forcing your brain to think about all those things instead of typing. Right? That's going to be tough and it is tough, at least on me. so that's bad news, right? The good news is that, when you review things, there are silly things that you need to check and there are important things that you need to track and you can easily delegate silly things to ai,

Darin 00:27:18.631 Gimme an example of silly versus important.

Viktor 00:27:21.100 does every, does, and this is really silly, does every function have a, have a comment explaining what that function is? Because we are people who cannot understand what the function does by reading the code. You, You, don't need to be doing that. That

Darin 00:27:35.534 correct. But what's an

Viktor 00:27:37.104 And huh,

Darin 00:27:38.894 but what's an important.

Viktor 00:27:40.644 what's an important are, are we doing it right architecturally, right. is this really the feature we want to deliver? does this fulfill even what we are trying to do Right. On a higher level, AI will tell you per, uh, tell you decently, well, I'm not going to say perfectly, decently. Very, very rather, and I'm talking about specialist ai, or, or agent whether, hey, this PR is technically sound. you you should maybe double check it yourself, but on a my higher level, technically, right? Not kind of, oh, does every getter, setter, star, uh, def is defining itself as camel case instead of whatever else? Come on. that's a waste of your talent. whether that we are architecturally saying whether, and more importantly, whether actually that's the feature we want to deliver, whether that's something that fixes a real issue, and so on and so forth, right? Those are, I would argue more important questions and now you can spend more time on those more important questions than on silly questions. You know, the typical ones, I'm going to simplify it here and also offend the whole company. you don't need to spend time on, on things that you would normally go to sonar to check, which is a part of bi review.

Darin 00:29:02.082 Yeah, there's no need to because it's a machine talking to a machine, or it should be,

Viktor 00:29:07.379 Yeah. Sonar detected 57,000 issues. Cool. Go fix it. The same thing you would say to your younger colleague, right? You are probably tell you're an old guy I am as well. We would probably delegate that. Hey Michael, uh, can you, can you work on those southern issues? That's extreme. That's the priority number one company. That's when you start lying because, uh, that, that makes that person do better job when you lie to light that person straight into their face.

Darin 00:29:41.271 Let's pull it back to reality. Let's say Sonar or any of the other tools found not 57,000, but found, let's call it 50 to 60, which could be a, a reasonable number for a newer project, maybe more, but you know, it feels like what I mean, of course you could dump it off to Michael, like you were just saying, but. How would that be fixed in an AI world? Because, let's think about it this way here. Here are the tooling that the machines have. We have cps, we have agent to agent protocol, and we have skills, right? As a human. Those are the things that we can do, provide wire up, for lack of a better term,

Darin 00:30:23.676 so agents can work together amongst themselves.

Viktor 00:30:26.856 Yeah,

Darin 00:30:27.628 So if we've provided all those things, where's the rub going to fall there Because, okay, I've got 50 or 60 security findings that we need to remediate because we're getting ready to do our annual PCI review. Great. Off we

Viktor 00:30:44.743 yeah. I can argue that you shouldn't be doing that. Tid Uh, I try to focus on the work that I was doing, not kind of, if you all. Try to keep, let's say, security, medium high at the bay at all times. then there is no yearly review that matters, but go on.

Darin 00:31:02.949 well here's where I think that agent agent can be useful and it sort of makes sense. structured well scope tasks, by the way. Same thing for a human. Give a human a structured well scope task. More than likely you're gonna get the right answer out the backside.

Viktor 00:31:18.679 Yep.

Darin 00:31:19.869 new there. multi-agent teams with a coordinator. Okay. Again, this is sort of the judge model or however else you want to say it to where you've got another agent looking at all the other agent teams working together. Hmm. That sounds like a team metaphor to where we have a manager or a PM working with the actual developers that we shouldn't be surprised by that. Again, if we're modeling. If we're thinking we're gonna have a single agent do everything wrong idea. If we're thinking that we need to do a one for one replacement for a human to an agent, also the wrong idea, there's something in the middle there that will make more sense.

Viktor 00:32:04.055 Many of the issues that, um, you mentioned earlier in the list is that somehow, we decided to skip the things that we normally would never skip now that we use ai. You know, when, when you said, oh, there are security issues. Yeah. And I'm like, so are we skipping now the security issue, detection and remediation? That's, that's the real question. And I feel that very often. We do very often. We have the other day, the other podcast we mentioned, uh, how AWS had an outage because, they just chose to skip probably many of the things that they normally do are completely, in favor of using Kira. Right. And that, that, that was the problem. Eventually we will get probably in a very, very different SDLC, very, very different. But let's first make what we have work better with ai. That's my suggestion. Kind of. Can that be the first on a 0.0? But across the whole life cycle, that's the important part. I, I don't care that you are now writing code faster if that's, that's, the only thing that is happening.

Darin 00:33:11.873 so here's something about that whole life cycle. Something that I don't think we do as well for humans is everything can be easily timestamped as things move through agent to agent, so we can make sure things are done in order or have been done in order. Take your pick however way, however way you want to look at the clock. That's something we don't do as well. With humans, except in maybe war room scenarios, then we're watching clocks like crazy to make sure that we're doing things the right way. here's where the agent agent handoffs sort of break down. We can lose context during this handoff between agent to agent. Maybe we had too much context built in the agent that was doing the handoff, and it gets truncated on the way in,

Viktor 00:34:03.985 I feel that that's, at least at this moment, that's actually my job. I'm personal. I dunno how others are doing it, but I'm not having agent handing off to another agent, handing off to another agent situation yet. I'm the coordinator that's what I'm doing right now. Like earlier in, in this recording I mentioned my shortest skill. Let me tell you the second shortest skill I have and that's that. and you try to figure out what I'm using it for and when, I want to go through all of them one at a time, no matter how important or unimportant you think it is.

Darin 00:34:41.976 Go through them one at a time, no matter how much you think Important or not important.

Darin 00:34:49.551 I have no idea.

Viktor 00:34:51.288 that's my response to CLO code after it fetches, code rabbit reviews. It always says, oh, let's fix high priority ones. And that, that's going back to your stats, and my response is, no, let's fix all of them. I will tell you, I, I will tell you no, only in case if I don't agree. First of all, let's analyze again all of them. I want your opinion on the, on its opinion about each of them, one at a time, and then I will give you the last word, which is either yes or no, and that yes or no. That has nothing to do with the amount of work we need to do, nothing to do with the importance. I want to fix them all. I'm going to say no. Only if I disagree, even though both of you think that this is an issue. I.

Darin 00:35:41.104 I ran into that last night where it was saying, this is gonna take a long time. I went, yeah. Um, fix it. And you know, 10 minutes later it was done. I mean, it's like, okay. So I guess that's also sort of telling when, when we as a human say this is gonna take a long time, that's probably in the days or weeks. Right? That's usually how we think. Potentially months, depending on how bad to a machine a long time is. Minutes.

Viktor 00:36:13.979 And it doesn't have the concept of time. It actually knows the, It understands the concept of time from the data it was trained on, on internet. So when we discuss on Stack Overflow how this specific thing takes three days to fix, that's the response it is giving you. Huh? Actually keep, uh, when I'm creating PR PRDs, from the very start, I put it, it gives me estimate and it's so, it is so ridiculous and funny that I keep it there. It still gives me estimate and every PRD that I start working on takes anything between one day and weeks. It's, it's entertainment part kind of like, Hey, let's chin up. How long will this take? Five days? Cool. Let's do it.

Darin 00:37:02.839 Yeah, I have another scenario to where it's always reminding me, Hey, we need to fix this. We need to fix this. I'm went, no, we don't need to fix this right now. You know, we need to fix it, but it's not important and I don't wanna deal with it right now. 'cause I know what the bigger fallout is gonna be after that change is made, and I'm not ready to deal with that fallout yet. Which leads me to, let's think about, we've got a handful of agents that are working together. Agent A makes a really subtle mistake, does the handoff to B. B doesn't catch it. So C starts building on the problem, and then D starts building on the problem until E finally just blows up. Because we've wrapped the bad with bad, with bad. But again, how many times have we done that as humans? More than I'd like to count.

Viktor 00:37:52.380 that situation is what makes me in a very happy that without planning, I made certain career decisions. In the past. In the past I was specialized, right? Kind of like, oh, I was, I was, I was amazing. In and at Visual Basic. and this is not t net just FYI, right? And I was ama, I, I, I could, through my career, I was specialized in certain things, and then through random things that happen in life, I went more towards generalization, right? Kind of like, I want to understand how the system works and I'm not good at, I'm not better than anybody, at any specific thing. But, uh, I kind of, I'm, I'm very good at understanding how the system works, and that's the skill that I feel gives me an edge right now. Because now I can actually do the delegation. I'm the orchestrator of, of the magic happening over there, right? Oh yeah. Now, now you create a pr. Oh, now let's do this. Now let's do that kind of, oh, this is wrong. This is good. Right? And this is not because I understand, let's say go better than anybody else. That's not the superpower anymore, The wide knowledge. That's the new thing.

Darin 00:39:06.496 One of the other places where agent agent can break down, and I've seen this before too. False completion reports.

Viktor 00:39:13.160 Yep.

Darin 00:39:14.140 That's like, okay, make sure everything's fully tested. That was the directive, and it's like, okay, I'm done. Okay, where are the tests? Oh, I didn't write any tests. It's like selective memory

Viktor 00:39:27.140 Oh, even battery is kind of five tests failed, but it's not because of what we are working on. That's my third skill. Kind of like, I don't care if the, all the tests we're passing before we started working on this, I could not care less. This is me paraphrasing the skill. This, that's not the exact word. I could not care less what you think, whether we cause it or no, we are gonna fix all of that. And if it's a flaky test that has nothing to do with, uh, what, what what we are working on, we are gonna fix it as well. Kind of like a hundred percent of tests need to pass period. And you're not deleting any of them as as a way to make it pass.

Darin 00:40:05.456 Yeah, not deleting or, and I've run into this a lot, ignoring the test, like tagging it with ignore test or whatever the language,

Viktor 00:40:13.140 it was a while ago when I discovered, and this is because I didn't pay attention that it just changed, uh, in TypeScript, at least the test I'm using there is describe and then, you know, description of the test and test is inside and, and would add, dot, skip. A hundred percent on the death pass.

Darin 00:40:31.345 of course they do because you skipped them.

Viktor 00:40:33.440 Yeah. And, and you skipped it because it's not relevant to what you think you're doing. and by the way, you have no idea that we are in a seventh context. Clear, uh, seventh time clear context So far.

Darin 00:40:47.050 If you understand what he just said about clearing context, welcome to, that's probably the most painful thing that we run into and working with AI agents today is you've been working with something, it's like sitting down with somebody and you do a pair programming and then you say, okay, hey, let's, let's go eat lunch or go take a break for 10 minutes and you come back and your pair forgot everything, like took a, a memory loss pill and now you're having to start all over again.

Viktor 00:41:20.570 Have you seen the movie Memento?

Darin 00:41:22.660 I have not.

Viktor 00:41:24.035 Oh man. So it's like Groundhog Day, except that he needs to figure it out. And so he writes, no, you're seeing Groundhog Day. Right.

Darin 00:41:32.620 Yes. Yeah. Yeah.

Viktor 00:41:33.605 Okay. It's like Groundhog Day where he loses the memory over and over again. Uh, doesn't wake up the same day, but kind of keep losing the memory and needs to kind of like start recording, start doing himself kind of with, with what he discovered that, that's how I feel we work with the agents, right. Except that instead of the tools we have, uh, our own memory or, agent MD files or, back to databases or what's not kind, but every session Yeah. That, that's, let's start over.

Darin 00:42:01.242 And sometimes you will do want to start over. It's like, okay, I'm getting ready. This is the other thing. You'll forget to clear it out and it's still building on what you've been working on, even though you're going down a completely different path, and that can be problematic as well. So how are we supposed to actually update our SDLC to work politely with agents? Can you think of any? I mean, we can drop it in and in one, in any part of the pipe, but we've seen before that Okay, we sped up one part of the pipe. Upstream is going to be able to get into it faster, but Downstream's gonna get backed up.

Viktor 00:42:39.074 There are two, I feel important parts. and in both cases I'll assume that you actually automated things that are repetitive, right? You're, you're not using humans to execute, execute, execute tests except while developing and things like that, right? So you automated things that repetitive, and then you have people involved in many different stages of SDLC. That's my assumption right now. Right? and you keep people involved in all of those phase, in most of those phases. It's just that they're now augmented with ai. That's the first step. The second step is that you don't need multiple people in the SDLC. I'm strongly believing that we are moving to the world where a team managing a product is no more than three people. And this is mostly for contingency. Kind of like people leave, So three people max. And when you say three people, that's not much different from before. But when I say three people per product, I mean product, fully product kind of. If it has a backend and a front end, that's the same team. If it has a database, that's the same Team. If it deploys to production, that's the same team, right? Kind of like full end to end, one team up to three people. That's my new norm. Right? And now, so you're augmenting each phase. That previously required your intelligence to continue being your intelligence plus augmented with ai. And you are reducing the headcount per product, right? So if you had five people involved in SDLC, you can do it with anything between one and three people, there will be more drastic changes, but I feel that those are the first steps.

Darin 00:44:16.697 Also, you sort of leaned into this earlier when you were talking there of we wanna keep humans in the loop where it makes sense in, in critical handoff parts. Like we want to be able to have the agent do everything to get ready to deploy to production, but the human should have the final button push to say, okay, go ahead and go. Everything is good, we're good to go. Just that final gate autonomy is where we're headed. That's a long-term goal. AWS got there a little bit faster and we found out what happened. but for now the starting point is, okay, let's just like we did in the past in a. In a nice Jenkins pipeline. yeah. We don't wanna go to production directly, just, yeah. The human will go through, check, do the checklist, and say, yeah, go ahead and go. And then everything else happens. again, that's no big change. It's like we just don't want everything to be autonomous today.

Viktor 00:45:13.630 You can make an analogy with cars. I dunno if you remember when was, let's say five years ago, that everybody thought that the autonomous, fully autonomous cars will be on our streets, in a year. And that never happened. we did not get fully autonomous cars all around all the globe, uh, running wild, years ago. Even though everybody predicted now, the reaction from people, some people was Oh, yeah, yeah. So we are never go, our autonomous cars are just silly. We are, we should abandon that. That's the wrong conclusion. or those who tried to put autonomous cars on the street that early, that was a wrong conclusion either, uh, as well, right? So we need the middle ground. And that middle ground is yes, we are developing this, we are testing it, we are rolling it out. You know, it starts in San Francisco, five streets cool, the whole city. Cool. Five cities, 10 cities, blah, blah, blah. Now we reach New York, probably the hardest, one of the hardest, biggest challenges for autonomous driving in, in, on the planet right after Istanbul, right? And, and, and a city in India. So, yeah, we are not abandoning the idea. Autonomous agents will bid here. what will be the level of autonomy? I dunno. Uh, but don't try to roll it now. Now, even if technology's ready, you are not.

Darin 00:46:41.984 There's a bunch of other items I wanna throw in one more. We need to monitor the agents, not just the output from the agents said differently. We need to monitor the output of the human. We need to monitor the humans, not just the output of the humans. The problem is, is we have machine speed now instead of human speed.

Viktor 00:46:57.976 Correct.

Darin 00:46:58.691 of the time, machine speed is gonna be faster than human speed. The only time that won't be true is if the human did RM dash RF slash then the machine took over these problems. You know, we're building out these pipes. We're, we've got to make sure that if, if we're really trying to inject AI into our release pipelines, into our SDLC, we've got to rehab the pipe so that it's fast enough to deal with. F the onslaught of whatever's coming at it, whether it's PR reviews, whether it's new code generated, whether it is new features coming in. Again, all of these things that were maybe, again, following the A AWS model, now we have an agent that has operator level access to all of our infrastructure. We've got to be paying attention, and we have to make sure that it's at speed, because once it actually works, once the business is going to expect it to work all the time.

Viktor 00:47:58.477 Yeah. Uh, let's make it 10 times first.

Darin 00:48:01.770 times. Yeah. Okay.

Viktor 00:48:03.100 okay. You know, it's, let's say that you have 10 steps and I'm ridiculous in it now, simplifying it, 10 steps in as, as they'll see, right? And now I'll give you two options and you tell me which one is going to result in, more features delivered to production. One 10 steps. Increase the speed of each of those 10, 10 steps by 10%, or double the speed of development. What, what gives you better results?

Darin 00:48:30.613 That's a good question because I could argue either one, depending on what my team structure is like.

Viktor 00:48:37.298 Yeah. But you know, if development is double the speed and whatever comes after development is the same speed,

Darin 00:48:44.148 Oh, we're

Viktor 00:48:44.653 you're not think, you're not, you're not delivering anything faster.

Darin 00:48:48.288 right. We're just backing up at that point.

Viktor 00:48:50.338 You are just kind of piling things like before we were piling issues in Jira. In Jira or GitHub, now we're piling prs. That does not help.

Darin 00:49:01.939 Well, some people think it is because they're meeting their goals, so they're gonna get their bonuses, but the people downstream aren't because they're not meeting the newly revised goals of PR processing.

Viktor 00:49:12.539 Yeah, and that's a change that those companies should have made long before. AI cannot understand that your goal is delivery of something to production. That's the only goal that matters. The fact that you discover more bugs in QA or that you deliver more lines of code in development or that you detected more issues in security phase is irrelevant on its own.

Darin 00:49:38.288 Could you agree that a fully agentic pipeline SDLC is coming?

Viktor 00:49:43.667 Augmented. Yes.

Darin 00:49:46.087 Okay. Augmented in the middle, but let's step back. Could we actually get to a fully, this is like the autonomous

Viktor 00:49:53.422 Like more distant future, you mean?

Darin 00:49:55.927 distant future. Yes.

Viktor 00:49:56.882 Oh, yeah. yeah. yeah. yeah. imagine a factory and you have quality control, Uh, and you're making whatever screws in a factory right now, you're not checking every screw that comes out of the machine, right? That, that would be just insane. You can just as well read of the machine and do it, uh, manually completely, right? You're making quality controls to understand whether the process works well or not, right? You're extrapolating data from a subset of data. that's what we do in production, like kind of, if you're storing metrics, you're not storing it for every single, let's say traces. Like you're not storing traces for every single request to discover whether every single request worked. No. You're summarizing them, you're grouping them, you're aggregating them, and so on and so forth, and trying to extrapolate, okay, so what is the acceptance for me? Kind of like, I dunno, 99.9% of successful requests, kind of that, that's my meta. Cool. I'm not measuring all requests. I don't know whether really 99.9, but from those random that I picked up. Stored in a database. So my sample is 1% of the traffic, and then I'm measuring whether 99.9% of the 1% is, is, uh, is, is okay, and if it is, I'm doing fine, and so on and so forth. You need to extrapolate from the sample, right? that's the only way to, to truly do it.

Darin 00:51:22.900 Well, I think the organizations that are going to win this are not gonna be the ones that try to automate everything the fastest. They're gonna be the ones that try to insert agents in the right places. With the right guardrails and still keeping humans in the right place

Viktor 00:51:42.482 I disagree with that. I think that there would be, companies that insert AI in almost all the places and keep humans,

Darin 00:51:50.838 well, that's gonna be

Viktor 00:51:51.448 right? Because if you Yeah. But if you say. Oh yeah. Right stage. But right stages are almost, uh, so right stages are all the stages that are not currently automated. And they're not automated because they're not repeatable. So, uh, if you say no AI in PR review, but AI in development, you're not getting any, benefit. And if you say, okay, so PR is the right place. Yeah. But security analysis is also the right place, observability is that the right place? Is it is the right place. I would argue that no place that is, or maybe not, no place, majority of places that are not repetitive automated already are the right places. Not full autonomy. That's not what I'm saying. Just to be clear, I'm saying augmented with ai.

Darin 00:52:38.990 Yeah, I think at some point in the future it will go beyond augmentation and it will just be ai. I'm That's but long, longer term.

Viktor 00:52:47.290 Yes.

Darin 00:52:48.020 Now for some of you listening, you're thinking at my company there is a 0% chance any of this will ever happen, or it'll happen after I'm retired in 50 years.

Viktor 00:52:58.820 Well,

Darin 00:52:59.420 Let me give you a cautionary tale,

Viktor 00:53:01.170 mm-hmm. ' Darin: cause I have one, I've brought it up before. My dad used to do sheet metal fabrication, HVAC duct work, and would create or build HVAC for fabric factories, textile factories. These factories, and this was in the late eighties, 40 years ago. I built these massive a hundred thousand, 200,000, 300,000 square feet facilities with weaving machines and everything else. Once it was built, once it was online, the whole thing could run with five people. Five. These were the people that went around and did maintenance on the machines. That was it. That was 40 years ago. To think that that kind of automation is not coming to knowledge work, forgive me, you're fooling yourself. You are delusional. There is exception.

Darin 00:54:01.838 Of course, there's always exceptions, but yes, go ahead.

Viktor 00:54:04.143 There is a big, big, huge exception. And that's if you have some kind of monopoly,

Darin 00:54:09.353 Oh, absolutely.

Viktor 00:54:11.133 right? So can fly. Oh, I'm a banking system in Argentina and I'm inventing now. I'm not trying to ridiculous Argentina. Uh, and actually, government, just this, I convinced the government that, uh, AI is not allowed any, anywhere in a banking system. So actually no competition can kill me. I'm the biggest one there, then you, you honestly don't need it.

Darin 00:54:32.003 Fair enough. But for those of you that aren't working in those levels of monopolies, I'm not saying look for a job, but either brush up your skills or start thinking about what do I want to want to do for my next career? So what do you think? This got really heavy, really fast at the end. However, the Slack workspace, go to the podcast channel. and leave your comments there.