DOP 327: When AI Tools Go Rogue

Episode: 327

Published: December 3, 2025

Viktor 00:00:00.000 I don't know if you read, I, found out about it yesterday. about Anthropic investigating rogue, sleeper agents. Agents that are activated on certain date. You know, like during Cold War where there was like sleeper Russian agents. same idea. Imagine you're running an agent that does exactly what you want it to do, and there is a hidden part that is activated on November 27th, 2025.

Darin 00:01:33.168 Viktor, do you remember the time when Google said to put glue on pizza?

Viktor 00:01:38.490 I do,

Darin 00:01:40.125 Did you do it?

Viktor 00:01:41.820 not yet. Not yet. I have it on my to-do list.

Darin 00:01:45.015 It's on your, I'm assuming it's way down on your to-do list.

Viktor 00:01:48.720 It's very low on my list. That's why I haven't done it yet. In part because I don't cook for myself. I order mostly, and I haven't found, pizzeria the top receipt yet.

Darin 00:01:57.211 It would be funny if it wasn't true. That's happened. There's been other things that have happened where AI tools apparently have gone rogue. We won't get into all the politics of it. I think what we can say though is no matter what you're. Political persuasion may be, uh, nobody wants glue on their pizza. And some of the other things that have come out of the AI tools,

Viktor 00:02:24.759 Almost nobody. There is always somebody. Somebody is wanting weird stuff. Who knows?

Darin 00:02:30.128 that's what we're gonna talk about today, is what do we do when the AI tooling that we're using starts to go rogue? Now there's been these high level pizza on or glue on pizza. That's not what we're gonna talk about today. I think we're gonna focus on what's gonna happen when, and we've seen this happen. When you allow an autonomous AI agent to go off and manage your infrastructure for you, is that a good idea or a bad idea?

Viktor 00:02:58.373 This is going to be the shortest podcast ever. Just don't, I don't believe in autonomous agents for anything but trivial tasks. that does not mean that I'm saying that we will never get there. I'm not saying that, but I haven't had a single session. With my AI agents that produced correct output without me providing additional instructions. Right. You know, like, include code when it, it does this and you say yes or no, and you say yes, yes, yes. And there was always a time when I said no. And then I had to type additional instructions, because it went in a different direction than I would want it to go. Right. I haven't had a single session where I haven't said no at least once, and I cannot imagine me running unsupervised agents today if I'm not capable of, saying yes, not capable. Sorry, if I don't reach the point where I say yes to everything. When I reach that point, then we can talk about unsupervised agents.

Darin 00:04:06.744 But even if you did say yes, you got to that point and then you switch out the model underneath, you can't trust it anymore. And, and actually even beyond that, you can't even trust that the models aren't changing out from underneath you. Today.

Viktor 00:04:21.943 first of all, you can trust that in majority of cases, that models are not changing underneath because there are, especially when you work with APIs, there are very specific versions that you can use, right? as an example, there was a bunch of Sonet four releases so far, they're also sonnet four, But there were multiple releases of Sonnet four so far. So if you're very specific and you say, I don't remember now the naming, uh, scheme for sonnet, but sonnet four, blah, blah, blah, blah, blah, then you're using the model that was specific model that was cut at specific moment in time. Now that model might not be around forever, so you are going to be switching and I will assume for a moment. That whenever you change any software that you're using, you're doing some testing first. And I consider models in this case software, let's say Twitter, API, Are you switching to a new version of Twitter, API, without checking it, testing it, validating it. If the answer is yes, and some people say yes, then I think that you're doing it terribly wrong.

Darin 00:05:34.173 In the future though, what you just said there though about models changing out, it's sonnet four dash something we usually forget about the dash.

Viktor 00:05:45.049 yes.

Darin 00:05:45.894 And if you forget about the dash, and I've heard this over the past few weeks at the time that we're recording this, that people are saying Sonnet four has gone really downhill. There's probably been a new flavor of, well, let's, let's assume that that's true

Viktor 00:06:03.684 say it's the case. Yes.

Darin 00:06:05.184 Let's say it's the case then could that have been the dashes changing out? Sure. It's, it's a much different conversation, I think, but to the point of, I'll, I'll step off on a rabbit trail here for a second. People are still sort of vibing instead of actually. Controlling what's going on. Different conversation, not for today, but it could happen, right? If you're not using the dash, it could go rogue because there is a new flavor, and if you don't have your guardrails up, that's the key point to this guardrails, then you should expect chaos to ensue.

Viktor 00:06:46.586 Yes. Absolutely, and especially now to be clear, I don't think that that's that important when you are using supervised agent. So when I'm using cloud code, I don't use Sonet four dash, it's just SONET four. because I'm behind the wheel all the time. I'm checking it, checking out what it's doing, confirming that it's okay, letting it proceed or instructing it, providing additional instructions, and so on and so forth, right? So I, I don't have a strong need to be very specific in those cases, but if you're talking about you releasing iGen software. Or if you're talking about you going towards unsupervised agents, then everything we said about D is extremely important because then, then you're releasing software with certain expectations. Right.

Darin 00:07:39.500 I am gonna pull a Viktor here 'cause you've done this to me a few times.

Viktor 00:07:42.680 Okay. Go for it.

Darin 00:07:44.810 Isn't this what's gonna happen when you just have a human doing the same thing anyway? They're unsurprised.

Viktor 00:07:50.220 first of all, humans are rarely unsupervised, especially in enterprises, There are, let's say a significant percentage of companies have, let's say, code reviews, right? What is the code review, if not a form of supervision? Or you run automated tests before it's merged to the main line. What is that? A supervision? Right. So we have supervision for humans. Not necessarily a hundred percent of the cases, but it's not far from that figure.

Darin 00:08:20.184 So that's the basics of supervision. But then what happens when it's that person's last day and they decide to be not so polite on their way out the door?

Viktor 00:08:33.219 Yeah, Now that depends on the permissions you give to that person. Now, my understanding, and this is especially common, my understanding in us is kind of basically first you remove the permissions of a person and then you tell that person that, uh, he or she's fired, right?

Darin 00:08:51.714 That's one way to think about it. Yes. That that is not uncommon. What if it's the other way? They're not getting fired. They're deciding they're going to fire themselves.

Viktor 00:09:03.480 Well then hope that, that that's, that's a good person without, uh, bad intentions and or maybe that person never had permissions to, to the damage. Kind of if, if you don't have permissions to, to push directly to the main line or to merge anything to the main line, then, uh, if you do something bad and that something bad actually happens, that means that you have very weak, processes like code reviews or what's not.

Darin 00:09:37.398 So how can we keep things going from going rogue? I don't think it's much different than what we do with humans.

Viktor 00:09:44.868 I think it's the same thing. We might change the processes, we might change the tools, but conceptually it's the same. Right? You are supervising the work of, I'm going to use the word intelligence now, right? intelligence is doing something and you're supervising that. Whether that intelligence is human or artificial or something else, uh, you know, squirrels running around, that's not that necessarily that important. Right. But you're doing something.

Darin 00:10:14.231 So if we've got our guardrails in, we know what to do with humans. You said conceptually that's what we're gonna be doing. What can we do in practicality? Today, what are the things that we can do to keep, I mean, to an extent, giving it good instructions only gets you so far because sometimes it just ignores the instructions.

Viktor 00:10:35.764 there are two common causes of why something happens that shouldn't happen, right? There is malice kind of this person did this, did this intentionally. he wants to mess, mess up with us because he, he doesn't like us anymore or because he got paid to do that or what not. And there is lack of knowledge and experience, I did this. Bad things happen in production. I didn't, I did not want those things to happen. I just did not know better. Now I'm less concerned when AI is, when we are talking about ai, I'm less concerned about the first use case malice. in theory it can happen, but that's much lower on my list because we have bigger problems to deal with. And those bigger problems is knowledge, You're going to mess up something badly. If you don't know, first of all how that something works like Kubernetes. Second, if you don't know what are the policies and patterns and practices and what's not in your company, If you don't know both you're likely to mess it up. now I can assume that AI knows how something works like common knowledge. Kubernetes, kind of like it knows Kubernetes. I have. I don't have problem with that at all. The problem I have is that it has no idea how your company operates. And that's where you come in. You're supposed to explain to ai. Okay. So we are using Kubernetes to begin with, like kind of don't try using eec ECS, for example, right? And we have those network policies. we always do this, we never do that, and so on and so forth, right? So there is that non-public knowledge that any intelligence in any form needs to get before he or she or read. He's capable of doing something really good, useful for the company. Uh, and that applies to, per say, intelligence. It applies to artificial and human right. You, you just hire the person, right? That person comes to your company and on day one, the moment that person enters the building, starts deploying to production, right? Uh, you cannot allow that, not because you don't trust that person, but because that person has no idea. That person just deployed something to Azure and your company's using only AWS. And how did, how would that person know that something, right. So we have a challenge of enhancing or augmenting AI experience or knowledge with the knowledge specific to the company where we work. Without that, it cannot be useful. it won't, it won't produce good results. It's impossible.

Darin 00:13:33.943 So here's my concern.

Viktor 00:13:35.698 Mm-hmm.

Darin 00:13:36.403 The humans that are responsible for managing the AI

Darin 00:13:43.693 typically are not trained managers. This is going off in a deep end.

Viktor 00:13:52.018 Okay. Okay.

Darin 00:13:53.623 All right. Most of us are developers. We think like developers. We're used to pulling Jira tickets and doing them. We're not trained to think like a manager. We're not trained to operate like a manager.

Viktor 00:14:10.185 Yes, that's correct. Right. We, we don't at, actually, let's put it differently. Some of us don't. So if you've been a tech lead, for example, right? You've been, you tech lead is a manager, a specific kind of a manager, right? You're managing technical aspects of a project and that involves people working on those technical aspects. The challenge is that. There are multiple types of managers and hardly anyone, uh, is doing all those types, right? There is a technical lead, right? It's one type, there is a product manager, right? Trying to figure out what is the feature, kind of what is it, what, what are we building in the first place, right? There is a people manager, right? And there are probably other types. And no single person, is all of those at the same time or rarely? Most of us are not. Right? And then we need to figure out do we become that Uber person, Uber manager, or do we somehow keep that separation and then have different people supervising different. Types of the work that AI that the agents are doing? not sure.

Darin 00:15:36.951 I think the key part to being a manager, and I've seen this over the years, unfortunately, is being able to document things. Managers are typically really good at documenting things.

Viktor 00:15:51.983 Yes, and everybody else is really bad at reading that documentation. Isn't that true? Let's face it. How, how often were you in teams that kind of, uh, months were spent or days on documenting something and then, everybody, most people just glimpse through it in, in five seconds and say, okay, I know. Uh, okay. Clear.

Darin 00:16:17.457 Okay. I'm trying to be a little more, let's play it out here for a second. You go through your annual performance review. This is probably a bad example, but you go through your annual performance review, you sit down, you talk with your manager. Manager talks with you. You have to put out some paperwork in order to make HR happy. We're gonna be doing those same things as people that are managing ai. If we don't do that, the parallel on that, back to the human is. I got hired in, I report into a manager, but we never have one-on-ones. We never do an annual performance review. I just go and do whatever I want to do without any practical oversight. What do you think my value to the company is really going to be?

Viktor 00:17:04.845 let me answer that with a question. What is the goal, do you think? Primary goal of annual performance review. Or the second, because first you'll probably say higher increase in salary. Right? So the second main goal of a performance review,

Darin 00:17:21.734 Well, you took the first one, which was more money, but of course, yeah, whatever. I, I don't know. What are you thinking? Because I could go three or four different ways.

Viktor 00:17:31.589 I think it's uh, it's one of many ways how we learn. Good performance review. Most of them are not just, let's face it, but kind of, okay, so this is good. This not so much. You should be doing this and that, It's redirecting you how to do your job better. It's providing knowledge in certain form. You've been focused more on these things. You haven't been focusing on those things as much as you should. You should read this, you should watch that. You should listen to this. That will give you a better understanding how this aspect where you haven't been doing well should be done. Would you say that that's a valuable part of performance review?

Darin 00:18:21.738 It is actually the, probably the most important part of the performance review.

Viktor 00:18:24.598 There we go. I just gave you a slack saying second most part, still money. Kind of there somewhere, but let, let's make it second.

Darin 00:18:32.135 So what happens if I don't get that?

Viktor 00:18:34.573 Then Basically, you are not getting the, the value of that reveal, right? Because the goal is for you to thrive in that company. The goal is for you to do. Better job in that company to be a more, more valuable employee. Right. And your supervisor is supposed to tell you how to get there, Tell you what are your strong, points and what are your weak points?

Darin 00:19:04.300 so once you get that feedback from the manager, that's good. But what happens when I walk away and I never look at that again and refer to it? Or I'm not reminded of it.

Viktor 00:19:15.690 Um, after that well done review

Darin 00:19:19.395 Yes.

Viktor 00:19:19.980 then it's up to you, right? Kind of you've been given tools how to. Perform better or knowledge or tips, how to perform better and you, you did not, do anything about it. Then if you're still talking about people, then it's up to you, right? Kind of. So you've been missing skills, operational skills. You've been mostly focused on programming and you're a DevOps engineer. And you keep ignoring that advice. either it was a bad advice, then the review itself was not good. Or you, you've been given good advice and you ignore it, and then it's really, uh, you cannot complain. Most of the time it's a bad advice in the first place, but let's be positive.

Darin 00:20:10.424 Alright, so let me flip it around a little bit. Let me say that I am the one managing AI

Darin 00:20:19.399 I've been told by my manager I have to. Use AI to do my job

Viktor 00:20:25.089 Okay.

Darin 00:20:25.953 and let's also assume that the AI has been properly configured, not trained, but it, it has the knowledge that it needs to have for, its for my situation in this company, right? Every, everything is, is looking good.

Viktor 00:20:42.593 now you reduce the number of, percentage of people using AI in that way to maybe 1% or 0.1, but go on.

Darin 00:20:50.898 Exactly. I, I, I give you that, but assuming that that's true, but remember, I'm being forced to use it. I am maliciously complying with the order to use ai. So let's flip it around. This is what we're talking about earlier too. What happens when somebody goes rogue, a person goes rogue, that's driving ai.

Darin 00:21:16.068 What are we gonna do then?

Viktor 00:21:18.049 I'm not sure I have the answer, but I also don't think it's any different in from many other similar cases. Before, let's say company decided to, to virtualize, now you're talking like, I dunno, 15 years ago, right? And you No, it should be always bare metal. Right. Company decided that we are going virtual because it's more efficient, because it this and that and you, you don't agree with it, you sabotage it or something like that. Well. Uh, whatever company was doing with doing that case in the past? I think it's the same now, right? Because we can discuss whether we are already there or No, but let's assume for a second that we are, that there is, increase in quality or decrease in, or increase in speed, uh, or something like that. There is a benefit to using ai, whether it's 10% or 90%. It doesn't matter, right? The company saw the benefit and did the right thing set up the system in the right way. It's really helpful. Still significant minority of the K of companies did that, but, uh, company did all that. And it's really showing that we are 15% more efficient. I'm being very, very, negative now, right? That it can easily be more than 15, but let's say 15% more efficient. And now you come and say, ah, no, I don't want to do that. You know, I think that the answer from the company should be, that's fine. It's perfectly fine. You don't have to use AI but our metrics, our thresholds changed, yesterday it was, okay, if you do X amount of work, now you're supposed to do X multiplied by something, right? Because others are doing it, kind of Others are. Somehow some percentage more efficient than ever before. I don't think that you have to use ai, but you have, but you cannot say that I'm, measured as, I was measured before. those are the new, new requirements, but not requirements in terms of whether you use, AI or No, but. if, you were doing 12 PRS a month before, now on average you need to do 15. That's a silly measurement, but let's roll with it. Right? That's because we are more efficient as a company. You don't have to use ai. Why we, I mean, we don't.

Darin 00:23:54.175 That's an interesting take on it because you can choose not to, and instead of me being the harsh HR person, it's like, oh, you didn't meet your numbers, so therefore you're out. You're taking the tact of, you don't have to use it, but the bar has been moved.

Viktor 00:24:10.825 Yeah, I mean it's same thing. Let's say that, we agree that most people agree that using an IDE for coding like VS Code or JetBrains or whatever makes you more efficient, right? Let's assume that for a second. Now, do I think that people must use id. They are not allowed to use Vim instead, no. Or none, or whatever you want. I don't care what you use. It's just that the how we are measured changes over time. And if you end up using something else and that makes you stay on top, power to you, man, kind of like just keep doing that. Amazing. Right. If it turns out that you. Drop to the bottom, then, I'm sorry, but kind of there is no raise, and this is positive outcome kind of for you're fired, right? Because simply you cannot keep up with your colleagues, but it's up to you which tools you should use. I don't think that we should ever force people to use specific tools. if you say that K nine SS is more efficient to manage Kubernetes resources than cube control, does that mean that everybody should be forced to use canine s.

Darin 00:25:26.541 Absolutely not. Yeah. Well, you're, you're bringing home the point again, the bar has moved. You're now moving down as the human in performance. You're put on a performance improvement plan, and you keep dropping because you keep deciding to not use AI or whatever else. Tooling that's there that everybody else is using, that's causing you to keep sliding down the ranks,

Viktor 00:25:51.004 Or maybe not for all I know. You might be even more efficient without it. I dunno.

Darin 00:25:57.631 But if you do start lowering in your efficiencies and that's being measured. Just forget ai, right? Again, I agree with you. You don't have to use ai. That doesn't mean you're gonna be fired, but let's see how you measure up against everybody else.

Viktor 00:26:15.596 And just to be clear, we had those changes in how we are measured all the time. There's nothing new about it. when we moved from punch cards to, I dunno, whatever came after the, how we are measured, the change when we moved from, from assembly to, I dunno, see it changed how much we can produce right. With Kubernetes, we can do things that we couldn't before and so on and so forth, right? So, no, you don't have to use the tools, but the objectives are different or measurements are different.

Darin 00:26:51.464 And I think that's gonna continue on the question will be what is going to be the next tool beyond ai? We don't know yet,

Viktor 00:27:01.409 We dunno yet.

Darin 00:27:03.029 but we can guarantee that there is gonna be something beyond that

Viktor 00:27:05.804 Oh yeah.

Darin 00:27:06.694 just because we're there now, will I have to use it? I hope not. I hope to be retired by that point, but as fast as everything is moving right now, that next thing could be coming in sooner than we think. I mean, again, AI has moved fast. It's not as fast as what we've seen in the beginning of 2025 with Ag Agentic AI coming in. I really dislike that phrase, but it

Viktor 00:27:32.857 Oh yeah.

Darin 00:27:33.262 make a difference.

Viktor 00:27:34.935 It's such a difference that today when I say ai, I actually mean always authentic ai. I, I don't even consider anymore. I dunno what else is there, like GPT, right? agents became synonym, at least in our industry for ai.

Darin 00:27:50.037 I wonder what it's gonna look like this time next year, like towards the end of 2026.

Viktor 00:27:58.121 if the speed continues being the same, I cannot even imagine, man, 'cause the difference between this time last year, massive difference. This time last year, I would not even consider it for anything except asking what's the closest restaurant, and now I'm the other camp. I, I don't remember the last time I actually did anything without, using an agent. Massive difference in a year and massive difference. This. People might get wrong. It's not that there is a massive difference in capabilities of models. There is, but it's massive difference in the ecosystem as a whole, right? Because we, what did we have before? Models, now we have models with the agents, with the cps, with, uh, some kind of semantic search slash memory with so many other things. In a year.

Darin 00:29:04.456 I think you're right. Models are models are models. They're not gonna be changing a lot, but it's that ecosystem around that's gonna, I think, still keep getting bigger and bigger. In fact, I recently saw, I had to shake my head. Uh, somebody doing technical SEO for LLMs and the, the tagline, my phrasing not theirs, was how to load up all the spam into the LLMs so you can be found for your latest nutritional product in an

Viktor 00:29:34.466 That segment of the industry is going to be massive.

Darin 00:29:39.191 I know I.

Viktor 00:29:40.586 It's going to be massive. 'cause this now unrelated to our conversation, but basically, okay, so you figured out yesterday how to be on a first page of Google. Now you need to figure out how to be. In Gemini as an answer to, somebody asking something Gemini, for example, right? And done in a way that actually, that provides not only the answer because that's not enough. and being there not only, so that you are the answer. But that users are redirected to you after reading the answer. So how to provide an answer, that LLM is going to pick up among infinite number of other answers. How to have that answer be helpful yet insufficient at the same time so that somebody comes to your site. That's a completely new set of challenges. For anybody publishing anything on web?

Darin 00:30:44.062 And there will never be any rogue agents. I'm using agents in a different way that will cause issues there ever. No, we know that is to be false.

Viktor 00:30:56.485 I don't know if you read, I, found out about it yesterday. about Anthropic investigating rogue, sleeper agents. Agents that are activated on certain date. You know, like during Cold War where there was like sleeper Russian agents. same idea. Imagine you're running an agent that does exactly what you want it to do, and there is a hidden part that is activated on November 27th, 2025.

Darin 00:31:29.574 I was feeling okay up until that,

Viktor 00:31:33.029 But it's, it's it's very possible kind of, it's not impossible, even hard to detect, but who is going to check it? All right? You go to a company, you get the agent for free. It does some amazing thing. He sends emails for you, right? And you just use it. In your own infra.

Darin 00:31:56.988 it sounds like agents are the next level of malware.

Viktor 00:31:59.838 Oh, they definitely are.

Darin 00:32:02.129 The next ticking time bomb. So we think people can go rogue. Well, now we know agents can go rogue. What do you think? Head over to the slack workspace. Go to the podcast channel. Look for episode number 3, 2, 7, and leave your comments there.