DOP 342: Your Company Documentation Is Useless for AI

Episode: 342

Published: March 18, 2026

Viktor 00:00:00.000 When people talk about GitOps they very often say, Git is the source of truth, and I always correct them. Git is not the source of truth, not by a long shot. The source of truth is the VM that ended up being created because you pushed something to Git. What you have in Git is that you can call it desired state, or even better, those are the replicas of the sources of truth.

Darin 00:01:32.180 If you think about it, companies are really good at producing documentation. It's not necessarily good documentation, but it's lots of documentation. But how much of it is any good?

Viktor 00:01:43.462 None. Almost none. Uh, thank you so much for watching. That's the answer to the, to the big question. think of it this way. how often did you find documentation in wherever you were working or with whomever you were working? Easy to find.

Darin 00:02:00.431 Easy to find,

Viktor 00:02:01.970 Yeah.

Darin 00:02:02.561 uh, probably 10% of the time

Viktor 00:02:05.840 Yeah. Kind of like, you know, you have a problem and 15 seconds later or whatever, the relatively short period of time, oh, okay. So here's the solution. Outside of public internet, I bet that more often than not, you would go public rather than, uh, to, to your internal documentation. Because even, even though public doesn't, doesn't necessarily have information that you're exactly looking for, it's still going to be more useful.

Darin 00:02:37.276 and potentially more correct.

Viktor 00:02:38.602 before we get to being correct, I think that the major problem is first whether you can find it. Because if you cannot find it, it doesn't matter whether it's correct or not. And when I say find it, I don't mean, so there is always docs, you know, kind of like docs that you brought or somebody you work directly and the same documentation you use often and you have bookmarked that direct link that happens. That's fine. You probably even keep it up to date, that's fine. But I'm talking about information kind of, oh, I dunno, this, I dunno how to solve this. somebody must have had the same problem and documented it and then you find it.

Darin 00:03:18.815 Right? But the Stack of documentation, I think follows this flow. Go with me here for a second. We have. What is documented versus what is true? Step one, then we potentially have what is true versus what people actually do. And finally we have what people actually do versus what AI might be able to extract from those people to actually use.

Viktor 00:03:46.789 correct.

Darin 00:03:47.865 So when we go from do we have a documentation down to AI, being able to use that documentation, there's pieces in there that most of us haven't thought about. Is it true? And then what do people actually do with the documentation that's not even the documentation? What do people actually do? Even if the documentation is true, that doesn't mean the people are actually using it.

Viktor 00:04:08.154 There is actually almost every company has without often knowing very accurate documentation. You're just not in a place you're looking for, it's not in Notion or Wiki, it's in your system. Almost everything is documented there.

Darin 00:04:26.658 Well, you would think that, okay, my code is self-documenting. So you would think all the documentation about the code would be in the code.

Viktor 00:04:32.862 Mm-hmm.

Darin 00:04:33.678 However, not always because how many people have touched that code

Viktor 00:04:38.452 But depends what you mean by documentation in code. So do you mean comments or do you mean code itself? Why do we write comments in code?

Darin 00:04:47.928 because we can't figure out what the code is actually supposed to do.

Viktor 00:04:50.577 Yeah. But if you could figure it out, you wouldn't need comments.

Darin 00:04:54.412 Correct. I think that's a completely different problem that we shouldn't be talking about today, that could go on for hours, but. There have been reports, McKenzie's, Adobe's other place that when I was doing research for this is people will spend up to a couple of hours a day just trying to find information. That seems pretty reasonable to me, right? Even coming from a big shop like McKenzie or Adobe, right? People are just searching for the, they're not actually finding the information they need to actually do the thing they're trying to do. They're just trying to find the information.

Viktor 00:05:30.981 Okay. So situation is kind of, the val valid information exists. I just cannot find it.

Darin 00:05:35.868 Well, we think the valid information exists and we don't know where to find it. We're not even ge. We're not even sure that the valid information exists, but we're assuming that it does. So for these big companies thinking that, okay, hey, we'll just throw AI at all of our documentation and everything will be much better. this goes back to the early days back when Google actually used to have an appliance that you could buy. That would scan everything and index your whole corpus of documentation.

Darin 00:06:06.334 Uh, you notice they don't sell that anymore, or at least I don't think they do because, you know,

Viktor 00:06:13.483 it's useless. That's why. let's say that there are two types of information. There is information why something was done and there is information how something works. Now, the first group, that's almost forever. That doesn't go out of date, We might not be doing that anymore, but that's pretty much always up to date That's eternal now how something works. That's very problematic. I never saw a case where actually how something works, is up to date.

Darin 00:06:51.148 Unless that thing just never changes

Viktor 00:06:53.755 Unless that thing never changes. Yes. Kinda. Okay. So excluding mainframe, I never saw that thing actually, uh, being up to date.

Darin 00:07:01.540 I was gonna actually use something much simpler, like cleaning the toilets in the bathrooms used to. You'd think, okay, great, this, this is easy to do. I take the brush, I use the thing, I scrub it, and everything's clean. Well, guess what? New regulations came down from the city that you can no longer use this chemical to clean this because it causes cancer that then causes something else.

Viktor 00:07:23.115 Correct.

Darin 00:07:24.515 And that update has not been posted yet. Governments are really good about posting things and not doing anything about 'em. Okay. That's an overreach, but that's, Hey, that's what government can do. But you're leaning into this part of this documentation, what we can call documentation decay, right? It was correct at one point in time,

Viktor 00:07:44.701 Yes.

Darin 00:07:45.292 maybe a few points in time until finally that documentation is no longer true or is becoming less true over time until it is completely false. That doesn't help us any, Even if I found that document and I said, oh, great, I found the documentation, and even if it's one day out of date, sure, it might gimme a context, but it's not gonna gimme the answer I need. How do we work around that? I mean, how do we even keep documentation up to date that's gonna keep moving like that?

Viktor 00:08:16.598 You know, it can never be, nothing can be ever a hundred percent accurate 'cause it changes. But one thing that can help is actually to consider. Everything. Documentation. Because those changes you're mentioning, they're not undocumented, they're just Oh, sometimes they are, but more often than not, they're not undocumented. They're just not documented in a place where they, where it should be documented. So let's say about, if I keep your example about scrubbing toilets, right there, there, there's a handbook there in a drawer in, in the bathroom, right? And then the rules changed somebody found out about those rules and uh, was informed about it over zoom call. that's somebody, because this is the most natural thing to do, did not update that playbook I mentioned earlier. Now, is it that information doesn't exist? No, it does exist. It's in a playbook, outdated, and it's in the Zoom call. What doesn't exist necessarily is in a place you, it should exist or where you expect it to exist. But the information almost always exists. I,

Darin 00:09:29.345 In the case you're talking about are there, the information realistically is in somebody's head.

Viktor 00:09:34.978 no, it's in a Zoom call.

Darin 00:09:36.454 Okay. It's in a Zoom call. Fair enough. But somebody was on the Zoom call and at least, and, and it was probably in their head at one point. It's probably gone, but the long term of that is in a Zoom call.

Viktor 00:09:46.201 correct. Exactly. now it's in a Zoom call and it was in somebody's head. Head then it became in somebody's head then in Zoom call. maybe afterwards it'll be also in some slack conversation that is long, overlooked. 'cause you know, there were new conversations in that Slack channel.

Darin 00:10:05.659 Which leads us into something that we were talking about earlier. What happens when that person that was on that zoom call leaves the company? Right? that last quick link to get us to the zoom call is gone.

Viktor 00:10:18.468 unless, unless you scan literally all information and find it,

Darin 00:10:25.602 But what's the chances of you being able to, to scan all the documentation, digest it, and understand what it is?

Viktor 00:10:34.069 chances are relatively low. but if your question is, can we do it, the answer is yes.

Darin 00:10:40.641 sure. We could do it. Anything's possible.

Viktor 00:10:43.080 Yeah, I mean, we can transcribe all Zoom conversation. Actually, as a matter of fact, most likely if you're using Zoom, your conversations are already transcribed and they already exist in written format already. I'm just keeping with the same example. So information is in a written format just as it was before the outdated one. now you have two places where you can find it, zoom call, you can listen to it, uh, you can read it in the transcription. Then somebody in that Zoom call had the follow up action probably, and then it did get through the, through the Slack or teams or whatever you're using, right?

Darin 00:11:19.033 this just seems completely. Asinine to me that companies tried to exist in this form in 2026. I know it's not gonna change overnight, but we've gotta get better at this. Technology isn't going to dig us out of this hole.

Viktor 00:11:35.977 let's say that there are certain things that you say, okay, enough time passed for us to gather the evidence that this does not work, if Notion is your place for restore information knowing that there were other variations of a similar concept vs. What's not, kind of like we have decades of proof that, considering such a place is the only source of truth does not work. So I just assume that, uh, we can give up on that idea, whatever the next idea will be, that it's not that one.

Darin 00:12:08.614 Well, I think you're leaning towards, or at least the way I'm interpreting it, you're leaning towards, uh, a couple of things. I was reading and studying up on, a couple of items from Gartner through 2026, organizations will abandon 60% of AI projects unsupported by a AI ready data. In other words,

Viktor 00:12:25.662 Yeah,

Darin 00:12:26.103 isn't there. Error.

Viktor 00:12:27.432 I can easily imagine that happening.

Darin 00:12:28.892 Yeah. And they also said somewhere in the ballpark of 30% of generative AI projects will be abandoned at the proof of concept stage, often due to poor data quality. Again, poor data quality isn't an AI problem. That is a forever problem

Darin 00:12:46.802 that's nothing new.

Viktor 00:12:48.458 you can apply the same thing to people. We are 40% less productive because of poor data quality. I can, I can easily imagine that.

Darin 00:12:55.894 some people say, well, RAG is gonna solve all this. We'll just throw all these documents at RAG and let Rag take care of this. Rags not the answer. It's a tool, but it's not. The answer

Viktor 00:13:07.382 it could be, but not necessarily rug in how people imagine it. usually when people say rug, they mean, okay, so I tell my agent I need to do this. And then, uh, some rug process executed, finds information, then that is done. That's usually what people think by rug, right? augment the context of AI with additional information. And then AI does LMS do whatever they need to do? that will not work. What could work is that you have some kind of constant process. So imagine that we did already exist, scanning of everything that exists, we organize it somehow, and then in the future, have some kind of constant process that. Whenever information is generated somewhere, it is digested and some central location for that information is updated in an intelligent way. It cannot be just happened. It cannot be just I in insert another row into some database, right? Or create another page because then you need to figure out, okay, so we have 57 entries of this one and only one is valid, uh, and 56 are outdated. But imagine if you would have army of people going through everything that ever happened, every Slack conversation, every zoom call, every whatever, and update the documentation, in some central place theory that could work with people, right? Enough number of people having. theoretically we could use RUG for that. So I'm not referring to rug as if my agent uses RUG to find information that, of course, but rug that would, pull information near real time and, uh, organize it.

Darin 00:14:56.179 Well, that's what the promise of rag is. But again, garbage in, in the case of rag, do you even get garbage out? it's gonna be a different type of garbage.

Viktor 00:15:09.366 yeah. But uh, I'm talking about garbage, mostly garbage but not fully garbage in garbage discarded so that we are left with something that is meaningful and then put in a box where it belongs. And then second drug. That when it needs that information goes to that box and fetches it.

Darin 00:15:33.934 Yeah, but there are no vendors selling that solution today, Viktor. So

Viktor 00:15:37.058 Oh no, I'm just imagining the future, man. I'm, I'm not talking about the present. You need to specify the timeline. When you, when we discuss something, I, I can just see that that's feasible. We have already information stored in different places. We know how to digest, we know how to convert, uh, audio to, text, for example. we know that LMS can given good instructions, extract meaning from mostly meaningless things. and we know how to, update stuff, once we figure out what is meaningful. I can say for example, if you look at, transcripts from Slack, they were horrible actually in the past, but now if you look at them you get that summary and summary is actually meaningful most of the time. So you had one hour conversation, 55 minutes of that conversation is, is useless, but it's pretty capable right now to, create a meaningful summary of things that mattered, from a conversation. Right. At least that's my experience right now.

Darin 00:16:41.922 I agree with you. It compared to where it used to be, to where it is today,

Darin 00:16:46.602 completely different.

Viktor 00:16:47.949 Not perfect, I'm not saying that, but, um, yeah. Better than Joe.

Darin 00:16:54.195 Correct, and especially if you look at that summary as soon as you receive it,

Viktor 00:17:01.634 Yes.

Darin 00:17:02.060 because while it's fresh in your brain, it's like, yeah, or wait, wait, wait. That's not right. Because if you wait too long, you'll forget what the meeting was and then you'll read it and like, oh yeah, that's what it was. And it's like, no, that's not really what we decided.

Viktor 00:17:14.690 The missing piece of that. So like, um, summary of a Zoom conversation that's happening and that's decent. The missing piece I feel that we should work on is how they, they get that tiny part of relevant context from that conversation. And I'm just using this example and put it within the much larger context of the whole organization. That's the missing piece, right? So that it, lives there, where it, where it should be.

Darin 00:17:46.380 Which sort of leads me into revisiting the gaps we talked about before. we'll start with the first one. What's documented versus what is true? When we have things documented, usually that's documented as. Best case scenario or perfect versus how we're actually doing it. Would you agree with that?

Viktor 00:18:05.762 Yes.

Darin 00:18:07.366 Because we wrote it, we follow it. And then over time, again, going back to that documentation decay concept, oh, we actually moved it to a different server. It's a different ip. We forgot to update the IP in this, or whatever the case may be. We changed where the container registry lives, so now we forgot to update that endpoint. So things that appear innocuous at the time, and they are, as those little things keep stacking up and stacking up, eventually that document has gotten to a point of wait. How did all these things actually get to this point?

Viktor 00:18:43.439 I think that that brings me to the question of what is that location where we should. Once we discard garbage, which we can do it today already, and we come to the, to meaningful something, where should that live? And you almost, without probably thinking, provided the answer, that's the system itself. Now I'm talking about software engineering, not, not, not all the use cases, if you go back to your toilet cleaning example, if that information about how to clean the toilet and what's not, is put right above it and the next time you find out that you should do a different way, you're very likely to erase it and just write down on that note, a different one. Right?

Darin 00:19:37.135 Oh yeah, for sure.

Viktor 00:19:38.347 so you're more likely to do it than if it's on some piece of paper in some drawer, right? you're more likely to see it the next time you clean the toilet as well that note right above the toilet results in more likely being up to date and more likely being found. Now, what's the equivalent of drought above the toilet in, in what we do, right? You mentioned ips. when you change the ips, I dunno, you up, you updated your English configuration. Now if you put the note there, which is equivalent to putting a note inside your code, that's more likely to stay up to date, because you're using it and it's more likely to be found because it's right in front of you.

Darin 00:20:23.308 let me repeat back to you what I think I heard. new IP address, we put it in the ingress config, not necessarily in the documentation, because if we were, every time we did an ingress change, we ran a script passing in the values so that each time that ingress file was basically ephemeral, we gen generated it every time, which would be silly, right. But playing out this scenario, I would be looking at a document and typing things in. Well, I could mistype, I could do whatever it is,

Darin 00:20:52.048 just letting, if something can be a file, let it be a file period.

Viktor 00:20:56.617 Correct. In the right location. And here comes the important thing. Don't put that comment in ingress. What is the new ip? That's pointless. What is not pointless is why we are changing the ip. Because ip, I have, I, I see the ip. I don't need, I don't need you to document what is the I. I don't, yesterday it was this one, and now when I look at it's something else, I don't need that information. Just like I, in theory, don't need information, what the code does. If I read the code and I have infinite cognitive capacity, I can understand what the code does. What I don't understand is why are we doing it in the first place, so why not? What is missing, What we have in software, not necessarily in art, in software, we have why? Uh, sorry, what? All the time. Check any part of the system and you will understand what is going on. How many replicas do we have? Five. How do I know? Well, cube control get pos. It's easy, right? How many dashboards do we have in Grafana? Well, let me go and check. What do those, those dashboards do? Well, let me check the code of the, the dashboard. Ah, it's taking the information from here and here and then combining it this way. That's what it does. I know it, I know what, the thing that I'm missing is motivation is, is more ephemeral information, precise information.

Darin 00:22:35.642 What you're bringing up right now though is what is true versus what people actually do. Going through your Stack Cube Kettle get pods looking at the definitions of the Grafana dashboards. These are things that in theory are documented. In theory, it's true, but nobody had to come to you to say, oh, actually these are the things, because you hadn't fed that information back into a cannon of documentation. Right, because this you, it's coming outta your brain. Or maybe Cube Cuddle Get Pods was great until we got rid of Cube Cuttle and started using some other script in order to take care of getting our contexts right.

Viktor 00:23:15.337 Yeah,

Darin 00:23:15.732 And it's because of you that we knew those things, but you're not in line with what the documentation says. You know, what are we gonna do about that?

Viktor 00:23:24.753 Rely on the system more than we do.

Darin 00:23:27.459 Rely on the system more than we do.

Viktor 00:23:29.613 Yeah, very often peoples use terms like single source of truth. that's often misunderstood that we think that there is only one source of truth. No, no, no. There are many source of truth. here's a great example that illustrates what I'm trying to say. When people talk about GitOps they very often say, Git is the source of truth, and I always correct them. Git is not the source of truth, not by a long shot. The source of truth are the, is is the VM that ended up being created because you pushed something to Git. What you have in Git is that you can call it desired state, or even better. It's uh, yeah, those are the replicas of the sources of truth or, uh, copies of the sources of truth or what's not right, but the source of truth, if you, if you ask me in good of world. What is the source of truth? Well, your Kubernetes cluster is a source of truth. It's not git because there might be differences. You might have five pods in a cluster and your git might say, say, there, there should be four. Now, what is the, the, the actual state? Well, it's five pods. What is your fantasy is to have four. Four is not the source of truth in that case. Git is not the source of truth. It's your cluster. let's say on a note or, or yeah, whatever. We should have five easy two instances in AWS. Is that the source of truth? Absolutely not. What is the source of truth? Well, how many easy? Two instances we have? Maybe we have zero because there is, uh, downtime. Maybe everything crashed still. The source of truth is that we have right now, zero easy two instances. That's the information you work with now, what is missing? Is that why, oh, we wanted five because, uh, you know, this or that, whatever it is. Cool. That's the information we are missing.

Darin 00:25:35.496 Well, speaking of missing, we now we take a look at what people are actually doing versus what AI can consume.

Darin 00:25:44.526 We can feed ai, PDFs, slides, code, all that's fine. What it doesn't have is the overall context of each of those items

Viktor 00:25:58.051 Yes and no.

Darin 00:26:01.852 well, and how those items fit into the overall, I keep using this word into the overall corpus of the business because. Generalized AI is not going to understand all of your company's slogans or acronyms

Viktor 00:26:18.631 Yes.

Darin 00:26:19.332 out of the box, right? You can train it, blah, blah, whatever, but you know, just because you understand it and what you are doing doesn't mean that it's going to cleanly translate into consumption, into rag ai, whatever you wanna call it at this point.

Viktor 00:26:37.695 imagine this, you are the expert. You know the system you're managing, I dunno, service in your company, right? You, you know them inside out, you know, you know, kind of, you're the expert. You just hire somebody first day at work. What is the, the only reasonable way. That, and you know, everything, just to be clear, you're, you're almighty kind of like you're the biggest expert over there, right? But you just hire somebody first day at work. what do you do with that person? You tell, go and manage our production system. If that's what you say, what is your expectation from that person?

Darin 00:27:16.743 very, very low. Not going

Viktor 00:27:18.042 Okay. I think it's the same with AI. Now. It's not that person you just hired, never heard about SSH and never heard about Kubernetes and never heard about a W, but that person doesn't even know. Are you using a WS on-prem? But you told that person nothing. You said deploy a new release. That's all you did. That's all the information, and you cannot expect miracles, What you need to do is to provide all the information you have. To that person that is relevant for that task. So even if it's not docu, I'm not, I'm not talking about a company, the, the worst company in the world, the worst company. Nothing is documented, but there is a person who knows stuff and that person delegates that work to a new hire is that person's responsibility. Kind of like to provide all the information that person might need to do the right job. Not to teach the person how to use Kubernetes. That's not the question at all, but it's kind of, oh, are we using Kubernetes here? Maybe CSI dunno, kind of, I, I know those technologies, no problem, but kind of you need to tell me man, kind of like, how do I connect, what do I do? What do kind of, you need to give me all the information and once you give me the information, I'll do the right job. Now I don't understand why would we expect from AI anything different than that?

Darin 00:28:43.660 Well, because it's a machine and machines should be able to do everything just by me telling it something.

Viktor 00:28:49.098 Yeah, good luck. Good luck with that. I mean,

Darin 00:28:57.744 isn't, that what we expect?

Viktor 00:28:59.283 uh, now you could be betting on a technology that will read brainwaves and figure it out, but I don't think that that's happening this year.

Darin 00:29:12.744 No, I don't think it's going to happen in 26 at all. No, it might, but you know what's gonna happen? Okay. I'll use you as an example. You're reading brainwaves from you, but yet, what's your first language you think in?

Viktor 00:29:30.925 Me,

Darin 00:29:31.591 Yes.

Viktor 00:29:32.450 in my case, English.

Darin 00:29:34.321 Oh, you think in English? Okay. Because that's the hardest one to deal with. Now

Viktor 00:29:40.180 Yeah, that's a long story,

Darin 00:29:41.671 right, I can get the rest of 'em right. Um.

Viktor 00:29:44.140 Let's say it's Serbian just

Darin 00:29:45.331 Okay.

Viktor 00:29:45.640 of conversation.

Darin 00:29:46.651 Yeah, so let's say, say you're using Thinking Serbian, but now your brainwaves, and it's like, what is this? I don't know what to do with this. We'll have another training problem, which leads me into the next point. We gotta rethink about how we're actually going to do this. We actually have to treat our documentation as a product, not as a byproduct.

Viktor 00:30:10.671 Yes.

Darin 00:30:11.606 And if we are expecting AI to help us in our operations, really, if we think about it, that's what we want, that that's whether we're a application developer or marketing person, whatever. We're being sold the bill of goods, that AI can help us do our job. If it's going to help us do our job successfully, it can help us do our job. It doesn't mean it's gonna be successful, but if we want it to be successful, then we have to train, air, quoting, train very hard, the AI to do what we do as we do. I.

Viktor 00:30:48.588 Correct. Ability to train ai? Yes. I think that that's, that's the key. if that's where you were going, kind of just, just as a person, we need to train that person. And by training, I don't mean, let me teach you Kubernetes, I mean in the intricacies of this organization. let's play it out with the person here. Let's say that you do have documentation and it's accurate and stuff like that, you still cannot tell that new hire, that's the URL of the notion in this company. And, uh, you should, uh, what should you do? You should, uh, deploy a new release, And that notion happens to have castillian pages that wouldn't work either, You cannot just be so simple. I say this now, improvement. Before I was just saying, deploy a new release. Now I'm getting better at this. I'm saying deploy a new release and here's the URL of notion that, has all the information that ever existed in this company. And it's all accurate because we are, we are better than anybody else in the whole wild world. it still wouldn't work. You need to point person directly to the relevant parts of your documentation, right? otherwise that person will spend 75 years just reading through it. or at least scanning through it. So the challenge is what you said before, first of all, how do we have accurate documentation? But once we do have accurate documentation, how do we find it? How do we find the relevant parts of it? And that's, that's a real challenge. Real, real challenge.

Darin 00:32:28.118 And that's, trying to get the documentation to the right size. Going back to what you said, we have this basically one document that documents everything. It's the, it is the, let's put it this way, it is the Bible of the organization,

Darin 00:32:43.148 right? Christian Bible, 66 books one, one big book. But it's like, okay, I got it, but where do I go? And so that's, that's good. At least you have it, but it's at, that's at one end of the continuum. The far other end of the continuum is, I'm gonna stick with a Bible metaphor. You pull out a single verse of a single chapter in a single book. That may be great, but that's not enough context to understand the overall story or you're trying to figure out one thing. So we've gotta be somewhere in the middle, like we need enough context, but not just bunches of little chunks all over the place. It's got to be a full story, whatever that story, like your, your case, deploying an application. Okay. What does deploying an application mean? Maybe that's a big umbrella of deploy application. But then underneath that we have deploy Kubernetes based application, deploy a VM based application, deploy a single page application.

Darin 00:33:44.723 Those are okay. I mean, it's, it's okay having that big bubble, at least that gets you sliced down to one place and then you start slicing the other ones, which are, are pretty easy steps. And then you can go full. Like, to me, a full context of deploy a spa is good enough for me because I, I'm betting that the platform engineering team already has the playbook for how to deploy a spa. 99% of the time, and if not, we would create a new one based on that.

Viktor 00:34:11.559 you know, going back to my early statement where you kind of rolled your eyes, I said, we have the technology to do those things. We just haven't done it. Here's an example. Google information is all over the place. It's whole internet is information, it's can be in HTML, it can be in PDF, it can be in videos, kind of like in different formats everywhere at and in infinite number of formats. And still, when you search for something, I'm not saying it's perfect, but most of the time if you skip through the ads, you are gonna get to the correct one. Maybe not the first result, the second time. So we know how to do that. Just we are not doing it.

Darin 00:34:57.806 Well, it's just too hard. I mean, I don't have 20,000 search engineers at my company. Heck, I don't only have two

Viktor 00:35:04.183 Yeah, that's true. I have a solution. Then for, for all company problems, it might not be acceptable, but I'm going to say it anyway, just make everything in your company public, and then you'll find it through Google.

Darin 00:35:18.790 You laugh, but, uh, sometimes when you can't have a good search on your documentation. Look at open source projects. how many of them actually have onboard search? A handful. Not many. A handful. The ones don't. What do you do? You do psych colon. Doc's domain search for the thing.

Viktor 00:35:44.604 how often do you use notion?

Darin 00:35:47.262 Not a lot.

Viktor 00:35:49.686 Actually, it's, it's gotten pretty good with those things. Like if you just dump random stuff like everybody does in Notion, if you ask Notion AI directly or through, uh, MCP, most of the time it actually does manage to find relevant information for your query

Darin 00:36:08.064 So it's ragging pretty reasonably is what I'm hearing.

Viktor 00:36:11.418 it's getting pretty good.

Darin 00:36:13.119 But here's the thing, I'll, I'll play that out. I'm the only one feeding it data, or maybe it's just three to five of us. Right. That's probably your use case, right? You probably have just a couple of people working in it,

Viktor 00:36:25.308 Not a big organization. Yes.

Darin 00:36:27.084 versus if you had a hundred or a thousand people, do you think it would be still as good?

Viktor 00:36:31.956 I think we are getting there I don't think that real limitation with, uh, notion would be very happy for you to have gazillion pages and charge you accordingly, right? now of course, bigger quantities always more problematic, but, uh, we, we are there. the problem is mostly that, the information is not relevant there for all outside types of queries, and this will be going back to the system, right? Kind of, you just need to understand that, from anything but the system, you'll get only motivations of something, not instructions. So it'll find, very likely relevant, uh, instructions that you're looking for. it's just that those instructions are not going to be accurate, not because it found the wrong ones. But because, uh, the information you have there is misplaced and you looked in the wrong place, you should have looked into the system. And when I say the system, I, I don't mean only clusters, I don't mean only a WSI mean kit as well and so on and support. we have a real trouble actually looking for multiple sources of information. I feel that, that, might be the problem, right? Kind of. Oh, there is no wiki page. We are so terrible. Yeah. But kind of the whole gi history, man. The whole kid history is there and you don't need to go through it. You just need to go from, from top and down until you find it.

Darin 00:38:00.575 That's great when it's in a system. But again, one of the things we have to fix is capturing this tribal knowledge. And we need to capture it systemically or systematically, I guess the better way to say it, right? Bob knows this, Doug knows this, Susan knows this. We've gotta capture that information in some way, shape or form. Whether it's actually having them write it down, probably a very low chance of that happening, sitting down, recording a zoom call like what we were talking about earlier, or doing video, if it's something that they're actually physically having to do, how do you rack and Stack in this data center because there's something special about it.

Darin 00:38:40.205 We have to start capturing this information. 'cause that's not gonna be tracked in GI or anywhere else.

Viktor 00:38:46.540 let's start first that. This reinforces the need for concepts like infrastructure code, right? if you haven't pressed the button in AWS console to create easy two instance, but you created something, Crosspoint Terraform massive, whatever you're using, you have it now, all of a sudden bought in AWS and in Git, you've wrote some comments over there or updated comments because while you were writing that Ansible, if you were updating that Ansible, you saw the comments, you're not a bad person. You saw the comments, you, you updated it, and off you go. I know that you're probably not updating comments because we are all lazy, but then we can, we are slowly going to the vibe coding situation. one thing I noticed when I work in my case cloud code is that it updates the comments all the time without me saying anything. It really likes doing that. The comments of the code that I have, that I work on a project through cloud code are, I would say always up to date. There might be a bit, maybe too much, but that's a different type of problem, is doing what I, what, what, what I was always too lazy to do. Kind of like I never update comments. I'm too lazy to do that. I bet you didn't update, uh, comments of every function you modified

Darin 00:40:09.865 no, never. Why would I? It

Viktor 00:40:14.449 Exactly.

Darin 00:40:14.935 thing, but now it's doing it a little bit differently. But I should have documented that.

Viktor 00:40:20.076 what I'm really trying to say is that, and this is obviously a technical challenge, there are things to be solved, but, you know, uh, whatever was the article you were referring, I think that the real problem is not that the information is not there. I, I don't think that the problem is that the information is not accurate. The real problem is that, uh, we need to become better with multiple sources of information and distinguishing what is more likely to be accurate than than others. And then making some conclusions, like going back to my example, Hey, if you have information both in GI and AWS about these two instances, well, which one is more accurate?

Darin 00:41:05.361 It's gonna be easy too every day. The actual

Viktor 00:41:07.505 Exactly, Exactly, and if you wrote labels in that easy two instance, you will have the, why are we doing it in the first place? So it's up to you to do it, or again, live code it and then you will have it.

Darin 00:41:23.311 Which leads me to the next point. We've gotta build this documentation process, whatever it is, for maintenance and not just for creation. Not just capturing it

Darin 00:41:34.861 because, let me play this out for a second. Uh, let's, let's go back 10 years. The person in charge of documentation at that point said, let's put everything in Confluence. Great, no problem. Person leaves the company. Now the person that's in charge of that area is said, well, let's put everything as Google Docs. Great. But we now we got two quote unquote sources of truth. Don't yell at me, but we got that so far. Now that person leaves the company and we've decided to put a center of excellence in, because that's the right thing to do. So now we're gonna do everything as a Jekyll based website. Now remember, confluence is still there. A lot of it is aged out. Some of it is still true. We still have things that are in Google Drive, which may be true or not true, and we've got Jekyll, which is more true than not, but there's still some not true things there. And that center of excellence gets tore down and now it's just a guy deal trying to deal with the whole mess because the company's gotten rid of all the people because AI is gonna solve the problems.

Viktor 00:42:35.180 Okay. What's the problem there?

Darin 00:42:37.502 the problem is, is we built for creation and not for maintenance. It's like whatever, whatever it was, the new hot thing or whatever, somebody sold them or whatever, the, I, I've used this for the past 10 years. That's what we're gonna do here without looking at, okay, how are we actually gonna maintain what we have? And then how do we add in things so that they are maintainable? This goes back to your conversation of where Claude Code is maintaining the comments as it goes through most of the time for you. it's already internalized. At least we're assuming it is internalized that comments are important.

Darin 00:43:11.572 understanding how things work, so therefore it is built in as maintenance, not just creation.

Viktor 00:43:18.685 I feel that the problem might be slightly different. So let's say, what is the motivation that you, you made the decision to switch from Confluence to, I dunno, whatever was the second option. Google Docs or whatever you said. Right. And I bet in most cases, simply because you were never happy with it in the first place. Somebody build that confluence thing in. Nobody's using it. We rarely kind of, we, we are happy if, if people were using Confluence and they were refining information they need, then what's not? There is a relatively small chance that you would have chosen to switch to something else. What we often do is that we switch to something else, as a response to solve a problem we created earlier by creating something that doesn't work or something that, nobody wants. More often than not, nobody wants that. Confluence was busy. I bet it'll stay. You know what? Even if you made the decision, let's say that Confluence was busy, people were getting real value from it. They were really using it, and then you make a decision to switch to Google Docs, what would happen? Well, nothing would happen. Your Google Docs would be free because people will stay where they, where they're comfortable and where they get value from. The only reason why they switched to Google Docs is because Confluence was bad in the first place.

Darin 00:44:46.396 Right. The goal for all this knowledge that we have that we want to capture is we want data that is fresh. But anytime we create it understands the context, and it gives me my answers instantly that I'm looking for

Viktor 00:44:57.981 Data that is used

Darin 00:44:59.652 data that's used.

Viktor 00:45:01.281 and that's the, I feel the biggest problem is that simply we document stuff and we are the only consumers of that something, or not even us. that's why you chose to switch from Confluence to something else. You'd think that if somewhere else, people would actually use it.

Darin 00:45:17.808 So how do we solve for this? I've got a couple of ideas I'm gonna throw out to you. This is mainly for the practitioners, me and you Viktor, right? These, this is what I'm thinking. we need to audit before we build out anything. It's like, what do I have to actually feed into this? Am I gonna spend more time cleaning up the data I have versus just potentially writing something fresh? It's one thing to think about,

Viktor 00:45:40.167 I know, thumbs up and thumbs down.

Darin 00:45:42.678 Okay.

Viktor 00:45:44.037 Kind of like

Darin 00:45:45.363 Well,

Viktor 00:45:45.542 if, if it,

Darin 00:45:46.263 and a thumbs down. That's the problem, right? We've got,

Viktor 00:45:48.957 kind of like, uh, if, if there, are no thumbs up,

Darin 00:45:53.193 yeah,

Viktor 00:45:53.757 nobody's using it or nobody finds this useful, delete it.

Darin 00:45:57.333 yeah, that's true. If people would actually remember to click those, some people are averse to actually being positive or negative because

Viktor 00:46:06.357 Oh. Statistically, you know, if, uh, not everybody puts thumbs down, but, uh, if enough people use it, there will be at least one, I guarantee, but at least a hundred need to go to find it useful to get one.

Darin 00:46:21.657 Here's another one for you. Instrument your documentation. We're used to instrumenting our applications, but what about instrumenting the documentation? 'cause that will tell us what people actually search for and can't find.

Viktor 00:46:32.610 Oh, that's good. It's like tracing for docs.

Darin 00:46:36.297 Tracing for docs. These gaps are gonna tell us where the problems are

Viktor 00:46:40.143 Yeah, I like that. Oh, that's, that's, brilliant. That, that's, we should make a, have a startup based on that.

Darin 00:46:47.709 Isn't that what Confluence is anyway, or what it's supposed to be?

Viktor 00:46:51.123 probably do it. They probably have the statistics. Yeah.

Darin 00:46:55.089 Uh, we wanna start in one place, don't boil the ocean, like pick something small, whatever that is, that's, that's pretty normal anyway. we, we need to design for retrieval, not just for storage. If I have, three petabytes of data and I can't get to it, what good is it? If I have three megabytes and I can't get to it, what good is it? Right? It's the same problem. We gotta get it back out. just like everything else that we talk about in our space, we need feedback loops. If we don't have feedback loops, again, this goes back to the instrumentation. If we don't have feedback loops, how do we know if it's any good or not? We so start small, grow from there. a couple of ideas beyond this. This is not a technology problem. This is a people and process problem. At the end of the day, this whole AI rag is just the next evolution of search.

Viktor 00:47:45.623 Wait. If it's a people problem, then should we just get rid of people?

Darin 00:47:51.584 Yes. That's the purpose of ai.

Viktor 00:47:54.203 Yeah. Okay. So.

Darin 00:47:57.224 If it were that easy, 'cause we still, we still need people to get the AI trained and then we can let, let go of the people that that's how it'll work out. here's another thing. We're not gonna be able to buy our way out of our documentation problems When a vendor comes along and says, Hey, just let us have access to all your drives and we'll be able to get everything indexed and wonderful. You'll be able to ask anything. It'll only take us like, 30 years. okay, sure. Maybe because. Realistically, but you always have data moving. That's what we were talking about earlier. We want this data to be as fresh as possible because if it's outta date, what good is it? This goes back to basic models, right? You ask the model, what date is it? And it says September, 2023. Thinking back a couple of years ago, it's like it doesn't know anything beyond that.

Viktor 00:48:43.755 Now you're touching to something you said earlier, uh, when you mentioned the word maintenance, right? Most of those projects are, projects with the end date, that's, we get those problems, right? Oh, let's index everything. Cool. And the next day index is already out of death.

Darin 00:49:00.799 that was true even for the Google indexes, right? If, if they stop, if they stop the spiders,

Viktor 00:49:07.018 exactly.

Darin 00:49:08.089 it's done.

Viktor 00:49:09.238 Exactly. So you need to have a dedicated indexer your company.

Darin 00:49:15.186 Okay, we're poking at that, but I think the companies that treat this as a first class citizen and as a strategic advantage and not a chore are gonna actually win the game

Darin 00:49:28.238 because how much easier is it? I'll use this. I guess we could go back to the bathroom, but we won't, let's go to a car mechanic. I'm needing to replace the timing belt on a 1983 Chevy Malibu. I don't even know if that one exists, but if I know timing belt and 83 and make and model, I should be able to get the data. I know how much it's gonna cost and how long it should take to do the job That's, that one is not going to change over time, or shouldn't change over time except maybe get faster and cheaper or maybe more expensive because the timing belt costs more over time. But the process is gonna stay. But if I were a backyard mechanic and I raise up the hood and take a look at the, oh, you really need a timing belt, I can get to it in a couple of weeks. Uh, I'll, I'll, I'll charge you like $800. You don't know if it's good or bad, you just know the guys in the backyard and you should be, be able to get a better price because he's in his backyard and doesn't have a nice, full clean shop.

Viktor 00:50:33.647 Yep.

Darin 00:50:35.928 The guy may be completely correct and the shop may be completely incorrect, right? Just depends. But if we can treat our documentation on one off the rails here, if we treat our documentation as a strategic advantage instead of just something we have to do, it's gonna be better. If you were to treat your C corporate bylaws as a strategic advantage. Alright, lemme flip around. If you're gonna treat your corporate bylaws as just something we have to do versus a strategic advantage, I'll back it up even one further because this is what happens in real life. If you are coming into a VP director or C-level of organization, I can guarantee you, you are treating your employment agreement as a strategic advantage and not just a chore. You are setting up your golden handcuffs and also your golden parachute at the same time. You're not taking a boiler plate. We need to treat our documentation the same way with that same level of fervor, I guess. 'cause either it's going to save us or it's gonna bankrupt us. I'm taking two extremes there. So what do you think? How are we gonna solve this Viktor? Is there any way to solve it?

Viktor 00:51:51.826 Fire. All the people I. already said gave you

Darin 00:51:54.022 Okay, so what if we can't do that because you know we have a heart.

Viktor 00:52:01.456 Then hire only people who know everything.

Darin 00:52:04.324 But if they know everything, then they're not gonna write. Want to write it down because they will see that as a strategic advantage for themselves.

Viktor 00:52:10.558 You don't need it written down. If they know everything,

Darin 00:52:13.336 What if they die?

Viktor 00:52:15.118 well treat them better so they live longer.

Darin 00:52:23.494 Capture the information. You need to be capturing the information anyway. Capture it correctly, build it out correctly. Don't fall for the scams. That would be another thing I would think. Maybe AI isn't your answer. Maybe you just need a basic search engine, right? That might be your right first step.

Viktor 00:52:41.534 Yeah. And you know, when it comes to ai, and I'm now talking about user of that documentation, not the one helping you with documentation, uh, I think that it in this context, it's irrelevant, kind of like all those problems are equally with or without ai valid and important. cause I, I read a lot of articles kind of like, oh, your, AI initiative, this and that will fail because, uh, AI will not be able to do whatever it needs to do the correct way because of your documentation. Your people cannot do it either because of documentation. It's exactly the same problem. Or new hires. New hires, let's say.

Darin 00:53:23.541 Lemme take it one step further. We'll let this be the last question to the audience. If you're at a point, your rag process at your existing documentation, would the answer it give you be better or worse than the hallucination it gives you? If the answer is worse based on your documentation, you got real problems. So what do you think? Head over to the slack workspace. Go over to the podcast channel. Look for episode number 3, 4, 2, and leave your comments there.