DOP 53: Should You Maintain Your Systems or Let Them Rot on the Vine?

Posted on Wednesday, Apr 29, 2020

Show Notes

#53: Recently, the governor of New Jersey made a plea for COBOL programmers to help maintain the state’s unemployment system. In this episode, we discuss the ramifications of not taking the maintenance of your systems seriously.

Hosts

Darin Pope

Darin Pope

Darin Pope is a developer advocate for CloudBees.

Viktor Farcic

Viktor Farcic

Viktor Farcic is a member of the Google Developer Experts and Docker Captains groups, and published author.

His big passions are DevOps, Containers, Kubernetes, Microservices, Continuous Integration, Delivery and Deployment (CI/CD) and Test-Driven Development (TDD).

He often speaks at community gatherings and conferences (latest can be found here).

He has published The DevOps Toolkit Series, DevOps Paradox and Test-Driven Java Development.

His random thoughts and tutorials can be found in his blog TechnologyConversations.com.

Rate, Review, & Subscribe on Apple Podcasts

If you like our podcast, please consider rating and reviewing our show! Click here, scroll to the bottom, tap to rate with five stars, and select “Write a Review.” Then be sure to let us know what you liked most about the episode!

Also, if you haven’t done so already, subscribe to the podcast. We're adding a bunch of bonus episodes to the feed and, if you’re not subscribed, there’s a good chance you’ll miss out. Subscribe now!

Signup to receive an email when new content is released

Transcript

Darin Pope 0:00
This is episode number 53 of DevOps Paradox with Darin Pope and Viktor Farcic. I am Darin

Viktor Farcic 0:06
and I am Viktor

Darin 0:07
and in a non video recording, we're going old school today, with just audio. Today being 6am my time today, at the time we're recording. The topic for today. In everything that's been going on these past few weeks. We've seen especially in the US, I'm not sure over in Europe, but we've had calls from certain places I'll make I'll just go ahead and name it from the governor of New Jersey. Their unemployment benefits system was written in COBOL. And they didn't have people to maintain it, and they were having lots of problems and they were asking people to come and volunteer to write and fix whatever COBOL program issues they were having. Now number one, why COBOL? We'll talk more about that in a minute. My second part to that is come volunteer. Really? You want somebody that's maybe already retired or unemployed, to come work for the unemployment office? Isn't that a little a little strange?

Viktor 1:29
And most likely that person, that system requires that person to be physically present around the system? Right?

Darin 1:37
Probably not remote working friendly. If I had to guess. Yeah, I mean, it's okay. Now there are people that wouldn't mind because they may be retired. It's like, Oh, yeah, I'll go back and just do something for a couple weeks. Right. It's just it's sort of like doing a, you know, a hackathon for a weekend, right.

Viktor 1:56
But let's, let me repeat what you just said. Just to see whether I understood. Basically, government or whomever is asked if there are no doesn't doesn't have people who know the system and the language that are not retired. So it's asking retired people who are actually the people most affected with the current situation and in most danger to come and help fix the system and potentially put themselves into danger.

Darin 2:32
There's there is a chance of that, yes.

Viktor 2:35
Because kind of like you know, if you're 40...50 can you get infected but it's not that likely that something will have long lasting effects, right. You're retired. You're in a tough spot.

Darin 2:52
Yeah. Now, I was semi bashing on COBOL now if you're a COBOL programmer, you're actually worth a lot of money. I can't remember if I talked about this on the podcast or if I was just speaking with somebody else. I was at a client. I think it was late last year. And I was talking to this kid. And by kid, I mean, he was 25...24...25. I was like, so you know, what are you doing here? He says, well I work in the mainframe group, I'm a COBOL. programmer. What? Now he had the general look of a programmer from the 60s or 70s. He already had the long ponytail he already had. So he was already except it just wasn't gray. Right? It was it was still normal colored hair. Why did you do that? He says, I'm set for life for at least the next 40 years.

Viktor 3:54
Exactly,

Darin 3:55
because my company will never move off of mainframe and they will never leave COBOL He says if they do, I can learn something else. Because if I've learned COBOL, I can learn anything. So COBOL as a language, I don't have a problem with COBOL as a language, most of the problems with, with the whole cover is COBOL on certain pieces of hardware that cannot necessarily scale because it would cost too much to scale it. That's part that's part of it anyway.

Viktor 4:32
Even bigger problem is that that is a symptom of a different issue. And that issue is that there is that very, very short term way of thinking that, yeah, we just need to fix this. We just need to do that. And then you end up in a situation that, I don't know, 30 years later, there is the differences between your system that hasn't been updated or improved or anything for 30 years. And then what? What you would need to do to put it up to speed today is so big that it doesn't even make sense anymore to to adopt any other strategy, like then, you know, if you are on mainframe then the only option you have is actually to keep using it because the gap is so huge that it's close to impossible to to move on

Darin 5:28
the well I mean yeah because you could not bring in your favorite fill in the blank consulting firm and rewrite that in a weekend.

Viktor 5:40
Exactly. Because you don't have enough manpower. You don't have enough knowledge. You don't have enough the docs. You don't have enough vendor support. You don't have enough of anything that is required for you to take such a complicated task and changing a system drastically is a huge task, unless we're talking about some tiny, tiny, tiny system. Now, I always have the same logic to me that's similar like when when operating, let's say operating system of a server, right? If you jump one minor version once a year or twice a year, that's not so complicated. If you have an operating system that hasn't been update upgraded for 10 years, then it's very tough. Simply because making big jumps is complicated. But if you don't make any jumps, then the only option you have is a big one.

Darin 6:33
Right. So let's, let's talk about a couple of other things that I've seen that are sort of in the same vein. At the time we're recording, today is Friday, April 17. This is going to be releasing on April 29. So by the time you're listening to this, this has been old news. But in the states this week, there were stimulus checks sent out And the first big wave happened on April 15. Funny enough, which is typically tax day in the United States, which has been delayed this year. But people started checking their bank sites to see if the stimulus check was in their account. Well, what started happening around midday US time, wherever midday was probably, you know, between 11 and 1pm Eastern people on the West Coast were waking up and finding out and checking. Bank sites were going down left and right, because they could not handle this enormous influx of traffic for a simple read only call. That should be a simple read only call.

Viktor 7:50
Yep.

Darin 7:51
Right. I'm, I'm being opinionated, opinionated here. But that should people were saying, hey, I need what's my most recent five deposits. Right, that's effectively what they're looking for, like what's what's my most recent deposit?

Viktor 8:08
Exactly

Darin 8:10
Or last 10 transactions or something, whatever the case may be.

Viktor 8:14
You know, I, I can understand that in that situation banks didn't have kind of like already everything ready, kind of, Oh, we are ready for any disaster that can happen in terms clock, kind of it's already it's already scaled up and or whatever is needed. That would be unrealistic. But what is again a symptom of something very bad is that they're not designed that they can adapt to change when it comes. Like if I understood right and correct me if I'm wrong. At least I think I've been hearing about stimulus checks coming for weeks now. So basically, we had weeks let's say a month, between knowing what will happen and that's something happening. Right? Now the real question is whether a month is enough to adapt to something or not. And for most many companies, it's enough, like many companies need a week or a month is plenty to adopt to change.

Darin 9:17
I work for I would say, for the financial, the retail financial sector in the States, that answer is no.

Viktor 9:24
Yes

Darin 9:25
it's three, three to six months, at best. Because they have to get everybody around the table and get their managers of their managers of their managers around the table. And just so they can decide when they're going to have their meetings to start discussing what they're going to do.

Viktor 9:44
Yeah. Probably that month days, they're still discussing what will be the data for meeting to think about.

Darin 9:54
Right. Now what would be interesting because I have worked with some smaller regional banks. Like, not like, not like City County type size banks, right. You know, they're really super local, hyper local banks. But I've worked with some regionals that aren't like the the majors or the mid majors. And I haven't checked to see how they fared. Because in those scenarios, because when I've been to their, to their places, their corporate headquarters where all the developers are, are maybe 100,000 square feet, maybe. I mean, not not a huge facility, if you think about a corporate headquarters where everybody is working. So to me, it would seem like there might have been some that would be like, Oh, this is coming, we might ought to consider that we're gonna have a lot of people checking on this day. Right? And do we feel like we're okay, at least you would have made a measured, you know, a measured calculation. But maybe not, I don't know.

Viktor 10:56
I mean, logically speaking, I can only assume that the bigger you are the more levels of decision makers and management you have, right? Because the major the major problem, usually in those cases is how long does it take for you to make a decision, but that's usually the most painful part. And when I say decision, I don't mean decision, like we're going to handle, be able to handle that. But the decision also technical decision how you're going to do that all all different types of decisions before actually, John, I'm inventing a name, sits down and does it.

Darin 11:35
Right. It might only take John five minutes to do it. It might just be a simple thing. But how many man months or years have occurred before John sits down and does it in five minutes?

Viktor 11:47
Exactly. Or for all we know maybe John actually already did in that fictional bank solve the problem, but now it's waiting 57 approvals to get to production. Maybe there is a fix waiting.

Darin 12:02
right, and has been waiting.

Viktor 12:04
Yeah.

Darin 12:05
But it got out of queue because product owners were fighting to get their other features out first.

Viktor 12:11
Yes.

Darin 12:13
You never do that as a product owner, right? Um

Viktor 12:18
Let's not go there.

Darin 12:18
I might edit that out.

Viktor 12:23
Its okay

Darin 12:25
I'm going to I'm going to move on to a different one. There is a what is his title? So I saw this tweet from Kieran Healy. And I'm sure he's not listening, but I'm sure I butchered his name at the same time. He is a professor of sociology at Duke University in Durham, North Carolina. And he had a tweet that I'll include the link to the date I don't even see right now, but sort of during this time and he said, quote, If they cannot believe a ton of government HR infrastructure still runs on COBOL, wait until the cool data science kids find out just how much of their machine learning, AI, and model fitting kits sit on top of a bunch of Fortran libraries.

Viktor 13:26
But those are, those are two extremes. And then I like that tweet. I'm not sure whether that was intentional. That tweet actually talks about both extremes that we have systems that are rotting and then we wonder why they don't work after 30 years of negligence. And then we have also cool kids that don't understand that the world did not start yesterday. And that everything is kind of like starting today. We're going to design everything from scratch. We have no legacy, no business before today. Right? And then that's maybe equally dangerous is just as negligence?

Darin 14:13
Well, I think it's, it's beyond dangerous. I think it's I think it's crazy. I can't think of a better bigger word. But usually what happens, I was going to mention this when we were talking about the banks. And they're trying to get people figured out, okay, who's going to do what well usually there's at least one to two, maybe three consulting companies also involved in all these things that have their own priorities to make sure their contracts keep getting renewed. And you ask a consulting company, hey, we've got this system. We need to make it better. Most consulting companies, unless you're a long haired, gray haired ponytailed person is going to say well we need to replace it with the cool new hotness, and we need to get rid of this old new fartness. I made that up. Right. It's, it's they're thinking that just because it's COBOL and mainframe that it's horrible and it needs to be replaced. That may that may or may not be true.

Viktor 15:18
Yeah. I mean, it really depends. That's, that's one of the biggest dangers I think that companies are facing is that it's very, really hard to understand the internal and hidden motives of companies that you're giving power to manage your business, like a consulting company can be like the one you explained, but also it can be a vendor that is trying to actually have a very strong interest to keep your mainframe because they're the only provider of the hardware of that mainframe. So kind of like, you have those competing interests that are pulling you in different directions and none of them is actually working for your own interests. Unless you really, really know what you're doing, but you cannot know what you're doing because you externalized all that know how. So you have no idea. You just need to trust those companies and those companies are giving you opposing advices.

Darin 16:21
Yeah, and that is dangerous because I've seen that pattern happen a lot. It's like, Oh, yeah, we're just gonna outsource that. It's a legacy system to us. We'll just have somebody maintain it, we'll pay them a retainer fee of, you know, a few million dollars a year. And that means we can open up two tickets a year to have words changed on the screens.

Viktor 16:40
Yeah. And when the when the time comes to do something with that system, let's improve it one way or another, let's say right now scale it, then you don't have that knowledge because you decided that that's, that does not matter. And then it suddenly does.

Darin 16:57
I also think in this what we're talking about here also falls into this. Most companies. Okay, I'm going to say this as a broad statement. This may not be true everywhere but as a broad statement. Most companies do not know how to quickly triage which is in especially in these days, you have to have a first responder mindset. You can't have an upper A agile or upper W waterfall mindset.

Viktor 17:32
But it's not the only mindset, it's for many companies, it is impossible to quickly triage because of what what we just spoke a few minutes ago because if you have 75 different providers, working on different aspects of your system and you happen to need to try or something that affects many of them then realistically can never be fast because you need to go through all those different levels and companies and all all that stuff before you get any feedback. It's kind of like you cannot triage fast anything you cannot do fast anything unless you're in control of that something. I'm not advocating every company should be in hundred percent control but that's simply something that needs to be understood. If you're not in control, you cannot be fast.

Darin 18:23
And you have to be okay with that. When you chose to not be in control, you chose to give up flexibility, power. Because if you if you offshored and I don't mean offshore, but you got it outside your four walls offshored however it physically happened. Your primary customer service system, if you're a bank, it was mainframe. We've we've moved on to the new hotness, that's this thing called Java. And we've offshored the the COBOL and mainframe stuff to somebody else. Now you need something fixed immediately for that. Well, some of the contracts I've seen for people that do that kind of work, it's literally, you know, seven figure eight figure deals, you get to open up two things a year and that's it. And if you want this emergency thing, great, there's a force majeure clause in there, too. That means we can charge you whatever we want, at whatever rate and you have to pay it up front, before we do anything. That's mafia tactics.

Viktor 19:26
Yeah. And that's in case that you don't need to involve lawyers, which you often do.

Darin 19:31
You always have to involve, you always have to involve lawyers. So is, is there hope? Is there a way to solve this problem? Or at least plan for solving this problem?

Viktor 19:49
I think I mean, there is. I think we know how to solve the problem is just that it's very hard to convince companies. I mean, it's not necessarily bad to have a provider for part of your system, it just which provider you choosing, like, I can argue, I can argue, actually, that there is no conceptual difference between, let's say, IBM, and I'm just using them as a name, and Amazon, right. They're both providers that are that are doing, managing part of your system one way or another. The major difference is that one is actually allowing you to be in control and working in a background to enable you to be in that control, while the other is mostly focused on tell me what you need and I'm going to do it for you type of stuff.

Darin 20:44
By the way, the former was AWS and the latter was IBM, just in case you weren't sure which way it was.

Viktor 20:52
I mean, I'm not I don't want to pick on on any of those. It can be anything but there is a difference between using a provider that you specialised in something and it helps you a lot without actually preventing you from having control. And the other one is simply, okay, give me 1000 people, no, put 1000 people on this problem and solve it for me. I don't want to know about it.

Darin 21:16
But that what you just said there, give me 1000 people just solve it, I don't care. That makes you an an irresponsible business owner or business leader.

Viktor 21:29
It's, I would say that's, that's partly inertia because I could understand that logic being true, let's say 20-30 years ago, because we didn't know how to do stuff. I mean, the knowledge how to do stuff well was not so widespread. Whether you had a website in 80s or 90s was not really mission critical, and so on and so forth. But the needs today are different than then, and then people are trying to fit those needs business needs, into the models that are used from the past.

Darin 22:07
Right. So let's let's use the website example. That was a great use case to hand off because if you didn't have something at all, you're taking this as an experiment, an R&D type thing to see how it works out. And you don't have those skill sets in house at all. That makes sense.

Viktor 22:27
I mean, realistically, why would you actually put your core effort into something that is going to be useful useful to maybe 3% of of your customer potential customers, which is websites in 80s right.

Darin 22:46
But what is what are the analogues to that today? How does that play today? What is what is the website of the 80s today

Viktor 22:56
There is no set of...let me think. An office? Bank office would be equivalent to the website. It exists because I mean when I say office I don't mean where bankers work but where I go to. I think that's becoming equivalent of a website of 80s. We're just doing now having reverse effect.

Darin 23:22
Yeah. It's funny. We keep we keep hammering on the banks but we can hammer on numerous company types.

Viktor 23:31
Okay, even better. Okay. Okay. Banks, I can see people going into an office and I can see the need for that. That is diminishing but still existing. Show me a person. Okay. Find me a person who goes to a travel agency.

Darin 23:48
Very few.

Viktor 23:50
I still see around my house there are like three four travel agency. I never saw a person over there except the person working. Or insurance company. When was the last time you went to some office and said, Can you please explain me the options I have for my insurance?

Darin 24:05
I've, okay, this is sort of a macabre story. But when I moved back to North Carolina, my father's wife, her sister or cousin or something, her family has an insurance company that's like, hey, just call them up. They'll get you set up on stuff. They'll give you a friends and family rate. And it's like, okay, whether they did or not didn't matter. It's like it was easy enough. That was 2007. I have never been to their office. I might have talked to them on the phone once or twice. I never met them until my dad's funeral. It was just one of those things of you know, I didn't need to, and that was, you know, 16 or I can't even do the math right now. 11 years. So, and everything's been great. You know, it's like, I know, I can reach If I need to, but I don't need to go to see them. Is there any hope? Is there any hope? How do we help people start to think differently about because we talked about COBOL. We talked, you know, we've talked languages, we things like that we like to poke fun at. It's not the languages. It's the technical debt that has accumulated over time. This is where this is where the time factor of technical debt has really hurt people.

Viktor 25:33
Exactly. You can build the best possible system anybody can imagine right now. Give it 10 years, and it's going to be in the same state as whatever you consider legacy today. If you treat it in the same way, of course,

Darin 25:47
right. I said, Let's, let's, I'm gonna go a little on the edge here. Let's say I know I'm introducing some technical debt right now. And I'm doing it with a full conscious that I'm doing it. But I also know that in my next sprint or the sprint after that, no later than that, it's going to be removed.

Viktor 26:06
But it's not.

Darin 26:08
But then it's not. But if it is, then you're you've planned for it. It's sort of like maybe I'm, it's sort of like, okay, maybe it's like doing a payment plan. I could pay everything, or I could pay $300 right now. Or I could pay $125 a month for three months. I'm gonna pay more because I'm having to carry that debt with me for three months, but then in the end, I'm done. But most people just keep paying. And they keep saying, Well, 125 is too much. I just need to pay a penny. And it's that that interest is going to start overcoming the principal on it.

Viktor 26:45
But that's kind of that might be partly the danger that we are in because most of businesses are these days, publicly owned by many people, shareholders and all those things right. Which the nature I would do exactly the same, are looking for quarterly goals, right. And if it's a quarterly goal, then realistically, none of the things that we are speaking about now that should be done could be can be done because they don't fit into that timeframe. Like, for example, if you take IKEA and the I'm taking them just as a first example, and I'm not saying in any form of whether they're doing things right, but for example, they're a huge company that is truly privately owned by two people or something like that, right? And they can then they can say, Okay, look, this might be an interest for me to do it or not three years. timeframe, right, because it doesn't really matter. I mean, this this is kind of like this is like my home right? Kind of like I'm doing things that are much longer run when I repair something than only for this month. It's a similar thing, like, I'm going now too far. Like the problem that we potentially have in politics and I'm just to clarify, I don't want to change it. But since we have in most of the countries, we have four years mandates for presidents prime ministers, there is no interest for any of them to do anything that might be beneficial longer than that term. And I'm now not not advocating dictatorship or anything like that, just to clarify, but it really I think it's very important when you when you look about how a company operates, and what which type of decisions they make is to look at the timeframe that matters for them, whatever the timeframe is.

Darin 28:33
Because one of the things that you'll typically see, especially when new leadership comes in, as they have their first 100 days. And there's usually two, there's two types of people in that. The first hundred days means they're holed up in their office, and they're ordering new paints new paintings and putting in new carpet, and that's the first hundred days. The second type which is I hope, most of them are actually getting out on the front lines and talking to the people. The people that are working for the company, the people that have been there maybe for five months and other people have been there for 50 years. And understanding what they're getting into. When those people get in, it's like, okay, they they're not going to know everything in 100 days. They basically read the introduction page to the Art of War. Right. That's all that they've done. So how do people get beyond that tech debt? They have to realize that it is truly a debt. And even if they think they're paying no interest on it each month, they are.

Viktor 29:54
Oh, yeah.

Darin 29:56
And it goes back to we'll come back full circle here. New Jersey was looking for people to come in and volunteer their time COBOL programmers to come in donate their time to fix the unemployment system.

Viktor 30:14
But, you know what this is the real question to me is I mean, this will be fixed one way or another. Either either because it will take more time, but eventually it will happen one way or another right. But now, what is the chance that this incident will not be forgotten the moment this is resolved with some workaround?

Darin 30:41
Oh, it will be completely forgotten.

Viktor 30:43
Exactly. So, like three years from now, five, whatever that period, something else will happen, not as a disaster, but something else will happen that will change the needs of that system, and they will be exactly the same conversation as we're having right now. Exactly the same and the system will be in exactly the same state. And we will wonder again, how did this happen?

Darin 31:05
And even more COBOL programmers will be dead by then.

Viktor 31:08
Exactly

Darin 31:10
literally

Viktor 31:12
and those that are not will be getting even richer than they're getting right now

Darin 31:17
than they are now. Exactly. Okay. It feels like we're just scratching the surface on this topic.

Viktor 31:26
It's but it's, it's kind of strange because and I might be wrong because I'm not that involved in the rest of the industries that exist in the world. But I have the impression that other types of industries are less negligent than then we are. Kind of you dont, I mean, if you take...let's say you are in San Fran, right, that that that bridge over there is being from what I've seen, short time I spent there, it's being maintained and then kept well kept all the time. Nobody waits until it starts cracking and you know, kind of like

Darin 32:06
like Seattle. I saw a note this week in Seattle that they are going to close down. Let me pull it up real quick as we're doing this semi live, the West Seattle bridge will stay closed from 2020 through 2021. And some people even their state DOT Department of Transportation, are saying it may not be fixable and maybe have to may have to be completely replaced. Right, because it hasn't been being maintained.

Viktor 32:42
Yeah. Then also replacing also makes sense. You just need to make a decision. Yeah, the problem that I feel that we are having is we don't make those decisions. We don't make a decision to maintain and we don't make a decision to replace. Both options are okay. It's just not okay not to make any of those two.

Darin 33:06
Yeah. Here's just one quick line here looking at one of the new sites from Seattle. City engineers have been documenting cracks first discovered in 2013. And then it goes on to say some other things. But so they've been documenting that there are cracks in the bridge for six to seven years.

Viktor 33:25
Oh, my. Okay. I take back what I said when I said that others are not in such bad shape as we are. So I'm taking it back.

Darin 33:37
Yeah, there's still but again, this is still I would assume this is government type things. This isn't financial, but it could be financial. So I'll stop right there. If you thought today's episode was interesting, thank you. We appreciate that. I'm glad you thought it was interesting. This is a big deal if you're in a company that is doing these types of things. You either are set for life for a job or you need to run as fast and as far away from there as you can. And if you're listening via Apple Podcasts, now Google Podcasts. Google Podcasts came out with an app that runs on both Android and iOS. So if you're fed up with Apple Podcasts, now you can install Google Podcasts. Now there are others too, but that's going to become another big one. But regardless, if you're listening, go ahead and subscribe and leave a rating and review. All of our contact information including Twitter and LinkedIn can be found at https://www.devopsparadox.com/contact. And if you'd like to be notified by email anytime a new episodes is released, you can sign up at https://www.devopsparadox.com. The signup form is at the top of every page. And there are links to the Slack workspace, the Voxer account and how to leave a review in the description of this episode. Now, also, we tend to do a live stream we try to do at least one live stream a week outside of this. That's sort of a ask me anything, except for CICD, on whatever's going on, so we will talk about bridges and COBOL and everything else. And those things but we we the only way you're going to find out about that is to join the Slack workspace. That's where we post the notifications or you can follow Viktor or myself on Twitter. And we'll usually post when those are but they've typically been happening on Friday, Friday afternoons Eastern time, that would be evening Europe time. We may end up switching those up some to give Europe more midday but who knows. That also means I'd have to get up at 4am to do these things. I don't know if I'm that dedicated yet. That means I would have to get up and shower before then because I have to at least semi presentable. Or not. I don't know. Any final things about the world of technical debt?

Viktor 36:01
It just keeps increasing like any other debt.

Darin 36:07
And at some point, that debt will overtake you.

Viktor 36:10
Or you're gonna declare bankruptcy right?

Darin 36:14
And then start over creating more debt?

Viktor 36:16
Exactly.

Darin 36:17
I hope not.

Viktor 36:19
It's kind of it's so cool. It's so much easier to start accumulating debt from nothing.

Darin 36:26
Yes, because then you have what you think is an unlimited ceiling.

Viktor 36:30
Exactly.

Darin 36:33
Okay. Thanks again for listening to episode number 53 of DevOps Paradox.