DOP 95: Should Everything Be Automated?

Transcript

Viktor: [00:00:00]
I can guarantee you that the script that is configured to log something as breadcrumbs is going to log something always, except if there is an outage or something like that. I can guarantee you that there is a much higher chance that a human will forget to log. It's a fact. We don't always repeat exactly the same steps.

Darin:
This is DevOps Paradox episode number 95. Should Everything Be Automated?

Darin:
Welcome to DevOps Paradox. This is a podcast about random stuff in which we, Darin and Viktor, pretend we know what we're talking about. Most of the time, we mask our ignorance by putting the word DevOps everywhere we can, and mix it with random buzzwords like Kubernetes, serverless, CI/CD, team productivity, islands of happiness, and other fancy expressions that make it sound like we know what we're doing. Occasionally, we invite guests who do know something, but we do not do that often, since they might make us look incompetent. The truth is out there, and there is no way we are going to find it. PS: it's Darin reading this text and feeling embarrassed that Viktor made me do it. Here are your hosts, Darin Pope and Viktor Farcic.

Darin: [00:01:19]
Last week, we were talking about text versus video and how you might be learning and how much we love working on producing books through Leanpub, because that's all automated. That's great. It's easy to do. It's all through Git. We love consuming video, but we hate the production side of it because we have to sit in front of Final Cut Pro or whatever tool that we're using that day is, but let's take it into our day-to-day life. Maybe not our day-to-day in life, but maybe your day-to-day life and the question we want to pose is should everything be automated?

Viktor: [00:01:56]
No

Darin: [00:01:57]
Great. Thanks for listening. Have a great rest of your

Viktor: [00:01:59]
you. Thank you. Bye-bye.

Darin: [00:02:02]
Should everything be automated? No, it shouldn't be. Should everything be under source control? Hm, no. That one's probably more fighting words than the other one.

Viktor: [00:02:15]
So it really depends on the stage you are. Let's say when I start working on something completely new. I don't push it to Git immediately. I sometimes don't push it at all. Ends up going to trash. I exaggerated a bit. Let say that I push to Git 99% and then delete 20 afterwards from Git. But if you go back to automation, let's say when I find some tool, platform, something that I see for the first time, and this is now confession that I probably never made before. I do not start with Terraform.

Darin: [00:02:46]
What? No. Really?

Viktor: [00:02:49]
Exactly

Darin: [00:02:51]
it's not possible.

Viktor: [00:02:52]
But that's simply the case. I go through and I'm going to after saying what I'm about to say, I'm going to go outside of this room, pour gasoline over myself and then burn. I start very often from web UI. Like the other day, it's been couple of years since I worked with Linode and I wanted to see where they are right now and so on and so forth. I spent first half an hour through web UI. I deleted everything I did at the end and then I went to CLI and did the same thing and then I deleted everything I did and then I went to Terraform and that is now stored in Git. Now I had the luxury to do all that. If I was in a startup, maybe I would be pressured, maybe not even by others, but even personal pressure to keep what I created in web UI, because maybe I needed to change my attention to something else. So I think that it very often depends on the experience, both with what you're working on right now and tooling around it. Like, I cannot blame somebody who is faced with a challenge to do something. Let's say, create some infrastructure and never worked with Terraform. I cannot blame that person for not using Terraform from day one. Let's say that person is familiar with the platform that they're creating stuff. Let's say AWS. If you never used Terraform, I don't expect you to start with Terraform. Now, even if you're experienced with AWS, because you have pressure, you need to do things fast and it's definitely going to take you more time with Terraform, than clickity click or AWS command line. That's where experience comes in. When you have startups, new projects, you don't want to automate because it takes too much time unless you're already experienced in that specific subject. If I was starting a new project from scratch and we say, Hey, let's use Google Cloud. I would use Terraform from the first second without the doubt, but that's because I'm familiar with Google Cloud and because I'm familiar with Terraform. If I wouldn't be familiar with either of the two, I might not go to Terraform from the first moment. Then comes the question if I don't start with Terraform, and this is just example. Terraform. It can be anything else. There is a strong chance that that something would stay and then the real question comes, am I aware of the debt that I'm generating and acknowledging it? If I am, that's not necessarily bad. The bad thing is if I try to ignore what I just did. If I ignore the fact that I created technical debt and then it stays forever.

Darin: [00:05:33]
That's the important part is realizing the technical debt that you created. If you don't realize that you have created technical debt, that is a larger problem for the company. Because that is one of those cases, as we've talked about in a couple of episodes ago to where you leave the company and you're the only one that knew that that was set up there and Oh, by the way, That wasn't a direct bill. That was a credit card that was on that account and it was your credit card because you were expensing the charges, and you cancel that. Then the company stops working. That's technical debt. But in automating, we're talking primarily about infrastructure at this point. Let's say we have all of our infrastructure automated. That was a non-negotiable. Let's talk about automating our application management or the SDLC. Let's use the real acronym for that. What is our software development life cycle? Is that what it stands for? I know the acronyms, I don't know the words behind the acronyms.

Viktor: [00:06:31]
yeah. Software development life cycle. Yes,

Darin: [00:06:35]
Should that be automated?

Viktor: [00:06:36]
Should? Definitely. This time you didn't say always or all like before.

Darin: [00:06:42]
Should the SDLC be automated?

Viktor: [00:06:44]
Yes.

Darin: [00:06:45]
Yes.

Viktor: [00:06:46]
Always not, but most of the time.

Darin: [00:06:50]
So why would you not want to automate your SDLC? Same argument? You're just starting?

Viktor: [00:06:57]
Yeah, you're just starting. I think that that applies to almost everything. The same thing like if it changed the subject and say, even go to the very beginning of SDLC. There is no SDLC until there is S. You're not going to start writing production level code from the first second. You're most likely going to end up with your first function, class, whatever that has a hundred lines of code or 200, which is unacceptable. You know that for testing reasons, it should be split into smaller pieces that are more encapsulated, but you're not going to start like that because you just want to prove it to yourself that something works and then you go as fast as possible because those things take more time and that time is easily shorter in the long run. We know that writing tests takes more time than not writing tests, but on the long run, looking at the month timeframe or longer, is definitely shorter with tests than without, because then you start running into problems that take more time to resolve without testing, but still you're not going to write tests first day of a new project, most likely. Some people are very proficient. Some people are TDD and you're going to write it. Same thing, like what I said before, like if from day one, I would write Terraform for my GKE cluster because I'm proficient with it, but most people are not proficient and almost nobody's proficient with everything and there are many moving pieces. There are production level code you might be writing, but not writing tests. Tests. You might be writing tests from the first moment, but you're not writing SDLC. You're not going to build with a CICD tool first day most likely. Now somebody who lives and breathes that CICD tool is going to do it from the first day but then he's not going to do something else from the first day. There is always, every project at the very beginning, generates technical debt from the very start. Hopefully you generate more technical debt first week than ever after of that application or system or whatever it is and that's okay. What is not okay is when I said you have a couple of lines of, couple of hundreds of lines of that first function. Now if you end up then a month later, having 10,000 lines of that first function, then you're doing something terribly wrong. Then there is no excuse or you're already couple of weeks in development and you don't have automated pipeline that builds and runs your tests. You're still running it locally. After awhile, there are no excuses anymore.

Darin: [00:09:28]
You're saying after a while, I would say based on the environment that you are working in even day one can be bad because if you're in a large enterprise organization and you are not in the research and development arm, you're actually in an operational type arm, you're now on a team that's producing a new service that's going to be consumed by another team. Hopefully you already have provided to you by that organization an easy button that gives you all of the things that you need. The corporate standard is we use Spring Boot for all of our services. Okay, great. So what should happen is since you're using Spring Boot, Oh, by the way, it's a specific version of Spring Boot that has been approved by the company to use. So there's an easy button. You give it a name. Maybe you give it how many environments that you need. Do you need dev, stage, prod or do you need dev, stage, load testing, prod? You get two choices. That's it. You give it a name. You figure out environment count and you press go and you go get a coffee. You go for a walk. You come back and everything is already up. So you clone the repository. You make a change and when it makes that commit, everything goes out to that lowest dev environment and is it working or not? If you're working in a large company and you do not have that from day zero, again, if you are not in a research and development part of the organization, that's a little bit different. That always has to be different, but if you're in a commoditized part of the organization and you don't have that easy button, your organization is making a very poor decision and creating risk by introducing technical debt that shouldn't be there anymore.

Viktor: [00:11:17]
We can rephrase maybe that it should be short while you're experimenting and not doing stuff and whatever the short means it means shorter than the time it needs to get to production.

Darin: [00:11:30]
Explain that a little bit more.

Viktor: [00:11:31]
In your example, if you're going to deploy that something to production the first day, then from the first day, it needs to follow all the guidelines. Everything. All the best practices. Now, if it's going to reach production for the first time a month from now then hey, first couple of days of finding your way is okay. Assuming that there is no easy button. So you need to have a good reason not to do stuff. If I say, Hey, I want to try this out and it will take me triple the time if I spin up CICD and this and that. It's okay. I know that I will have to do it, but if there is easy button, then there is less use. Why would they do it worse than better for the same amount of time? Then there is absolutely no excuse. Any larger companies, you have those. You have a easy button. Now we can argue how easy the button is and how good is the result of easy button. For this discussion, let's assume that easy button is easy and that the result of easy button is what you really need.

Darin: [00:12:29]
The argument that I do see from people that have an easy button and do not want to use it is because, well, the easy button doesn't do what I want. Look, you're working for the company. There is no I in company. They've already laid out what they expect. You don't like it? You can leave. Was that harsh?

Viktor: [00:12:53]
No. I was about to criticize what you're saying until you said the last sentence. If you don't like it, you leave. Then I agree. I've seen many easy buttons that are not easy and I've seen many buttons where if I change the word I to we as if this team, this project, I've seen many easy buttons that simply do not provide what the project needs.

Darin: [00:13:14]
And that's okay because they're probably doing something different that falls outside of the scope of the easy button

Viktor: [00:13:19]
Yeah. Then it's not easy anymore.

Darin: [00:13:21]
Right. But in my head and I probably did not communicate it well enough, is we are writing a new microservice that is going to be just like the other 50 that we have and I don't want to use the easy button because I'm special because I want to work on Saturdays and nobody else wants to work on Saturdays. Whatever the thing is. That's not okay. You need to suck it up buttercup and do what the other 50 are doing and if you don't like it again, you can leave.

Viktor: [00:13:47]
But there is one problem with easy button, one big problem in my head and that is that sometimes, or often, depending on how you want to define it, those easy buttons might actually stop any improvements from happening ever again and that's problematic. Let's say that there is an easy button that deploys to mainframe and there are buttons like that. Now you're starting a new project. You might want to deploy to VMs. You might want to deploy to Kubernetes and a easy button might be problematic from that aspect. But that's why I said, I was about to interrupt you, but then you said, then you can leave. Then if a silly easy button that only deploys to mainframe and there are good reasons not to use mainframe, then you leave.

Darin: [00:14:32]
Let's stay there for a second. If you have an easy button and it's not being updated at all, that means the backside of that easy button, which means provisioning a Git repository or a Subversion repo, whatever it is that your company is using, it's provisioning a space in the ticketing system, whatever that might be. It's setting up whatever the deployment targets might be. If those are never changing. Let's just stay here. You're using Git. You only use JIRA. You're using vanilla Git, because you didn't want to pay anybody. You're not using JIRA. You're using some open source ticketing system that you've hacked and whacked at for years. So you can't make that change and you're only going to mainframe. The chances of that easy button changing are near zero because your company is so behind the times they couldn't make a change. It would be too risky to the business.

Viktor: [00:15:23]
Exactly.

Darin: [00:15:24]
It's business driven, it is not technology driven. But if your business is forward looking and that easy button is not being updated as time moves along, then that is a huge problem.

Viktor: [00:15:35]
Let's put it this way that if there is a person in a company or two or maybe three that do not want to use the easy button and everybody else is using willingly easy button, there's something wrong with that person or two. Now, if there is a significant percentage of people who are trying to avoid to use the easy button, then there is a problem with easy button. Then it's not doing what it's supposed to do.

Darin: [00:16:00]
It's not doing what it's supposed to do. I would agree with that. I don't know that there's a problem, but it's not doing what it needs because those two aren't equivalent

Viktor: [00:16:08]
Yeah. It's not a problem that you're having a easy button. It's a problem what easy button does because realistically, I think that most of us are happy to avoid doing extra work. So if I choose not to use easy button, and I'm not the only one choosing that, but let's say there are many of us choosing not to use easy button. An easy button is supposed to simplify our lives because that's the goal. Then it's not really simplifying it because I really have no desire. There is no way that I will rather write Terraform, than execute button that will already create my cluster for me. There is no way that I will spend my time doing that unless there's something wrong with me, which happens every once in a while. But if there's many of us choosing, let's say to create our own infrastructure instead of clicking a button, that's a smell to me.

Darin: [00:17:05]
Very, very smelly. Okay. So that was the front side. We're at the left-hand side of the SDLC. Let's get to the right hand of the SDLC. Should everything be automated in the runtime environments, whether that's actually production or any of the lower environments. Should the provisioning of those things always be automated?

Viktor: [00:17:27]
On which side you said? Left?

Darin: [00:17:29]
No. Basically we've got all the code. All the code is done. It's packaged. It's all ready to go. Now we're talking about the actual runtime environments. Should the provisioning, the maintenance, all of that, be automated?

Viktor: [00:17:42]
There is even more needs for that to be automated than the left hand side, I believe and the reason is that I believe that front and center are developers in a software company or part of the company doing software. Doesn't have to be software company. All the other roles are equally important, but they are facilitating providing services to developers. Developers are customers of other software related groups in a company. Everybody needs to have a customer. So if I'm, let's say on the right side and I'm in charge of deployments, I'm providing a service to the developer. Otherwise, I'm not sure who my customer is. If I'm providing service to a developer, I have even more needs to automate it because I'm providing a service. I'm not renting my hours to developer. I'm giving him means to deploy his stuff to production. And now I know that that's not the case in 80% or 90% of the companies, but that's how it should be. Developer develops software and wants that software to be running in production. He cannot do it directly because he has no knowledge, experience, authority, whatever, so others need to create a service that will allow him or her to do that and that service requires higher level of automation otherwise it wouldn't be a service. Let's say right side of me as Viktor is AWS. Would you agree with that? They are on my right side and I don't think that when I request VMs, I don't think that it goes to some ticketing system and then there's somebody who creates a VM for me. It wouldn't work. AWS is the right side of all the developers using AWS. Maybe there is something in between developers and AWS. Maybe there is a one or two or three departments in between, but still the logic is the same. AWS is on the right side.

Darin: [00:19:38]
But what we see going back to your 80 90% statement a moment ago is people open up a ticket to get a VM. That ticket actually isn't the ticket that gets the VM created. That's the ticket that then allows another ticket to be created with a service provider to create the VM. But even yet that ticket, isn't the ticket that gets the thing created. That's the ticket that actually gets the ticket created to the part of the company that is going to create the VM. So three tickets are created to create a VM because process. Where in reality, if that were automated, as it should be, you could still have the automation create all three tickets for you. This is why an ITSM tool like ServiceNow, like Remedy, like whatever, exist. It's like, well, we need proof that they happened. Well, great. I can guarantee you that and I'm going to use the word broadly, a script, doing something for you and leaving behind breadcrumbs of what it did is going to be better than any number of humans that can possibly do it.

Viktor: [00:20:43]
I can guarantee you that the script that is configured to log something as breadcrumbs is going to log something always, except if there is an outage or something like that. I can guarantee you that there is a much higher chance that a human will forget to log. It's a fact. We don't always repeat exactly the same steps.

Darin: [00:21:04]
I think it goes beyond fact and it's truth because fact isn't always truth

Viktor: [00:21:08]
Okay. I'm not sure, but let's say it doesn't happen always.

Darin: [00:21:13]
It's interesting that you brought up the left-hand side of the SDLC. It would be great to have it automated, but you can sort of get by with parts of it not being automated depending on your organization, but regardless the size of the organization on the right hand side, once you figured out what it is that you're doing really needs to be automated. Again, if you're not part of R and D, research and development, and you're purely operational, that better be automated.

Viktor: [00:21:39]
So one thing is how you get somewhere and the other thing is what is the output of that something. So developers manually create, write code, whatever, but for their users, the usage of their outcomes is full automation, right? When you go to Amazon to shop for something, you don't call them by phone and then somebody responds. Shopping cart is fully automated. That's the result of writing code in a way. It runs. It's fully automated. Our customers get fully automated solution and everybody's customers should get fully automated solution. That means that if developer is your customer, you have to give him the same level of attention as when the customer is the end user for a developer.

Darin: [00:22:25]
So in these last couple of minutes, if nothing is automated today and everything that we've talked about today, if nothing has been automated, where should you begin your automation steps? Where you can?

Viktor: [00:22:37]
There is no answer to that because it really depends on everybody's work. If I'm a developer, I would start writing automated tests. If I'm in charge of deployments, I would start writing, I don't know, Kubernetes manifests. If I'm in charge of infrastructure, I would start writing. Actually there is one answer. The good start is to write code. What that code will do depends on the domain and the problem you're trying to solve. Everything is code.

Darin: [00:23:07]
And that's it. Should everything be automated? Maybe. We initially said no, but should everything be automated? The answer is long-term, yes, but it depends on where you're at in the organization.

Viktor: [00:23:19]
Yeah, I'm intentionally saying no to word everything simply because let's say I'm going to click myself to merge a pull request, so it's not literally everything, but it should be very close to it. Maybe actually the better question would be should all, instead of should almost everything be automated, should almost everything be written as code. That's the real question. Automation is easy. Very easy. I can automate anything you want as long as there is code that will execute all of that. Automation is basically stitching different pieces of code together. The real question is should almost everything be written as code? Would you agree with that?

Darin: [00:24:01]
Yes, almost everything should be written as code. Can we have code write our JIRA tickets for us? Technically, yes.

Viktor: [00:24:09]
Oh, yeah. Many people have that when you write a commit message. I dunno slash this, slash that, that triggers creation of JIRA tickets or closes JIRA tickets and stuff like that. Yeah.

Darin: [00:24:21]
But all of this is predicated that through this automation, there is things listening for those things, whatever those actions are. In next week's episode, we're going to start diving into something that we're seeing a lot more of that is not the easiest of things to comprehend, even though in reality, it is. And that is eventing.

Darin:
We hope this episode was helpful to you. If you want to discuss it or ask a question, please reach out to us. Our contact information and the link to the Slack workspace are at https://www.devopsparadox.com/contact. If you subscribe through Apple Podcasts, be sure to leave us a review there. That helps other people discover this podcast. Go sign up right now at https://www.devopsparadox.com/ to receive an email whenever we drop the latest episode. Thank you for listening to DevOps Paradox.

DOP 95: Should Everything Be Automated?

Show Notes

Hosts

Darin Pope

Viktor Farcic

Links

Rate, Review, & Subscribe on Apple Podcasts

Signup to receive an email when new content is released

Transcript

33Across

host description

View Cookies

33Across