DOP 78: A Day in the Life of a SRE

Posted on Wednesday, Oct 21, 2020

Show Notes

#78: Today we with speak with Adam Hawkins, a SRE for Skillshare and the host of the Small Batches podcast. We discuss what it’s like to be a day-to-day SRE and how some companies still don’t understand that it is possible to actually follow the “you build it, you run it” model of software development.

Links from the episode


Adam Hawkins

Adam Hawkins

Adam teaches teams how to improve velocity, reliability, and quality. He’s the host of the Small Batches podcast and is a Site Reliability Engineer at Skillshare.


Darin Pope

Darin Pope

Darin Pope is a developer advocate for CloudBees.

Viktor Farcic

Viktor Farcic

Viktor Farcic is a member of the Google Developer Experts and Docker Captains groups, and published author.

His big passions are DevOps, Containers, Kubernetes, Microservices, Continuous Integration, Delivery and Deployment (CI/CD) and Test-Driven Development (TDD).

He often speaks at community gatherings and conferences (latest can be found here).

He has published The DevOps Toolkit Series, DevOps Paradox and Test-Driven Java Development.

His random thoughts and tutorials can be found in his blog

Rate, Review, & Subscribe on Apple Podcasts

If you like our podcast, please consider rating and reviewing our show! Click here, scroll to the bottom, tap to rate with five stars, and select “Write a Review.” Then be sure to let us know what you liked most about the episode!

Also, if you haven’t done so already, subscribe to the podcast. We're adding a bunch of bonus episodes to the feed and, if you’re not subscribed, there’s a good chance you’ll miss out. Subscribe now!

Books and Courses

Catalog, Patterns, and Blueprints

Buy Now on Leanpub Buy Now on Udemy

Signup to receive an email when new content is released


Adam: [00:00:00]
If you are not willing to try new things and experiment out with it, experiment different ways of working or changes to your process to create better results, then you're never going to get better and practically speaking, you're only going to get worse.

This is DevOps Paradox episode number 78. A Day in the Life of a SRE.

Welcome to DevOps Paradox. This is a podcast about random stuff in which we, Darin and Viktor, pretend we know what we're talking about. Most of the time, we mask our ignorance by putting the word DevOps everywhere we can, and mix it with random buzzwords like Kubernetes, serverless, CI/CD, team productivity, islands of happiness, and other fancy expressions that make it sound like we know what we're doing. Occasionally, we invite guests who do know something, but we do not do that often, since they might make us look incompetent. The truth is out there, and there is no way we are going to find it. PS: it's Darin reading this text and feeling embarrassed that Viktor made me do it. Here are your hosts, Darin Pope and Viktor Farcic.

Darin: [00:01:15]
So this week we've decided to have another podcast host on with us. How did this come about Viktor?

Viktor: [00:01:26]
I don't know, maybe we are getting out of ideas.

Darin: [00:01:30]
that's very possible

Viktor: [00:01:31]
we need to bring people with ideas?

Darin: [00:01:35]
Well, that's, what's in our intro. Right? We bring people on whom, because we don't know what we're talking about. So it's it's okay. Today we have Adam Hawkins. Is it Hawkins or Hawkings? Hawkins. Sorry. I was trying to trying to give you at least a little bit of a Stephen Hawking's love there. Yeah, but it didn't work. Adam is the host of the Small Batches podcast or just Small Batches podcasts. Probably no the required. How are you doing Adam? You doing pretty good today?

Adam: [00:02:09]
Yeah, doing good. Happy to be here and talk through some stuff with everybody.

Darin: [00:02:15]
So in Adam's real life, he is an SRE. He's not just a cool podcast host, right? He actually has a day job just like Viktor and myself do. So why don't you explain a little bit, if you want to say who you work for, that's great. If not, that's great too, but explain sort of what your day to day role looks like.

Adam: [00:02:36]
Yeah. So I work as a staff SRE at Skillshare. You may have seen Skillshare from a ton of YouTube ads. They market everywhere. They do online learning. You can go to Skillshare, sign up and you get access to all these courses. They primarily target creatives, a lot of classes for things like Adobe Photoshop, freelancers, like marketing. Some stuff for podcast production, video production, that type of stuff. So in my work as an SRE, I primarily focus on enabling teams in the sort of you build it, you run it model. Practically speaking. I do a lot of continuous delivery type things like building pipelines, making sure that there's appropriate checks in the pipelines and monitoring and really focusing on being an advocate for the four metrics in Accelerate. Deployment frequency, lead time, change failure rate, and MTTR, and ensuring that our team and our organization can move at a high velocity so we can succeed as a business. It's not necessarily probably the like typical role you'd see as an SRE, but we're sort of a small team. Like right now we're about 35 engineers. I think that those ideas like the principles of DevOps, those metrics are a great way to like focus teams and keep them sort of aligned to a vision of what success looks like for an engineering team.

Viktor: [00:04:06]
So what does enabling a team mean in your context?

Adam: [00:04:12]
Well, I chose the word enabling specifically because this is actually one of the different interaction modes from the book, Team Topologies. So the book Team Topologies sets out a few different team types and different ways that these different teams interact. So like one of them is enabling. So I'm not sure what all of your experiences, but, you know, like working as somebody who's kind of moved from like backwards from writing user facing software and then building APIs and then doing infrastructure, then moving into SRE, you know, you get further and further away from the things that directly provide value to the business. Because of that, you get farther away from the code. But those problems and those capabilities still exist in the organization. It's not really possible for one team to own everything. Maybe not own, but like fully understand everything. For example, I can't understand React. That's just outside of my wheelhouse. If I have to write software like that, somebody is going to have to come and enable me in some way. Either by like showing me how to use our internal tools or whatever structures we have in place. What I do as a SRE is try to enable teams and engineers to work at like different levels of this you build it, you run it. So in practice this means like teaching people how to use our continuous delivery tools. Teaching people how to add monitoring and alerting into their production environment. Showing people how to construct deployment pipelines. Like kind of bootstrapping sort of the initial structure or that they can come in and add, or, you know, they can grow upon that. But making sure that whatever the different teams put in place hit sort of the minimum viable requirements as we have defined them as an engineering team at Skillshare.

Viktor: [00:05:57]
I had a sinister reason for asking you what does enable mean because in many of the companies I've worked with enabling means you have to use this tool and if you want this done, you need to open 7 Jira tickets and things like that. If I understood right what you mean by enable, you mean enable. You are really treating them as your customers in a way.

Adam: [00:06:19]
Exactly, in the sense that, I mean, I don't know about you guys, but I never want to be on the critical path to anybody else's work. Right. Especially, you know, being like an SRE and coming in, like in the intersection of like infrastructure, software design that hits a whole bunch of different layers in the whole stack. SRE tends to hit a lot of those different things. So like, I try to just make sure that people know what to do, how to do it, and if they need help doing so then do it so that they can execute outside of, you know, like us or myself as dependencies getting the work into production.

Darin: [00:06:54]
I'm going to restate what Viktor said. You're actually doing enabling on the positive side and not on the negative side. Meaning sometimes you'll hear people say, okay, you're enabling an addict with something versus enabling somebody to really help somebody. That's the positive versus the negative. You brought up the Team Topologies book. For any very astute listeners, you may remember episode 51 where Adam Sandor was on with us, fighting with Viktor which ended up being more of a lovefest than a fight. That's where we first talked about Team Topologies, the book itself. And one question I have is you were talking about when you're trying to get your customers going, you onboard them with your CD tools, talk to them about monitoring and alerting. Do you allow people to actually deploy anything if proper monitoring and alerting is not in place?

Adam: [00:07:53]

Darin: [00:07:55]
Well, that's not enabling

Adam: [00:07:57]
How so?

Darin: [00:07:58]
Because you're stopped. Well, you're stopping product from shipping something just because of a stupid rule that they have to have monitoring and alerting in place.

Adam: [00:08:05]
Well, I can't tell if the stupid rule as being facetious or not, but part of the enabling, I think, comes from in order to build and deploy code properly, you have to meet a certain set of requirements. And that set of requirements may not be what every single engineer is familiar with. So you might have engineers who have never deployed code to production. You might have engineers who have worked at senior level and have gone through this whole process. But part of the enabling mode in my mind is making sure that all of these different criteria are met and I would rather have us hit this minimum bar in production, rather than ship something to production that didn't meet that minimum bar and have longer or more negative ramifications in production. So you can think of it definitely as a gate. I think the gate is a big part of the sort of enabling, I mean, enabling doesn't necessarily mean do whatever you want. You can just deploy untested code to production. I mean, that will kind of be the opposite. I mean, that would definitely be enabling them, but that will not necessarily be enabling the business in my mind.

Darin: [00:09:17]
Talk to a product owner. They'll tell you that's the exact opposite. If you're telling me I cannot ship a, a feature that is required because I don't have my monitoring and alerting in place, product's going to have your head.

Adam: [00:09:30]
Oh, yeah, for sure. But the, sort of the layer that I am talking about is more from the service level, not necessarily like the specific stuff for a certain feature. For example, like if you're going to deploy a new service to production that doesn't have any kind of telemetry for say, like HTTP request or like success rates or any of these kinds of things. Well, then that doesn't meet the bar to deploy to production. Once you have certain services in production and then you have these like more fine grain features and like personally I'm willing to just let that go with the assumption that the team who's building that thing, if they need to monitor that, like that specific thing, then they will. That's sort of like at the level that you're talking about, Darin I'm willing to just sort of let that, let that one go.

Viktor: [00:10:16]
We're talking more or less if I understood right about having some rails. Does this mean that, for example, if I would be working now with you, if I meet those bars, let's say, or don't go out of those rails, is it that, hey, any, anyway I can accomplish that with, or without your help is good, right or no?

Adam: [00:10:40]
I think that's fair.

Viktor: [00:10:41]
is it more about making sure that certain objectives are met, no matter the path taken or you need to follow a strict path? Kind of it needs to be like this or that?

Adam: [00:10:53]
It's a little bit of both. I think that every organization walks this line a bit differently. So of course, like starting from the first principles, right? Say if you're going to launch a new service into production and you're going to have to have. I mean, this might come. It's not, I don't think it's a surprise to say, but you need to have automated tests. Okay. I mean, that might be new for some people, but if you put that in some sort of, you know, requirements, then they'll have to be met. And I don't necessarily care so much about how that, how that happens, but the fact that you have an automated test suite that says, Hey, I made a change. It's been tested, the test passed. Therefore I'm confident I can deploy this code to production. You can create that test suite however you want. You don't necessarily need to be require like SRE involvement in that by design, but you can of course ask for help. And that sort of goes for all of the other sort of like high level requirements. Like for example, that you have log streaming. You have telemetry. You have some way to do zero downtime deploys. You have perhaps say a canary release process. You have some SLOs established. You have alerts that trigger somebody on PagerDuty. You know, you have a process in place to detect and resolve production issues. That's sort of the high level objectives. And then I think every engineering team takes from those sort of first principles and then defines some either like, well trodden paths, like, Hey, at company X, we use language Y and tool Z and we integrate them like this. This is the well-trodden path that you can follow and copy and it will work. But, you know, feel free to go off and do your own thing if you can meet all these requirements using a different of tools. Whereas at Skillshare, we have a more like prescriptive set of tools. Like, Hey, at Skillshare, we use PHP and Node.js. So like if you use Node.js, then you're going to be using these set of libraries that have been written internally. They're going to be integrated in this way. And by using all of those things that have been already written internally, then you can say that, Hey, it's already met all the different criteria. Now, write the code in such a way that you maintain these requirements as you build out the service in question.

Viktor: [00:13:07]
What is the expectation from those I don't know how you call them application teams, maybe, uh, what is the expectation from experience and knowledge perspective? Beyond, you need to be able to write the code of your application, kind of like that's a given. Right. But I'm really curious, how far do you go? Right? Because I've seen both extremes. I don't want to write even Maven file. Right. Kind of like, that's not my job.

Adam: [00:13:34]
I think that is actually the really good question because that speaks to hiring, team composition, experience levels in the teams. I think the, at least at the level that we strive for is that a team should be able to basically like greenfield something to deploy to production and operate it, using some sort of off the shelf tools. It's always comes to a question of like where do you draw a line? Right. So like we deploy to Kubernetes. Okay. So like we, the SRE team, we run the cluster, right. But, you know, if somebody is deploying something, they're going to have their deployments and their services and all these type of things, we might help them write those manifests, but it's up to them when they deploy it to be able to troubleshoot it in that environment. And if they can't figure that out or they need help, they can of course ask like SRE, but it's their first responsibility to act in that model. But I've been on the other extremes, you know, where it's like you said. They not even expect like, Oh, there's a change to the Dockerfile or to the Makefile. Like, I can't do that. Somebody else should do that. You know or like, for example, using something like CircleCI or Codefresh, like these YAML files that define pipelines. Like, Oh, Nope, somebody else should touch that. That's not me. I can't do that. We go for the opposite where, Hey. All of this is owned by your team. You have the capability to change it. You can modify it to meet your requirements if you need help ask, but you have the power to do that. You are the maintainer of your critical path. Change it as you see fit.

Viktor: [00:15:14]
Does it include also PagerDuty? If something goes wrong in production and it's not the cluster, it's the application, who gets a call at three o'clock in the morning?

Adam: [00:15:24]
The application teams. Given that we have a mapping between services and teams or like high level, like code areas or business areas and individual teams with the appropriate monitoring and alerting and say, if something goes down or if there's an issue with say, Purchasing like payments, right? There's a team who is responsible for that. They will get paged and then they're the front lines. So if they receive the page and say, Hey, I've gone through all this troubleshooting and it looks like there's something at this other layer, like lower layer of the stack, they can escalate it to the person who is on call for say SRE or a different team. But there's a PagerDuty rotation for all the different teams and without that it doesn't work. Right. Because if you don't have that, then you cut the feedback loop from production back to dev, which then defeats the whole purpose,

Viktor: [00:16:12]
Exactly. But now, here's a critical question from my side. There will be a follow up one. Did you hire people to those roles or did you transition them into such way of working.

Adam: [00:16:26]
Which roles?

Viktor: [00:16:27]
I mean, not roles. Sorry, but that level of ownership of applications. Is that how it always was?

Adam: [00:16:35]
it's probably a mix. I think Skillshare has been around for maybe like four or five years. I've been with the company for a year and a half. I think they always had a higher level of ownership, but not necessarily complete autonomy because there was a DevOps team or ops team handling some of this stuff. This is always the problem or the challenge when it comes to like brownfield versus greenfield projects and organizations with older code bases and team structures, whatever. But for new stuff, it's certainly more like that where, hey, team X creates this thing. They are the complete owners of it from time zero up to production, operating it, and eventually even decommissioning it. Whereas for some of the stuff that doesn't meet or fit that model, and for engineers who have never operated in that way, then this is where SRE comes in back to the idea of enabling. Giving them the tools and the knowledge and training or whatever it is that they need to be able to operate at that level. And also making it clear to people who come in the door that this is what's expected from them. That you can't just say the Makefile needs to be changed. I will create a ticket and some other person's backlog. Like, no, no, no, no, no, no, no, no. If you need to change that, you'd need to figure out how to do it yourself. And if you can't, you need to ask, like, I don't know how to do this. I need help. Like, what do I need to do so that you can do it.

Viktor: [00:17:55]
What you are saying is very close to what I've been preaching, you know, kind of that's the way to go and all this stuff. Now, the answer that I get most of the time is that, yes, I agree. This is great. This is how things should be done, but we are special because, and then there is a lot of after that,

Adam: [00:18:16]
some rambling. Right.

Viktor: [00:18:18]
Financial industry. We are this, we are that. There is obviously a reason why that is a good thing, except in somebody's case. Can you imagine why ignoring now the fact that people can not change easily overnight? Kind of why, why is the rest of the industry then stuck? I mean, not rest. Most of the industry is stuck in that 75 departments. A lifecycle of application is split between 85 groups and all, all that stuff.

Adam: [00:18:50]
That's a great question and when you talk about the rest of the industry, it's always surprising how big the quote rest of the industry is. I read, don't quote me on this exact figure, but I read Mik Kersten's book, Project to Product, and he estimates what percentage of companies has adopted this DevOpsy and high velocity type way of building and delivering software. And it was significantly small. I'm talking like 2%, 3%, really small compared to the rest. And if you look at that, it's like 5% compared to 95%. Who knows why? Is it education? Is it understanding or is it simply the fact that they haven't needed to. That business as usual is fine enough. One thing for us who work at tech companies building software, it's easier for us to be on the forefront of these things, but say, if you work in retail, maybe you don't even think about software in the same way as part of your business or see how it is important to change the ways of working to empower the business. I don't have a good answer as to how come so many people or what large percentage of your organization doesn't think or act in this way or industry not organization. But why it's kind of still a small group of people. I don't know, but I think that's a challenge that we have as people who or teams or organizations that are working on like this is to really just do what we can to spread the knowledge and try to help other organizations succeed also. I mean, it's kind of like a rising tide thing that like, if we all like, as an industry or work in such a way, then we'll have better products. We'll have better services that we can use as customers of them also.

Darin: [00:20:27]
As the old guy here, I can give you two answers. First answer is that's not the way we do it. The way we do it here is this way. That's the answer one. The second answer and I've heard this time and time again is we don't trust our developers to do the right thing. And my, my response to the second question or statement is always then why did you hire them? Well, we needed somebody to do the work. I went, well, why didn't you hire the right people? Well, we needed somebody right now. I went then you've put yourself in the problem.

Viktor: [00:20:59]
But actually those two answers are

Darin: [00:21:02]
they're valid. They're valid. They're sadly valid.

Viktor: [00:21:06]
To be honest, when I spend, with some not all, of course, let's say a few days with a huge company, I come to the same conclusion, kind of like those people cannot do that. Now the real question is why did you train them not to be able to do that? That's kind of that that's, that's the real challenge. Everybody starts somewhere from a very low bar. Some people get used to it and then others don't, and it must be the environment.

Adam: [00:21:35]
If somebody says, like you don't trust your coworkers, well then you have an entirely different problem, you

Darin: [00:21:40]
Oh, it's not the coworkers. It's the managers saying they don't trust their people.

Adam: [00:21:44]
Ah, okay. Well, you can even have any other way where other developers don't trust the other developers to do what they need to do. Like it can go all sorts of ways, but I definitely agree with you in the sense of the managers don't trust the developers then you're screwed in that way. Even the other way around where higher level management thinks that continuous delivery is too dangerous. We cannot go fast without creating more risk. Like I've been in organizations where that has been the top down organizational position, that this is too dangerous or we shouldn't do that. And the same thing can be said about things like remote work, where we've all been kind of forced into it now, but there is definitely a fear of new stuff. Maybe a fear, not like misunderstanding, but like fear and mistrust, or as you said, sadly true factors that inhibit adoption of these ways of working.

Viktor: [00:22:39]
To me, the curious thing is that usually it is phrased as we cannot do let's say continuous delivery because continuous delivery is too dangerous. While we know that it's not. If you start Googling right now in five minutes, you're going to find just enough proof for anybody that's not correct. What that sentence really usually means is that we don't know how to do it and especially that I'm a manager, I don't even know what you're talking about and those who know what they're talking about, they don't know what that is. I really get disappointed when I hear generalization of something that is your own personal or company or team inability to do something. We know that continuous delivery works, right. Maybe it's not perfect for every single organization, but it works. It produces results. I mean, look at Google, Amazon, Apple. I mean, what makes you think that you know better?

Adam: [00:23:32]
Well, right. Part of the other thing too, is that some of these things that seem true on their face are sort of like anti-patterns and thinking, like the misconception about continuous delivery is that if you go fast, you're gonna break things more. What is it that saying from Facebook, like move fast and break things. But people associate that with the idea of continuous delivery. As you said, Viktor you can just go online and find out now that continuous delivery is in fact, the actual safest way to build software. But if you only thought about it on its face, you might see how it was going against that. The same thing goes to automated testing. I haven't had this sort of conversation for a while, but you still encounter people who will think that automated testing actually slows down the whole process and it has a negative impact. If I hear that I almost go completely white faced because I'm just so surprised that somebody would say that now, but there's so much evidence to the contrary that in fact the only way to succeed is with automated testing.

Viktor: [00:24:30]
Yes, but that's the difference. In reality, if you look specific organization, specific company then actually, what you just said is true because automated testing is slowing them down because it's flaky. So they need to rerun it 50 times until it passes and because you're still going to do manual testing after your automated testing. So why did you invest in automated testing? Now, I'm not saying in any formal way that that's the way to do it. But when you look at the sorry state of the actual implementation of testing in many companies, you say, I mean, I say also, I don't understand why did you invest years in this? I don't understand why did you do that?

Adam: [00:25:15]
Yeah, I know. I've been there too. The worst thing about some of these things is that if you adopt them, like sort of like halfway, they don't work, right. You end up with some automated test suite that takes a long time. It's flaky. Maybe it doesn't cover all the things that you expect it to do. You can't really trust it. And then it becomes this unfortunate ceremony of adding tests or re-running them just to get it green. But to what purpose? I think this comes back to one of the other things we mentioned prerecording, there's a sort of a continuum between where you start and like where you want to go, and this comes back also to your point about hiring people to work in a certain way, is that you need experienced people to lead these things to make sure they don't devolve into these things like these implementations that take years of time, or that just lead to the exact opposite result of what you want. Like the practices that teams and organizations put in place at the very beginning of their life tend to live on for a very long time. If you can get those things right, that's an amazing success.

Viktor: [00:26:25]
Now that you mentioned hiring that's similar to when company doesn't do well, usually they change their CEO, CTO. They change person at the top. Now what surprises me is that why not apply the same thing to engineers? And in this context, I don't mean change, but effectively, I believe that the only way, especially in large organizations to make that change is to bring fresh blood, give them a project and then make it success and then start spreading that to the existing organization. If somebody hasn't succeeded in doing that for 20 years, what makes you think that you making a decision right now? I make a decision. You're going to do automated tests today. Suddenly it's going to be beautiful. It's not. At least that I never saw it.

Adam: [00:27:14]
Oh yeah. I mean, and if you look at the literature, the model that you were suggesting at least is probably one of the best ways to do that is to a bring in people who aren't tainted by the status quo. Try to remove the status quo from their work as much as possible. Give them a project with clear objectives and say, figure out the best way to do that and just let them. First trust them to do that right very important and then just let them go and then learn from that, double down if it works and then figure out how to spread that in the organization. I wouldn't necessarily call it a takeover, but it's almost like a progressive rollout of a different way of working. The power there comes from, you can prove it. Like you can say, Hey, we did this experiment with this new team. We gave them this, it works. And that's the thing that convinces the people who like, I don't know about this, or I'm uncertain. I don't know how to do it. It's right here. Follow this. We've already done it. You know, taking the unknown and turning it into at least a known and perhaps even a prescriptive solution. It's pretty powerful.

Viktor: [00:28:20]
Yeah, because otherwise you're locked into that endless conversation. I say this. You say this. I say it works. You say it doesn't work. We are special and you never get anywhere without that proof. And that proof cannot be done without drastic change

Adam: [00:28:36]
or perhaps without the curiosity to experiment, which to bring this back to the third way of DevOps that's in the title or the name. If you are not willing to try new things and experiment out with it, experiment different ways of working or changes to your process to create better results, then you're never going to get better and practically speaking, you're only going to get worse. You're not going to even maintain. You'll just degrade over time. If you're not willing to experiment, then you're not willing to change and then you're just dead. Like you're dead in the water. At least from the growth perspective.

Darin: [00:29:12]
Well, we're too busy to experiment. I can't hire somebody experienced because I can hire three other people for the same amount of money. So three people will be better than one really good person. What are the other things that I hear almost on a daily basis?

Adam: [00:29:25]
Is that the mythical man month right there?

Darin: [00:29:28]
You're breaking out an old one there. It's partially mythical man month, but it's just most large companies. Like today, you said you have 35 engineers roughly on staff, correct. It's easy for you to do this. I go to clients that have 10,000, 20,000 developers. There's no, and they've never, they're still building all their VMs by hand and it takes 12 months to get a VM through a JIRA ticket. That's not going to change. Ever. If you want to retire, that's the place you want to go work. Which is fine, which is fine.

Viktor: [00:30:09]
I wouldn't say it's never going to change. Sometimes it does change, but the more important thing is that survival is optional. It's not God-given that we're all going to survive. That every single company is going to survive and hey, if it happened to TV industry with Netflix, it can happen to others.

Adam: [00:30:31]
Well, it's certainly not going to change if you're thinking too big. If your first objective is to roll something out to the whole organization, that's only going to work if your organization is sufficiently small. It's definitely not going to work if you have 10,000, a thousand or, you know, even like X hundred. Even 35 people or 20 people. How do you actually influence and change people or change like ways of working. That's a really hard problem. Once you get beyond two people, its hard. It's even hard for two people.

Viktor: [00:31:00]
I think that the whole approach that's taken and now I'm building on top of what you said Darin is wrong. You get to a certain position in a big company, 10,000 people, and then you start making plans how to change 10,000 people and that's just horribly wrong. That's never going to work. You need to figure out how you're going to get to those 10 people. And then once you get to those 10 people, how you're going to convert them into 50 people. And then from 50 to I don't know a couple of hundred and so on and so forth. Ultimate goal might be 10,000, but the problem is that I too often see those plans. Oh, now we have this project that will convert the whole company, whole 10,000 people, into cloud native and that is the problem from the very beginning. Before you even started thinking about the steps, you made a huge mistake. You envisioned something that is not doable. That's the same thing as if we say, okay, we have this 10 year old application, how we are that is you know, all horrible. How are we going to rewrite that application? If that's your thinking, you're going to fail. Now, if you start thinking about how we are going to remove this tiny part of that application and make it better, and that's going to be 1% of that monster, but it's going to be better. And then we're going to figure out how we're going to increase that 1% into 3. Then you're on the right track. But to me, it's the same thing. Like refactoring a 10 years old application developed by couple of hundred people during those 10 years. If you create the new architecture for that application, it's going to fail. Same thing with people.

Adam: [00:32:47]
yeah. The only way to make real progress is by incremental updates. If you happened to be in a position where you can just burn it all down and recreate it, it's a different thing, but it's more predictable to make incremental changes as you said, Viktor. Refactoring an application or breaking down a monolith or changing organization, like even restructuring teams, all these things, more predictable and successful ways to do it in small batches.

Darin: [00:33:14]
Oh, and right there, another promotion for your podcast. I heard that I heard that and that's probably a good place to sort of wrap up. Adam is again, the host, just a host, right? You do it all by yourself.

Adam: [00:33:28]
Mostly. So I had been doing episodes, like I pre-write them and record them. They're like five to eight minutes. So they're small

Darin: [00:33:36]
small batches. Yes. Unlike this one, which will be longer.

Adam: [00:33:40]
yeah, but that's okay. I started doing more interviews to keep it fresh. So like half interviews and half just me.

Darin: [00:33:48]
So go check it out wherever you listen to fine podcasts. That is Small Batches. Look for the name, Adam Hawkins, and just in case Small Batches. Do you have any competitors with that same name? Okay. Yeah. I looked up one today that I was listening to and there were like 20 with the exact same name. I'm like, okay. I have to add the host name in there as well.

Adam: [00:34:10]
Yeah, it's probably whiskey or a coffee or something to do with drinking.

Darin: [00:34:16]
It wasn't that. You've never listened to here. I don't drink.

Adam: [00:34:19]
Oh. Oh,

Darin: [00:34:20]
I'm one of the, I don't drink coffee. I don't drink alcohol. I have an addiction and he does all those. So he makes up for like my wife and daughter makeup for the coffee part. Viktor makes up for the alcohol part.

Adam: [00:34:34]
Well that's admirable. I couldn't, I couldn't do that

Darin: [00:34:39]
Let's just call it. I have a very addictive personality. And I were talking about guardrails. I have my boundaries and I know where those are.

Viktor: [00:34:47]
Me too. I'm also very addictive, but that's why I have a lot of alcohol.

Adam: [00:34:55]
Yeah. Yeah. Sometimes. Sometimes the best way to write some code is to have a little scotch.

Darin: [00:35:02]
Wow. Okay. Maybe that would be one of the taglines for this week. To write better code, have a little scotch.

Adam: [00:35:10]
Yeah. It keeps it unlocks your brain a little bit in my experience.

Darin: [00:35:14]
Alright, Adam, thanks for hanging out with us today. If people want to get in touch with you, where's the best place they can find out about you.

Adam: [00:35:20]
just go to You'll have a link to the podcast and linked to my website and ways to contact me.

Darin: [00:35:26]
That link will also be down in the show notes, or if you're actually watching this on YouTube, it will be down in the description as well. Adam, thanks for hanging out with us today.

Adam: [00:35:34]
Oh my pleasure. Thank you, Darin. Thank you, Viktor.

We hope this episode was helpful to you. If you want to discuss it or ask a question, please reach out to us. Our contact information and the link to the Slack workspace are at contact. If you subscribe through Apple Podcasts, be sure to leave us a review there. That helps other people discover this podcast. Go sign up right now at to receive an email whenever we drop the latest episode. Thank you for listening to DevOps Paradox.