DOP 324: Kubernetes Resource Right-Sizing and Scaling with Zesty

Episode: 324

Published: November 12, 2025

Guests:

Omer Hamerman

Omer 00:00:00.000 Don't overcomplicate, don't overthink. Sometimes the 80% you already have in place will do the job.

Darin 00:01:08.498 Recently Kubernetes one point 34 was released at the time of recording. It was really recently released as in a few hours ago, and it still seems like it's yet another boring release. What do you think Viktor?

Viktor 00:01:22.177 Kubernetes is boring. It is a boring release, and that's a good thing. was not boring first five years when every release would be. Earth shuttering change with new features that were, people were desperately missing as so and so forth. Right now, it's stable. It's used by almost everybody. It is been around for 10 years. people are more focused on making it continue being stable and working than, changing everything

Darin 00:01:51.335 But the workloads have changed over time.

Viktor 00:01:54.731 like.

Darin 00:01:56.199 Can I say it?

Viktor 00:01:57.654 Yeah.

Darin 00:01:58.509 Ai AI workloads have been introduced. Is that going to keep things stable? On today's show, we have Omer Hammerman on from Zesty Omer. How you doing?

Omer 00:02:10.654 Hi. Good. How are you? Thank you for having me.

Darin 00:02:13.534 What do you think about that statement? Kubernetes is plain boring. 10 years in everything's great, but yet now we have workloads that could potentially upend all of that.

Omer 00:02:25.837 I have these, uh, two figures on my shoulders, screaming counter arguments. One of them is saying, boring is good when you're going to production. That's what you want. Uh, on the other hand, Kubernetes isn't not. Boring because it takes the boring old workloads and shifts them into the dynamic world of, uh, pos getting up and down. You can run temporary jobs, can basically do anything. And while the recent release is kind of boring, the one before that to. Those of us who have been working with Kubernetes for quite a while, there have been some exciting changes, and it's gonna be a very nerdy, exciting change. But 1 33 was one of the most exciting releases, with a very special feature that I've seen. bottom line, boring is good.

Darin 00:03:13.829 But what do you think about AI workloads potentially upending that stability?

Omer 00:03:18.071 It's definitely pushing the limits of what the system can do, but it's not getting there. So there are a few aspects and AI is just, it's another type of workload, but not something we haven't seen because we're not talking about the whatever's going on inside the container. It's just a new type of workload. Sure it requires a lot more, it needs more CPU, preferably GPUs. it needs a lot of disc space. you probably do want to put more gateways and firewalls and cover it with more layers of security, but at the, end of the day, you're just running another type of workload, which, I'm happy that Kubernetes has been around for that long because we've, kind of prepared ourselves, not us. Whoever's working on the code base has been preparing the world for something stable enough to run AI workloads on. So I'm happy with how things are.

Darin 00:04:09.624 Well, you're speaking as a technical person. If I was the business person and I'm having to add disc, I'm having to add GPU. I'm having to add, I'm having to add, I'm having to add what was now costing me maybe a few million a year now cost me five x, 10 x that.

Viktor 00:04:26.898 that cost is not in Kubernetes, right? That cost is outside the ELL of Kubernetes, right? You just said GPUs, right? That's the cost. Now that you got your workload in Kubernetes.

Omer 00:04:40.833 Exactly. I'd actually claim that the fact that you can run this in Kubernetes is something that helps you reduce those costs. Because if we didn't have Kubernetes, you'd have to, like Viktor said, it's just something that you provision whatever cloud provider you use, and you just launch your. Workload on there, but having Kubernetes meaning the, the keyword here is auto-scaling. You can scale up when you need the load and you can scale down, and that goes to everything with a few stars here, here and there. You can scale down pods, you can reduce the size of the Node.js. You can even reduce the size of volume. That's a comment for later, but you can basically resize everything. So having the ability to resize and automatically scale things works in your favor. You can use it when you need it. By the way, cloud resources are, quote air quotes, infinitely scalable. You can expand them forever and then you can shrink them. And that's the fun about paying on demand and paying as you go. so it works in your favor. That's what I think.

Darin 00:05:43.046 I guess that's okay.

Viktor 00:05:45.821 What? Wait, wait, wait. What? What's okay? That you can actually spend less money on GPUs than you would normally spend.

Darin 00:05:53.716 Okay. People that have listened to the show know that I'm not anti cloud, but I'm also a strong proponent of data centers and that's sunk cost, and I'm not gonna be able to scale that down or reallocate that.

Viktor 00:06:09.473 By the way, how are you going to buy GPUs for your data center?

Darin 00:06:13.703 I have an in. I have, I have a back door.

Omer 00:06:19.613 here's the thing. You're right that it's some cost, and when you have your own infrastructure, maybe just part of it, but if you have your own infrastructure and it's bought and it's installed in a data center, the moment you allocate everything towards one goal. it's allocated. It doesn't matter if you use it or not, you've already paid for it. However, if you're using something like Kubernetes, you might be able to resize things dynamically to free up resources on the site. It doesn't mean you're not going to pay for them, but it does mean you can reallocate them towards other efforts. This, is especially important in large enterprises and in larger teams that have. Multiple products, multiple divisions, multiple efforts going on. So if you can free up resources, it works in your benefit. And yes, you're not going to pay less in that sense, but you are going to be able to use it. And more often than not, you've already paid less by the fact that you went to a data center rather than the cloud provider. So I think you made it

Viktor 00:07:16.843 with your own data center, it's not that you, you can pay less, you can avoid buying more.

Omer 00:07:22.501 exactly. That's better wording. Yeah, I agree.

Darin 00:07:26.599 Okay. I'm just trying to push on this AI thing. Uh, because people are getting all up in arms over, oh, it's a, uh, it's like, look, it's another workload. It's sunk cost. Even if it's not, not a sunk cost. It is going to be an ongoing cost that you're needing to have, um, air quoting needing here. Very hard. are people wasting money? Today in AI workloads, just like they were wasting money back in the early days of Kubernetes, and they're either going one of two things. They're either wasting way too much money or they're optimizing too soon.

Viktor 00:08:04.780 Most likely, but, and this is my argument against your own data center, right? We don't really know, companies do not really know yet what they need. They have no idea kind of, should they create their own models? Probably not. Maybe, yes. Who knows? there are so many things that is, that are completely unknown. People will tell you on YouTube or television or whatever, I mean, know stuff. Nobody knows stuff and nobody knows what's going on. Right? And even fewer people know what will be happening half a year from now. And that means that actually investment in your own hardware. For the things you don't know and you don't, as an enterprise, like a bank traditional, you have no idea what to do nor what will be happening in the near future. So the only thing you can do is actually say, okay, let's just experiment and let's experiment by leasing what we need instead of buying.

Omer 00:09:05.282 I, I agree 100%. I mean, you can pick that path, but if you look at, let's take a look at in again, air quotes, influencers. You have a guy like DHH running 37 signals. He's one of the biggest pushers of move away from the cloud towards your own data center, which is great, but. Not only he's famous enough to get whatever people he want, he can literally hire anyone. He just posts something on Twitter and he has like 1500 cvs coming in the same second. So that's one. And two, he had his company for like. 20 years. It's a stable product. They know what they're doing, even though if they're slowly progressing towards ai, it's a slow move so they can anticipate what's going on. They can forecast how much resources they need. But like Viktor said, we're not all DHH. Most of us aren't companies, even banks that are now digital banks and the recent ones that have maybe started in the recent. In the past decade, they literally don't know what's going on. And, and I think Viktor said six months, I think most people don't know what's going to happen next month, next month. I mean, the, the, pace of AI releases is something our minds cannot comprehend, not at this pace. so in that sense, having something that can grow and shrink is amazing. shifting back to the Kubernetes world, if I. Could have one suggestion to anyone listening, running on Kubernetes going into AI is invest a lot more in your auto-scaling capabilities. Meaning having automation around what? Scaling both up and down. Be able to resize right size, of course. Uh, right size everything. Make sure that everything is efficient as it can be, but also be able to grow and shrink as things move.

Darin 00:10:46.258 You were talking about a feature in 1 33 that really caught your eye, but you didn't tell us what it was.

Omer 00:10:53.918 You're right. so 1 33 had released a very special feature which had been in Alpha, which means alpha version in Kubernetes, don't get released, to the cloud pro, major cloud providers. Basically never until it goes to, I think either beta or gi. I don't remember. this feature it's called. I don't remember the official name. I think it's in place, uh, resource update or something like that. It basically means that you can take a pod and change its resources, CPU allocation or memory allocation and having it change it on the fly without restarting the pod, which is crazy. People wanted it since ever. there have been some really. Huge challenges with it because it basically goes to the core of how namespace work in Linux. And it's been just moved to beta in 1 33 and then it, it was released. And it's not the end of the story because literally have to change specific pods. Now you cannot go to a deployment or a stateful set and update it there. That would require a recycle of all the pods and updates of everything. but this actually. Connects beautifully to autoscaling and right sizing things because this means you can now go through your pods, whether with a script you wrote or a controller in an operator you built. Look at the pods. naturally monitor how much they're using, how much they have been using, statistically, going backwards, and then update them on the flight, changing them within the node it's not a guarantee that they will get changed naturally. They're running on hosts. Maybe there's not enough space. There's not enough CP or memory, but they will get updated in place if that's possible, without causing any interruption to your system. That's the feature.

Viktor 00:12:32.575 I forgot actually, that actually didn't forget. I, I'm always waiting for GA because Kubernetes now changed the rules. I think that it's available in Kubernetes after ga. But anyways, that's very exciting because that finally makes vertical scaling actually useful. Now we are still missing a, or maybe I'm, I'm wrong, so correct me. You probably know much more about it than me. Does it mean that we are going to move towards finally having vertical and horizontal scaling working together?

Omer 00:13:04.028 Hmm. somehow I find myself in the same conversation every day. I don't know why. So, uh, this is actually super exciting because yes, I do, I do think that's the direction. Now Kubernetes has this concept of, uh. They call it multi-dimensional autoscaling, meaning you can go horizontally or vertically based on this, air quotes brain that makes the decision on the fly automatically. you don't have such a thing, right? We have the HPA, which is an open source you can install. Most companies install it, and then you have the VPA, but it has this huge warning on it that says you cannot use VPA with HPA using the same metric on the same resource because it will just create a negative feedback loop. So. I definitely think that it's going towards that. Now, back to the concept, multidimensional autoscaler is just a concept. Kubernetes, they love this thing. They put out a concept, they will literally tell you how to build it. They won't build it themselves. I don't know why they leave some things to the cloud provider, like, container, what do they call it? The CSI, for example, it's a concept. The cloud providers have to implement it or whatever, whoever you are. It doesn't mean that you have to be a cloud provider. That was a long answer to say yes. I definitely think that's a direction.

Darin 00:14:14.402 I hear horizontal, I hear vertical. Therefore I hear X, I hear Y, but where's the Z?

Viktor 00:14:23.462 Z as in time,

Darin 00:14:25.502 And time I. Or, or not time. What, what is Z? 'cause if we're gonna be living in a two dimensional world in Kubernetes, then I don't have great hope for it over the decades.

Viktor 00:14:37.601 Z is already. Uh, correct me if I'm wrong, right. But Z is already there together with X, assuming the text is horizontal for a while, right. Nothing, nothing prevents you from saying, okay. The criteria for scaling will be based on the metrics, that are gathered over a month or anything. Right? I can, I can look into Prometheus, create a query and define. threshold that will scale it. Horizontal in that threshold can be, that query can be based on, not can, but must be based on a period of time, right? Whether that's last five seconds or last five months, it's fully up to you, right? the problem is that that graph that you are visually narrating doesn't have. Why? Uh, sorry. Uh, yes. Why?

Omer 00:15:25.715 What if we assume that Z is in fact a TA time period? And

Viktor 00:15:30.400 Yeah.

Omer 00:15:31.835 with that analogy, I think Viktor, what you mentioned is exactly that, right? Because. for example, the feature of inplace right sizing, shortens the time to my goal. My goal is to right-size things, is to auto scale things. If that basically does it in zero time, the time we count in Kubernetes is usually the time. the downtime, right? I'm trying to make a change and then that means something is getting replaced, there's some interruption in the system, or I'm provisioning something new. A new stateful set, a new deployment, a new pod, and there's time to launch it until it, serves a healthy health check. So this allows us to shorten the amount of time required to replace those things. And if we're talking about autoscaling, there is autoscaling out. There's autoscaling, up and down, which means making the same thing larger, right sizing vertically. These things takes time because new pods take time to either update or restart or whatever. the more we we get closer to in-place changes, the more we can shorten that, third dimension. So that's actually something that you just brought up that came up to mind, but I think that would be perfect if we managed to get everything around that zero time.

Darin 00:16:42.509 We're able to reach this nirvana to where everything is magically up, down wide. Not wide. Do I even need to worry about requests and limits anymore on a pod.

Omer 00:16:56.847 I think just like Nirvana is in Buddhism, you'll never actually get there. It's just a concept. There's no way to reach that, right, because. To the point of in place fight sizing, there's limitations. You're going towards the limitation of the host or a node, and if that's not yours and you're using a serverless air quote solution, there's limitations to the underlying hood, the underlying sorry platform. And if there's no limitations to that, there's one thing I sure I'm sure everyone will agree on. There's limitations to your monthly bill, so at some point you're going to hit a wall.

Viktor 00:17:30.761 depends what you meant. Darin. By worrying did you mean worry? How? How much memory I should assign or worried about? Available.

Darin 00:17:40.661 or do I need to assign anything at all? Because, you know, even coming up with those first numbers correctly just seems like a waste of time.

Viktor 00:17:48.386 this is going to be radical, but the thing that everybody, there is a collective hallucination going on in our industry. By you thinking that you know how much memory in CPU something needs, you have no bloody idea. You don't know it on day one. You don't know it on day two, you dunno it three years later. to me, I think actually that's one of the mistakes in a way of Kubernetes teaching you somehow that you should specify how much memory in CPU something needs while you have no idea. At best, if you're very experienced, you have a approximate guess, you know, kinda Java, probably four gigabytes. And if it's gold, then let's say one gigabyte of memory, uh, because you know it's not Java. That's how it works

Omer 00:18:40.270 I'm laughing because I have a day job that's around the, the reason I have a job is because I'm trying to answer this exact question and doing that dynamically over time in Kubernetes automatically. I totally agree. With your assumption, the fact that we're letting users decide something. Not only letting them something decide that they don't know. There's no way for you to learn about something that did not preexist. Not only that you, again, air quotes, you cannot change it. If you change it, it dies and restarts itself. That's a great point. And by the way, it's not all about CPU and memory. Mind you, there's another aspect of things, especially when we're talking about AI in Kubernetes, which is storage. Now storage, especially if you're connected to a cloud provider like we talked earlier. It can grow. You have to turn on a switch that's called, uh, auto expand or auto expansion or something. You have to turn that on. And then if you've provisioned 10 gigs and you now need a hundred, it will go to A-W-S-G-C-P or Azure or whatever you're using, and we'll tell that please add more. And under the hood, it'll add a bunch of more discs, but you'll get the same thing. The moment you wanna scale it down won't work. that's another thing I was, and still am working on solving, just another dimension maybe to consider.

Darin 00:20:00.199 It's all about dimensionality, I guess is what we're saying. because even if you did know, you had observed, you knew how to set the values. That's true until the next spike or the next update of the app, which changes the memory profile, it changes the CPU profile. It changes something. So what was true an hour ago may no longer be true.

Omer 00:20:24.568 Which is exactly why you have to build automation around it. Making those decisions based on a, let's call it a, a chron schedule, will not always work. Sometimes it's okay if you are making, like you said, if you're making application adjustments that are now requiring a bit more memory. If it's just a bit and it's growing over time, the next. Cycle of, let's say seven days from now, someone, a human would literally go check out theano graphs or whatever you're using, and based on that, make decisions and update the workloads. Fine. It would work to some extent. However, with the spiky workload we live in, it won't work. The only thing that will work, especially in downtime, you know, we have to sleep. Sometimes you have to create an automation around it. If you have something, an operator, a script, whatever that is, that runs constantly in the cluster and measuring, what the applications are doing, and based on that updates the workload, especially using the new feature that can do that without even replacing them. That's how you win. I don't see any other way.

Darin 00:21:26.342 My concern in that is measurements are still probably all that we have. We're looking at disc io. We're looking at CCP utilization. We're looking at memory utilization. But those values are too static, and even if you were able to respond quick enough, is it even good enough?

Omer 00:21:48.478 I'm guessing if I'm reading between the lines you're saying we don't have enough context. Which begs the question, if we have had eyes inside the container to what the application is doing, like, Viktor mentioned Java, so if I could see. The JVM and its configurations or if I could see the, uh, node process. So if, if I could see whatever the application is doing, then I would have a bigger context window. and then use that. And it's a good question. I don't think I have an answer, but maybe, maybe that's part of the world we're going to, you'd have to have more context.

Darin 00:22:24.214 What do you think about that? Is it all AI nowadays? Is that what we're headed towards?

Viktor 00:22:29.640 I mean AI for sure. Uh, that's probably not, we can have a long discussion how useless or useful AI is. But one thing that I'm surely convinced at this point, AI is pretty good at crunching data, It might be good or bad at giving you suggestion kind of, Hey, write this line of code, or do this, or do that. that can be a discussion, but crunching data is definitely better than us. so it mostly depends on how well we can explain it to ai, what, what to do with that data. Or maybe not explain it, but have some automation behind it or what's not Right. But crunching data. Yeah, definitely. Because when you think about it. we are doing, at least in the past, we were doing some relatively unreliable things. And that's, Hey, I'm going to write a query, for Prometheus. And that query will say, okay, what is the average memory utilization over past, uh, 15 minutes? And if that is above 80% of what I specified there and what I specified there has is completely random, by the way. Uh, so. If last 15 minutes of ization is above 80% of the random number that we put, then you scale up. Right? Or, or, vertically or horizontally, whatever. Right? It's pretty unreliable way to do it, and we cannot really, it's very hard for us, at least for me as a person, to write really a comprehensive query. You know, kind of, oh, I will write a 500 lines query that will, actually take into account and both short term and long term memory usage and spikes and all this stuff, it's, it's, it's costly impossible. Right. For me as a person to write that. So I need to stick with last 15 minutes or whatever the period of time is, was the average. Right. That's as far as I can go, and I would be very surprised that AI cannot improve it, even if it's 10% is better. Right. Yeah.

Omer 00:24:38.260 so what you're suggesting is that AI improves the infrastructure on which AI runs, but then in turn, the new AI that improves it would need more infrastructure because it would need more resources.

Viktor 00:24:52.083 I was not necessarily now talking about scaling of AI workload, but scaling in general. Uh, but yes, including itself.

Darin 00:25:00.115 We're poking at ai. That's just what it is that we have to do in 2025. If I was a startup today, and again, I was thinking, what would I do in the next five years? Okay. A startup doesn't have five years. A startup has maybe five months. Uh, what would you prioritize at this point?

Viktor 00:25:21.564 CC, can I answer that? Can I please, please, please, please, please, please, please are. Are you a startup funded by VC money?

Darin 00:25:27.544 Yes.

Viktor 00:25:28.399 You'll not get money. If you're not doing ai. Something doesn't matter how silly that idea is. Doesn't matter how useless that something is, you will not get finances. You will not get money for your startup if you're not inventing something ridiculous related to

Omer 00:25:44.189 All you have to do is add dash AI to the name.

Viktor 00:25:46.999 Exactly, Exactly, Bio

Darin 00:25:49.784 Oh,

Viktor 00:25:50.534 ai.

Omer 00:25:50.849 Yeah.

Darin 00:25:51.709 okay. That's enough of that. Um, what would you prioritize though? I mean, okay, let's, let's bring it back out like two years, you know, one year, two year. what are you gonna prioritize? Because in this world now, at least it came in 1 33, knowing that I can right size a pod as long as. I've got enough resources on the host. Correct.

Omer 00:26:19.870 Yes.

Darin 00:26:21.010 So if I can do that in place, it seems like now I'm not chasing after the smallest and or at what point do I stop over optimizing and just take a good air, quote size machine host to run my workloads on. I mean, how do I even, because we were talking about optimizing pods, but now it's like, okay, well what I really need to do is optimize the machines. Those pods are running on. However, I've just pushed it out a little bit further.

Omer 00:26:55.848 I think that's just another layer of the same system when I speak about autoscaling, at least, I'm always considering the Node.js underneath. Because it is just part of the system. pods are analogous to boxes. If you want more boxes in the same crate, the crate needs to. Be not bigger, but it needs to hold a number of these boxes and you need to play with the numbers in order for the boxes to fit. If there are two boxes and one is a little bit too big to hold two of them on the same machine, but still the machine is large enough to hold one, then you're wasting space. so there, there needs to be another, statistical game here, deciding which are the perfect Node.js, not just in size. But also, you know, instances can be, memory intense. They can be geared towards, CPU. They can be geared to towards GPUs, but they come with, uh, Nvidia or, whatever other cards you can put on them. another part of that automation is not only learning. What kind of instances you need? What are the sizes you need and changing them over time. There are platforms for that. There are tools for that like carpenter or the cluster autoscaler that most companies have. And this using that to decide what Node.js am I going to use? And over time you started by saying, when do I stop over optimizing? Which I think. word specifically over optimizing probably suggests that it's something negative. You don't want to over optimize. That means you've spent too much time doing something that you probably shouldn't. But when do you stop optimizing? I think pretty much never. It doesn't mean that you as a person or as an engineer have to physically go and manually make changes, but it does mean you want to have something that does it for you. And this leads me to your even. Previous question about a new startup, what do I invest in? And that's, uh, naturally my own opinion. Outsource as much as you can. if you're a new startup working, especially in the ai field, you don't want to mess with auto scaling and, expand or shrink volumes or right size your pods. That's my take on it. You can use the open source tooling that you have, that's great, but you have to have someone. Not only install them, maintain them, make sure it works properly and it doesn't take you all the way. Like we mentioned, you won't rightsize the way you think it will. That's my take.

Darin 00:29:15.428 So lemme go ahead and bring the storyline back to DHH for just a minute. So what you just said was, if I'm a new startup. I don't wanna run Kubernetes myself, let's put it that way. I don't wanna run it myself. I just want something to run whatever my app is. Then over time, I might want to run my own thing because the other thing is costing me more money. And then if by the grace of God we make beyond five years, then we might start thinking about pulling it back into our own data center.

Viktor 00:29:50.486 I wanna push back on that cost more money. I don't believe that. Cloud costs, uh, sorry. Cloud, cloud, costs more money very often it doesn't, but it does cost more money If you compare only hardware with hardware and you say, okay, so I have a a hundred servers that I bought and a hundred servers in AWS cost me this much. And you're completely ignoring that. You will be, first of all paying sometimes 50. Sometimes hundred, because that's the maximum. And more importantly, you're ignoring people. What is the, what is the average expense for a person like 200 KA year, 300 KA year with all the expenses higher. Right. And usually companies ignore that. It's kind of, okay, it's a server cost this much here and that much there. The fact that I need to hire 20 more people just to manage my own data center, that is somehow not the cost.

Omer 00:30:48.956 And also as a new startup, good luck finding someone that. Doesn't want to work in the cloud and doesn't want to run on a data center, which sounds archaic and something that most of his friends want to have Kubernetes and AWS in their CV not working in the basement in a data center.

Darin 00:31:05.902 Oh, you just, you just hit a point that ticks me off. Let's stay here for just a minute. CV driven development. Come on people. This, this has to stop. Will it ever stop?

Viktor 00:31:26.390 it's not only cv. Let's say that you're a cook, would you choose to, you, you, you would say like, oh, I don't, it doesn't matter. I can work on a a in a truck. Right. Food truck or I can work in a restaurant. Doesn't matter to you at all. Of course it does, It's not only cv something, something, it's that I want to work in a good company, A company that will help me grow my skills. Yes, those skills eventually go into my cv. Of course it does, but it's more very natural that you don't wanna work in a food truck and you want to work in a proper restaurant. Right? If you would be a cook.

Omer 00:32:07.346 it's even worse in tech because the game is rigged against you. Now, I would be the first one saying, ignore the cv. Do what you like. Do what your, where your passion is. But if you wanna work at a better company later, the game is built in a way that people have. I mean, LinkedIn is built that way. They have automation that will scan keywords in your profile to learn whether you are a match, a potential match as a candidate for a company. There is no way around it. So if that's your goal, if that's where you wanna be, if you wanna move laterally or gradually towards larger, bigger, famous, more famous names, there's no way around it. Doesn't mean you don't have to, you have to ignore all options that don't include the cloud or Kubernetes or I dunno, what's the next type. But it doesn't mean you have to consider it.

Viktor 00:32:58.438 You sound defeated, Dar, come

Darin 00:33:00.453 I I will, I will. I, well, what I will concede is that it's fine to put things on your cv. My problem is the job hoppers that are, okay. I spent two months on this. I got it on my cv, and now I'm going somewhere else. And okay, maybe not two months, but maybe just a year. And my concern with that is these are the people, the people, sorry, people that are gonna be running the infrastructure for decades to come, unless they happen to go work at Meta or something, to where they get a a hundred million, 200 million signing bonus paid out over five years. Uh. You know, I, I don't know where this is going, and especially again, throwing AI back in there. Now we've got C levels. Hey, let's fire everybody. Let's just replace it with AI automation and everything will be fine.

Omer 00:34:00.811 I have to say something I believe in in life, and it doesn't have to do with tech only. It's everything Regardless of whatever happens, there will always be demand for quality. If the people in the next five or 10 years are going to be the people that after two months with Kubernetes decide that they're expert. No one is an expert in Kubernetes. No one is an expert. Even the founders of the, there were four core founders. None of one of them is an expert in API machinery. He has no idea how storage and scheduling works. So what I'm saying is if these are the people, the ones that work two months with a new technology and now they're experts, and these will be the ones that build the infrastructure, it'll, at some point it will. Break or get bridged, something will happen and then they'll find whatever, uh, coun, they go find counseling. They'll need the advisors. They'll find the professional, the high quality, the name that they know they can trust.

Darin 00:34:54.706 So what you're saying is we're actually living in Jurassic Park and life will find a way.

Omer 00:35:00.397 Absolutely, 100%. I stand behind that.

Viktor 00:35:03.552 Survival is not optional, is optional.

Darin 00:35:08.637 Just don't go to the toilet out in the woods when there's a T-Rex there, because that's a bad, bad thing to do.

Omer 00:35:13.567 Yeah.

Darin 00:35:14.831 I am trying to figure out. Okay, so I, I wanna go back to the memory and CPU thing for a second because it got a little too dark. What else do we need to be tracking besides memory and CPU? I mean, those are the basics. I mean, it seems like we always should start there. But the keyword there is start. There has to be more in order to understand, especially breaking out this 1 33 feature, because none of us can remember the name of it. And even if we did, it'd probably be too long to say. So we're saying the 1 33 feature. What are the other things that we really need to know? I mean, it seems like, sure, if I've got space and I can vertically scale it, I'll take that. Right. That's a, that's a good thing.

Viktor 00:35:56.572 Here's a question I have. Uh, that's all that mean. The vertical scaling. Now without restarts, without relocation of pods or whatever it's called, that assumes that actually we don't have right sizing in a way. Right? Because imagine in an ideal world where actually my pods are right sized and my Node.js are right sized, and there is very little based. Ideal situation, close to no waste. That means that my, uh, rightsizing, sorry, changes to, uh, memory CPU that do not require pod restarts will never work. I know it's, it's, it's a world that doesn't exist, but in a ideal world, I cannot change memory and CPU at least cannot increase it without restart, even if that feature exists.

Omer 00:36:55.972 Yeah, which means you need to rightsize the other layer. But I, I wanna say, so I, I wanna pull a thread for a second from that question to a previous question we didn't deal with, which was, it would sound strange, but bear with me. We ask the question whether new companies even have to deal with Kubernetes or use managed Kubernetes or whatever. What I wanna say is sometimes we overthink things and over complicate things. It doesn't mean, I mean, I work at a company, that's all I do all day, right? I, I build right sizing features for everything, especially in Kubernetes. That's what I do. But sometimes we overcomplicate things, new companies. Tend to jump on Kubernetes because of fomo. They, they're the fear of missing out something. The fear of missing new technology connects back to the CV driven development or hiring or whatever you wanna call it. Back to the resources Part. CPU and memory are great. They're the main two metrics to look at. You can use additional context, you don't always have to. Putting that aside, more context is hard to get. You definitely have to consider it. If your application doesn't need to scale based on CPU and memory, but based on a length of a queue or another application based metric, you can use things like cada, the open source layer that helps you do that. So we will find your horizontal autoscaler and help that autoscale based on other metrics. again, back to the startup world, I won't invest so much energy and time figuring that out and fine tuning things if they don't correlate directly to what I do and, and pushing me forward. I made a complicated argument here, but uh, I hope the message cut through.

Darin 00:38:44.006 I am trying to figure out what the message was. I actually got lost.

Omer 00:38:47.596 Don't overcomplicate, don't overthink. Sometimes the 80% you already have in place will do the job.

Darin 00:38:54.631 Okay. That's a nice bow on top. Thank you that that helps. And I did take offense at that we overcomplicate things. We're in tech. We don't overcomplicate things. It's just what's necessary. That's a joke, by the way,

Omer 00:39:12.210 If, if you had comments on here, people would say that exactly what Kubernetes is. That's all it is. Complication of technology. It's not what I think.

Viktor 00:39:23.370 I strongly, uh, whenever I hear those comments, I go into offensive mode. I dare anybody to show me a less complicated, easier way to do what Kubernetes does. You might not need it. That's a different, that's a different thing. Do you need that? Maybe, maybe you don't, but there is no easier or simpler or less complex way to do what Kubernetes done. Doesn't exist. Sorry. It does exist, but I'm excluding now, special services like, uh, Google Cloud Run, and, you know, uh, it exists as a service. Yes. But something you can do doesn't exist.

Omer 00:40:05.865 Y you know what I, I'll bite. Um, I think there are simpler frameworks and systems. But, and that's a big, but if you need, it's not the complexity. If you need the features and the community and the ecosystem that that Kubernetes suggests, or sorry, provides, if you ever looked at the CN CCF landscape, that would tell you enough of how the ecosystem is, is broad. If you don't need a lot of that and some companies don't need, I love ECS on AWS. Especially on Fargate. It's so simple. Everything's connected. If I have two applications that need two replicate, that's direct, that's the first place I'd go. So simple. No need to instantiate anything, just works connected to CloudWatch, everything's beautiful. The moment you need a complication or something that Kubernetes knows how to do, that would be the first pain and that would be the first signal. But like you said, not just, not everyone needs it.

Viktor 00:41:04.478 Yeah, exactly. But if you do need it, like I can easily argue that I can come up without thinking with five things that ECS doesn't do. Kubernetes does, and you can probably do it in ECS, but then you will start jumping into really com more complication than with Kubernetes. This.

Omer 00:41:24.098 Absolutely 100%.

Darin 00:41:26.229 What are some of the common mistakes people are doing today when they're just trying to get started in right sizing? Let's simplify it a bit like they're just trying to get started. are the handful of things that everybody gets wrong?

Omer 00:41:39.637 I think the first easiest one is just not doing it. Not being aware of your options. Some company, I mean, most companies I think tend to go with horizontal scaling, which is more understandable, and they seem to not trust the vertical scaling, for a good reason. More interruptive more. Risk to the process, but sometimes you'd actually benefit more than having VPA operating in your cluster. So what I'm saying is basically if I need to wrap everything to one line, have the information first, learn about the capabilities, and then make a decision, too often I see companies literally not knowing about the options. He's just doing something, by the way, because then the LLM said, so they logged into chat, GPT or Claude and asked it to help them. autoscale Kubernetes, so it, they use the output.

Darin 00:42:30.851 Is that where our lives are headed? We get output from something that's generated that may or may not be correct, and we copy and paste and go.

Viktor 00:42:43.291 Can I ask you, propose an alternative, and you tell me whether that's any better? Uh, Stack Overflow

Omer 00:42:52.441 Exactly what I had in mind, exactly what I had in mind. And we had it for years. We just didn't call it that.

Viktor 00:42:58.741 if you don't know what, how to do something right and AI helps you, it, it'll probably not make it worse. Then you not knowing how to do something and before ai, you would either not do it or you would go to Stack Overflow copy paste. The only difference between Stack Overflow and AI is that one is faster than the other.

Omer 00:43:20.995 which by the way means. That most models we currently have are trained on answers from Stack Overflow when we're talking about

Viktor 00:43:28.760 Exactly.

Darin 00:43:32.260 I can never get away from Stack Overflow, is what I'm hearing. The Stack of roof flow is always gonna be the true way of doing development going forward because it's been immortalized in a model somewhere.

Viktor 00:43:45.331 It makes perfect sense because models cannot be trained really only on code. So if you say, Hey, I have a model that, digested every single GitHub re. Code. Now I'm not, I'm talking about code, right? you won't get far. You need some kind of opinion, some kind of ranking, some kind of idea that A is better than B, And you get that through rankings in star call flow. You get that through comments, uh, and so on and so forth, right? So Stack Overflow is more valuable. Potentially then, code itself, um, sorry, GitHub repos, not as a source of knowledge, but as a source of ranking things. Now how reliable Stack Overflow ranking and comments are. That's a separate discussion, right? But it's better than nothing.

Omer 00:44:36.826 Absolutely. I agree. I, I think every major AI company has their own army of, I forgot the technical term, but they kind of, Adjust the answers by the LLM to help it. There's a technical term for that, but they help it manually, helping it make better decisions and better answers. with Stack Overflow it's open. You can literally see who voted up, who voted down. You can see what they said. with ai, it's hidden from you. You can't really.

Darin 00:45:04.639 But isn't that the same thing we're doing with auto-scaling our pods? We can't see what's doing it.

Omer 00:45:10.441 In a way, I think it's not the same, because if you have something like everything with Kubernetes, if it's an application, a controller, whatever it is running in your cluster, you can view the logs, you can see what it's doing. You can. See the it. It'll be collecting something that you already have, right? If you're running permit use or whatever other system for collecting metrics and monitoring, you have the information. If you wanna make better decisions, it's up to you. If you want to kill it and start something new or build your own it's yours. That's part of the magic of Kubernetes. It was built in a way to help you extend it and do whatever you want. So in a way, I think that means Kubernetes is the one of the most open platforms you can look at and learn from.

Darin 00:45:49.578 Now, we said at the top that you work for zesty. What does zesty have that can help us with this?

Omer 00:45:56.613 So I, I hinted earlier, but basically we're, building a platform that has a few components to help you do exactly that. You can rightsize volumes, which is something you can partially do only upwards. You can't shrink volumes. So we do that. We help you rightsize containers, and by rightsizing, I mean we do both. Horizontally and vertically. So we'll help you do the V VP's work by making selves aware of the horizontal scaling so that they don't conflict. So that's another big pain point that we solve. we have a bunch of additional features. We can help you get instances faster. So if you have something like Carpenter that reacts to. context Carpenter is, uh, like cluster autoscaler adds more Node.js to your cluster when you have unscheduled pods and they need more hosts. this can take time because it speaks to your cloud provider and launches new instances that. In turn needs to download the images, unpack them, yada, yada, yada. This thing takes time. Sometimes up to 40, 45 seconds. We have a another mechanism that hibernates Node.js with the images on them and then lets you have them way quicker, like 15, 20 seconds. So in essence, a platform that, Attacks a bunch of features for right-sizing, auto-scaling, quicker scale, and you know, connecting to our multi-dimensional with the, time dimension. So we help you even shorten the time in that sense.

Darin 00:47:24.914 Now zesty can be found@zesty.co. That's ZEST y.co. All of Om Omar's contact information will be down in the episode description. Omer, thanks for being with us today.

Omer 00:47:38.232 Thank you for having me.