#69: Google Cloud Run vs Azure Container Instances vs AWS ECS. We discuss the pros and cons of each Containers as a Service solution in today’s episode.
If you like our podcast, please consider rating and reviewing our show! Click here, scroll to the bottom, tap to rate with five stars, and select “Write a Review.” Then be sure to let us know what you liked most about the episode!
Also, if you haven’t done so already, subscribe to the podcast. We're adding a bunch of bonus episodes to the feed and, if you’re not subscribed, there’s a good chance you’ll miss out. Subscribe now!
Viktor Farcic is a Principal DevOps Architect at Codefresh, a member of the Google Developer Experts and Docker Captains groups, and published author.
His big passions are DevOps, Containers, Kubernetes, Microservices, Continuous Integration, Delivery and Deployment (CI/CD) and Test-Driven Development (TDD).
He often speaks at community gatherings and conferences (latest can be found here).
His random thoughts and tutorials can be found in his blog TechnologyConversations.com.
Now, if Google has the best Kubernetes service and everybody's trying to catch up with that service, then it's normal that everybody is focused on improving their Kubernetes service while Google has a luxury of creating containers as a service already on top of it.
This is DevOps Paradox episode number 69. Is Containers as a Service Serverless?
Welcome to DevOps Paradox. This is a podcast about random stuff in which we, Darin and Viktor, pretend we know what we're talking about. Most of the time, we mask our ignorance by putting the word DevOps everywhere we can, and mix it with random buzzwords like Kubernetes, serverless, CI/CD, team productivity, islands of happiness, and other fancy expressions that make it sound like we know what we're doing. Occasionally, we invite guests who do know something, but we do not do that often, since they might make us look incompetent. The truth is out there, and there is no way we are going to find it. PS: it's Darin reading this text and feeling embarrassed that Viktor made me do it. Here are your hosts, Darin Pope and Viktor Farcic.
Now last week we talked about the new edge features that are now available from Docker, but if you've been following along with the catalog course, either by book or by video, you've seen that we've released everything. Correct? By this point we've released everything about containers as a service. Oh, we haven't yet. Well, I
I know, sorry, by this time, what
by this time, the, this is, so this is releasing on. You would ask me this question. I should always write it. This is August 19th.
19th. We're we're done.
I have a problem with time travel. I always get confused.
Yes. Um, So we are recording this one a little bit early because it is holiday time for lots of people, including Viktor. Notice, I didn't say myself, um, not that I'm bitter about it or anything. The containers as a service. This is our sort of wrap up episode for that, at least as the head to head comparison. Google Cloud Run versus ECS Fargate flavor and ACI. Let's go through the pros and cons. I think basically, if you were to do the tl;dr on this, is it Google Cloud Run today?
Oh, without any doubt. So if you would ask me which one it's not even which one is better, can any of those be called container as a service and be considered serverless computing at the same time? If you would frame it like that, then actually Google Cloud Run is the only one that fits that description today. All the rest, the other two allow you somehow in very different ways to do something with container images, but they're either not really CaaS or they're not serverless.
Before we go any further, let's just remind everyone that's listening that we're talking only about GCR, ECS and ACI today. We're not talking about anything else. So no hate mail about product X does this or product Y does this. We're talking about just these three today.
Managed and I repeat managed containers as a service in big cloud providers.
Yes. Not self managed. Not, not small managed. Right? So that sets the frame. Now you've already professed your love for everything Google Cloud Run, but there has to be something bad about it.
Or maybe not bad, but a weakness.
Yes, actually, there is one thing now that I think about it is that, uh, I do get slightly higher percentage of uh, requests without a response or slightly lower availability when running relatively small load. So, if you're talking about, let's say thousands of concurrent requests, not higher than that, then, uh, uh, with Google Cloud Run, I'm not really at hundred percent availability, but rather like two nines, right? 99.99 or 99.98 or something like that. So that's probably the only negative thing I can say about it.
But wouldn't you expect, especially at something that low wouldn't you expect to get 100% availability?
I mean, I would expect, I would hope for, but the trick is that, uh, so it is serverless, meaning that if I don't run anything and then start bombing it with, let's say a few thousand requests, uh, concurrent requests, then it spins up an instance or two or five or 10, queues my request and then forwards them further. But if I didn't design my application well, and in this particular case, I'm talking about, I didn't, uh, to let's say not to implement proper health checks and what so not, so that Kubernetes behind the scenes, knows when the application is ready, then uh, uh, I will get some failed requests because application just started, Kubernetes thinks it's available, start sending requests to it while actually the process is about to start. So you can blame it actually on me not having properly tuned, uh, the definition of my application. But in the trials that I did, I did it more or less intentionally because, uh, those are mistakes that many are going to make the very beginning anyways. So I wanted to kind of test it from user perspective that didn't spend already half a year with it.
And that's the reason why I called this one out way up front, because that is also a weakness of both ACI and ECS as well. Your numbers were a little bit better. You had worse numbers with GCR, but you were talking very small. Second, third, fourth decimal point number worse. Right? So it's very low difference. But to believe that you won't have any failures is shortsighted, even with it being serverless.
But that's the thing. Uh, it's not necessarily even serverless because, so what is happening with, let's say with Azure, uh, you are running a replica of your container image or container at all times. Not two, not three and not zero. One. Right? So there is no potential loss of requests due to the need to initialize to start your application when requests are coming in. Right. It is permanently running. So you will not. There is no confusion. No. Are you running, not running, should I start sending requests and stuff like that? Right? There is no gateway of sorts on top of it, there is almost nothing because nothing is needed. You are going to get a hundred percent or close to a hundred percent availability until you reach the point where a single replica cannot handle your load. And then you're going to see failures big time because it doesn't scale. It doesn't scale below, nor above one replica. So it's kind of unfair comparison. Right? When I say Google shows less availability, uh, let's say with thousand concurrent requests, then if I would increase that to 100,000. I would get completely opposite picture because Azure container instances would collapse completely because one replica cannot handle that load. I mean, imagine that it cannot, because it really depends on your application, how much memory, CPU assigned. But when you reach the point that a single replica cannot handle your load, Azure container instances collapse immediately.
so that is a weakness, I guess we're going through and talking weaknesses first. Um, that's a weakness of ACI is Highlander. There can only be one.
exactly. So let's say if it cannot scale to, I could maybe even argue that, uh, you know, pay for what you use uh, could be maybe one of the most important, if not the most important feature or aspect of being serverless. Uh, Run when needed and pay for whatever your users are using. If you would define serverless like that, then ACI already drops from, uh, any attempt of being serverless, because you're definitely not uh, managing my infrastructure outside of servers because you're not scaling my application. You're not making it highly available. Uh, You're just giving me server for for free. I was not, not for free as in, I don't pay you money, but for free, I don't manage it. But it's not pay per use if there is no scaling, high availability, none of those things. So you could probably discard it as being serverless. That does not mean that it is not containers as a service because they definitely provide you a service for your containers, but not serverless.
So we've covered weaknesses for both ACI I'm sure there's more weaknesses for ACI, but that's the glaring. That's the glaring one. Is there another glaring one that we should call out?
Both ACI and ECS is something I wouldn't use, to be honest, uh, in the context of me searching for containers as a service that is or resembles serverless model. Uh, but for very, very different reasons, like ECS is too complicated for my taste. Too complicated to be from the context of serverless, right? Because serverless is supposed to simplify things like here's my code, here's my container image what so not do something about it. Don't ask me too many questions. uh, So ECS fails big time on that one. Yeah. It is easier than not using it at all, but it's still not easy in any formal way. Uh, There is no horizontal scaling in ECS is horrifyingly complicated and unfulfilling, but at least it exists. So that's something, but again, it doesn't go to zero. ECS cannot when I say cannot yes I know that you can set up a what so not on top of it. Everything can be done, but out of the box, it doesn't scale to zero. Uh, So basically I will be running my container permanently. It could be configured to go up and down dynamically, but down is never going to below one and assuming that I'm responsible person that would mean that it would never go below two, because I just want to make sure that if one of them fails, the other one is going to pick up until AWS recreates and also so not. I have a minimum of always paying two containers. Now for some that might not be an issue because if you have more or less constant traffic, then why not? Why would you go to zero? Uh, But uh, I think that that would discard it from being considered serverless as well. Um, I don't know if you're noticing a theme, a problem with ACI is that it is not supposed to be used for production. While problems with ECS is that it is too complex from the perspective of something as a service, right? Uh, too many things you need to create. Now, I would rather say that ECS could be containers as a service provided not by Amazon, but by your infrastructure department, if they do all the heavy lifting that needs to be done in ECS to make containers work. So to do the work on top of the work of ECS then it could be containers as a service or serverless from developer perspective. I think that I confused the hell out of everybody, including me.
Well, that's sad because I completely followed it all the way through. Your tests at least specific let's, since we're sort of edging towards ECS, your tests did not include the new copilot project at the time you did the initial tests. If you've watched our live streams, you've heard me gush about copilot. Now, it's still very early. But as much as I believe that the Docker compose things are promising, copilot specifically for ECS, I believe is highly promising because it is getting us closer to that, here's my image, go run it. They're not there yet with image, but go with me a minute because it will spin up everything that I need. Now again, my current problem with copilot is if I already have a VPC and I want it to run in that VPC, I can't do that. But the other side to that is if I'm doing something as a service, I expect all of that infrastructure to just be provided for me. With Google Cloud Run, did you have to spin up anything or did you just give it an image?
The only thing I had to do is create the project. You know Google Cloud everything is inside the project and to enable few services that, again, that's always Google Cloud. You need to tell it, yes, I allow myself to use Cloud Run. Other than that, nothing. Only the project.
And with copilot, if you're willing to let it be copilot, under the hood, it's creating CloudFormation. So it'll create VPC and create subnets and all the other things. And if that works for you, great, that's getting closer. But in a lot of enterprises today, it's not going to cut it. But for a startup, it may be fine.
or a smaller company. Right? If you don't have all the constraints, that will be a possible solution for you.
That does not necessarily make it less safe, less secure. Uh, Defaults are often in this day and age, defaults are often very good. uh, It's rather than, you know, copilot and Docker that we discussed previously, might simply not be able injectable into your existing rules of the game and that can be the biggest stumbling block for adoption of either of the tools, at least within companies.
My thing again with copilot is companies should get over it and just do what the tool does because this gets back to the point of the people that are developing copilot, it's not some third party developing copilot. It's AWS developing copilot, and you're not going to get anyone smarter than the people at AWS building your infrastructure for you. Period. So that's that. Okay. So we've talked some negatives. Sort of did this backwards. We always do things backwards. Let's talk about the positives. What are the strengths of each of the three? Let's start with ACI because it seems like it's the 98 pound weakling here.
So instead of actually starting with ACI, let's go kind of feature by feature
okay. Let's do the features. Yeah.
You want it to be easy to use, right and ACI and GCR are, uh, Google and Amazon, Azure are extremely easy to use. It cannot get easier. It's a single command that just does the magic. ECS requires usage of CloudFormation, Terraform, and all that stuff. So, uh, not having to manage your infrastructure, that could be the second one, right? Uh, Again, Azure and Google. Amazing. Just, just do the job for me. Don't bother me with, uh, with stuff. Now, the third one could be horizontal scaling and this is something that actually only Google has. Right? So Google is amazing at that. It goes from zero to something, from something to something else and from something else back to zero. uh, In ECS, scaling is still an option. It's a thing, but it doesn't go to zero. So that's why I don't give it such a high ranking. Uh, Is it some kind of full or partial, uh, open standard? Not to say open source doesn't necessarily have to be open source, but something open that you know what it is so that you can protect yourself uh, if you ever want to switch somewhere else. Uh, Azure is amazing at that aspect. It just asks you for image. There is no lock-in in a way, right? Uh, give me your image and run stuff. If you want to go somewhere else, go. Google. Amazing. It's based on Knative. It's completely open source. They're just giving you layer on top of it. Very, very thin one. Uh, Up points for Azure and Google again. Negative one for ECS. But when we jump into availability, that's where ECS uh, shines. That's where Google shines. Production ready ECS. No doubt. Only ECS. Even Google. I would not necessarily give it the kind of complete full point for production ready if for no other reason, because uh, Knative, which is what they base their service on, is still, at least officially, it's not version one. It's zero point something. It is, it is relatively new. So if production ready in your head means, um, it's been running in production by somebody else for over a year or two, then it's not there, but it's almost. Uh, and I already mentioned Azure out as production, unacceptable in any formal way. So yes, but you said positive points, simplicity of use, positive points for GCR and ACI, hands off infrastructure, same suspects, horizontal scaling, GCR amazing, uh, open source, open standard what so not Azure gets a point. Google gets a point. Availability. Azure loses the game. The other two are good production readiness. Again, Azure loses big time, um, and the other two are really good. So if you consider kind of distributed the parts where Azure shines are the parts where the ECS fails and then the other way around. Whatever ECS is good at, Azure is really not. And then comes Google and just swoops in and, uh, is decently good or excellent or the best in almost every aspect. And when you think about it, think about it like this is my line of thinking again, not official. Uh, Google has, without a doubt, the best Kubernetes service in the market from my perspective. I'm sorry, partners and whatever, but that's simply how it is. Now, if Google has the best Kubernetes service and everybody's trying to catch up with that service, then it's normal that everybody is focused on providing on improving their Kubernetes service while Google has a luxury of creating containers as a service already on top of it. What I'm trying really to say that Google has the luxury of going a step above of where everybody is, because it is already in that place.
It just doesn't have the same adoption.
yes. um, From adoption point, if you would categorize them from adoption, then bye, bye Google. Rarely anybody's using it among big companies, uh, How much that will change, that's the, that we'll see or not. But yes, low adoption, amazing service.
It's the Betamax versus VHS conversation all over again.
Superior product. Nobody wants to use it, but that's okay.
uh, I think that the word wants is not necessarily true. Uh, if you speak with individuals, like person, not a company, quite a high percentage of people who used it love it. It's rather that, low penetration within companies, big companies.
because you're never going to get fired for buying Microsoft. In today's world, you're never going to get fired for buying AWS. You might get fired for buying Google
And you never know when they're going to kill a product.
they're really good about that. AWS. They don't kill products or at least not that I can remember. So that's, let's get the phrasing, right? Because I've messed this up too many times, too. That's managed, managed one more time, managed containers as a service from the big three.
Now, if you're interested in learning more about containers as a service and go through all the points that Viktor called out and you haven't bought the course yet, go buy the course or buy the book. Your choice doesn't matter. There's a link for that down in the show notes. Now, a few of the items that Viktor was calling out today, one of them in particular is availability. Next week we're going to tackle what availability really means.
We hope this episode was helpful to you. If you want to discuss it or ask a question, please reach out to us. Our contact information and the link to the Slack workspace are at https://www.devopsparadox.com/ contact. If you subscribe through Apple Podcasts, be sure to leave us a review there. That helps other people discover this podcast. Go sign up right now at https://www.devopsparadox.com/ to receive an email whenever we drop the latest episode. Thank you for listening to DevOps Paradox.