DOP 66: AWS Lambda vs. Google Cloud Functions vs. Azure Functions for 2020

Transcript

Viktor Farcic 0:00
Now, where it shines really amazingly well, those are the sporadic workloads. That's just amazing.

Darin Pope 0:08
This is DevOps Paradox episode number 66, AWS Lambda versus Google Cloud Functions versus Azure Functions for 2020.

Darin Pope 0:22
Welcome to DevOps Paradox. This is a podcast about random stuff in which we, Darin and Viktor, pretend we know what we're talking about. Most of the time, we mask our ignorance by putting the word DevOps everywhere we can, and mix it with random buzzwords like Kubernetes, serverless, CI/CD, team productivity, islands of happiness, and other fancy expressions that make it sound like we know what we're doing. Occasionally, we invite guests who do know something, but we do not do that often, since they might make us look incompetent. The truth is out there, and there is no way we are going to find it. PS: it's Darin reading this text and feeling embarrassed that Viktor made me do it. Here are your hosts, Darin Pope and Viktor Farcic.

Darin Pope 1:15
Now over the course of the past few episodes, we've been talking about serverless. And recently Viktor just finished up the Functions as a Service section of the catalog course and book. And we figured now is a good time to talk about his experience a little bit more in depth of how he feels about Functions as a Service as they exist in 2020. Viktor, what is your feeling?

Viktor Farcic 1:54
So to begin with, I think that it is mature enough. I think that Functions as a Service if somebody wants to use Functions as a Service and was worried whether that is mature technology mature service, I think that we are there as industry. And you can see that by differences or to be more precise lack of differences between different solutions on different on different providers. So the big question kind of whether we should use them whether they're mature enough, the answer is short answer is yes. Whether we should use them, that's a longer answer. But I guess we're gonna maybe let's start first with comparison between the three, right?

Darin Pope 2:40
Yeah. Name the three.

Viktor Farcic 2:42
It would be Google Cloud Functions, Azure Functions and AWS Lambda. Right. So they're all functions is managed Functions as a Service. So in I think that if you cover those three, that is representative of managed Functions as a Service in general. Now, this excludes this would exclude any other type of serverless. And this would exclude completely managed self managed serverless, including Functions as a Service, right? So we're focusing on managed Functions as a Service flavor of serverless in the Big Three providers.

Darin Pope 3:28
Okay, so who do you want to start with first?

Viktor Farcic 3:31
I mean, we can do, let's, let's see. Let's say that we start by features and then comment on on how they fare between the three. So for example, one of the things that many must be interested in is to begin with which languages are supported. And like in Google, that would be NodeJS, Python, Go and Java. In Azure Functions, it would be C#, JavaScript, F#, Java, PowerShell, Python TypeScript. In AWS supports Java, Go, PowerShell, NodeJS, C#, Python and Ruby. So if you look at it from that perspective, then Google Cloud Functions are are the big loser of that right? They're the only platform only service of that kind that supports less languages than others. Only four at least today, right, that might easily change in the future. But on the other hand, I could just as well argue that supported languages are not that important. Because you should be able to actually switch to whichever language is supported because we are talking about functions. We are not talking about developing something huge, where the choice of the language is really important. Mostly because you would get lost in a language you don't know, right? And then we can we can also enter into discussion whether actually other languages should be supported. Like if I look at Google Cloud Functions, I get NodeJS, Python, Go and Java. If exclude Java as being probably a silly choice for a function, then actually I could say that, yeah, Google supports only four. But three that really matter in this context are supported. And then I could even argue that actually the loser in that story is Azure because Azure doesn't support Go and Go is definitely a good candidate for something as a function simply because it is it is extremely efficient, extremely fast to start, and so on and so forth. That does not necessarily make Go better than Java or worse. But for something that is super Now, actually, the important thing to understand is that every time we send a request to functions, a new instance of a function is spin up. Or maybe it was kept warm for a few minutes to the last one, right? But potentially, let's say, potentially not necessarily always, you get a new instance of your function or application. And yes, you can. If you go to Azure Functions, you can make those functions be written in F#, but do you really want F# as a function? Something that is supposed to start within milliseconds? No, like, if you're building a traditional application that will run for hours or days or even months. It doesn't matter really, whether it takes five seconds, five milliseconds or five minutes to start because if it is a worthwhile investment to wait for a while for something that is going to run for a long, long time. But for functions probably you want to go with JavaScript Python. Go, right. In that case, it's all the same doesn't matter which which are supported because they're all supported with the exception of Go not being supported in Azure Functions.

Darin Pope 7:27
And just for just for transparency, all the testing that you did was with JavaScript.

Viktor Farcic 7:34
Yes.

Darin Pope 7:35
Okay.

Viktor Farcic 7:35
Yes. And now, when we are about testing, and this is kind of what shows maturity of of the technology, I could not detect any reasonable discrepancy in the results between the those three, right. They're all more or less equally fast. They don't show considerable lag in initialization. They can handle a potentially huge workload, all the good stuff. Highly available, what's not. So they are all equally good as long as you're not building something like, let's say in Java or C#. That's really a bad idea.

Darin Pope 8:19
So languages if you throw it out there, they're the only the only one that's really questionable is lack of Go support on Azure today.

Viktor Farcic 8:31
Yes, yes.

Darin Pope 8:32
And today is in July 2020.

Viktor Farcic 8:35
Yes, but at the end of the day, no matter how much I'm saying that it shouldn't be problematic to write your functions in any language. No matter how experienced you are, or no. The chances are that people are going to look for a solution that works in that their favorite and potentially only language they ever worked with. In that case, pick your delta, kind of check whether your language is supported or no and then just go for it. Whatever that works.

Darin Pope 9:12
Just go for it. I see what you're tried to do there.

Viktor Farcic 9:14
There you go. That's a subliminal messages.

Darin Pope 9:20
Okay, so languages effectively, they're roughly all the same.

Viktor Farcic 9:27
I mean, Google supports the least number of languages,

Darin Pope 9:30
but they support the core, the core number, right? They've got, yes, JavaScript, Python, Go and Java.

Viktor Farcic 9:38
Exactly.

Darin Pope 9:40
Those four are good and then if much like Azure is added PowerShell and F#. That's great. That's the Microsoft ecosystem. Right? Easy onboarding. And then AWS has just been out there the longest.

Viktor Farcic 9:54
Exactly.

Darin Pope 9:55
It's they're all okay.

Viktor Farcic 9:59
Exactly. No, no. I mean, unless you really have to work with the specific language then, but Okay, I'll get to that later actually. So I mentioned high availability, they're more or less, actually depends on how you look at availability. Google is kind of like maybe one second or third decimal behind. You know, the number of nines. So it with Azure and AWS, I got, I got, I got more nines, like three nines after decimal right? With Google I got two. Now it's not, it wasn't some really deep testing that was running for days and stuff like that relatively quick type of testing of availability. So yeah, Google is a bit less highly available. Which, which, unlike languages, might be a bigger deal breaker, then I mean that that might be a more important thing. Now it all depends really on how you set up your system, you know, whether you have retries on on your upper level applications or functions or what so not. But yeah, if you're if high availability is important, and it should be, then again, Google is a bit behind when function as a service is in question,

Darin Pope 11:33
and define high availability for us in a functions as a service scenario. How did you define it?

Viktor Farcic 11:42
In fact, so basically, how many, how many requests? Actually there are, there are two things you want to look for, generally speaking, is whether you have you run multiple replicas of something and that is spread across zones and all those things, but that does not really matter in Functions as a Service, because those things are obfuscated for you. You don't know really where your functions are running, you don't know much about them. So the only way you can really observe availability of functions is by bombarding them with requests and see how many failed requests I receive, right? You cannot observe the system in detail to do something deeper than that, right? You cannot kill a function in the middle of of the process and see whether some other function will take over. You don't have those tools at your disposal. So the only thing that I can come up with is that Yeah, let's let's do some some form of performance testing. Bomb big bombed the solution with huge amount of requests and see what you're getting. Right. And then they offer Okay, fair, fairly, work fairly well. Well, with as I said, small difference when Google functions negative difference and Google functions is is concerned.

Darin Pope 13:08
So from your perspective, you're checking for a non 200 response code.

Viktor Farcic 13:15
Exactly See, I mean application that is not that is decently I mean function that is simply small, not necessarily with all the bells and whistles that many are not going to going to put inside. Just simple function bomb it until there is no tomorrow and see what you get.

Darin Pope 13:34
Okay, we've talked about language. We've talked about availability, and I'm not gonna talk say high availability, but we've talked about availability. What's the third thing?

Viktor Farcic 13:45
There are small issues, things like for example, one thing that I like I like, default values being sensible, right. I liked that a lot. For example, Google functions are secure by default, meaning that nobody can access the mix except the person you give privilege to do that, or a process, while others can available to all by default. But again, that's that's such a small difference because it's it's more than about default values. In all of them. All of them can be secured, all of them can be monitored. You can collect, collect logs from all of them. They all do the same thing. And then actually, and then you can say, okay, so if they do the same thing that's complicated. How are we going to decide which one is better than the other right? And then we would probably enter into the story of the cost. So which one is cheaper? And again, the answer is, none of them. They're all equally expensive or equally cheap. The differences I poked in different scenarios, and the differences are really miniscule. Depending on the memory and CPU you assign and how many you have in parallel, and so on as for the differences can be, let's say 10%. But 10% of cost difference, which I know that we are, if you're talking about millions of dollars can be huge. But 10% is almost certainly not significant difference from for anybody to choose one over the other as that being the primary choice. That's probably just not not enough. Long story short, I could not find a significant difference, something that would make me say, this one is so much better than the other. And then that leads me to conclude that most likely, there is no real competition because you're going to use whatever you're going to use the service of your favorite compute provider, right? You're not it's not the suggestion. It's not like with Kubernetes. With Kubernetes, I could argue that if you're running today in one, there could be a significant reason for you to switch to another. But with functions, if you're let's say running in Azure, you have no incentive to go anywhere else. Or the same applies to AWS and. And Google, right. So it's a depressing story. There is no real comparison because they're all equally good or equally bad. The only thing that you will notice is that and this is now a shocker. They are extremely expensive when running at scale, like five times more expensive than than almost anything else you can do.

Darin Pope 16:48
Meaning if you versus a long running process that can handle the same amount of traffic.

Viktor Farcic 16:53
No, not necessarily long running process versus relatively constant traffic. And what I mean by relatively constant traffic. You know, if I said that this 10 years ago, that would be Whoa, you expect maybe, I don't know, thousands or millions of requests, concurrent requests, and you expect a variation of 10%, that would be constant traffic. In my head constant traffic is traffic that does not change drastically from one minute to another. Right? So it's okay if you have three times more traffic in afternoon compared to morning, right? That still counts as a constant traffic, I'm more referring to those very fast variations. You know, like this very moment, you have thousand concurrent requests, and then half a minute later you have 100,000. There you benefit greatly because, first of all, it would be very hard for you to scale that with that speed, in that quantity yourself. By the time those oscillations, you cannot really easily scale hardware fast enough to handle those oscillations so that you get financial benefits of using less or using more. But if it's not such a high frequency oscillation, then it's very expensive because let's put it this way. For each CPU you use as functions, you're going to pay five to 10 times more than CPU you use in a normal way. So if you It really depends from use case to another. But if you if you have a normal traditional running stuff, and your oscillations don't go more than double, you can just say, Okay, I'm going to put double the servers and I'm still going to pay the same more or less. Now where it shines really amazingly well. Those are the sporadic workloads. That's just amazing. Like, let's say if you would have CI/CD pipeline builds, they are very sporadic, you don't have normally thousands of concurrent builds in parallel. I mean, some do, but many, many don't. And the oscillations are amazing, like, people come to the office, and then they first thing they do they start building something, and they don't do much for next three hours. And then before for an hour before they leave office, they the the number of builds increases drastically, and so on and so forth. Right. That would be like ETL or creating backups, basically, almost anything that is a cron type job, either scheduled periodically, or initiated through some event. That is extremely cost effective. Because if you want to run backups, let's say through cron jobs, right, you need a server there that will initiate that cron job. And maybe you have a couple of jobs a day, and you're paying the whole server for that. And that's where those benefits really shine. Right. sporadic, sporadic workloads I would say would be the best candidate possible.

Darin Pope 20:16
So the 1-3-5 times a day, great usage. The well let me because there's I've heard two things here. I've heard this cron job type thing. But if it's a cron job that's running every minute, maybe not. So when or is that one, okay, as well. If it's running every minute, would that be a better?

Viktor Farcic 20:42
So if it's running every minute, but let's say it takes five seconds, then yes, it's a good candidate, right? Because you have 55 seconds of wasted compute power. Now, if it's running every minute, and it takes I don't know like, 45 seconds Then it is you might be it might be cheaper not to use Functions as a Service. Managed functions as service, right?

Darin Pope 21:11
And then the other one that you called out was the one for scaling for a spike.

Viktor Farcic 21:18
Yes

Darin Pope 21:18
you were talking about sporadics and spikes. So if if I'm going along at 1000 requests per second, and all of a sudden it spikes to 100,000 requests per second. I don't care if you're running on a Kubernetes cluster that has cluster auto scaling turned on, you're probably not going to be able to react the cluster autoscaler is not going to react that fast.

Viktor Farcic 21:43
Yes.

Darin Pope 21:45
At least in today, in July 2020. It will not react that fast.

Viktor Farcic 21:50
But then how often does that really happen?

Darin Pope 21:54
It shouldn't. Well, if you're if you're a news site, then that could happen at any point in time, you know, what's a breaking news item. If you're Twitter if you're you know, something like that it's informational type thing that would be very useful. But if you're but if you're a banking site, other than people rushing to transfer their money to different accounts. Again, it may not be the right solution for a banking site. And that's, I think that's the thing, though, that we're talking about here in the language in the availability in the cost. Just because it fits right now doesn't mean it will fit for every use case that happens for what you're using it for. So be ready to spend money. You're going to pay money one way or the other.

Viktor Farcic 22:49
I must stress also that my comparison might be unfair because while I do think that more often than not functions are very expensive compared to others. Now you might have other savings that might actually make that cheaper. Maybe it fits your development model and the savings from not managing your infrastructure are going to be greater than the cost of that you pay to your vendor. Right? So there are many other factors that is close to impossible to calculate on general terms and say this is cheaper. This is expensive. But if you look at simply compute the bill from your vendors, then it's expensive. Now it's up to you to to add to subtract from that build other savings.

Darin Pope 23:42
For an example. I've got an application that's doing 3000 requests per second. I'm making a decision Okay, do I go Kubernetes or some sort of container as a service or do I go Functions as a Service. So if I, if I go containers as a service, and I'm AWS today, I'm still probably going to need a body or two to help manage that in some way, shape or form beyond just the developer creating a function and dropping it on Lambda.

Viktor Farcic 24:21
I mean, if you if you go for container as a service, you could use similar services in other providers, right? So management overhead would be the same with containers as a service, as with functions as a service or very low, let's say, or at least, if providers would implement containers as a service well. Now I think we should save this for a different episode, but I think that actually container as a service state is very immature. Implementations are horrible and really ineffective. But that's a question of time because everybody just started with those concepts. But what was interesting is that you mentioned that I think you said service that has 3000 requests or something like that, right? Yeah.

Darin Pope 25:12
Steady state 3000 requests per second.

Viktor Farcic 25:14
Yeah. Now, let's say that even it's not steady. But let's focus on 3000 requests per second now, and forget about Functions as a Service and this and that right? How effective it is, theoretically speaking, to handle each request with a different instance of something.

Darin Pope 25:37
In general, it's not practical.

Viktor Farcic 25:40
Exactly. So if it's some ETL process, let's say that you want to transform something in your database, maybe load some files and put them somewhere ETL processes are perfect. Batch type processes are perfect. They're all single process, single job type of stuff, right? But 3000 request handling with the 3000 newly created processes. That is inefficient. That's horribly inefficient. No matter whether we forget about the existence of functions or whatever. Rarely anybody would come to conclusion that is an efficient way to do stuff. For 3000 requests, I'm just following your example.

Darin Pope 26:26
We should add this to the list. And by the way, if you're listening to this and you haven't taken the survey yet about the catalog, there's a link down in the show notes. https://www.devopsparadox.com/survey. This is something we need to add to the list. We've talked about the functions and just generally the functions normal, right, the normal lambdas the normal, other two functions. We need to test functions on the edge because that could be different. It may be the same from. But let's think of it this way, I've got a globally distributed application and I'm just trying to capture data as quickly as possible. So it may be something as simple as I've got a little function on the edge that accepts the HTTP request, but what the function is really doing is taking the body of that request and posting it to a queue somewhere for later processing. We just want to make sure that we get the message as quickly as possible.

Viktor Farcic 27:37
Yes, I mean, that that would be a great example. For example, I mean, definitely. Edge computing, that edge computing really deals with with that type of processes, right. Either very variable variable at the place where you're collecting it or very sporadic And in this case sporadic you know, at the source. Your edge devices are likely not going to transmit a constant stream of something. Right? I mean, like I'm really there's a zillion type of use cases first. And but that kind of brings us I guess, to the to the key thing. Are functions useful? Yes. I have absolutely no doubt that functions as a service is a useful model. Where I'm strongly differ from many people is that I don't see that as being a model that seeing that carries significant percentage of workload in an average company. Right, kind of, I think we mentioned that in a previous episode. Like maybe 5%, maybe 10% stretch. Going back to the very beginning, if you focus on comparison between the big three, the good news is that use wherever you use whatever is offered by your vendor, it's all the same.

Darin Pope 29:15
So if you need to use Functions as a Service, mature enough, may not be cost effective. But might be depends on your workload. And we haven't know I'll say it this way. operationally, how does it change?

Viktor Farcic 29:38
Basically, you up, the only thing you do from operational perspective is really set up some infrastructure type of you know, initial things like VPCs in AWS or things on the edge of computing in a way right. But everything else is just to deploy my function, here's my code, here's my function deploy it. Basically, most of the operations go into developer hands. Now, the part where operations sysadmin will suffer greatly is very reduced visibility. So when functions work, they're going by themselves. Basically, there is not much if anything for you to do. When things go wrong. Yep, try to trace the problem. Try to debug what's going on. Because now we're talking. We're talking about you being you having two monoliths and then you switching to hundred or hundreds of micro services and you now we're talking about you're switching to thousands or 10s of thousands of functions without really appropriate mechanism for you to observe, monitor or alert, what's going on. And even if we would have a proper mechanism for that, it's still extremely complex. And that is the part that that is I believe immature and it is immature mostly because you rely on your vendor hundred percent fully. Right? So you will not be able to leverage some industry standards. You will not be able to leverage many of the other tools. Some maybe but many not. So, you basically you're at the mercy of your vendor. If they have what it takes, you're good. If not, you're going to be running blind.

Darin Pope 31:50
Really, we see most of the time that people run blind anyway.

Viktor Farcic 31:55
Yes, but at least there is a light at the end of the tunnel. I can I can tell people yes, you're blind, but why don't you do why don't you add Jaeger for tracing? Why don't you add Prometheus for collecting metrics? Why don't you do this or that right? Maybe go with Datadog? There are many solutions. So it's you if you're blind, you're blind by choice. Mostly.

Darin Pope 32:21
Well, you've made an architectural decision to use a solution that doesn't support third party observability.

Viktor Farcic 32:31
Yes. And, you know, you might be extremely happy with what your vendor provides. If you are great power to you, right.

Darin Pope 32:40
Or going back to my example of I'm capturing a JSON POST and dropping it on a queue. I don't need a whole lot of observability in that. Either it works or it doesn't work.

Viktor Farcic 32:59
Yes.

Darin Pope 33:00
Right, but if my function, okay, I'm going to write to the queue. And then I'm going to send a message to Slack. And I'm going to, if you start fanning things out in a single function versus potentially decomposing those other two other function calls or putting it on other things, you've got to think a lot more clearly how you're writing your function. You cannot write a function the same way you write a normal application.

Viktor Farcic 33:30
Exactly. And that's what we are coming back at, I think, to the previous examples, you know, if it's some batch type of processing, you know, create an event and then function does something and that's it or periodically run a function, that's all great. But if we're talking about you're building a system like you, I think you were describing, that would be create an event receiver function function calls three other functions, three other functions called seven other functions. It goes back and forth in the circle and then the weird paths and all that those things, something goes wrong somewhere along the way there. What I just described I think that those were the main and the valid reasons that people were rejecting microservices for a long time. And when I say for a long time, I mean last until last couple of years. Microservices are big today because those subjects are kind of better handled, easier to do. And I'm just not sure that we have, I think that we have them covered for micro services more or less, is better at least, but not that covered with managed function as a service. And I repeat managed. Just to clarify that I'm not talking about those that you would run yourself because then they can follow the same pattern says in any other type for application.

Darin Pope 34:53
So this doesn't wrap up our mini-serverless series We're going to be revisiting it again. But for now, this is sort of the end of the serverless series, at least as a straight block. Next week's episode is going to be different. But then the next big up is container as a service, right?

Viktor Farcic 35:18
Yes. I mean, I can give a very quick tl;dr. I think serverless is the future. And I think that CaaS or container as a service are the most likely implementation of that future. And I think that FaaS is not it.

Darin Pope 35:36
FaaS is great for a single process. Again, your batch load or if you if you're type of traffic experiences, highly spiky, so it's what were the two s's we have spiky and sporadic.

Viktor Farcic 35:51
They're, they're good for that if there is no equally good alternative.

Darin Pope 35:57
Exactly. If those are candidate for you to look at going with Functions as a Service.

Viktor Farcic 36:04
correct

Darin Pope 36:05
It doesn't mean that they are that Functions as a Service is the correct answer. But it might be,

Viktor Farcic 36:13
yeah.

Darin Pope 36:16
Okay. That was Functions as a Service managed, good grief managed Functions as a Service.

Viktor Farcic 36:27
It's so important to actually, I learned I practiced to make sure that I always inject that word because I got quite a few negative comments is oohh you know, but this really what you're just saying is not true for this or that. Because like OpenFaaS, for example. It's self managed, officially Function as a Service but it's not really functions, its containers at least as a input mechanism and many other things do not apply to it and so on and so forth. So people were yelling at me. Yeah, this does not apply here. This does not apply there. So Azure, Azure Functions, Google Cloud Functions, AWS Lambda, only those three.

Darin Pope 37:17
If you would like more details and you haven't picked up the catalog course or book yet, go ahead and do that. The link for the for both of them actually is just a link will take you to a page and you can choose which adventure you want to take whether you want to take the course or if you want to read the book. Go see that down in the show notes. And we will talk to you again next week.

Darin Pope 37:49
We hope this episode was helpful to you. If you want to discuss it or ask a question, please reach out to us. Our contact information and the link to the Slack workspace are at https://www.devopsparadox.com/ contact. If you subscribe through Apple Podcasts, be sure to leave us a review there. That helps other people discover this podcast. Go sign up right now at https://www.devopsparadox.com/ to receive an email whenever we drop the latest episode. Thank you for listening to DevOps Paradox.

DOP 66: AWS Lambda vs. Google Cloud Functions vs. Azure Functions for 2020

Show Notes

Hosts

Darin Pope

Viktor Farcic

Links

Rate, Review, & Subscribe on Apple Podcasts

Signup to receive an email when new content is released

Transcript