Evan
00:00:00.000
if I've got a robot in my house that doesn't respond in milliseconds, you know, I might have a dead cat who I really loved. I really loved that cat and the robot accidentally ran it over or basically mowed it up, Or if I'm building something of high precision, you know, I just need to be able to have time series based resolution of what's happening.
Darin
00:01:24.669
Back in episode 3, 3, 5, we had Peter on from Coru and Viktor. You brought up a point that, especially in observability, in the old days, we would keep more recent data and we would start aggregating. Older data just to save on cost. Do you still agree with that?
Viktor
00:01:41.469
I do, I mean, I do agree with basically not keeping things that you don't need, be it in software or you know, in your house. Doesn't matter.
Darin
00:01:51.887
So it's gonna be interesting how we take this today, because coming from our primary world, you know, we think in seconds and maybe milliseconds, right? That's not a big deal, you know? 'cause you think about Kubernetes responding or waiting for a Jenkins build to finish up. It's like, okay, who cares? You know? It's not that big of a deal. But
Darin
00:02:21.917
Okay. You know, I still work with it. Come on, gimme a break. I could say something negative about your project too,
Darin
00:02:28.817
No, I won't do it now. but let's, let's assume this for a second. what if we're waiting for a car to stop? Do you think we could wait seconds or milliseconds for it to respond? Or maybe a, a manufacturing arm, you know, a robot arm moving things around. That seems like it would be a bad thing. On today's show, we have Evan Kaplan on from Influx Data. Okay, Evan, how you doing today?
Darin
00:02:50.432
Good. I heard you snickering in the background a little bit about some of the earlier things, but what do you think about the the car and the arm thing? Because to me, when we need used to, we call it real time data. I think there has to be a better phrase to it than real time, even though that's what people still call it. are you seeing and how do you think 'cause Victor's context? 'cause in case you haven't listened to the episode, goes around, used to, we could throw data away, but now that we have ai, we want as much data as we can so it can potentially help us make some better choices. But in the examples I was just bringing up, we're talking about making a choice now. Like if we need to shut an arm down because it's getting ready to blow a bearing or something, what are we thinking about these days?
Evan
00:03:36.252
So you guys share some background with me in that you, you're not new to this industry. People have been talking about real time. You know, forever. R toss operating systems. Just, you know, the narrative. Nobody knows what real time is. I'm not sure it's an important distinction. I'm not sure it matters. I'm not sure the label helps. Of course, we use it in our marketing and so, so we use it because it, it signifies some broad concept that at least things don't have ridiculous amounts of latency. That's about the utility of the, term, but. Let's ground, you know, since, since you started with the, the little bit of the controversial, or not really controversial is that car, that robot arm, that, dynamic is, we're clearly moving into a physical world of increasing automation. And if you look at sort of the backdrop of the universe, if you think the counterforce is entropy, then you think the, you know, the force against entropy is consistent organization. And for us as humans, that means increasing automation, building systems that are self-healing, self adapting, eventually completely autonomous. That's where we want our systems to go. They don't all need to be there. They certainly don't, don't need to be there tomorrow. Some of 'em are highly useful, like the self-driving car when it gets there, but that's the journey that all these systems are on. And so when you talk about microsecond or you talk about observability data, which most people are happy every second. Not particularly useful. You know, if I've got a robot in my house that doesn't respond in milliseconds, you know, I might have a dead cat who I really loved. I really loved that cat and the robot accidentally ran it over or basically mowed it up, Or if I'm building something of high precision, you know, I just need to be able to have time series based resolution of what's happening. higher the fidelity, the, the higher the sampling rate, the more fidelity I'm gonna have and the better picture of the real world I'm gonna have. So, let me not go on and on here. I took a little thread and I took it forward, but, but tell me where you want to go with this.
Viktor
00:05:43.145
when we talk about getting quick responses in what's not, uh, wouldn't the logical. Direction, then be direct communication. No intermediary at all. I mean, I know that's not feasible, but if somebody wants past something, then just remove anything in between of those two end points.
Evan
00:06:02.825
That would be great, except you want to build intelligence. And intelligence ideally can be resident at whatever the end point is, but isn't always available at the end point. And the nature of pipelines and data, you know, doesn't, doesn't often allow that. Right? And then you have just, you know, like these are just electrons moving through space. So simultaneous, you know, is, um, it's a little bit challenging, but I think you're right. The least impedance is the most beneficial for, if you want real time.
Darin
00:06:30.493
Isn't sort of the gold standard for real time a human or you would hope so at least.
Viktor
00:06:36.573
We are not. You could be real time only when you change the rules of the speed of flight. Make it instantaneous.
Darin
00:06:45.057
Okay. But okay. It depends. So self-driving cars, let's stay there for just a second. As a human, I still don't trust self-driving cars. I won't get in a self-driving car.
Darin
00:07:05.847
Yeah, my premise I guess is falling apart, but it seems like, how does, okay, so we have time series data. let's play this out for a second. We have time series data coming in. How long do I need to keep the time series data?
Evan
00:07:18.441
Depends on the problem you're trying to solve, Depends on the fidelity you want. It depends on what you wanna learn from it, so it's very common. The most common time series use cases, I collect really high resolution data for short periods of time, and then I downs sample it and store it for long periods of time. Anything that would give me the signal I need. And separate the signal from the noise. That's the period you want to keep. Now, if I want increasing fidelity and incredible intelligence about the physical world, I'm gonna keep more data. That's more descriptive. You know, let's look maybe an easy metaphor for you as a photograph, right? Do I want a two megapixel photograph that works fine for 98%, I'm showing it on a screen. I'm showing it on my thing. Or do I want, you know, an 18 megapixel photograph that, you know, that allows me to locate the tiniest portion or pixel on the image? Like it depends on your utility. my theory is, is that we want increasingly. Increasingly high resolution data to instrument the physical world that we're entering a very different world with physical ai. And so it's not okay for the car to respond in one second. The car's gotta respond if it's moving at 60 miles an hour, and we can do the calculations. We're not assuming a car's moving at the speed of light, but we're assuming the car's moving at 60 miles an hour. It's gotta respond in some significant sub-second time to things that interject itself in its path. This is the world we're going into these factory arms, Think about the manufacturer of a semiconductor precision accuracy pattern. It's like this is all, we're talking nanometers, nanoseconds, we're talking, you know, we're talking high resolution. This is the world we're going into.
Darin
00:09:01.122
is nano now the true second because used to 'cause, used to, we would think about sec milliseconds, right? We would thinking. Thinking websites or any kind of response times from disc, we were thinking milliseconds.
Evan
00:09:17.042
Depends on your system. our default period is nanosecond. Very few people use it. Some of the quantum computing customers we have use it, but our default is nanosecond. So if you want to store it nanosecond, you can. It's expensive, you'd expect.
Viktor
00:09:32.239
I guess the answer to that question is whatever you need, and not more granular than that. Or not more or not faster than that, right? Because the moment you go above what you need, you are paying for it. And not only in money, but also in performance in many different ways, right? you would say, 10 milliseconds for the car is ideal, why would you go for one millisecond? I am not saying it's not better, but if you say that's the kind of, I don't need more than that. or why would you have a infinite amount of pixels in a photo that you're going to show on a, on a, on, on a mobile?
Evan
00:10:11.782
So if what you're saying is fit for purpose, of course, I totally agree. The question is, is how does the purpose change? Right. My, my initial thing is I just need the photograph to show anything. It turns out it's a piece of evidence in some sort of murder case, and all of a sudden I want a high resolution to capture some tiny anyway. I mean, you get the, you get the simple idea is I don't always know a priori what that period is. Of course, I should make some estimate, put a range in it and do it, which is what most people do.
Viktor
00:10:39.888
the tricky part I feel is that you don't also know what will your needs be in the future. So if you go back to my photo example, right? Let's say that my mobile cannot show more than one megapixel photo, right? Maybe it'll tomorrow. So do I over optimize for tomorrow that I don't know what it is or, or what do I do, right? do I pay today for tomorrow?
Evan
00:11:04.188
what did you do when you rented your apartment in Barcelona? Did you plan on having three more kids and a family? And so you know how you make these judgments.
Evan
00:11:13.092
I think the really important point here guys to pull out is that we're in an age, and you guys have been around this sort of infrastructure over a long period of time where you've gotten these really quality, specialized tools on the data side, And you know, it started years ago with document data, search data, graph data time series, data vector data you can get more efficient. You can get a purpose-built stuff that actually solves significant problems. And so the tool set you have is just so much wider. It's as if before you had a hammer and a screwdriver and you were gonna build pretty much everything, but now you've got a full set of tools that you can work with. That's the change in the world that I think is important. It doesn't mean you should have a full set of tools every job, but it just means you have those tools available, and I think that's what's changing.
Darin
00:11:58.676
So, in other words, there needs to be a Home Depot or a Lowe's right down the street from where we're working. So if we need a new tool we can go get
Darin
00:12:08.846
that's instant. Right? I, I wanna nerd out for just a second, or at least to me it's nerd. You were saying nanosecond is your base time series, or time second, whatever. how do you even have hardware that can write to that fast? Keep up? That seems very odd to me. And he said quantum. Okay, sure. But I can't go down to my local microcenter, rest in peace, fries,
Evan
00:12:33.388
Actually, I don't think the hardware needs to catch up. I don't think it's capturing every nanosecond, but it's breaking up the data. It captures the nanoseconds and reads, so, so, because, yeah, but the idea is, well, how do I wanna store the data and if I want to capture it in this
Darin
00:12:47.809
do you think this whole AI thing is really coming into play? I mean, there's always been machine learning, For decades, I mean, obviously the generative AI is sort of, the new kid on the block has been for a couple of years, but is that really helping us? Is it hurting us? where's that going?
Evan
00:13:09.649
I can't talk about, there are other people way smarter than me about the broader implications on society. I could opine on it forever. We all have our thoughts. I'm not sure I'm the useful guest for that. But in terms of, um, in terms of generative ai, I find it amazingly useful. Our coders find it amazingly useful. I find it in my personal life amazingly useful. I just think it's a main tool. It's not what we do, I'm much more interested in, in deterministic models of the world rather than probabilistic models for the digital world. These for data, for language. Probabilistic models work just fine. But for the physical world, you need deterministic models. Again, I don't want my robot to run over my cat. It's pretty simple. I don't want myself driving car to hit the pedestrian, And so I don't want probabilistic models, I want deterministic models. in the generative world, we're collecting digital data, and that's pretty much all collected and it appears every day. It's res scraped. You get new data every day, and now they're building synthetic data to make their models. In the physical world, the amount of data you can collect is by definition, infinite. By definition, infinite. Tell me the system you want to build. I'll tell you the period of data to collect the amount of data you want to collect, to train the model, and then you can collect it. That's what's new, you know, we can build models of the physical world that actually are quite deterministic. It's not new actually, it's happening. It's been machine learning. It's just accelerated.
Darin
00:14:41.157
Well, I think that's where time series really does come in, because I mean, we've had time series data stores in the past as well, but that's really to me, better than a relational database for building out models.
Evan
00:14:52.347
A hundred. No, it's not even close. So if you built this thing 30 years ago and still exist today, there are time series applications running on Oracle today, right there. If you built it 20 years ago, you ran it on MySQL. Maybe if you at 15, you built it on HBase or something like that. You know, it's only in the last five or six years that people have built really meaningful stuff on dedicated time series platforms, satellites, self-driving cars, battery power systems, energy management alternatives, just a, you know, just the app. Now that you have a specialized tool, you can get that resolution be less expensive and to be more responsive, which is where all these systems have to go over time. No pun intended.
Evan
00:15:39.297
Because as you started using the general purpose databases, relational databases, MySQL, PostgreSQL you realize that you're just not optimizing for. If you know your vector of search, you know your index is primarily time you can change your ingest level. You can change the time it is to write. You can change your query time and query speed. You can change your cost of storage if you specialize. If you don't, you pay all the prices by not specializing. if you want to take your minivan and you want to race in F1 series, it'll be fine. It'll go around the track. It'll do what it's supposed to do. If you don't get run over.
Darin
00:16:18.367
and you're guaranteed to come in last at that point too. But you know, that's, could you imagine a minivan riding around the Las Vegas F1 track? That would be,
Viktor
00:16:30.244
now we are entering into the discussion of, you know, all encompassing databases like Elastic, that actually are trying to be in not everything, but all encompassing, at least within an area. You know, logs, metrics, traces,
Evan
00:16:45.908
that's that whole space, you know, that's the whole space. They all want to be the melt Stack. They want to have metrics, events, logs, and traces. And so Elastic is one of 'em, but there are, you know, there are 30 others.
Viktor
00:16:58.594
But that almost feels like almost inevitable is the wrong word, because it's obviously inevitable, but I'm going to keep using it. Inevitable Expansion of companies, right? You find your niche, you say, I'm very good at logs. Where do I expand my business? To metrics, right? How do I do that? By modifying what I have.
Evan
00:17:21.801
you build a tool, it's useful, Splunk did this before Elastic, right? you build something, you build a tool, it's useful and you figure out how else you can use it. That's just capitalism. That's just, you know, how do I leverage my adjacency, you'd expect vendors to do it, but you also expect customers and architects like you. To be very picky about what you assemble as a solution and that sometimes a one time fits all makes a lot of sense for you, right? Just a simple Mongo db, they claim to do graph, vector time, everything. maybe it's good enough, I don't know, You gotta figure out as a developer, what is, or an architect, what is the thing you're building and what do you need?
Darin
00:18:00.122
When Influx InfluxDB specifically was created. You weren't around for that. You came in a little bit later. Right.
Evan
00:18:07.947
Yeah, I came in, I have pre-revenue, but there were 3000 when I joined. There were 3000 instances in the wild running daily. Today there are 1.4 million that run daily, and so pretty early, but not, I'm not a founder.
Darin
00:18:24.813
you were in early enough. Why did the founders even decide to create a time series? Was it they just got bored and decided to? Let's go do this this weekend.
Evan
00:18:34.803
Like most good businesses, it's a pivot, Like most of this stuff, it's a pivot. my partner, Paul Dick's, the founder, he had worked on Wall Street. He had built, um, a lot of application code on top of existing databases. To do financial trading for different firms. And so he had that experience, but he didn't go right there. What he did was he tried to start something like a mini Datadog, a simple cloud-based in those days. We didn't call it observability server monitoring platform, And he realized that was gonna be a big lift and he'd have to raise a lot of money. And he was deeply convicted about open source. And he wanted to work on an open source thing anyway, and so the database that he originally built. To collect the metrics and events for that cloud-based service. He then open sourced it. He wrote it in go. He open, sourced it, and it became immediately popular because it handled the four or five things that were, that I mentioned. It had amazing ingest. It had, super fast query time. It had the ability to evict data very fast for down sampling and resolution. It was cheap to store the data. So we solved a set of problems. That if you were a developer, you'd have to go solve on your own if you used Postgres or MySQL, which it would've been the state of the art at that point. it was just really one of these organic popularity things. And it was helped that it was an originally a GO project. It Go was early and Docker had just come out and so there was a really interesting time for Go. but mostly it was just really super easy to use. People get started right away and it was powerful.
Darin
00:20:11.728
It's interesting you, you bring that up. So I'm guessing that was around 2011? 12.
Evan
00:20:15.418
13. He made the first, he started in early 13, made the first commits, I think publicly in 13.
Darin
00:20:21.028
So we're talking 12 years ago. That was early go, I mean. Early that would predates even Kubernetes Viktor. Is that possible?
Darin
00:20:38.464
That's actually pretty interesting because do you think it was probably one of the first major projects that took on
Evan
00:20:43.939
It was a very early project. I think Docker proceeded it. I don't know. Remember, there's some other really interesting projects, but I don't actually remember. Since I came to the company, it was still was not really a revenue company. I came in early 16,
Darin
00:21:03.454
Inter, I mean, okay, it's, we've had a lot of things happen over these past nine years.
Evan
00:21:08.509
Well, you know, I was young. I had jet black hair and now you can, you can, your audience can't see me, but I've got gray hair and it's actually been a really nice run. I really love working on open source projects. We have chosen to keep it a promiscuously licensed, which is really. That's become almost controversial these days is most of our competitors have gone to these secondary licenses, so we kept it MIT and Apache. We have a really vibrant OSS community, so that's been feeding and the business has grown consistently and nicely over that period of time. And so we're now of a size where we can really continue to, you know, build something of significance.
Evan
00:21:52.035
You know, this is a debate between Paul and I. I think I would. I think I would, I think we've taken what our approach is really open core. we're committers on Apache data fusion, we're committers on Apache Arrow. So we're very much involved in the whole Stack. And we make contributions to the open source, not just our own project. And so we're deeply, it's one of our core values is we're deeply committed to open source. And I think the right method for open source is we just don't put everything in the open source, right. That's just it. We're open core, really.
Viktor
00:22:25.183
to be clear, I'm, I'm an opposite person completely. Just that. I can see certain concerns being raised, whether this system actually works for new players. I, I can feel, I cannot prove it. I can feel that some new startups are learning the lesson and not necessarily correctly we're learning the lesson, but saying, okay, maybe this is not actually beneficial for. For me starting today in a way, right.
Evan
00:22:58.618
I still think it's very beneficial, particularly for infrastructure. And I help out other open source companies and I'm pretty committed to the whole ecosystem. one of my advisors is Ali Zi, who's the CEO of Databricks. And Ali said something I thought was very wise, but not, not so obvious. the problem with open source is you have to hit two home runs. In order to even be in the game. First you have to have a project that is interesting, popular that people want to engage in. And for those guys that was Spark, The second, maybe even harder, which is to figure out a way. That it makes sense to monetize it and to keep a lead and to keep yourself as the primary player in that space. So look at where that's failed a little bit. So Redis, Redis is probably the fourth largest provider of Redis. Elastic is definitely the second largest provider of elastic. Hashi Corp. May increasingly becoming the, the second largest provider of, I mean, IBM may be the second largest partner of, um, Terraform. We're a little bit different as we continue to be the largest provider of influx. There are other third parties, and I know the question you're gonna ask but haven't asked, which is, what about the hyperscalers just taking your code and running it like they did with Elastic, or They did with Rad S, and they did Kubernetes, but you know, Google contributed Kubernetes, right? And so we. We have been, I don't know if it's lucky or whatever, but we have been very lucky in that, you know, Amazon runs InfluxDB and they pay us, and it's a very good business for us, and they're an amazing partner. That's not what happened with Elastic. It's not what happened with Redis. It's not what happened with a lot of open source projects. I think they have set the example with us in the last two years that it's better to partner with these open source vendors and build your own development teams for Kit, do all this other stuff. And that's been a really generative thing. So hopefully what that does is the next startups who do open source, they don't feel like they have to close off their licenses. They don't have to have infections, copy, left licenses. They don't have to do what Mongo did and Elastic did, and these other companies have done. So, because the community if, and you're an open source guy, so if you build a good community, you gotta have a good chance of monetizing in some interesting way.
Viktor
00:25:25.007
I'm not sure that I would say that you have a good chance, but I would say that in certain areas you almost stand no chance without getting that initial adoption. Right. I mean, there are plenty of, I could say that compared to what it was, Docker is a failed company commercially. I mean, the, the, they're doing well now, but compared to.
Viktor
00:25:48.971
Yeah, exactly right. I I do agree, assuming that that's what you meant, that actually the second stage, making a commercial successful product on top of open source might be actually even more complicated than open source, well adopted, open source itself. But it's also that there are areas where simply. You don't even have a choice. Like today, you can easily create, let's say, model based company for LLMs and say, I'm going all close source. Perfectly valid, right? But if you try to convince people to use, let's say, infrastructure as code, given such a high competition in that area, in open source, you basically, there is no choice. I mean, or there is illusion of choice, right? But you will never succeed.
Evan
00:26:35.099
that's part of how I got into this. So I'd been an infrastructure guy, mostly security and networking. And I was running a public company and we had a lot of Oracle and we replaced a fair amount of it with open source Mongo. And then, you know, and then we built some big data pipelines based on, you know, Hadoop and other stuff. And so I saw the power of this stuff. I thought it was inevitable. And I think it is, I think in certain sectors, having a closed source product. Is a Non-starter.
Viktor
00:27:01.746
I mean, I I would go as far as to say that if Oracle started, let's say later, I dunno how many years, but later, it would never become a company by being with the licenses that they had. They were just lucky to start. Not lucky, but they started early enough, right?
Darin
00:27:23.097
wanna ask a clarifying question of something you just said there. You said you were the head of a public company
Darin
00:27:29.097
you were building things around Mongo and whatever The second thing was, I forgot
Evan
00:27:33.167
do the big data infrastructure. We were trying to, we were trying to assemble That one was successful, one was really freaking hard.
Darin
00:27:39.704
I'm guessing Mongo was really hard and Hadoop was very simple, right? It's, it's, yes.
Darin
00:27:45.704
Yes. Uh, yes. Uh, how important was it to you back then to make sure you were contributing back to those projects as they existed at the time?
Evan
00:28:00.309
That's a great question. It was not important at all. It's terrible to say it was not important at all. As a CEO of a company, it wasn't important at all. Now, if my people wanted to, I would be because they wanted to. I would totally be supportive. No issues. I wouldn't block it. I'd be supportive, but it just was not even on my radar.
Darin
00:28:22.908
here's something that's free. I can take off the shelf. It doesn't cost me anything in money. It's gonna cost me some body time, but it's gonna get me a lot faster and I don't have to pay anybody a single penny. Is that the thought process of a CEO?
Evan
00:28:35.853
That's right. That's right. But I would tell you, you know, this was 2012 or something. I guarantee if I was running that company today, I'd be paying MongoDB. because I'd want all the support, the added features, the stuff that they've added above the open source. Maybe I'd wanna run it on Atlas 'cause I'd save a bunch of money. So, what Mongo did that was so successful is they figured out the way to monetize it. The, I'm a little biased because the CEO of Mongo, the former CEO of Mongo is now my board. But, but, um, but the. No, no, it's no issue. It's, it's, it's no issue is, you know, they started by selling support contracts. Support contracts are ephemeral. Once I realized I got two support incidents a year, why am I paying you $20,000 for my two support incidents a year? I'll figure this out. but if you give me a service or if you give me added capability, I'll, I'll generally pay for it. It's still gonna be a lot cheaper than Oracle, and they have less lawyers.
Viktor
00:29:33.086
But that's, that's human nature can, why would you invest work in something that does not benefit you? You will notice actually that companies running in massive scale, they're more likely to contribute simply because their systems might not contribute because they're nice people or none of those things. I mean, individual employees. Yes. And when I say contribute, I mean as a company. Policy, but because we depend on this, if you don't put this into the project, we are toast and we are one of five that actually need it.
Evan
00:30:10.281
That actually mean that actually happens all the time. we have a project that's even more popular in our database called Telegraph. We get tons of contribution to that because people want different connectors, which telegraph's a connection system, different connectors to be compatible. I think there are 300 plus of these things out there. People contribute all the time. you can count a little bit on altruism, but you can't carry too much on altruism.
Viktor
00:30:35.166
let's face it, and this must be harsh, might be harsh what, what I'm going to say, but most open source maintainers, open source maintainers, are not doing it for altruism. Right? They're get, they're also getting paid by that company, right? Everybody does it for some selfish reasons to, to begin with, well, not necessarily everybody, right? There are weekend projects, but I'm talking about serious dedication.
Evan
00:30:58.419
So we have an employee who's a phenomenal, and he devotes almost his whole time to being the PMC for a big Apache project. And I will tell you, we pay his salary, so, and we use that stuff. So we're willing, we're willing to do that. And he's also an amazing person, but I will tell you, he builds it because he's proud. He's just proud of building a really cool Apache product that lots of people are using. I mean, those were the original reason people became software engineers.
Viktor
00:31:29.230
oh. I, I have zero doubt about that. I'm just trying to say that he's still getting salary in a way to work on it. Uh, and you are giving, huh? Yeah.
Evan
00:31:41.936
Yeah. Would he be doing it as much if he just had to do this all on his own time? It'd
Viktor
00:31:45.711
yeah, kind of would he live kind of, uh, scraping by, you know, doing some odd part-time jobs so that he can dedicate every single minute he can of the rest of his time? To that for pure altruism, probably not. Right? That does not exclude pride, kind of. I get paid for much of the work I do, and I'm also proud of it. And it's open source, kind of great.
Evan
00:32:09.771
I think it's thriving. I think promiscuous license will continue to thrive. I think amazing things have been done on the Apache frameworks. Obviously, you know, Linux, Kubernetes, all the Apache projects, like this is our infrastructure, This is it.
Darin
00:32:24.247
How big of a deal are the foundations now, rewinding back to your 2001, right? There were effectively no foundations. Linux Foundation might have been then. I've been around then, I can't remember,
Evan
00:32:34.022
yeah. I don't know actually for the, when Linux Foundation started, you know, I think they're important. I, we don't, I don't end up engaging much with them, but has sponsoring these projects, I think they're all pretty useful. are they game changers? Should they be highly profitable businesses? I don't think so. But they're pretty useful.
Darin
00:32:52.442
that sort of goes against the concept of being highly profitable and being a foundation, but. That's a whole different conversation.
Viktor
00:32:59.552
Oh yeah. I think that the question should be the other way around. I, I don't think that actually, this is my personal belief, that foundations are that beneficial for. You as a company, behind the project, but it is very beneficial for you as a user because then actually, project is in a foundation. You cannot do a rockpool of the license or, you know, do some random stuff at least not easily. You, you can, it's rather, it's literally impossible, but makes users are slightly more protected, right? Those that don't want to. get, commercial version.
Evan
00:33:37.448
yeah, as you said, they're not gonna get the rug pulled out. Generally, if it's a Linux Foundation or Apache project, I can't imagine how you'd pull the rug on an Apache project. But yeah.
Darin
00:33:46.688
I wanna go back to your mention of your partnership with AWS because it sounded interesting to me. Are they, you said they're paying you, but are they paying you for your commercial
Evan
00:33:57.458
Yeah, both, both. They pay to license our open source and run it, and then we offer our commercial product on their platform. They, they, it's called Time Stream for InfluxDB. It's pretty much the same as InfluxDB, but you could just buy it directly on Amazon. And we have a, a good commercial relationship that works for them, works for us. And by the way, they worked hard to build that. They did a really good job with us because as you suspect, we were a little bit like, okay, we've seen what happened with you guys in the past. but that is not the case here. They've done some really good work.
Viktor
00:34:30.613
you look at your yourself influx and let's say some others, other extremes, was it in your case pure luck or that's a strategy or how, how you approach things. Kind of like why didn't AWS do the same thing as with elastic?
Evan
00:34:48.533
I think there are a couple reasons. One is you're now, you know, they probably made that elastic decision close to 10 years ago, and so now they've seen, you know, you'd have to ask them, but now they've seen what it takes to maintain their own. fork of all this sort of stuff. They've seen what it takes to have a, a Valkyrie version of Redis. They've seen that this is expensive and their customers want faster turns or other stuff, and so all of a sudden they're a big development operation when maybe they don't need to be. Right. Maybe they could cultivate relationships with vendors. They can get the workloads on their platform, which is most important. And then when the workloads are on their platform, now you can use so many of their services, SageMaker, Redshift, Lambda, and integrate and use those tools. And you guys know data has gravity, so if I can get data on the platform, why would I have any impedance to getting data on the platform? You know, within reason, why wouldn't I just get as much data on the platform? And this particularly time series was, and I say this humbly, we're the clear leader in that space. People have been asking them to run influx even though they had their own product. And so it just fit, it was just good. There was a change in how they thought about it. And we happened to be there and we, you know, we had customers who wanted to be on AWS with them. It just worked.
Viktor
00:36:08.187
I was more hoping for some birds of wisdom kind of, ah, because I did this and the
Evan
00:36:12.679
Oh, I'm genius. No, no. It's 'cause of me as you can see, because you can see video. I'm just a handsome debonair executor that, that they couldn't resist.
Darin
00:36:22.182
Let me flip it around to something we haven't talked about at all. Okay. We've been talking about time series. Somebody's listening now it's like, oh, I, I need to do some of that time series. who doesn't need time series?
Evan
00:36:32.496
That's a great question. You know, I think if you're, if you're, if you're doing some sort of instrumentation, it's light work, you're not super worried about the op. Racial nature of it. You might be posting some dashboards or things like that. You could easily use one of the observability vendors. You could easily use a general purpose database. you can do that if you don't care about that. The people who do need to use it are people who are operational workloads. People not only who have dashboards, but have things dependent on it. Like they've got huge amounts of data. You know, we have a customer, ninja one has 10 million devices out there collecting at regular sampling rates who are. You know, looking for security events. We have other customer security space, like you need an operational platform. You don't need an observability, you need operational. You don't need analytics as much as you need. Wow. I've gotta be able to, you know, I've gotta have zero time for the data to be read. I have to be able to act on the data in less than 15 milliseconds. Like those are the people who need it. Other people. You know, they could spin up, SQL Server and, and do fine. You know, we have a huge group of home users, believe it or not, because the open source. So we have just people who instrument their thermostats, their pools, the different sensors around their house, and there's some incredibly cool project. We don't monetize that at all. and occasionally we have a home user who brings us into their, into their corporate job, and that's great, but it's not what we count on. I think you guys should actually become home users, and then we should do this, this whole process again.
Evan
00:38:19.814
it depends petabytes of data, it's more about how much you're writing simultaneously than what's, what's collected. 'cause what's collected over time is huge. But you know, can you write a billion points simultaneously? how many devices at the end of it? I'm sure I can get you guys some of the bigger projects we have and share with you, but,
Darin
00:38:39.239
so no exabytes that you've seen at least, or at least none that you can speak of.
Evan
00:38:43.709
I don't, yes, I'm sure there are. I, we're not talking about it. I'm not, I'm not. I
Darin
00:38:51.569
it, it, it, it would be, you know, government, three letter companies that would have that kind of, but hey, we won't get into that either. anything else that we, I mean, you've been in this space for a long time. We've sort of jumped all around. Anything else you wanna talk about just
Evan
00:39:07.784
No, only that, that if I could get one message across to your audiences, there's times and places for developers to start with a time series database and not just use the database they've always had or something. They've spun up a fast Postgres or something like that, and that's when they're reading, you know, often it's when they're reading either virtual or physical sensors and they know. That over time, the amount of data they're gonna be working with is gonna be both expensive and needs to be performant and needs to come in. Our stuff is easy to start with. It's incredibly, it's schema on, right? It was sort of modeled after Mongo, to be honest with you. Mongo, if you remember Mongo in the early days, was their brand was easy to use, easy to get started versus, you know. Versus your standard relational SQL database, and we very much have built on that brand. Time to Awesome is relatively quick and scale is pretty amazing. So just know when a developer should start with it. That would be the most important thing.
Darin
00:40:04.969
Sounds good. So everything influx DB can be found@influxdata.com. Uh. And again, that's what I was gonna ask is, do you know what the, the home project is for
Evan
00:40:25.844
'cause we still like stars. Even though we're north of 30,000, we still like them.
Evan
00:40:35.249
You know, it's, it's a little bit of pride. Like, like all these people, you know, these people after Ted, you know, projects, listen, open source projects, if they don't have velocity, they die. And so I'm always looking at the rate of change in stars. And so if stars continue to go up at a nice rate and like people can see that we're continuing to add, we're continuing to work. You know, I think if you were to ask Paul, my founder, he'd like to have a database like Postgres that's 40 years old or whatever, we have some pride in that. So independent of the commercial instincts.
Darin
00:41:21.790
Yeah. 30 years from now, that would be, uh, let's see, do the math. That would be 90. That would be about right. and you should have at least a hundred thousand stars by then. We need more than that.
Darin
00:41:32.550
Wow. That time series database really does interesting things with math. That's all I can say.
Darin
00:41:42.330
Transformers. Alright. Again, everything about influx we found@influxdata.com. Again, open source at GitHub. Evan, thanks for being on the show today.
Evan
00:41:51.345
Hey guys. Thanks for having me. It was a real pleasure. Enjoy, um, um, enjoy Barcelona