Evan 00:00:00.000 if I've got a robot in my house that doesn't respond in milliseconds, you know, I might have a dead cat who I really loved. I really loved that cat and the robot accidentally ran it over or basically mowed it up, Or if I'm building something of high precision, you know, I just need to be able to have time series based resolution of what's happening.
Darin 00:01:24.669 Back in episode 3, 3, 5, we had Peter on from Coru and Viktor. You brought up a point that, especially in observability, in the old days, we would keep more recent data and we would start aggregating. Older data just to save on cost. Do you still agree with that?
Viktor 00:01:41.469 I do, I mean, I do agree with basically not keeping things that you don't need, be it in software or you know, in your house. Doesn't matter.
Darin 00:01:51.887 So it's gonna be interesting how we take this today, because coming from our primary world, you know, we think in seconds and maybe milliseconds, right? That's not a big deal, you know? 'cause you think about Kubernetes responding or waiting for a Jenkins build to finish up. It's like, okay, who cares? You know? It's not that big of a deal. But
Viktor 00:02:09.962 Can I be mean for a second?
Darin 00:02:12.047 okay, sure.
Viktor 00:02:13.262 Jen's bills never respond in seconds. I,
Darin 00:02:17.572 Okay.
Viktor 00:02:19.502 so you don't need to worry about that one.
Darin 00:02:21.917 Okay. You know, I still work with it. Come on, gimme a break. I could say something negative about your project too,
Viktor 00:02:27.877 Go for it. Go for
Darin 00:02:28.817 No, I won't do it now. but let's, let's assume this for a second. what if we're waiting for a car to stop? Do you think we could wait seconds or milliseconds for it to respond? Or maybe a, a manufacturing arm, you know, a robot arm moving things around. That seems like it would be a bad thing. On today's show, we have Evan Kaplan on from Influx Data. Okay, Evan, how you doing today?
Evan 00:02:48.792 Good. Good. How you guys doing?
Darin 00:02:50.432 Good. I heard you snickering in the background a little bit about some of the earlier things, but what do you think about the the car and the arm thing? Because to me, when we need used to, we call it real time data. I think there has to be a better phrase to it than real time, even though that's what people still call it. are you seeing and how do you think 'cause Victor's context? 'cause in case you haven't listened to the episode, goes around, used to, we could throw data away, but now that we have ai, we want as much data as we can so it can potentially help us make some better choices. But in the examples I was just bringing up, we're talking about making a choice now. Like if we need to shut an arm down because it's getting ready to blow a bearing or something, what are we thinking about these days?
Evan 00:03:36.252 So you guys share some background with me in that you, you're not new to this industry. People have been talking about real time. You know, forever. R toss operating systems. Just, you know, the narrative. Nobody knows what real time is. I'm not sure it's an important distinction. I'm not sure it matters. I'm not sure the label helps. Of course, we use it in our marketing and so, so we use it because it, it signifies some broad concept that at least things don't have ridiculous amounts of latency. That's about the utility of the, term, but. Let's ground, you know, since, since you started with the, the little bit of the controversial, or not really controversial is that car, that robot arm, that, dynamic is, we're clearly moving into a physical world of increasing automation. And if you look at sort of the backdrop of the universe, if you think the counterforce is entropy, then you think the, you know, the force against entropy is consistent organization. And for us as humans, that means increasing automation, building systems that are self-healing, self adapting, eventually completely autonomous. That's where we want our systems to go. They don't all need to be there. They certainly don't, don't need to be there tomorrow. Some of 'em are highly useful, like the self-driving car when it gets there, but that's the journey that all these systems are on. And so when you talk about microsecond or you talk about observability data, which most people are happy every second. Not particularly useful. You know, if I've got a robot in my house that doesn't respond in milliseconds, you know, I might have a dead cat who I really loved. I really loved that cat and the robot accidentally ran it over or basically mowed it up, Or if I'm building something of high precision, you know, I just need to be able to have time series based resolution of what's happening. higher the fidelity, the, the higher the sampling rate, the more fidelity I'm gonna have and the better picture of the real world I'm gonna have. So, let me not go on and on here. I took a little thread and I took it forward, but, but tell me where you want to go with this.
Viktor 00:05:43.145 when we talk about getting quick responses in what's not, uh, wouldn't the logical. Direction, then be direct communication. No intermediary at all. I mean, I know that's not feasible, but if somebody wants past something, then just remove anything in between of those two end points.
Evan 00:06:02.825 That would be great, except you want to build intelligence. And intelligence ideally can be resident at whatever the end point is, but isn't always available at the end point. And the nature of pipelines and data, you know, doesn't, doesn't often allow that. Right? And then you have just, you know, like these are just electrons moving through space. So simultaneous, you know, is, um, it's a little bit challenging, but I think you're right. The least impedance is the most beneficial for, if you want real time.
Darin 00:06:30.493 Isn't sort of the gold standard for real time a human or you would hope so at least.
Viktor 00:06:36.573 We are not. You could be real time only when you change the rules of the speed of flight. Make it instantaneous.
Darin 00:06:45.057 Okay. But okay. It depends. So self-driving cars, let's stay there for just a second. As a human, I still don't trust self-driving cars. I won't get in a self-driving car.
Viktor 00:06:54.672 Well, as a human, I don't trust other humans driving cars,
Evan 00:06:58.032 I,
Darin 00:06:58.197 Well, that's what I was gonna say as well. so
Evan 00:07:02.187 it's all a matter of risk. It's all a matter of what risk you want.
Darin 00:07:05.847 Yeah, my premise I guess is falling apart, but it seems like, how does, okay, so we have time series data. let's play this out for a second. We have time series data coming in. How long do I need to keep the time series data?
Evan 00:07:18.441 Depends on the problem you're trying to solve, Depends on the fidelity you want. It depends on what you wanna learn from it, so it's very common. The most common time series use cases, I collect really high resolution data for short periods of time, and then I downs sample it and store it for long periods of time. Anything that would give me the signal I need. And separate the signal from the noise. That's the period you want to keep. Now, if I want increasing fidelity and incredible intelligence about the physical world, I'm gonna keep more data. That's more descriptive. You know, let's look maybe an easy metaphor for you as a photograph, right? Do I want a two megapixel photograph that works fine for 98%, I'm showing it on a screen. I'm showing it on my thing. Or do I want, you know, an 18 megapixel photograph that, you know, that allows me to locate the tiniest portion or pixel on the image? Like it depends on your utility. my theory is, is that we want increasingly. Increasingly high resolution data to instrument the physical world that we're entering a very different world with physical ai. And so it's not okay for the car to respond in one second. The car's gotta respond if it's moving at 60 miles an hour, and we can do the calculations. We're not assuming a car's moving at the speed of light, but we're assuming the car's moving at 60 miles an hour. It's gotta respond in some significant sub-second time to things that interject itself in its path. This is the world we're going into these factory arms, Think about the manufacturer of a semiconductor precision accuracy pattern. It's like this is all, we're talking nanometers, nanoseconds, we're talking, you know, we're talking high resolution. This is the world we're going into.
Darin 00:09:01.122 is nano now the true second because used to 'cause, used to, we would think about sec milliseconds, right? We would thinking. Thinking websites or any kind of response times from disc, we were thinking milliseconds.
Evan 00:09:17.042 Depends on your system. our default period is nanosecond. Very few people use it. Some of the quantum computing customers we have use it, but our default is nanosecond. So if you want to store it nanosecond, you can. It's expensive, you'd expect.
Viktor 00:09:32.239 I guess the answer to that question is whatever you need, and not more granular than that. Or not more or not faster than that, right? Because the moment you go above what you need, you are paying for it. And not only in money, but also in performance in many different ways, right? you would say, 10 milliseconds for the car is ideal, why would you go for one millisecond? I am not saying it's not better, but if you say that's the kind of, I don't need more than that. or why would you have a infinite amount of pixels in a photo that you're going to show on a, on a, on, on a mobile?
Evan 00:10:11.782 So if what you're saying is fit for purpose, of course, I totally agree. The question is, is how does the purpose change? Right. My, my initial thing is I just need the photograph to show anything. It turns out it's a piece of evidence in some sort of murder case, and all of a sudden I want a high resolution to capture some tiny anyway. I mean, you get the, you get the simple idea is I don't always know a priori what that period is. Of course, I should make some estimate, put a range in it and do it, which is what most people do.
Viktor 00:10:39.888 the tricky part I feel is that you don't also know what will your needs be in the future. So if you go back to my photo example, right? Let's say that my mobile cannot show more than one megapixel photo, right? Maybe it'll tomorrow. So do I over optimize for tomorrow that I don't know what it is or, or what do I do, right? do I pay today for tomorrow?
Evan 00:11:04.188 what did you do when you rented your apartment in Barcelona? Did you plan on having three more kids and a family? And so you know how you make these judgments.
Viktor 00:11:12.218 Yeah.
Evan 00:11:13.092 I think the really important point here guys to pull out is that we're in an age, and you guys have been around this sort of infrastructure over a long period of time where you've gotten these really quality, specialized tools on the data side, And you know, it started years ago with document data, search data, graph data time series, data vector data you can get more efficient. You can get a purpose-built stuff that actually solves significant problems. And so the tool set you have is just so much wider. It's as if before you had a hammer and a screwdriver and you were gonna build pretty much everything, but now you've got a full set of tools that you can work with. That's the change in the world that I think is important. It doesn't mean you should have a full set of tools every job, but it just means you have those tools available, and I think that's what's changing.
Darin 00:11:58.676 So, in other words, there needs to be a Home Depot or a Lowe's right down the street from where we're working. So if we need a new tool we can go get
Evan 00:12:04.771 Or a virtual one. Or a virtual one that's instantaneously available.
Darin 00:12:08.846 that's instant. Right? I, I wanna nerd out for just a second, or at least to me it's nerd. You were saying nanosecond is your base time series, or time second, whatever. how do you even have hardware that can write to that fast? Keep up? That seems very odd to me. And he said quantum. Okay, sure. But I can't go down to my local microcenter, rest in peace, fries,
Evan 00:12:33.388 Actually, I don't think the hardware needs to catch up. I don't think it's capturing every nanosecond, but it's breaking up the data. It captures the nanoseconds and reads, so, so, because, yeah, but the idea is, well, how do I wanna store the data and if I want to capture it in this
Darin 00:12:47.809 do you think this whole AI thing is really coming into play? I mean, there's always been machine learning, For decades, I mean, obviously the generative AI is sort of, the new kid on the block has been for a couple of years, but is that really helping us? Is it hurting us? where's that going?
Viktor 00:13:04.991 remove it from those cars you're afraid of, and let me know how it went.
Evan 00:13:09.649 I can't talk about, there are other people way smarter than me about the broader implications on society. I could opine on it forever. We all have our thoughts. I'm not sure I'm the useful guest for that. But in terms of, um, in terms of generative ai, I find it amazingly useful. Our coders find it amazingly useful. I find it in my personal life amazingly useful. I just think it's a main tool. It's not what we do, I'm much more interested in, in deterministic models of the world rather than probabilistic models for the digital world. These for data, for language. Probabilistic models work just fine. But for the physical world, you need deterministic models. Again, I don't want my robot to run over my cat. It's pretty simple. I don't want myself driving car to hit the pedestrian, And so I don't want probabilistic models, I want deterministic models. in the generative world, we're collecting digital data, and that's pretty much all collected and it appears every day. It's res scraped. You get new data every day, and now they're building synthetic data to make their models. In the physical world, the amount of data you can collect is by definition, infinite. By definition, infinite. Tell me the system you want to build. I'll tell you the period of data to collect the amount of data you want to collect, to train the model, and then you can collect it. That's what's new, you know, we can build models of the physical world that actually are quite deterministic. It's not new actually, it's happening. It's been machine learning. It's just accelerated.
Darin 00:14:41.157 Well, I think that's where time series really does come in, because I mean, we've had time series data stores in the past as well, but that's really to me, better than a relational database for building out models.
Evan 00:14:52.347 A hundred. No, it's not even close. So if you built this thing 30 years ago and still exist today, there are time series applications running on Oracle today, right there. If you built it 20 years ago, you ran it on MySQL. Maybe if you at 15, you built it on HBase or something like that. You know, it's only in the last five or six years that people have built really meaningful stuff on dedicated time series platforms, satellites, self-driving cars, battery power systems, energy management alternatives, just a, you know, just the app. Now that you have a specialized tool, you can get that resolution be less expensive and to be more responsive, which is where all these systems have to go over time. No pun intended.
Darin 00:15:36.282 why did time series need to exist?
Evan 00:15:39.297 Because as you started using the general purpose databases, relational databases, MySQL, PostgreSQL you realize that you're just not optimizing for. If you know your vector of search, you know your index is primarily time you can change your ingest level. You can change the time it is to write. You can change your query time and query speed. You can change your cost of storage if you specialize. If you don't, you pay all the prices by not specializing. if you want to take your minivan and you want to race in F1 series, it'll be fine. It'll go around the track. It'll do what it's supposed to do. If you don't get run over.
Darin 00:16:18.367 and you're guaranteed to come in last at that point too. But you know, that's, could you imagine a minivan riding around the Las Vegas F1 track? That would be,
Evan 00:16:26.942 I am sure somebody's outfitted one to do just that.
Viktor 00:16:30.244 now we are entering into the discussion of, you know, all encompassing databases like Elastic, that actually are trying to be in not everything, but all encompassing, at least within an area. You know, logs, metrics, traces,
Evan 00:16:45.908 that's that whole space, you know, that's the whole space. They all want to be the melt Stack. They want to have metrics, events, logs, and traces. And so Elastic is one of 'em, but there are, you know, there are 30 others.
Viktor 00:16:58.594 But that almost feels like almost inevitable is the wrong word, because it's obviously inevitable, but I'm going to keep using it. Inevitable Expansion of companies, right? You find your niche, you say, I'm very good at logs. Where do I expand my business? To metrics, right? How do I do that? By modifying what I have.
Evan 00:17:21.801 you build a tool, it's useful, Splunk did this before Elastic, right? you build something, you build a tool, it's useful and you figure out how else you can use it. That's just capitalism. That's just, you know, how do I leverage my adjacency, you'd expect vendors to do it, but you also expect customers and architects like you. To be very picky about what you assemble as a solution and that sometimes a one time fits all makes a lot of sense for you, right? Just a simple Mongo db, they claim to do graph, vector time, everything. maybe it's good enough, I don't know, You gotta figure out as a developer, what is, or an architect, what is the thing you're building and what do you need?
Darin 00:18:00.122 When Influx InfluxDB specifically was created. You weren't around for that. You came in a little bit later. Right.
Evan 00:18:07.947 Yeah, I came in, I have pre-revenue, but there were 3000 when I joined. There were 3000 instances in the wild running daily. Today there are 1.4 million that run daily, and so pretty early, but not, I'm not a founder.
Darin 00:18:24.813 you were in early enough. Why did the founders even decide to create a time series? Was it they just got bored and decided to? Let's go do this this weekend.
Evan 00:18:34.803 Like most good businesses, it's a pivot, Like most of this stuff, it's a pivot. my partner, Paul Dick's, the founder, he had worked on Wall Street. He had built, um, a lot of application code on top of existing databases. To do financial trading for different firms. And so he had that experience, but he didn't go right there. What he did was he tried to start something like a mini Datadog, a simple cloud-based in those days. We didn't call it observability server monitoring platform, And he realized that was gonna be a big lift and he'd have to raise a lot of money. And he was deeply convicted about open source. And he wanted to work on an open source thing anyway, and so the database that he originally built. To collect the metrics and events for that cloud-based service. He then open sourced it. He wrote it in go. He open, sourced it, and it became immediately popular because it handled the four or five things that were, that I mentioned. It had amazing ingest. It had, super fast query time. It had the ability to evict data very fast for down sampling and resolution. It was cheap to store the data. So we solved a set of problems. That if you were a developer, you'd have to go solve on your own if you used Postgres or MySQL, which it would've been the state of the art at that point. it was just really one of these organic popularity things. And it was helped that it was an originally a GO project. It Go was early and Docker had just come out and so there was a really interesting time for Go. but mostly it was just really super easy to use. People get started right away and it was powerful.
Darin 00:20:11.728 It's interesting you, you bring that up. So I'm guessing that was around 2011? 12.
Evan 00:20:15.418 13. He made the first, he started in early 13, made the first commits, I think publicly in 13.
Darin 00:20:21.028 So we're talking 12 years ago. That was early go, I mean. Early that would predates even Kubernetes Viktor. Is that possible?
Evan 00:20:29.878 it predates Kubernetes. I hope Viktor doesn't get insulted by that.
Darin 00:20:34.453 No, he,
Viktor 00:20:34.863 Oh, I've been around.
Darin 00:20:38.464 That's actually pretty interesting because do you think it was probably one of the first major projects that took on
Evan 00:20:43.939 It was a very early project. I think Docker proceeded it. I don't know. Remember, there's some other really interesting projects, but I don't actually remember. Since I came to the company, it was still was not really a revenue company. I came in early 16,
Darin 00:20:57.996 So what's it been like from 16 to now? 25
Evan 00:21:01.594 it's been.
Darin 00:21:03.454 Inter, I mean, okay, it's, we've had a lot of things happen over these past nine years.
Evan 00:21:08.509 Well, you know, I was young. I had jet black hair and now you can, you can, your audience can't see me, but I've got gray hair and it's actually been a really nice run. I really love working on open source projects. We have chosen to keep it a promiscuously licensed, which is really. That's become almost controversial these days is most of our competitors have gone to these secondary licenses, so we kept it MIT and Apache. We have a really vibrant OSS community, so that's been feeding and the business has grown consistently and nicely over that period of time. And so we're now of a size where we can really continue to, you know, build something of significance.
Viktor 00:21:47.284 If you would started today all over, would you still keep it MIT?
Evan 00:21:52.035 You know, this is a debate between Paul and I. I think I would. I think I would, I think we've taken what our approach is really open core. we're committers on Apache data fusion, we're committers on Apache Arrow. So we're very much involved in the whole Stack. And we make contributions to the open source, not just our own project. And so we're deeply, it's one of our core values is we're deeply committed to open source. And I think the right method for open source is we just don't put everything in the open source, right. That's just it. We're open core, really.
Viktor 00:22:25.183 to be clear, I'm, I'm an opposite person completely. Just that. I can see certain concerns being raised, whether this system actually works for new players. I, I can feel, I cannot prove it. I can feel that some new startups are learning the lesson and not necessarily correctly we're learning the lesson, but saying, okay, maybe this is not actually beneficial for. For me starting today in a way, right.
Evan 00:22:58.618 I still think it's very beneficial, particularly for infrastructure. And I help out other open source companies and I'm pretty committed to the whole ecosystem. one of my advisors is Ali Zi, who's the CEO of Databricks. And Ali said something I thought was very wise, but not, not so obvious. the problem with open source is you have to hit two home runs. In order to even be in the game. First you have to have a project that is interesting, popular that people want to engage in. And for those guys that was Spark, The second, maybe even harder, which is to figure out a way. That it makes sense to monetize it and to keep a lead and to keep yourself as the primary player in that space. So look at where that's failed a little bit. So Redis, Redis is probably the fourth largest provider of Redis. Elastic is definitely the second largest provider of elastic. Hashi Corp. May increasingly becoming the, the second largest provider of, I mean, IBM may be the second largest partner of, um, Terraform. We're a little bit different as we continue to be the largest provider of influx. There are other third parties, and I know the question you're gonna ask but haven't asked, which is, what about the hyperscalers just taking your code and running it like they did with Elastic, or They did with Rad S, and they did Kubernetes, but you know, Google contributed Kubernetes, right? And so we. We have been, I don't know if it's lucky or whatever, but we have been very lucky in that, you know, Amazon runs InfluxDB and they pay us, and it's a very good business for us, and they're an amazing partner. That's not what happened with Elastic. It's not what happened with Redis. It's not what happened with a lot of open source projects. I think they have set the example with us in the last two years that it's better to partner with these open source vendors and build your own development teams for Kit, do all this other stuff. And that's been a really generative thing. So hopefully what that does is the next startups who do open source, they don't feel like they have to close off their licenses. They don't have to have infections, copy, left licenses. They don't have to do what Mongo did and Elastic did, and these other companies have done. So, because the community if, and you're an open source guy, so if you build a good community, you gotta have a good chance of monetizing in some interesting way.
Viktor 00:25:25.007 I'm not sure that I would say that you have a good chance, but I would say that in certain areas you almost stand no chance without getting that initial adoption. Right. I mean, there are plenty of, I could say that compared to what it was, Docker is a failed company commercially. I mean, the, the, they're doing well now, but compared to.
Evan 00:25:46.451 For a different, for a totally different use case,
Viktor 00:25:48.971 Yeah, exactly right. I I do agree, assuming that that's what you meant, that actually the second stage, making a commercial successful product on top of open source might be actually even more complicated than open source, well adopted, open source itself. But it's also that there are areas where simply. You don't even have a choice. Like today, you can easily create, let's say, model based company for LLMs and say, I'm going all close source. Perfectly valid, right? But if you try to convince people to use, let's say, infrastructure as code, given such a high competition in that area, in open source, you basically, there is no choice. I mean, or there is illusion of choice, right? But you will never succeed.
Evan 00:26:35.099 that's part of how I got into this. So I'd been an infrastructure guy, mostly security and networking. And I was running a public company and we had a lot of Oracle and we replaced a fair amount of it with open source Mongo. And then, you know, and then we built some big data pipelines based on, you know, Hadoop and other stuff. And so I saw the power of this stuff. I thought it was inevitable. And I think it is, I think in certain sectors, having a closed source product. Is a Non-starter.
Viktor 00:27:01.746 I mean, I I would go as far as to say that if Oracle started, let's say later, I dunno how many years, but later, it would never become a company by being with the licenses that they had. They were just lucky to start. Not lucky, but they started early enough, right?
Evan 00:27:20.082 Yeah. You know, you you try to fit your times.
Darin 00:27:23.097 wanna ask a clarifying question of something you just said there. You said you were the head of a public company
Evan 00:27:28.372 Mm-hmm.
Darin 00:27:29.097 you were building things around Mongo and whatever The second thing was, I forgot
Evan 00:27:33.167 do the big data infrastructure. We were trying to, we were trying to assemble That one was successful, one was really freaking hard.
Darin 00:27:39.704 I'm guessing Mongo was really hard and Hadoop was very simple, right? It's, it's, yes.
Evan 00:27:44.374 of bef Oh, you were joking.
Darin 00:27:45.704 Yes. Uh, yes. Uh, how important was it to you back then to make sure you were contributing back to those projects as they existed at the time?
Evan 00:28:00.309 That's a great question. It was not important at all. It's terrible to say it was not important at all. As a CEO of a company, it wasn't important at all. Now, if my people wanted to, I would be because they wanted to. I would totally be supportive. No issues. I wouldn't block it. I'd be supportive, but it just was not even on my radar.
Darin 00:28:19.338 Can I make a guess of why? That's why that is.
Evan 00:28:21.843 Sure.
Darin 00:28:22.908 here's something that's free. I can take off the shelf. It doesn't cost me anything in money. It's gonna cost me some body time, but it's gonna get me a lot faster and I don't have to pay anybody a single penny. Is that the thought process of a CEO?
Evan 00:28:35.853 That's right. That's right. But I would tell you, you know, this was 2012 or something. I guarantee if I was running that company today, I'd be paying MongoDB. because I'd want all the support, the added features, the stuff that they've added above the open source. Maybe I'd wanna run it on Atlas 'cause I'd save a bunch of money. So, what Mongo did that was so successful is they figured out the way to monetize it. The, I'm a little biased because the CEO of Mongo, the former CEO of Mongo is now my board. But, but, um, but the. No, no, it's no issue. It's, it's, it's no issue is, you know, they started by selling support contracts. Support contracts are ephemeral. Once I realized I got two support incidents a year, why am I paying you $20,000 for my two support incidents a year? I'll figure this out. but if you give me a service or if you give me added capability, I'll, I'll generally pay for it. It's still gonna be a lot cheaper than Oracle, and they have less lawyers.
Viktor 00:29:33.086 But that's, that's human nature can, why would you invest work in something that does not benefit you? You will notice actually that companies running in massive scale, they're more likely to contribute simply because their systems might not contribute because they're nice people or none of those things. I mean, individual employees. Yes. And when I say contribute, I mean as a company. Policy, but because we depend on this, if you don't put this into the project, we are toast and we are one of five that actually need it.
Evan 00:30:10.281 That actually mean that actually happens all the time. we have a project that's even more popular in our database called Telegraph. We get tons of contribution to that because people want different connectors, which telegraph's a connection system, different connectors to be compatible. I think there are 300 plus of these things out there. People contribute all the time. you can count a little bit on altruism, but you can't carry too much on altruism.
Viktor 00:30:35.166 let's face it, and this must be harsh, might be harsh what, what I'm going to say, but most open source maintainers, open source maintainers, are not doing it for altruism. Right? They're get, they're also getting paid by that company, right? Everybody does it for some selfish reasons to, to begin with, well, not necessarily everybody, right? There are weekend projects, but I'm talking about serious dedication.
Evan 00:30:58.419 So we have an employee who's a phenomenal, and he devotes almost his whole time to being the PMC for a big Apache project. And I will tell you, we pay his salary, so, and we use that stuff. So we're willing, we're willing to do that. And he's also an amazing person, but I will tell you, he builds it because he's proud. He's just proud of building a really cool Apache product that lots of people are using. I mean, those were the original reason people became software engineers.
Viktor 00:31:29.230 oh. I, I have zero doubt about that. I'm just trying to say that he's still getting salary in a way to work on it. Uh, and you are giving, huh? Yeah.
Evan 00:31:39.400 to eat. Yeah.
Viktor 00:31:40.875 exactly.
Evan 00:31:41.936 Yeah. Would he be doing it as much if he just had to do this all on his own time? It'd
Viktor 00:31:45.711 yeah, kind of would he live kind of, uh, scraping by, you know, doing some odd part-time jobs so that he can dedicate every single minute he can of the rest of his time? To that for pure altruism, probably not. Right? That does not exclude pride, kind of. I get paid for much of the work I do, and I'm also proud of it. And it's open source, kind of great.
Evan 00:32:09.771 I think it's thriving. I think promiscuous license will continue to thrive. I think amazing things have been done on the Apache frameworks. Obviously, you know, Linux, Kubernetes, all the Apache projects, like this is our infrastructure, This is it.
Darin 00:32:24.247 How big of a deal are the foundations now, rewinding back to your 2001, right? There were effectively no foundations. Linux Foundation might have been then. I've been around then, I can't remember,
Evan 00:32:34.022 yeah. I don't know actually for the, when Linux Foundation started, you know, I think they're important. I, we don't, I don't end up engaging much with them, but has sponsoring these projects, I think they're all pretty useful. are they game changers? Should they be highly profitable businesses? I don't think so. But they're pretty useful.
Darin 00:32:52.442 that sort of goes against the concept of being highly profitable and being a foundation, but. That's a whole different conversation.
Evan 00:32:58.262 it happens.
Darin 00:32:59.202 Yes,
Viktor 00:32:59.552 Oh yeah. I think that the question should be the other way around. I, I don't think that actually, this is my personal belief, that foundations are that beneficial for. You as a company, behind the project, but it is very beneficial for you as a user because then actually, project is in a foundation. You cannot do a rockpool of the license or, you know, do some random stuff at least not easily. You, you can, it's rather, it's literally impossible, but makes users are slightly more protected, right? Those that don't want to. get, commercial version.
Evan 00:33:37.448 yeah, as you said, they're not gonna get the rug pulled out. Generally, if it's a Linux Foundation or Apache project, I can't imagine how you'd pull the rug on an Apache project. But yeah.
Darin 00:33:46.688 I wanna go back to your mention of your partnership with AWS because it sounded interesting to me. Are they, you said they're paying you, but are they paying you for your commercial
Evan 00:33:57.458 Yeah, both, both. They pay to license our open source and run it, and then we offer our commercial product on their platform. They, they, it's called Time Stream for InfluxDB. It's pretty much the same as InfluxDB, but you could just buy it directly on Amazon. And we have a, a good commercial relationship that works for them, works for us. And by the way, they worked hard to build that. They did a really good job with us because as you suspect, we were a little bit like, okay, we've seen what happened with you guys in the past. but that is not the case here. They've done some really good work.
Viktor 00:34:30.613 you look at your yourself influx and let's say some others, other extremes, was it in your case pure luck or that's a strategy or how, how you approach things. Kind of like why didn't AWS do the same thing as with elastic?
Evan 00:34:48.533 I think there are a couple reasons. One is you're now, you know, they probably made that elastic decision close to 10 years ago, and so now they've seen, you know, you'd have to ask them, but now they've seen what it takes to maintain their own. fork of all this sort of stuff. They've seen what it takes to have a, a Valkyrie version of Redis. They've seen that this is expensive and their customers want faster turns or other stuff, and so all of a sudden they're a big development operation when maybe they don't need to be. Right. Maybe they could cultivate relationships with vendors. They can get the workloads on their platform, which is most important. And then when the workloads are on their platform, now you can use so many of their services, SageMaker, Redshift, Lambda, and integrate and use those tools. And you guys know data has gravity, so if I can get data on the platform, why would I have any impedance to getting data on the platform? You know, within reason, why wouldn't I just get as much data on the platform? And this particularly time series was, and I say this humbly, we're the clear leader in that space. People have been asking them to run influx even though they had their own product. And so it just fit, it was just good. There was a change in how they thought about it. And we happened to be there and we, you know, we had customers who wanted to be on AWS with them. It just worked.
Viktor 00:36:08.187 I was more hoping for some birds of wisdom kind of, ah, because I did this and the
Evan 00:36:12.679 Oh, I'm genius. No, no. It's 'cause of me as you can see, because you can see video. I'm just a handsome debonair executor that, that they couldn't resist.
Darin 00:36:22.182 Let me flip it around to something we haven't talked about at all. Okay. We've been talking about time series. Somebody's listening now it's like, oh, I, I need to do some of that time series. who doesn't need time series?
Evan 00:36:32.496 That's a great question. You know, I think if you're, if you're, if you're doing some sort of instrumentation, it's light work, you're not super worried about the op. Racial nature of it. You might be posting some dashboards or things like that. You could easily use one of the observability vendors. You could easily use a general purpose database. you can do that if you don't care about that. The people who do need to use it are people who are operational workloads. People not only who have dashboards, but have things dependent on it. Like they've got huge amounts of data. You know, we have a customer, ninja one has 10 million devices out there collecting at regular sampling rates who are. You know, looking for security events. We have other customer security space, like you need an operational platform. You don't need an observability, you need operational. You don't need analytics as much as you need. Wow. I've gotta be able to, you know, I've gotta have zero time for the data to be read. I have to be able to act on the data in less than 15 milliseconds. Like those are the people who need it. Other people. You know, they could spin up, SQL Server and, and do fine. You know, we have a huge group of home users, believe it or not, because the open source. So we have just people who instrument their thermostats, their pools, the different sensors around their house, and there's some incredibly cool project. We don't monetize that at all. and occasionally we have a home user who brings us into their, into their corporate job, and that's great, but it's not what we count on. I think you guys should actually become home users, and then we should do this, this whole process again.
Viktor 00:38:04.972 I am already home user. I use my home very frequently.
Evan 00:38:09.467 You well, well said. Well said.
Darin 00:38:14.323 can you define huge amount of data?
Evan 00:38:16.604 Yeah,
Darin 00:38:17.129 throw around huge amount of data. What is huge amount of data?
Evan 00:38:19.814 it depends petabytes of data, it's more about how much you're writing simultaneously than what's, what's collected. 'cause what's collected over time is huge. But you know, can you write a billion points simultaneously? how many devices at the end of it? I'm sure I can get you guys some of the bigger projects we have and share with you, but,
Darin 00:38:37.094 Okay, so petabytes,
Evan 00:38:38.579 yeah,
Darin 00:38:39.239 so no exabytes that you've seen at least, or at least none that you can speak of.
Evan 00:38:43.709 I don't, yes, I'm sure there are. I, we're not talking about it. I'm not, I'm not. I
Darin 00:38:49.694 It could,
Evan 00:38:50.279 to talk about that, but
Darin 00:38:51.569 it, it, it, it would be, you know, government, three letter companies that would have that kind of, but hey, we won't get into that either. anything else that we, I mean, you've been in this space for a long time. We've sort of jumped all around. Anything else you wanna talk about just
Evan 00:39:07.784 No, only that, that if I could get one message across to your audiences, there's times and places for developers to start with a time series database and not just use the database they've always had or something. They've spun up a fast Postgres or something like that, and that's when they're reading, you know, often it's when they're reading either virtual or physical sensors and they know. That over time, the amount of data they're gonna be working with is gonna be both expensive and needs to be performant and needs to come in. Our stuff is easy to start with. It's incredibly, it's schema on, right? It was sort of modeled after Mongo, to be honest with you. Mongo, if you remember Mongo in the early days, was their brand was easy to use, easy to get started versus, you know. Versus your standard relational SQL database, and we very much have built on that brand. Time to Awesome is relatively quick and scale is pretty amazing. So just know when a developer should start with it. That would be the most important thing.
Darin 00:40:04.969 Sounds good. So everything influx DB can be found@influxdata.com. Uh. And again, that's what I was gonna ask is, do you know what the, the home project is for
Evan 00:40:17.564 it's influx tb. Yeah.
Darin 00:40:19.334 it's InfluxDB slash InfluxDB
Evan 00:40:21.434 Yeah, and give us a GitHub star if you like it.
Darin 00:40:24.734 There you go.
Evan 00:40:25.844 'cause we still like stars. Even though we're north of 30,000, we still like them.
Darin 00:40:29.714 Okay. So why do Stars matter to you?
Evan 00:40:33.344 I just, you know.
Darin 00:40:34.274 probably.
Evan 00:40:35.249 You know, it's, it's a little bit of pride. Like, like all these people, you know, these people after Ted, you know, projects, listen, open source projects, if they don't have velocity, they die. And so I'm always looking at the rate of change in stars. And so if stars continue to go up at a nice rate and like people can see that we're continuing to add, we're continuing to work. You know, I think if you were to ask Paul, my founder, he'd like to have a database like Postgres that's 40 years old or whatever, we have some pride in that. So independent of the commercial instincts.
Darin 00:41:09.010 So we can come back again in what, 25 years would that be about? Right? No,
Evan 00:41:16.240 From your mouth, as they say.
Darin 00:41:18.880 Yeah.
Evan 00:41:19.180 they say. where I'm from, from your mouth to God's ears.
Darin 00:41:21.790 Yeah. 30 years from now, that would be, uh, let's see, do the math. That would be 90. That would be about right. and you should have at least a hundred thousand stars by then. We need more than that.
Evan 00:41:29.820 know, I'll be almost 40 by then.
Darin 00:41:32.550 Wow. That time series database really does interesting things with math. That's all I can say.
Evan 00:41:40.395 It transformed. We have transformers in there
Darin 00:41:42.330 Transformers. Alright. Again, everything about influx we found@influxdata.com. Again, open source at GitHub. Evan, thanks for being on the show today.
Evan 00:41:51.345 Hey guys. Thanks for having me. It was a real pleasure. Enjoy, um, um, enjoy Barcelona
Viktor 00:41:57.820 I.
Evan 00:41:58.005 and keep using, keep using your house.