#104: When a business decides to release code to production that hasn’t been fully tested or is releasing because a date has been promised, that’s a business decision and not a technical decision. However, at some point in the future, the debt will come due. Today, we speak with Dan Burns from Testifi about TDD, BDD, and why an organization must apply test automation the correct way or they’ll find themselves playing catchup all the time.
Viktor Farcic is a member of the Google Developer Experts and Docker Captains groups, and published author.
His big passions are DevOps, Containers, Kubernetes, Microservices, Continuous Integration, Delivery and Deployment (CI/CD) and Test-Driven Development (TDD).
He often speaks at community gatherings and conferences (latest can be found here).
His random thoughts and tutorials can be found in his blog TechnologyConversations.com.
If you like our podcast, please consider rating and reviewing our show! Click here, scroll to the bottom, tap to rate with five stars, and select “Write a Review.” Then be sure to let us know what you liked most about the episode!
Also, if you haven’t done so already, subscribe to the podcast. We're adding a bunch of bonus episodes to the feed and, if you’re not subscribed, there’s a good chance you’ll miss out. Subscribe now!
Testing doesn't really happen in isolation, obviously. It happens in the context of the broader project and depending on how big the pile of debt is, it's a question of how you scope but in the end, actually you start right from the start. You start with the requirements, the focus of the application, where should you be putting your energy and then you start to validate this piece by piece.
This is DevOps Paradox episode number 104. Technical Debt Is a Business Decision
Welcome to DevOps Paradox. This is a podcast about random stuff in which we, Darin and Viktor, pretend we know what we're talking about. Most of the time, we mask our ignorance by putting the word DevOps everywhere we can, and mix it with random buzzwords like Kubernetes, serverless, CI/CD, team productivity, islands of happiness, and other fancy expressions that make it sound like we know what we're doing. Occasionally, we invite guests who do know something, but we do not do that often, since they might make us look incompetent. The truth is out there, and there is no way we are going to find it. PS: it's Darin reading this text and feeling embarrassed that Viktor made me do it. Here are your hosts, Darin Pope and Viktor Farcic.
Way back in episode 43. Can you believe it, Viktor?
That's so long ago, that I don't remember what we did in episode. I don't remember what we did in episode 90.
And that was only a few well weeks ago, I guess, but 43, we titled that one "there is no such thing as continuous testing." Has our stance on that changed at all?
Mine didn't. There is something as continuously testing, but not as a separate practice. It's all part of CICD. So we do test continuously just to clarify, but there is no separate continuous testing process. No.
or we should say you should be testing continuously.
Yes, as part of your CICD.
Today we've got somebody on with us. Dan Burns from Testifi. Dan, how are you doing today?
Very well, thank you gentlemen. Nice to be on board.
Dan is the co-founder and CEO of Testifi. Give us the ten second overview of what Testifi is, and then let's get into why testing matters, not only from a shift left, but also still shift right.
Testifi is basically all about enabling test automation within an organization. It's not just automating and speeding up testing, but it's for the entire software delivery process. And I think Darin, you hit the nail on the head when you said, actually it's not about continuous testing. It's about facilitating continuous integration, continuous delivery, this constant process of delivering software for which therefore you need to be testing continuously. What does it mean from a shift left, shift right perspective? Shift left means testing, quick feedback, pushing the tests instead of testing the code at the end. You're testing your requirements. So the BDD, the specification by example stuff and then the shift right basically means having these progressive, staged quality gateways where every time one layer of tests passes, it pushes it to the next layer where you continually increase the scope of your tests and move from functional testing to non-functional testing to make sure you've got this constant process of evaluation going on. The ideal case is that every commit that you push into your CICD pipelines comes out the other side as a fully qualified piece of code.
That's one of those things that I believe if three of us would start a new company today, without having any legacy code, just starting from scratch, literally everything and we have certain experience, I would agree completely. We just do CICD from the start. We start testing from the start. BDD, TDD, whatever. But the real challenges are, how do you get to that point, knowing that you have 20 years of code that is not tested or partially tested or not well tested. The real challenge I believe is for those who are not starting today, right?
Sure and this of course is highlighting many different topics and of course this topic of technical debt is a very real threat on everybody's project. What you see is the classic, even if you're running an agile project or using agile or CICD tools or processes or whatever, you still have this classic design, develop, test and there may be at the end test automation as a process. Of course, this leads to a buildup of technical debt. The whole shift left to move towards this BDD and best practices and everything is hard and difficult, but you can of course start at any point in time by starting to get control and to get grip of the artifacts that you have. For me, this is not just testing. This is also understanding what your application does, understanding what are the core requirements, what's the core functionality of this, and then focusing your attention and your energies on proving or validating this core functionality is there. Testing doesn't really happen in isolation, obviously. It happens in the context of the broader project and depending on how big the pile of debt is, it's a question of how you scope but in the end, actually you start right from the start. You start with the requirements, the focus of the application, where should you be putting your energy and then you start to validate this piece by piece. The approach is relatively straightforward in the end. If we take a very simple example, let's imagine you have a website, it's a fancy website and on this website, before you can do anything, you have to login. That's the gatekeeper for you to be able to do anything on this website. So the first thing is to get this login under control. Once you have your login under control, then you can start to execute the next step, which might be to create an account or create a customer or do something fancy. But every single time you encounter these new pieces of code or pieces of functionality you have to battle to get that under control somehow, to create a building block or a reusable piece of code that you can use to make your life easier. But the point is the first time you do this, everything's new. The second time you do this, you've already got the login and now you've already got the create account or whatever other transaction it is that you're doing and then by the time you've gone through 10 or 20 or 50 or a hundred different use cases, actually you've got a fairly complete library of actions, which allow you to cover the core functionality of your system. Then you have the nice effect that once you have this library of elements available, then creating new test cases or use cases starts to become easier because you already have 80% of the work done. By building up this library of coverage and these elements that you can use, then you start to enable better practices, maybe not best practices from day one, but you can slowly start to move left. You can have a stronger relationship between the requirements and the tests. You can start to prioritize the energies and the efforts on the relevant features, rather than just wading through the technical debt that you want to get there. All of this then has the effect of enabling the rest of your CICD processes. It becomes a complimentary and an accelerator for the rest of the process.
I have a question and it's never crossed my mind until today. Is agile one of the reasons why we're collecting so much technical debt?
The reality is is that it's not really anything to do with agility. It's all to do with these short timelines. If we have a two week sprint let's say and we get to the end of the sprint and we're 80% done on some of those topics, there's always pressure to get that stuff out the door if it's scrum based let's say. There's always pressure to finish, to be done. What typically happens is the undone work, the leftover stuff, gets pushed into a pile of we'll do it later and then we have the next iteration and now there's the next pressure to get the next lot of features and functionality out the door and the pile of we'll do stuff later gets bigger and bigger as we move forward. The problem is that a very typical anti-pattern that you have and I'm not saying this is everywhere but I've seen this in far too many places not to be wary of this and this is the typical silo based mentality where you have people in the team who are primarily business analysts or POs and they're pushing the requirements into the sprint. Then the developers pick up the requirements they start working on them and then they get pushed over to the testers and the team, so if we don't have a mature team, a typical anti-pattern that you see is that the testers are not as technically capable as the developers. So the testers pick up the work towards the end of the process and then when things finally stabilize, then the test automation comes in again. This is your typical anti-pattern that you see almost everywhere. If you're doing this, then you're always playing catch up and the only way to address this then is to think about shifting left and instead of testing the code, you focus on testing the requirements and that means test driven or behavior driven development or specification by example et cetera.
Yeah, because if testers are testing somewhere close to the end of the sprint then they better not find any problems, because the sprint is about to finish.
They're disincentivized to find problems. The reality is at this point in time their tests are the most expensive. They're the most fragile and they offer the least value back into the team. You want those tests to be running right from the start as the development is working. Those tests should run red until they run green. Once they run green, they should always run green and there should be this constant feedback. If you've got that basic check in place, then you can always think about how you would exploratively or qualitatively work around that test cases and build it up and so on.
You're calling out TDD and BDD separately. Why are you doing that?
In my opinion, they're two different disciplines. Basically when it comes to software development or at least assessing what you've got, that you've got a top down approach which means from requirements and specifications and you have a bottom up approach which means code and so forth. So for me TDD is the bottom up approach. That's focusing at a unit test level. This is a design practice which is focused around making your code focused and efficient. But the problem is is that you can have a hundred percent unit test coverage using TDD and TDD is the only way that you can really get a hundred or close to a hundred percent unit test coverage and still the application doesn't do what you want it to do. So, TDD is testing are things working correctly but they're not testing if you're doing the correct things, which is what you require from a top down approach. So a top down approach implies BDD or specification by example which means now you want to test or prove that the behavior of your system is as expected. Now we're talking about a black box. We put input into that black box and we expect the system to behave in a certain way. This is a much more business driven perspective. This is typically what a business wants. They want to validate. They don't really care how a system works per se. What they want to validate is the system works as they expect it to work or as requested or as paid for. For me they're two very different disciplines. They're very obviously consistent but they're two sides of the same coin. The problem with the TDD approach is because it's at a unit test level. It's sometimes very difficult to assign a business value to one of those test cases. Because you are testing at a very low level, at a function or a method level, it's sometimes almost impossible to assign a business value. Whereas when you're talking about BDD, your buy or acceptance test driven development or whichever way you want to think about it, you're basically saying I want my system to behave in a certain way. I want my system to behave in a certain way because of a functionality or a feature that I place business value on I've requested this feature, so therefore it makes it much easier to have that kind of relationship back to the value of the system.
How often do you think that, at least the companies you work with or the teams you work with, how often do they trust their tests? How often do they ask to rerun the pipelines that run tests?
This is perhaps one of the reasons why this topic of continuous testing was perhaps introduced right at the start. From an automation perspective, the cost is creating the test cases rather than executing the test cases. That means once you've gone to the trouble of setting up those test cases, you should be running them on a constant basis. On every commit, on a daily basis, on demand. The problem is that when you've gone to the trouble of setting up your test cases and sometimes they run red and sometimes they run green. You have a lot of different reasons for why a test might fail. Now the reality is test automation is complicated. We're not just testing the code. We're also testing the data, the environments, the configuration. There's a lot of different reasons for why it might be. Now the problem is if you don't have a well-architected testing strategy or testing concept then you very often run into these problems with data being burnt or data being not in the right state or something interfering with the test execution that basically means that it's not so straightforward. When there's an overhead to understanding the output of your test execution, to analyzing the results, then it's a disincentive for you to execute the test cases because now instead of the test execution being zero cost, it goes to cost something. It costs effort. It costs time and so forth. The problem is if you have a hundred test cases and you have five tests that don't work so well, you can easily find out what the story is with each of those five test cases. But when you have a thousand or 5,000 test cases and you still got 5% not working, it's now too much effort to analyze each of those test cases in isolation. What typically happens is the release manager runs around the team at the end of the sprint asking everyone critical or non-critical? What's your gut feeling? Do you think we can go live? This of course is the end or the end of the reliability of your test automation. This is sadly what happens very often that a lot of people take a lot of time and effort to set up a test automation suite but they're not thinking about what happens when this test automation suite or this framework scales and how do we handle this? How do we maintain this? How do we make sure that we're still getting the value that we need out of it?
Yeah because at the end of the day that can be a tremendous cost. You need to have a person or even a whole team going after failed tests and deciding whether they failed for real or they should be just re-run, right?
Sure and then it's not just a test question. It's also a where's the problem. Is it something to do with the code? Is the data? Then you need to have the best people that you can get to really look into what's wrong and so forth. Again this is the reason why you should be shifting left because these kinds of problems occur when you make changes to the system without anticipating the effect to your system. In my ideal case, in my ideal world, we're not talking about 95% pass rates. We're talking about a hundred percent pass rate. If something breaks, that the tests run red until they run green and then they always run green. If it doesn't run green anymore, then somebody has work to do but the difference is if you minimize the time, the feedback loop, so you minimize the time first of all from when the test is executed to when you've got the feedback back in some kind of decision makers hand and secondly instead of doing a two month block of code, you're just doing a mini small increment a 30 minute or a one hour block programming time, becomes much easier to understand why that test is no longer working. If you do it properly, if we're shifting left, if we're planning the changes to our system properly, we should know exactly. We should be able to predict with a high level of certainty which tests will be impacted by which change to the system. Then we can say okay is this change an expected change or is this an unexpected change. If we do this then instead of being an overhead, this becomes an incredibly powerful feedback tool, which means that we can therefore move forward with much more confidence.
What happens when business decides we don't care If the tests are failing. Now we have to ship.
The answer is it depends Darin. It depends on what is the cost of failure of your system. If you have to ship and obviously this is the classic problem. Normally release dates, especially for high profile projects, release dates are broadcast and advertised and there's a hard date, regardless of what's the status of your application. Now if you push code out the door which has known problems, first of all you've got to expect those problems coming home to roost. Then the question becomes what's the consequence of failure? How much is it going to cost me? What's the cost of poor reputation? What's the cost of exposed security? Does it have a direct financial cost? Does it have a transactional cost? Does it have a reputational cost? If those costs are acceptable to you, then okay. You can push that code out the door and you can fix it up with hotfixes or you can push new releases out the door. Playing catch up but you should do it in a controlled way so you know exactly where to focus your energies, so you know exactly where to focus the attention and the time and the effort and instead of playing with a shotgun, you should be playing with a sniper rifle.
But isn't that just another form of technical debt? You've decided to ship quote unquote too early.
My point I guess is around the feedback. Let's say you shipped too early. Fine. There's a lot of reasons for why you have to have a fixed release date. In an ideal world, of course, you wouldn't ship early. If you do ship early, you want to know which parts of the system are fragile at the very least. The best case you've got everything properly tested and validated and you're fully under control but in the worst case you have at least an idea where you can expect problems.
I don't even think that technical debt is necessarily always a bad thing. It's similar like financial debt. You can take a debt because it will speed up your business and you will pay it off or you can be silly and just take a debt that you can not pay off and then you need to sell your house. With debt, it really depends whether it's conscious or unconscious, whether it's planned or unplanned, whether you have an idea how you're going to pay it back and that you are going to pay it back. The problem is when you you know you go to a shop and you start maxing your credit cards like crazy. That's a bad form of debt.
I agree entirely. The point is technical debt is a business decision, so you have to understand what's the cost or what's the consequence of going out with this technical debt and the problem is as Viktor points out if you don't pay off the debt sooner or later it's going to be a problem for you. Let's say an anti-pattern that I've seen quite a lot is when you're doing in sprint spaces, there's only a certain amount of work that's possible and if you're working on new features or if you're working on covering technical debt, there's only ever going to be a certain amount of work possible. As the technical debt builds up then more and more of the attention and more and more of the energy goes on dealing with that technical debt or moving slower because they're slowed down by this additional technical debt and then eventually sooner or later you just have to stop the development of new features and focusing on a consolidation round or focusing on cleaning up those problems. This is then the consequence of not paying back the debt and again it's just a business decision that you have to make but you have to make it with the best possible information available.
That becomes extremely complicated when you have multiple businesses on the same product or project. Long time ago, one of the companies I've worked in, we had three or four PMs and they would work one after another or in parallel. Then you have your project, my project, and your project and we're all on the same code base. When you are in such a situation, none of them had any incentive to ever fix the technical debt of the one that was generated by the previous PM you know in the previous quarter, iteration, whatever it is. So it is a business decision but only if it's made by somebody who has a complete control and interest of the project as a whole. If you split it into PMs, it's never going to be a real decision.
It's an interesting point and I guess this is a reflection on the fact that as a project gets bigger, they become more complex. There's more people. There's more parties involved. In this case then it makes it even more important to make sure that you've got the right systems and foundations in place. If you have a small team made up of capable, experienced developers, you don't need to be as strict and as consequent with the systems that you have in place because everything's under control. You have so-called trust based quality but as the team starts to get bigger and indeed as it starts to split out and so on and so forth then the complexity increases and then the importance of having these systems in place become higher and higher. In this case the bigger and the more complicated it is, the more important it is to have these disciplined approaches. My argument and let's say one of the things that I say to our team is that it's freedom through discipline. We can have the discipline to be able to be creative and to work in a way that's not as constrained as long as we make sure we're addressing all the fundamental elements. I believe one of your previous podcasts, you said by tightening the screws, by making sure everything's tightened down, you actually give yourself more freedom. This is exactly the same point. By making sure you've got these fundamentals in place, that you're disciplined, that you're observing these rules and systems that you set up, then you give yourself the freedom and the ability to work more easily
and by not increasing the size of the team. That's why we like microservices. Once you reach a number of let's say above 10 people, it's impossible for that team to work as a team. It never happens.
It certainly makes things complicated and this has always been the criticism of scrum or agile based approaches that they don't scale. In this case of course you have to think about ways to split off or to delegate out responsibilities or find ways to make the team a little bit more manageable.
But somebody sold me a SAFE certification, so it does scale, right?
Exactly and you keep on believing that
Hey, I was participating in a company practicing SAFE with 10 people. Now imagine that suffering.
So I'm not saying that it doesn't scale. It just becomes more complicated. As you introduce this complexity, all jokes aside about SAFE or LESS or any of the other different disciplines that you've got with regards to scalable software development practices, the reality is as soon as you've got a bigger team, it becomes more complicated. There's less trust. You have to have a much better thought out system or systematic approach to dealing with the way that you move forward because you can't compensate as easily with the general brilliance of one member of the team. A high performer or a high quality developer, their impact becomes minimized as the team gets bigger and bigger. You can't have somebody riding in and saving the day all the time. What this therefore reflects to me is that it's even more important to be disciplined to make consequence about laying down the foundations, especially around testing.
You brought up trusted a couple of times. You've used it in a couple of different contexts. Instead of trust, should we be following the security side of the house and go more towards zero trust?
So what does trust mean in this context? We deal with enterprise applications. We also deal with small startups and the typical scenario that you get in a startup environment is that you have one or two generally speaking pretty brilliant people who've got an idea and they start banging out code. Now typically there's one core developer and there's one or two other people supporting that person. In this context you've got one person who has more or less a good overview of the entire code base. You can say that they basically understand everything that's going on. If someone makes a mistake, they can close their eyes and they can see which part of the code we're talking about. It's all more or less under control. As the team starts to get bigger, by the time you've got four or five people on the team, it's already too much for one person to really keep that entire code base in his head. Already we're reaching a situation where we've got these limits of trust. So arguably we're moving towards this no trust scenario and what does this no trust scenario look like? It means well we want to validate that the functionality is working. We want to validate that the new functionality works when it's been implemented but we don't want to do that by hand. We want to automate that process. We want to press go and we want to make sure that all of our checks and our validations are as automated as possible. If we're doing this then we've introduced a system where we have more or less a systematic way to validating our code base, creating test cases to making sure that we're covering the functionality, so arguably we're already moving into this no trust zone that you're talking about. But the point is it's not about not trusting. It's about trying to make everybody's life easier because now we don't have to trust that we have to keep everything in our head and so forth. We don't have to trust anybody that somebody is going to follow up and catch our mistakes. We're now trying to automate that and let's say take that pressure away.
I would just make a small correction. That first person that you mentioned that is exceptionally good. He tricked you. He wasn't. If he was, he would be automating himself from day one.
You've got it right Viktor. Sure. Sure. It's the dream right. It's not just the 10Xer, but the 100 X developer. Right? They don't exist. The reality is everyone's just a human. We have degrees of capabilities and everything, but in the end, everybody's human and the idea is to minimize the downsides. To maximize the upsides of people's ability and creativity, but take away the drudgery and the mundane and anything that they have to do more than once. Well, we should be thinking about automating it, for sure.
So, that's a good transition point to how can Testifi help eliminate the humans?
We're doing our best to build a framework, which allows this kind of systematic test automation at scale. We've built our framework on a number of principles. The first principle is democratization of testing. It's not enough that it's just the technical people or the people who feel comfortable in an IDE. It's not enough that they're the only ones that are creating test cases. So we want to spread the load to the non-technical people, the people who have the business perspective, the understanding of what acceptance tests are and so forth. Typically these people don't feel a comfortable in a code base. So we want to enable these people, which means we have to use building block kind of concepts. We need to be able to drag and drop these building blocks and intuitively create test cases that make things easy. That's the first thing. Second thing is we have to enable the technical people too. We have to make sure that when we're doing tests, it's not just at a superficial level, it's at a full stack. If we've got full stack development, we need to have full stack testing. That means not just the request and the response, but the back ends, the database, the log files, any kind of API call. We need to make sure we're covering that. So it has to be both easy to use and it has to be technically complete. The third thing then is a seamless integration with the rest of the CI tools means task management, test management, your CI orchestration tool, whether that's Jenkins or if its AWS pipelines or whatever it might be. The whole tool needs to just fit seamlessly behind so that all of that overhead of maintaining your test framework, the test infrastructure, is just taken away. That's what we're trying to do with the framework that we offer our customers. We also have an AI tool, which we're working. This is called Forecast, and this also has the purpose of automatically generating these building blocks. This of course then speeds up the process of overcoming that technical debt. Between these two tools, what we're trying to offer is a means and a capability to rapidly boot up a framework which offers this kind of feedback mechanism.
If you want to get in touch with Dan, all of Dan's contact information will be down in the show notes and also a link off to Testifi. Dan, thanks for being with us today.
It's a pleasure, gentlemen. Thank you for the invitation.
We hope this episode was helpful to you. If you want to discuss it or ask a question, please reach out to us. Our contact information and the link to the Slack workspace are at https://www.devopsparadox.com/contact. If you subscribe through Apple Podcasts, be sure to leave us a review there. That helps other people discover this podcast. Go sign up right now at https://www.devopsparadox.com/ to receive an email whenever we drop the latest episode. Thank you for listening to DevOps Paradox.