Intentionally “Failing” the DORA Metrics

Many of us evaluating our DevOpsishness, have one time or another, answered how well we were doing based on the four metrics defined by the DORA research.

Deployment Frequency — How often an organization successfully releases to production

Lead Time for Changes —The amount of time it takes a commit to get into production

Change Failure Rate — The percentage of deployments causing a failure in production

Time to Restore Service — How long it takes an organization to recover from a failure in production
Google Four Keys project

I always felt like I was working in high performing teams, but if I answered the questionnaire honestly, the results came back that we could do better. For example, currently we choose to deploy to production once a week even though we could deploy daily. So, why don’t we deploy more often?

Here are some of the reasons:

Small, full-stack team

There are only a handful of people contributing to our code base. Even though engineers have their preferences, there is enough pairing happening for the team to have a collective ownership of the code and the features. Changes to trunk are often and small in size, usually on a single feature that more than one person are working on. Deploying once a week is enough to get the changes that matter, which are few in numbers, to production.

Few or no dependencies

The same as continuous integration and automated testing for your team code, it makes sense to check often the integration of your changes with the overall system you belong to. In our case, there are very few dependencies and nothing that some basic integration testing or a heads-up to from the other teams won’t fix. That is why we can get by not deploying so often as our changes have a very low risk of affecting the overall system.

Freedom to choose when to deploy

Even though we decided to deploy once per week, there is no one stopping us deploying more often if we see the need to. We have the authority and the autonomy to decide the frequency of our deployments based on the accountability we have of our service. That removes any unnecessary stress of having a fixed deployment window. If we find it necessary, we just go ahead and deploy.

Repeatable deployments

We are also in a position to technically deploy to production on demand. In case there is an issue in production, we know that whenever the fix is available and tested, it will take less than a minute to deploy it. The knowledge and access to deploy is distributed and documented, so no one person is the bottleneck of bringing changes to production whenever it is necessary.

Automation VS ROI

If we decided to increase our deployment rate, then we would also need to increase out test automation coverage as well as our monitoring and observability capabilities. In our case, the return on investment (ROI) would be small. We provide a service of good quality already in a pace that our users find useful. So, we prefer adding automation based on quality demands and not only to increase our deployment frequency.

The underlying reason we feel confident and as high performers, even though we only deploy once per week, is because we are in a very good place regarding our DevOps capabilities, such as Deployment Automation or Working in Small Batches. We didn’t adapt or improve the capabilities consciously, but I guess we were drawn to them and implemented them because of the problems we faced.

So, next time you go through the four capabilities questionnaire, and you find that you are not doing well maybe first question whether it makes sense to invest to do better. If yes, identify why you need to better and then have a look if one of the capabilities would improve the situation.

“Failing” the DORA metrics is not that bad, as long as we consciously know why.

Hi Areti, thanks for sharing your thoughts, it’s good to see and learn how others are using (or partially using) DORA in practice. I recognize some contexts may not require optimization on each of the metrics. I consider the metrics input to determine which bottleneck to tackle next, but they always have to be interpreted in the context of the team. I.e. delivery pressure, members leaving/joining, and other environmental constraints all influence the numbers.

There are a few points in your post that I’d question in my context.

“Deploying once a week is enough to get the changes that matter, which are few in numbers, to production.” => What value do your customers get from undeployed code? What risk lies in batching changes and deploying several in one go?

“That is why we can get by not deploying so often as our changes have a very low risk of affecting the overall system.” => If we consider production as the final proof that our changes are working correctly, what does it mean for the confidence we can have in undeployed changes?

“That removes any unnecessary stress of having a fixed deployment window. ” => Could this be related to https://martinfowler.com/bliki/FrequencyReducesDifficulty.html?

“In case there is an issue in production, we know that whenever the fix is available and tested, it will take less than a minute to deploy it” => Assuming you are fixing forward and not hot fixing a release branch, what risk lies in doing an emergency release that contains undeployed code + the fix?

“If we decided to increase our deployment rate, then we would also need to increase out test automation coverage as well as our monitoring and observability capabilities” => Wouldn’t you need monitoring and high observability in any case? Failures may be unrelated to a deployment, but caused by external factors and also in those moments you want to be able to observe your application health. Making observability instrumentation part of the DoR/DoD helps in deploying features that are observable from their very inception.

I realize I’m assuming a lot about the context and your reasoning, so I might be completely off the mark here. Reaching a plateau in a comfortable way of working with a team is very natural, but I’d argue there’s always room to improve further and deliver even more value to customers, company and development team as test engineers.

LikeLike

One thought on “Intentionally “Failing” the DORA Metrics”

jpjwolli says:

Nov 15, 2022 at 9:07 am

Hi Areti, thanks for sharing your thoughts, it’s good to see and learn how others are using (or partially using) DORA in practice. I recognize some contexts may not require optimization on each of the metrics. I consider the metrics input to determine which bottleneck to tackle next, but they always have to be interpreted in the context of the team. I.e. delivery pressure, members leaving/joining, and other environmental constraints all influence the numbers.

There are a few points in your post that I’d question in my context.

“Deploying once a week is enough to get the changes that matter, which are few in numbers, to production.” => What value do your customers get from undeployed code? What risk lies in batching changes and deploying several in one go?

“That is why we can get by not deploying so often as our changes have a very low risk of affecting the overall system.” => If we consider production as the final proof that our changes are working correctly, what does it mean for the confidence we can have in undeployed changes?

“That removes any unnecessary stress of having a fixed deployment window. ” => Could this be related to https://martinfowler.com/bliki/FrequencyReducesDifficulty.html?

“In case there is an issue in production, we know that whenever the fix is available and tested, it will take less than a minute to deploy it” => Assuming you are fixing forward and not hot fixing a release branch, what risk lies in doing an emergency release that contains undeployed code + the fix?

“If we decided to increase our deployment rate, then we would also need to increase out test automation coverage as well as our monitoring and observability capabilities” => Wouldn’t you need monitoring and high observability in any case? Failures may be unrelated to a deployment, but caused by external factors and also in those moments you want to be able to observe your application health. Making observability instrumentation part of the DoR/DoD helps in deploying features that are observable from their very inception.

I realize I’m assuming a lot about the context and your reasoning, so I might be completely off the mark here. Reaching a plateau in a comfortable way of working with a team is very natural, but I’d argue there’s always room to improve further and deliver even more value to customers, company and development team as test engineers.

LikeLike