Metrics for Measuring Software Delivery Performance

Metrics for Measuring Software Delivery Performance



For years, people have been trying to figure out the ways to measure Software Development performance, starting with counting lines of code to measuring man hours and productivity. In the age of modern software, if we count lines of code then we are actively encouraging the team to create complex systems that over time will become harder to maintain. Why do we need more code if we can do a better job with less? Quite similarly, productivity means nothing on its own. If we produce loads of things that are useless, we aren’t really being productive.

I think we are looking at the problem in the wrong way, and there is perhaps a need to look at more meaningful metrics.

Let’s look into how Google’s DORA team (DevOps Research and Assessment) can help organizations deliver and operate software quickly and reliably, to meet the demands of the ever-changing business landscape.

What is DORA?

DORA provides assessments and surveys of organizations to see if they are a high performing IT organization or where they may need help and attention to become even better. The DORA team conducted a seven-year research program, which has validated a number of technical, process, measurement, and cultural capabilities that drive higher software delivery and organizational performance

What are DORA Metrics?

Think of DORA metrics as instrument gauges in your car. They will make no sense without context and co-relation with other instruments. For instance, is it good or bad if your car speedometer shows that you are travelling at a speed of 100 kms/ hr?  Answer is, it depends on the context.   It could be good if you are within the speed limits on a highway.  It could be bad if you are inside city limits or near a school zone.

Similarly, DORA metrics can be seen as instrument gauges, which give us information about the progress. 

Four DORA metrics are divided in to two groups - Stability and Throughput.  DORA research categorizes the teams based on their score, on these metrics ranging from ‘Low’ performers to ‘Elite’.

Stability is the measure of quality of software that we produce. It is measured by 2 metrics -

Metric1. Change Failure Rate (ratio of number of deployments to number of failures in production) 

Metric2. Mean Time to Recovery (time to restore service in production from the time the defect is found in production)

Throughput is a measure of the efficiency of our approach. It is again measured using  two  metrics -

Metric3. Lead Time (how long it takes from ‘commit’, to running the code in production).

Metric4. Deployment Frequency (how often do we release the code to end users).

What is the Benchmarking Level of DORA Metrics? 

DORA Metrics





Deployment frequency

Fewer than once per six months

Between once per month and once every six months

Between once per week and once per month

(multiple deploys per day)

Lead time for changes

More than six months

Between one month and six months

Between one day and one week

Less than one hour

Time to restore service

More than six months

Between one day and one week

Less than one day

Less than one hour

Change failure rate

16-30 percent

16-30 percent

16-30 percent

0-15 percent

Points to Keep in mind while using DORA metrics:

Four DORA outcome metrics are inseparably linked to each other. 

  • Some teams look to optimize only Throughput at the expense of Stability. Soon, such teams incur more work because of poor quality. This means that the team will end up going slower (impacting Throughput). It causes overall productivity to slow down dramatically.
  • It is important to balance Stability and Throughput and even more important to do it in way that is sustainable over long term.
  • DORA metrics can help us amplify our engineering discipline. The goal is long term success, and not short-term gratification.
  • The Four DORA outcome metrics are trailing indicators. It tells you how you did and not how you are going to do. (Continuous integration, test driven development predicts good score on those trailing indicators).

It is important to consider all the metrics in DORA as a whole, and not independent of one another to get information about the progress. These metrics should not be goals, and how we interpret them will depend upon the context, as long as we are thoughtful about what the results mean. DORA metrics are not easy to GAME or misinterpret, however we have to be cautious about how we use them. Unfortunately, there is no silver bullet in this case.