In one of our latest Meta-Cast, Josh and I went through a couple of questions from the audience. One of the questions surrounded what to measure for individual developers. To be honest, I was taken aback by the question.
You see, I’ve been preaching for years that when you move to agile metrics, you want to do three things:
(One) Move from individual team member metrics to a more holistic, whole-team view.
(Two) Move from measuring functional teams, for example test team progress, again towards holistic, executing agile team view.
(Three) That you’ll want to collect fewer metrics and focus them in four distinct areas of interest. They are:
- Predictability
- Value
- Quality
- Team Health
(And Four) That you want the metrics to be more output based or results based, than input based. For example, instead of being interested in “planned velocity”, be more interested in “resulting velocity” from your teams.
An addendum here, is that trending is much more important than any specific data point. Then recently I ran into the following thoughts around agile metrics.
Outcome Metrics
Gabrielle Benefield wrote about outcome as being what truly matters in . She makes the case for “outcome” metrics that focus slightly elsewhere than are traditional views. Here’s an excerpt:
Throughput tells us how fast we go, output tells us how much we delivered. But why are we doing any of this work? What are the problems we are trying to solve? What are the business results we desire?
By measuring throughputs and outputs, we are incentivising people to deliver more of them. More creates more waste. While some people may believe if they throw enough features out it may increase the probability of hitting their target, it feels eerily similar to the belief that if you put enough monkeys in a room typing for long enough they will produce the complete works of Shakespeare. This not only misguided, but is wasteful for our products and out planet. Yes you heard it right. Think of the amount of resources being used for all of those wasted features, let alone all the power they end up consuming as they linger around on servers for years. When the glaciers melt away, you know who to blame.
The one thing very few organisations appear to be measuring are Outcomes. Outcomes are the value we create. They range from wanting to increase revenue for a company, to improving the usability and learnability of a product.
In spite of all of this, we continue to measure what is easy rather than what is important. I think the reason is that it is not as simple as measuring throughput or output. You have to think, understand and test relentlessly.
Type of Metric
Based on her article and my own thinking, I’ll try to categorize each of the metrics as either:
Input – something defined or measures at the “beginning” of the pipeline. A typical measure might be planned values, for example, planned test cases.
Output – something that results from the pipeline being executed. Velocity is good example of an output metric.
Outcome – and to Gabrielle’s point, these are metrics focusing on experimentation and customer testing. For example, measuring actual customer satisfaction with a new releases feature set by leveraging a survey.
Metrics Brainstorming List
Most of these are wild and crazy ideas. However, I thought it might be useful to share some of them around what to measure in agile teams.
What might be even more interesting is what I haven’t listed. For example, individual metrics (test cases run per tester per day) or functional metrics test case run coverage per day per plan.
- Features removed from the Product Backlog—over releases, quarterly, or perhaps as a percentage? Features de-scoped or simplified in the Backlog; same as above. These two are trying to show if we’re as willing to subtract as we are to add. (Value, Output)
- Stop the line events for a release or across an organization; could include Continuous Integration / Continuous Deployment stops and/or other “process” stops. Clearly a “quality centric” sort of look into the organization. (Quality, Outcome)
- Root Cause discovery sessions conducted per team – per Release; could view types of corrective actions and correlate patterns across teams? Again a quality-centric metric—are we truly focusing on continuous improvement? We could also consider the sheer number of issues and their trending over time. (Quality, Outcome)
- Number of Retrospective Items resolved per team. I’ve often wanted a way to look at the retrospective and still maintain the integrity/confidentiality of it for the team. I wonder if this would work? (Quality, Output)
- Number of stories delivered with zero bugs – per Sprint, per Release? And by zero bugs, I’m implying newly introduced bugs. Is the team holding to their agile quality values? (Quality, Output)
- Number of User Stories that were reworked based on PO / Customer review; perhaps even maintain a “Cost of Rework” factor. There is a range here where this is very healthy. But it can also become repetitive and unhealthy. How would we determine that? (Value, Output)
- The percentage of technical debt addressed; targets > 20%. You’d need to be clear about what fell into the “technical debt” bucket. (Value or Quality, Outcome)
- Trending of velocity per team, perhaps a rolling average in story points. What’s interesting here is release-level predictability per team in points. Avoid aggregation of results across teams (organization-level) and beware of the affect of team turbulence. NO individual team member velocity measurement! (Predictability, Output)
- Time-stamp types of work as they move through your teams—capturing throughput per story size. Then you’ll have a range and average/mean for story throughput. You can choose from this to forecast your release-level commitments. Avoid aggregation of results across teams (organization-level) and beware of the affect of team turbulence. (Predictability, Output)
- Delivery predictability per sized user story; average variance across teams and is the trending improving? This is the non-specific variant of the above—simply looking at raw story delivery variance. (Predictability, Output)
- The percentage of test automation (including UI level, component / middle tier, and Unit level) coverage. Or instead of coverage, you could show the ratio of planned vs. running automation. Trending here would be the most interesting bit. (Quality, Output)
- The percentage of each sprint spent on automation investments. The percentage of each sprint spent on Continuous Integration and Continuous Deployment investments. Again, trending over time would be the most interesting. (Quality, Output)
- Team agile health survey data; monitor trending & improvements. Happiness Factor of the team –
- Training budget per agile team member. Simple. Perhaps look at year-over-year trending. (Team Health, Output)
- Agile Maturity Survey – there are a wide variety of these sorts of tools developed. I lean towards the developed by Bill Krebs. The AJI strikes a very nice balance between measuring but not being too heavy handed in reacting to team-based maturity and evolution. (Team Health, Output & Outcomes)
- Customer usage of delivered features (actual usage). Somehow instrument your application or product so that usage can be collected and analyzed. For example, Google Analytics for web pages. Establish Product-driven Business Case values and then actual values. The delta might be interesting. (Value, Outcome)
- Customer survey’s to identify value delivered (actual, not wishful) Perhaps use some sort of Net Promoter Score? (Value, Outcome)
- Dedicated Scrum Masters vs. Multi-tasking Scrum Masters (Dual or more roles). Create some sort of ratio that represents your investment in the core roles for agile transformation. Could extend this to coaches and Product Owners as well. (Team Health, Output)
- Number of failures due to teams experimenting, taking risks, stretching, etc. I might be looking for something greater than zero. Could also make this cross-team or organizational. (Value or Team Health, Output)
- Measure organizational commit levels to date driven targets/expectations. This is an indicator of the level of planning within your organization and who is “signing up” for the credibility and feasibility of the plan.
1, senior leadership only
5, plus – directors & managers
10, plus – team leads
15, plus – whole team
25 plus – integrated across ALL contributing teams
50, EVERYONE (contributing to the release) thumbs up…
for any delivered release
(Predictability, Output) - New test cases added per Release or per Quarter. Retired test cases removed per Release or per Quarter. Both focus towards the effort to make our testing relevant in real-time. (Quality, Outcome)
- Stakeholder & Senior Leadership attendance at Team Sprint Reviews. Could measure distinct feedback from this group, rework driven by feedback, and pure attendance. Point is, how often are the “right people” in the room for the review? And were they engaged? (Value, Outcome)
- Team Churn – measuring internal and external changes (impacts) to a team sprint over sprint. At iContact we had a formula for this that created an index of sorts. I simply can’t recall. But it’s a powerful metric because it’s basically “waste” and slowing the team down. OR you could correlate with velocity. (Predictability, Output)
- Backlog grooming and look-ahead is an incredibly important team activity. There are also milestones where you want to have groomed the work for the next release – within the current release. Track backlog grooming frequency and the pace of story analysis through it. I often use an egg timer to reinforce “crisp” grooming of stories at a 5-8 minute pace per story. (Value, Output)
Wrapping Up
Remember the title of the article. These are potentially weird, outrageous, or downright silly metrics ideas. I’m not sure if any of them deserve serious consideration in your agile projects, teams, and work.
Now I wouldn’t have written them down and shared them if I didn’t think they had “potential”. Still, I’d like to hear back from you on their viability as high quality, agile-centric measures.
- Which ones are viable or do you like?
- Which ones aren’t?
- Do you have any “pet” metrics that you’d like to add to my list? Please do so.
- What do you think about the notion of: input, output, and outcome metrics? Does it matter?
- And if you had to “boil it down” to 5 simple metrics to measure High-Performance agile teams, what would you look at?
I’m looking for any and all feedback, so please give it to me.
Stay agile my friends,
Bob.
Great discussion!
For consideration:
Your top level KPI categories are predicatability, value, quality, team health. I would argue that in a context where requirements are fuzzy and shifting (most agile contexts), I would replace “predictability” with “flow” (measuring wip, cycle time, throughput, etc) and would make sure to try to add “learning” metrics to “value”. I would also argue that you might want to compress “value” & “quality” : if quality doesnt lead to value, who cares.
In summary, I would argue for 3 top level categories for agile team KPIs
– Flow (e.g. cycle time)
– Value (e.g. increase in revenue, reduction in % tech debt backlog)
– Team Health (e.g. team Net Promoter Score)
Last thing, you might consider breaking value into
– revenue/market impact/customer benefit kpis vs.
– internal software quality metrics (test case coverage, OO complexity metrics, etc)
I think this is a slippery slope because both ultimately measure contribution to enterprise value and when there separated, execs and teams tend to devalue internal metrics.
I love this topic and hope there is continued conversation here.