On creating effective team performance metrics

My observation as a person running a technical team in a startup is that people often get team performance metrics wrong, including me. It turns out that when you are you are formulating metrics, you are not creating a way for measuring performance, you are actually deciding the behavior of people which in turn dictates the culture of the team. Metrics drive people to meet them so that they look good and the metrics must nudge people in the right direction. Otherwise, you would be soon looking at bad and unintended consequences because you have created wrong incentives. Example – If you are going to judge a developer on how many bugs does he/she fixed in last quarter, a disingenuous engineer would introduce bugs only to fix them later and look good. Metrics have been met, performance is good. It is not difficult to see that this kind of behavior has bad consequences as company will suffer because of bad decisions taken by the engineer. Moreover, team members who wrote high quality code and had very less or no bugs would be not happy that they have been judged negatively when they actually performed well.

Please note that this post has been written with young startups in mind vs big companies like Facebook, Microsoft, Netflix etc. They have longer projects, bigger teams that are split into areas. Startups don’t have that luxury often and measuring performance based on projects becomes difficult. So what do we do. Here are my two cents on how to go about it.

  • Goal is that people should be incentivized to do the right thing. Performance measurement process had to aid engineers in it.
  • Look at the team and determine the kind of behavior/culture you want the team to exhibit.
  • Think how a team member will go about meeting metrics and is there a way to manipulate them to look good. Ideally there should be none. Do you have any safeguards to address it. Ex – can people give bad estimates of their work and get away with it? Someone need to cross check.
  • Look at some behaviors that you want people to exhibit but are subjective in nature when it comes to measuring it. Like, helping team members.

You are not deciding the process of measuring performance, you are actually deciding the culture. This very important to understand when determining what metrics to use.

atishayk

Suggestions on metrics to use

Here is a list of behavior that I want to encourage in my team

  • Are our Engineers doing adequate work, i.e. quantity of work? We want people to be incentivized to do work.
  • Are our Engineers doing work with high quality? We want people to be incentivized to not goof up and hold high standards of software development.
  • Are our engineers improving with time in terms of quality and quantity of work done? This is important to measure because we want engineers to perform better with time.
  • Are they fixing issues that come up post production with right speed? Ex – are high priority bugs being fixed asap.

Here is the metrics I use if you agree with above.

  • Number of feature MR (Merge Request) done per week. Done means code is merged in Master branch and is deployed.
    • What does it tell us – Is the person doing enough work? This is a measure of throughput.
    • Why – It is important to know if enough work is being done and if there is sudden drop in work done.
    • How to measure – Each feature has to be given points. Say 1 point for 1 day worth of work. This is essential because different features require different amount of work. Add a field to measure feature point for each feature in bug tracking software of your choice. Each MR should have a tag to tell whether it is a feature or a bug. You need to check if you repository tool has an automated way to capture this information via APIs. Otherwise, an alternate solution is needed.
  • Number of bugs found post deployment
    • What does it tell us – Is the work done with high quality. We should have an additional tag to track if this is a regression (i.e. the feature was tested and was working earlier but now it is not working) or a new bug (i.e. was not found in testing before and got found later on).
    • Why – Important because we want good quality. We should care about quality in addition to quantity. We also care about new work breaking old working features.
    • How to measure – In the bug tracking tool, we should mark the bug to clearly tell if it is a bug in released feature vs unreleased feature and if it is regression or not. Unreleased feature bugs should not be counted for this metric. Unreleased bugs can also be measured separately, if required.
  • Time to Merge a code review
    • What does it tell us – How long do we take to give feedback to engineers post their work? 
    • Why – This is important because we should not be slowing down individual engineers. It should be measured to find out if there are bottlenecks that are not in the hands of engineers.
    • How to measure – You need to check if your Git repository has APIs to pull this information.
  • Code review comments
    • What does it tell us – A high number of comments will mean that people are not submitting high quality work.
    • Why – We want to know if people are able to deliver code with high quality or not. Lot of comments mean there is lot to learn for the engineer. Code review comments should go down with time and will indicate improvement in quality of work.
    • How to measure – We can get this information from Git Repository APIs.
  • Unit Test Cases per checkin
    • What does it tell us – Are we writing good amount of test cases to make sure code quality can be maintained?
    • Why – We want to make sure the automation can be run for the features and regressions be caught instead of finding them post production.
    • How to measure – This could be a simple yes/no metric or you can give it a score based on the amount of code coverage.
  • Time to fix HIGH Priority Bugs
    • What does it tell us – Once we find issues that are high priority, do we fix them on time or not.
    • Why – we should care about customer experience and important bugs should be fixed at a fast pace.
    • How to measure – Bug priority can be used to tag a bug and then can be measured. if you need a separate tag, you can create one.
  • Helping peers and collaboration
    • What does it tell us – Are people helping each other and are they collaborating well.
    • Why – Teams must work well with each other. It is important to know if there is harmony in the team.
    • How to measure – This is a subjective metric. You can have team members rate each other on a scale of 1 to 10. Rating can be kept anonymous (so that people are more comfortable in giving a honest rating). Alternatively, you can have engineering manager do the rating on the basis of his day to day observations (but visibility into effective collaboration is difficult and it can be hard to make correct judgement). You decide how you want to go about it.

Some metrics that I do not suggest that you use

  • Burndown chart – This is good for measuring work done vs work remaining but is not indicator of performance. This is more useful for project management. If there is too much work, it is difficult to finish it on time and it implies a project management issue vs engineering issue.
  • Measuring performance on project basis – In a startup world, projects are shorter, they change often and new projects come up at a faster pace. Project based performance management may be useful when dealing with large projects with dedicated teams.

I am eager to hear thoughts on the proposed metrics how it can be made better.

5 thoughts on “On creating effective team performance metrics”

  1. I think your approach is quite balanced.
    How do you handle the case where one bug is more complex and difficult to fix than another one?

    1. Thank you for liking the approach. One way to handle complex and difficult bug is convert in into a feature and link it to the bug. Measure the feature as any other feature. Another approach is to associate effort with a bug but then you end up creating additional overhead in terms of process and I tend to stay away from it. I prefer a light weight process.

    1. I would love to hear examples of over indexing on above metrics that I listed (you can simply disagree with the metrics as well) and your thoughts on avoiding them.

Leave a Reply to Ram Valliyappan Cancel Reply

Your email address will not be published. Required fields are marked *

Skip to content