This post was originally supposed to be a comment in Mike Talks' blog (http://testsheepnz.blogspot.co.nz/2012/02/are-we-there-yet-metrics-of-destination.html) but it became a bit lengthy so I decided to post it here...
There are great posts about metrics already available so I don't want to dig in too deeply in the subject. Nonetheless, I would like to comment your interesting blog post!
I like the start with "are we there yet"! I wish more bloggers would make such associations between "real life" and software projects / testing.
When you mentioned you are estimating how long testing will take, do you mean a case where you and the project manager have a long history together and you both know what kind of testing you will do in that time? I am asking because testing is never done and I don't think that is too secure way for a PM to create a testing budget. Will he/you have the code done at this part? Do you (as in, you with the PM) have a lot of history with the same product?
"I know some people hate recording hours on a project. I personally think it's vital, because it helps a manager to determine spend on a project."
Yes, in some (possibly even most) cases this is very good. From testing point of view, I am usually interested to know how much of time (e.g. 90 min sessions) is used to test certain features/functionalities. For example, if there are no bugs found in a week, I could start asking questions like why we are not finding bugs and could we use testing somewhere else more effectively. (If the goal would be to raise bugs.)
"What about the number of test requirements tested and the number passed? Personally I like this metric, as it gives a good feel of how many paths and features we've tested, and I do think it's useful to keep track of this (as long as it's relatively painless)."
In many cases this can be a good thing to track. There can be for example legal requirements that need to "pass". However, like you point out in your text, it can be also very misleading. Even if 100% of the requirements pass, it doesn't mean a product is good or doesn't have critical bugs.
"Another metric I've seen is simple number of test cases run, and number passed. ... However it's more than likely a lot easier to track this number than the number of requirements if you're running manual test scripts which are just written up in Word (unless you're an Excel wizard)."
This is easy to track, but I don't see it telling really anything interesting. What would be interesting to know is how you use this metric. Is it used to raise up questions or make decisions (inquiry or control)?
One thing I want to stress. Passed tests can be dangerous to follow because they don't tell us much (if any) of the product. They might even give false confidence of the stability/quality of the product to management.
"What about measuring defects encountered for each build?"
I like to see how these change over time and what questions the change might raise up. Just like your text explains a situation where earlier bugs were blocking further testing.
When it comes to regularity of metrics, I think automated/scripted systems would be great to give numbers in many cases. Numbers can raise up good questions, but I would prefer not to use too much testing time to collect them. Depending for example on project size of course.
"I had to keep daily progress number updates. After day 5 I had still not finished script 1. In fact after day 8 I still wasn't done on script 1. On day 10 all 3 of my scripts were completed."
Maybe the progress should have been communicated differently? Sounds like there was "too big piece" to be reported for 1 progress step. I don't know what the script involved, but here is an example for a UI check:
- Needed (navigation) controls are coded
- The flow between controls is automated
- The assertions/verifications/requirements of each successful step are automated
- Minor changes to produce script #2
- Minor changes to produce script #3
- Testing, fixing and re-factoring the scripts
As we saw, because your comments helped the manager to understand there is nothing to worry, the metric was (close to) useless. Your explanation was a better report and the number could have been tossed away.