My colleague and friend Eric Jacobson, who recently (as I write) did a bang-up job on his first conference presentation at STAR West 2011, asks a question in response to this blog post from 2006. (I like it when people reflect on an issue for a few years.) Eric asks:
You are suggesting it may not make sense for testers to give time-based estimates to their teams, but what about relative estimates? Let’s say a Rapid Software Tester is asked to participate in Planning Poker (relative-based story estimation) on an Agile Scrum team. I’ve always considered this a golden opportunity. Are you suggesting said tester may want to refuse to participate in the Planning Poker?
Having observed Planning Poker in action, I’m conflicted. Estimating anything is always a bit of a dodgy business, even at the best of times. That’s especially true for investigation and in particular for discovery. (I’ve written about some of the problems with estimation here and in subsequent posts, and with how those problems pertain to testing here.) Yet Planning Poker may be one way to get a good deal closer to the best of times. I like the idea of testers hearing what’s going on in planning sessions, and of offering perspective on the possible implications of work or change. On the other hand, at Planning Poker sessions I’ve observed or participated in, testers are often pressured to lower their numbers. In an environment where there’s trust, there tends to be much less pressure; in an environment where there’s less trust, I’d take pressure to lower the estimate as a test result with several possible interpretations. (I leave those interpretations as an exercise for the reader, but don’t stop until you get to five, at least.)
In any case, some fundamental problems remain: First, testing is oriented towards discovering things, not building things. At the root of it all, any estimate of how long it will take to test something is like estimating how long it will take you to evaluate someone’s ability to speak Spanish (which I wrote about here), and discovering problems in their ability to express themselves. If you already know something or can reasonably anticipate it, that helps a lot, and the Planning Poker approach (among many others) can help with that to some degree.
The second problem is that there’s not necessarily symmetry between the effort in creating something and the effort in testing it. A function or feature that takes very little effort to program might take an enormous amount of effort to test. What kinds of variation could we put into data, workflow, timing, platform dependencies and interactions, scenarios, and so forth? Meanwhile, a feature that takes signficant amounts of programming effort could take almost no time to test (since “programming effort” could include an enormous amount of testing effort). There are dozens of factors involved, including the amount of testing the programmers do as they code; what kind of review is being done; what the scope of the change is; when particular discoveries get made (during “development time” or “testing time”; the skill of the parties involved; the testability of the product under test; how buggy the finished feature is (in which case there will be more time needed for investigation and reporting)… Planning Poker doesn’t solve the asymmetry problem, but it provides a venue for discussing it and getting started on sorting it out.
The third problem, closely related to the second, is this idea that all testing work associated with developing something must and shall happen within the same iteration. Testing never ends; it only stops. So it’s folly to think that all testing for a given amount of programming work can always fit into the same iteration in which the work is done. I’d argue that we need a more nuanced perspective and more options than that. The decision as to how much testing we’ll need is informed by many factors. Paradoxically, we’ll need some testing to help reveal and inform our notions of how much testing we’ll need.
I understand the desire to close the book on a development story within the sprint. I often—even usually—share that desire. Yet many kinds of testing work must respond to development work, and in such cases the development work has to be complete in some lesser sense than “fully tested”. Many kinds of confirmatory checking work, it seems to me, can be done within the same sprint as the programming work; no problem there. Yet it seems to me that other kinds of testing can reasonably wait for subsequent sprints—indeed, must wait for subsequent sprints, unless we’d like to have programmers stop all programming work altogether after a certain day in the sprint. Let me give you an example: in big banks, some kinds of transactions take several days to wend their way through batch processes that are run overnight. The testing work associated with that can be simulated, for sure (indeed, one would hope that most of such work would be simulated), but only at the expense of some loss of realism. For the test, whether the realism is important or not is always an open question with a fallible answer. Instead of making sure that there’s NO testing debt, consider reasonable, small, and sustainable amounts of testing debt that spans iterations. Agile can be about actual agility, instead of dogma.
So… If playing Planning Poker is part of the context, go for it. It’s a heuristic approach to getting people to consider testing more consciously and thoughtfully, and there’s something to that. It’s oriented towards estimating things in a more comprehensible time frame, and in digestible chunks of task and effort. Planning Poker is fallible, and one approach among many possible approaches. Like everything else, its usefulness largely depends mostly on the people using it, and how they use it.
16 replies to “Should Testers Play Planning Poker?”
I (as Tester) take part in planning poker together with Devs. But we do not estimate testing. Only Development. I’m doing it to get more comprehension of the system I’m testing while discussing each task and to give ideas of potential problems/conflicts.. Considering my estimates usually agree to estimates of Devs – I think it works.
Michael replies: If it works for you, it works for you. Cool.
This was something I was wondering about too. I’ve had difficulties estimating the time via this planning poker method when I started with a project. This is because of not knowing the technology yet, not knowing what the vision exactly was (agile way of doing things).
Asking questions during the sessions and discussions helped a bit to get more insight into the technical stuff, but the impact on the business and estamating that part was more difficult, because it was within the team in the beginning a technical approach.
I even would try to make my estimsation based upon the discussions between the developers. More discussion seemed to me, would be an area that would be more infected with possible risks.
Well, after a few sprints and some exploration I think I did somewhat better. So it is difficult when a project starts, as with all planning and gets better during the iterations. Although still I did not feel too comfortable doing it.
For getting more insight in the possible issues I did a follow up on plannings sessions, with an informal risk session: So what did we plan, what things do you see that could go wrong.
Maybe it is an idea to combine the poker planning with also talking about risks, so that from that point of view the tester gets a better idea about the ‘problem areas’ and the rest of the team also starts to get a feeling for the testing part.
This would maybe speed up test conciousness for the complete team.
Of course this is my own experience, could be different somehewere else. Just a thought.
Michael replies: Yes; I like the putting the idea of talking about risks at the centre of things.
One important test result of Planning Poker is
“hearing what’s going on in planning sessions, and of offering perspective on the possible implications of work or change” – this is what testers find [http://www.developsense.com/blog/2011/03/more-of-what-testers-find/]
In some contexts these findings may be worth more to the project, than knowing the estimate of the said testing.
I think it is essential that tester participate in planning and estimation. They are part of the team and as such should be able to give there input. If that can’t or won’t be done, then you are not really doing scrum and then it would be better for everyone if testers are *not* part of the scrum team. Instead there should be a separate testing team which follows the development sprint. You then do miss out on part of the ‘synergy’ between testers and developers.
I have seen several times that developers revised there own estimates after I or another tester pointed out some potential icky parts or areas that were so important that the it either required a lot of regression testing or would involve more development work.
In cases were the testing work is expected to take a serious amount of time, there is always the potential to split the story into two parts: ‘development basic testing’ and a ‘testing development support’ which can be moved over to the next sprint (as long as the sprint result is not intended to actually go into production).
As always it is better to be pragmatic about it, then to act as if scrum rules are set in stone, which is a bit contra the whole idea of agile.
It’s normal for developers in our team planning sessions to want to discuss testing when we’re discussing a story, and to increase their estimates accordingly if it seems that something will require a lot more testing effort. I have never felt pressured here to reduce my estimates (unlike some previous environments), and I suspect that if I was being pressured my dev colleagues would resist just as strongly as my test colleagues. This is the first Agile Scrum team I’ve worked with so I’m sorry to hear others are less civilised.
I’ve experimented with adding an extra item to discuss, “test depth”, so that we can discuss how risky we feel this card is (and as a result, how much effort we think we want to put into exploring it, & how varied we want to make our tests). I’m not sure yet how useful this is as a separate item, and we may drop it given time, but it’s started the conversation about how when I say “I’m done with this card”, what I mean is “I’m done with testing this card to the level that we agreed it needed, considering how important the functionality it delivers (or affects, to the best of our knowledge) is to our customers, and our best estimates of how technically tricky it would be to implement, and neither I or the developer(s) working on it have discovered anything that would cause us to re-evaluate that decision.”
I don’t know yet how much more this adds than just discussing story point estimates. We’ve talked about that a lot, some of my developer colleagues feel it’s a bit of a duplication of story points in some ways, and perhaps it is. But it’s brought up plenty of discussion so far – so in that sense it’s been very useful.
Not only that testing spans over several iterations, there is an incremental testing effort being built-up from iteration to iteration.
Michael replies: Huh?
As one can’t assume done as done in testing and regression on past features interaction of past features with newly added features should also be tackled.
“As one can’t assume done as done in testing?” Huh?
So testing has to be defined in 2 levels: New additions to the iteration – can be roughly evaluated (though as you say, we need to estimate quality of incoming code),
While Regression, End to end use cases and Interconnection is an incremental effort which one must consider.
We have to convey the message as summarised here:
Michael replies: I think it’s stretching things to say we have to, but that’s certainly the way that I prefer to think about the issue.
Another problem is the fact that people (on the business side and technical side) treat the estimates as actuals. As soon as we provide the estimate we forget it’s an estimate, forget that we’re going to learn new things that will affect the estimate and not bother to adjust said estimates accordingly.
Actuals, yes, or “commitments”. The whole setup is like a customer asking a salesman “Can you tell me how long you need for me to decide to buy something?”
The most important thing I did as a tester in planning poker sessions was not to be shy to ask what I didn’t understood.
I found myself saying questions like “I’ve got a stupid question, but how is the application going to….”.
Actually some of this stupid questions were very helpful for all team (other ones only made me wiser 😀 ).
I have realized that in this sessions there’s a lot of assumptions not talked out loud. I see a tester more helpful due to his way of thinking than due to his skills to estimate the effort to test a user story.
You raise good issues. I like the idea of getting everyone to think about testing. If you have to estimate (or guess) how long a feature or story will take to test, awareness of the testing process and the effort required increases. When awareness increases, effort often increases as well. The extra effort just might result in better quality. Estimating the test time is worth a try no matter how difficult it may be.
Assuming we are talking about Scrum then of course they should. Planning Poker estimates the relative size of the problem, usually during backlog grooming. The solution is not yet known. People don’t think about the solution until the second half of the planning meeting when they start to discuss what actual tasks are needed to be performed.
The fact that testing tasks are involved (or not) in developing the solution are completely irrelevant to estimating the rough size of the problem. The solution could involve all sorts of things, all that ends up in the velocity component rather than the relative sizes of PBIs.
Michael replies: Completely irrelevant? Fascinating. The last sentence sounds like mumbo-jumbo to me, although I’m sure it means something to you.
[…] Should Testers Play Planning Poker? […]
I totally agree with Mark Rotteveel. One of the most important benefits from planning and estimation session with testers participation is effect of the “whole team”. Ideally there are no labels “dev”/”tester” in modern teams at all.
According to whose ideals? To fit what purpose? “Ideally there are no labels ‘doctor’/’lab technician’ in modern hospitals at all.” “Ideally there are no labels ‘pilot’/’mechanic’ in modern airlines at all.” What if people like their specialities, and are good at them? Why stop at testers and programmers? Why not have programmers do the book-keeping when the company’s accountants fall behind?
Another benefit is capacity limits. We have used approach with separated capacity (aka velocity) for testers and developers to make sure that we never commit to tasks that can’t be tested to the end of iteration. When testers capacity limit is reached but developers can still take more work we start thinking how to distribute testing work between other team members to increase testing capacity.
I’ve never heard “separated capacity” used a synonym for velocity (which is not a judgment; merely an observation), so I’m not sure what you mean here. I’m skeptical as to how often programmers can stay busy with programming work to the end of an iteration without either testing for the last bit (is that the best use of their time? It may be) or seriously slowing down for the last few days (which might not be the worst thing in the world, either). I understand the motivation for tidy closure on a sprint, too. But just as there can be a modest product backlog, I don’t see a big problem with a modest (a day or two) testing backlog between the end of one non-production iteration and the beginning of the next. And surely we should let teams decide either way for themselves, rather than putting them under the thumb of a calendar or an idealized process. No?
Also testing estimates reduce risks in the iteration. In Scrum for example team can manage work inside the iteration in any way. Having testing estimates in mind you can build iteration plan to balance testing work through the iteration. Sometimes we started easy for developers but complex for testers tasks very early to minimize risks to have untested tasks on iteration demo.
That makes sense to me.
And relative sizes for estimation in Planning Poker play very well for testing activities because they are not so easy to estimate in raw hours. I agree with note that some testing activities should run continuously, but only when we have time and resources to perform them. Most of testing tasks can be estimated and these estimates will increase flecibility and predictability of the testing process. Customer wants working software instead of perfectly tested software.
One more time: the goal is not to make testing predictable. The goal is for testing to reveal the unpredicted. In any case, James Bach are working on some ideas about estimation these days. Stay tuned.
Thank you for sharing this post and good to see valuable comments and response. I agree with the first comment as it comes from someone working on QA Testing. Mikalai Alimenkou has very nicely responded to this…good one!! Look forward to your next post.
I am confused, unless Testers have development experience and know the stack of technologies in use, how can they make estimates? It’s a bit confusing to me. I am a BA on a project and I am not really in the position to include estimates with the rest of the team
[…] http://www.developsense.com/blog/2011/10/should-testers-play-planning-poker/ […]
Totally agree with the article.
From my experience, if there is a Planning poker already then QA needs to be there. Otherwise, there is a risk the story which is even not ready to discuss will be taken to the development phase. QA should at least say that he/she understands how the story will be tested.
But when it goes to estimates… Personally I have no clue how it is possible to QA estimate on the planning. You need to have at least some checklist of what you are going to test. And how you should predict how many issues/bugs you will find, ad how much time the developer will fix those and you will need to retest an entire feature or part of it.
It would be interesting to hear about how people deal with all of that…