Scripts or No Scripts, Managers Might Have to Manage

A fellow named Oren Reshef writes in response to my post on Worthwhile Documentation.

Let me be the devil’s advocate for a post.

Not having fully detailed test steps may lead to insufficient data in bug reports.

Yup, that could be a risk (although having fully detailed steps in a test script might also lead to insufficient data in bug reports; and insufficient to whom, exactly?).

So what do you do with a problem like that? You manage it. You train the tester, reminding her of the heuristic that each problem report needs a problem description; an example of something that shows the problem; and why she thinks it’s a problem (that is, the oracle; the principle or mechanism by which the tester recognizes the problem). Problem, example, and why; PEW. You praise and reward the tester for producing reports that follow the PEW heuristic; you critique reports that don’t have them. You show the tester lots of examples of bug reports, and ask her to differentiate between the good ones and the bad ones, why each one might be consider good or bad, and in what ways. If the tester isn’t getting it, you have the tester work with and be coached by someone who does get it. The coach talks the tester through the process of identifying a problem, deciding why it’s a problem, and outlining the necessary information. Sometimes it’s steps and specific data; sometimes the steps are obvious and it’s only the data you need to specify; sometimes the problem happens with any old data, and it’s the steps that are important. And sometimes the description of the problem contains enough information that you need supply neither steps nor data. As a tester under time pressure, she needs to develop the skill to do this rapidly and well—or, if nothing works, she might have to find a job for which she is better suited.

You can argue that a good tester should include the needed information and steps in her bug report, but this raise (at least) two problems:

– The same information may be duplicated across many bugs, and even worst it will not be consistent.

As a manager, I can not only argue that a tester should include the needed information; I can require that a tester include the needed information. Come on, Mr. Advocate… this is a problem that a capable tester and a capable test manager (and presumably your client) can solve. If “the same” information is duplicated across many bugs, might that be an interesting factor worth noting? A test result, if you will? Will this actually persist for long without the test manager (or test leads, or the test team) noticing or managing it?

And in any case, would a script solve the problem that you post above? If you can solve that problem in a script, can you solve it in a (set of) bug report(s)?

Writing test steps is not as trivial as it sounds (for example due to cognitive biases, or simply by overlooking steps that seems obvious to you), and to be efficient they also need to be peer reviewed and tested. You don’t want that to happen in a bug report.

“Writing test steps is not as trivial as it sounds.” I know. It’s non-trivial in terms of time, and it’s non-trivial in terms of skill, and it’s non-trivial in terms of cost. That’s why I write about those problems. That’s why James Bach writes about them.

Again: how do you solve problems like testers providing inefficient repro steps? You solve it with training, practice, coaching, review, supervision, observation, interaction… that is, if you don’t like the results you’re getting, you steer the testers in the direction you want them to go, with leadership and management.

The tester may choose the same steps over and over, or steps that are easier for her but does not represent real customers.

Yes, I often hear things like this to justify poor testing. “Real customers” according to whom? It seems as though many organizations have a problem recognizing that hackers are real; that people under pressure are real; that people who make mistakes are real; that people who can become distracted are real. That people who get up and go away from the keyboard, such that a transaction times out are real.

Is it the role of testers to behave always like idealized “real” customers? That’s like saying that it’s the role of airport security to assume that all of the business class customers are “real” business people. I’d argue that it’s nice for testers to be able to act like customers, but it’s far more important for testers to act like testers. It’s the tester’s role to identify important vulnerabilities in the product. Sometimes that involves behaving like a typical customer, and sometimes it involves behaving like an atypical customer, or and sometimes it involves behaving like someone who is not a customer at all. But again, mostly it involves behaving like a tester.

Again you may argue that a good tester should take all that into account, but it’s not that simple to verify it especially for tests involving many short trivial steps.

Maybe it isn’t that simple. If that’s a problem, what about logging? What about screen capture tools? Such tools will track activities far more accurately than a script the tester allegedly followed. After all, a test script is just a rumour of how something should be done, and the claim that the script was followed is also a rumour. What about direct supervision and scrutiny? What about occasional pairing? What about reviewing the testers’ work? What about providing feedback to testers, while affording them both freedom and responsibility?

And would scripts solve that problem when (for example) you’re a recording bug that you’ve just discovered (probably after deviating from a script)? How, exactly? What happens when a problem identified by a script is fixed? Does the value of the script stay constant over time?

Detailed test steps (at least to some extent) might important if your test activity might be transferred to another offshore team someday (happened to me a few weeks ago, I sent them a test document with only high level details and hoped for the best), or your customer requires in-depth understanding of your tests (a multi-billion Canadian telecommunication company insisted on getting those from us during the late 90’s, we chose the least readable TestDirector export format and shipped it to them…).

Ah, yes. “I sent them a test document with only high level details and hoped for the best.” What can I say about “hope” as a management approach? Does a pile of test scripts impart in-depth understanding? Or are they (as I suspect) a way of responding to a question that you didn’t know how to answer, which was in fact a question that the telco didn’t know how to ask?

Going through some set of actions by rote is not a test. A test script is not a test. A test is what you think and what you do. It is a complex, cognitive activity that requires the presence or the development of much tacit knowledge. Raw data or raw instructions at best provide you with a miniscule fraction of what you need to know. If someone wanted in-depth understanding of how a retail store works, would you send them a pile of uncontextualized cash register receipts?

The Devil’s Advocate never seems to have a thoughtful manager for a client. I would suggest that a tester neither hire nor work for the devil.

Thank you for playing the devil’s advocate, Oren.

7 replies to “Scripts or No Scripts, Managers Might Have to Manage”

  1. Awesome post, Michael. I so rarely see blog posts that so directly address the test manager’s role and responsibilities to the test team.

    You might want to have a look at Jon Bach and Eric Jacobson’s and Huib Schoots‘ blogs, and at Jerry Weinberg’s writing (especially Perfect Software and Other Illusions About Testing, The Secrets of Consulting), and his blog.

    I am a test manager, and try to be responsible to my team to not waste their time demanding they write detailed test cases, but to truly understand if THEY truly understand the complexity and scope of the software they are working with. I apply a certain level of trust after some initial observation of my tester, that will grow or decrease as more time passes and I observe and work with my tester more and more. When I observe a tester who consistently demonstrates they “get” the application they are working with, I get out of their way, and try to keep the path as free of obstacles as possible. And I leverage them to help the ones that might need more help getting there.

    Your writing always challenges me to think about how to raise my own bar, thanks!

    Michael replies: You’re welcome!

    On a related note, today I was called into a meeting with an IT auditor focused on an audit of one of my major initiatives. The auditor was questioning the QA process. How did we test, how did we document, what information did we get from development, etc. After a bit of some back and forth, the auditor said, “Do you think your team could benefit from having time in your schedule to write more thoroughly detailed test cases? We could help you with making that case. I’m used to auditing projects where the QA team has all their test cases documented out in detailed test steps and covering the entire application.”

    This auditor and I have a pretty good working relationship, and she knows that we can be time-challenged on this project. She thought I would jump at the chance to have her help me gain a larger testing window and/or more resources. Which in turn would enable time for the testers to sit down and script out everything they wanted to test in the application, and then follow that script. My response was a thanks, but no thanks, we are pretty happy with our mix of exploratory testing and checklists that we are doing. We want to continue to grow, but not in the direction of writing out how to test our application from A – Z.

    Great story! Thanks!

  2. The way you tackle each and every single sentence seems a bit over-the-top, Michael, that’s your style, isn’t it?

    Michael replies: That’s one of my styles, in this context in particular. My intention here was not to demolish Oren. He pointed out from the outset that he was playing devil’s advocate. When someone does that, they’re presenting arguments and inviting you to refute them.

    However I see the point of pointing out the flaws in the though work by the Oren-guy. He seems to have cast himself into concrete of TCBT and now as hes sinking to the great river of testing he tries to grab any straws he can possibly taking as many under with him as he can.

    That’s how he’s acting, for the purpose of the exercise. You’ll see from his reply that he’s not really like that.

    Debating (wherein you agree to argue from a given position, whether or not it jibes with your personal beliefs) is an important skill for testers. Debating well demands that you consider multiple perspectives on an issue, and testers need that kind of mental agility. (Checkers, by the way, don’t; checkers don’t need to understand anything.)

    I believe that acting and role-playing are also important skills for testers. Like debating, they encourage you to take on a different perspective, allowing you to see both the strengths and the weaknesses of alternative points of view. As testers, we often have to deal with things that we might consider irrational. When that happens, as Jerry Weinberg cautions us, “your first step is to stop thinking of it as ‘irrational’, and to start thinking about it as ‘rational from the perspective of a different set of values’”.

    There is no need to defend one’s point of view if they work really well for you. In Oren’s case it seems that the TCBT doesn’t work and he has no way to make it work for him. So he defends it like a dying fire which will inevitably extinct and he’s left in the cold and dark. What he needs to do is to refine the way of working and adapt to the situations at hand.

    It seems to me that your last sentence there doesn’t agree with your first. I’d say (first sentence) that there’s no need to defend your point of view if it works for you and for everyone to whom you’re responsible. And if that’s so, (last sentence) he doesn’t need to do anything.

    For example the situation of unexpected outsourcing: it’s not unexpected to someone (to him maybe, but he could have found that out) and I believe he lost many good opportunities to communicate with people who make the decisions with trying to write rigid test cases. Increasing communication within and without the company could have redeemed himsefl with ANY level of test description. He could have broaden the board he plays with by giving more room for intelligence and commitment instead of controlling and narrowing the view.

    People have the tendency to be trusted and with good skills they deliver good results with little knowledge from the product beforehand. The way they do their work best is by learning and testing. If some guidelines are given then the testing can be directed into the areas where it is mostly needed. Making people run test cases is not directing, managing or even guiding: it’s restricting and it suffocates the inner fire withing the testers.

    When it comes to test casing I see them as documentation about something someone thinks about something. I may absorb information about them but I question them the same as I question every document, guide, instruction, function, everything. Test cases are a relic that should be put into a glass jar and be stored into a museum and tour intelligent people around it and say: “This is how it was in the history. Now you kids know better, right?”

  3. Great, down to earth post Michael.

    Michael replies: Thank you. And thank you for prompting it.

    I’m coming from the embedded world where tracking activities is not always possible or practical.

    Tracking activities is always possible or practical to some degree, isn’t it? And it’s never possible to practical to track them completely (with all of the appropriate nods to “always”, “never”, and “completely”). By the way, have you seen the book Testing Embedded Software by my friend Bart Broekman? I was impressed by the sensitivity to context expressed in that book—which makes sense, consider that a PIC chip could end up in anything from a nuclear submarine to a Tamagotchi.

    Actually most of your writing is aimed at the non-embedded world, a world in which some of the theories and great ideas just falls apart due to practical reasons, forcing you to practice context driven testing without even calling it that way.

    I might go one level higher: there is context in the embedded world too.

    BTW I exaggerated my two examples for the sake of the Devil, in reality the big telco wanted just that- a big pile of… so they can cross out a vague requirement from upper management, and the offshore team has plenty of experience in that area but using another embedded OS, so I could be sure they are doing a great job.

    I was pretty sure that you were exaggerating for effect. Thanks for doing that.

  4. To Oren Reshef


    I work in embedded world as well, and we have had no problems using ET, for 4 years. It all comes down to, as MB said, ‘appropriate in current project’. Sometimes we need to track more, sometimes we need to test only briefly, etc.

    The context is even more important, I think, in embedded world, specially in bug reporting. As you need to frame the report so that it is apparent what is the (ultimate) impact. For example, missing parameter in an low-level communication protocol command might be simple thing, or it might take down the whole data-center.

    We use session-based test management (with some thread-based elements), to keep track and use that for references when new people join the team or need to know what we have done.

    I suggest to keep an open mind and ask your bosses/clients, why do they need that test info, and in what form. It might be that they really don’t want all those details, but more generally what areas have been covered, how and to what extent, (and more importantly, what has not been tested).

  5. Hi Michael, the past four or five posts have been the best I’ve ever read. You’re busting through a lot of misconceptions I’ve encountered.

    Michael replies: Thanks, Joe.

    I’d like to offer my own experiences that scripts have not been a great method of capturing and facilitating the majority of testing activities. They have been useful for smoke tests, and particularly time consuming functionality where a tester making a mistake can result in hours of lost time. When I was forced to follow scripts I can definitely say that I didn’t follow them. I would look over them, but the requirements that they were based on or the application were far better triggers for the testing that I needed to complete. I would often take a large test suite (something that might take 2 days to complete) and condense it into a one page document and then follow that or pass it on. After that I would go through the test cases and click “Pass” or “Fail”. I resented needing to (pretend to) follow scripts, but I knew I was doing better work by re-imagining the documents that were given to me. Other good testers have done the same.

    Sometimes people wave you off because you’re “just a consultant”. I know that you have done real work in the field, but hopefully if more people who are not “just consultants” will speak up and agree with you then it will lend more credibility. Or perhaps some people will find other ways to dismiss it. At least I’m making an honest attempt to report my observations.

    I appreciate that you’re doing that, and yes, I wish more working testers would talk about what they do. Alas, some of them face non-disclosure constraints. I wish some companies would be more forthcoming, but there are understandable competitive advantages associated with the good stuff and potential embarrassments or legal liability with the bad stuff.

    There are plenty of responses to the “just a consultant” critique. First, all testers are “just consultants” in a very real sense. We don’t write the code, we don’t fix the code, we don’t manage the project, we don’t set the project schedule, we don’t make hiring and firing or staffing decisions, we don’t negotiate customer contracts, we don’t allocate bonuses to programmers or managers… We provide information and insight. So, any tester wants to use “just a consultant” as a critique might choose to notice what’s looking back in the bathroom mirror. (And, by the way, might choose also to order up a copy of Jerry Weinberg’s The Secrets of Consulting.)

    A second response to the “just a consultant” issue: Yes, I am a consultant, but I’m also a tester. I have a mandate that differs from others: I observe and study and investigate and write about and refine ideas on testing itself. That is, I test testing. I do this for most of my waking hours, and I’ve been focused on doing that with testing specifcially for going on 20 years.

    A third response can be found on Page 6 of James Bach’s How To Fake A Test Project.

    A fourth response: through most of the 90s, I was a tester and program manager for what was, in its time, the best-selling piece of commercial software in the world. I’ve tested in banks, in other financial institutions, in educational software, in big retail. So yes, I do have real-world experience.

    All of which leads me to my fifth and final response to those who issue the “just a consultant” response: take a hike, eh?

    Thanks again for these posts. Best of 2011!

  6. @Debbie – I’ll focus on just one sentence which caught my eye:
    “get out of their way” is a good thing, as long as we remember to run some debriefs from time to time.

    As I see it, one of the best things which SBTM offers – is constant debriefing (every few hours).

    I think in many cases, test managers / team leaders do not have enough time to put into this task, but even the best of testers can benefit from manager/colleagues feedback – especially if done as a small group brain-storming.

    Michael replies: If that’s the case, I’m fascinated about what else they might be doing.

    If we are really lucky, we get to go over the bugs, and give some feedback on these too.

    In a more scripted testing approach, we tend not to plan for intermediate debriefings – a tester can get a task for anything between 3-14 days, and no debriefing is planned for it – this is something we need to tackle.

    Yowza! What other forms of supervision and accountability are there between the tester and test leads or test managers? Between the tester and the programmers? To me, with the limited information I have, that sounds like a long feedback loop, like trying to drive a bus by mail.



Leave a Comment