Very Short Blog Posts (11): Passing Test Cases

Testing is not about making sure that test cases pass. It’s about using any means to find problems that harm or annoy people.

Testing involves far more than checking to see that the program returns a functionally correct result from a calculation.

Testing means putting something to the test, investigating and learning about it through experimentation, interaction, and challenge. Yes, tools may help in important ways, but the point is to discover how the product serves human purposes, and how it might miss the mark.

So a skilled tester does not ask simply “Does this check pass or fail?” Instead, the skilled tester probes the product and asks a much more rich and fundamental question: Is there a problem here?

9 replies to “Very Short Blog Posts (11): Passing Test Cases”

  1. Is it not also asking “is there an opportunity here” – testing is mainly about discovering what we don’t yet know – good or bad – as well as checking what we thought we knew.

    Michael replies: Yes. (It’s a Very Short Blog Post.) But arguably, an as-yet missed opportunity may be a special case of the more general “Is there a problem here?” It’s not a sure thing, to me, that a missed opportunity is necessarily a problem, though it might be. Problems are always relative to some person and some desire. More on that here.

  2. I agree with this post, but I also disagree. It holds for a QA tester, but I am a Development tester. Generally I am verifying that developers have achieved the goals of an iteration. This means asserting if they have met the defined requirements. My role is to validate that the iteration is actually done. Once I have validated the iteration through my tests it is given to the QA tester validate the quality of the iteration where they employ experimentation outside of the scope of the define requirements which includes things like usability.

    Michael replies: I find this kind of setup confusing. Let me make sure I have it right.

    Suppose that there is a set of defined requirements that, in the current iteration, the programmers will implement a four-function calculator that handles a pair of single-digit inputs, separated by an operator, and a “calculate” button. As part of that, you create a set of checks that exercise the calculator in some way—a few examples for each operator. All of those checks pass, so presumably the programmers are done. Then you idly try to multiply eight by zero, and the calculator returns the result “eight”. Are the programmers done?

    All right: every single functional check passes, and the result appears in white on a white background. Are the programmers done?

    Okay; all the checks pass, and none of the white-on-white business, but the result takes seventeen minutes to appear. Are the programmers done?

    I’m aware that this is an extremely simplified example. Here’s a more real-world explanation of what bothers me:

    It sounds to me like the setup you’re describing isn’t a setup for engineering, but for an elaborate game in which one set of players says “bring me a rock” and the other says “we brought you the rock you asked for”. Along with defined and explicit requirements, there are always unstated and implicit requirements. I must be misunderstanding something. I sure hope so.

    Here’s a little more of my take on the subject of “done”.

  3. I agree, there are too many times when testers or project managers only want the test cases to pass. They are not concerned if defects are found and they certainly do not want them to fail because that will result in delays to the production implementation. The purpose of test cases is to find problems.

    Michael replies: “that will result in delays to the product implementation.” I don’t understand people who think that way, since discovery of a problem doesn’t have to delay the product implementation at all. The decision to delay the product implementation (or not) rests with those very people. It seems to me that they’re asking to wear a blindfold so they don’t have to be aware of anything on the road ahead—as though presumably you’ll never hit the moose you don’t see.

  4. Sorry for the length of my response, but you raised some points that I feel the need to address.

    Michael replies: I welcome the long response. Thank you.

    You asked if the programmers where done for various scenarios, yes they are. They have to have some point at which they deliver their work to the next phase and they met their definition of done. If they did not define done they will continuously work to improve the product without ever delivering it.

    Really? Would they not be able to control themselves? I’ve never met programmers who were that much into gold-plating, although on one level, I suppose that level of initiative would be admirable. But if they were to do that, would no one ask them to stop? In all seriousness, would the programmers not be engaged in an ongoing conversation with managers and other stakeholders?

    Just because it is deemed not done in another phase of a process doesn’t mean it wasn’t done for the programmers. If in the next phase there is another definition of done and the programmers didn’t meet it, it is possible that it is eventually sent back to the programmers and they now have a new definition of done, but at some point the team has to deliver or we all lose our jobs.

    I’m worried that we’re talking at cross-purposes here. If I’ve got it right, you check to make sure that certain explicit, specified requirements have been met, but that anything else that you might reveal in the process of doing so—even if it were intensely problematic—would be out of scope for you to discover, investigate, or report. This sounds to me like following a plan instead of responding to change; a contract negotiation instead of customer collaboration. That is: not very Agile. (I’m assuming a claim of Agility because of the invocation of “definition of done”.) In this approach, it seems to me that a problem that is not detected (or even ignored) by the “development tester” gets passed on to “QA testers”, which would greatly lengthen the feedback loop—the time between discovering and resolving the problem.

    There isn’t a point where programmers, automated tests, human testers, or product owners will catch everything, there are always bugs. Yet, there has to be a point where we move on or we will be perpetually undone. As a team, our collective ability to define an adequate definition of done will be evident in the success of the team to deliver the value the team is commissioned for.

    It seems to me that a product could pass any number of explicit checks and still fail to deliver that value; that a product could deliver some value and still have terrible problems that completely undermine that value. As as tester, I want to find those problems and alert people to them absolutely as soon as I can—even if the conclusion is no more than “Thanks; we’re willing to live with that problem.”

    I am not sure of what you believe engineering is, but we practice engineering as a way of delivering solutions through organized problem solving. There is a engineering methodology to it. In our method we define a problem, propose solutions, evaluate solutions, select a solution, develop the solution, test the solution, deliver the solution, analyze the delivered solution to identify problems and start over again. When a solution is proposed there is a specification (requirement) and this gives programmers their definition of done and if they find something obvious to adjust in the spec during development it is adjusted and if the adjustment will affect the schedule it is approved by the team. I really don’t understand your engineering comment. If a country says “we need a rocket” and a company says “we have the rocket you asked for” am I to believe that the rocket wasn’t engineered?

    If the rocket were checked only against ideas that people had at the outset of a burst of development work, I’d worry that it wasn’t engineered, yes. Part of engineering is learning and discovering how to test the product you’re building. The Golem at Large has a very interesting discussion of what this meant on the Challenger project (spoiler: there was a lot of controversy over how to interpret test results, and over what constituted competent testing, long before the explosion happened).

    Back to the original point. You said, “Testing is not about making sure that test cases pass. It’s about using any means to find problems that harm or annoy people…a skilled tester does not ask simply “Does this check pass or fail?” If the test cases test for problems that harm or annoy people they are indeed tests in your definition.

    They are checks, and checking is a part of testing. Some might call them “unit tests” (or “microtests”, in Mike Hill’s lingo); we don’t do that, but we don’t get upset when others do, and we might call them tests in casual parlance. But it seems to me that this is checking of the functional correctness of the product, and there are far more ways to harm or annoy people than by functional incorrectness. Indeed, many things that are functionally incorrect go unnoticed by the users of software, and many things that are harmful or annoying are functionally correct.

    Or does your point mean that a teacher in a school who gives a multiple choice test with only one choice that passes for each question is not administering a “Test” because it is graded by a machine?

    I think the controversy over standardized testing in schools says a good deal about that. In many parts of the world, we’re seeing a narrowing of the curriculum to make sure that students pass the standardized tests, at the expense of, you know, education.

    Human software testers will always be necessary and automated tests help programmers design, develop solutions and prove their definition of done. As you eluded to in your other post on “done”, someone has to say what done is or no software would ship and I couldn’t even respond on this blog because it would still be in development or testing. So, I am a skilled tester and my tests are automated to specs and I simply ask, “Does this check pass or fail.”

    No test can prove that a program or a programmer is done. No test can show that a product is acceptable. No test can prove that a programmer is not done, either, although you could make a case that a failing test suggests strongly that the program is rejectable, since presumably someone declared that “we won’t be done until this test passes” at some point in the past. Still, I would prefer to think that we base release decisions on what we know now, and not on what we hoped would be so a couple of weeks ago.

    Tests don’t make the done/not-done determination; people do, and people use the outcome of testing to inform that determination. I worry that with that pass/fail focus trained solely on explicit requirements, you’re losing an opportunity to inform your clients (programmers, managers) about problems that they’d want to know about, even if no one specifically planned to discover that specific problem. Mind, I can worry all I like; I’m not working for your organization, nor is it working for me.

    Nonetheless, if I were in the cockpit of an airplane, and there were a mountain directly in front of me, I’d make sure that the pilot in command knew about it, even if the mountain hadn’t been on the flight plan.

  5. Charles Bryant said, “If they did not define done they will continuously work to improve the product without ever delivering it.”

    Maybe Charles doesn’t realize he’s speaking to a programmer and a veteran of many product launches? Anyway, I’m sure Michael smells the bullshit. Charles you are conflating two issues: a decision rule, and the ability to make a decision. And there is no excuse for that, unless you are an intern or something.

    We don’t need a RULE in order to make a DECISION. Never in my career shipping award-winning products (Borland C ) did I use any definition of done other than that the management team felt that it is done. This has always come as a result of vigorous testing and discussion of test results, with no need for games involving passing test cases.

    Engineering is NOT the process of oversimplifying complex things. That’s what little kids do. Please grow up and take on the responsibility and mantle of adult life. We agree with Billy Koen, as he put it in his book Discussion of the Method. “Engineering is the use of heuristics to make the best change in an uncertain situation.” To do that, you must recognize the limitations of your methods, and certainly not defend methods that are deliberately weaker than they could be simply to make it nice and convenient for yourself.

    In our terms, Charlie, you are not talking like a tester (development or otherwise). You are talking like tool jockey with delusions of grandeur. If you are going to take up responsibility as a tester, you cannot shrug off the scenarios that Michael raised.

    — James

  6. Another long one. First, I have incredible respect for Michael. My response were not meant to be disrespectful, like James’ are to me. You are both testing god’s in my eyes. Yet, if anyone tells me that there can only be one way to engineer software or test software then I can’t accept that as the industry would not move forward if we just stood on the shoulders of giants without aspirations of improving on their work. I have no delusions of improving on your work or otherwise positioning myself as a “somebody” in testing.

    Michael replies: And yet you could.

    James, in his sometimes brusque and direct way, is encouraging you to up your game. (It would help if we all met, I think.)

    I am a lowly developer turned tester with a lot still to learn. I can’t pretend to be an expert, especially on testing. I have made no attempts to promote myself, to direct people to my blog, or force my views, but I have only expressed my thoughts in hopes of clarifying the original post, which was vague, in hopes of gaining more understanding. I like expressing my thoughts, especially written, and I like it even more when I can learn from people who correct my thoughts. My point is still one of context were there seems to be a disagreement on the terms “test(er)” and “done”.

    We encourage this. Keep it up.

    I am not trying to bullshit you or twist my words to make my point. Maybe it is better said that I guess I am fortunate as most of the developers I have worked with are master craftsmen and if given free reign they would not deliver in a timeframe that business desires as they would constantly refactor if there is no pressure to deliver or some goal to be reached. Yes, they need a point to guide them in determining when enough is enough and when they have done too little. This is what “done” means in our context and it helps them to time box their effort in a manner that they can deliver something valuable in a reasonable amount of time. There are times when the timebox and business pressure has forced them to deliver sub par work, but in my current role I am a guard against that. There has to be some point of reference to make a decision of when to move work to the next phase of our process. Since I am obviously missing the point, when should we deliver work to QA? What is the trigger? James, you said you never had a RULE in order to make a decision. What is the word for the observation that informed you of the point that it was safe to ship (in my context move a release to QA)? What is the basis for saying we have done enough for now…deploy?

    Here are some ideas. First of all, I think it’s a mistake on someone’s part to think of yourself as a guard, rather than as an investigator, a guide, riding shotgun rather than sitting in a tower—and certainly in an Agile context. Are you the supervisor of the programmers, or a faithful assistant and illuminator of the work they do? Second, rather than thinking in terms of “delivering work to QA”, why not stay engaged with the (other) testers all the way along? Isn’t that the Agile way? Third, the engineering method is not the application of rules, but the application of heuristics, fallible methods for solving a problem or making a decision.

    Engineering has context. You don’t need the same rigor to engineer a rocket or heart pump as you do to develop a simple blog engine. There are a lot of give and take in between those two extremes. I feel that there is disdain for Agile practices here. I share of lot of them, but I am not sure how stepping on the perceived limitations of my methods without getting to know or understanding them promotes the inquisitive nature you are promoting for testers. I guess a lowly “intern” like myself doesn’t deserve a question about how I do what I do as you have heard it all before and you don’t have time to try to educate another lost checker. I am assuming that you believe my current practice is useless and I need to be more like the grown up testers we have on our QA team.

    Our point is precisely that you are not a lowly intern. The option to be more powerful is entirely available to you, and you can choose that option. Your practice is not useless, but I would argue that you can be even more useful if you shift a couple of things.

    To be clear, as I am developing tests or checks, if the developers, the analyst, or I find something wrong during development or our demos and walk trough’s, the issue can redefine what “done” means in order for developers to deliver a release to QA or it can be addressed in the next iteration. If something is found and we pass it along to QA anyway, we inform QA of the issue. We don’t blindly stick to the current script on the spec. The spec lives and changes as the project progresses. It is not some 1990’s requirement document we slaved over for months and can’t be changed, but we don’t work from no spec at all either. “Done” could be redefined many times, but at some point we say we are “done” for now and push the big red release button. The results of my tests helps the team bless a release as ready for QA and ultimately to our users. In my role I just don’t spend as much time trying to hunt for bugs as that is not my focus. I learn more about how to test our product every day, but manually driving the product is not the major time spend in my day. If it were my focus, why have a QA team or on the other hand why not have many more layers of QA? We have a QA team that handles the heavy work of exploratory testing. I am not on that team. I am on the engineering team and it is my responsibility to insure the software works as we currently expect it to work so the QA team is not wasting their time in frustration over the basic and planned operation of the software. The QA team’s time is much better served doing exactly what you are explaining to me, discovering the unexpected to increase the value of the product we deliver for users. Granted they still test the basics, but hopefully the basics are OK because I “checked” them ? so they can put more time and focus on the hard stuff. So, I focus on testing, excuse me – checking, the known specification, I help the developers “check”, and I help uncover new checks that we should perform. I build tools to help us with our checks. I do whatever I can to help developers increase their pace and in the process not tie up QA with issues with basic functionality. I stop the Dev team from just throwing their crap over the fence. James would probably call me a helper and maybe a junior checker, yet my work none-the-less provides value. (Actually, James, I mistook you for James Whittaker, former Google test guru, so I was very confused by your response for a moment 😉

    Well, for my part, I was confused about your reports of being on an Agile team. What you’re describing here (“blessing a release as ready for QA”; just throwing their crap over the fence”; “(tying) up QA with issues with basic functionality”) simply doesn’t sound like an Agile organization. It sounds painful. One suggestion would be to look at this post.

    Then again, maybe I have it all wrong. I agree with your points on functional correctness vs harmful and annoying. Maybe we are just doing it wrong and the transactions we process successfully for some of the largest financial institutions and corporations is the result of dumb luck and we are destined to crash and burn soon because of our impure testing practices on the engineering team. I have to believe that our QA team is doing something right. I have to believe that they follow a lot of what you are preaching. Yet, our test engineering team follows a model somewhat similar to Google, Facebook, Adobe, Yahoo! and Microsoft, maybe they are doing it wrong too. Or maybe I don’t understand those companies, or I am not expressing my role correctly, or it could be that Michael and James are so far ahead of the game that I just need to study them more to attain enlightenment (this is not meant to be facetious, I’m very serious). I am one who knows there is always a better way of doing a thing. Michael and James are very knowledgeable and passionate about testing and I have been spending more time on this blog and learning from these master testers. Although I agree with most of what I read, I have not yet been steered from my naïve thoughts on my role as a test engineer, checker, tool jockey or whatever you want to call me or belittle me with.

    We aren’t doing that, Charles. We’re pointing out an inconsistency. One the one hand, you refer the complex judgements and organized problem-solving of engineering; on the other you say “I simply ask ‘Does this check pass or fail?'”. I suspect that you don’t behave that way in real life, but it sells yourself short to say that you do. That’s what bothers us (me, at least). If you’re going to be in the quality assurance business—and I by no means advocate that (—at least don’t suggest that you or the team letting bits, passing checks, make the decision for you.

    Anyway, I accept that your ideas are very much valid and our problem here is one of communication and context.

    I would say exactly the same thing here. I think I have far less trouble with what you do than with your description of what you do.

    I love vigorous discussion, especially with people as knowledgeable as both of you, Michael and James. I just believe you don’t understand my role or maybe you do and I have just replied to a post in hostile territory. I understand that there is a difference between a spec, test plan, scenario, story…whatever you may call a test definition and the actual act of doing and observing what is explained in the test definitions. The manual execution of a test definition is what testing is in your eyes and a test is only a valid test when done by a competent human. I understand that you believe automated tests are automated checks and should not be trusted. I get it, which is why I believe our QA team in indispensable. In the end, no matter what we call it, our entire team focuses on quality, but there is a scope in each role’s focus on exploratory testing, active testing, manual testing or however you want to define it in your context. If we were all 100% exploratory testers at the level and experience of professional “testers” it would slow down our process, we wouldn’t need QA, and we may even have many more edge cases fixed as there are many more people on the engineering team than the QA team, but what does this gain us? We have people paid to focus on exploratory testing and insuring our user experience is top notch and I have no problem trusting them. In fact, we applaud them when they find bugs as it makes our product better and uncovers shortcomings in our understanding of our product.

    I am not shrugging off anything either of you say. I am no where near the level of you guys, but I am not a do what every you say just because you say type of guy. I know and recognize who I am talking to and I am not trying to play like I am an authority on test, to the contrary I am researching what you guys have to say on it and learning a lot in the process, but it doesn’t discount the fact that I believe my role is not one that you can raise your nose at or discount. In the end, I still believe that for software to be released test cases have to be ran whether planned or unplanned, scripted or unscripted, by man or machine, and they have to be interpreted by someone to make a decision on their results. My problem in this whole adventure has been the fact that what I do in that scenario seems to be trivialized and expelled from the realm of testing and I am not yet sure why but I will find the answer. It is probably a problem of communication.

    Again: not what you do, but your description of what you do.

    Lastly, I brought up the subject of teachers and rockets as I wanted to get more info on what you meant on the original points as they left too much open and although I could also counter your responses I will leave those points where they sit as they are debatable. Thank you so much for responding in the manner you have and for breaking down what you meant so it’s a little clearer for me. It has been an honor thus far being able to have this discussion. I guess I couldn’t accept the “very short blog post” being so short.

    Well, you certainly addressed that problem [grin]. I appreciate you taking a stand. I (and, I’m sure, James) want to help you (and other testers like you) tell the story of what you do in a way that frames and underscores the value of your work. With your patience and resilience, we’ll get there.

  7. Hi, thank you for this post Testing is not about making sure that test cases pass. It’s about using any means to find problems that harm or annoy people. very useful information


Leave a Comment