On Testing and Checking Refined

Over the last few months, and especially during some face-to-face time that we had in England recently, James Bach and I have been working to sharpen our notions of testing and checking. Although the task had been on the list for some time, we didn’t get a sense of great urgency about it until we were surprised recently to find that, at a very subtle but important level, we meant different things by “checking”. Until then, what we had achieved was “shallow agreement”, something that’s very common in our world. Ideas can only be represented by words and never completely described. Words are always often ambiguous and slippery. For example, the word “versus” in my original post on the subject, “Testing vs. Checking” was misunderstood by some people. “Versus” can mean “in opposition to” (Manchester United vs. Chelsea, Marbury vs. Madison), but it can also mean “in contrast to, distinct from”, which affords expressions like “trees vs. leaves”, “French people vs. Parisians”, or “riding vs. balancing”. It’s interesting and to some degree unfortunate that people naturally tend to drop anchor on their initial interpretations of words. But like software itself, sometimes it’s hard to anticipate what other people will recognize as a bug. It’s even harder to recognize what we ourselves will recognize as bugs. Whatever we will realize eventually, we’re not there yet.

In the course of our conversations, we argued. A lot. In our business, argument is not to be feared. It’s the stone on which we sharpen ideas. From time to time, I adopted positions that were more like James’ used to be, and it seemed that James adopted positions that were more like mine used to be, until eventually we converged. We took confusion, comments, and complaints from colleagues (and some antagonists) seriously. We obtained some invaluable insights from the work of Harry Collins, whose books (The Shape of Actions, Tacit and Explicit Knowledge, Artificial Experts, Changing Order, and others) have been profoundly influential on us, as I predicted they would be a couple of years back. Indeed, the post in which I made that prediction reflects a lot of the background that informs what I’m announcing today.

The outcome of our conversations, a statement on what we mean by testing and checking in Rapid Testing and in the rest of our work, was posted on James’ blog on March 26 or so. Since that time, the post has been lightly edited in response to some thoughtful and helpful comments from reviewers and early readers.

I would like to emphasize our goals here. Our purpose is not to denigrate checking, nor to disparage the use of tools, nor to deplore those people who are asked to do human checking. On the contrary: we’re attempting to deepen our understanding of our craft; to show that checking is deeply embedded in testing; to emphasize that tools and the skilled use of them are essential to our work in many ways; to realize that humans will always inject human elements into the things they do; to realize the value of those human elements and the risks involved in asking humans to behave like machines. We must be clear on the differences between what humans do and how our processes and tools—media, as McLuhan would call them—do. Or more accurately, the differences between what we do and how our tools affect what we do.

We must also be clear that media, processes and tools, do not do things well or badly. We do things well or badly by and through and with our media. Media extend, enhance, accelerate, intensify, enable, amplify what we are, in ways that precisely reflects our thoughtfulness and our skill. This is crucially important to recognize in testing, where our goal is to use our minds, our skills, our tools and our processes to help people understand the product they’ve got so that they can make informed decisions about whether they have the product that they want.

8 replies to “On Testing and Checking Refined”

  1. Michael,

    One (minor, tongue in cheek) complaint – redefinition robbed me of a blog post! I had started to have problems with aspect of the original definitions, but never mind, I prefer the new ones, and have managed to recycle many of the fieldstones.

    I’m already working hard with the new definitions and distinctions, and look forward to seeing where they lead us.


  2. Hi Michael, can you help me with some thoughts about testing and checking that I’ve recently been going over in my mind. Based on your definitions of the terms, do you think the answer can ever be “yes” to the first question and “checks” to the second question (both below)? I’ve added my own thoughts in brackets below each question.

    Question 1: For a given testing goal, if the software and its surroundings remain static, can all tests ultimately be expressed as checks?

    [From one perspective, I want to say “yes”, as once we decide that a particular testing goal has been achieved (based upon whatever stopping heuristic we have chosen) then we have learnt everything (via testing) that we deem necessary to consider that testing complete. Once we have this knowledge, it should be possible to express the testing we recently performed as checks that can be followed by a human (human checking) and potentially by a computer program (machine checking).]

    [From another perspective, however, I want to say “no”, because whilst the checks created could be executed and provide similar information to the original tests, performing the same tests for a second time may result in different information because even if the software, the goal, the stopping heuristic and the surroundings remained unchanged, the person performing the tests may observe something different based upon their tacit knowledge. This makes me think that whilst it is possible to express as checks an instance of a test, we can never fully express the test (or the testing) itself.]

    Michael replies: There are a couple of way to answer. First, instead of asking whether you can express all tests as checks (you can do that easily; simply do no testing other than checking), ask whether you might want to express all of your testing as checking. That’s a choice, based on cost, value, coverage, and risk. Another way to answer: human checking will tend to fail to be checking exclusively, because of these annoying habits people have: noticing, conjecturing, speculating, thinking, experimenting, screwing up, inventing… I could go on, but you get the picture. People make lousy machines, and that’s a good thing. A third way to answer is that things do change, including but not limited to things in the environment, concurrent processes, interrupting processes, temperature, time, sequencing, pacing, and—perhaps most importantly—our models and our observations, as you mention. This is an instance to which I’d apply The Unsettling Rule: nothing is ever settled.

    Question 2: When the software or its surroundings change, what is more valuable? The checks previously identified (via previous testing) or new tests? Does it depend?

    [My gut feel is that it must depend. If we were to suggest that one was always more valuable than the other, then this (to me) sounds too much like a best practice.]

    Michael replies: I agree, with the note that, for my part, I think it’s wonderful that you’re thinking in terms of “best practice” as a problematic notion.

    [So within which context does each provide value? My thoughts here are that checks tend to be more valuable when we want to confirm (or disconfirm) information that we already know. Testing, on the other hand, tends to be more valuable when we want to expand the information that we already have by discovering something new. Not surprisingly, I’m thinking that both of these statements, if true, are at best, heuristics.]

    Michael replies: checking by its nature depends on us having an observation to make and a rule or algorithm to apply. I agree with your assessment here.

    — Matt

    Thanks for writing.


Leave a Comment