Elements of Testing and Checking

In the last couple of weeks, I’ve been very gratified by the response to the testing-vs.-checking distinction. Thanks to all who have grabbed on to the idea and to those who have questioned it.

There’s a wonderful passage in Chapter 4 of Jerry Weinberg‘s Perfect Software and Other Illusions About Testing in which he breaks down the activities of a programmer engaged in testing activities—testing for discovery, discovering an unexpected problem, pinpointing the problem in the behaviour of the product, locating the problem in the source code, determining the significance of the problem, repairing the problem, troubleshooting, and testing to learn (or hacking, or reverse engineering). He points out that confusion among the differences in these different aspects of testing can lead to conflict, resentment, and failed projects.

I brought up the test-vs.-check idea because, like Jerry, I think that the word “test” lumps a large number of concepts into a single word, and (as any programmer will tell you) not knowing or noticing what’s going on inside an encapsulation can lead to trouble. I wanted to raise the issue that (as Dale Emery has helped me to articulate) excellent testing requires us to generate new knowledge, in addition to whatever confirmations we generate. Moreover, tests that generate new knowledge and tests (or checks) that confirm existing knowledge have different motivations, and therefore have different standards of excellence.

A test is a question (or set of questions) that we want to ask of the program. It might consist of a single idea, or many ideas. Designing a test requires us to model the test space (or consider the scope of the question we want to ask), and to determine the oracles we’ll use, the coverage we hope to obtain, and the test procedures that we intend to follow. These are the elements of test design. Performing the test requires us to configure, operate, observe, and evaluate the system, and then to report on what we’ve done, what we’ve observed, and our evaluation. These are the elements of test execution.

A check is a component of a confirmatory approach to testing. As James Bach and I reckoned, a check itself has three elements:

1) It involves an observation.
2) The observation is linked to a decision rule.
3) Both the observation and the decision rule can be performed without sapience (that is, without a human brain).

Although you can execute a check without sapience, you can’t design, implement, or interpret a check without sapience. What needs to be done to make a check happen and to respond to it?

  • We start the process when we recognize the need for the check. That’s the bit in which we consider some problem to solve or identify some risk, and come up with an observation that we’d like to make. That requires sapience; it’s an act of testing.
  • Once we’ve seen a need for the check, we must translate the check into a question for the agency that’s going to perform it, whether that agency is a human or a machine. That requires us to develop the decision rule, turning the test idea into a question with a binary outcome. That requires sapience too, and thus is a testing activity.
  • When we have a question that expreses our test idea, the next step is to program the check, interpreting the binary question into program code (or into a script for non-sapient human execution), and put it into some transferrable form, such as a source code file or a Word document. Part of this requires sapience (the design and expression of the idea), and part of it doesn’t (the typing). Maybe a machine could do the typing part (say, via voice recognition), but programming isn’t just typing; it’s typing and thinking.
  • When we have a check programmed, the next step it to initiate it, to start it up or kick it off. This too has a sapient and a non-sapient aspect. A machine could start a check automatically, either on a schedule or in response to an event, but someone has to tell the machine about the schedule and the event. So the decision to run a check and when to run it is sapient, but the actual kickoff isn’t; it can be done mechanically.
  • Once the check has been initiated, the agency (machine or human) will execute or run the check, going through a prescribed set of steps from start to end. By definition, that’s definitely machine-doable and non-sapient. Pre-scribed literally means written down beforehand. For a check, the script specifies exactly what the agency must do, exactly what the agency must observe, exactly how the agency must decide the result, and exactly how the agency must report, and the agency does no more and no less than that.
  • Upon completing the prescribed steps, the agency must decide the result of the check. Pass or fail? True or false? Yes or no? By definition, the check must be non-sapient; machine-decidable, whether a human or a machine makes the decision.
  • The agency will typically record the result, based on a program for doing it. Checks performed by a machine might record results in an alert or result pane in an IDE, or in a log file. Checks performed by a human might show up in a pass or fail checkbox in a test management tool, or in a column of a spreadsheet.
  • The agency may report the result, alerting some human that something has happened. The report might be passive—the agency may be programmed to leave a log file in a folder at the end of a check run, say; or it might be more active, taking the form of a green or red bar, or a lava lamp. Depending upon the degree to which his actions have been scripted, a tester may or may not actively or immediately report the result.
  • Someone may interpret the result of check, assigning meaning to it. Okay, so the output says “pass”, or “fail”. What does it mean? What is our oracle—that is, what is the heuristic principle or mechanism by which we might recognize a problem? Is the result what we expected? Problem or no problem? If there’s a problem, is the problem in the check or in the item that we’re testing? Ascribing meaning requires sapience.Note that this step is optional. It’s possible for someone to consider a check “complete” without a human observation of the result. This should trigger a Black Swan alert: failing checks tend to get noticed, and passing checks don’t.
  • Someone may evaluate the check, ascribing significance to the outcome and to the meaning that we’ve reckoned. After the check has passed or failed, and we’ve figured out what it means, we have to decide “big deal” or “not a big deal”; whether we need to do something about it; whether the check and its outcome supply us with sufficient information.This step is optional too. Whether it happens or not, evaluation is definitely a human thing. Machines don’t make value judgments. Another Black Swan alert: if we don’t go through the previous step, interpreting the result of the check, we won’t get to this step either. There’s a risk here: the narcotic comfort of the green bar.
  • Whether we’ve ascribed meaning and significance or not, there is a response. One response is to ignore the result of the check altogether. Another is to pay just enough attention to say that the check has passed, and otherwise ignore interpretation and evaluation. Ignoring the check—oblivion—doesn’t require sapience. However, a person could also choose to ignore the result of the check consciously, which is a sapient act.Alternatively, if the check has passed, are we okay with that? Shall we proceed to something else, or should we program and execute another check? If the check has failed, a typical response is to decide to perform some action. We could perform some further analysis by developing new checks or other forms of testing. We could fix the program, fix the check, or change them to be consistent with one another. We could delete the check, or kill the program. All of these decisions and the subsequent activities require sapience, a human.

In future posts, I’ll be talking about how we can put this miniature task analysis to work for us, which I hope in turn will help us to consider important issues in the quality of our testing.

See more on testing vs. checking.

Related: James Bach on Sapience and Blowing People’s Minds

1 reply to “Elements of Testing and Checking”

Leave a Comment