DevelopsenseLogo

Give Us Back Our Testing

“Program testing involves the execution of a program over sample test data followed by analysis of the output. Different kinds of test output can be generated. It may consist of final values of program output variables or of intermediate traces of selected variables. It may also consist of timing information, as in real time systems.

“The use of testing requires the existence of an external mechanism which can be used to check test output for correctness. This mechanism is referred to as the test oracle. Test oracles can take on different forms. They can consist of tables, hand calculated values, simulated results, or informal design and requirements descriptions.”

—William E. Howden, A Survey of Dynamic Analysis Methods, in Software Validation and Testing Techniques, IEEE Computer Society, 1981

Once upon a time, computers were used solely for computation. Humans did most of the work that preceded or followed the computation, so the scope of a computer program was limited. In the earliest days, testing a program mostly involved checking to see if the computations were being performed correctly, and that the hardware was working properly before and after the computation.

Over time, designers and programmers became more ambitious and computers became more powerful, enabling more complex and less purely numerical tasks to be encoded and delegated to the machinery. Enormous memory and blinding speed largely replaced the physical work associated with storing, retrieving, revising, and transmitting records. Computers got smaller and became more powerful and protean, used not only by mathematicians but also by scientists, business people, specialists, consumers, and kids.

Software is now used for everything from productivity to communications, control systems, games, audio playback, video displays, thermostats… Yet many of the software development community’s ideas about testing haven’t kept up. In fact, in many ways, they’ve gone backwards.

Ask people in the software business to describe what testing means to them, and many will begin to talk about test cases, and about comparing a program’s output to some predicted or expected result. Yet outside of software development, “testing” has retained its many more expansive meanings.

A teenager tests his parents’ patience. When confronted with a mysterious ailment, doctors perform diagnostic tests (often using very sophisticated tools) with open expectations and results that must be interpreted. Writers in Cook’s Illustrated magazine test techniques for roasting a turkey, and report on the different outcomes that they obtain by varying factors—flavours, colours, moisture, textures, cooking methods, cooking times… The Mythbusters, says Wikipedia, “use elements of the scientific method to test the validity of rumors, myths, movie scenes, adages, Internet videos, and news stories.”

Notice that all of these things called “testing” are focused on exploration, investigation, discovery, and learning. Yet over the last several decades, Howden’s notions of testing as checking for correctness, and of an oracle as a mechanism (or an artifact) became accepted by many people in the development and testing communities at large. Whether people were explicitly aware of those notions, they certainly seem tacitly to have subscribed to the idea that testing should be focused on analysis of the output, displacing those broader and deeper meanings of testing.

That idea might have been more reasonable when computers did nothing but compute. Today, computers and their software are richly intertwined with daily social life and things that we value. Yet for many in software development, “testing” has this narrow, impoverished meaning, limited to what James Bach and I call checking. Checking is a tactic of testing; the part of testing that can be encoded as algorithms and that therefore can be performed entirely by machinery. It is analogous to compiling, the part of programming that can be performed algorithmically.

Oddly, since we started distinguishing between testing and checking, some people have claimed that we’re “redefining” testing. We disagree. We believe that we are recovering testing’s meaning, restoring it to its original, rich, investigative sense. Testing’s meaning was stolen; we’re stealing it back.

6 replies to “Give Us Back Our Testing”

  1. If what we are doing by being clear with using testing and checking is recovering testing’s meaning. Would it not also mean that we are reasserting testings original meaning as valid?

    Reply
  2. So, are you saying you are Martin Luther in spirit, performing the Reformation (which was really just getting to the beginning and recovering what was already there) in testing? Because it sounds rather reasonable.

    Michael replies: Ummm… usually people confuse me with Michael Bolton the singer. Martin Luther is a new one. I’ll leave the parallels up to others. 🙂

    Reply
  3. If it doesn’t involve thinking, it is not testing. If it doesn’t involve discovery, it is not testing. Sherlock Holmes said “Even an empty result is a result” and I believe that discovering nothing is still discovering. Checking aims at NOT discovering anything – one checks that nothing is discovered.

    Michael replies: I’m not entirely sure that checking aims at not discovering anything, but I agree that it certainly is biased in the direction of confirmation, as you suggest. That reminds me of this.

    Reply
  4. Well said Michael! I couldn’t agree more that the term “testing” has been corrupted to mean checking. Too many so-called testers and test managers have fallen under the spell of test management tool companies and have boiled testing down to something they can copy from a requirement document, encapsulate in a test script, and ship off to a low cost “test center of excellence” to execute. The use of brain power and proper scientific analysis has been pushed to side.

    Reply

Leave a Comment