DevelopsenseLogo

Two Futures of Software Testing (STAR Tester Interview, post EuroSTAR 2009)

At the EuroSTAR 2008 conference, I gave a talk entitled, “Two Futures of Software Testing” which was rated as the highest-scoring track session at the conference. Conference attendees also chose the talk as the winning entry for the CapGemini Award for Innovation. Here I provide a number of answers to questions that people have asked since the presentation.

Q: How can we predict the future of software testing?
A: Well, we can’t, obviously. The talk was about possible futures, because (as my friend Fiona Charles titled her EuroSTAR presentation), The Future of Testing is Ours to Make.
In one of the futures that I proposed, testing is about demonstrating correctness—developing tests that show that the functions in the program produce the right answers. When those tests pass, we say that we’re done. To me, that’s part of the dark future.
In the brighter future that I’m rooting for, testing is motivated by asking questions about value. That requires human investigation and rapid learning—understanding the product and the systems that interact with it. That, in turn, includes learning about the context, and in particular the human-related systems—the problems that the program is trying to help us solve, the way people interact with the program, and the way people value it.
It also involves investigating adaptability—not merely demonstrating that the program can work in the lab, but that it will work in situations that we might not have anticipated when we set out to write it.
Q: Those sound like radical ideas.
A: They are—in the sense of what radical really means, which is “going back to the roots”.
Testing for adaptability is fundamental to why we test. That idea shows up in Herbert Leeds and Jerry Weinberg’s book Computer Programming Fundamentals, written in 1961. More recently, there’s a fellow named Sajay Samuel who talks about the difference between experiment and experience[iii].
Testing has traditionally been done as experiment, under very controlled circumstances, staged in a lab. Experience is about the way people deal with the world in the world, using what their senses are telling them.
For software testing, experience is about how the users of the program—very often people not at all involved with the development process—perceive the product. To those people, it doesn’t matter much that Vista passes millions of automated tests every night. To those people, what matters is that they have to get new drivers or ditch their old hardware, or that they have to say Yes to some security question for which they don’t really know the answer.
Both experience and experiment are important, but these days experiment—as performed at superhuman speeds by machines—seems to be dominating the discourse. I hope the future balances things.
Q: The Agile community suggests that testers should get involved early by writing the user acceptance tests for a given feature firs. When the program passes those tests, we’re done with that feature. Isn’t it a good idea to use tests to show that we’re done?
A: Done what, though? Done the first round of development for that feature, sure. Done testing, though? No way.
Tests that show that the feature on its own does what it’s supposed to do are terribly important, but in Agile models, programmers tend to take the primary responsibility for that kind of testing. Since the inputs and outputs tend to be more deterministic way down low in the code, it makes a lot of sense to use automation assistance there, to keep the feedback cycles immediate, crisp, and directly to the point.
Of course testers can collaborate, contributing test ideas and coverage ideas—and they should, in my view. But in the bright future, the bulk of the work for testers in this kind of model will be oriented towards problems at a higher level than functional correctness. Each new feature interacts with something, and each new feature adds more interactions. We’re always developing new questions about the product, because despite our attempts to ignore it or prevent it, both the product and the world around keep changing all the time.
I’m with Cem Kaner: I believe that a lot of ideas in testing are stuck in the world of the 1970s[iv]. Back then, computer programs were much smaller, there was close collaboration between the people who wrote the programs and those who used them, and the problem space was far smaller. User interfaces were simpler—punch cards and glass teletypes—so the inputs were much more deterministic. Programs ran on computers—just a handful of them, from a handful of manufacturers.
Now programs run on cell phones, game consoles, refrigerators; they run on virtual machines that don’t have their own tangible existence. Look at the job the Opera people have to do, for example—dozens of versions of the browser adapted to several OS platforms on dozens of phones from dozens of manufacturers. The browser then has to render kajillions of different Web pages over which it has no real control. So one of the skills that we, as a community, will have to develop is our systems thinking—figuring out how to simplify the testing task without oversimplifying it; recognizing the role that the observer plays in observations; realizing the extents and the limitations of any approach to understanding things.
Q: That sounds less testing and more like philosophy.
A: Philosophy is the love of wisdom. Our role is to provide information to people who need it, so it would be well to provide that information wisely. It’s not just our programs that we have to explore; we have to explore what it means to test them.
Q: So what’s the most important thing to know about exploratory testing?
A: The most important thing to know is that it’s an approach, rather than a technique. Purely exploratory and purely scripted approaches are endpoints on a continuum.
The question is not whether our testing is exploratory or scripted, but the degree to which the prescribed ideas control us. Too much exploration and we risk getting off the mission. Too little exploration and we risk failing to discover things that neither we nor our clients realized might be important. I think generally we’ve tended towards the latter.
The future starts right now. To what extent is the next thing we do informed by the last thing we learned? To what extent does our next test idea come from someone else, or from some point in the past?
Q: If testing is exploratory, how do we make it a manageable process?
A: Exploration is something that humans do, so it’s already manageable. Prospecting for gold is very exploratory, and people manage that.
Systems have inputs, functions that process them, and outputs to observe. When we design systems, we include controlling functions to manage the system. If there’s something we don’t like about the outputs, we use the controlling function—management, in other words—to change something about the inputs or the functions so that we get the outputs we seek.
Sometimes the manageability question is about accountability; we don’t like the way exploratory testing is recorded and reported. Jon and James Bach provide a great set of answers to that problem with session-based test management, and with their focus on aligning the product story and the testing story.
Sometimes manageability is about understanding coverage; we don’t know how much of the product has been tested. This year I wrote a series of articles on test coverage—distinct from code coverage—to provide people with some ideas on how to account for the coverage that they’ve obtained in various dimensions.
Checklists, guideword heuristics, scenarios, coverage outlines, risk lists—these things, to me, blend the best of the two approaches. General ideas, expressed concisely and coherently, provide structure and reminders to help guide a skilled tester. They also afford degrees of freedom to the tester, such that the tester is encouraged to vary tests rather than repeating them—to seek and find new information, rather than covering ground that’s already been covered.
The manageability question might be about how to predict when we’ll be done testing. We’re done when we have no more important unanswered questions about the product. Exploration is about discovering new information, so in one sense it’s not predictable at all.
So how much time should we allocate? Try setting up some general questions that we want to answer; explore the product to ask and answer those questions; and see where the answers take us. If we discover things that pose more important questions, management has a decision to make: ship the product, or allocate more time to answer the questions, fix the problems that the answers point to, and then ship the product. The management function decides whether testing should continue or stop.


[i] The notes for the presentation Two Futures of Software Testing can be found at http://www.developsense.com/presentations/e2008twofutures.pdf

[ii] Leeds, Herbert, and Weinberg, Gerald M., Computer Programming Fundamentals. McGraw-Hill Company Inc (1970), ASIN: B000K6JVKY

[iii] An interview with Sajay Samuel is available at http://www.cbc.ca/ideas/features/science/index.html#episode11, as Episode 11 in CBC Radio’s Ideas series, How To Think About Science.

[iv] Cem Kaner’s talk The Ongoing Revolution in Software Testing can be found at http://www.kaner.com/pdfs/testingRevolution2007.pdf

Leave a Comment