Dale Emery, a colleague for whom we have great respect, submitted a comment on my last blog post, which in turn referred to Testing and Checking Refined on James Bach‘s blog. Dale says:
I don’t see the link between your goals and your solution. Your solution seems to be (a) distinguishing what you call checking from what you call testing, (b) using the terms “checking” and “testing” to express the distinction, and (c) promoting both the distinction and the terminology. So, three elements: distinction, terminology, promotion.
How do these:
- deepen understanding of the craft? (Also: Which craft?)
- emphasize that tools and skilled use are essential?
- illustrate the risks of asking humans to behave like machines?
I can see how your definitions contribute to the other goal you stated: to show that checking is deeply embedded in testing. And your recent refinements contribute better than your earlier definitions did.
But then there’s “versus,” which I think that bumps smack into this goal. And not only the explicit use of “versus”; also the “versus” implied by repeatedly insisting that “That’s not testing, that’s checking!”
Also, I think your choice of terminology bumps up against this “deeply embedded” goal. Notice, that you often express distinctions by adding modifiers. In James’s post: Checking, human checking, machine checking, human/machine checking. The terms with modifiers are clearly related (and likely a subset) of the unmodified term.
Your use of a distinct word (“checking”) rather than a modified term (e.g. “mechanizable testing” or “scripted testing” or similar), have a natural effect of hinting at a relationship other than “this is a kind of that.” I read your choice of terminology (and what I interpret as insistence on the terminology) as expressing a more distant relationship than “deeply embedded in.”
James and I composed this reply together:
Our goal here is to improve the crafts of software testing, software engineering, and software project management. We use several tactics in our attempt to achieve that goal.
One tactic is to install linguistic guardrails to help prevent people from casually driving off a certain semantic cliff. At the bottom of that cliff is a gaggle of confused testers, programmers, and managers who are systematically—due to their confusion and not to any evil intent—releasing software that has been negligently tested.
This approach is less likely than they would wish to reveal important things that they would want to know about the software. You might believe that “negligently tested” is a strong way of putting it. We agree. To the extent that this unawareness brings harm to themselves or others, the software has been negligently tested. For virtual world chat programs on the Web, that negligence might be no big deal (or at least, no big deal until they store your credit card information). However, we have experience working with people in financial domains, retail, medical devices, and educational software who are similarly confused on this issue specifically: there’s more to testing a product than checking it.
Our tactic, we believe, deepens the understanding of the craft of test quite literally: where there were no distinctions and people talked at cross-purposes, we install distinctions so that we can more easily detect when we are not talking about the same things. This adds an explicit dimension there there had been just a tacit and half-glimpsed one. That is exactly what it means to deepen understanding. In Chapter 4 of Perfect Software and Other Illusions about Testing, Jerry Weinberg performed a similar task, de-lumping (that’s his term) “testing”. There, he calls out components of testing and some related activities that are not, strictly speaking, testing at all: “testing for discovery”, “pinpointing”, “locating”, “determining significance”, “repairing”, “troubleshooting”, “testing to learn”, “task-switching”. We’re working along similar lines here.
Our tactic, we believe, helps to emphasize that tools and skilled use of tools are essential by creating explicit categories for processes amenable to tooling and processes not so amenable. These categories then become roosting places for our thoughts and our conversations about how tools relate to our work. At best, understanding is elusive and communication is difficult. Without words to mark them, understanding and communication are even more difficult. That is not necessarily a problem in everyday life. As testers we work in a turbulent world of business, technology, and ideas. Problems in products (bugs) and in projects (issues) emerge from misunderstanding. The essence of a testing work is to clear misunderstandings over differences between what people want and what they say they want; what people produced and what they say they produced; what they did and what they say they did. People often tell us that they’ve tested a product. It often turns out that people mean that they’ve checked the functions in a product. We want to know what else they’ve done to test it. We need words, we claim, to mark those distinctions.
We’re aware that other people have come up with labels for what we might call “checks”; for example, Mike Hill speaks of “microtests” in a similar way, and others have picked up on that, presenting arguments on similar lines to ours. That’s cool. In the post on James’ blog, we make it explicit that we use this terminology in the domain we control—the Rapid Software Testing class and our writings—and we suggest that it might be useful for others. Some people borrow bits of Rapid Software Testing for their own work; some plagiarize. We encourage the former, and ask the latter to give attribution to their sources. But in the end, as we’ve said all along, it’s the ideas that matter, and it’s up to people to use the language they want. To us, it’s not a terrible thing to say “simplistic testing” any more than it would be a terrible thing to call a compiler an automatic programmer, but we think “compiler” works better.
We visit many projects and companies, including a lot of Agile projects, and we routinely find that talk of checking has drowned out talk of testing—except that people call it testing so nobody even notices how skewed their focus has become. Testers become increasingly selected for their enthusiasm as quasi-programmers and check-jockeys. Who studies testing then? What do testers on Agile projects normally talk about at their conferences or on the Web? Tools. Tools. And tools—most of which focus on checking, and the design of checkable requirements. This is not in itself a bad thing. We’re concerned by the absence of serious discussion of testing, critical investigation of the product. Sometimes there is an off-handed reference to exploratory testing, based on naïve or misbegotten ideas about it. Here’s a paradigmatic example, from only yesterday as we write: http://www.scrumalliance.org/articles/511-agile-methodology-is-not-all-about-exploratory-testing.
The fellow who wrote that article speaks of “validation criteria”, “building confidence” (Lord help us, at one point he says “guarantees confidence”), “defined expected results”. That is, he’s talking about checking.
Checking is deeply embedded in testing. It is also distinct from testing. That is not a contradiction. Distinction is not necessarily disjunction; “or” in common parlance is not necessarily “xor”. Our use of “versus” is exactly how we English speakers make sharp distinctions even among things that are strongly related, even when one is embedded in the other (the forest vs. the trees, playing hockey vs. skating). Consider people who believe they can eat nothing but bread and meat, as long as they gobble a daily handful of vitamin pills. We think it would be perfectly legitimate to say “That’s not nutrition, that’s vitamin supplements.” Yes, vitamins are part of nutrition. But they are not nutrition. It’s reasonable, we would argue, to talk about “nutrition versus vitamins” in that conversation.
For instance we could say “mind vs. body.” Mind is obviously embedded in body. Deeply embedded. But don’t you agree that mind is quite a different sort of thing than body? Do you feel that some sort of violence is being done with that distinction? Perhaps some people do think so, but the distinction is undeniably a popular and helpful one, and it has been to a great many thinkers over hundreds of years. Some people focus on their minds and neglect their bodies. Others focus on their bodies and neglect their minds. At least we have these categories so that we can have a conversation about them.
When Pierre Janet first distinguished between conscious and sub-conscious thought, that also was not an easy distinction. Today it is a commonplace. Everyone, even those who never took a class in psychology, is aware of the concept of the sub-conscious, and that not everything we do is driven by purely conscious forces. We believe our distinction between testing and checking could have a similar impact and similar effect—in time.
Meanwhile, Dale, we know you and we respect you. Please help us resolve our confusion: what’s YOUR goal? In your world, are there testers? Do the ambitious testers in your world diligently study testing? Or would you say that they study programming and how to use tools? How do you cope with that? Do you feel that your goal is served best by treating testing and whatever people do with tools to verify certain facts about a product as just one kind of activity? Would you suggest breaking it down a different way than we do? If so, how?
7 replies to “Versus != Opposite”
Thank you for the elaboration. And particularly for “we make it explicit that we use this terminology in the domain we control” – in other words even RST et.al. is a context.
“But in the end, as we’ve said all along, it’s the ideas that matter, and it’s up to people to use the language they want. ” – in other words even the domain words of RST et.al. has to be used in context.
This approach, we believe, is less likely than they would wish to reveal important things that they would want to know about the software.
Let me see if I’ve got it. People harm themselves when they imagine that this approach reveals important things that it does not reveal. You’d like to reduce this harm. Is that a fair summary of the point you’re making in this paragraph?
I want that, too, and I support anything you can do that reduces the harm.
where there were no distinctions and people talked at cross-purposes, we install distinctions so that we can more easily detect when we are not talking about the same things
I wasn’t asking about the general tactic of making distinctions. I was asking how this particular distinction, phrased in this particular way, and expressed using these particular terms, deepens understanding of the craft. I suspect the value (as with all distinctions that add value) is in what you do differently with some aspect of testing, having sorted it into one category or the other. And in what you invite other people to do differently with “checking” than with “testing.”
I have little insight into what you do differently based on your sorting of any given activity. Mostly I’ve seen Michael chide people for using the “wrong” term. (To be fair: Likely I’ve seen more than that, and chiding is merely what I remember most, given my own biases.)
Michael replies: I hope you’re right about the parenthetical. I also want to emphasize something that I hope is more clear now than it was before: we use this terminology in Rapid Testing, in the class, methodology, and community around them. Within our community, we’d generally be more adamant on the distinction.
I suspect that you try to use the distinction to help people stop hurting themselves. I’d like to hear more about how you do that, and maybe about how the distinction per se helps you to do that.
People often tell us that they’ve tested a product. It often turns out that people mean that they’ve checked the functions in a product. We want to know what else they’ve done to test it.
Yes. This seems to go back to wanting to help people reduce the risk of fooling themselves. And in particular, reduce the risk of unwarranted confidence in either the software or in the relevance and sufficiency of the information produced by their testing practices.
Our tactic, we believe, helps to emphasize that tools and skilled use of tools are essential by creating explicit categories for processes amenable to tooling and processes not so amenable.
I’m not seeing that. I don’t see anything in either the definitions or the terms “checking” and “testing” that says anything at all about amenability to tooling.
I suspect that definitions and terms that more directly convey the idea of “amenity to tooling” would be much more helpful here.
what’s YOUR goal?
I don’t know how to answer that.
In your world, are there testers? Do the ambitious testers in your world diligently study testing? Or would you say that they study programming and how to use tools?
There are testers in my world, sort of. But mostly I’ve been working with people to improve their test automation skills and practices. There are testers around, but mostly I have minimal contact with them. So I don’t know how they pursue their craft.
Mostly the ambitious test automators I work with study programming and tools. The tests they are automating originate elsewhere (and I am unhappy about many aspects of that).
Do you feel that your goal is served best by treating testing and whatever people do with tools to verify certain facts about a product as just one kind of activity?
Good god, no.
Would you suggest breaking it down a different way than we do?
In this comment and my previous one I’ve made all the suggestions I can think of. Until recently, I had no idea what problem you two were trying to solve by promoting your terminology, and so had no idea whether I had anything of value to suggest at all. With your recent posts, I have a better idea of your goals. And with this post, I’m starting to see some of the connections (and what seem to me to be disconnections) between your tactics and your goals. As I better understand your goals and the thinking that connect your tactics and your goals, I maybe be able to make better suggestions. Or maybe I’ll finally grok what you’re up to.
Michael replies: One explanation for disconnects, perhaps, is that we’re in different paradigms. (“There are testers around, but mostly I have minimal contact with them.”) We work with testers and with organizations, with a focus on developing testing skill. It seems you’re on a related but different tack: working with people in a programming specialty on a specific set of skills related to the development and use of tools.
You remark that you’re unhappy with the idea that the test (check?) automators are automating tests (checks?) that originate from elsewhere. Is there a chance that both the automators and the automation are being underemployed, and revealing less than they could about important problems in the software?
[…] Blog: Versus != Opposite Written by: Michael Bolton […]
I’m glad that there’s another person truly questioning the validity and usefulness of the distinction. Dale, my respects.
I wrote a post about this in my own blog:
If you’re up for a constructive conversation and are not afraid of being questioned, come and check out my answer (and questions) to your comment:
Looking forward to your reply.
I was directed to this post by Michael Bolton to answer my question “What problem are you trying to solve?”
The answer is here, but it’s buried so deeply I’m not sure I understand it. Here’s what I’m thinking: Bolton and Bach find that people aren’t doing enough exploratory testing, or maybe not enough penetration testing, or some other testing that requires people to be actively and intelligently engaged with the product (as opposed to fully automated testing).
Michael replies: I don’t know of any kind of testing (including the testing activity that surrounds testing) worthy of the name that doesn’t require people to be actively and intelligently engaged with the product. “Fully automated testing” isn’t possible, since testing is evaluating a product by learning about it through experimentation (see Testing and Checking Refined). Machines and tools don’t learn. They may extend or assist or amplify our ability to learn, but “fully automated testing” is as hollow an idea as “fully automated programming” or “fully automated management”.
Their solution: change the meaning of the word “test” to cover only those practices, so now they get to say to their clients “You haven’t tested your product enough!”
This is a misunderstanding, misinterpretation, or misrepresentation of what we’ve written here.
Correct me if I’m wrong about that.
Consider that done.
If I’m right, though, there’s a ton of literature and practice that appears incorrect from Bach/Bolton’s POV.
It would be easier to tell people “you must do adequate exploratory testing, penetration testing, usability testing, and other types of unscripted testing.” So, I don’t see justification for the semantic change, although I do respect that if Bach and Bolton find it helps them communicate with clients, then good for them.
It has been helpful not only for our clients, but for others who are not our clients.
I find it very strange that this new meaning for “test” conflicts with an earlier blog post on this topic, which appears to have since been edited. This time, with their semantics, testing is “distinct” from checking. I see no need to be this disruptive 🙂
You claim that “an earlier blog has been edited”. It would be nice if you bothered identified that post here, or the way in which it has been edited (it hasn’t). Once again: testing is distinct from checking in the same kind of way that trees are distinct from leaves. Leaves are parts of trees, but leaves are not trees; leaves are distinct from trees. I’m sorry if you find the distinctions between trees and leaves, or cars and wheels, or shoes and soles disruptive (disruptive how? to whom?), but I can’t understand how those distinctions are irrelevant or insignificant. You seemed to accept this distinction a while ago; I don’t know what’s changed for you.
Given a test procedure for a product, running it manually has quality measurement value A. Automating it, and then running it in the lab or cloud, has quality value B. B has some advantages, but some disadvantages too. It’s important to know that B is not a superset of A, but many people in the industry do not know that, and so they incur risk without being aware of it.
It’s not only important to know that B is not a superset of A. It’s also important to know that B is a subset of A.
I like a different version of the word “check:” Checking is when there’s no validation and the verifications are limited to what is identifiable in the automation code. Check is a kind of test, actually, what people generally call “automated test.”
This is broadly consistent with what we say, in fact.
For a web site or other GUI, the fact that B is not a superset of A (from paragraph above) is very important. That’s where the word “check” is useful, to emphasize that if you take a manual test and automate it, it’s still a test, but better to call it a “check” in that case as a reminder that manual testing is still needed.
This is not quite what we say. It’s not that “manual testing is still needed”. (Since we hold that “automated testing” doesn’t exist—see above—we hold that “manual testing” is an unhelpful concept.) Instead, the point is that checking must be embedded in skilled and conscientious testing. We don’t use “sapient” to describe this lately (as i did in this post from 2009; “sapience” gets too easily misinterpreted, alas; and back then I had not sufficiently cleared up the misunderstanding that some took from my use of “versus”. I meant “vs.” in the sense of “distinct from (yet contained within)”, rather than “in opposition to”; rooms vs. houses, rather than Hatfields vs. McCoys. I’m grateful to critics (collegial and non-collegial alike) who eventually were successful in pointing out the unhelpful aspect of “vs.” in making my meaning clear.
Whether you use tools heavily or not in your testing (we like tools whenever they help us), our aim is to emphasize the complex judgement involved in testing things. Harry Collins (with whom we met around the time Testing and Checking Refined was published, puts it beautifully.
There’s more on this in my book “MetaAutomation” that will likely come out in November.
I know this is an old post to pick up on, but I am responding in part to a comment made here and posts made elsewhere.
“Machines and tools don’t learn.”
I am curious what you think of the computer science topic “machine learning” (E.G. http://en.wikipedia.org/wiki/Machine_learning )? Is it mis-labeled in your mind, or is it that you don’t see machines as learning within the Rapid Software Testing Namespace (http://www.developsense.com/blog/2015/02/the-rapid-software-testing-namespace/ )? This also reminds me a little of the AI affect (http://en.wikipedia.org/wiki/AI_effect ).
““learning” is the process of developing one’s mind. Only humans can learn in the fullest sense of the term as we are using it here, because we are referring to tacit as well as explicit knowledge.”
– http://www.satisfice.com/blog/archives/856 (mentioned other places in your blog as well like http://www.developsense.com/blog/2014/03/harry-collins-motive-for-distinctions/)
Michael replies: I don’t think much of machine learning, even though I’m sometimes dazzled by what people have managed to with it. (For example, I’m consistently blown away by SoundHound and Shazam, those programs that can “hear” a song and report very quickly on what it is. I know that they are strictly algorithmic, but that neither suggests that they are unimpressive nor that they are intelligent.) I think “machine learning” is misnamed; it cheapens what “learning” really means, to me. “Machine pattern refinement” would be closer to the mark; there are algorithms that refine other algorithms. In any case seems to me that you can’t have intelligence without intention, motivation, desire. Artificial intelligence researchers keep trying to create a more powerful brain; a faster, more powerful rule-processor. But Harry Collins suggests that real intelligence in a human sense means creating an artificial social agent, and there’s no sign of that anywhere on the horizon. A big reason for this is that we don’t know how the mechanisms of collective tacit knowledge work; how humans anticipate, process, and sort out each other’s actions—and how they repair problems in how those things are communicated. This is not to say that socially aware AI is impossible; but we haven’t figured it out yet and it doesn’t look like we’re close.
I am also a little curious about your take on “tacit” knowledge and your philosophical usage for the word. That is to say, from my reading of the usage of tacit, it appears you feel machines cannot reach this state because we can always express (even if really complicated) their state. But isn’t that just a bit like saying ‘I don’t know what is going on in my brain when I ride my bike, so machines can’t do that’? Eventually we might be able to get brain scans that could describe (even if really complicated) how I ride my bike. Perhaps that sort of research will uncover the mechanism to build machines that are human-equivalent in general intelligence, perhaps not. It feels like your statement is at best should have a ‘for now’ clause in it.
Yes; for now. Collins works this out in Tacit and Explicit Knowledge. The brain scan isn’t even necessary; we could have sensors and control mechanisms that could handle this just fine. (We already have something like this on the Segway, for example.) Collins also points out that balancing a bike is easily within our reach, if you consider situating the prototype on an small asteroid where gravity is 1/1000 of what it is here. The bike would tip over slowly, and to keep it balanced, there’d be lots of time for processing. Our current machinery could keep up just fine.
But the real problem for intelligence is not in balancing the bike; that’s what Collins would put with behaviour, the intention-free part of actions. People don’t merely balance bikes; they ride bikes in a variety of social contexts: in Amsterdam, and in Beijing, and in London, and in Toronto where the interactions with other cyclists, drivers, pedestrians, and the law are all different. The protocols for riding bikes are different on the roads, on cycling trails in the mountains, on sidewalks, in velodromes; for adults and children; for commuting, for exercise, and for competition. Humans have intentions that include social purposes, and signal those intentions to each other in ways that we don’t understand and can’t encode. The real problem for intelligence is in the action of riding the bike, the behaviour plus the intention. Another example: when I was at a crosswalk the other day, I noticed that drivers would not slow down if I were facing away from them or away from the road, but they would slow down when I made eye contact. Can the Google Car do that? Not even close, apparently.
I get there is also social-tacit knowledge that requires a culture, but that isn’t limited to just humans. Google’s auto-complete learns from the culture doing the searching and gets better at helping me do my searching. If your argument is that Google search takes human input to decide a culture and that it by itself has no way of creating a culture (because it lacks a drive to socialize, etc.) I would agree with you there (for now).
Yes; for now. But let’s not mistake Google’s auto-complete algorithms for a human understanding of the social purposes that you’re trying to achieve.
Maybe this is not what you mean, or maybe I am missing something. Perhaps this too falls under your namespace. Ultimately, I get that trying to define things in simple ways is hard, but I’d enjoy hearing a detailed response. Thanks,
It would be a lot more efficient for you to read the work of Harry Collins (in particular Artificial Experts, The Shape of Actions, and Tacit and Explicit Knowledge). There’s a nice overview of some of the issues in The Teaching Company course, Philosophy of Mind: Brains, Consciousness, and Thinking Machines.