DevelopsenseLogo

Taking Severity Seriously

There’s a flaw in the way most organizations classify the severity of a bug. Here’s an example from the Elementool Web site (as of 14 January, 2015); I’m sure you’ve seen something like it:

Critical: The bug causes a failure of the complete software system, subsystem or a program within the system.
High: The bug does not cause a failure, but causes the system to produce incorrect, incomplete, inconsistent results or impairs the system usability.
Medium: The bug does not cause a failure, does not impair usability, and does not interfere in the fluent work of the system and programs.
Low: The bug is an aesthetic (sic —MB), is an enhancement (ditto) or is a result of non-conformance to a standard.

These are serious problems, to be sure—and there are problems with the categorizations, too. (For example, non-conformance to a medical device standard can get you publicly reprimanded by the FDA; how is that low severity?) But there’s a more serious problem with models of severity like this: they’re all about the system as though no person used that system. There’s no empathy or emotion here; there’s no impact on people. The descriptions don’t mention the victims of the problem, and they certainly don’t identify consequences for the business. What would happen if we thought of those categories a little differently?

Critical: The bug will cause so much harm or loss that customers will sue us, regulators will launch a probe of our management, newspapers will run a front-page story about us, and comedians will talk about us on late night talk shows. Our company will spend buckets of money on lawyers, public relations, and technical support to try to keep the company afloat. Many capable people will leave voluntarily without even looking for a new job. Lots of people will get laid off. Or, the bug blocks testing such that we could miss problems of this magnitude; go back to the beginning of this paragraph.

High: The bug will cause loss, harm, or deep annoyance and inconvenience to our customers, prompting them to flood the technical support phones, overwhelm the online chat team, return the product demanding their money back, and buy the competitor’s product. And they’ll complain loudly on Twitter. The newspaper story will make it to the front page of the business section, and our product will be used for a gag in Dilbert. Sales will take a hit and revenue will fall. The Technical Support department will hold a grudge against Development and Product Management for years. And our best workers won’t leave right away, but they’ll be sufficiently demoralized to start shopping their résumés around.

Medium: The bug will cause our customers to be frustrated or impatient, and to lose faith in our product such that they won’t necessarily call or write, but they won’t be back for the next version. Most won’t initiate a tweet about us, but they’ll eagerly retweet someone else’s. Or, the bug will annoy the CEO’s daughter, whereupon the CEO will pay an uncomfortable visit to the development group. People won’t leave the company, but they’ll be demotivated and call in sick more often. Tech support will handle an increased number of calls. Meanwhile, the testers will have—with the best of intentions—taken time to investigate and report the bug, such that other, more serious bugs will be missed (see “High” and “Critical” above). And a few months later, some middle manager will ask, uncomprehendingly, “Why didn’t you find that bug?”

Low: The bug is visible; it makes our customers laugh at us because it makes our managers, programmers, and testers look incompetent and sloppy—and it causes our customers to suspect deeper problems. Even people inside the company will tease others about the problem via grafitti in the stalls in the washroom (written with a non-washable Sharpie). Again, the testers will have spent some time on investigation and reporting, and again test coverage will suffer.

Of course, one really great way to avoid many of these kinds of problems is to focus on diligent craftsmanship supported by scrupulous testing. But when it comes to that discussion in that triage meeting, let’s consider the impact on real customers, on the real people in our company, and on our own reputations.

13 replies to “Taking Severity Seriously”

  1. When you say it like that, even the low level sounds inevitably pretty damn serious, and even expensive. Reputation can be invaluable.

    In my opinion, low level issues need not to be that serious – to make anyone laugh at us – as single bugs, but when there are more of this kind, some end user person may react by cry for help, god, mom, sacred creatures, or in despair or by other minor cussing, and thinking “I rather not keep my information in such sloppy-feeling system”. If met in live environment, that is. When these problems go unrecognized or downplayed, they will eat the value of the product slowly, from inside.

    And oh yes, I know about the dangers of counting bugs as things. Need to balance between seeing problems as bug counts and the level of their overall severity.

    How would you then answer to a project manager asking “How many of those there are then?” After I have said as tester “we found also few low level bugs that are diminishing the product value more than their severity as individual problems would suggest.” Often it feels time is wasted when going through these one by one, when it makes the PM think, huh, these are not as bad as the tester claims.

    Michael replies: I might say “there are swarms of bugs here; like those clouds of gnats that you occasionally walk through.” I might say “there are enough to make us look really sloppy.” If I’m well prepared, I might have a list of headlines or summaries from the bug tracking system that I can hand to the project manager, so that the visual impact tells the story.

    But there’s another thing I might do, too: I might keep these problems out of the tracking system, and make a deal with the programmers and the product managers: “Let me keep a list of these things without making a formal report, and pass them to the programmers in a batch every now and again so we don’t have to talk about them so often.” Or if the culture supports it, I might MiP the bug (In the Rapid Testing namespace, “MiP” and “MiPping” are words invented by James Bach to stand for “Mentioning in Passing”. Mipping is a kind of informal communication that is going on all the time in a project, often unconsciously. Yet we can choose to mip something consciously too. Try setting a little alarm in your head to help you recognize that you’re about to do something formal or time-consuming, and ask yourself if you could do it informally and inexpensively.

    I agree about the ways to avoid the problems. Thanks for writing this, I think it has good points to help with describing and explaining problem severities.

    Thank you for the comment.

    Reply
  2. Hi Michael,
    I had to laugh when I first read this but then wasn’t sure if it shouldn’t make me cry as well.
    Most of the software testers I know have come across these severity descriptions but actually READ them as you describe. Other team members may actually understand it in the original sense. This difference of understanding is where conflict arises. And an opportunity for both parties to learn..

    In the description for
    “Medium: The bug does not cause a failure, does not impair usability, and does not interfere in the fluent work of the system and programs.”

    I’d argue that this is NOT a bug. It causes no failure and I can use all functions (so no missing requirements or functions that I, the user would need) and is a breeze to use (does not impair usability). I can use it fluently (no performance issues either in the whole systems or submodules or standalone programs and no interface issues) so I’d argue it’s a very well designed system. In fact I’d like to work more on systems that have this description!

    A personal bugbear of mine is having Typos in the “Low” category. Yes, a typo could have low severity but it could also be critical. Adding a k to an m may add a factor of 1000 to what the developer thought would be 30m but are displayed as 30km.

    Thanks for making me think about this issue again.

    Reply
  3. Really nice explanation, Today i also learn human emotions and regulatory action be the part of deciding severity.
    After reading your post might be this is time to make changes in my post as well that i have posted when i was going through a,b,c of Testing.here is the link
    http://www.abodeqa.com/2012/08/09/difference-between-severity-and-priority/

    might be i could get some better feedback as well on this.
    But above post was the part of my note that i was thinking to share with all.

    Thanks for such a worth reading post

    Reply
  4. Severity is, of course, a relevant issue I suppose if the goal is other than to have perfect software, but rather something that works partly some of the time. Oh, well. If Microsoft can live with it I guess I have to.

    Michael replies: Software is subject to the decidability problem. We might believe we have perfect software, but until we’ve decided that everyone important has evaluated every aspect of it under every set of conditions, we can never be sure we have perfect software—and even then, we can’t know what might be revealed that would change our perception. Our understanding both of what we want and of what we’ve got is always provisional. In that, software is like a scientific theory: the best we’ve got is the best we’ve got so far, so far as we know.

    Another point I’d like to make: Severity is usually judged by the apparent severity of the symptom, but, in truth, the severity of the defect cannot be identified until the defect is diagnosed. A detected symptom is but one of the symptoms of the defect. Think sniffles.

    Absolutely so. You’ve reminded me to do something I should have done in the first place, which is to link back to this post. Thanks!

    Reply
  5. Michael, I think you missed one very important category:
    Nobody Cares – This type of bug is one that upper management usually makes the biggest deal over. Inevitably, there is some sort of minor functionality, usability, or graphical issue which affects 0.001% of users in an almost harmless way, but someone (likely a HIPPO type of person) says it needs to be “fixed” and therefore it is listed as critical in JIRA.

    Michael replies: In saying “Nobody Cares”, you have an interesting take on “nobody”.

    You could choose to dismiss a complaint from upper management, but to do so might be something that my friend Johanna Rothman would classify as a CLM, or “Career-Limiting Move”. This is, of course, one of the shoals that testers have to navigate: helping our clients to understand the relative significance of problems to people who matter. For good or ill, in a business, the people who control the purse strings are the people who decides who matters most—and sometimes they decide that they themselves matter most.

    Love the post, keep it up!

    Thanks!

    Reply
  6. Michael,

    Thanks for another insight. These severity descriptions are so story-telling, it sure can help with bug advocacy. I’ll definitely give them a test drive in that aspect.

    Now, having experienced much struggle with the definitions you used as an example, I’d like to add.

    Amazingly and sadly, sometimes folks (and whole teams) don’t see a difference between an incident in production and a finding within the development cycle, and approach both with the same severity classification system. As long as bugs don’t bother much from development perspective, they are considered minor. Moreover, since bugs of Severity 3 and 4 are often seen as “nice to fix but ok to release” or “improvement suggestions”, the incentive is to challenge and downplay the severity set by testers. Furthermore, Impact and Probability, packed into Severity, cause confusions and disagreements with High Impact – Low Probability bugs. And if that wasn’t enough, sometimes fixing Priority is not defined as a separate flag, so Severity is about that, too. Good luck, testers.

    On the other hand, I’ve learnt something new through my Accessibility testing experiences: a cluster of minor barriers is considered a major problem. Because all these technically minor issues cause major impact in people’s understanding and operation. And because software Accessibility is a holistic concept.

    Indeed, “traditional” functional testing must get humanized, too, and severity descriptions you offered lead towards this direction.

    Thanks,
    Albert

    Reply

Leave a Reply to Dwarika Cancel reply