When A Bug Isn’t Really Fixed

On Monday, January 10, Ajay Balamurugadas tweeted, “When programmer has fixed a problem, he marks the prob as fixed. Programmer is often wrong. – #Testing computer software book Me: why?”

I intended to challenge Ajay, but I made a mistake, and sent the message out to a general audience:

“Challenge for you: think of at least ten reasons why the programmer might be wrong in marking a problem fixed. I’ll play too. #testing”

True to my word, I immediately started writing this list:

1. The problem is fixed on a narrow set of platforms, but not on all platforms. (Incomplete fix.)

2. In fixing the problem, the programmer introduced a new problem. (In this case, one could argue that the original problem has been fixed, but others could argue that there’s a higher-order problem here that encompasses both the first problem and the second.) (Unwanted side effects)

3. The tester might have provided an initial problem report that was insufficient to inform a good fix. Retesting reveals that the programmer, through no fault of his own, fixed the special case, rather than the general case. (Poor understanding, tester-inflicted in this case)

4. The problem is intermittent, and the programmer’s tests don’t reveal the problem every time. (Intermittent problem)

5. The programmer might have “fixed” the problem without testing the fix at all. Yes, this isn’t supposed to happen. Stuff happens sometimes. (Programmer oversight)

6. The programmer might be indifferent or malicious. (Avoidance, Indifference, or Pathological Problem)

7. The programmer might have marked the wrong bug fixed in the tracking system. (Tracking system problem)

8. One programmer might have fixed the problem, but another programmer (or even the same one) might have broken or backed out the fix since the problem was marked “fixed”. Version control goes a long way towards lessening this problem. Everyone makes mistakes, and even people of good will can thwart version control sometimes. (Version control problem)

9. The problem might be fixed on the programmer’s system, but not in any other environments. Inconsistent or missing .DLLs can be at the root of this problem. (Programmer environment not matched to test or production)

10. The programmer hasn’t seen the problem for a long time, and guesses hopefully that he might have fixed it. (Avoidance or indifference)

I had promised myself that I wouldn’t look at Twitter as I prepared the list. By the time I finished, though, the iPhone on my hip was buzzing with Twitter notifications to the point where I was getting a deep-tissue massage.

When I got up this morning, there were still more replies. Peter Haworth-Langford wanted “Do we need to focus on what’s wrong? Solutions? Are we missing something else by focusing on what’s wrong?” The last response on the thread, so far as I know, was Darren McMillan asking, rather like Peter, “I’d like to know which role you are playing when you say I’ll play too? Customer/PM….. Is there more to the challenge?” Nope. And thank goodness for that; I was swamped with replies. I decided to gather them and try grouping them to see if there were patterns that emerged.

Very few problems indeed have any single, unique, and exclusive cause. Moreover, this list is based on Joel’s Law of Leaky Abstractions: “All non-trivial abstractions are to some degree leaky.” Some of the problems listed below may fit into more than one category, and, for you, may fit better into a category to which I’ve arbitrarily assigned them. Cool; we think differently. Let me know how, and why.

Note also that we’re looking at this without prejudice. The object of this game is not to question anyone’s skill or integrity, but rather to try to imagine what could happen if all the stars are misaligned. We’re not saying that these things always happen, or even that they frequently happen. Indeed, for a couple, of items, they might never have happened. In a brainstorm like this, even wildly improbable ideas are welcome, because they might trigger us to think of a more plausible risk.

Erroneous Fix

Just as people might fail to implement something the first time, they might fail when they try to fix the error, even though their acting in good faith with all of the skill they’ve got.

Lynn McKee: He is a human being and simply made an error in fixing it.
Darren McMillan: The developer didn’t have the skills to apply fix correctly, resulting code caused later regression issues.

That kind of problem gets compounded when we add one or more of the other problems listed below.

Incomplete Fix

Somtimes fixes are incomplete. Sometimes the special case has been fixed, but not the general case. Sometimes that’s because of a poor understanding of the scope of the problem; problems are usually multi-dimensional. (We’ll get to the root of the misunderstanding later.)

Ben Kelly: ‘Fix’ hid erroneous behavior but did not resolve the underlying problem
Ben Simo: The problem existed in multiple places and requires additional fixes.
Lanette Creamer: Fix isn’t accesible
Lanette Creamer: Fix is not localized.
Lanette Creamer: Fix isn’t triggered in some paths.
Lanette Creamer: Fix doesn’t integrate w other code
Stephen Hill: The programmer might have fixed that symptom of the bug but not dealt with the root cause.
Stephen Hill: Has the fix been applied only to new installs or can it retrospectively fix pre-existing installs too?

Sometimes people will fix the letter of the problem without doing all of the related work.

Ben Kelly: Bug fix did not have accompanying automation checks added (in a culture where this is the norm)

It’s possible for people comply maliciously to the letter of the spec, fixing a coding problem while ignoring an underlying design problem.

Erkan Yilmaz: dev knows since decision abt design it’s bad but against his belief fixs also bug(s). He cant look honestly in mirror anymore

Unwanted Side Effects

Sometimes a good-faith attempt to fix the problem introduces a new problem, or helps to expose an old problem. Much of the time such problems could easily intersect with “Incomplete Fix” above or “Poor Understanding” below.

Ben Simo: The fix created a new problem worse than the solution.
Lanette Creamer: fix breaks some browsers/platforms
Lanette Creamer: fix has memory leaks
Lanette Creamer: fix breaks laws/reqirements
Lanette Creamer: fix slows performance to a crawl.
Michel Kraaij: The bug was fixed, but “spaghetti code” increased.
Michel Kraaij: The dev did fix the bug, but introduced a dozen other bugs. (this issue fixed or not?
Michel Kraaij: The fix increases the user’s manual process to an unacceptable level.
Nancy Kelln: Was never actually a problem. By applying a ‘fix’ they now broke it.
Pete Walen: Mis-read problem description, “fixed” something that was previously working.

Intermittent Problem

In my early days as telephone support person at Quarterdeck, customers used to ask me why we hadn’t fixed a particular problem. I observed that problems that happened in every case, on every platform, tended to get a lot of attention. Sometimes a fix will appear to solve a problem whose symptom is intermittent. The fix might apply to first iteration through a loop, but not subsequent iterations; or for all instances except the intended maximum. Problems may reproduce easily with certain data, and not so often or not at all with other data. Timing, network traffic, available resources can conspire to make a problem intermittent.

Michel Kraaij: The dev happened to use test data which didn’t make the bug occur.
Michel Kraaij: The dev followed “EVERY described step to reproduce the bug” and now the bug didn’t occur anymore.
Pete Walen: Fixed sql query with commands that don’t work on that DB… not that that ever happened to anything I tested…
Pradeep Soundararajan: might have thought the bug to be fixed by tryin to repeat test mentioned in the report although there are other ways to repro

Environment Issues

We’ve all heard (or said) “Well, it works on my machine.” The programmer’s environment many not match the test environments.

Adam Yuret: The fix only works on the Dev’s non-production analogous workstation/environment.
Michel Kraaij: The fix is based on module, which has became obsolete earlier, but wasn’t removed from the dev’s env.
Pradeep Soundararajan: Works on his machine
Stephen Hill: Might be fixed in dev’s environment where he has all the DLLs already in place but not on a clean m/c.

“Works on my machine” is a special case of a more general problem: the programmer’s environment might not be representative of the test environment, but the test environment might not be representative of the production environment, either. There might not be a test environment. Patches, different browser versions, different OS versions, different libraries, different mobile platforms… all those differences can make it appear that a problem has been fixed.

Darren McMillan: Fix on production code blocking customer upgrades
Lynn McKee: No two tests are ever exactly the same, so even tho code change was made something is diff in testers “environment”.
Michel Kraaij: The fix demands a very expensive hardware upgrade for the production environment.
Pete Walen: Fixed code for 1 DB, not for the other 3, and not the one that was used in testing.

Poor Understanding or Explanation

Arguably all problems include an element of this one. Sometimes there’s poor communication between the programmer and the tester, due to either or both. The tester may not have described or explained the problem well, and the programmer provided a perfect fix to the problem as (poorly) described. Sometimes the programmer doesn’t understand the problem or the implications of the fix, and provides an imperfect fix to a well-described problem. Sometimes a report might seem to refer to the same problem as another, when the report really refers to different problem. These problems can be aided or exacerbated by the medium of communication: the bug tracking system, linguistic or cultural differences, written instead of face-to-face communication (or the converse).

Ben Kelly: Programmer can’t reproduce the problem – tester didn’t provide sufficient info.

Ben Simo: The problem wasn’t understood well enough to be satisfactorily fixed.
Michel Kraaij: The dev asked whether “the problem was solved this way?” He got back a “yes”. He just happened to ask the wrong stakeholder. Nancy Kelln: Wasn’t clear what the problem was and they fixed something else.
Pradeep Soundararajan: Assumes it to be a duplicate of a bug he recently fixed.
Zeger Van Hese: Developer solved the wrong problem (talking to the tester would have helped).
Ben Simo: The tester was wrong and gave the programmer information leading to breaking, not fixing, the software.
Michel Kraaij: The tester classifies the bug as “incorrectly fixed”. However, it’s the tester who’s wrong. Bug IS fixed.

Zeger Van Hese: The dev doesn’t see a problem and marks it fixed (as in: functions as designed)

Another variation on poor understanding is that the “problem” might not be a problem at all.

Darren McMillan: It wasn’t a problem after all. Customer actually considered the problem a feature. On hearing about the fix customer cried.
Darren McMillan: Problem came from a tester with a tendency to create his own problems. Wasn’t actually a problem worth fixing.

Finally in this category, a “problem” can be defined as “a difference between things as perceived and things as desired” (that’s from Exploring Requirements Weinberg and Gause). To that, I would add the refinement suggested by the Relative Rule “…to some person, at some time.” A bug is not a thing in the program; it’s a relationship between the product and some person. One way to address a problem is to solve it technically, of course. But there other ways to address the problem: change the perception (without changing the fact of the matter); change the desire; decide to ignore the person with the problem; or wait, such that perhaps the problem, the perception, the desire, or the person are no longer relevant.

Darren McMillan: Fix was a customer support call. Fix satisfied customer, didn’t fit product needs.
Pradeep Soundararajan: Because the bug is the perception not the code

Insufficient Programmer Testing

Partial problem fixes, intermittent problems, and poor understanding tend not to thrive in the face of programmers who think and act critically. Inadequate programmer testing is never the sole source of a problem, but it can contribute to a problem being marked “fixed when it really isn’t.

Ben Simo: The fix wasn’t sufficiently tested.
Darren McMillan: Obvious: wasn’t properly tested, didn’t consult required parties for fix,
Lynn McKee: …and therefore must not have done any or /any/ effective testing on his end first.

Version Control Problems

Version control software was relatively new to most on the PC platform in the middle 1990s. These days, it’s implemented far more commonly, which is almost entirely to the good. Yet no tool can guarantee perfect execution for either individuals or teams, and accidents still happen. Alas, version control can’t guarantee that the customer has the same configuration on which you’re developing or testing—or that he had yesterday, or that he’ll have tomorrow.

George Dinwiddie: Forgot to check in some of the new code.
Nancy Kelln: Bad code merge overwrote the changes. Bug fix got lost.
Pete Walen: Fixed code, did not check it in, fixed it again, check in BOTH (contradictory fixes)
Pete Walen: Fixed code, forgot to check it in. Twice. #hypotheticallyofcourse
Pradeep Soundararajan: Has a fix and marks fix before he commits the code and checks in the wrong one.
Stephen Hill: Programmer might not be using the same code build as the customer so does not get the bug.

Tracking Tool Problems

Sometime our tools introduce problems of their own.

Ben Kelly: Programmer fat-fingers the form response to a bug fix.
Nancy Kelln: Marked the wrong bug as fixed in the tracking tool.
Pradeep Soundararajan: The bug tracking system could have a bug that occasionally registers Fixed for any other option.
Pradeep Soundararajan: He might have overlooked a bug id and marked it fixed while it was the other
Zeger Van Hese: The dev wanted to re-assign, but marked it fixed instead (context: he hates that defect tracking system they made him use)

Process or Responsibility Issues

Perhaps there’s a protocol that should be followed, and perhaps it hasn’t been followed correctly—or at all.

Ben Simo: Perhaps it isn’t the responsibility of one programmer OR one tester to mark a problem fixed on their own.
Darren McMillan: Obvious: didn’t follow company procedure for fixes (fix notes, check lists, communications)
Dave Nicolette: Maybe programmer shouldn’t be the one to say the problem is fixed, but only that it’s ready for another review.
Michel Kraaij: The dev misunderstood the bug fixing process and declared it “fixed” instead of “resolved”.

Avoidance, Indifference, or Pathological Behaviour

We don’t like to think about certain kinds of behaviour, yet they can happen. As Herbert Leeds and Jerry Weinberg put it in one of the first books on computer programming, “When we have been working on a program for a long time, and if someone is pressing us for completion, we put aside our good intentions and let our judgment be swayed.”

Ben Kelly: Programmer is under duress from management to ‘fix all the bugs immediately’
Michel Kraaij: The dev is just lazy
Michel Kraaij: The dev is out of “fixing time”. To make the fake deadline, he declares it fixed.
Pradeep Soundararajan: Is forced by a manager or a stakeholder to do so
Pradeep Soundararajan: That way he is buying more time because it goes through a big cycle before it comes back.

Maybe the pressure comes from outside of work.

Erkan Yilmaz: Dev has import. date w. fiancé’s parents, but boss wants dev 2 work overtime suddenly, family crisis happens

There may be questions of competence and trust between the parties. A report from unskilled tester who cries “Wolf!” too often might not be taken seriously. A bad reputation may influence a programmer to reject a bug with minimal (and insufficient) investigation.

Ben Kelly: Bug was dismissed as the tester reporting had a track record of reporting false positives.

Perhaps someone else was up to some mischief.

Erkan Yilmaz: dev went 2 other person’s pc who was away + used that pc 4 fix/marking – during that he found info that invaded privacy

Sometimes we can persuade ourselves that there isn’t a problem any more, even though we haven’t checked.

Michel Kraaij: The dev has a huge ego and declares EVERY bug he touches as “fixed”.
Pradeep Soundararajan: He did some code change that made the bug unreproducible and hence considers it to be fixed.
Pradeep Soundararajan: Considers it fixed thinking it was logged on latest version & prior versions have it altho code base in production is old
Pradeep Soundararajan: Thinks his colleague already fixed that.
Stephen Hill: Has the person for whom the problem was ‘a problem’ re-tested under the same circumstances as previously?

Sometimes management or measurement exacerbates undesirable behaviours.

Ben Kelly: Programmer’s incentive scheme counts the number of bugs fixed (& today is the deadline)
Ben Simo: Perhaps programmer is evaluated by number of things mark fixed; not producing working software.
Lynn McKee: He is being measured on how many bugs he fixes and hopes no one will notice no actual coding was done.
Pradeep Soundararajan: Yielding to SLA demands to fix a bug within a specific time.

It is remarkable how a handful of experienced people can come up with a list of this length and scope. Thank you all for that.

6 replies to “When A Bug Isn’t Really Fixed”

Tweets that mention When A Bug Isn’t Really Fixed « Developsense Blog -- Topsy.com

January 11, 2011 at 10:48 pm

[…] This post was mentioned on Twitter by Michael Bolton. Michael Bolton said: Blogged: When A Bug Isn’t Really Fixed http://bit.ly/dEwdKe Thanks to Twitter contributions, folks! #testing #softwaretesting #qa […]
Matt

January 12, 2011 at 7:59 am

In the “Process or Responsibility Issues”, one point really stood out for me. Dave Nicolette said “Maybe programmer shouldn’t be the one to say the problem is fixed, but only that it’s ready for another review.” I’m working on making this part of our understanding of bugs at my company, simply because a programmer can’t say with confidence that a bug is fixed for the same reason that programmers can’t say a release is bug-free: bias. Even if they’ve tested their fix, the most they can say is that it has been programmer tested.

My preference is not to consider bugs “fixed”, but rather retested under certain conditions, and then mark THAT test as passed or failed. The programmer AND the tester may have missed some sort of detail which makes the bug appear to be fixed when it actually isn’t.
Pete Walen

January 12, 2011 at 11:06 am

When I saw the tweet, my first thought was “I wonder if he intended that to be a general blast.” Almost immediately after came the thought “Who cares? Ten? I can do ten…” I think I got to 5 before I realized other folks had done the same thing. Interesting reactions. Thanks for pulling this together! (I still owe you 5 instances…)
Lynn McKee

January 12, 2011 at 6:24 pm

Thanks for the challenge Michael, it was fun to see how the responses evolved on Twitter.
Nilanjan

January 20, 2011 at 2:39 am

This is a great list. Great set of ideas on what to test on bug fixes.

Might be good to add a disclaimer not to turn this into a slug fest against developers. Some of the comments have a tone 🙂

Wondering if we should prevent some of these issues (or do I need to reread http://www.developsense.com/blog/2010/05/testers-get-out-of-the-quality-assurance-business/)?

Michael replies: I hope it’s clear that I’ve taken an agnostic view of what the problem is or who’s involved. Note that I didn’t say who’s to blame, or who’s responsible. It seems to me that the concept of responsibility is far more powerful when it is granted and accepted, and far less when it’s used as a sneaky synonym for “blame” or “culpability”. Alas, in my experience as a programmer, I’ve been the one most likely to accept responsibility for something that wasn’t actually fixed. (At least eventually. And just to be clear: I’m not most like to accept the responsibility because I’m noble; it’s because I make lots of mistakes.)

I’d be careful about questions like “Should we prevent some of these issues?” The obvious answer is “of course.” It might be a good idea to answer the question with some other questions: Who is “we”? What are the extents and limits of our roles, our responsibility, and our authority? While we’re preventing “these issues”, what else might we prevent?
Lisa Davidson

September 5, 2011 at 6:10 am

This is indeed a wonderful post. I my view bug fixing should be at priority, who’s to blame, or who’s responsible should not be the main focus. An experienced QA Testing engineers will always test a product or application being an independent thinker to ensure that quality is delivered.