Lessons Learned in Finding Bugs

This story is put together from several parallel experiences over the last while that I’ve merged into one narrative. The pattern of experiences and epiphanies is the same, but as they used to say on TV, names and details have been changed to protect the innocent.

I was recently involved in testing an online shopping application. In the app, there’s a feature that sends notifications via email.

On the administration screen for setting up that feature, there are “Save” and “Cancel” buttons near the upper right corner. Those buttons are not mapped to any keys. The user must either click on them with the mouse, or tab to them and press Enter.

Below and to the left, there are some fields to configure some settings. Then, at the bottom left, there is a field in in which the user can enter a default email notification message.

Add a nice short string to that text field, and everything looks normal. Fill up that field, and the field starts to expand rightwards to accommodate the text. The element in which the text field is embedded expands rightwards too.

Add enough text (representing a perfectly plausible length for an email notification message) to the text field, and the field and its container expand rightwards far enough that they start to spill off the edge of the screen.

And here’s the kicker: all this starts to obscure the Save and Cancel buttons in the top right, such that they can’t be clicked on any more. You can delete the text, but the field and container stubbornly remain the same enlarged size. That is, they don’t shrink, and the Save and Cancel buttons remain covered up.

If you stumble around with the Tab key, you can at least make the screen go away—but if you were unlucky enough to click “Save” and return to the application, the front-end remains in the messed-up state.

There is a configuration file, but it’s obfuscated so that you can’t simply edit it and adjust the length of the field to restore it to something that doesn’t cover the Save and Cancel buttons. You can delete the file, but if you do that, you’ll lose a ton of other configuration settings that you’ll have to re-enter.

The organization had, the testers told me, a set of automated checks for this screen. We looked into it. Those checks didn’t include any variation. For the email notification field, they changed the default to a short string of different happy-path data, and and pressed the Save button. But they didn’t press the on-screen Save button. They pressed a virtual Save button.

Thus, even if the check included some challenging data, the automated checks would still have been able to find and click on the correct invisible, inaccessible, virtual Save and Cancel buttons just fine. That is, there is no way that the checks would alert a tester or anyone else to this problem.

After searching for a product, there was a screen to display tiles of products returned in the search. Some searches returned a single product, displaying a single tile. It didn’t take very long for us to find that leaving that screen and coming back to it produced a second instance of the same tile. Leaving and coming back again left three tiles on the screen. It didn’t take long to produce enough tiles for a Gaudi building in Barcelona.

Logging in and putting products into the shopping cart was fine. Putting items into the shopping cart and then logging in put the session into a weird state. The number of items on the shopping cart icon was correct, based on what we had selected, but trying to get into the shopping cart and change the order produced a message that the shopping cart could not be accessed at this time, and all this rendered a purchase impossible. (I tried it later on the production site; same problem. Dang; I wanted those books.)

We found these problems within the first few minutes of free, playful interaction with this product and trying to find problems. We did it by testing experientially. That is, we interacted with the product such that our encounter was mostly indistinguishable from that of a user that we had in mind from one moment to the next. Most observers wouldn’t have noticed how our encounter was different from a user, unless that observer were keen to notice us doing testing.

That observer might have noticed us designing and performing experiments in real time, and taking notes. Those experiments were based on imagining data and work flows that were not explicitly stated in the requirements or use case documents. The experiments were targeted towards vulnerabilities and risks that we anticipated, imagined, and discovered. We weren’t there to demonstrate that everything was working just fine. We were there to test.

And our ideas didn’t stay static. As we experimented and interacted with the product, we learned more. We developed richer mental models of the data and how it would interact with functions in the product. We developed our models of how people might use the product, too; how they might perform some now-more-foreseeable actions—including some errors that they might commit that the product might not handle appropriately. That is, we were changing ourselves as we were testing. We were testing in a transformative way.

Upon recognizing subtle little clues—like the text field growing when it might have wrapped, or rendered existing data invisible by scrolling the text—we recognized the possibility of vulnerabilities and risks that we hadn’t anticipated. That is, we were testing in an exploratory way.

We didn’t let tools do a bunch of unattended work and then evaluate the outcomes afterwards, even though there can be benefits from doing that. Instead, our testing benefitted from our direct observation and engagement. That is, we were testing in an attended way.

We weren’t constrained by a set procedure, or by a script, or by tools that mediated and modified our naturalistic encounter with the product. That is, we weren’t testing in an instrumented way, but in an experiential way.

We were testing in a motivated way, looking for problems that people might encounter while trying to use the damned thing. Automated checks don’t have motivations. That’s fine; they’re simply extensions of people who do have motivations, and who write code to act on them. Even then, automated checks had not alerted anyone to this bug, and would never do so because of the differences between the way that humans and machines encounter the product.

Oh, and we found a bunch of other bugs too. Bunches of bugs.

In the process of doing all this, my testing partners realized something else. You see, this organization is similar to most: the testers typically design a bunch of scripted tests, and then run them over and over—essentially, automated checking without a machine. Eventually, some of the scripts get handed to coders who turn them into actual automated checks.

Through this experience, the testers noticed that neither their scripted procedures nor the automated checks had found the problems. They came to realize that even if someone wanted to them to create formalized procedures, it might be a really, really good idea to hold off on designing and writing the scripts until after they had obtained some experience with the product.

Having got some experience with the product, the testers also realized that there were patterns in the problems they were finding. The testers realized that they could take these patterns back to design meetings as suggestions for the developers’ reviews, and for unit- and integration-level checks. That in turn would mean that there would be fewer easy-to-find bugs on the surface. That would mean that testers would spend less time and effort on reporting those bugs—and that would mean that testers could focus their attention on deeper, richer experiential testing for subtler, harder-to-find bugs.

They also realized that they would likely find and report some problems during early experiential testing, and that the developers would fix those problems and learn from the experience. For a good number of these problems, after the fix, there would be incredibly low risk of them ever coming back again—because after the fix, it would be seriously unlikely that those bits of code would be touched in a way to make those particular problems come back.

This would reduce the need for lengthy procedural scripting associated with those problems; a handful of checklist items, at most, would do the trick. The fastest script you can write is the script you don’t have to write.

And adding automated checks for those problems probably wouldn’t be necessary or desirable. Remember?—automated checks had failed to detect the problems in the first place. The testers who wrote code could refocus their work on lower-level, machine-friendly interfaces to test the business rules and requirements before the errors got passed up to the GUI. At the same time, those testers could use code to generate rich input data sets, and use code to pump that data through the product.

Or those testers could create tools and visualizations and log parsers that would help the team see interesting and informative patterns in the output. Or those testers could create genuinely interesting and powerful and rich forms of automated checking, as in this example. (Using the packing function against the unpacking function is a really nice application of the forward-and-backward oracle heuristic.)

One of the best ways to “free up time for exploratory testing” is to automate some checks—especially at the developer level. But another great way to free up time for exploratory testing is to automate fewer expensive, elaborate checks that require a lot of development and maintenance effort and that don’t actually find bugs. Some checks are valuable, and fast, and inexpensive. The fastest, least expensive check you can write is the valueless check you don’t have to write.

And attended, exploratory, motivated, transformative, experiential testing is a great way to figure out which is which.

2 replies to “Lessons Learned in Finding Bugs”

Leave a Comment