A couple of years ago, I developed a version of a well-known reasoning exercise. It’s a simple exercise, and I implemented it as a really simple computer program. I described it to James Bach, and suggested that we put it in our Rapid Software Testing class.
James was skeptical. He didn’t figure from my description that the exercise would be interesting enough. I put in a couple of little traps, and tried it a few times with colleagues and other unsuspecting victims, sometimes in person, sometimes over the Web. Then I tried the actual exercise on James, using the program. He helpfully stepped into one of the traps. Thus emboldened, I started using the exercise in classes. Eventually James found an occasion to start using it too. He watched students dealing with it, had some epiphanies, tried some experiments. At one point, he sat down with his brother Jon and they tested the program aggressively, and revealed a ton of new information about it—many of which I hadn’t known myself. And I wrote the thing.
Experiential exercises are like peeling an onion; beneath everything we see on the surface, there’s another layer that we can learn about. Today we made a discovery; we found a bug as we transpected on the exercise, and James put a name on it.
We call it an Ellis Island bug. Ellis Island bugs are data conversion bugs, in which a program silently converts an input value into a different value. They’re named for the tendency of customs officials at Ellis Island, a little way back in history, to rename immigrants unilaterally with names that were relatively easy to spell. With an Ellis Island bug, you could reasonably expect an error on a certain input. Instead you get the program’s best guess at what you “really meant”.
There are lots of examples of this. We have an implementation of the famous triangle program, written many years ago in Delphi. The program takes three integers as input, with each number representing the length of a side of a triangle. Then the program reports on whether the triangle is scalene, isoceles, or equilateral. Here’s the line that takes the input:
function checksides (a, b, c : shortint) : string
Here, no matter what numeric value you submit, the Delphi libraries will return that number as a signed integer between -128 and 127. This leads to all kinds of amusing results: a side of length greater than 127 will invisibly be converted to a negative number, causing the program to report “not a triangle” until the number is 256 or greater; and entries like 300, 300, 44 will be interpreted as an equilateral triangle.
Ah, you say, but no one uses Delphi any more. So how about C? We’ve been advised forever not to trust input formatting strings, and to parse them ourselves. How about Ruby?
Ruby’s String object supplies a to_i method, which converts a string to its integer representation. Here’s what the Pickaxe says about that:
to_i str.to_i( base=10 ) ? int
Returns the result of interpreting leading characters in str as an integer base base (2 to 36). Given a base of zero, to_i looks for leading 0, 0b, 0o, 0d, or 0x and sets the base accordingly. Leading spaces are ignored, and leading plus or minus signs are honored. Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str, 0 is returned. The method never raises an exception.
We discovered a bunch of things today as we experimented with our program. The most significant thing was the last two sentences: an invalid number is silently converted to zero, and no exception is raised!
We found the problem because we thought we were seeing a different one. Our program parses a string for three numbers. Depending upon the test that we ran, it appeared as though multiple signs were being accepted (+–+++–), but that only the first sign was being honoured. Or that only certain terms in the string tolerated multiple signs. Or that you could use multiple signs once in a string—no, twice. What the hell? All our confusion vanished when we put in some debug statements and saw invalid numbers being converted to 0, a kind of guess as to what Ruby thought you meant.
This is by design in Ruby, so some would say it’s not a bug. Yet it leaves Ruby programs spectacularly vulnerable to bugs wherein the programmer isn’t aware of the behaviour of the language. I knew about to_i’s ability to accept a parameter for a number base (someone showed it to me ages ago), but I didn’t know about the conversion-to-zero error handling. I would have expected an exception, but it doesn’t do that. It just acts like an old-fashioned customs agent: “S-C-H-U-M-A-C… What did you say? Schumacher? You mean Shoemaker, right? Let’s just make that Shoemaker. Youse’ll like that better here, trust me.”
We also discovered that the method is incorrectly documented: to_i does raise an exception if you pass it an invalid number base—37, for example.
There are many more stories to tell about this program—in particular, how the programmer’s knowledge is, at best, is a different set compared to what empirical testing can reveal. Many of the things we’ve discovered about this trivial program could not have been caught by code review; many of them aren’t documented or are poorly documented both in the program and in the Ruby literature. We couldn’t look them up, and in many cases we couldn’t have anticipated them if they hadn’t emerged from testing.
There are other examples of Ellis Island bugs. A correspondent, Brent Lavelle, reports that he’s seen a bug in which 50,00 gets converted to 5000, even if the user is from France or Germany (in those countries, a comma rather than a period denotes the decimal, and they use spaces where we use commas).
Now: boundary tests may reveal some Ellis Island bugs. Other Ellis Island bugs defy boundary testing, because there’s a catch: many such tests would require you to know what the boundary is and what is supposed to happen when it is crossed. From the outside, that’s not at all clear. It’s not even clear to the programmer, when libraries are doing the work. That’s why it’s insufficient to test at the boundaries that we know about already; that’s why we must explore.
Can you share one or two more-compelling examples of Ellis Island bugs? Frankly, the fact the triangle-sides program didn't check for valid input is just plain old bad programming, and not a new category of bug. Is there an example of a case that actually would have slipped through widely-used good development practices such as TDD? I'd like to understand whether this sort of bad programming is all that common these days, and if so what the true root causes might be.
Cheers,
Dave
@Dave…
Frankly, the fact the triangle-sides program didn't check for valid input is just plain old bad programming, and not a new category of bug.
I've been trained (principally by Jerry Weinberg) to pay special attention to the word magic invoked by "just".
I think you might be subject to the narrative bias (a.k.a hindsight bias) here: since you know the end of the story, you can anticipate the problem at beginning of it. My question for you is this: would you have been aware of the conversion-to-zero issue in Ruby had we not reported it here? Or would you have expected an exception to be raised for invalid input?
As with many things, we're not identifying a new category of bug; we're putting a new name on a class of things that people have known about forever. As a matter of fact, the Y2K problem is a special case of this kind of problem. For many input forms, 1994 would have been silently converted to 94 internally; 1976 would have been converted to 76; and 2005 would have been converted to… what?
Is there an example of a case that actually would have slipped through widely-used good development practices such as TDD?
A couple of problems here. First, do you have any evidence that TDD is widely used? Looking at Scott Ambler's Agile most recent survey that poses the question, TDD is "very commonly" used by 37 respondents out of 244, "commonly" used by 83 out of 244, and "sometimes" used by 65 of the 244. There are several inferences that one could make to suggest that even these numbers are high. Only 124 respondents to the survey identified themselves as programmers; non-programmers would be dealing in rumours of TDD a significant portion of the time. This was a survey done of people who had adopted or were considering adopting Agile, so there would be a bias towards people who had, you know, heard of TDD. And so forth.
Second, what basis do you have for believing that TDD is any better for anticipating this kind of problem? Amongst the community of people who practice, teach, and talk about it (including Ward Cunningham, Kent Beck, Elisabeth Hendrickson) it is consistently described as a design technique, not as a testing technique. In fact, Elisabeth has recently referred to having to turn down the noise from her tester head as she tries to program using TDD.
I'd expect some correlation between TDD well-practiced and a smaller number of unit-level bugs, perhaps. There's a greater correlation, I'd expect, between TDD and a better-understood design. But when it's not universally practiced, practiced at varying levels of quality, and addressed towards discovering solutions rather than discovering problems, I don't see any reason to relax vigilance.
I'd like to understand whether this sort of bad programming is all that common these days, and if so what the true root causes might be.
The root cause? I'd start by considering that pretty much every function in every computer program is an attempt to effect a transformation of some piece of information into another form. Some of those transformations happen by the programmer's explicit instruction; some happen with the programmer's knowledge; and yet others happen invisibly as the operating system, language libraries, application frameworks, and hardware process information. One big issue, as always, is that we don't know what we don't know. Until we do. Do you own a Prius, by any chance?
—Michael B.
I've been trained (principally by Jerry Weinberg) to pay special attention to the word magic invoked by "just".
Good point. I'll try to be more conscious of my unconscious use of such words. Thanks!
…would you have been aware of the conversion-to-zero issue in Ruby had we not reported it here?
I must have communicated very poorly indeed. I didn't mean to comment on the details of language implementations. I would have expressed the desired behavior of the code in tests or examples prior to writing the code, both happy path and otherwise. If my goal was to write a program to report the type of triangle described by three input values, I would not try to test the implementation of Ruby internals. I would, however, make sure the program did not try to use illogical input values.
If I were working on legacy code, such as the Y2K problem you mentioned, then characterization tests would have exposed undesired silent conversions such as 1976 to 76, and unit tests would have ensured the revised code worked as desired. I don't see what is confusing or unusual about that.
First, do you have any evidence that TDD is widely used?
If it is not widely used, or if equally-effective or better methods are not widely used, then perhaps that is the root cause of the problem in this example; not to mention many other problems with software.
I'd start by considering that pretty much every function in every computer program…
You and I are not seeing the same problem, here. If developers expressed the desired behavior of their code in an executable form, the underlying support for string parsing or integers or what-have-you would not matter. As soon as they tried a unit test to ensure the code would reject an illogical input value, the test would signal the fact that the code was not behaving properly. The developer would then make the code behave properly. In this particular example, which I assume was intentionally chosen because it's easy to understand, the root cause of the bug is not any given language's implementation of integers or strings or what-have-you, it's that the developer did not check for valid input values.
The reason I asked for a more interesting example is that the triangle program (as you described it) was poorly written. It doesn't really make a compelling case for your argument. A well-written program that nevertheless exhibited behavior that is hard to test, or side-effects so obscure that a developer wouldn't even think to test for them, would make a stronger example. I would still be interested in such an example.
Cheers,
Dave
@Dave:
Here's another. Open WordPad. In the Font size combo box, type 1639. Observe a message that the number entered must be between 1 and 1638. Note that the value reverts to what was in there to start with (for me, 10).
Type some text. Highlight it. Right click. Choose Font. In the Size combo box, type 1639 (or any other number up to 9999). Click on OK. Observe that your text is really, really big. Note that the value 1638 appears in the drop-down box.
Here's the weirder bit: put your cursor in the Font size combo box. Press Ctrl-A to Select All. Press Ctrl-C to copy. Paste the contents of the clipboard somewhere you can see it (Notepad, or even WordPad, but you'll have to resize it to be able to read it!). Note that whatever value you entered, it had been changed to 1638.5. Not 1638, mind you, but 1638 point-five.
Now, Dave: You might want to put this down to WordPad being a Microsoft product; or to being a product that doesn't matter much, since it's free; or to bad programming; or to an apparent case of one programmer not realizing what another did. Those things would be true, but they'd be beside the point: if they can happen in WordPad, they can happen anywhere where the programmer is unaware of the possibility of someone, or something, changing his data without his awareness. The repetition there is intentional; I'm not simply saying that we don't know when something changes, but also that we don't know that we don't know.
I would have expressed the desired behavior of the code in tests or examples prior to writing the code, both happy path and otherwise.
That's good. But that's you. I presume that you are more conscientious than many. Moreover, how do you know that you're expressing everything about the desire of the code in your tests or examples? The happy path is often intractable; the unhappy path is infinite.
If my goal was to write a program to report the type of triangle described by three input values, I would not try to test the implementation of Ruby internals. I would, however, make sure the program did not try to use illogical input values.
I didn't try to test the implementation of Ruby internals either. I did make sure that the program did not try to use illogical input values. The trouble is that my tests make sure that the improper data was being rejected, but that silent conversion was still happening in the background. As James Bach points out, this can happen when a function returns something that appears to be a reasonable value, but is in fact a sort of error code. Since the value appears to be reasonable (especially when the error code and the return value can be interpreted as the same data type), the program doesn't treat the return value as an error code. Or to put it another way, a function returns bits. The caller interprets those bits one way. There is no guarantee that the interpretation is correct.
I'm willing to run the exercise on you to show you what I mean. Contact me via Skype. Speaking of which, here's another good example of an Ellis Island bug: http://www.developsense.com/blog/archive/2009_09_29_archive.html.
Cheers,
—Michael B.
@DaveI can share a couple of real bugs of this sort.1. Windows 2000 or XP, standard graphical editor Paint. Open picture size dialog, enter width=-1 and hight=1, click OK and watch what happened.2. Windows 7, Paint program again. Open page properties, enter left margin=-10 and click OK — left margin changes to zero automagically. (Oh, you can't enter minus sign, it is not accepted by test field… No problem, just copy it to the clipboard and paste to the left margin text field, and there it is!)
1 of 3
"…this can happen when a function returns something that appears to be a reasonable value, but is in fact a sort of error code."
"…a function returns bits. The caller interprets those bits one way. There is no guarantee that the interpretation is correct."
Honestly, I'm not trying to be argumentative, but isn't this something that "shouldn't" happen (yes, I know, there's that magic word, "should")? Why would the client code interpret the return value from a function in some way that wasn't intended? Answer: Because the author of the client code didn't bother to find out what the function's return value is supposed to mean. It seems to me this is another indication that development practices ought to be improved.
"Dear Programmer…I could have asked you, but you were downstairs getting another latte."
This is a good story, and I certainly can't debate the fact that this sort of thing happens in software development. To me, though, it's another illustration of the value of eliminating specialized silos and hand-offs on a development team. If the same person, or the same pair of people, did both the development and the (basic) testing, it would not be possible for the person who had the answer to be unavailable to the person who had the question. Also, as I mentioned previously, with agile methods we wouldn't continue to build and promote code if we didn't understand what the limit was on this input field. We wouldn't say, "Oh, he's off getting a latte. So I'll just pull an assumption out of thin air and keep on going." We would wait for him/her to return with latte in hand, or text him/her. This is fundamental to the agile working style, regardless of the particular process framework or methodology: We have to know the definition of "done" before we start writing any code. Otherwise, we're shooting in the dark at a moving target.
I take your point that sound development practices aren't the norm in industry as of this moment. That fact is the reason the Software Craftsmanship movement got started. I take your point that we shouldn't relax vigilance in ferreting out errors of this kind. As with any problems in any domain, it is necessary to act on both a tactical and a strategic level – on a tactical level to stop the bleeding (after-the-fact testing to find bugs such as the Ellis Island type), and on a strategic level to solve the underlying problem (working to improve development practices industry-wide, so that fewer bugs are generated in the first place). So, it's good that you and others are aware of various types of bugs and have techniques to expose them. You're right to do that. Please remember there are other kinds of "right," too. If we can address the true root cause on a strategic level, I think we can improve things a lot more than by trying to step on every bug that scurries by, one by one. All we will get out of that, ultimately, is sore feet.
2 of 3
"…how do you know that you're expressing everything about the desire of the code in your tests or examples?"
I don't know with absolute certainty. It may not be possible. I have found that by thinking about it in that way, more and more examples come to mind in the course of development; certainly more than I ever thought of when I was working from written specifications, in the Old Way. We can, at least, cover all the cases that are known. And we can reason through some of the edge cases. Consider the font size boundary conditions in one of your Windows examples. If you set the font size too high using one method, the program resets the value to its previous setting; if you set it too high using another method (right clicking and so forth), the program sets the value to its maximum allowable setting. So, the problem here is inconsistent behavior. The program doesn't take the invalid input value and go marching off through memory that it doesn't own, or anything horrible like that. Surely, we would think of that sort of thing during development. The more collaboration, the more feedback, the more fine-grained incremental results delivered, the greater the likelihood someone would notice this behavior early in the development cycle. Can we foresee and catch every possible error? I doubt it. Does that mean we shouldn't use the most effective methods available to avoid creating errors? I don't think so.
I maintain that the true root cause of these problems is the fact that most programmers don't employ the good development practices of the day, and don't make an effort to keep up as good practices evolve. As a testing specialist, wouldn't you rather add value by exercising a working application in production-like contexts than by discovering silly little bugs the developers never should have passed along to you in the first place? You point out, rightly, that TDD is a design technique and not a testing technique. Remember, too, that one of the effects of TDD is to prevent or avoid nearly all trivial programming errors. When employed in concert with other agile practices, such as automated unit tests, continuous integration, pairing (including cross-functional pairing), customer collaboration, frequent review and feedback, incremental delivery, and sitting together to foster osmotic communication, TDD and related practices really go a long way toward making Ellis Island bugs a thing of the past. When dealing with legacy code – like the function that returns bits, but we aren't sure what the bits mean – there are useful techniques that more developers ought to learn that would alleviate many of these problems. (And please don't ask for "studies" or "proof" of this, as if that train hadn't left the station years ago. That would have represented healthy skepticism ten years ago. Today…not so much.)
3 of 3
It seems to me testers would welcome that sort of change in development methods. Surely it's boring to spend your time discovering one trivial bug after another, and never have time left in the project schedule to do value-add system-level testing that developers can't do given the limitations of development environments, and that needs to be done before promoting the code to production? I think it's the developer's job to deliver working code – code that meets functional requirements and doesn't crash or behave badly. I don't think it's the tester's job to find my little bugs for me. When that happens, I feel embarrassed. Unfortunately, the majority of people who receive a paycheck for writing code don't feel that way. They seem to think bugs are someone else's responsibility. In my view, that is the root cause of the problems you've offered as illustrations.
With regard to your suggestions, I can't try out some of the programs you mention because I don't use Windows. The reason I don't use Windows is because so much Windows-based software behaves as you describe, and distracts me from getting work done. The reason so much Windows-based software behaves badly is that most developers don't…well, you already know what I have to say about that. What I may do, as time permits, is write a version of the triangle program in Ruby and see how it behaves when I allow "any" input values. But that would just be for fun and education about Ruby. I don't think this is the deeper problem with Ellis Island bugs. It's just one example of what can happen when the deeper problem goes unsolved.
Honestly, I'm not trying to be argumentative, but isn't this something that "shouldn't" happen (yes, I know, there's that magic word, "should")?
I think you've answered your own question here.
Why would the client code interpret the return value from a function in some way that wasn't intended? Answer: Because the author of the client code didn't bother to find out what the function's return value is supposed to mean. It seems to me this is another indication that development practices ought to be improved.
There's more to it than that. Please understand that I'm smiling when I say this: No matter how diligent a job a programmer OR a tester does, we're likely to encounter things that we don't expect. We'll be in situations where we thought we understood everything but didn't; where we encounter a problem in a place where we didn't think; tested, but didn't realize that the return value was in error; tested, but didn't cover all of the possible conditions; tested, but didn't have access to a particular platform; etc., etc., etc. Sure development processes should be improved. I can go one better than that: programmers should never make mistakes. One better than that, even: people should know everything, never lie, and always be right.
One thing I'd like to emphasize, though, is that as a tester, I constantly struggle not to tell programmers how to do their jobs, and I advise other testers to do the same. Even if the tester has direct experience in programming, or in a particular problem domain, telling someone else how to do his job (unless you're his manager) is presumptuous.
If we can address the true root cause on a strategic level, I think we can improve things a lot more than by trying to step on every bug that scurries by, one by one.
Well, good luck with that. Really. Some of us have been at it for 20 years or more. Some of us (Jerry Weinberg comes to mind) for 50. If you figure out a really good way to get mindshare, please let me know.
So, the problem here is inconsistent behavior…Surely, we would think of that sort of thing during development.
What does the empirical evidence suggest? In fact, it suggests a number of possibilities: that Microsoft's programmers didn't test; that Microsoft's testers didn't test; or that they did test, but the problem was considered too unimportant to address. Even if we did agree to be more thoughtful and diligent, there are also cost vs. value considerations. How much time can our organizations afford for us to be cognizant of all of the errors in all of our products and all of their interactions with everyone else's products?
Does that mean we shouldn't use the most effective methods available to avoid creating errors? I don't think so.
If you see me advocating against using effective methods, let me know.
Unfortunately, the majority of people who receive a paycheck for writing code don't feel that way. They seem to think bugs are someone else's responsibility. In my view, that is the root cause of the problems you've offered as illustrations.
I agree. So, your next step?
—Michael B.
Here's another example of potential Ellis Island bugs: http://ow.ly/18cC1
[quote]
This topic discusses coding issues for developing applications to run on Windows 64-bit operating systems.
When you use Visual C to create applications to run on a 64-bit Windows operating system, you should be aware of the following issues:
* An int and a long are 32-bit values on 64-bit Windows operating systems.
* size_t, time_t, and ptrdiff_t are 64-bit values on 64-bit Windows operating systems.
* time_t is a 32-bit value on 32-bit Windows operating systems.
You should be aware of where your code takes an int value and processes it as a size_t or time_t value. It is possible that the number could grow to be larger than a 32-bit number and data will be truncated when it is passed back to the int storage.
The %x (hex int format) printf modifier will not work as expected on a 64-bit Windows operating system; it will only operate on the first 32 bits of the value that is passed to it.
* Use %I32x to display an integer on a Windows 32-bit operating system.
* Use %I64x to display an integer on a Windows 64-bit operating system.
* The %p (hex format for a pointer) will work as expected on a 64-bit Windows operating system.
[/quote]
—Michael B.
Your Ellis Island bug seems to be grouping together two different common programming mistakes.
Yes… or rather, it’s grouping together a number of things, many of which can be described as programming mistakens or triggered by them.
The bug which exists in the Delphi checksides function is really an integer overflow (or underflow). At the University of Toronto this was taught in first year computer programming and reiterated in first year computer science (since first year computer programming was not mandatory). On my website is an example of overflow which had existed in Java and many implementations of binary search until 2006. Just as I did in my lecture, in my web blog (http://i-am-geek.blogspot.com/2010/02/are-most-binary-searches-broken.html), I use a clock to show how overflow and underflow can occur.
That’s a great example; thank you. The fact that this problem could have existed in so many Java programs for as long as it did suggests that Ellis Island problems can exist below our notice for a long time.
Integer overflow/underflow occurs because programmers treat integers in a computer like the concept integers in math. However, a computer is a FINITE statement machine and cannot represent all integers. When this error occurs it is usually not obvious to the programmer.
Yes. I feel obliged to make this explicit: The fact of the bug in the triangle program is not interesting. It’s an example, a contrived example, and a toy example of something else: When an error exists in the code, it is often not obvious to the programmer, and THAT is interesting. Similarly, the problem with Ruby treating an indecipherable number as zero, rather than as an exception, in the to_i function is not interesting in and of itself. Neither is my failure to notice this behaviour; that’s a straightforward case of me being a dweeb. Other, more general things are more interesting to me: 1) We don’t know when we’re being dweebs ourselves. 2) We often don’t know when other people are being dweebs either, until we have some evidence of it. 3) The occasions upon which someone will make a mistake are unpredictable, even after you’ve accounted for their experience, their expertise, their professionalism, and so forth. 4) Testing is one workaround for (1), (2), and (3). 5) Programming languages sometimes include behaviours that are reasonable (to some), illogical (to some), confusing (to some), dangerous (to some), and that lead to errors. Specifiying them and documenting them will prevent many, even most of the failures… and the occasional failure will get through.
On the other hand, something like the Ruby to_i method is usually the result of a conscience decision to implement some behaviour in a specific way. To me this is just a programmer not properly understanding (or possibly not even reading) the specifications of a language.
I’ve been trained by Jerry Weinberg to put a red light on the top of my car and turn on the siren whenever I hear the word “just”.
The reference documentation for Ruby and its standard library (as it exists in my copy of the Pickaxe, and distinct from the tutorial portion of the book) runs to 390 pages. I don’t know anyone who’s capable of reading that volume of information from cover to cover, able to recall the whole works comprehensively, and put it to good and error-free use. In addition, I don’t know of very many people who use comprehensive, cover-to-cover reading as their learning style for a new program. What’s more common, very common, is for people to act upon reasonable beliefs which mostly work, whereupon they infrequently get fooled and even less frequently find out about the problem.
There is also a related topic called “undefined behaviour”. If you look at the documentation for ANSI/ISO C there are many places it refers to “undefined behaviour”. Talk to anyone knowledgeable in C programming and they will tell you that “undefined behaviour” means the programmer is free to implement the behaviour however they like. There is no right or wrong. They are even free to change the behaviour from one version of the language to the next. Programmers are discouraged from using anything documented as “undefined behaviour”. I see this as similar to the issue with the Ruby to_i method. Programmers need to read and understand the documentation.
I agree. I’ve often suggested that too, as you do. I’ve noticed that the best programmers are the ones who least need to hear that advice. Meanwhile, the worst of them pay attention neither to the documentation nor to me. How’s it working out for you?
These are known, common, first year mistakes. I see all of them as a problem between the seat and the keyboard.
Indeed they are. When they happen, though, they result in other problems, between other seats and other keyboards.
Thanks for the reply.
I think the commenters may be missing the point here.
As a tester, I need to be aware of patterns of potential failure. I need cognitive tools, such as labels, which allow me to manage that awareness, as well as to convey it to other testers.
“Ellis Island Bug” is one such label. It represents a kind of behavior that I genuinely must be on my guard against as I puzzle over the strange things that products do. If I find myself thinking “There’s no way this product can behave this way because I entered a number greater than zero!” Then my awareness of Ellis Island bugs will cause me to consider the possibility that my input was replaced by a zero.
The fact that you can think of some code-based alternative heuristic is irrelevant. The Ellis Island heuristic is for black box testing.