Exploratory Testing on an API? (Part 3)

In the last two posts, I’ve been answering the question

Do you perform any exploratory testing on an API? How do you do it?

Last time I described the process of learning about factors in the project environment that affect my strategy as I go about the testing of an API. This time, I’ll go deeper, learn more, and begin to interact with the product. That learning and interaction feeds back into more learning and more interaction.

Along the way, I’ll develop ideas that are reflected in the Heuristic Test Strategy Model. I get ideas about potential problems and risks based on Quality Criteria (and things that threaten them). I identify Product Factors and get ideas about coverage—how to the cover the product with testing. I apply those ideas using approaches that come from families of Test Techniques.

Notice, though, that even though I’ve tagged them, these ideas don’t come to me from the HTSM as such. Ideas occur to me as I encounter elements of the product and its context. Each interaction with and observation of the product triggers ideas about risk and how to test for it. Some of these ideas come from my testing and development experience; others from stories told by my colleagues or reported in the news; still others from the product and its documentation; and those all lead to new ideas that are synthesized in the moment.

One more thing, dear reader, dear tester: you may find that this post is long. That length reflects the volume of ideas that occur in an exploratory process, but not the pace. When it’s happening, it can be fast and invigorating and exciting! This description of the process, at this level of detail, might seem like a big meal to digest at one sitting. So feel free to take it slow, to drop it and try a few things on an API you’re testing, and then come back for more later. Reading (and definitely writing!) about certain parts of testing work takes longer than doing the work and getting the ideas. And it’s a lot less fun.

Whether they are end users—getting access to the product through the GUI—or developers—getting access to the product through an API—people use software to make things happen. They have tasks that they want to get done. So what can I imagine the API’s user wanting to do?
Product Factors/Operations
Test Techniques/Scenario Testing.

An API’s user wants to get some data, make something happen, or effect some change. Anything that happens or changes in the product is driven by specific functions in the code. What functions does the product perform?
Product Factors/Functions

I find out about functions by looking at the product and API documentation if it’s available, or by looking at source code if that’s available. I might not have a programmer’s fluency in the programming language, but I can usually make some sense of it—and that will help me learn the language, in the longer run.
Project Environment/Information

If I can’t get to either documentation or source code, I might ask developers or designers for help in getting them.
Project Environment/Developer Relations

I take notes about what I’m learning, perhaps in the form of a mind map; perhaps in point-form text; perhaps in a table. The process of taking notes helps to cement what I’m learning. In addition, I want to notice inconsistencies between my sources of information.

Sometimes different people tell me the different things. Some people contradict what is written in the requirements documents or specifications. Sometimes requirements documents or specs are inconsistent with each other. Sometimes they’re mysteriously silent on certain things; and people fill in the gaps with beliefs or guesses. Whenever those things are happening, bugs are breeding. I’ll collect lots of specific statements about the product should do, and I’ll look for controversy.
Test Techniques/Claims Testing

I’ll review what I already know. To what functions does the API offer access? The user of the API wants to obtain some kind of value from interacting with the product, and those values are expressible in terms of quality criteria.
Quality Criteria/Capability

What could go wrong with those functions and threaten something do do with one of those quality criteria?

What are the protocols for calling those functions?
Product Factors/Interfaces

Is there anything, like setting up or logging in, that should happen before an API call? What if they don’t happen? I’ll make a note to try that later.
Test Techniques/Flow Testing

Thinking about what happens before an API call reminds me to ask myself about after, too: Are there there things to be cleaned up after an API call?
Product Factors/Time

Does data need to be saved or purged?
Product Factors/Data

Purged? That makes me think of rare but specific occasions at which purging might happen in the lifetime of someone using the API. What are other contexts in which people might be using the product? My testing should address those.
Product Factors/Operations

Given a set of available functions, I’ll try to use a small selection of the functions that I’m aware of.
Test Techniques/Function Testing
Test Techniques/Claims Testing

I’ll try some simple stuff using a tool that affords direct interaction; these days it some quick calls from IRB (the Ruby interpreter). I might use Cypress or Postman, but probably not; using a GUI makes things feel less interactive and naturalistic for me. When the API gets used, it’s going to be used by coders! I want to maximize my experience of what they will experience.

Does the product appear to do what it should? Can it work?
Quality Criteria/Capability

I’ll start simple with representative inputs; then I’ll explore around the extents and limits that I’ve been told about or that are documented. To the the degree that they haven’t been documents, I’ll try to find out what the extents and limits are.
Test Techniques/Domain Testing

I’ll put in some preposterously long strings, or large quantities, or deliberately malformed input. I’ll leave out mandatory fields in the call; put in extra fields that I’ve made up; provide letters where where only numbers are anticipated; feed in symbols or accented characters where only normal letters are anticipated. If the API accepts JSON, I’ll feed it dinner of invalid JSON. If it’s a Web app, I might focus on a smattering of characters that are associated with cross-site scripting, SQL injection, or other security risks.
Quality Criteria/Security

If I do something that should trigger error handling, does error handling happen? Does the product return an error code? After that, does continue to do what it should do?
Test Techniques/Claims Testing
Quality Criteria/Reliability/Robustness

I pay special attention when the product appears to contradict the documentation or what someone has told me.
Test Techniques/Claims Testing

At this point, if I find problems, I note them. Even if I don’t, I start to gather some ideas about how I might use tools and automated checks to vary the input and cover big sets of possible inputs when it’s time for deep testing. I try to find out whether the programmers are using or have considered “contract testing”—checking against a set of agreed or published inputs and outputs.

I pause and remember that that there’s a difference between the product doing what it should and appearing to do what it should. Every step of the way, I’d like to be able to observe as much as I can of the product’s state. Are there any special API function calls that I can use to examine or control the state of the product before, during, and after the usual calls?
Quality Criteria/Development/Testability

Certain calls that developers use for diagnostics or debugging may not be noted in the public documentation; I may have to ask developers about them
Project Environment/Developer Relations

If I have access to the source code, I might browse for them myself. Even if there are no undocumented functions, I’m learning about the code. And I might get ideas for adding diagnostics or debugging interfaces.
Project Environment/Information

Thinking about testability like this reminds me to ask: Does the product offer logging? I have a look. Is the format of the logs consistent with other logs in the same system? Consistent log files will help to make my testing go more quickly and easily, which helps me to get to deeper testing.
Quality Criteria/Development/Testability

If there is no logging and there is resistance from the developers or from management to putting it in, I’ll keep that in mind and note it for the future, when I design automated checks. If there’s no logging in the product, the logs for my checks may have to be more specific and detailed than if there were.

I’ve wandered from thinking about functions to testability, so I pull on the leash to bring my mind back to functions again. What services do the functions perform for the rest of the system? What are the bits and pieces of the system?
Product Factors/Structure

If I haven’t started one already, I’ll sketch a diagram of the product’s structure, and use that to help guide my understanding of the whole product. As my understanding develops, I’ll share the diagram with others to determine whether we’re all on the same page, and to help to maintain our awareness of the product’s structure. I might also post a copy up in a prominent place and invite people to let me know as soon as they see something wrong. Every now and again, I’ll deliberately review the diagram on my own and with other people.

Thinking about the parts of the system reminds to to ask myself: Upon what parts of the whole system do those functions depend? In Rapid Software Testing, something that our product depends upon that is outside the control of our current project is called a platform.
Product Factors/Platform

Platforms might include the operating system, the browser, application frameworks, third-party libraries, stuff that was built in-house on another project. Platforms might also include hardware elements, too: displays, keyboards, mice, touchscreens, printers, or other peripherals. I’ll develop lists of platform dependencies, preparing myself for deeper coverage later, or helping others who aren’t testing via the API.

I return to exercising the functions. Most functions do something to something: data. What data can I supply via the API? In what format? What responses will the API provide? How are responses structured?
Product Factors/Data

Answers to the questions I’m asking will influence my selection of tools for deeper testing. I’ll keep a running list of tools and tool ideas.
Project Environment/Equipment and Tools

Is the API providing access to a database? If so, and I’m focused on testing the functions in the API, I might be able use direct access to the database as an oracle.
Test Techniques/Function Testing

Thinking about databases prompts me to think about stored data and its quality. I focus for a moment on the data quality and reliability, and about what data I might want to use or generate both for interactive testing and for automated checks.
Test Techniques/Domain Testing
Test Techniques/Automated Checking

If I want to examine risk related to data quality, going directly to the database might be more efficient than trying to get at the data through the API’s functions. Of course, I could do a smattering of both, for which I’ll need different tools.
Project Environment/Equipment and Tools

Although I’m doing it in a somewhat more painstaking way, as I’m learning to use the API, I’m doing what one of its users—typically a developer—would do to learn about it.
Test Techniques/User Testing

Just like its intended user, I make mistakes. Some of these are intentional; lots happen unintentionally, too, because as a normal, non-superhuman, I make mistakes all the time. I try to remember to log those mistakes, and note the struggle of trying to solve them. My assumption is that managers and the team want to be aware of anything that could annoy, frustrate, delay, or interfere with anyone using the API successfully.

How might it be difficult for a developer to use this API correctly? How easy might it be to use it incorrectly, or in a way that provides incorrect, misleading, ambiguous, confusing, or undesired results?
Quality Criteria/Usability
Test Techniques/Risk-Based Testing

This is key: APIs are there for programmers to use in ways that help people. I imagine a set of programming tasks that I might want to accomplish using the API. At that same time, I imagine a person performing those tasks; a role. It’s quite possible that not all users of the API will be expert programmers, so it makes sense for me to create a “novice programmer” role.
Test Techniques/User Testing

What if I make the kinds of mistakes a novice programmer might make, and ask for something in a way that the API doesn’t support? What if I hand it pathological or malformed data; data structures with missing elements, dates in the wrong format, or characters that might be used in countries other than my own? Emojis that might not be filtered by the front end?
Quality Criteria/Localizability

Does the product report an error? Does it do so in unhelpful ways? Will the novice understand it? A potential misunderstanding might point to a problem in the product or in the documentation for it.

How well do things get cleaned up after an error condition? Are error conditions handled gracefully?
Quality Criteria/Reliability/Error Handling

How are error conditions relayed to the user of the API—the software calling it, and the human writing that software? Are there error codes to make troubleshooting easier?
Quality Criteria/Development/Supportability

Are there human-readable strings returned directly, or in data structures like JSON or YAML or XML? Are the error codes and messages consistent with the documentation?
Quality Criteria/Usability

Are error codes and message reasonably elegant, helpful, and efficient at conveying useful information? Or are they clunky, confusing, misspelled, or embarrassing in any way?
Quality Criteria/Charisma

Some APIs are used for atomic interactions; single calls to a service followed by single responses from it. Other APIs work more like a conversation, retaining data or maintaining the product in a certain state from one call to the next. A conversation has its risks: what happens when I interrupt it? What happens if I leave out a particular call or response?
Test Techniques/Flow Testing
Test Techniques/Risk-Based Testing

What if I varied the sequence of interactions in a conversation? Programmers forget steps sometimes, and get things out of order. Things can happen asynchronously, too.
Product Factors/Time

Has anyone made any claims about timeouts, either informally or in the documentation?
Test Techniques/Claims Testing

How long will the system let me leave a conversation dangling?
Product Factors; Time

What happens to the internal state of the system or to the data that’s being managed after a conversation has been abandoned for a while? What happens when I try to pick up a previously abandoned conversation?
Quality Criteria; Reliability

Thinking about sequence leads me to think about time generally. How long does it take for the system to process an API call and return a result?
Product Factors/Time

Tools help make the round-trip time easily visible—but I don’t trust evaluating performance based on the first result. Because the tool makes it so easy, I often do several quick tests to observe results from the same call several times. Small loops of test code handle this nicely, logging not only outcomes but timing. If there’s substantial variation, I make a note of it; it might be a bug, or it might be a target for deeper testing.

Has anyone made explicit claims about performance?
Project Environment/Information

Even though I might refer to those claims, I don’t need them to do performance testing; I’ll operate the product, collect observations about performance, and include those observations in my report.

Inconsistency within the product is a powerful oracle. In addition to the API, there may also be human user interfaces: a GUI, or a command line interface. I’ll exercise those interfaces too, and look for inconsistencies between the way they behave and the way the API behaves.
Product Factors/Interfaces

Considering the human interfaces reminds me of other interfaces. The API interacts with certain components in the product, and the product inevitably interacts with other products: the operating system, the file system, third-party frameworks and libraries, probably networks; maybe printers, ports, Bluetooth, near-field radio… These interfaces afford access to the platforms upon which the product depends.
Product Factors/Platform

What if one of the platform dependencies were missing? What if I deprived the of something it needs by preventing access to files or databases?
Test Techniques; Stress Testing

Would the product stay up under stress? Would its behaviour vary or accuracy vary or otherwise become undependable?
Quality Criteria; Reliability

Thinking about reliability prompts me to think about how the product might behave when I place it under load.
Quality Criteria; Performance

Could I stress the product out by overwhelming it with large amounts of data in specific fields, or with an extreme rate of calls?
Test Techniques; Stress Testing

After a certain point, I’ll almost certainly need tools for stress testing, so: build, ask for help from developers, or buy?
Project Environment/Equipment and Tools

I could use those tools to do the same checks over and over again, comparing the product’s output to a specified, hard-coded result. But I could broaden my coverage by varying the inputs for the checks, and using a parallel algorithm to compute the desired result. Exploring the product prepares me to identify those opportunities.

After a session or two of this kind of activity, I’ve learned a lot of useful stuff about the product. I might already be starting to develop code that mimics an intended user’s task. My learning allows me to make much better decisions about where I might want to apply automated checks to address particular risks; how I might want to induce variation in those checks; how I might want to extend ideas about checking.

Based on the length of this post, it may seem like this exercise is necessarily a long process. It doesn’t have to be. Code takes time to develop. Quick little informal experiments can be fast and cheap. Ideas about risk, coverage, quality criteria, and oracles can be developed much more quickly and less expensively than code can. Writing about these things, as I’ve done here, takes far longer than thinking them through! Yet taking notes as I go often helps by inducing pauses in which more ideas occur to me.

This is testing! I’m not only testing to find bugs here; I’m testing to learn how to test the product. If I’m going to use automated checking as part of a strategy for learning specific things about the product quickly, I must test to learn how to develop those checks. This is an exploratory process! Notice how often each idea above is connected to the one before. Testing is fundamentally exploratory.

Do I do exploratory testing on an API?  Yes—because if I’m testing it, I’m doing exploratory work! If I’m doing anything other than that, it’s probably automated checking—and I can’t begin to do excellent checking until I’ve done that exploratory work.

I’ll have some follow-up notes in the final post in this series.

4 replies to “Exploratory Testing on an API? (Part 3)”

  1. My experience with API testing is a bit different, so I thought worth sharing. APIs are built for UIs (mobile, web, desktop), machines (servers) or IOT devices (thermostat, smart fridge, tv, etc.) I’m aware that software is written by developers, but it also changes the focus of testing.

    Michael replies: Right! The builder mindset and the tester mindset are different from one another. When they switch from a programming role to a testing role, programmers will tend to focus on programmer-oriented problems. That’s both a feature and a bug. It underscores the importance of programmer testing, and why a diversity of testing mindsets is valuable.

    With that respect, it’s important that the API is providing a compelling business case, a fluent way of manipulating data to provide, in the end, a valuable customer experience.

    As far as REST is concerned, the new technologies and frameworks are smart enough to allow engineers to focus on what’s important. There are guidelines of what should happen in case of failures, status codes, etc so the business becomes prio in this phase as well.

    Yes. Consistency with relevant standards (laws, regulations, guidance documents, conventions) is a powerful oracle that might point to problems in usability, supportability, and testability.

    Be careful with saying that the tools are smart enough, though. The tools may be useful and powerful, but it takes a human to be smart.

    One aspect I would add to your full list, is the problem of semantics and wording. How do you choose the correct words and vocabulary that defines your API? Again, in my experience, this is a source of multiple evils. Do you implement a /me endpoint or it’s self explanatory that you should be authenticated? Do you make resources/data configurable or embed it part of Create operations?
    Implementation versioning and data versioning bring complexity to all of the above. Is your API backwards compatible? Are there multiple representations of the same data? These are some questions that need to be answered.

    Yes. Notice what was missing from my so-far exercise? A couple of general quality criteria—scalability and installability—and a couple of development-related quality criteria: portability and maintainability. You’re provided some excellent examples of maintainability here. Each of the design decisions you’ve pointed to presents alternatives; it’s a good idea for the tester (or a programmer in a testing role) to be aware of those alternatives and to ask if the current choice is okay if there’s any reason to believe it might not be.

    Headers (if http) worth taking into consideration.

    Postman could be useful, but also hides some details of request. Here’s where good old curl turns to be much more successful.

    Yes; every tool comes with assets and liabilities; every tool extends some capabilities and constrains others.

    Additional Quality Criteria: consistency, reliabile, guessable

    For us (and in the HTSM), consistency is part of reliability; “guessable” would probably fall under usability. But hey, if you want to make them top-level categories in your model, that’s cool. We encourage testers (and programmers!) to develop their own sets of quality criteria; to apply them; to share and compare so that we can learn from each other. You’ve done that here—thank you!

  2. Fantastic post Michael! This is an example of how I use to do API testing. We will definitely incorporate feedback from your approach.

    This is an example of my approach with ‘API testing’ a software library called ‘hystrix’ by Netflix. Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems and to prevent cascading failures for distributed systems.

    My team and I discussed the key features and software patterns that need to be surveyed. We created a list of features and patterns that we will use to assess the hystrix library to see if it meets our needs. Some of the features and patterns that were surveyed in the context of the hystrix library:

    1. Use of Timeouts
    2. Fail Fast for incoming requests
    3. Circuit breaker Patterns
    4. Bulkhead Patterns
    5. Handling dead connections
    6. etc….

    If the library passed our initial survey, then look at it deeper by implementing the key fault tolerant patterns that we want to support. The driver behind this initiative was our business asked us to support less than 5 minutes of total outage per month ( how we measure this is another conversation )

    This library was recommended by a trusted team so we reduced the amount of diligence around ( security reviews, maintainability by open source developers, etc…)

    We discovered that it was very easy to learn how to use the hystrix library because the demo application implemented some of the failure and timeout patterns we cared about. We continued learning and surveying the key features and then had a discussion if we should proceed by writing code for a quick pilot. Since the hystrix library passed our survey, we proceeded with the pilot. As part of our evaluation, we were trying to determine the overall learning barrier and difficulty of adoption of the library. So, we picked a senior developer and junior developer to use the library to implement two of our features. As part of this process, we discovered that the junior developer had issues implementing his solution because the documentation in the help guide didn’t exist while the senior developer, based on experience, had no issues.

    One of our most important principles is ‘governance through code’ which basically means do not place a burden on our engineers to implement all of our recommended practices. Let’s make it easy for them to do the right thing. So, we decided to create tailored templates to determine if we can write most of the code in place to support our fault tolerant practices using the hystrix library. We wrote several exemplars and then gave it to the junior developer again and he was able to write an implementation.


Leave a Comment