The Two Kinds of Tests

February 27, 2020

One aspect of testing that gets fairly confusing is that there are many different kinds of tests discussed in various articles, and the exact difference between those tests is often a matter of debate. A partial list of some of the categories of tests you may have read about includes:

Unit tests
Integration tests
Functional tests
Smoke tests
Acceptance tests
Model tests
Component tests
Performance tests
UI tests
End-to-end tests

Wow! And you might think that all of those might be necessary to have a complete test suite. The good news is that everything you need to have full test coverage for your mobile app can be reduced down to just two kinds of tests, both of which can be clearly defined: API tests and UI tests.

API tests call into code directly through public methods and validate the results of those calls. In other words, API tests are code invoking other code directly, and directly verifying the outcome.

UI tests, on the other hand, simulate the way a user interacts with code through the human-exposed interface (screen, voice, gestures, motions, etc.). In other words, the code under test is triggered more indirectly, by replicating the kinds of actions a user is expected to take, and validating what a user would expect to see as a result.

As covered in earlier posts in this series, tests should always be based on requirements, and every kind of meaningful requirement can be validated with either API or UI tests — no further subcategorizations or arbitrary distinctions are required. Both kinds of tests have benefits and tradeoffs. And both kinds of tests are vital for a comprehensive mobile app test suite.

API Tests

A primary benefit of API tests is that they validate logical behavior independent of any specific UI. If you consider how common it is for the same business logic to drive multiple user experiences, the value of an API test really comes into focus.

For example, the same underlying code in a mobile application will frequently drive an iPhone view, an iPad view, and possibly a view for macOS, tvOS or watchOS. API tests are the most efficient way to validate the behavior of business logic or data in a single place. And because UI tends to iterate more frequently than business logic, API tests are generally more resilient and require fewer updates than UI tests.

The other primary benefit of API tests is that they usually run very quickly. Since they don’t depend on navigating through UI or being on a specific screen in order to execute, a large number of API tests can run in seconds or minutes to validate large sets of requirements without creating a bottleneck on every commit.

Where API tests fall short relative to UI tests is when it comes to validating the actual user experience. API tests validate what certain code responds with when called from other code using certain inputs; however, in the broader context of a user’s experience this is a rather narrow slice of what is going on.

For example, an API test may validate that an error is returned in code when a too-short password is passed to a createCredentials() method. However, nothing about such an API test checks whether that error is surfaced up to some UI that the user can see or hear, or whether it allows the user to retry with a longer password, etc.

While there are many aspects of code that are absolutely vital to test in isolation from user experience (security / encryption, data integrity, etc.), a mobile application ultimately exists to help a human being accomplish something. And tests which ignore the aspect of how well something is actually working for a human being are entirely insufficient by themselves.

UI Tests

Because humans are the most important factor to consider in creating a mobile app (unless it’s an app for cats), our most important requirements will always be based on what humans expect to see, do, or accomplish using our application. It doesn’t matter if you have a well-tested API for deleting a message if there is no user interface that allows a human to perform that task and receive some sort of feedback that it was completed.

For this reason, the majority of requirements coming from a product team will usually be user-focused. For example: “There should be a button which allows the currently displayed message to be deleted. Tapping the button should bring up a confirmation prompt which asks if the user is sure they want to delete this message. The prompt should have a button to proceed with the deletion, and a button to cancel the deletion.”

This requirement doesn’t go into great detail about how this behavior should be implemented from an API perspective, but it does a pretty good job of describing a couple UI tests that can verify the code that is ultimately implemented, and ensure that it results in the desired experience for the human end user.

Therefore, the greatest benefit of UI tests is that they validate what your actual users will be seeing, hearing, touching, or doing. This is ultimately the most important test of your code if you want the application to be of use to human beings. Failing to validate the behaviors your actual users expect is leaving a huge gap in your test suite. This may in fact be the most common gap I’ve seen across multiple teams and codebases: there are lots of code-only tests, and somehow they never seem to catch glaring bugs that users stumble over in production.

The second benefit of UI tests is that they are in some ways the ultimate abstraction of behavior. A good UI test will have absolutely no knowledge of or expectations for a specific underlying implementation in code. A UI test that validates the delete button behavior described earlier should continue passing even if the screen is changed from UIKit to SwiftUI or the entire architecture of the app is changed drastically. Ultimately, the UI test only cares that a) it can find the delete button b) tapping it brings up the prompt c) tapping the delete button on the prompt makes the message disappear or d) tapping the cancel button on the prompt dismisses the prompt and the message is still visible. All the code that makes that stuff work doesn’t matter at all to the UI test

Side Note:

One of the most common mistakes in UI tests is using them to validate styling and design, or making them depend on specific button text, screen titles, etc.. Please don't do this. A good UI test will not depend on any specific hierarchy of elements or any specific placement of those elements on screen. This is why assigning accessibility identifiers and querying for them directly is the best (and hopefully only) way your UI tests should find on-screen elements. Assume that everything other than an identifier may change: a segmented control could become a radio button, a navbar item could become a toolbar item, and the specific text on a button could change at any time (especially for localization) without in any way effecting the behavior that is being verified. Always keep UI tests as abstract and behavior-based as possible, and never coupled to a specific layout, design or type of element.

Despite their vital benefits, UI tests also have some downsides. The first of these is that they tend to run slowly, and a large suite of UI tests can take hours to run. This usually comes down to the fact that UI tests attempt to emulate the way a user interacts with the application, meaning that each test requires launching the app, tapping a few times to navigate, waiting for onscreen elements to become visible, scrolling the screen, etc. On top of this, UI test tooling in general usually has implementation constraints that make it necessary to pause frequently to ensure animations are “done” and they end up running more slowly than an actual human using the app would.

Thankfully there are some ways to mitigate this problem. Perhaps the most important is parallelization — Xcode UI tests for example can run on multiple simulators or devices in parallel. Make sure to take advantage of this whenever you can, since it easily speeds up execution of a UI test suite by a large factor (often 4 to 8 times faster).

In a well-modularized application where different screens and features are defined in separate modules, it’s also often possible to set up UI tests to present a screen directly and test much of its functionality in isolation, without need to navigate through the entire app to get to it. The app target itself would certainly have UI tests to validate navigation to and away from a particular screen, but the behaviors contained entirely within a screen itself can be tested inside its own feature module directly, greatly reducing the overhead associated with navigation.

Lastly, but with a large chunk of salt: it’s often possible to turn off or speed up animations when running UI tests, so that the time spent waiting for screen-to-screen transitions, for sheets to slide up or down, etc. is reduced to nearly zero. I’ve seen this reduce UI testing time by as much as 30 - 40%. However, I don’t really recommend it because you are now testing UI in a context that does not match what the user will experience, and it’s possible (though rare) to miss race conditions or subtle bugs that only happen when animations are enabled. As much as is possible, use the first two approaches to speed up UI tests rather than the disabling or accelerating animations option.

The other significant downside for UI tests is that they are often pretty flaky. Flaky means that sometimes they produce false failures, and this is often an issue with the test tools themselves rather than your own code. There are few things in the world of development more frustrating than a test that fails sometimes and succeeds sometimes, even though the test conditions are underlying code are identical each time. And often the solution is just to run the whole suite again and hope for no false failures.

While the flakiness issue isn’t really inherent in the concept of UI testing itself (there’s no mathematical or philosophical reason why UI tests should be flaky), it’s so common across multiple UI testing frameworks and tools (including those provided by Apple and Google) that it really merits mention as a significant downside.

There are many discussions on Stack Overflow and developer forums that address workarounds to reduce flakiness in certain scenarios and those are worth looking into. However, such workarounds are usually just temporary until the tools themselves eventually get updated to resolve the underlying issues causing the flakiness.

In any event, make sure that the implementation of such workarounds are abstracted into helper functions and never built into the actual requirements or logical steps of the test itself. You should be able to update or remove these kinds of workarounds in a single place when they change or are no longer necessary, and this should not require modifying any lines of code in the test case themselves.

Is That Really It?

Yes. It really is no exaggeration to say that the only useful tests, and the only tests which must be written are those that validate specific, explicit, expected behaviors (a.k.a. requirements). And any software requirement written can be verified by either an API test or a UI test.

Of course there are all sorts of other arbitrary distinctions implicit in the different types of tests listed at the top of this article: for example “integration tests” validate the interaction between different types. But this is not really a useful distinction and you can’t say that there are too few or too many integration tests, or that any such tests that have been written are good or bad without comparing them to explicit requirements.

Ultimately, the only point of any test (whether it is a smoke test, functional test, model test, etc.) is to validate a requirement, and doing so is the only value a test can provide. Requirements can either be tested via API or via UI, and therefore this is really the only distinction that matters.

A question that comes up often is “what about unit tests?”. Surely unit tests are the most important tests since most developers have heard that stated and most projects have some set of tests that the team calls unit tests. I was originally going to address this in a sidebar, but the sidebar outgrew its space and became a fuill article, so please feel free to check out the longer discussion of the what about unit tests question.

The short answer though, is that unit tests have no meaningful (or useful) definition or criteria beyond what is described above for API tests, but they do discourage developers and teams from testing many important requirements for arbitrary and dogmatic reasons.

Therefore, let’s move forward focused on the only two important and distinct types of tests — API tests and UI tests — because a mobile application will need both kinds (and thankfully no others) to be considered fully and thoroughly tested. Ensure you have good requirements for everything your team codes, and that API tests or UI tests exist to validate all of those requirements, and you will soon find yourself enjoying (possibly for the first time) a simple and practical approach to testing!

Next up: See how all this comes together into a useful and practical approach to testing

The Two Kinds of Tests

API Tests

UI Tests

Side Note:

Is That Really It?

Share this article with friends