The Test Pyramid In Practice 2/5

le 20/09/2018 par Lyman GILLISPIE, Jérôme Van Der Linden

In the previous article, we discussed the theory of the Testing Pyramid -- a testing strategy to ensure our application’s quality at a reasonable cost. Notable, we discussed the notion of feedback, and the importance of having fast, accurate, and reliable feedback. Unit tests typically address these criteria for a modest investment. Through this article we’ll develop a concrete example to explore the use of automated unit tests and try to answer some of our readers’ recurring questions.

This article originally appeared on our French Language Blog on 26/06/2018.

body .gist .highlight { background: #202020; } body .gist tr:nth-child(2n+1) { background: #202020; } body .gist tr:nth-child(2n) { background: #202020; } body .gist .gist-meta { display:none; } body .gist .blob-num, body .gist .blob-code-inner, body .gist .pl-s2, body .gist .pl-stj { color: #f8f8f2; } body .gist .pl-c1 { color: #ae81ff; } body .gist .pl-enti { color: #a6e22e; font-weight: 700; } body .gist .pl-st { color: #66d9ef; } body .gist .pl-mdr { color: #66d9ef; font-weight: 400; } body .gist .pl-ms1 { background: #fd971f; } body .gist .pl-c, body .gist .pl-c span, body .gist .pl-pdc { color: #75715e; font-style: italic; } body .gist .pl-cce, body .gist .pl-cn, body .gist .pl-coc, body .gist .pl-enc, body .gist .pl-ens, body .gist .pl-kos, body .gist .pl-kou, body .gist .pl-mh .pl-pdh, body .gist .pl-mp, body .gist .pl-mp1 .pl-sf, body .gist .pl-mq, body .gist .pl-pde, body .gist .pl-pse, body .gist .pl-pse .pl-s2, body .gist .pl-mp .pl-s3, body .gist .pl-smi, body .gist .pl-stp, body .gist .pl-sv, body .gist .pl-v, body .gist .pl-vi, body .gist .pl-vpf, body .gist .pl-mri, body .gist .pl-va, body .gist .pl-vpu { color: #66d9ef; } body .gist .pl-cos, body .gist .pl-ml, body .gist .pl-pds, body .gist .pl-s, body .gist .pl-s1, body .gist .pl-sol { color: #e6db74; } body .gist .pl-e, body .gist .pl-ef, body .gist .pl-en, body .gist .pl-enf, body .gist .pl-enm, body .gist .pl-entc, body .gist .pl-entm, body .gist .pl-eoac, body .gist .pl-eoac .pl-pde, body .gist .pl-eoi, body .gist .pl-mai .pl-sf, body .gist .pl-mm, body .gist .pl-pdv, body .gist .pl-som, body .gist .pl-sr, body .gist .pl-vo { color: #a6e22e; } body .gist .pl-ent, body .gist .pl-eoa, body .gist .pl-eoai, body .gist .pl-eoai .pl-pde, body .gist .pl-k, body .gist .pl-ko, body .gist .pl-kolp, body .gist .pl-mc, body .gist .pl-mr, body .gist .pl-ms, body .gist .pl-s3, body .gist .pl-smc, body .gist .pl-smp, body .gist .pl-sok, body .gist .pl-sra, body .gist .pl-src, body .gist .pl-sre { color: #f92672; } body .gist .pl-mb, body .gist .pl-pdb { color: #e6db74; font-weight: 700; } body .gist .pl-mi, body .gist .pl-pdi { color: #f92672; font-style: italic; } body .gist .pl-pdc1, body .gist .pl-scp { color: #ae81ff; } body .gist .pl-sc, body .gist .pl-sf, body .gist .pl-mo, body .gist .pl-entl { color: #fd971f; } body .gist .pl-mi1, body .gist .pl-mdht { color: #a6e22e; background: rgba(0, 64, 0, .5); } body .gist .pl-md, body .gist .pl-mdhf { color: #f92672; background: rgba(64, 0, 0, .5); } body .gist .pl-mdh, body .gist .pl-mdi { color: #a6e22e; font-weight: 400; } body .gist .pl-ib, body .gist .pl-id, body .gist .pl-ii, body .gist .pl-iu { background: #a6e22e; color: #272822; } body .gist .gist-file, body .gist .gist-data { border: 0px; border-bottom: 0px; }

Application

"The difference between theory and practice is that in theory there is no difference between theory and practice, but in practice there is one."
Jan Van de Snepscheut

Let's move to the practical part. To do this, and to complete our overview of tests, we will take the example of microservices. Of course this choice isn’t entirely random: microservices are intended to be as autonomous as possible (team, coupling, deployment, etc.) and this autonomy is enabled through testing: integration and end-to-end tests are not entirely appropriate if we want to continuously deploy our service independently of others.

Example

The following diagram succinctly describes the architecture of our example:

We’ve decided to create a set of services to search and book trips by train, but rather than use the API from the France National Railway company, SNCF, we picked a Swiss Open API available on: https://transport.opendata.ch/. The latter will provide us the routes and schedules.

The Connections Lookup service is a facade over this API which allows decoupling from this external service. Our interest in this article is more educational, but we will come back to it.

And finally the heart of the system, the Journey Booking service is in charge of searching for routes and recording them in a database.

The endpoints are:

GET /journeys/search?from=...&to=...
allows searching for available routes, though not booked trips (this is the entry point for the Lookup service).
GET /journeys
gives the list of all reserved trips
GET /journeys/{id}
gives the trip whose id is passed in the query
POST /journeys
allows you to book a trip
PUT /journeys/{id}
allows you to change a trip reservation
DELETE /journeys/{id}
deletes the trip whose id is passed in the request

The last 5 endpoints will interact with a database (in our case, Postgres).

Our Booking microservice is structured as in the following diagram. It is very standard, the example being simple and the business logic minimal. It would be reasonable to do everything in the Controller, but for the sake of the example we’ll keep our Service layer and see where it leads us.

From a technology point of view, we’ll use a standard: Spring and its ecosystem. There are many tools available to test Spring and it’s good to be clear on what to use and when. The complete project is available on gitlab.

Unit tests

We’ll start at the base of the pyramid with unit tests. A unit test aims to validate a single behaviour (i.e. a method or a subset of a method) resulting from a business use case in isolation from the rest of the world:

other objects: instantiation, attributes, parameters, etc.
other systems: a database, a web service, system time, etc.
other tests: order of tests, test data

Some will say it isn’t necessary to isolate everything. In Working effectively with Unit Tests Jay Fields introduces the notions of social tests and solitary tests. Personally, I’m in favor of isolating as much as possible to avoid any interference. For simplicity, our unit tests are independent of any external input/output, i.e. databases, file systems, networks, etc.

To do this, we use what some call plugs, others stubs, mocks, or fakes -- what’s called a Test Double in the literature. It’s an object that we completely control which will stand in for a dependency of our object under test. It allows us to validate different behaviors depending on values returned from the double -- for example the happy path, edge and corner cases, and errors.

While it’s possible to develop test doubles by hand, there are also many libraries available that simplify their implementation: Mockito, EasyMock or JMockit are the most known in the Java world.

What to test?

If we look at our previous schema, we would unit test each of the objects that make up our component:

In truth, since the Client is implemented with the Feign library, there is no real code to test:

Voir le lien github

Likewise for the Repository part, which is based on Spring Data and therefore has no code:

Voir le lien github

We will come back to these two elements in our integration tests since our objective is not to test the underlying frameworks, which are already well tested elsewhere.

So, we now have the following schema:

Here is an excerpt from the Service (Gitlab link):

Voir le lien github

And an excerpt of the Controller (Gitlab link):

Voir le lien github

As we said before, the Controller is almost a simple utility layer, almost.

Utility layers

A common question that many readers ask us is "Is it worth it to test an utility layer?", which we answer with another question "is it worth it to have this utility layer?". Often these layers are only to enforce a layered pattern, and have no purpose other than to be there "just in case".

In general, the practice of TDD (Test Driven Development) helps us to avoid this. Without going into the details of a practice that would be worth a complete article, TDD aims to specify expected behavior via a test before actually implementing it. So we first write a test and then the simplest possible code that allows the test to pass and therefore satisfy the specified behavior. This avoids over-design and "just in case" layers, and focuses on the simplest code that provides value quickly.

In our example, though the controller seems to have little code, it still has two responsibilities: to expose Data Transfer Objects (DTOs) instead of entities and to expose the API via the use of annotations. The code (though minimal) will be tested individually and we will test the exposure (url mappings, error code management, etc.) in the component tests.

Private methods

Another recurring questions among our customers is "is it necessary / how to test private methods?".

The extreme answer is "no": If you do TDD, private methods only appear after the refactoring step (red / green / refactor) and are therefore indirectly tested through public methods.
A more pragmatic answer is "no, but": on legacy code, testing private methods can be a short-term way to put a test harness around a class before refactoring it (i.e. to reduce complexity, too much responsibility, too many dependencies etc.). Spring provides a utility class (ReflectionUtils) to simplify the writing of such tests. In the long term, after refactoring, these tests should be removed and replaced by public method tests.

100% coverage or nothing

With tools such as Jacoco, Cobertura or Clover, it’s possible to determine how much of our code is reached / covered when running tests. Beyond this simple indicator, these tools allow us to see where the tests have passed and, especially, where they’ve failed. We can then check whether critical paths of our application are or are not tested.

We should take care when relying on code coverage as an indicator, because it can be misleading: it’s certainly possible to execute 100% of the code without testing anything (by not asserting anything, for example). Don’t aim for 100%, instead begin by focusing on the critical parts of the application, and keep track of the your code’s coverage trend. Is it increasing? Decreasing? If you want to go further, it’s possible to apply mutation testing (also known as chaos-monkey testing), which modifies the business code, more or less randomly, and verifies that a test fails. If the tests continue to pass, it's likely that t don’t effectively validate the code. The Pitest framework can automate this in Java.

For example, the following report indicates that JourneyService (after removing all assertions) is entirely covered by tests, but these tests score rather poorly with respect to mutation coverage.

Example of "incomplete" test:

Voir le lien github

And the associated report:

Implementation of unit tests

We’ll use JUnit, AssertJ, and Mockito to implement our tests, notice that there’s no Spring at this level of the pyramid. Here’s an excerpt from our tests for the JourneyService (Gitlab link):

Voir le lien github

Several things to note in this code:

1. The test methods have explicit names. If a test fails, we’ll know very quickly what the source of the problem is. There’s no universal convention, but I advise that you adopt the following nomenclature, which is verbose but unambiguous:
  unitUnderTest_ShouldExpectedBehavior_WhenInitialState
  We might not respect this naming convention, but test code must be as readable, if not more so, than the business code. So long as it is understandable, the test code documents what your application actually does better than any documentation.
2. To aid readability, you can use the following standard structure in your test code:
  - Preparation of the test environment and initialization of input data.
  - Execution of the behavior you want to test (usually a method).
  - Verification of the results and side effects.
  Personally, I use some comments from the Behavior Driven Development (BDD) syntax: given, when, then to structure the test. Others use the 3A rule: arrange, action, assertion. The key is to have a well structured and readable code.
3. In the same vein, I use the org.mockito.BDDMockito class which adopts the BDD structure. So Mockito.when is replaced by BDDMockito.given and verify by then.Another important point in this example, Mockito is used both to provide a Stub (in the first two tests) and a Mock (in the third). Without going into details, the Stub is there to replace a dependency and validate that the tested system works. A Mock, on the other hand, allows us to check the interaction of the system under test with its dependencies. We can check that the dependency has been called with the expected parameters. We should pay attention to the use of Mocks in our tests. If we’re not careful the tests can become tightly coupled to our dependency’s implementation, which can quickly become a nightmare to maintain and understand.

It should go without saying that these tests must should be run continuously within your build pipeline after every commit to detect regressions as soon as possible. Unit tests validate the business aspects of your application, i.e. business logic and algorithms. They are a security blanket for any code modification -- i.e. adding features, refactoring, and bug fixes -- and I can not stress enough that they are essential.

They are necessary, but not sufficient. In the next article, we’ll discuss component tests, which augment the collection of tests that it is good to have in your toolkit.