How exactly should unit tests be written without mocking extensively?

https://softwareengineering.stackexchange.com/questions/382087

16-02-2021
|

Question

As I understand, the point of unit tests is to test units of code in isolation. This means, that:

They should not break by any unrelated code change elsewhere in the codebase.
Only one unit test should break by a bug in the tested unit, as opposed to integration tests (which may break in heaps).

All of this implies, that every outside dependency of a tested unit, should be mocked out. And I mean all the outside dependencies, not only the "outside layers" such as networking, filesystem, database, etc..

This leads to a logical conclusion, that virtually every unit test needs to mock. On the other hand, a quick Google search about mocking reveals tons of articles that claim that "mocking is a code smell" and should mostly (though not completely) be avoided.

Now, to the question(s).

How should unit tests be written properly?
Where exactly does the line between them and the integration tests lie?

Update 1

Please consider the following pseudo code:

class Person {
    constructor(calculator) {}

    calculate(a, b) {
        const sum = this.calculator.add(a, b);

        // do some other stuff with the `sum`
    }
}

Can a test that tests the Person.calculate method without mocking the Calculator dependency (given, that the Calculator is a lightweight class that does not access "the outside world") be considered a unit test?

Solution

the point of unit tests is to test units of code in isolation.

Martin Fowler on Unit Test

Unit testing is often talked about in software development, and is a term that I've been familiar with during my whole time writing programs. Like most software development terminology, however, it's very ill-defined, and I see confusion can often occur when people think that it's more tightly defined than it actually is.

What Kent Beck wrote in Test Driven Development, By Example

I call them "unit tests", but they don't match the accepted definition of unit tests very well

Any given claim of "the point of unit tests is" will depend heavily on what definition of "unit test" is being considered.

If your perspective is that your program is composed of many small units that depend on one another, and if you constrain yourself to a style that tests each unit in isolation, then a lot of test doubles is an inevitable conclusion.

The conflicting advice that you see comes from people operating under a different set of assumptions.

For example, if you are writing tests to support developers during the process of refactoring, and splitting one unit into two is a refactoring that should be supported, then something needs to give. Maybe this kind of test needs a different name? Or maybe we need a different understanding of "unit".

You may want to compare:

Ian Cooper's TDD: Where Did It All Go Wrong
JBRainsberger's Integrated Tests are a Scam

Can a test that tests the Person.calculate method without mocking the Calculator dependency (given, that the Calculator is a lightweight class that does not access "the outside world") be considered a unit test?

I think that's the wrong question to ask; it's again an argument about labels, when I believe what we actually care about are properties.

When I'm introducing changes to the code, I don't care about isolation of tests -- I already know that "the mistake" is somewhere in my current stack of unverified edits. If I run the tests frequently, then I limit the depth of that stack, and finding the mistake is trivial (in the extreme case, the tests are run after every edit -- the max depth of the stack is one). But running the tests isn't the goal -- it's an interruption -- so there is value in reducing the impact of the interruption. One way of reducing the interruption is to ensure that the tests are fast (Gary Bernhardt suggests 300ms, but I haven't figured out how to do that in my circumstances).

If invoking Calculator::add doesn't significantly increase the time required to run the test (or any of the other important properties for this use case), then I wouldn't bother using a test double -- it doesn't provide benefits that outweigh the costs.

Notice the two assumptions here: a human being as part of the cost evaluation, and the short stack of unverified changes in the benefit evaluation. In circumstances where those conditions do not hold, the value of "isolation" changes quite a bit.

See also Hot Lava, by Harry Percival.

OTHER TIPS

How exactly should unit tests be written without mocking extensively?

By minimising side-effects in your code.

Taking your example code, if calculator for example talks to a web API, then either you create fragile tests that rely on being able to interact with that web API, or you create a mock of it. If however its a deterministic, state-free set of calculation functions, then you don't (and shouldn't) mock it. If you do, you risk your mock behaving differently to the real code, leading to bugs in your tests.

Mocks should only be needed for code that read/writes to the file system, databases, URL endpoints etc; that are dependent on the environment you are running under; or that are highly stateful and non-deterministic in nature. So if you keep those parts of the code to a minimum and hide them behind abstractions, then they are easy to mock and the rest of your code avoids the need for mocks.

For the code points that do have side effects, it's worth writing tests that mock and tests that don't. The latter though need care as they will inherently be fragile and possibly slow. So you may want to only run them say overnight on a CI server, rather than every time you save and build your code. The former tests though should be run as often as practicable. As to whether each test is then a unit or integration test becomes academic and avoids "flame wars" over what is and isn't a unit test.

These questions are quite different in their difficulty. Let's take question 2 first.

Unit tests and integration tests are clearly separated. A unit test tests one unit (method or class) and uses other units only as much as necessary to achieve that goal. Mocking may be necessary, but it is not the point of the test. An integration test tests the interaction between different actual units. This difference is the entire reason why we need both unit and integration testing - if one did the job of the other well enough, we wouldn't, but it's turned out that it's usually more efficient to use two specialized tools rather than one generalized tool.

Now for the important question: How should you unit test? As said above, unit tests should construct auxiliary structures only as far as necessary. Often it is easier to use a mock database than your real database or even any real database. However, mocking in itself has no value. If often happens that it is in fact easier to use actual components of another layer as input for a mid-level unit test. If so, don't hesitate to use them.

Many practitioners are afraid that if unit test B reuses classes that were already tested by unit test A, then a defect in unit A causes test failures in multiple places. I consider this not a problem: a test suite has to succeed 100% in order to give you the reassurance you need, so it is not a big problem to have too many failures - after all, you do have a defect. The only critical problem would be if a defect triggered too few failures.

Therefore, don't make a religion of mocking. It is a means, not an end, so if you can get away with avoiding the extra effort, you should do so.

OK, so to answer your questions directly:

How should unit tests be written properly?

As you say, you should be mocking dependencies and testing just the unit in question.

Where exactly does the line between them and integration tests lie?

An Integration test is a unit test where your dependencies are not mocked.

Can a test that tests the Person.calculate method without mocking the Calculator be considered a unit test?

No. You need to inject the calculator dependency into this code and you have a choice between a mocked version or a real one. If you use a mocked one its a unit test, if you use a real one its an integration test.

However, a caveat. do you really care what people think your tests should be called?

But your real question seems to be this:

a quick Google search about mocking reveals tons of articles that claim that "mocking is a code smell" and should mostly (though not completely) be avoided.

I think the problem here is that a lot of people use mocks to completely recreate the dependencies. For example I might mock the calculator in your example as

public class MockCalc : ICalculator
{
     public Add(int a, int b) { return 4; }
}

I would not do something like:

myMock = Mock<ICalculator>().Add((a,b) => {return a + b;})
myPerson.Calculate()
Assert.WasCalled(myMock.Add());

I would argue that, that would be "testing my mock" or "testing the implementation". I would say "Don't write Mocks! *like that".

Other people would disagree with me, we would start massive flame wars on our blogs about the Best way to Mock, which really would make no sense unless you understood the whole background of the various approaches and really don't offer a whole lot of value to someone who just wants to write good tests.

How should unit tests be implemented properly?

My rule of thumb is that proper unit tests:

Are coded against interfaces, not implementations. This has many benefits. For one, it ensures that your classes follow the Dependency Inversion Principle from SOLID. Also, this is what your other classes do (right?) so your tests should do the same. Also, this allows you to test multiple implementations of the same interface while reusing much of the test code (only initialization and some assertions would change).
Are self-contained. As you said, changes in any outside code cannot affect the test result. As such, unit tests can execute at build-time. This means you need mocks to remove any side effects. However, if you are following the Dependency Inversion Principle, this should be relatively easy. Good test frameworks like Spock can be used to dynamically provide mock implementations of any interface to use in your tests with minimal coding. This means that each test class only needs to exercise code from exactly one implementation class, plus the test framework (and maybe model classes ["beans"]).
Do not require a separate running application. If the test needs to "talk to something", whether a database or a web service, it's an integration test, not a unit test. I draw the line at network connections or the filesystem. A purely in-memory SQLite database, for example, is fair game in my opinion for a unit test if you really need it.

If there are utility classes from frameworks that complicate unit testing, you may even find it useful to create very simple "wrapper" interfaces and classes to facilitate mocking of those dependencies. Those wrappers would then not necessarily be subject to unit tests.

Where exactly does the line between them [unit tests] and integration tests lie?

I have found this distinction to be the most useful:

Unit tests simulate "user code", verifying behavior of implementation classes against the desired behavior and semantics of code-level interfaces.
Integration tests simulate the user, verifying behavior of the running application against specified use cases and/or formal APIs. For a web service, the "user" would be a client application.

There is gray area here. For example, if you can run an application in a Docker container and run the integration tests as the final stage of a build, and destroy the container afterwards, is it OK to include those tests as "unit tests"? If this is your burning debate, you're in a pretty good place.

Is it true that virtually every unit test needs to mock?

No. Some individual test cases will be for error conditions, like passing null as a parameter and verifying you get an exception. Lots of tests like that will not require any mocks. Also, implementations that have no side effects, for example string processing or math functions, may not require any mocks because you simply verify the output. But most classes worth having, I think, will require at least one mock somewhere in the test code. (The fewer, the better.)

The "code smell" issue you mentioned arises when you have a class that is overly complicated, that requires a long list of mock dependencies in order to write your tests. This is a clue that you need to refactor the implementation and split things up, so that each class has a smaller footprint and a clearer responsibility, and is therefore more easily testable. This will improve quality in the long run.

Only one unit test should break by a bug in the tested unit.

I don't think this is a reasonable expectation, because it works against reuse. You may have a private method, for example, that is called by multiple public methods published by your interface. A bug introduced into that one method might then cause multiple test failures. This doesn't mean you should copy the same code into each public method.

Mocking should only be used as a last resort, even in unit tests.

A method is not a unit, and even a class is not a unit. A unit is any logical separation of code that makes sense, regardless of what you call it. An important element of having well tested code is being able to freely refactor, and part of being able to freely refactor means that you don't have to change your tests in order to do so. The more you mock, the more you have to change your tests when you refactor. If you consider the method the unit, then you have to change your tests every time you refactor. And if you consider the class the unit, then you have to change your tests every time you want to break a class up into multiple classes. When you have to refactor your tests in order to refactor your code, it makes people choose to not refactor their code, which is just about the worst thing that can happen to a project. It is essential that you can break a class up into multiple classes without having to refactor your tests, or you're going to end up with oversized 500 line spaghetti classes. If you are treating methods or classes as your units with unit testing, you are probably not doing Object Oriented Programming but some sort of mutant functional programming with objects.

Isolating your code for a unit test doesn't mean that you mock everything outside of it. If it did, you'd have to mock your language's Math class, and absolutely no one thinks that's a good idea. Internal dependencies should not be treated any differently than external dependencies. You trust that they are well tested and work like they are supposed to. The only real difference is that if your internal dependencies are breaking your modules, you can stop what you're doing to fix it rather than having to go post an issue on GitHub and either dig into a codebase you don't understand to fix it or hope for the best.

Isolating your code just means that you treat your internal dependencies like black boxes and don't test things that are happening inside them. If you have Module B which accepts inputs of 1, 2, or 3, and you have Module A, which calls it, you don't have your tests for Module A do each of those options, you just pick one and use that. It means that your tests for Module A should test the different ways you treat the responses from Module B, not the things that you pass into it.

So, if your controller passes in a complex object to a dependency, and that dependency does several possible things, maybe saving it to the database and maybe returning a variety of errors, but all your controller actually does is simply check to see if it returns an error or not and pass that information along, then all you test in your controller is one test for if it returns an error and passes it along and one test for if it does not return an error. You don't test whether something got saved in the database or what kind of error the error is, because that would be an integration test. You do not have to mock the dependency in order to do this. You have isolated the code.

They should not break by any unrelated code change elsewhere in the codebase.

I'm not really sure how this rule is useful. If a change in one class/method/whatever can break the behaviour of another in production code, then the things are, in reality, collaborators, and not unrelated. If your tests break and your production code doesn't, then your tests are suspect.

Only one unit test should break by a bug in the tested unit, as opposed to integration tests (which may break in heaps).

I'd regard this rule with suspicion too. If you're really good enough to structure your code and write your tests such that one bug causes exactly one unit test failure, Then you're saying you've identified all the potential bugs already, even as the codebase evolves to use cases you haven't anticipated.

Where exactly does the line between them and integration tests lie?

I don't think that's an important distinction. What is a 'unit' of code anyhow?

Try to find entry points at which you can write tests that just 'make sense' in terms of the problem domain/business rules that that level of the code is dealing with. Often these tests are somewhat 'functional' in nature - put in an input, and test that the output is as expected. If the tests express a desired behaviour of the system, then they often remain quite stable even as the production code evolves and is refactored.

How exactly should unit tests be written without mocking extensively?

Don't read too much into the word 'unit', and lean towards using your real production classes in tests, without worrying too much if you're involving more than one of them in a test. If one of them is hard to use (because it takes a lot of initialisation, or it needs to hit a real database/email server etc), then let your thoughts turn to mocking/faking.

First, some definitions:

A unit test tests units in isolation from other units, but what that means is not concretely defined by any authoritative source, so let's define it a bit better: If I/O boundaries are crossed (whether that I/O is network, disk, screen, or UI input), there's a semi-objective place we can draw a line. If the code depends on I/O, it's crossing a unit boundary, and therefore it will need to mock the unit responsible for that I/O.

Under that definition, I don't see a compelling reason to mock things like pure functions, meaning that unit testing lends itself to pure functions, or functions without side-effects.

If you want to unit test units with effects, the units responsible for the effects should be mocked, but perhaps you should consider an integration test, instead. So, the short answer is: "if you need to mock, ask yourself if what you really need is an integration test." But there's a better, longer answer here, and the rabbit hole goes much deeper. Mocks may be my favorite code smell because there's so much to learn from them.

Code Smells

For this, we'll turn to Wikipedia:

In computer programming, a code smell is any characteristic in the source code of a program that possibly indicates a deeper problem.

It continues later...

"Smells are certain structures in the code that indicate violation of fundamental design principles and negatively impact design quality". Suryanarayana, Girish (November 2014). Refactoring for Software Design Smells. Morgan Kaufmann. p. 258.

Code smells are usually not bugs; they are not technically incorrect and do not prevent the program from functioning. Instead, they indicate weaknesses in design that may slow down development or increase the risk of bugs or failures in the future.

In other words, not all code smells are bad. Instead, they are common indications that something might not be expressed in its optimal form, and the smell may indicate an opportunity to improve the code in question.

In the case of mocking, the smell indicates that the units which seem to be calling for mocks depend on the units to be mocked. It may be an indication that we haven't decomposed the problem into atomically-solvable pieces, and that could indicate a design flaw in the software.

The essence of all software development is the process of breaking a large problem down into smaller, independent pieces (decomposition) and composing the solutions together to form an application that solves the large problem (composition).

Mocking is required when the units used to break the large problem down into smaller parts depend on each other. Put another way, mocking is required when our supposed atomic units of composition are not really atomic, and our decomposition strategy has failed to decompose the larger problem into smaller, independent problems to be solved.

What makes mocking a code smell is not that there's anything inherently wrong with mocking - sometimes it is very useful. What makes it a code smell is that it could indicate a problematic source of coupling in your application. Sometimes removing that source of coupling is much more productive than writing a mock.

There are many kinds of coupling, and some are better than others. Understanding that mocks are a code smell can teach you to identify and avoid the worst kinds early in the application design lifecycle, before the smell develops into something worse.

By not writing them. Because not all code benefits from unit testing. TDD is a technique, a tool to solve problems, and not the One True Way™ to write all code.

Unit tests should only be written against code that has no dependencies. Code that doesn't get covered well by integration and end-to-end tests.

So if you find yourself in a situation where you think about using mocks, this is a sign that's it's not worth writing a unit test. Because your tests will depend on implementation details, and they will always break if you change the implementation, and need to be rewritten. The purpose of a test is to guard against regression, something you can rely on when you refactor. Mock based tests can't fulfil that buy their very nature.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange