Is testing behavior of many classes in one test still unit testing?

https://softwareengineering.stackexchange.com/questions/400266

03-03-2021
|

Question

Our project's policy is to write unit tests for single classes only. All dependencies are mocked. Recently we've noticed that this approach makes us vulnerable in such cases:

Originally class A looks like this:

class A(val b: B) {

   fun doSomething() {
       b.doSomethingElse()
       b.doSomethingElse2()
   }
}

And such class A is covered with unit tests. Then due to new requirements class B goes through refactoring and gets hidden behind an interface so that it is technically possible that class A gets a different behavior based on a scenario.

The problem is that now, when we want to follow our project's guidelines, we also should refactor A's unit tests. Previously there was a test for proper communication with an object of B (mocked of course). Now when reference to B is gone then the test is refactored to verify communication with this new interface.

I find this scenario as an information loss happening during this test refactoring. We are no longer verifying any communication between A->B due to unit test purity when in real system such communication exists.

Having that in mind, should we change our way of thinking about unit tests and create test cases where there is more than one real object? Would it still be unit testing or is it already integration testing?

Solution

The outer limit of a unit is IO. If you're talking to peripheral's you ain't unit testing no more.

But within that you can carve things up as thickly or as finely as you see fit.

A test can exercise methods from three objects and still be a unit test. Some may claim that's integration but three lines of code are just as "integrated" so that's not really a thing. Method and object boundaries don't present any significant barrier to testing. Peripheral's do.

A test is not a unit test if:

It talks to the database

It communicates across the network

It touches the file system

It can't run at the same time as any of your other unit tests

You have to do special things to your environment (such as editing config files) to run it.

Michael Feathers - Working Effectively with Legacy Code

Now sure, you can argue that some people consider every test that can be automated by a unit testing frame work a unit test.

If we just want to argue semantics then fine, I'll define a unit as a beautiful flower in a field that smells bad.

Regardless of the terms, what is needed is segregation of slow troublesome tests from tests so blindingly fast and stable that you can run them while you type.

Call what they act on what you will. Just don't mix them up.

OTHER TIPS

Alternative approach to categorise tests into two categories
- Slow tests
- Quick tests

Usually slow tests are tests which requires access to external resources, also slow tests can be the tests which requires very very very complicated setup.

Quick tests are tests which used by developers for getting quick feedback during their work.

Would you test every class explicitly or test multiple classes within tests of consumer this decision should be made by the team, as Arseni Mourzenko already pointed out in his answer.

If you write tests before writing production code, you would find that testing multiple classes makes workflow quicker and provide more valuable feedback.
Because when you start working on some feature, you don't know yet what kind of interactions you would have.
So simply you would write everything in one class or one method covered with tests and after that you will refactor it by extracting duplications or dependencies.

Don't get caught op in definitions. It is not the important thing.

There is some disagreement about the exact meaning of "unit". But the purpose of your automated tests is not to conform to some arbitrary definition. The purpose is to detect bugs. Or more generally: To verify the behavior of the code conforms to expectations.

When refactoring, unit tests should ensure you did not introduce bugs or change behavior. Therefore tests shouldn't be coupled to implementation details in the tested code. Indeed it should be possible to completely rewrite the implementation of a class and have the tests verify behavior is still the same.

So from this perspective a test does not care if the tested class calls into other classes. This is an implementation detail. Indeed, a typical refactoring is to extract some internal code in a method to a separate class - and this is exactly the kind of scenario where tests should be able to verify the refactoring didn't change behavior.

But if you decide that a test should only test a single class, then you couple the test to implementation details. You cannot safely refactor the implementation since you might have to change the test also and introduce mocks. But changing the test at the same time as changing the tested code defeats the whole purpose of testing, since you can't be sure that you verify the same as you did before.

Especially insidious is the kind of mocking where you just verify that certain methods are called with certain parameters on the mock. This form of test couples so tightly to the implementation that they become a hindrance to refactoring rather than a help.

So bottom line: Avoid mocks (except in the case of external services or non-deterministic input), and let the tests exercise as many classes as is necessary to verify the behavior under test.

Personally I like to think of "unit" as "unit of behavior" - the smallest part of a specification which can be tested independently. But that is just my definition.

In your refactored class A, you say it is no longer responsible for communicating directly with class B, but is instead responsible for communicating with an Interface. So, as long as you're testing that communication, then you should not need to worry about no longer communicating directly with the specific class B.

Class A knows nothing of class B, so neither should class A's tests.

To test class B's behavior (in the case that class A or any other class calls an instance of class B at runtime), you should have other tests around class B.

Is testing behavior of many classes in one test still unit testing?

That depends on whose definition of unit test is considered authoritative.

should we change our way of thinking about unit tests and create test cases where there is more than one real object? Would it still be unit testing or is it already integration testing?

When you make backwards incompatible changes, things that depend on what you have changed break. So if your tests are now screaming at you, that's great -- everything is working the way that it is supposed to.

At this point, you've got a couple of choices. One is to rethink the changes you are making to your design, so that they are backwards compatible. This usually means creating new classes and interfaces, and implementing the old use cases in terms of the new elements. In effect, the old code gets refactored to leverage the new way of doing things, where appropriate, and the tests verify that you haven't introduced any new mistakes in doing that. Your new unit tests cover the new behaviors.

Side note: you can at this step deprecate the old way of doing things as a way of managing the change. This allows you to separate in time adding the new from removing the old.

Another possibility is that you decide that this breaking change is necessary, and this is the right time to pay the costs for that. So just remove the problem test. Ta-da! and you are done. Perfectly reasonable choice when you are replacing both the baby and the bath water.

The problem cases lie in the middle; you are committing a "small" incompatible change, and most of the value of the tests is still present, but continue to accrue that value requires a disproportionate amount of work.

Often this means that your tests are spanning to many volatile decisions (see Parnas, 1971) and or that your current design makes swapping elements too expensive.

For example, its easy to pretend that a behavior is a single atomic thing, where in practice it is actually an accumulation of a number of different independent ideas. If you break your tests up so that they are better focused on the behavioral ideas (rather than worrying about structural concerns like "class"), then you tests that are resilient to change outside of their immediate concern.

But let's be honest, getting to that state requires frontloading some thinking and design.

You may want to review James Shore's Testing Without Mocks; or some of the published articles on Property Based Testing, which often take a single behavior and partition it into a number of different tests, which allows you to preserve/re-use many of your tests even when the overall behavior changes.

Historical a unit test is testing the behavior of a single component, class. In order to immediately find regression errors, where the class no longer meets the expected behavior.

However unit tests are nowadays used for much, much more.

Test Driven Development

This is a fast way to develop parts in isolation/fluff, with an assured behavior, from the outset.
Data integrity

I have seen checks on XMLs of business process flows, to check there is always a default unconditional flow to a next state. These tests guarantee that the hard data resources do not contain possible mistakes. Your own checking "compiler" so to say. The same can hold to compare the translations of several languages. (For resource files I have seen testing all files in a resource directory tree, in one single unit test using a class-path directory walk, so future resources will be automatically tested.)

Already these cases show that unit testing is actually used as a quality assurance measure.

And therefore you can also test component containers and all, system behavior. The idea of only testing units (after writing the units) is obsolete.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange