Should I write automated unit tests that fail when the code changes?

https://softwareengineering.stackexchange.com/questions/306963

11-12-2020
|

Domanda

Generally when writing automated unit tests (eg JUnit, Karma) I aim to:

cover all the boundary conditions
get a high level of coverage.

I heard someone say:

coverage and boundary conditions aren't enough for a unit test, you need to write them so they will break if the code changes.

This sounds good to me in theory - but I'm not sure how to apply it.

My question is: Should I write automated unit tests that fail when the code changes? If so, how.

Soluzione

Your aim should be not to write unit tests that fail when the code changes, but unit tests that fail when the behaviour changes. Here, behaviour means anything that an external caller of the method wants it to do, like returning the right response to a question or saving the right thing to a databse. How it achieves that is its own internal implementation, not its behaviour.

By testing behaviour rather than implementation, you can refactor code to make improvements, and instantly verify whether you've accidentally changed the way it behaves by running your tests.

In reality, it's not possible to perfectly achieve this goal. If you have a method:

int add(int x, int y) {
    return x + y;
}

You can write as many unit tests as you want for it, but it's extremely unlikely any of them will fail if you modify it to:

int add(int x, int y) {
    if(x==10731 && y == -405571) {
        return 0;
    }
    return x + y;
}

However, you can take some sensible steps to get as close to full behavioural coverage as is practical:

As you said, think about boundary conditions and corner cases. These are the places where you're most likely to see accidental behaviour change.
Think of line-by-line coverage as "necessary, but not sufficient". Imagine trying to be as lazy as you possibly can while still getting full line-by-line coverage, and you'll see how easy it is to write an inadequate set of tests that follow this
Think about the behaviour your method is supposed to provide, and the branches it can follow. Ideally you should test that for every route through its implementation, it provides all the behaviour that's expected.
When you've written a set of tests, ask "Are there any implementations of my method which are at least as simple as the existing one, which would pass all of my tests, but which have the wrong behaviour?" This is a good rule of thumb to see if your behavioural coverage is good enough.

As shown in the example add method above, you can never really fully defend against modifications to your method that add extra stuff to it, making it less simple. But much more likely, bugs are going to sneak in through modifying or removing parts, without adding to the complexity (because, why would you refactor something in a way that adds to its complexity?). So by adding the "at least as simple" condition, you get something practically achievable.

Altri suggerimenti

The answers given - that this is a wrongheaded thing to do, and violates every principle of good testing - are correct.

But this is programming. There are always some sensible rationales for even the strangest request, and it can be productive to consider them.

Say you have some critical kernel code, or some high-security part of your SSH library, which absolutely, under no circumstances, should ever change its behavior, not even for some tiny subset of the infinite values it takes, all of which you could never test against.

And say you have a company set up such that there is a QA department responsible for writing unit tests against code provided by the Dev deparment, to ensure it meets business requirements.

In this situation, it makes sense to say "the critical code paths should only be modified by Dev when QA has been notified with signoff by upper management." Any change without signoff and notification is non-deliberate, and may be malicious.

In that case, a test which uses introspection to get a hash of the contents of each method, and compare that to a stored hash, could be legitimate. You can't serialize Method(), but you could subclass Method to be serializable. Perhaps a better approach is to get a hash of the file itself. Or just compare the file to a secure backup of the file.

But probably the best approach if the requirement is to fail on any modification, is to check the versioning system's logs and see if any changes were downloaded for the relevant files. If yes, then fail.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange