Question

In another question, it was revealed that one of the pains with TDD is keeping the testing suite in sync with the codebase during and after refactoring.

Now, I'm a big fan of refactoring. I'm not going to give it up to do TDD. But I've also experienced the problems of tests written in such a way that minor refactoring leads to lots of test failures.

How do you avoid breaking tests when refactoring?

  • Do you write the tests 'better'? If so, what should you look for?
  • Do you avoid certain types of refactoring?
  • Are there test-refactoring tools?

Edit: I wrote a new question that asked what I meant to ask (but kept this one as an interesting variant).

Was it helpful?

Solution

What you're trying to do is not really refactoring. With refactoring, by definition, you don't change what your software does, you change how it does it.

Start with all green tests (all pass), then make modifications "under the hood" (e.g. move a method from a derived class to base, extract a method, or encapsulate a Composite with a Builder, etc.). Your tests should still pass.

What you're describing seems to be not refactoring, but a redesign, which also augments the functionality of your software under test. TDD and refactoring (as I tried to define it here) are not in conflict. You can still refactor (green-green) and apply TDD (red-green) to develope the "delta" functionality.

OTHER TIPS

One of the benefits of having unit tests is so you can confidently refactor.

If the refactoring does not change the public interface then you leave the unit tests as is and ensure after refactoring they all pass.

If the refactoring does change the public interface then the tests should be rewritten first. Refactor until the new tests pass.

I would never avoid any refactoring because it breaks the tests. Writing unit tests can be a pain in a butt but its worth the pain in the long run.

Contrary to the other answers, it is important to note that some ways of testing can become fragile when the system under test (SUT) is refactored, if the test is whitebox.

If I'm using a mocking framework that verifies the order of the methods called on the mocks (when the order is irrelevant because the calls are side-effect free); then if my code is cleaner with those method calls in a different order and I refactor, then my test will break. In general, mocks can introduce fragility to tests.

If I am checking the internal state of my SUT by exposing its private or protected members (we could use "friend" in visual basic, or escalate the access level "internal" and use "internalsvisibleto" in c#; in many OO languages, including c# a "test-specific-subclass" could be used) then suddenly the internal state of the class will matter - you may be refactoring the class as a black box, but white box tests will fail. Suppose a single field is reused to mean different things (not good practice!) when the SUT changes state - if we split it into two fields, we may need to rewrite broken tests.

Test-specific-subclasses can also be used to test protected methods - which may mean that a refactor from the point of view of production code is a breaking change from the point of view of test code. Moving a few lines into or out of a protected method may have no production side effects, but break a test.

If I use "test hooks" or any other test-specific or conditional compilation code, it can be hard to ensure that tests don't break because of fragile dependencies on internal logic.

So to prevent tests from becoming coupled to the intimate internal details of the SUT it may help to:

  • Use stubs rather than mocks, where possible. For more info see Fabio Periera's blog on tautological tests, and my blog on tautological tests.
  • If using mocks, avoid verifying the order of methods called, unless it is important.
  • Try to avoid verifying internal state of your SUT - use its external API if possible.
  • Try to avoid test-specific logic in production code
  • Try to avoid using test-specific subclasses.

All of the points above are examples of white-box coupling used in tests. So to completely avoid refactoring breaking tests, use black-box testing of the SUT.

Disclaimer: For the purpose of discussing refactoring here, I am using the word a little more broadly to include changing internal implementation without any visible external effects. Some purists may disagree and refer exclusively to Martin Fowler and Kent Beck's book Refactoring - which describes atomic refactoring operations.

In practice, we tend to take slightly larger non-breaking steps than the atomic operations described there, and in particular changes that leave the production code behaving identically from the outside may not leave tests passing. But I think it is fair to include "substitute algorithm for another algorithm that has identical behaviour" as a refactor, and I think Fowler agrees. Martin Fowler himself says that refactoring may break tests:

When you write a mockist test, you are testing the outbound calls of the SUT to ensure it talks properly to its suppliers. A classic test only cares about the final state - not how that state was derived. Mockist tests are thus more coupled to the implementation of a method. Changing the nature of calls to collaborators usually cause a mockist test to break.

[...]

Coupling to the implementation also interferes with refactoring, since implementation changes are much more likely to break tests than with classic testing.

Fowler - Mocks aren't stubs

If your tests break when you're refactoring, then you're not, by definition, refactoring, which is "changing the structure of your program without changing the behaviour of your program".

Sometimes you DO need to change the behaviour of your tests. Maybe you need to merge two methods together (say, bind() and listen() on a listening TCP socket class), so you have other parts of your code trying and failing to use the now altered API. But that's not refactoring!

I think the trouble with this question, is that different people are taking the word 'refactoring' differently. I think it's best to carefully define a few things you probably mean:

>  Keep the API the same, but change how the API is implemented internally
>  Change the API

As one other person already noted, if you are keeping the API the same, and all your regression tests operate on the public API, you should have no problems. Refactoring should cause no problems at all. Any failed tests EITHER mean your old code had a bug and your test is bad, or your new code has a bug.

But that's pretty obvious. So you PROBABLY mean by refactoring, that you are changing the API.

So let me answer how to approach that!

  • First create a NEW API, that does what you want your NEW API behavior to be. If it happens that this new API has the same name as an OLDER API, then I append the name _NEW to the new API name.

    int DoSomethingInterestingAPI();

becomes:

int DoSomethingInterestingAPI_NEW( int takes_more_arguments );
int DoSomethingInterestingAPI_OLD();
int DoSomethingInterestingAPI() { DoSomethingInterestingAPI_NEW (whatever_default_mimics_the_old_API);

OK - at this stage - all your regression tests sill pass - using the name DoSomethingInterestingAPI ().

NEXT, go through your code and change all calls to DoSomethingInterestingAPI() to the appropriate variant of DoSomethingInterestingAPI_NEW(). This includes updating/rewriting whatever parts of your regression tests need to be changed to use the new API.

NEXT, mark DoSomethingInterestingAPI_OLD () as [[deprecated()]]. Keep around the deprecated API as long as you like (until you've safely updated all code that might depend on it).

With this approach, any failures in your regression tests simply are bugs in that regression test or identify bugs in your code - exactly as you would want. This staged process of revising an API by explicitly creating _NEW and _OLD versions of the API allows you to have bits of the new and old code coexisting for a while.

I assume your unit tests are of a granularity that I would call "stupid" :) ie, they test the absolute minutiae of each class and function. Step away from the code-generator tools and write tests that apply to a bigger surface, then you can refactor the internals as much as you want, knowing that the interfaces to your applications have not changed, and your tests still work.

If you want to have unit tests that test each and every method, then expect to have to refactor them at the same time.

keeping the testing suite in sync with the codebase during and after refactoring

What makes it difficult is coupling. Any tests come with some degree of coupling to implementation details but unit tests (regardless whether it's TDD or not) are especially bad at it because they interfere with internals: more unit tests equals more code coupled to units i.e. methods signatures / any other public interface of units - at very least.

"Units" by definition are low-level implementation details, interface of units can and should change/split/merge and otherwise mutate as system evolves. Abundance of unit tests can actually hinder this evolution more than it helps.

How to avoid breaking tests when refactoring? Avoid coupling. In practice it means avoid as much unit tests as possible and prefer higher level / integration tests more agnostic of implementation details. Remember though that there is no silver bullet, tests still have to couple to something at some level but ideally it should be interface that is explicitly versioned using Semantic Versioning i.e. usually at the published api/application level (you don't want to do SemVer for every single unit in your solution).

Your tests are too tightly coupled to the implementation and not the requirement.

consider writing your tests with comments like this:

//given something
...test code...
//and something else
...test code...
//when something happens
...test code...
//then the state should be...
...test code...

this way you can't refactor the meaning out of tests.

Licensed under: CC-BY-SA with attribution
scroll top