Question

I'm working for a company with a big ass legacy codebase. Changing existing code is like a guarantee to break things. Most functions are a few thousand lines, using some global flags, having different modes, which basically leading to a totally different functionality. Refactoring most of these things is impossible and won't happen because it is too expensive, although everything breaks every 1-2 weeks.

Part of the problem is high prices per hour which is leading to short estimates for broken code and a lot of free fixing later on.

I'm trying to do a better job on new applications, features and modules I’m adding, although I feel pretty much alone on this as no one else seems to realize how big of a problem that stuff is, probably because every time something complex breaks I'm the go-to person as most people are pretty afraid to work on the existing code or don't understand it at all.

And if they add something it is pretty much introducing another flag/mode, copy pasting the old behavior and adding their change to that copy which they only call from a few cases, so old stuff doesn't break, which makes the problem with the codebase even worse..

Although I care about the existing codebase my question is more about completely new features/modules inside that environment.

So far I’m focusing on:

  • keeping my functions short
  • giving everything meaningful names
  • do some integration testing

We are talking PL/SQL, so no real OOP and dynamically calling stuff has negative impact on performance, which is pretty crucial and you could compile things with broken references etc. so that is no option either.

The kind of applications I’m developing are mostly material flow control systems. So most of the time the application simplified looks like a service which is basically a game loop, with two modules:

Module 1:

  • getting data from a programmable logic controller (e.g. SIEMENS Simatic)
  • parsing the data into an array of structs (for each conveyor unit)
  • looping over the array and saving the current state to the database

Module 2:

  • getting the current transports from the database and put them in an array of structs
  • looping over the array, loop over the possible routes
    • check the database if the route is closed (e.g. because of broken conveyor)
    • check the database for other transports if path is directional
  • either skip the transport or creating a command for the PLC and update transport to send

At the end nearly all functions are basically checking some database state or modifying database state and all the outputs are determined by database state. So unit tests are pretty much impossible. Only things I could unit tests are the functions for parsing and some helper functions for conditionals, which are both private by default because no one else uses them.

When I create a new application like this my current workflow is to develop a simulation application on the same time, which can act like the PLC, so I can do integration testing, which gives me a some peace of mind when adding features or changing things and speeds up testing those features. Creating things bottom up also improved my code a bit because I find it easier to create smaller functions with less parameters etc.

Sorry for the long text but as 99% of all the things I’m finding are generally unit testing or about lose coupling via OOP/contracts/interfaces which basically can’t be applied in that case I think I needed to explain the applications I’m working with a bit.

Giving those constraints what approaches, techniques or tools are there to improve my code in the future and making it safer to change code. Any useful principles, books, talks or something you could recommend?.

Was it helpful?

Solution

At the end nearly all functions are basically checking some database state or modifying database state and all the outputs are determined by database state. So unit tests are pretty much impossible.

Yes, unit tests may be hard to introduce. But there is no "event driven GUI" in your way, it is just code which gets some input, does some processing and produces some output. This is an excellent starting point for regression tests.

You need

  • a test database

  • some code to bring the relevant parts of the test data into a defined initial state

  • some code to execute code of a feature of one module you want to test

  • some "shadow" tables which contain the data in the state you expect

  • a module to compare the expected data with the current data (and report any differences)

(and, of course, some infrastructure bringing these things all together).

This will allow you to create a regression test suite first, then start introducing new features and refactorings into the code base and easily verify you did not break anything.

If required, you can also "simulate" dependency injection in PL/SQL, see this older article, it might help you to decouple your "subject under test" from parts of the code base which may block testing (like the PLC code).

OTHER TIPS

If all the following are true:

  • it is old;
  • it is "ugly";
  • behavior must not change, except the details specifically requested;

then the answer is simple: do not refactor. While having a beautiful maintainable code is a beautiful purpose, the businesses cannot care less about beauty. They rely on that behavior.

The stories are countless on the internet. I can even tell you stories from my own experience.

Other pieces of food for thought:

  • did you make an estimate how much time you need to implement the refactoring?
  • will the management agree for you to use all that time (time = money) to perform the refactoring?
  • how will you make sure that the behavior will not change, not even in some small details?
Licensed under: CC-BY-SA with attribution
scroll top