Prevent deprecated code from compiling after reaching a deadline [closed]

https://softwareengineering.stackexchange.com/questions/367264

30-01-2021
|

Question

In my team we have been cleaning a lot of old stuff in a big monolithic project (whole classes, methods, etc.).

During that cleaning tasks I was wondering if there is a kind of annotation or library fancier than the usual @Deprecated. This @FancyDeprecated should prevent the build of the project from succeeding if you haven't cleaned old unused code after a particular date has passed.

I have been searching in the Internet and didn't find anything that have the capabilities described below:

should be an annotation, or something similar, to place in the code you are intended to delete before a particular date
before that date the code will compile and everything will work normally
after that date the code will not compile and it will give you a message warning you about the problem

I think I am searching for an unicorn... Is there any similar technology for any program language?

As a plan B I am thinking of the possibility of making the magic with some unit tests of the code that is intended to be removed that start to fail at the "deadline". What do you think about this? Any better idea?

Solution

I don't think this would be a useful feature when it really prohibits compilation. When at 01/06/2018 large parts of the code won't compile which compiled the day before, your team will quickly remove that annotation again, code cleaned up or not.

However, you could add some custom annotation to the code like

@Deprecated_after_2018_07_31

and build a small tool to scan for those annotations. (A simple one liner in grep will do it, if you don't want to utilize reflection). In other languages than Java, a standardized comment suitable for "grepping", or a preprocessor definition can be used.

Then run that tool shortly before or after the particular date, and if it still finds that annotation, remind the team to clean up those code parts urgently.

OTHER TIPS

This would constitute a feature known as a time bomb. DON'T CREATE TIME BOMBS.

Code, no matter how well you structure and document it, will turn into an ill-understood near-mythical black box if it lives beyond a certain age. The last thing anyone in the future needs is yet another strange failure mode that catches them totally by surprise, at the worst possible time, and without an obvious remedy. There is absolutely no excuse for intentionally producing such a problem.

Look at it this way: if you're organized and aware of your code base enough that you care about obsolescence and follow through on it, then you don't need a mechanism within the code to remind you. If you're not, chances are that you also not up-to-date on other aspects of the code base, and will probably be unable to respond to the alarm timely and correctly. In other words, time bombs serve no good purpose for anyone. Just Say No!

In C# you would use the ObsoleteAttribute in the following manner:

In version 1, you ship the feature. A method, class, whatever.
In version 2, you ship a better feature intended to replace the original feature. You put an Obsolete attribute on the feature, set it to "warning", and give it a message that says "This feature is deprecated. Use the better feature instead. In version 3 of this library, which will be released on such and such a date, use of this feature will be an error." Now users of the feature can still use it, but have time to update their code to use the new feature.
In version 3, you update the attribute to be an error rather than a warning, and update the message to say "This feature is deprecated. Use the better feature instead. In version 4 of this library, which will be released on such and such a date, this feature will throw." Users who failed to heed your previous warning still get a helpful message that tells them how to fix the problem, and they must fix it, because now their code doesn't compile.
In version 4, you change the feature so that it throws some fatal exception, and change the message to say that the feature will be removed entirely in the next version.
In version 5, you remove the feature entirely, and if users complain, well hey, you gave them three release cycles of fair warning, and they can always just keep on using version 2 if they feel strongly about it.

The idea here is to make a breaking change as painless as possible for the users affected, and to ensure that they can continue to use the feature for at least one version of the library.

You've misunderstood what "deprecated" means. Deprecated means:

be usable but regarded as obsolete and best avoided, typically because it has been superseded.

_{Oxford Dictionaries}

By definition, a deprecated feature will still compile.

You are looking to remove the feature on a specific date. That's fine. The way you do that is you remove it on that date.

Until then, mark as deprecated, obsolete, or whatever your programming language calls it. In the message, include the date it will be removed and the thing that replaces it. This will generate warnings, indicating that other developers should avoid new usage and should replace old usage wherever possible. Those developers will either comply or ignore it, and someone will have to deal with the consequences of that when it's removed. (Depending on the situation, it might be you or it might be the developers using it.)

Don't forget that you need to retain the ability to build and debug older versions of the code in order to support versions of the software that have already been released. Sabotaging a build after a certain date means that you also risk preventing yourself from doing legitimate maintenance and support work in the future.

Also, it seems like a trivial workaround to set my machine's clock back a year or two before compiling.

Remember, "deprecated" is a warning that something will be going away in the future. When you want to forcefully prevent people from using that API, just remove the associated code. There's no point in leaving code in the code base if some mechanism makes it unusable. Removing the code gives you the compile-time checks you're looking for, and doesn't have a trivial workaround.

Edit: I see you refer to "old unused code" in your question. If the code really is unused, there's no point in deprecating it. Just delete it.

I have never seen such a feature before - an annotation that starts taking effect after a particular date.

The @Deprecated can be sufficient, however. Catch warnings in CI, and make it refuse to accept the build if any are present. This shifts the responsibility from the compiler to your build pipeline, but has the advantage that you can (semi)easily alter the build pipeline by adding additional steps.

Note that this answer does not fully solve your problem (e.g. local builds on developers' machines would still succeed, although with warnings) and assumes that you have a CI pipeline set up and running.

You are looking for calendars or todo lists.

Another alternative is to use custom compiler warnings or compiler messages, iff you manage to have few if any warnings in your codebase. If you have too many warnings, you'll need to spend additional effort (about 15 minutes?) and have to pick up the compiler warning in the build report which your continuous integration delivers on each build.

Reminders that code needs to be fixed are good and necessary. Sometimes these reminders do have strict real world deadlines, so putting them on a timer may be necessary as well.

The goal is to continuously remind people that the issue exists and needs to be fixed withing a given timeframe - a feature that simply breaks the build at a specific time not only doesn't do that, but that feature is itself an issue that needs to be fixed withing a given timeframe.

One way to think about this is what you mean by time/date? Computers don't know what these concepts are: they have to be programmed in somehow. It's quite common to represent times in the UNIX format of "seconds since the epoch", and it's common to feed a particular value into a program via OS calls. However, no matter how common this usage is, it's important to keep in mind that it's not the "actual" time: it is just a logical representation.

As others have pointed out, if you made a "deadline" using this mechanism, it's trivial to feed in a different time and break that "deadline". The same goes for more elaborate mechanisms like asking an NTP server (even over a "secure" connection, since we can substitute our own certificates, certificate authorities or even patch the crypto libraries). At first it might appear that such individuals are at fault for working around your mechanism, but it may be the case that it's done automatically and for good reasons. For example, it's a good idea to have reproducible builds, and tools to help this might automatically reset/intercept such non-deterministic system calls. libfaketime does exactly that, Nix sets all file's timestamps to 1970-01-01 00:00:01, Qemu's record/replay feature fakes all hardware interaction, etc.

This is similar to Goodhart's law: if you make a program's behaviour depend on the logical time, then the logical time ceases to be a good measure of the "actual" time. In other words, people generally won't mess with the system clock, but they will if you give them a reason to.

There are other logical representations of time: one of them is the software's version (either your app or some dependency). This is a more desirable representation for a "deadline" than e.g. UNIX time, since it's more specific to the thing you care about (changing feature sets/APIs) and hence less likely to trample on orthogonal concerns (e.g. fiddling with the UNIX time to work around your deadline could end up breaking log files, cron jobs, caches, etc.).

As others have said, if you control the library and want to "push" this change, you can push a new version which deprecates the features (causing warnings, to help consumers find and update their usage), then another new version which removes the features entirely. You could publish these immediately after each other if you like, since (again) versions are merely a logical representation of time, they need not be related to the "actual" time. Semantic versioning may help here.

The alternative model is to "pull" the change. This is like your "plan B": add a test to the consuming application, which checks that the version of this dependency is at least the new value. As usual, red/green/refactor to propagate this change through the codebase. This may be more appropriate if the functionality isn't "bad" or "wrong", but just "a bad fit for this use-case".

An important question with the "pull" approach is whether or not the dependency version counts as a "unit" (of functionality), and hence deserves testing; or whether it's just a "private" implementation detail, which should only be exercised as part of actual unit (of functionality) tests. I'd say: if the distinction between the dependency's versions really does count as a feature of your application, then do the test (for example, checking that the Python version is >= 3.x). If not, then don't add the test (since it will be brittle, uninformative and overly restrictive); if you control the library then go down the "push" route. If you don't control the library then just use whatever version is provided: if your tests pass then it's not worth restricting yourself; if they don't pass then that's your "deadline" right there!

There is another approach, if you want to discourage certain uses of a dependency's features (e.g. calling certain functions which don't play well with the rest of your code), especially if you don't control the dependency: have your coding standards forbid/discourage the use of these features, and add checks for them to your linter.

Each of these will be applicable in different circumstances.

You manage this at package or library level. You control a package and control its visibility. You are free to retract visibility. I've seen this internally at large companies and it only makes sense in cultures that respect ownership of packages even if the packages are open source or free to use.

This is always messy because the client teams simply don't want to change anything, so you often need some rounds of whitelist-only as you work with the specific clients to agree on a deadline to migrate, possibly offering them support.

One requirement is to introduce a notion of time into the build. In C, C++, or other languages/build systems which use a C-like preprocessor¹, one could introduce a time stamp through defines for the preprocessor at build time: CPPFLAGS=-DTIMESTAMP()=$(date '+%s'). This would likely happen in a makefile.

In the code one would compare that token and cause an error if time is up. Note that using a function macro catches the case that somebody didn't define TIMESTAMP.

#if TIMESTAMP() == 0 || TIMESTAMP() > 1520616626
#   error "The time for this feature has run out, sorry"
#endif

Alternatively, one could simply "define out" the code in question when the time has come. That would allow the program to compile, provided nobody uses it. Say, we have a header defining an api, "api.h", and we don't allow calling old() after a certain time:

//...
void new1();
void new2();
#if TIMESTAMP() < 1520616626
   void old();
#endif
//...

A similar construct would probably eliminate old()'s function body from some source file.

Of course this is not fool proof; one can simply define an old TIMESTAMP in case of the Friday night emergency build mentioned elsewhere. But that is, I think, rather advantageous.

This obviously works only when the library is re-compiled — after that the obsolete code simply does not exist any more in the library. It would not prevent client code from linking to obsolete binaries though.

¹ C# only supports the simple definition of preprocessor symbols, no numerical values, which makes this strategy not viable.

In Visual Studio, you can set up a pre-build script that throws an error after a certain date. This will prevent compilation. Here's a script that throws an error on or after March 12, 2018 (taken from here):

@ECHO OFF

SET CutOffDate=2018-03-12

REM These indexes assume %DATE% is in format:
REM   Abr MM/DD/YYYY - ex. Sun 01/25/2015
SET TodayYear=%DATE:~10,4%
SET TodayMonth=%DATE:~4,2%
SET TodayDay=%DATE:~7,2%

REM Construct today's date to be in the same format as the CutOffDate.
REM Since the format is a comparable string, it will evaluate date orders.
IF %TodayYear%-%TodayMonth%-%TodayDay% GTR %CutOffDate% (
    ECHO Today is after the cut-off date.
    REM throw an error to prevent compilation
    EXIT /B 2
) ELSE (
    ECHO Today is on or before the cut-off date.
)

Make sure to read the other answers on this page before using this script.

I understand the objective of what you're trying to do. But as others have mentioned, the build system/compiler is probably not the right place to enforce this. I'd suggest the more natural layer to enforce this policy is either the SCM or environment variables.

If you do the latter, basically add a feature flag that marks a pre-deprecation run. Every time you construct the deprecated class or call a deprecated method, check the feature flag. Just define a single static function assertPreDeprecated() and add this to every deprecated code path. If it's set, ignore assert calls. If it's not throw in an exception. Once the date rolls past, unset the feature flag in the runtime environment. Any lingering deprecated calls to the code will show up in runtime logs.

For an SCM based solution, I'll assume you're using git and git-flow. (If not, the logic is easily adaptable to other VCS's). Create a new branch postDeprecated. In that branch delete all the deprecated code, and begin working on removing any references until it compiles. Any normal changes continue to make to the master branch. Keep merging any non-deprecated related code changes in master back into postDeprecated to minimize integration challenges.

After the deprecation date ends, create a new preDeprecated branch from master. Then merge postDeprecated back into master. Assuming your release goes off the master branch, you should now be using the post-deprecated branch after the date. If there's an emergency, or you can't deliver results in time, you can always rollback to preDeprecated, and make any needed changes on that branch.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange