Why are minor breaking language changes not handled with transpilers?

https://softwareengineering.stackexchange.com/questions/349311

12-01-2021
|

Pregunta

For any successful programming language, making a breaking change to the language is extremely difficult. This is the case even for those changes where the legacy code could be reliably and automatically fixed with a simple tool (effectively, a trivial transpiler from the old version to the new). Among such changes might be:

a deprecation of a feature that can be converted to its equivalent, presumably safer or otherwise preferable, representation
an addition of a new reserved keyword (such identifiers can be renamed in the legacy code)
a requirement to be explicit in something that used to be implicit in the past, e.g., to disambiguate from some new feature being added to the language

Why is it not considered perfectly acceptable to force such breaking changes on users as long as a conversion tool is also provided?

To clarify, I'm talking about conversion tools that are guaranteed (by the definition of the previous and the current version) to never require manual intervention.

Solución

You focus on syntax changes to a language. Syntax changes are fairly easy to pull off in a backwards-compatible manner:

The new syntax would have been illegal in previous versions. This is why new language features often reuse keywords or introduce contextual keywords.
The new features have to be enabled explicitly, either as a feature switch in the source code (like Perl's use v5.22 or use feature 'state'), or as a compiler option (like g++ --std=c++14 or javac -source 1.8). Both of these effectively put the transpiler into the compiler frontend itself, thus avoiding the need for an extra tool.

If a language is not yet very stable, a source conversion tool is more desirable than supporting old syntax versions in the compiler. For Golang, go fix is a tool that can rewrite code to reflect syntax and API changes. But in the long run, stability is more important: while you might not do this every day, it is important that even decades-old code can still be executed without extra hassle. Backwards-incompatible platform changes may be necessary, but explain why some organizations need to keep some DOS or Windows XP machines around…

Some language changes do not affect the syntax, but the semantics. Rewriting code to reflect semantic changes might be possible in some simple cases, but is impossible in the general case: you quickly run into the halting problem.

To aid the Python 2 to Python 3 transition, a 2to3 source converter was written. With this tool, there is no expectation that it will always work without intervention. Simple updates like rewriting print statements as print functions are of course no problem. But the changes to the numeric types (long was folded into int, which is no longer fixed-size, and the / operator now always produces floats) are already more difficult – remember that Python is dynamically typed. Non-trivial uses of metaprogramming are entirely hopeless. The lesson here isn't that you shouldn't write such tools, but that tools are very limited in what they can do.

For libraries and some applications, it is often important that they can be used with multiple language versions. If we had to update the source code to use a newer language version, this would prevent it from working on other language versions as well. Such programs must be written in the common subset of all language versions. This is not hypothetical, but e.g. common in C++ (where you maybe want your header files to be legal C code, or want to support multiple C++ standards), or in applications for interpreted languages like Perl, Python, PHP where you don't want to force your users to upgrade (they often can't upgrade easily).

Breaking language changes are then only OK if you have control over all aspects of deployment, and have the time to verify the changes made by some conversion tool. Otherwise, users are strongly incentivized to avoid upgrading to the new language version: breaking changes are highly disruptive, and are a strong indicator that the language is not yet fit for serious development.

Otros consejos

If a deprecated feature can be safely converted to its equivalent, is it really deprecated?

Programming languages are there for the programmers, not for the computers. You could feed a computer mumbo-jumbo and if you had a mumbo-jumbo interpreter it would run just fine.

Making small breaking changes to a language will catch more then a few programmers off-guard.

I think you're looking at it from the wrong perspective, computer centered. While you should be looking at the programmer centered perspective.

Without making, potentially, "breaking" changes in a language a language cannot evolve, (it has to be abandoned and replaced instead).

There are a number of strategies to deal with and minimise the disruption of such changes that the use varies from language to language and depend a lot on the mindset of the language developers.

These include:

Backwards compatibility flags to allow you to specify which version of the language you are using
Semantic Versioning lets you know how much to expect to have changed
Long support lives gives you time to catch up
Really clear error messages when old code is encountered
Deprecation warnings long before the final change is introduced
Supplying automatic porting tools - although this can be a problem if you still need to support the older versions as well as the new.
Manual porting tools - ditto
Back porting allows you to start introducing the new features while still using the old version
Really good documentation and change history
Good and responsive support possibly from the authors of the change(s).
Community support in web forums, SO, etc.

All of the above can work really well when moving forward incrementally, i.e. from one version to the next but in general you are likely to hit problems when making when jumping forward numerous steps as all to often happens when you need to resurrect a project that hasn't been touched in years. Porting tools will try to help but are by nature prone to getting things wrong because the author(s) of the tools can't know every usage.

Open source projects, especially those with large developer and user bases, tend to use all of the above.

The least popular attitude is "it has changed deal with it" and breaking changes even on a minor version number or build number change without mentioning any specific company.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a softwareengineering.stackexchange