Using data to maintain compatibility when evolving RESTful service

https://softwareengineering.stackexchange.com/questions/378379

07-02-2021
|

Pregunta

I've been reading up on API versioning for REST APIs.

Most of the articles I've come across (here's an example) seem to focus on two options:

URI based versioning, e.g. v1/my_resource/
Media Type Versioning, i.e. adding additional information to the ACCEPT header.

I have also seen proposals of adding a custom version parameter or header.

Generally though, from a philosophical point of view, one appears to end up in Non-RESTful territory with significant complexity and caching implications, and REST puritans would insist that versioning is wrong and you need to maintain backwards compatibility (here's Roy Felding's words on the matter).

One thing I haven't seen mentioned is using the data payload as a solution to the problem of changing your functionality while keeping backwards compatibility.

Take the following example JSON body:

{
  "v15": { 
    "field2": "abc",
    "field3": 42
  },
  "v14": {
    "field1": "code still supports me",
    "field2": "abc"
  }
}

The idea here would be to have the newer API consumers act on the v15 data and the older consumers will still get their required v14 data while ignoring the v(n > 14) data by design. Of course, the drawback is that both the producer and consumer code will need to be able to handle two or more versions of the data.

Still, the tradeoff on the face of it seems appealing to me. I am sceptical, however, since I haven't seen this solution described in any articles I've found.

My question is: What am I missing? Is this a bad solution for evolving a service?

Solución

Is this a bad solution for evolving a service?

No, but it needs more care and attention.

(REST doesn't really enter into the problem, which is the more general problem of messages).

If clients and servers are going to be independently deployable, then we face the following problem: a client from year:2017 needs to be understood by a server from year:2016, a server from year:2017 and a server from year:2018.

Presumably client:2017 to server:2017 is trivial -- both are using the 2017 definition of the message, and so can understand each other perfectly.

But how do we arrange that the client can talk to past and future servers? Either (a) the definition needs to be completely fixed for all time, or (b) the way the message evolves over time needs to be constrained.

More precisely, we need to describe the processing model for the message in such a way that sensible things happen when the producer and consumer are not using precisely the same schema.

What this usually means is that

required elements are fixed
elements are never repurposed (instead, create a new element)
optional elements can be added, with a specified value to be used in the absence of the element
unrecognized elements are ignored

So if the client sends a message with optional elements {A,B}

... then server:2016, that doesn't know about element B, just ignores it.

... then server:2017, both elements A and B

... then server:2018, uses the elements A and B, and also the specified default value for element C.

There's a lot of good material available that discusses these ideas in detail

Gierke: Evolving Distributed Systems
Pieter Hintjens: The End of Software Versions
Greg Young: Versioning in an Event Sourced System
Rich Hickey: Spec-ulation
David Orchard: Extensibility, XML Vocabularies, and XML Schema

Otros consejos

Yes, it is a bad solution for evolving a service.

The key thing about the way you send the version information is that it can alter the routing of the request.

You want to have your version 14 api on a different box to your version 15 api and use the information on the incoming request to decide which box to send the request to.

A hostname, header or url path can be picked up by load balancers or web servers fairly easily and used for that purpose.

However, if that information is embedded in the body of the request as part of the json its much harder to get at. Sure you can program an API to parse the json and understand the version number, but can you setup a HAProxy ACL rule to do it?

Also I should add, you don't really want the api consumers to have to worry about the version too much. Ideally they should be able to not specify a version and get the latest version.

This is why I prefer the version header method. The API interface can remain ignorant of other versions but consumers who want a particular version can add the header as required.

Another important caveat. Data versioning is a thing and you may need to version your data as well as your api. especialy if you are processing data which has been around a ling time. But normally a v14 api will be able to safely assume v14 data

You should work to maintain backwards compatability within reason.

I don't think Roy Felding is necessarily arguing against breaking changes when required to evolve the application.

The best solution for versioning REST services that I have seen (and used as a consumer) is to specify a querystring parameter. For example, Microsoft services take an api-version parameter which allows them to route the request to the correct version of the service.

If I specify api-version=4.1-preview.3 then it is perfectly clear which version of the service should be used.

The biggest issue I see with your approach is how your client would generate your request body. You're going to have a serializer which is aware of different model versions and somehow populates the correct fields used by each version.

It's much easier when your client simply understands one version and doesn't have to account for versioning.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a softwareengineering.stackexchange