Is it worth to exclude null fields from a JSON server response in a web application to reduce traffic?

StackOverflow https://stackoverflow.com/questions/21188145

Domanda

Lets say that the API is well documented and every possible response field is described.

Should web application's server API exclude null fields in a JSON response to lower the amount of traffic? Is this a good idea at all?

I was trying to calculate the amount of traffic reduced for a large app like Twitter, and the numbers are actually quite convincing.

For example: if you exclude a single response field, "someGenericProperty":null, which is 26 bytes, from every single API response, while Twitter is reportedly having 13 billion API requests per day, the amount of traffic reduction will be >300 Gb.

More than 300 Gb less traffic every day is quite a money saver, isn't it? That's probably the most naive and simplistic calculation ever, but still.

È stato utile?

Soluzione

In general, no. The more public the API and and the more potential consumers of the API, the more invariant the API should be.

  • Developers getting started with the API are confused when a field shows up some times, but not other times. This leads to frustration and ultimately wastes the API owner's time in the form of support requests.
  • There is no way to know exactly how downstream consumers are using an API. Often, they are not using it just as the API developer imagines. Elements that appear or disappear based on the context can break applications that consume the API. The API developer usually has no way to know when a downstream application has been broken, short of complaints from downstream developers.
  • When data elements appear or disappear, uncertainty is introduced. Was the data element not sent because the API considered it to be irrelevant? Or has the API itself changed? Or is some bug in the consumer's code not parsing the response correctly? If the consumer expects a fields and it isn't there, how does that get debugged?
  • On the server side, extra code is needed to strip out those fields from the response. What if the logic to strip out data the wrong? It's a chance to inject defects and it means there is more code that must be maintained.

In many applications, network latency is the dominating factor, not bandwidth. For performance reasons, many API developers will favor a few large request/responses over many small request/responses. At my last company, the sales and billing systems would routinely exchange messages of 100 KB, 200 KB or more. Sometimes only a few KB of the data was needed. But overall system performance was better than fetching some data, discovering more was needed then sending additional request for that data.

For most applications some inconsistency is more dangerous than superfluous data is wasteful.

As always, there are a million exceptions. I once interviewed for a job at a torpedo maintenance facility. They had underwater sensors on their firing range to track torpedoes. All sensor data were relayed via acoustic modems to a central underwater data collector. Acoustic underwater modems? Yes. At 300 baud, every byte counts.

There are battery-powered embedded applications where every bytes counts, as well as low-frequency RF communication systems.

Another exception is sparse data. For example, imagine a matrix with 4,000,000 rows and 10,000 columns where 99.99% of the values of the matrix are zero. The matrix should be represented with a sparse data structure that does not include the zeros.

Altri suggerimenti

It's definitely dependent from the service and the amount of data it provides; it should be evaluate the ratio about null / not null data and set a threshold over than it worth to exclude that elements. Thanks for sharing, it's an interesting point as for me.

The question is on a wrong side - JSON is not the best format to compress or reduce traffic, but something like google protobuffers or bson is.

I am carefully re-evaluating nullables in the API scheme right now. We use swagger (Open API) and json scheme does not really have something like nullable type and I think there is a good reason for this.

If you have a JSON response that maps a DB integer field which is suddenly NULL (or can be according to DB scheme), well it is indeed ok for relational DB but not at all healthy for your API.

I suggest to adopt and follow a much more elegant approach, and that would be to make better use of "required" also for the response.

If the field is optional in the response API scheme and it has null value in the DB do not return this field.

We have enabled strict scheme checks also for the API responses, and this gives us a much better control of our data and force us not to rely on states in the API.

For the API client that of course means doing checks like:

if ("key" in response) {
    console.log("Optional key value:" + response[key]);
} else {
    console.log("Optional key not found");
}
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top