Should a REST API return a 500 Internal Server Error to indicate that a query references an object that does not exist?

https://softwareengineering.stackexchange.com/questions/368213

31-01-2021
|

Question

I am working with a REST API which resides on a server that handles data for a multitude of IoT devices.

My task is to query the server using the API to collect specific performance information about said devices.

In one instance, I obtain a list of available devices and their corresponding identifiers, then later query the server for more details using those identifiers (GUIDs).

The server is returning a 500 Internal Server Error for a query on one of those IDs. In my application, an exception is thrown and I don't see details about the error. If I examine the response more closely with Postman, I can see that the server returned JSON in the body which contains:

errorMessage: "This ID does not exist".

Disregard the fact the server provided the ID to begin with -- that's a separate problem for the developer.

Should a REST API return a 500 Internal Server Error to report that a query references an object that doesn't exist? To my thinking, the HTTP response codes should refer strictly to the status of the REST call, rather than to the internal mechanics of the API. I would expect a 200 OK with the response containing the error and description, which would be proprietary to the API in question.

It occurs to me that there is a potential difference in expectation depending on how the REST call is structured.

Consider these examples:

http://example.com/restapi/deviceinfo?id=123
http://example.com/restapi/device/123/info

In the first case, the device ID is passed as a GET variable. A 404 or 500 would indicate that the path (/restapi/deviceinfo) is either not found or resulted in a server error.

In the second case, the device ID is part of the URL. I would be more understanding of a 404 Not Found, but still could argue based on which parts of the path are interpreted as variables versus endpoints.

Solution

I think a 404 response is the best semantic match here, because the resource you were trying to find (as represented by the URI used for the query) was not found. Returning an error payload in the body is reasonable, but not required.

According to RFC 2616, the definition of the 404 status code is:

10.4.5 404 Not Found
The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.

OTHER TIPS

I will use your examples.

http://example.com/restapi/deviceinfo?id=123

If the endpoint returns a json array, the best choice for is 200 OK with a empty array if no result were found.

If the endpoint is designed to return a single result, my choice would be 404 NOT FOUND, because, for me, the right syntax for this kind of endpoint is: http://example.com/restapi/deviceinfo/123. I usually use request param only for filtering and when my endpoint returns an array.

http://example.com/restapi/device/123/info

I think this question was already answered here. POST or GET, the better choice seems 404 NOT FOUND because the resource 123 was not found.

In both cases I can't see the necessity to explain the reason of the request was not completed. The request information and HTTP code already explains why.

HTTP 404 is correct, because the server understands what resource the client is asking for, but it doesn't have that resource.

The fact that you're working with a "REST API" is the key. The API should behave like it's performing a REpresentational State Transfer, not executing a function. (Sure, the term "REST" has come to have a broader meaning, but you can still use its literal meaning to good effect here.) The client has asked for the state of a resource described by the URL http://example.com/restapi/device/123/info . A querystring (/deviceinfo?id=123) wouldn't change the situation. The server knows you're asking to transfer the state of device 123, but it doesn't recognize that as a known resource. Hence HTTP 404.

The other possible responses discussed here have specific meanings too:

HTTP 200 - We got the state for you; it's in the response body.
HTTP 204 - We got the state for you; it's blank.
HTTP 400 - We can't tell what resource you're asking about. Fix your URL.
HTTP 500 - We malfunctioned. Not your fault.

See RFC 2616 Sec. 10 as appropriate.

A 5xx error is typically used to indicate that the server encountered an error and cannot complete the request. If the server takes the request, can successfully parse it, and then does its work, that shouldn't return a 5xx error.

I'm unsure if there's any kind of convention on what to return if a query yields no results. I've seen both what you describe (a 200 with a body that contains a message) as well as a 404 indicating that results were not found. The 200 probably makes the most sense - the request successfully completed and there were no problems with the client's request or during the server processing the request. A body can deliver a message to the client.

I would treat both of your examples (http://example.com/restapi/deviceinfo?id=123 and http://example.com/restapi/device/123/info) the same - the 123 is a parameter. Both cases are different ways of structuring a request to get device information for a device with ID 123.

First, I would consider authorization and authentication. If the user does not have the appropriate permissions, I would return a 403 or a 401 as appropriate. Although it's called "Unauthorized", my understanding is that 401 is more about authentication and 403 is about unauthorized or permission denied. I wouldn't be too picky here, though, if you just wanted to stick to 403 for all authentication and authorization errors.

Then, I would handle the ID. Based on your example, it looks like it's a numeric ID. If a non-numeric value was provided, I would return a 400. If the parameter could possibly be a valid device identifier, then I would continue with processing. If there were other arguments, they would also be checked here. I would expect the response body to contain appropriate information about why the request was bad.

If all of the parameters were valid, I would begin to process the request. If the system or any dependency (a database, a third-party service, another internal service) is unavailable, I would return a 5xx code - 503 would be specific, but a 500 would also be acceptable. In either case, I would return a body with additional details. Do consider that if an external dependency reports a 408 Request Timeout, I would eat that and return a 500 to my client, allowing a client to receive a 408 only if the request to my system timed out. If the system is able to complete the request, I would return a 200 and the appropriate body.

204 may be useful in some cases, but it precludes you from sending a response body. Especially in an API setting, sending a response body with information that can be fed into a logging or reporting mechanism seems like the right decision to make in most cases.

The only time that a 404 would be returned is if the server did not have a /deviceinfo endpoint or a /device/:id/info endpoint.

I would not consider an ID that is not found to be the same as the resource not found. The resource is the device information for a particular device (in your example). Returning a 404 would mean that the resource (the device information) does not exist. A 200 with an appropriate body means that the system can indeed provide device information. There may or may not be a device with the specified ID.

A 500-series HTTP error indicates a server malfunction. Apart from 501 Not Implemented and 505 HTTP Version Not Supported, using these error codes carries the implication that retrying the request at a later time may succeed (although only 503 Service Unavailable explicitly states this). Ideally, a server should never produce one of these codes, although the inability to write bug-free software and provision the server with infinite resources means you'll need them from time to time.

For an "object does not exist" result, you should probably return either 404 Not Found (when the request is for an object by name), or 200 Success with an empty result body (when searching for an object by attributes). 204 No Content looks tempting, but I'd use it only for situations where a lack of a response body is the expected result.

As a client of your API, when I make either of these calls:

http://example.com/restapi/deviceinfo?id=123
http://example.com/restapi/device/123/info

I expect to get back a (representation of) a DeviceInfo object (or some particular type anyway, whether that's a formal type or just something conforming to a documented "duck type" convention). I want a 200 status to mean I actually got one, and I can go ahead and use it.

For REST APIs, I think of 400 and 500 status codes as something like exceptions. You use them to indicate when you can't return a "normal" response for the request you've received, so the client will need to do something exceptional rather than process the information it was expecting to retrieve.

This means, as an API consumer, that I can use some sort of checked-rest-call function that retrieves a response or throws an exception. That's great; my normal logic can be straight line code, and I can organise my error-handling the same way I do in local code. Unexpected 404s will manifest as "no resource found" exceptions without me having to do anything at all, not as "missing attribute" errors when I'm later processing { errorMessage: "Device 123 not found" } as if it was a DeviceInfo object.

If you reason that the endpoint http://example.com/restapi/deviceinfo is found, and it's only the id=123 that isn't, and so return 200 with an error message in the body, then you're creating exactly the same kind of interface problems as C functions that could return either a correct result or an error code, or methods that indicate problems by arbitrarily returning null. It is much nicer as a user of your interface to have errors indicated through a "separate" channel from regular returns. That applies here too, even though HTTP 200, 404, and 500 responses are the same channel from a low level point of view. They're standardised and easy to tell apart so my REST client framework can easily switch on those statuses to turn them into the proper structures in my language; to do the same thing with the JSON layer (where you always say 200 and give me either a DeviceInfo or an error message) I need to embed some knowledge of the JSON schemas you use.

So only use 200 when you can return a valid value of the expected type (which is why http://example.com/restapi/search-devices?colour=blue can return 200 with an empty array if there are no blue devices; an empty array is a valid array, and a sensible answer to the request "I would like the details of all blue devices"). If you can't, use the most appropriate non-200 status code. Even though "device 123 does not exist" is the correct response to "give me the details of device 123", and is not an error for the server, it is an exception for the client's expectation that they'll get back a DeviceInfo and should not be communicated as a normal "here is what you asked for" response.

You could use error 422 Unprocessable Entity to differentiate from 404 not found. 422 means that the server understand the request but it cannot give a proper response. I'm using this code on similar situations.

Error 500 normally indicates that the request crashed a server side program; in enterprise environments these errors are treated as an egg in the face and are being avoided.

Error 4xx should signal to the programmer that the API endpoint (resource) does not exist. Consequently, once the appropriate endpoint is reached, any error handling from that point on is the API's programmer responsibility that should be handled gracefully, i.e. with an error message of a 200 response.

Although the w3 notes that 404 is used when no other response is applicable, wouldn't this be what a 204 (No Content) response is good for? The request was valid from a processing standpoint and the server processed the request and has a result. That's a success, which leans towards 2xx responses. There was no content for this particular request so 204 tells the user that there was nothing wrong with their query but there's nothing there.

You could also make a weak case that 409 (Conflict) is an appropriate response. Though 409 is most often used for a POST, it does say

The request could not be completed due to a conflict with the current state of the resource. This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request. The response body SHOULD include enough information for the user to recognize the source of the conflict. Ideally, the response entity would include enough information for the user or user agent to fix the problem; however, that might not be possible and is not required.

arguably the non existence of the requested id is the conflict here, and returning in the message that no such id exists in the system is enough information to the user to recognize and fix the conflict.

The other answers cover it, but I just wanted to point out that a 500-series error means "I messed up." In this case, "I" means the API, so an unhandled server error or similar. A 400-series error means "you messed up" -- the caller of the API sent something incorrect.

Other users have provided valid answers for what to do if you want to go by the book.

However I would suggest that you are reading too much into the RESTfulness and being absolutely kosher with respect to it.

REST is a capricious protocol, because if you want to go by the book while using it, then it forces both client and server to be built having specific knowledge of the fact that they are communicating via REST, so in essence the specific communication protocol being used cannot be abstracted out.

There is a different approach: completely eschew RESTfulness, and use it just as a communication protocol and nothing else. This means that the only responses that need to be returned are "HTTP 200 OK" and "HTTP 500 Internal Server Error" because as far as the communication protocol is concerned, any attempt to communicate can only have two outcomes: either the request was successfully delivered to the server, or not.

What happens after delivery, is none of the communication protocol's business. So, once the server has successfully received the request, and begins processing it, many errors may occur, but these are all application-specific errors that are none of the communication protocol's business to know anything about. They should be communicated within the payload of a response that looks perfectly successful as far as the communication protocol knows.

So, bottom line is, I would recommend returning an "HTTP 200 OK" and within the response payload ("response content body" in HTTP parlance) have an application-specific error code that says "ID not found" or whatever.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange