What http codes should i return when user request file which doesn't meet his requirements

https://softwareengineering.stackexchange.com/questions/416576

16-03-2021
|

سؤال

i have a case where user can filter and download multiple pdf documents merged into single file.

The client should display an error message when no document's meets the filter, and ask for confirmation when there are many documents because it might take a while to process them (let's say the confirmation should trigger at 10 documents).

For now i've implemented it as described below, but it just seems off.

Request: GET /foo?limit=10

The results are:

Everything is correct: status 200 OK, the body would contain download link.
No documents found that are matching the filter: status 404 NOT FOUND, empty body
Found more than 10 (requested limit) documents that are matching the filter: honestly no idea what the status should be, for now i am returning 429 TOO_MANY_REQUESTS but it's not right. The body contain the amount of documents so the user can either abort the operation or create new request with higher limit.

Im sure that im missing something important, i have not enough experience in the field. What is the right way of handling this?

Other solutions that i've been thinking about:

using custom http codes, but i've come to a conclusion that it's probably a bad idea (even if they are free now, i don't know whether they might be used as official http codes in a year or ten years from now, besides, it feels like an overkill).
each of these situations resulting in 200 OK and give more info in the request body

المحلول

The client should display an error message when no document meets the filter, and ask for confirmation when there are many documents because it might take a while to process them (let's say the confirmation should trigger at 10 documents).

Both cases are 200 OK.

OK, Foos resource exists, but I didn't find the subset you are looking for
OK, Foos resource exists, but the subset you are looking for is larger than expected.

each of these situations resulting in 200 OK and give more info in the request body.

Absolutely

200 OK
{ limit:100, found:101, link:"<link>" }

In both cases (limit exceed or not), there will be a link to follow. Isn't it?

200 OK
{ limit:100, found:0, link:null }

If no document is processable, then there's no link to show.

In all the above cases the service/API did the job, it worked, the client sent the right input to the right endpoint, there's nothing wrong with that. Don't make things confusing. 404 is misleading. The developer has to guess whether it's pointing to the right URI or not. 400 is too, the developer has to guess whether it's missing any important argument.

Use HTTP status codes to communicate with the HTTP client instead of business results to the web application.

We use HTTP status codes to make the HTTP client behave in one way or another. For example, if no confirmation is required we can send 302 Redirect and put the download link into the response headers and let the protocol to do its magic. If we want the HTTP client to wait before issuing the redirection, we send the header Retry-After along with the 3xx status code. As you may guess, our web app doesn't need to know and understand the semantics of Retry-After the HTTP client does.

نصائح أخرى

As an API consumer I would think that adding ?limit=10 would return the first 10 matching results, not throw an error if there are more than 10 results.

You could consider changing from the limit to an Expect HTTP request header.

Then if there are more results then expected, return a 417

If I am querying an endpoint that returns a list of items, I would not expect an error code in any of these cases. If I request a list of items matching filter X, and there are none, what I expect as return is an empty list. My query is valid, the endpoint exists, the filter exists and I passed an acceptable value to the filter. So my request is well-formed, and the answer in that case is simply that there are no items matching that request. That is a valid answer, not an error.

As for the "too many documents" part. The solution in a typical case would be to properly paginate the results. So you would return 10 results in this case, with metadata about pagination that indicates whether there are more results available. Then the client can use that metadata to prompt the user.

In your case where you return a merged document this isn't possible. I would separate the endpoints for this purpose entirely, if that works for your application. So the filtering endpoint would return all results (ideally paginated) so that the client knows which documents and how many are in that selection. The second endpoint would take a list of document IDs, and return a merged PDF. The part about warning when too many documents are selected is a concern of the client, not of the web API. Of course you also should set a limit on the server, but that would be a global, not overridable limit to prevent DOSing the system or overloading it accidentally.

I'm making some assumptions here about your case, but from what I read I would find your example API very confusing and the behaviour unexpected.

The 429 status code indicates that the user has sent too many requests in a given amount of time ("rate limiting").

RFC 6568 indicates that status 429 should be reserved for rate limiting only. That makes it a very poor status code for indicating too many results. Instead, consider adding an offset parameter to go with the limit. When more than limit documents are available, your API should include a URL to the next page of results (/foo?limit=10&offset=10). If you must decline the request when the limit is too low, I'd recommend a very standard 400 Bad Request, with more detailed information in the request body.

And while a 404 may be used to indicate there are no documents matching the filter, your API should provide a more detailed status in the response body. After all, the status does not make clear why your API didn't return any results; perhaps the client misspelled the URL, and tried to search on /goo instead.

The client should display an error message when no document's meets the filter, and ask for confirmation when there are many documents because it might take a while to process them (let's say the confirmation should trigger at 10 documents).

If the final decision rests with the client anyway, who can choose to set the limit to any arbitrary value they want, it'd be easier for you to just always return the link and the information on how many files this link contains.

Instead of sending a confirmation to you, the client can simply look at the returned file count, evaluate it for themselves and decided whether to access the link or not.

This removes the need for the limit input parameter:

Request: GET /foo

And you simply return your findings to the client:

200 OK
{ files: 250, link:"..." }

What I find counterintuitive about your intended approach is that you put the responsibility of deciding the limit with the client, but your API forces the client to use the decision logic that your API has chosen to implement.

Who says they always care to only process a certain number of files? Even if that is the case today, who says that it's going to be the case tomorrow?

If tomorrow you have a client who only wishes to process a minimum amount of files (instead of a maximum), then you'd need to redevelop your logic and start accounting for all possible decisions that a client could want to make.

Rather than forcing a filter that does not impact your own service, simply return the requested information and let the client decide what to do with the provided information. By returning the file count to them, you leave the decision making process to them, so you don't have to develop every possible evaluation yourself.

This gives you the best of both worlds:

Less complexity for you to implement (no confirmation logic etc)
The client can evaluate their own decisions without needing you to update your API. If the client changes how they evaluate things, they don't need to wait for you to update your API logic.

Just to finish my thought: if the file count does impact the performance of your API, e.g. because it's your API who performs the expensive task of processing the files, then I would expect your API to enforce a reasonable file limit.

It makes no sense to want to defend against something (overloading your server) and then not regulating it yourself and giving the client full control over whether to do so or not.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى softwareengineering.stackexchange