Put all exclusion conditions in sql query or first get all results and then perform exclusion code in memory?

StackOverflow https://stackoverflow.com/questions/8742326

Domanda

In my query I want to get all users of the same city. This query will also be available to end users so that they can see other users of same city. I query user table so it retrieves all the users even who has ran the query.

Now there are 2 options :

  1. Either I add a condition to query user.id != (userid of query running user)

  2. Or process the query result before displaying it and removing user who is running the query.

Does it matter or have any considerable effect which one I use?

Note - My main query is not as simple as finding same city but uses 3 table join to access the data which user wants to display. I just put city here for brevity.

È stato utile?

Soluzione

It depends, but in my general experience, if adding code in the database query and parameters going in to filter at the database results in significant reduction in data coming back, this usually means the database was actually able to use those things to make a better execution plan with a smaller working set internally (not just over the wire) and is generally better.

For instance, in a recent query I helped someone with, the query can be written to return all pairs of friends. But since, from an application point of view, only the friends of a particular person are needed on any particular page, there is no need to return extra data which is just discarded AND the query plan itself is different because there would be a smaller set on one side of a cross join. Anyway, my point is that USUALLY you are better off giving the database as much information as possible and letting it work from there.

Altri suggerimenti

I hate to be the guy giving a standard answer but.... Do some performance testing with both options and pick the one that's faster. If the difference is inconclusive, pick the one that is easier for future developers. I'd guess putting it in the query is easier for developers, since the results are likely to be used in more than one place and doing the check in code is probably best handled at the time of use (hence replicating the check for each use).

You can use the first option since 3 table joins + condition is not that big.

From the information supplied, it shouldn't make any noticeable difference which option is selected - the first option might be slightly preferred, since it requires slightly less data to be retrieved from the database.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top