Is this WHERE clause builder an over-engineered design?

https://softwareengineering.stackexchange.com/questions/409289

10-03-2021
|

Вопрос

I've got to build some somewhat complicated WHERE clauses in SQL for a project I'm working on, and the clauses feel very hierarchical with their combination of ANDs and ORs. Instead of:

WHERE ([userId] NOT IN @excludeUsers) AND ((([firstname] LIKE @nameFilter) OR ([surname] LIKE @nameFilter)) AND (([jobTitle] LIKE @infoFilter) OR ([mobileNo] LIKE @infoFilter)))

... I want to be able to write something like the following:

// Wcb is a WhereClauseBuilder
OrClause innerOr;
var whereClause =
    Wcb.And(
        "[userId] NOT IN @excludeUsers",
        Wcb.And(
            Wcb.Or(
                "[firstname] LIKE @nameFilter",
                "[surname] LIKE @nameFilter"
            ),
            innerOr = Wcb.Or(
                "[jobTitle] LIKE @infoFilter",
                "[mobileNo] LIKE @infoFilter"
            )
        )
    );

The idea is to eliminate mistakes like missing whitespace, brackets, and AND/OR keywords, from the query. The And and Or static methods would create instances of AndClause and OrClause classes, and they'd overload ToString allowing the whole object graph to resolve to a string upon $"{whereClause}". I'd also like to be able to add to the query later on, like:

if (extraInfoFilter != null) {
    innerOr.Or(
        "[extraInfo] LIKE @extraInfoFilter"
    );
}

However, the code I'm writing for this has gotten complex enough to prompt me to ask: is this solution over-engineered? Should I just build the strings manually instead of generating them from a hierarchical object model like this? Are there any practical reasons why that would be a better approach?

Решение

This would be over engineered if the components of the query are all known ahead of time. If you had one query that required a few dynamic criteria, then I would probably go for string concatenation and be done with it. If you have more than a few dynamic conditions, or multiple queries that need dynamic conditions, then investing the time in a query builder object is definitely justified, if you have no other utility to do so.

Другие советы

Your proposal does seem over-engineered. Mere proper formatting of the SQL would make the conditions at least as understandable as the alternative you propose.

WHERE
([userId] NOT IN @excludeUsers) 
AND 
(
    (
        ([firstname] LIKE @nameFilter) 
        OR 
        ([surname] LIKE @nameFilter)
    ) 
    AND
    (
        ([jobTitle] LIKE @infoFilter)
        OR
        ([mobileNo] LIKE @infoFilter)
    )
)

Formatting

Proper formatting is very helpful as is removing excess parentheses and nesting makes the query easier to parse.

Your query is just:

WHERE
    [userId] NOT IN @excludeUsers
    AND 
    ([firstname] LIKE @nameFilter OR [surname] LIKE @nameFilter)
    AND
    ([jobTitle] LIKE @infoFilter OR [mobileNo] LIKE @infoFilter)

This is not necessarily minimal. I don't recall the relative priority of the AND/OR operators off-hand, so I generally prefer using parentheses rather than relying on (possibly failing) memory and make priorities there.

Note: there was a mention that proper formatting was easy in language with multiline string literals, whilst true, any language that support string catenation does the job really.

Query Builders

Query builders can have great advantages over raw queries. One important advantage is checking the absence of spelling mistake at compile-time: catching both joTitle and infoFllter.

This, however, requires a much more elaborate query builder than what you have here:

You need your SQL model to be embedded into the application as a language construct.
You need a way to represent bindings.

On the other hand, there is the issue that query builders are prone to balloon up quite quickly -- I know, I built a few -- simply because the SQL language is pretty complex. And the SQL model suddenly becomes quite complicated to manipulate (efficiently) when users start using WITH or JOIN.

This whole problem arises because we accept SQL in string literals in the host language (C#). The person who helpfully formatted your sql statement conveniently stripped all the C#, but how then are you going to run the query?

My answer to this is QueryFirst, a visual studio extension. You write your SQL in a real environment with syntax checking and intellisense. Then, every time you save, we generate the C# wrapper that lets you call your query. Your queries are continuously validated against the DB, without you having to run your app. It is DRY - you never have to repeat a column name, or look up a datatype, or map result fields to an object. And SQL injection becomes impossible. And it's easier than doing it any other way.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с softwareengineering.stackexchange