Question

I recently had to wrote a query to filter some specific data that looked like the following:

Let's suppose that I have 3 distinct values that I want to search in 3 different fields of one of my tables on my database, they must be searched in all possible orders without repetition.

Here is an example (to make it easy to understand, I will use named queries notation to show where the values must be placed):

val1 = "a", val2 = "b", val3 = "c"

This is the query I've generated:

SELECT * FROM table WHERE
(fieldA = :val1 AND fieldB = :val2 AND fieldC = :val3) OR
(fieldA = :val1 AND fieldB = :val3 AND fieldC = :val2) OR
(fieldA = :val2 AND fieldB = :val1 AND fieldC = :val3) OR
(fieldA = :val2 AND fieldB = :val3 AND fieldC = :val1) OR
(fieldA = :val3 AND fieldB = :val1 AND fieldC = :val2) OR
(fieldA = :val3 AND fieldB = :val2 AND fieldC = :val1)

What I had to do is generate a query that simulates a permutation without repetition. Is there a better way to do this type of query?

This is OK for 3x3 but if I need to do the same with something bigger like 9x9 then generating the query will be a huge mess.

I'm using MariaDB, but I'm okay accepting answers that can run on PostgreSQL. (I want to learn if there is a smart way of writing this type of queries without "brute force")

Was it helpful?

Solution

There isn't a much better way, but you can use in:

SELECT *
FROM table
WHERE :val1 in (fieldA, fieldB, fieldC) and
      :val2 in (fieldA, fieldB, fieldC) and
      :val3 in (fieldA, fieldB, fieldC)

It is shorter at least. And, this is standard SQL, so it should work in any database.

OTHER TIPS

... I'm okay accepting answers that can run on PostgreSQL. (I want to learn if there is a smart way of writing this type of queries without "brute force")

There is a "smart way" in Postgres, with sorted arrays.

Integer

For integer values use sort_asc() of the additional module intarray.

SELECT * FROM tbl
WHERE  sort_asc(ARRAY[id1, id2, id3]) = '{1,2,3}' -- compare sorted arrays

Works for any number of elements.

Other types

As clarified in a comment, we are dealing with strings.
Create a variant of sort_asc() that works for any type that can be sorted:

CREATE OR REPLACE FUNCTION sort_asc(anyarray)
  RETURNS anyarray LANGUAGE sql IMMUTABLE AS
'SELECT array_agg(x ORDER BY x COLLATE "C") FROM unnest($1) AS x';

Not as fast as the sibling from intarray, but fast enough.

  • Make it IMMUTABLE to allow its use in indexes.
  • Use COLLATE "C" to ignore sorting rules of the current locale: faster, immutable.
  • To make the function work for any type that can be sorted, use a polymorphic parameter.

Query is the same:

SELECT * FROM tbl
WHERE  sort_asc(ARRAY[val1, val2, val3]) = '{bar,baz,foo}';

Or, if you are not sure about the sort order in "C" locale ...

SELECT * FROM tbl
WHERE  sort_asc(ARRAY[val1, val2, val3]) = sort_asc('{bar,baz,foo}'::text[]);

Index

For best read performance create a functional index (at some cost to write performance):

CREATE INDEX tbl_arr_idx ON tbl (sort_asc(ARRAY[val1, val2, val3]));

SQL Fiddle demonstrating all.

My answer assumes there is a Key column that we can single out. The output should be all the keys that meet all 3 values and each field and value being used:

This "should" get you a list of Keys that meet the criteria

SELECT F.KEY
FROM (
SELECT DISTINCT L.Key, L.POS
FROM (
  SELECT Key, 'A' AS POS, FieldA AS FIELD FROM table AS A
  UNION ALL
  SELECT Key, 'B' AS POS, FieldB AS FIELD FROM table AS A
  UNION ALL
  SELECT Key, 'C' AS POS, FieldC AS FIELD FROM table AS A ) AS L
WHERE L.FIELD IN(:VAL1, :VAL2, :VAL3)
) AS F
GROUP BY F.KEY
HAVING COUNT(*) = 3

Although Gordon's answer is definitely shorter and almost certainly faster as well, I was toying with the idea on how to minimize the code change when the number of combinations increase.

And I can come up with is something for Postgres which is by no means shorter, but more "change-friendly":

with recursive params (val) as (
   values (1),(2),(3) -- these are the input values
), all_combinations as (
   select array[val] as elements
   from params
   union all
   select ac.elements||p.val
   from params p
     join all_combinations ac 
       on array_length(ac.elements,1) < (select count(*) from params)
)
select *
from the_table
where array[id1,id2,id3] = any (select elements from all_combinations);

What does it do?

First we create a CTE holding the values we are looking for, the recursive CTE then builds a list of all possible permutations from those values. This list will include too many elements because it will also hold arrays with 1 or two elements.

The final select that puts the columns that should be compared into an array and compares that with the permutations generated by the CTE.

Here is a SQLFiddle example: http://sqlfiddle.com/#!15/43066/1

When the number of values (and columns) increase you only need to add the new value to the values row constructor and add the additional column to the array of columns in the where condition.

Using a naive approach, I would use the in clause for this job, and since there should not be any repetition, exclude when the fields repeat.

There is also some optimisations you could do.

First you can exclude the last field, since:

A <> B, A <> C
A <> B, B <> C, 

Also means that:

C <> B,  C <> A

And also, the following queries doesn't need a previously queried field, since:

A <> B == B <> A

The query would be written as:

SELECT * FROM table
WHERE :val1 in (fieldA, fieldB, fieldC) and
  :val2 in (fieldA, fieldB, fieldC) and
  :val3 in (fieldA, fieldB, fieldC) and
  fieldA not in (fieldB, fieldC) and
  fieldB <> fieldC

This is a naive approach, there are probably others which use the MySQL API, but this one does the job.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top