Question

I have a table that looks like this :

Key1 | Key2 | Key3 | Data  |
-----|------|------|-------| 
   1 |   1  |   1  |   a   |
   1 |   1  |   2  |   b   |
   1 |   2  |   1  |   c   |
   1 |   2  |   2  |   d   |
   2 |   1  |   1  |   e   |
   2 |   1  |   2  |   f   |
   2 |   2  |   1  |   g   |
   2 |   2  |   2  |   h   |
---------------------------|

Here is some code to create that example table :

CREATE TABLE Example(
Key1 int,
Key2 int,
Key3 int,
Data varchar(1));

INSERT INTO Example(Key1,Key2,Key3, Data)
VALUES (1,1,1,'a'), 
(1, 1,2,'b'),
(1, 2,1,'c'),
(1, 2,2,'d'),
(2, 1,1,'e'),
(2, 1,2,'f'),
(2, 2,1,'g'),
(2, 2,2,'h');

I want to randomly select one whole row for each distinct pair of Key1 and Key2 values (one for the 1 / 1 pair, one for the 1 / 2 pair etc.) One possible result would be :

Key1 | Key2 | Key3 | Data  |
-----|------|------|-------| 
   1 |   1  |   1  |   a   |
   1 |   2  |   1  |   c   |      
   2 |   1  |   2  |   f   |
   2 |   2  |   1  |   g   |
---------------------------|

I can do this doing multiple queries, by first selecting all the Key1 and Key2 distinct pairs and then using those pairs to run an other query like

SELECT stuff FROM table WHERE Key1 = value1 AND Key2 = value2 ORDER BY RAND() LIMIT 1`   

But this gross approach needs to ask as many queries as existing pairs of Key1/Key2 and is taking forever since my table is huge.

I've read things about using subqueries, partition, group by, but I struggle to implement them.
I'm new to SQL and only need to use it for a specific project and I don't really have the time to learn it properly, so I would be very thankful if you guys could give me a hand.

Thanks
JC

Was it helpful?

Solution

MariaDB 10.3 supports window functions, so something like this would work:

select * from (
  select 
     t.*, -- all columns
     row_number() -- assign sequential numbers
     over ( -- within a "window"
        partition by k1, k2 -- determined by a unique combination of k1, k2
        order by rand() -- while ordering rows randomly within the partition
     ) as rn -- set the column alias  
  from test t 
) tt 
where rn = 1 -- select only the first row from each "window" (partition)
order by k1, k2, k3

dbfiddle

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top