SQL Group By - Select Both Columns

https://stackoverflow.com/questions/3162585

02-10-2019
|

Question

I have a table of users containing the following columns:

| User_ID  (int) | Name (varchar) |   Age (int)  |  Experience_Level (int) |

I would like to create an sql query to output all of the IDs of people who are not unique in the combination of age and experience.

My code so far:

SELECT Count(*), User_ID FROM Users 
GROUP BY Age,Experience_Level
HAVING Count(*) > 1

Obviously this is incomplete, This will group the non-unique users but it will not tell me all of their User_IDs.

Thanks in advance for your help!

Solution

Here's a negated logic query:

SELECT *
FROM Users
WHERE UserID not in
(
SELECT MIN(UserID)
FROM Users
GROUP BY Age, Experience_Level
HAVING COUNT(*) = 1
)

OTHER TIPS

Since you want information about groups of multiple users, how do you want this data returned? In a string containing comma-separated list of user_id values?

You didn't tag your question with the brand of SQL database you use.

If you use MySQL or SQLite, you can use the built-in GROUP_CONCAT() function:

SELECT Count(*), GROUP_CONCAT(User_ID) AS User_List FROM Users 
GROUP BY Age,Experience_Level
HAVING Count(*) > 1

By default, GROUP_CONCAT() separates the values with commas. Refer to the manual if you want it formatted another way.

There are other solutions for other SQL vendors. This question has come up many times on Stack Overflow:

SELECT t.User_ID, t.Age, t.Experience_Level
FROM Users t INNER JOIN 
    (SELECT Age, Experience_Level
    FROM Users
    GROUP BY Age, Experience_Level
    HAVING Count(*) > 1) d ON t.Age = d.Age AND t.Experience_Level = d.Experience_Level

Test script:

create table Users (
User_ID int,
Name varchar(50),
Age int,
Experience_Level int
)

insert into Users (User_ID, Name, Age, Experience_Level) values (1, 'A', 33, 1)
insert into Users (User_ID, Name, Age, Experience_Level) values (2, 'B', 37, 1)
insert into Users (User_ID, Name, Age, Experience_Level) values (3, 'C', 33, 1)
insert into Users (User_ID, Name, Age, Experience_Level) values (4, 'D', 35, 2)
insert into Users (User_ID, Name, Age, Experience_Level) values (5, 'E', 33, 1)
insert into Users (User_ID, Name, Age, Experience_Level) values (6, 'F', 35, 2)
insert into Users (User_ID, Name, Age, Experience_Level) values (7, 'G', 18, 1)

In theory, what you want is something like this, but unfortunately SQL Server doesn't allow it:

SELECT * FROM Users 
WHERE (Age, Experience_Level) IN
(
    SELECT Age, Experience_Level
    FROM Users  
    GROUP BY Age,Experience_Level 
    HAVING Count(*) > 1 
)

So instead, you have to settle for joining to a subquery:

SELECT Users.* FROM Users 
INNER JOIN 
(
    SELECT Age, Experience_Level
    FROM Users  
    GROUP BY Age,Experience_Level 
    HAVING Count(*) > 1 
) subq 
    ON Users.Age = subq.Age
    AND Users.Experience_Level = subq.Experience_Level

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow