Question

I have a data set where lots of users have annotated a set of artifacts, and I want to know every other user a user has directly/indirectly interacted with. I want to develop query which shows each user and every other user that contributed to an artifact that they also contributed to. My data looks something like this

+----------+------+
| Artifact | User |
+----------+------+
|    1     |   a  |
|    1     |   b  |
|    1     |   c  |
|    1     |   d  |
|    2     |   a  |
|    2     |   m  |
+----------+------+

So in this case User a has interacted with b,c,d & m; b has interacted with a,c & d; and m has interacted with a

Ideally it won't show overlaps (so only distinct values), but I can weed through those if necessary

Was it helpful?

Solution

You can use GROUP_CONCAT function :

SELECT 
    t1.User, GROUP_CONCAT(DISTINCT t2.User ORDER BY t2.User) 
FROM 
    test t1
    JOIN test t2 ON (t1.Artifact = t2.Artifact AND t1.User != t2.User)
GROUP BY t1.User

Here's an example

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top