Question

I am trying to identify a list of duplicates from a table and my table looks like this:

Column1-Column2

  1. 1-1
  2. 1-2
  3. 1-3
  4. 2-1
  5. 2-2
  6. 2-3
  7. 3-1
  8. 3-2
  9. 3-4
  10. 4-1
  11. 4-2
  12. 4-3
  13. 4-4
  14. 5-1
  15. 5-2
  16. 5-4

    • 1 has a group of {1,2,3}
    • 2 has a group of {1,2,3}
    • And are duplicates
    • 3 has a group of {1,2,4}
    • 5 has a group of {1,2,4}
    • And are duplicates
    • 4 has a group of {1,2,3,4}
    • And has no friends ;)

Column 2 really is a varchar column, but I made everything numbers for simplicity sack.

I have been playing with CheckSum_Agg, but it has false positives. :(

My output would look something like this:

  • 1,2
  • 3,5

Where I select the min ID for the first column and all of the other values for the second column. Non-duplicates are omitted.

Another example might look like:

  • 1,2
  • 1,6
  • 3,5
  • 3,7
  • 3,8
  • (Notice no "4" in the list, I just added other "pairs" for show that 1 and 3 are the lowest. If 4 is in the list like 4,0 or 4,null, I can make that work too.)

I'm using SQL Server 2012. Thanks!

Était-ce utile?

La solution 3

--This code produced the results I was looking for in the original post.  

WITH t AS (
  SELECT
    column1,
    COUNT(*) c
  FROM #tbl
  GROUP BY column1
),
tt AS(
SELECT
  t1.column1 as 'winner',
  t2.column1 as 'loser'
FROM t t1
INNER JOIN t t2 ON (
  t1.c = t2.c AND
  t1.column1 < t2.column1
)
WHERE NOT EXISTS (
  SELECT column2 FROM #tbl WHERE column1 = t1.column1
  EXCEPT
  SELECT column2 FROM #tbl WHERE column1 = t2.column1
)
)
SELECT fullList.winner, fullList.loser
FROM
(  SELECT winner FROM tt tt1
   EXCEPT
   SELECT loser FROM tt tt2
) winnerList
JOIN tt fullList on winnerList.winner = fullList.winner
ORDER BY fullList.winner, fullList.loser

Autres conseils

WITH t AS (
  SELECT
    column1,
    COUNT(*) c
  FROM MyTable
  GROUP BY column1
)
SELECT
  t1.column1,
  t2.column1
FROM t t1
INNER JOIN t t2 ON (
  t1.c = t2.c AND
  t2.column1 > t1.column1
)
WHERE NOT EXISTS (
  SELECT column2 FROM MyTable WHERE column1 = t1.column1
  EXCEPT
  SELECT column2 FROM MyTable WHERE column1 = t2.column1
)
select column1,column2 from my_table
group by column1,column2
having COUNT(*) > 1

will give you list of duplicate records.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top