Question

I have following data:

Id: 1       Name:   apple       ForeignKey: 10
Id: 2       Name:   apple       ForeignKey: 10
Id: 3       Name:   apple       ForeignKey: 15
Id: 4       Name:   peach       ForeignKey: 11
Id: 5       Name:   peach       ForeignKey: 12

Rows with same Name and ForeignKey are duplicates in my case. Now I want to remove all the duplicates except one instance from the table.

In other words; I want to remove all rows but one, where Name and ForeignKey are equal.

In case with upper data only the row with Id 2 OR Id 1 should be removed.

With

select count(Name), Name, ForeignKey group by Name, ForeignKey having count(Name)>1

I am able to find items where more than 1 row with same Name and Foreign key exist, but how to get the IDs of those rows? And how to get the IDs of those rows except the first/last occurrence of that row with same Name and ForeignKey?

Was it helpful?

Solution

The answer if database specific, but here is how you can do it joining the table to itself:

delete t1
from yourtable t1
    join yourtable t2 on t1.id > t2.id
        and t1.name = t2.name 
        and t1.foreignkey = t2.foreignkey

OTHER TIPS

You can also do it with a CTE & window function, deleting the duplicate rows by counting the number of rows that are the same, and then deleting all but one.

SQL Fiddle demo

;WITH myvals
AS (
    SELECT [id]
        ,[name]
        ,[foreignkey]
        ,ROW_NUMBER() OVER (
            PARTITION BY [name]
            ,[foreignkey] ORDER BY [id]
                ,[name]
                ,[foreignkey]
            ) AS inst_count
    FROM yourtable
    )
DELETE
FROM myvals
WHERE inst_count > 1;

delete x from ( select *, rn=row_number() over (partition by name,foreignkey order by name) from yourtable ) x where rn > 1

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top