Вопрос

Editing someone else's SQL Server design here:

I have been tasked with removing duplicate rows from a database.

I have 2 tables: table1 and table2.

  • table1 has columns T1ID, and T1NAME
  • table2 has columns T2ID, DATA1, DATA2, DATA3

  • Tables join on T1ID = T2ID

If several rows have the same T1NAME, DATA1, DATA2, AND DATA3, I need to remove all but one of them from both tables.

Это было полезно?

Решение

I guess t1id in table1 and t2id in table2 is primary key for corrosponding tables.

if so the you can use following approach -

1) as you want to delete from both the tables, you need to store the t1id (or t2id) into one temp table so that you can use the values for deleting second table

2) for extracting the desired t1id/t2id you need to group the joined table with T1name, Data1, Data2, Data3 and select the record where you are getting multiple records.

3) Exclude only one t1id from duplicate records found for delete.

so for this you can use some commands like shown below -

CREATE TABLE test_table
 as (SELECT T1id
      FROM Table1, Table2
     WHERE T1name, Data1, Data2,
     Data3 IN (SELECT T1name, Data1, Data2, Data3
                       FROM (SELECT T1name, Data1, Data2, Data3, COUNT(*)
                               FROM Table1, Table2
                              WHERE T1id = T2id
                              GROUP BY T1name, Data1, Data2, Data3
                             HAVING COUNT(*) > 1))
       AND T1id NOT IN (SELECT MIN(T1id)
                          FROM Table1, Table2
                         WHERE T1id = T2id
                         GROUP BY T1name, Data1, Data2, Data3));

DELETE FROM Table1 WHERE T1id IN (SELECT T1id FROM Test_Table);

DELETE FROM Table2 WHERE T2id IN (SELECT T1id FROM Test_Table);

COMMIT;

Drop TABLE Test_Table;

Другие советы

Untested but try something like this

    ;with deleteThis
as
(
select   t1.id
        ,t2.data1
        ,t2.data2
        ,t2.data3
        ,row_number() over(partition by t1.id,t2.data1,t2.data2,t2.data3 order by t1.id) as rn
from table1 as t1
inner join table2 as t2
on t2.id = t1.id
)
delete from deleteThis
where rn > 1
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top