How to retain a row which is foreign key in another table and remove other duplicate rows?

https://stackoverflow.com/questions/19629956

01-07-2022
|

Вопрос

I have two table:

=====================

B:
id    Aid
1      1
2      4

(B doesn't contain the Aid which link to code C1)

Let me explain the overall flow:
I want to make each row in table A have different code(by delete duplicate),and I want to retain the Aid which I can find in table B.If Aid which not be saved in table B,I retain the id bigger one.

so I can not just do something as below:

DELETE FROM A
WHERE  id NOT IN (SELECT MAX(id)
                  FROM   A
                  GROUP  BY code,
)

I can get each duplicate_code_groups by below sql statement:

SELECT code
FROM   A
GROUP  BY code
HAVING COUNT(*) > 1

Is there some code in sql like

for (var ids in duplicate_code_groups){
    for (var id in ids) {
        if (id in B){
            return id
        }
    }

    return max(ids)
}

and put the return id into a idtable?? I just don't know how to write such code in sql.

then I can do

DELETE FROM A
WHERE id NOT IN idtable

Решение

Using ROW_NUMBER() inside CTE (or sub-query) you can assign numbers for each Code based on your ordering and then just join the result-set with your table A to make a delete.

WITH CTE AS 
(
    SELECT A.*, ROW_NUMBER() OVER (PARTITION BY A.Code ORDER BY COALESCE(B.ID,0) DESC, A.ID desc) RN
    FROM A
    LEFT JOIN B ON A.ID = B.Aid
)
DELETE A FROM A 
INNER JOIN CTE C ON A.ID = C.ID
WHERE RN > 1;

SELECT * FROM A;

SQLFiddle DEMO

Другие советы

The first select gives you all A.id that are in B - you don't want to delete them. The second select takes A, selects all codes without an id that appears in B, and from this subset takes the maximum id. These two sets of ids are the ones you want to keep, so the delete deletes the ones not in the sets.

DELETE from A where A.id not in
(
    select aid from B
    union
    select MAX(A.id) from A left outer join B on B.Aid=A.id group by code having COUNT(B.id)=0
)

Actual Execution Plan on MS SQL Server 2008 R2 reveals that this solution performs quite well, it's 5-6 times faster than Nenad's solution :).

Try this Solution

 DELETE FROM A
    WHERE NOT id IN
    (
    SELECT MAX(B.AId) 
    FROM A INNER JOIN B ON A.id = B.aId
    )

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow