Finding list of Duplicate records from 2 tables in DB2
-
30-09-2020 - |
Question
I have a Person
table and Address
Table. Their relationship is Person.AddrId = Address.Id
Person
table has First_name
, Last_name
, and AddrId
.
The Address
table has Address1
, Zipcode
, and Id
.
I want to get all the list of duplicate records from this columns combination.
I tried below query:
SELECT A.FIRST_NAME, A.LAST_NAME, B.ADDR, B.ZIPCODE, count(1)
FROM SCHEMA.PERSON A, SCHEMA.ADDRESS B
WHERE
A.ADDR_ID = B.ID
group by A.FIRST_NAME, A.LASTT_NAME, B.ADDR, B.ZIPCODE
having count(1) > 1
Unfortunately this gives only the duplicate records. I want both original as well as duplicate records.
Solution
SQL is closed so the result of a query is a new table that can be futher used. You can therefore join the result of your duplicates query with the original tables. Something like below should work:
select t2.*
from (
SELECT A.FIRST_NAME, A.LAST_NAME, B.ADDR, B.ZIPCODE
FROM SCHEMA.PERSON A
JOIN SCHEMA.ADDRESS B
ON A.ADDR_ID = B.ID
group by A.FIRST_NAME, A.LASTT_NAME, B.ADDR, B.ZIPCODE
having count(1) > 1
) as t1
join (
SELECT A.*, B.*
FROM SCHEMA.PERSON A
JOIN SCHEMA.ADDRESS B
ON A.ADDR_ID = B.ID
) as t2
on (t1.FIRST_NAME, t1.LAST_NAME, t1.ADDR, t1.ZIPCODE)
= (t2.FIRST_NAME, t2.LAST_NAME, t2.ADDR, t2.ZIPCODE)
Using a CTE will probably be a bit more efficient:
with t as (
SELECT A.*, B.*
FROM SCHEMA.PERSON A
JOIN SCHEMA.ADDRESS B
ON A.ADDR_ID = B.ID
)
select t1.*
from t1
join (
SELECT FIRST_NAME, LAST_NAME, ADDR, ZIPCODE
FROM T1
group by FIRST_NAME, LASTT_NAME, ADDR, ZIPCODE
having count(1) > 1
) as t2
on (t1.FIRST_NAME, t1.LAST_NAME, t1.ADDR, t1.ZIPCODE)
= (t2.FIRST_NAME, t2.LAST_NAME, t2.ADDR, t2.ZIPCODE)
I replaced the "," joins with explicit ones, I also used * since I did not know the names of the additional columns.