Question

I have a Person table and Address Table. Their relationship is Person.AddrId = Address.Id

Person table has First_name, Last_name, and AddrId.

The Address table has Address1, Zipcode, and Id.

I want to get all the list of duplicate records from this columns combination.

I tried below query:

SELECT  A.FIRST_NAME, A.LAST_NAME, B.ADDR, B.ZIPCODE, count(1)
FROM  SCHEMA.PERSON A, SCHEMA.ADDRESS B
WHERE 
A.ADDR_ID = B.ID
group by A.FIRST_NAME, A.LASTT_NAME, B.ADDR, B.ZIPCODE 
having count(1) > 1 

Unfortunately this gives only the duplicate records. I want both original as well as duplicate records.

Was it helpful?

Solution

SQL is closed so the result of a query is a new table that can be futher used. You can therefore join the result of your duplicates query with the original tables. Something like below should work:

select t2.*
from (
    SELECT  A.FIRST_NAME, A.LAST_NAME, B.ADDR, B.ZIPCODE
    FROM  SCHEMA.PERSON A
    JOIN  SCHEMA.ADDRESS B
        ON A.ADDR_ID = B.ID
    group by A.FIRST_NAME, A.LASTT_NAME, B.ADDR, B.ZIPCODE 
    having count(1) > 1
) as t1
join ( 
    SELECT  A.*, B.*
    FROM  SCHEMA.PERSON A
    JOIN  SCHEMA.ADDRESS B
        ON A.ADDR_ID = B.ID
) as t2
     on (t1.FIRST_NAME, t1.LAST_NAME, t1.ADDR, t1.ZIPCODE)
      = (t2.FIRST_NAME, t2.LAST_NAME, t2.ADDR, t2.ZIPCODE)

Using a CTE will probably be a bit more efficient:

 with t as (
    SELECT  A.*, B.*
    FROM  SCHEMA.PERSON A
    JOIN  SCHEMA.ADDRESS B
        ON A.ADDR_ID = B.ID
)
select t1.*
from t1
join (
    SELECT  FIRST_NAME, LAST_NAME, ADDR, ZIPCODE
    FROM  T1
    group by FIRST_NAME, LASTT_NAME, ADDR, ZIPCODE 
    having count(1) > 1
) as t2
    on (t1.FIRST_NAME, t1.LAST_NAME, t1.ADDR, t1.ZIPCODE)
     = (t2.FIRST_NAME, t2.LAST_NAME, t2.ADDR, t2.ZIPCODE)

I replaced the "," joins with explicit ones, I also used * since I did not know the names of the additional columns.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top