I have a CTE in which I am finding duplicate records matching on 5 columns:
;WITH DuplicateCount AS
(
SELECT
FirstName,
LastName,
DateofBirth,
Email,
c1.Status,
Count(*) AS TotalCount
FROM Customer c
INNER JOIN Customer_1 c1 ON c1.customerID = c.customerID
GROUP BY FirstName, LastName, DateofBirth, Email, c1.Status
HAVING COUNT(*) > 1
)
I am then selecting Status and TotalCount from that CTE and joining an Enum table to produce readable data
;WITH DuplicateCount AS
(
SELECT
FirstName,
LastName,
DateofBirth,
Email,
c1.Status,
Count(*) AS TotalCount
FROM Customer c
INNER JOIN Customer_1 c1 ON c1.customerID = c.customerID
GROUP BY FirstName, LastName, DateofBirth, Email, c1.Status
HAVING COUNT(*) > 1
)
SELECT e.Display, dc.TotalCount
FROM DuplicateCount dc
INNER JOIN Enum e ON dc.Status = e.Index
In this scenario, I am able to pull back readable data and use Excel to spit out a graph report of duplicates by Status.
Problem
I need to join the Customer_1
table once again to gather one more column: Stage. Here is how I tried to do it:
;WITH DuplicateCount AS
(
SELECT customerID,
FirstName,
LastName,
DateofBirth,
Email,
c1.Status,
Count(*) AS TotalCount
FROM Customer c
INNER JOIN Customer_1 c1 ON c1.customerID = c.customerID
GROUP BY customerID, FirstName, LastName, DateofBirth, Email, c1.Status
HAVING COUNT(*) > 1
)
SELECT e.Display,
CASE
WHEN c1.Stage = 6 THEN 'First'
WHEN c1.Stage = 7 THEN 'Second'
WHEN c1.Stage = 8 THEN 'Third'
WHEN c1.Stage = 11 THEN 'Fourth'
WHEN c1.Stage = 9 THEN 'Fifth'
WHEN c1.Stage = 10 THEN 'Sixth'
WHEN c1.Stage = 12 THEN 'Unknown'
ELSE ''
END AS Stage,
dc.TotalCount
FROM DuplicateCount dc
INNER JOIN Enum e ON dc.Status = e.Index
INNER JOIN Customer_1 c1 ON c1.customerID = dc.customerID
Obviously, that didn't work because none of my records will have duplicate PKs.
Is there a way to join a table to my CTE without a PK? Or somehow add a PK to my CTE without grouping by it?
Edit: This is what I am trying to achieve
FirstName |
LastName |
Stage |
Total Count |
John |
Smith |
First |
2 |
John |
Smith |
Third |
2 |
Alex |
Smith |
First |
2 |
Jane |
Smith |
Third |
2 |
Jane |
Smith |
First |
2 |
Jack |
Smith |
Second |
2 |
Then, when reporting on this data:
John Smith has 4 total records. Two in First, two in Third
Alex Smith has 2 total records. Two in First
Jane Smith has 4 total records. Two in First and two in Third
Jack Smith has 2 total records. Two in Second.
When graphing this data, I should be able to see:
First: 6 total.
Second: 2 total.
Third: 4 total.
Ideally, I could then also bring in CreatedDate and begin to gather data-over-time reports for:
How many duplicates per Stage.
How many duplicates per Person.
How many duplicates for specific date ranges, events, etc.