문제

I need to perform a self join that can result in multiple rows, but I need to limit the join to a single row per record. When multiple rows match the join criteria, only the value row with the maximum PK should be used. Here is a simplified schema, hypothetical:

CREATE TABLE #Records(
    Id int NOT NULL,
    GroupId int NOT NULL,
    Node varchar(10) NOT NULL,
    Value varchar(10) NULL,
    Meta1 varchar(10) NULL,
    Meta2 varchar(10) NULL,
    Meta3 varchar(10) NULL
)

Here are some sample inserts:

INSERT INTO #Records VALUES(1,123,'Parent', '888', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(2,123,'Guardian', '789', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(3,123,'Parent', '999', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(4,123,'Guardian', '654', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(5,123,'Sibling', '222', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(6,456,'Parent', '777', 'meta1', 'meta2', 'meta3')
INSERT INTO #Records VALUES(7,456,'Guardian', '333', 'meta1', 'meta2', 'meta3')

In generic terms, I would want the count of rows returned to equal the number or records in the table. I need a Parent column in a Guardian column. Parent should equal the most recent row, based on Id, that has a Node of 'Parent', for the matching GroupId. I need the same for Guardian, but the Node should be 'Guardian'. Results would look like this:

Id   GroupId    Node       Value    Meta1   Meta2   Meta3   Parent  Guardian
--- ---------- --------- --------- ------- ------- ------- ------- ----------
1     123       Parent     888      meta1   meta2   meta3    999     654     
2     123       Guardian   654      meta1   meta2   meta3    999     654
3     123       Parent     999      meta1   meta2   meta3    999     654
4     123       Guardian   789      meta1   meta2   meta3    999     654
5     123       Sibling    222      meta1   meta2   meta3    999     654
6     456       Parent     777      meta1   meta2   meta3    777     333
7     456       Guardian   333      meta1   meta2   meta3    777     333

Note, I have this partially working now, but it does not limit to the latest value. It works fine when all parent and guardian value nodes have the same value. I was attempting to limit to MAX, but have failed. Looking at this query may bias your judgement, so please don't hesitate to toss it out completely.

SELECT #Records.*, Parent,Guardian
FROM #Records
LEFT JOIN (
    SELECT MAX(Id) As ParentRow, GroupId, Value AS Parent
    FROM #Records
    WHERE Node = 'Parent'
    GROUP BY GroupId, Value
) AS Parents
ON #Records.GroupId = Parents.GroupId
LEFT JOIN (
    SELECT MAX(Id) As ParentRow, GroupId, Value AS Guardian
    FROM #Records
    WHERE Node = 'Guardian'
    GROUP BY GroupId, Value
) AS Guardians
ON #Records.GroupId = Guardians.GroupId

Thanks in advance!

도움이 되었습니까?

해결책

You were close, but you were returning too many results from your subqueries because you were grouping on multiple fields, since you want the Value at Max(id) for each GroupID, you can use the ROW_NUMBER() function to achieve this:

SELECT DISTINCT #Records.*, Parent,Guardian
FROM #Records
LEFT JOIN ( SELECT GroupID,Value'Parent',ROW_NUMBER() OVER(PARTITION BY GroupID ORDER BY ID DESC)'RowRank'
            FROM #Records
            WHERE Node = 'Parent'
           ) AS Parents
    ON #Records.GroupId = Parents.GroupId
      AND Parents.RowRank = 1
LEFT JOIN ( SELECT GroupID,Value'Guardian',ROW_NUMBER() OVER(PARTITION BY GroupID ORDER BY ID DESC)'RowRank'
            FROM #Records
            WHERE Node = 'Guardian'
          ) AS Guardians
    ON #Records.GroupId = Guardians.GroupId
      AND Guardians.RowRank = 1

다른 팁

The following gets the most recent previous parent/guardian within a group for each row in your original table. It uses correlated subqueries in the select clause to do the work:

select r.*,
       (select top 1 Value
        from #Records r2
        where r2.GroupId = r.GroupId and Node = 'Parent'
        order by id desc
       ) parent,
       (select top 1 Value
        from #Records r2
        where r2.GroupId = r.GroupId and Node = 'Guardian' 
        order by id desc
       ) guardian
from #Records r;

The use of nested selects ensures that all rows in the original table are included exactly once.

In some databases, you can do this using window/analytic functions. The following, for instance, is Oracle syntax:

select r.*,
       Last(case when Node = 'Parent' then Value end) over (partition by GroupId order by id) as Parent,
       Last(case when Node = 'Parent' then Value end) over (partition by GroupId order by id) as Guardian
from #Records;
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top