Question

I need to apply some logic during an update to a table.

existing_items is the target table and received_items holds updates to existing_items or new items altogether.

The logic is that - for every grouped received_items, a matching row should be identified in existing_items. If no match is found, then a new line should be created.

The kicker is that rows can match on multiple criteria. They should always be matched on code, line_no (if given), ref (if given). received_items rows should be processed in processing_seq order and potentially, checked for a match in the given order as well.

When all grouped received_items have been matched to one existing_item, any remainder is a new line.

Given:

 create table #existing_items(id int identity(1,1), code varchar(10)
,qty numeric(10,2), line_no int, ref varchar(10))

create table #received_items(code varchar(10), qty numeric(10,2), line_no int
,ref varchar(10), processing_seq int)

insert into #existing_items (code, qty, line_no, ref)
    values('ABC123',2.0, 1, NULL)
insert into #existing_items (code, qty, line_no, ref)
    values('ABC123',3.0, 2, '1001')
insert into #received_items(code, qty, line_no, ref, processing_seq)
    values ('ABC123', 4, NULL, NULL, 1)
insert into #received_items(code, qty, line_no, ref, processing_seq)
    values ('ABC123', 3, NULL, NULL, 1)
insert into #received_items(code, qty, line_no, ref, processing_seq)
    values ('ABC123', 4, NULL, 1002, 2)
insert into #received_items(code, qty, line_no, ref, processing_seq)
    values ('ABC123', 4, 2, 1003, 3)
insert into #received_items(code, qty, line_no, ref, processing_seq)
    values ('ABC123', 5, NULL, NULL, 4)

select * from #received_items

        ABC123    4.00    NULL    NULL    1
        ABC123    3.00    NULL    NULL    1
        ABC123    4.00    NULL    1003    2
        ABC123    4.00    2       1002    3
        ABC123    5.00    NULL    NULL    4

select * from #existing_items

        1    ABC123    2.00    1    NULL
        2    ABC123    3.00    2    1001

The results should be:

        1    ABC123    7.00    1     NULL
        2    ABC123    4.00    2     1002
        3    ABC123    4.00    3     1003
        4    ABC123    5.00    4     NULL

To explain:

existing_items with id=1 is updated to 7, because received_items should be grouped (code, line_no, ref, processing_seq). The row has been matched on code only, because there was no line_no or ref supplied.

A new item is created with id=3, because there was no match found with ref 1003.

existing_items with id=2 updates qty and ref, because a match was found on line_no.

A new row is created with id=4, because there are no rows to match left (id=1 has already been matched with the first set where processing_seq = 1).

Not sure how to go about it, was thinking a Cursor but there might be an easier way. I am currently working with multiple self joins.. like so:

 Select grp.*
,fm.id as full_match, rm.id as ref_match, lm.id as line_match
,(select min(id) from #existing_items where code = grp.code
 and rm.id IS NULL and lm.id IS NULL and fm.id IS NULL and grp.ref IS NULL
) as code_match
-- ,cm.id as code_match 
FROM (
select ri.code, sum(ri.qty) qty,ri.line_no,ri.ref, ri.processing_seq
from #received_items ri
group by code, line_no, ref, processing_seq
) grp
LEFT OUTER JOIN 
 #existing_items fm
 ON grp.code = fm.code AND grp.line_no = fm.line_no and grp.ref = fm.ref
LEFT OUTER JOIN 
 #existing_items rm
 ON grp.code = rm.code AND  grp.ref = rm.ref
LEFT OUTER JOIN 
 #existing_items lm
 ON grp.code = lm.code AND  grp.line_no = lm.line_no
order by grp.processing_seq

This gets part way to knowing which row to update and produces this interim result:

Code    Qty    Line_No  Ref    seq    fm.id   rm.id   lm.id       cm.id
ABC123  7.00    NULL    NULL    1     NULL    NULL    NULL        1
ABC123  4.00    NULL    1002    2     NULL    NULL    NULL        NULL
ABC123  4.00    2       1003    3     NULL    NULL    2           NULL
ABC123  5.00    NULL    NULL    4     NULL    NULL    NULL        1

Need a way to identify the nearest match on code only, which has worked, but not for seq=4 which should have cm.id of NULL - so I would need to change my subquery to not return prevously matched ids in the same subquery? Then I should be able to insert where there isn't an id in any of the matching columns.

Any insight into how to approach the problem greatly appreciated.

Was it helpful?

Solution 2

After a bit of time I managed to figure this one out:

;WITH existing as (
SELECT
    id, code, qty, line_no, ref
    ,ROW_NUMBER() OVER (ORDER BY line_no) AS code_inst
FROM #existing_items
)
,received as (
SELECT
    code, SUM(qty) qty, line_no, ref
    ,ROW_NUMBER() OVER (ORDER BY processing_seq) pseq
    ,ROW_NUMBER() OVER (PARTITION BY code ORDER BY processing_seq) code_inst
FROM
    #received_items
GROUP BY
    code, line_no, ref, processing_seq
)
SELECT
    recv.*, COALESCE(lm.ID, rm.id, cm.id) AS matched_id
FROM
    received recv
LEFT OUTER JOIN
    existing lm --line match
ON
    recv.code = lm.code and recv.line_no = lm.line_no
LEFT OUTER JOIN
    existing rm --ref match
ON
    recv.code = rm.code and recv.ref = rm.ref
LEFT OUTER JOIN
    existing cm --code match
ON
    recv.code = cm.code 
    AND recv.code_inst = cm.code_inst
    AND cm.ref IS NULL
    AND lm.id IS NULL AND rm.id IS NULL

It isn't ideal linking 3 times but considering the logic involved (different matching criteria) it made sense. I ended up loading this resultset into a table variable, whereby I then updated existing where the id matched and inserted where matched_id was NULL.

OTHER TIPS

You know what, the way you are updating your existing data has lots of overhead. You should identify for each rows that which should be updated, and also again run an update command. You can yourself calculate the cost of server resources. Looping through each record, and, identify that, whether it is changed or not and update. And, using multiple Tables you stated, #EXISTINGITEM, #NEWITEMS, and Your Actual data on other table. This have alot of overload.

I don't still know whether it is possible in your case or not but. What i would like to suggest is that. Suppose, I've an Issue Request in my inventory. For that, i've an 2 tables name, ISSUE_MAST and ISSUE_DETL. Issue Mast Contain the ISSUE_DOC_NO, and Detail table contains ISSUE_DOC_NO, and SEQ_NO.

Now,Suppose we have an New Issue Request to maintain. Then, I can simply hold the data in MAST and DETL tables Because it is totally new Request so there will be no problem. Its not where your problem is. And, Now, Suppose oneday, the Request have to be change. Then, User will provide me MAST and DETL information. And, Now, what can i do is simply Remove all the DETL items which DOC_NO is of the CURRENT DOC NO. And, Simply Insert all the Given rows again as a Fresh data.

I don't like the Update logic at all, because of the Overload. But, in some cases, we must go that way too. But, must of cases we should not choose that way. If poosible we have to Delete DETL info and Again RE insert it but will depend on your transaction complexity.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top