SQL Server: complex operation unique code for groups of rows

https://stackoverflow.com/questions/9990563

28-05-2021
|

Question

I have a table where most of the columns are filled with data. In that table I have rows where for 4 columns I have duplicate data. I wanna do something like this:

Check all rows, group all data by these four columns (one group = one unique data for these 4 columns) and sign to them UniqueCode in last column. UniqueCode are for group.

So, when I have something like this:

Name Street HouseNumber PostCode  UniqueCode
Cos  Cos    Cos         Cos
Cos  Cos    Cos         Cos

I want to fill UniqueCode with this same code. I want to write a query which clears all current uniqueCode and fills with new and write trigger (I call it right?) which do the same for newly added rows.

It is possible write that behaviour in sql?

Or I need to do it in code in my program?

Can you help me?

Sorry for my bad English.

Solution

Assuming you have a table similar to this one:

create table Persons
(
    ID int identity primary key,
    Name varchar(100),
    Street varchar(100),
    HouseNumber int,
    PostCode  varchar(100),
    UniqueCode uniqueidentifier
)

You would create a trigger that finds already entered Persons having the same composite key:

create trigger AssignPersonGroup on Persons
after insert, update
as
    set nocount on

    update Persons
    set UniqueCode = 
    isnull(
       (select top 1 UniqueCode from Persons
       where UniqueCode is not null
         and Inserted.Name = Persons.Name
         and Inserted.Street = Persons.Street
         and Inserted.HouseNumber = Persons.HouseNumber
         and Inserted.PostCode = Persons.PostCode)
      , newid())
    from Inserted inner join Persons
      on Inserted.ID = Persons.ID

And assuming this would be the data:

insert into persons (Name, Street, HouseNumber, PostCode)
values ('Zagor', 'Darkwood', 23, '01010')
insert into persons (Name, Street, HouseNumber, PostCode)
values ('Zagor', 'Darkwood', 23, '01010')
insert into persons (Name, Street, HouseNumber, PostCode)
values ('Chico', 'Darkwood', 23, '01010')

Then

select * 
from Persons

Would deliver:

ID  Name    Street  HouseNumber PostCode    UniqueCode
1   Zagor   Darkwood         23 01010   A113D12D-F730-42DD-B3EE-AC33E34C0679
2   Zagor   Darkwood         23 01010   A113D12D-F730-42DD-B3EE-AC33E34C0679
3   Chico   Darkwood         23 01010   FD0739AF-525C-42C2-B929-0AB8EEAC3A73

Your existing data would be updated if you execute following two updates:

update persons
  set uniquecode = newid()

update Persons 
   set uniqueCode=
   (
      select top 1 uniqueCode
      from Persons groups
       where UniqueCode is not null
         and groups.Name = Persons.Name
         and groups.Street = Persons.Street
         and groups.HouseNumber = Persons.HouseNumber
         and groups.PostCode = Persons.PostCode
      order by ID
    )

They are separated because Sql Server executes joins before scalar expressions. My first take on this was update query with derived table grouping on all columns and adding newid() column, but as inner join to persons table was executed first ids were unique per each row again.

Disable trigger prior to doing this and reenable it later. uniqueCode should not be set in program, only in trigger. You would need composite index on (Name, Street, HouseNumber, PostCode) and another one on UniqueCode.

That aside, I have seen your other question concerning this project. What are you trying to do? I'm unsure about nature of relationship between primary and high school tables. If you want to connect them by certain person, then you should have a person table with identity primary key. Both tables would reference this one and you would have no problems identifing someone's academic path :-)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow