Pergunta

I have a data structure, where I have to store pairs of elements. Each pair has exactly 2 values in it, so we are employing a table, with the fields(leftvalue, rightvalue....). These pairs should be unique, and they are considered the same, if the keys are changed.

Example: (Fruit, Apple) is the same as (Apple, Fruit).

If it is possible in an efficient way, I would put a database constraint on the fields, but not at any cost - performance is more important.

We are using MSSQL server 2008 currently, but an update is possible.

Is there an efficient way of achieving this?

Foi útil?

Solução

Two solutions, both really about changing the problem into an easier one. I'd usually prefer the T1 solution if forcing a change on consumers is acceptable:

create table dbo.T1 (
    Lft int not null,
    Rgt int not null,
    constraint CK_T1 CHECK (Lft < Rgt),
    constraint UQ_T1 UNIQUE (Lft,Rgt)
)
go
create table dbo.T2 (
    Lft int not null,
    Rgt int not null
)
go
create view dbo.T2_DRI
with schemabinding
as
    select
        CASE WHEN Lft<Rgt THEN Lft ELSE Rgt END as Lft,
        CASE WHEN Lft<Rgt THEN Rgt ELSE Lft END as Rgt
    from dbo.T2
go
create unique clustered index IX_T2_DRI on dbo.T2_DRI(Lft,Rgt)
go

In both cases, neither T1 nor T2 can contain duplicate values in the Lft,Rgt pairs.

Outras dicas

If you always store the values in order but store the direction in another column,

CREATE TABLE [Pairs]
(
    [A] NVarChar(MAX) NOT NULL,
    [B] NVarChar(MAX) NOT NULL,
    [DirectionAB] Bit NOT NULL,
    CONSTRAINT [PK_Pairs] PRIMARY KEY ([A],[B]) 
)

You can acheive exaclty what you want with one clustered index, and optimize your lookups too.

So when I insert the pair 'Apple', 'Fruit' I'd do,

INSERT [Pairs] VALUES ('Apple', 'Friut', 1);

Nice and easy. Then I insert 'Fruit', 'Apple',

INSERT [Pairs] VALUES ('Apple', 'Fruit', 0); -- 0 becuase order is reversed.

The insert fails because this is a primary key violation. To further illustrate, the pair 'Coconuts', 'Bananas' would be stored as

INSERT [Pairs] VALUES ('Bananas', 'Coconuts', 0);

For additional lookup performance, I'd add the index

CREATE NONCLUSTERED INDEX [IX_Pairs_Reverse] ON [Pairs] ([B], [A]);

If you can't control inserts to the table, it may be necessary to ensure that [A] and [B] are inserted correctly.

CONSTRAINT [CK_Pairs_ALessThanB] CHECK ([A] < [B])

But this may be an unnecessary performance hit, depending on how controlled your inserts are.

One way would be to create a computed column that combines the two values and put a unique constraint upon it:

create table #test (
    a varchar(10) not null, 
    b varchar(10) not null, 
    both as case when a > b then a + ':' + b else b + ':' + a end persisted unique nonclustered
    )

so

insert #test
select 'apple', 'fruit'
insert #test
select 'fruit', 'apple'

Gives

(1 row(s) affected)
Msg 2627, Level 14, State 1, Line 3
Violation of UNIQUE KEY constraint 'UQ__#test_____55252CB631EC6D26'. Cannot insert duplicate key in object 'dbo.#test'.
The statement has been terminated.
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top