T-SQL Unisci le prestazioni nel contesto editoriale tipico

https://stackoverflow.com/questions/7407560

29-10-2019
|

Domanda

Ho la situazione in cui un'applicazione "editrice" mantiene sostanzialmente aggiornato un modello di visualizzazione interrogando una visione molto complessa e unendo i risultati in una tabella del modello di visualizzazione denormalizzata, utilizzando inserti, aggiornamento ed elimina le operazioni separate.

Ora che abbiamo aggiornato a SQL 2008, ho pensato che sarebbe stato un ottimo momento per aggiornarli con l'istruzione SQL Merge. Tuttavia, dopo aver scritto la query, il costo della sottostruttura della dichiarazione di unione è 1214.54! Con il vecchio modo, la somma dell'inserto/aggiornamento/eliminazione era solo 0,104 !!

Non riesco a capire come un modo più semplice di descrivere la stessa operazione esatta possa essere molto più scarso. Forse puoi vedere l'errore dei miei modi in cui non posso.

Alcune statistiche sulla tabella: ha 1,9 milioni di righe e ogni operazione di unione inserisce, aggiorna o ne elimina più di 100. Nel mio caso di test, solo 1 è interessato.

-- This table variable has the EXACT same structure as the published table
-- Yes, I've tried a temp table instead of a table variable, and it makes no difference
declare @tSource table
(
    Key1 uniqueidentifier NOT NULL,
    Key2 int NOT NULL,
    Data1 datetime NOT NULL,
    Data2 datetime,
    Data3 varchar(255) NOT NULL, 
    PRIMARY KEY 
    (
        Key1, 
        Key2
    )
)

-- Fill the temp table with the desired current state of the view model, for
-- only those rows affected by @Key1.  I'm not really concerned about the
-- performance of this.  The result of this; it's already good.  This results
-- in very few rows in the table var, in fact, only 1 in my test case
insert into @tSource
select *
from vw_Source_View with (nolock)
where Key1 = @Key1

-- Now it's time to merge @tSource into TargetTable

;MERGE TargetTable as T
USING tSource S
    on S.Key1 = T.Key1 and S.Key2 = T.Key2

-- Only update if the Data columns do not match
WHEN MATCHED AND T.Data1 <> S.Data1 OR T.Data2 <> S.Data2 OR T.Data3 <> S.Data3 THEN
    UPDATE SET
        T.Data1 = S.Data1,
        T.Data2 = S.Data2,
        T.Data3 = S.Data3

-- Insert when missing in the target
WHEN NOT MATCHED BY TARGET THEN
    INSERT (Key1, Key2, Data1, Data2, Data3)
    VALUES (Key1, Key2, Data1, Data2, Data3)

-- Delete when missing in the source, being careful not to delete the REST
-- of the table by applying the T.Key1 = @id condition
WHEN NOT MATCHED BY SOURCE AND T.Key1 = @id THEN
    DELETE
;

Quindi, come arriva a 1200 costi di metropolitana? L'accesso ai dati dalle stesse tabelle sembra essere abbastanza efficiente. In effetti, l'87% del costo dell'accusa sembra provenire da un'operazione di ordinamento vicino alla fine della catena:

Unisci (0%) <- Aggiornamento dell'indice (12%) <- ordin (87%) <- (...)

E quel tipo ha 0 file che si nutrono e uscite da esso. Perché ci vuole l'87% delle risorse per ordinare 0 righe?

AGGIORNARE

Ho pubblicato l'attuale (non stimato) Piano di esecuzione solo per l'operazione di unione in un GIST.

Nessuna soluzione corretta

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow