Come CTE funziona davvero?

https://stackoverflow.com/questions/7803640

25-10-2019
|

Domanda

Mi sono imbattuto in questa soluzione CTE per concatenare gli elementi di riga e ho pensato che è brillante e mi sono reso conto quanto potente CTE può essere.

Tuttavia, al fine di utilizzare tale strumento in modo efficace ho bisogno di sapere come funziona internamente per costruire quella immagine mentale che è essenziale per i principianti, come me, si usa in diversi scenari.

Così ho provato a rallentatore il processo di frammento di sopra ed ecco il codice

USE [NORTHWIND]
GO
/****** Object:  Table [dbo].[Products2]  Script Date: 10/18/2011 08:55:07 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF OBJECT_ID('Products2','U') IS NOT NULL  DROP TABLE [Products2]
CREATE TABLE [dbo].[Products2](
  [ProductID] [int] IDENTITY(1,1) NOT NULL,
  [ProductName] [nvarchar](40) NOT NULL,
  [SupplierID] [int] NULL,
  [CategoryID] [int] NULL,
  [QuantityPerUnit] [nvarchar](20) NULL,
  [UnitPrice] [money] NULL,
  [UnitsInStock] [smallint] NULL,
  [UnitsOnOrder] [smallint] NULL,
  [ReorderLevel] [smallint] NULL,
  [Discontinued] [bit] NOT NULL
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[Products2] ON
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (1, N'vcbcbvcbvc', 1, 4, N'10 boxes x 20 bags', 18.0000, 39, 0, 10, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (2, N'Changassad', 1, 1, N'24 - 12 oz bottles', 19.0000, 17, 40, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (3, N'Aniseed Syrup', 1, 2, N'12 - 550 ml bottles', 10.0000, 13, 70, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (4, N'Chef Anton''s Cajun Seasoning', 2, 2, N'48 - 6 oz jars', 22.0000, 53, 0, 0, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (5, N'Chef Anton''s Gumbo Mix', 10, 2, N'36 boxes', 21.3500, 0, 0, 0, 1)
SET IDENTITY_INSERT [dbo].[Products2] OFF
GO
IF OBJECT_ID('DELAY_EXEC','FN') IS NOT NULL  DROP FUNCTION DELAY_EXEC
GO
CREATE FUNCTION DELAY_EXEC() RETURNS DATETIME
AS
BEGIN
  DECLARE @I INT=0
  WHILE @I<99999
  BEGIN
  SELECT @I+=1
  END
  RETURN GETDATE()
END
GO

WITH CTE (EXEC_TIME, CategoryID, product_list, product_name, length)
     AS (SELECT dbo.DELAY_EXEC(),
                CategoryID,
                CAST('' AS VARCHAR(8000)),
                CAST('' AS VARCHAR(8000)),
                0
         FROM   Northwind..Products2
         GROUP  BY CategoryID
         UNION ALL
         SELECT dbo.DELAY_EXEC(),
                p.CategoryID,
                CAST(product_list + CASE
                                      WHEN length = 0 THEN ''
                                      ELSE ', '
                                    END + ProductName AS VARCHAR(8000)),
                CAST(ProductName AS VARCHAR(8000)),
                length + 1
         FROM   CTE c
                INNER JOIN Northwind..Products2 p
                  ON c.CategoryID = p.CategoryID
         WHERE  p.ProductName > c.product_name)
SELECT *
FROM   CTE
ORDER  BY EXEC_TIME  

--SELECT CategoryId, product_list
--  FROM ( SELECT CategoryId, product_list,
--  RANK() OVER ( PARTITION BY CategoryId ORDER BY length DESC )
--   FROM CTE ) D ( CategoryId, product_list, rank )
--   WHERE rank = 1 ;

Il blocco commentato è l'uscita desiderata per il problema di concatenazione, ma non è il problema qui.

Ho un'EXEC_TIME colonna di sapere quale riga ma ho aggiunto prima. L'uscita non sembra giusto per me per due motivi

ho pensato ci sarebbero dati ridondanti a causa della condizione p.ProductName > c.product_name in un'altra parola prima parte del CTE righe vuote sono sempre meno valori nella tabella Product2 così ogni volta che viene eseguito dovrebbe portare una nuova set di righe già aggiunti nuovamente. Questo ha senso?
La gerarchia dei dati è davvero strano l'ultimo elemento dovrebbe essere la più lunga e guardate cosa è l'ultimo articolo? Un elemento con length=1?

Qualsiasi esperto in soccorso? Grazie in anticipo.

Risultati del campione

EXEC_TIME               CategoryID  product_list                                                        product_name                      length
----------------------- ----------- ------------------------------------------------------------------- --------------------------------- -----------
2011-10-18 12:46:14.930 1                                                                                                                 0
2011-10-18 12:46:14.990 2                                                                                                                 0
2011-10-18 12:46:15.050 4                                                                                                                 0
2011-10-18 12:46:15.107 4           vcbcbvcbvc                                                          vcbcbvcbvc                        1
2011-10-18 12:46:15.167 2           Aniseed Syrup                                                       Aniseed Syrup                     1
2011-10-18 12:46:15.223 2           Chef Anton's Cajun Seasoning                                        Chef Anton's Cajun Seasoning      1
2011-10-18 12:46:15.280 2           Chef Anton's Gumbo Mix                                              Chef Anton's Gumbo Mix            1
2011-10-18 12:46:15.340 2           Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mix                Chef Anton's Gumbo Mix            2
2011-10-18 12:46:15.400 2           Aniseed Syrup, Chef Anton's Cajun Seasoning                         Chef Anton's Cajun Seasoning      2
2011-10-18 12:46:15.463 2           Aniseed Syrup, Chef Anton's Gumbo Mix                               Chef Anton's Gumbo Mix            2
2011-10-18 12:46:15.520 2           Aniseed Syrup, Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mi  Chef Anton's Gumbo Mix            3
2011-10-18 12:46:15.580 1           Changassad                                                          Changassad                        1

Soluzione

La pagina Query ricorsive Utilizzando espressioni di tabella comuni descrive la logica del CTE:

La semantica dell'esecuzione ricorsiva è il seguente:

Split l'espressione CTE in elementi di ancoraggio e ricorsive.

Esegui l'elemento di ancoraggio (s) creando il set prima chiamata o di base risultato (T0).

Eseguire l'utente ricorsiva (s) con Ti come ingresso e Ti + 1 come uscita.

Ripetere passaggio 3 finché un insieme vuoto viene restituito.

Ritorna il set di risultati. Si tratta di un'UNION ALL di T0 Tn.

Tuttavia, questo è solo il flusso logico. Come sempre, con SQL, il server è libero di operazioni di riordino come meglio ritiene opportuno, se il risultato sarà "lo stesso", e il riordino è percepito per fornire i risultati in modo più efficiente.

La presenza della funzione con effetti collaterali (che causano un ritardo, per poi tornare GETDATE()) non è qualcosa che normalmente sarebbe considerato al momento di decidere se le operazioni di riordino.

Un modo ovvio in cui la query può essere riordinato è che si può decidere di iniziare a lavorare sul set di risultati Ti+1 prima che si sia completamente creato set di risultati Ti - può essere più efficace per farlo che a completamente Ti costrutto prima, dal momento che le nuove righe sono sicuramente già in memoria e sono stati aperti di recente.

Altri suggerimenti

Questa è una domanda interessante che mi ha aiutato a capire meglio CTE ricorsive troppo.

Se si guarda al piano di esecuzione si vedrà che una bobina viene utilizzato e che ha l'insieme di proprietà WITH STACK. Il che significa che le righe href="http://blogs.msdn.com/b/sqltips/archive/2007/08/30/spool-operators-in-query-plan.aspx" sono di leggere in maniera pila-simile (last in First out)

Quindi, prima le piste da parte di ancoraggio

EXEC_TIME               CategoryID  product_list  
----------------------- ----------- --------------
2011-10-18 12:46:14.930 1                         
2011-10-18 12:46:14.990 2                         
2011-10-18 12:46:15.050 4

Poi 4 viene elaborato come questa è l'ultima riga aggiunta. La riga restituisce JOIN 1 che viene aggiunto al rocchetto allora questa riga appena aggiunto viene elaborato. In questo caso il rendimento nulla registrazione quindi non c'è nulla di aggiuntivo aggiunto al rocchetto e si muove verso l'elaborazione della riga CategoryID = 2.

Questo restituisce 3 righe che vengono aggiunte al rocchetto

Aniseed Syrup
Chef Anton's Cajun Seasoning
Chef Anton's Gumbo Mix

quindi ciascuna di queste righe vengono elaborati a sua volta in un modo simile LIFO eventuali righe figlio essere aggiunto occupato di prima lavorazione può passare alle righe pari livello. Speriamo che si può vedere come questa logica ricorsiva spiega i risultati osservati, ma solo nel caso in cui non è possibile un C# simulazione

using System;
using System.Collections.Generic;
using System.Linq;

namespace Foo
{
    internal class Bar
    {
        private static void Main(string[] args)
        {
            var spool = new Stack<Tuple<int, string, string>>();

            //Add anchor elements
            AddRowToSpool(spool, new Tuple<int, string, string>(1, "", ""));
            AddRowToSpool(spool, new Tuple<int, string, string>(2, "", ""));
            AddRowToSpool(spool, new Tuple<int, string, string>(4, "", ""));

            while (spool.Count > 0)
            {
                Tuple<int, string, string> lastRowAdded = spool.Pop();
                AddChildRows(lastRowAdded, spool);
            }

            Console.ReadLine();
        }

    private static void AddRowToSpool(Stack<Tuple<int, string, string>> spool,
                                      Tuple<int, string, string> row)
        {
            Console.WriteLine("CategoryId={0}, product_list = {1}",
                              row.Item1,
                              row.Item3);
            spool.Push(row);
        }

    private static void AddChildRows(Tuple<int, string, string> lastRowAdded,
                                     Stack<Tuple<int, string, string>> spool)
        {
            int categoryId = lastRowAdded.Item1;
            string productName = lastRowAdded.Item2;
            string productList = lastRowAdded.Item3;

            string[] products;

            switch (categoryId)
            {
                case 1:
                    products = new[] {"Changassad"};
                    break;
                case 2:
                    products = new[]
                                   {
                                       "Aniseed Syrup",
                                       "Chef Anton's Cajun Seasoning",
                                       "Chef Anton's Gumbo Mix "
                                   };
                    break;
                case 4:
                    products = new[] {"vcbcbvcbvc"};
                    break;
                default:
                    products = new string[] {};
                    break;
            }


            foreach (string product in products.Where(
                product => string.Compare(productName, product) < 0))
            {
                string product_list = string.Format("{0}{1}{2}",
                                                 productList,
                                                 productList == "" ? "" : ",",
                                                 product);

                AddRowToSpool(spool,
                              new Tuple<int, string, string>
                                  (categoryId, product, product_list));
            }
        }
    }
}

I ritorni

CategoryId=1, product_list =
CategoryId=2, product_list =
CategoryId=4, product_list =
CategoryId=4, product_list = vcbcbvcbvc
CategoryId=2, product_list = Aniseed Syrup
CategoryId=2, product_list = Chef Anton's Cajun Seasoning
CategoryId=2, product_list = Chef Anton's Gumbo Mix
CategoryId=2, product_list = Chef Anton's Cajun Seasoning,Chef Anton's Gumbo Mix
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Cajun Seasoning
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Gumbo Mix
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Cajun Seasoning,Chef Anton's Gumbo Mix
CategoryId=1, product_list = Changassad

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow