Question

This is my first time posting on Stack Overflow but have already found it to be an invaluable resource.

My question has to do with correlated subqueries in the group by of a SQL Statement, which I understand is not possible. I hope that by explaining the intent of the query I may be able to find some help. In its simplest form I am trying to aggregate by a primary key and group by the results of a correlated subquery. The field used for correlating is unique per primary key record. If I group by the correlating field it does not provide the intended results. I need to be able to group by the results of the correlated subquery but am unsure how to restructure the query. A sample query is provided below.

Thanks

-John

SELECT DISTINCT 
    (SELECT substring(CommaDelimitedTranslatedList, 0, len(CommaDelimitedTranslatedList)) 
    FROM 
        ( 
            SELECT 
                cd.choiceValue + ',' 
            from SelectedChoiceTable sct 
                join ChoiceDescription cd on 
                    cd.ChoiceID = sct.ChoiceID 
            where 
                sct.ChoiceSet = TargetJoinTable.ChoiceKey  FOR XML PATH('')
        ) D ( CommaDelimitedTranslatedList ))  AS           [TargetJoinTableCommaDelimitedTranslatedList1] ,
    count(BaseTable.BaseKey)
 FROM BaseTable
    join TargetJoinTable on 
        BaseTable.BaseKey = TargetJoinTable.BaseKey
Group By 
    [TargetJoinTableCommaDelimitedTranslatedList1]
Order By 
    [TargetJoinTableCommaDelimitedTranslatedList1] ASC
Was it helpful?

Solution

At first glance, try changing the subquery to a CROSS (or OUTER) APPLY. This moves the "column" to the FROM clause from the SELECT clause thus allowing grouping.

SELECT
    foo.TargetJoinTableCommaDelimitedTranslatedList1 ,
    count(BaseTable.BaseKey)
 FROM
    BaseTable
    join
    TargetJoinTable on BaseTable.BaseKey = TargetJoinTable.BaseKey
    CROSS APPLY
    (
    SELECT
        substring(CommaDelimitedTranslatedList, 0, 2000000000)
               AS TargetJoinTableCommaDelimitedTranslatedList1
    FROM 
        ( 
        SELECT 
            cd.choiceValue + ',' 
        from
            SelectedChoiceTable sct 
            join
            ChoiceDescription cd on cd.ChoiceID = sct.ChoiceID 
        where 
            sct.ChoiceSet = TargetJoinTable.ChoiceKey
        FOR XML PATH('')
        ) D ( CommaDelimitedTranslatedList )
      ) foo
Group By 
    foo.TargetJoinTableCommaDelimitedTranslatedList1
Order By 
    foo.TargetJoinTableCommaDelimitedTranslatedList1 ASC

Or move the aggregate out onto a derived table thus

SELECT
    COUNT(foo.BaseKey),
    TargetJoinTableCommaDelimitedTranslatedList1
FROM
    (SELECT DISTINCT 
        (SELECT substring(CommaDelimitedTranslatedList, 0, len(CommaDelimitedTranslatedList)) 
        FROM 
            ( 
                SELECT 
                    cd.choiceValue + ',' 
                from SelectedChoiceTable sct 
                    join ChoiceDescription cd on 
                        cd.ChoiceID = sct.ChoiceID 
                where 
                    sct.ChoiceSet = TargetJoinTable.ChoiceKey  FOR XML PATH('')
            ) D ( CommaDelimitedTranslatedList ))  AS           [TargetJoinTableCommaDelimitedTranslatedList1] ,
     BaseTable.BaseKey

     FROM BaseTable
        join TargetJoinTable on 
            BaseTable.BaseKey = TargetJoinTable.BaseKey
    ) foo
Group By 
    [TargetJoinTableCommaDelimitedTranslatedList1]
Order By 
    [TargetJoinTableCommaDelimitedTranslatedList1] ASC

One problem with this will be that the CSV TargetJoinTableCommaDelimitedTranslatedList1 may be generated many times: once per BaseTable.BaseKey if I read this correctly. Which will be slow. My feeling is that the CSV generation should come last: you're actually grouping on TargetJoinTable.ChoiceKey, not the CSV it generates.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top