Extracting XML data in SQL - too many cross apply statements

https://stackoverflow.com/questions/12607943

04-07-2021
|

Question

I have an xml document containing details from a Statement:

<Statement>
<Id />

<Invoices>
    <Invoice>
        <Id />
        <Date />
        <AmountDue />
        etc.
    </Invoice>

    <Invoice>
        <Id />
        <Date />
        <AmountDue />
        etc.
    </Invoice>

    <Invoice>
        <Id />
        <Date />
        <AmountDue />
        etc.
    </Invoice>
</Invoices>

</Statement>

This works fine for the Statement specific details:

SET @statementId = @xml.value('(Id)[1]', 'UNIQUEIDENTIFIER');

but it requires a singleton, and only returns the first value. I need ALL of the values for the invoices, not just the first so a singleton won't work.

I am able to get the information out using cross apply statements like this:

SELECT 
@statementId AS STATEMENT_ID
Id.value('.', 'uniqueidentifier') AS INVOICE_ID
Date.value('.', 'smalldatetime') AS INVOICE_DATE
Due.value('.', 'decimal') AS INVOICE_AMOUNT_DUE

FROM @xml.nodes('Statement') A(S)
cross apply S.nodes('Invoices/Invoice') B(InvoiceD)
cross apply InvoiceD.nodes('Id') C(Id)
cross apply InvoiceD.nodes('Date') D(Date)
cross apply InvoiceD.nodes('AmountDue') E(Due)

This returns an Id, date, and amount from each Invoice in the Statement - perfect.

My problem comes when I try to extract all of the invoice details. I currently have seven cross apply statements and I got the following message:

"The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information."

What I want to do is have one cross apply for the Invoice and narrow down the exact field in the select statement, but unless I use '.' I must make the statement return a singleton and I don't get all of the data that I need.

I have done some research about specifying a namespace within the select statement, but all of the examples set the namespace to be an http address instead of a node in an xml document and I haven't gotten anything to return yet using this approach.

The result I'm looking for is something like this, but with more Invoice Details:

STATEMENT_ID      INVOICE_ID      INVOICE_DATE      INVOICE_AMOUNT_DUE     ...
Statement-1-Id    Invoice-1-Id    Invoice-1-Date    Invoice-1-AmountDue    ...
Statement-1-Id    Invoice-2-Id    Invoice-2-Date    Invoice-2-AmountDue    ...
Statement-1-Id    Invoice-3-Id    Invoice-3-Date    Invoice-3-AmountDue    ...

Where should I go from here?

EDIT: I removed some unnecessary information. Getting all of the invoice-specific details is my goal here.

Solution

select @XML.value('(Statement/Id/text())[1]', 'uniqueidentifier') as StatementId, 
       T.N.value('(Id/text())[1]', 'uniqueidentifier') as InvoiceId,
       T.N.value('(Date/text())[1]', 'smalldatetime') as InvoiceDate,
       T.N.value('(AmountDue/text())[1]', 'decimal') as AmountDue
from @XML.nodes('/Statement/Invoices/Invoice') as T(N)

.nodes will shred your XML to rows so that each row T.N is pointing to an Invoice node of its own. On that node there is only a single Id node so fetching the value specifying a singleton Id[1] works.

You can use Id[1] or (Id/text())[1] but the latter will give you a more efficient execution plan.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow