I have a query that looks like this:

SELECT 'FY2000' AS FY, COUNT(DISTINCT SGBSTDN_PIDM) AS CHEM_MAJORS
FROM SATURN.SGBSTDN, SATURN.SFRSTCR
WHERE SGBSTDN_PIDM = SFRSTCR_PIDM
  AND SGBSTDN_TERM_CODE_EFF = (SELECT MAX(SGBSTDN_TERM_CODE_EFF)
                               FROM SATURN.SGBSTDN
                               WHERE SGBSTDN_TERM_CODE_EFF <=  '200002'
                                 AND SGBSTDN_PIDM = SFRSTCR_PIDM)
  AND SGBSTDN_MAJR_CODE_1 = 'CHEM'
  AND SFRSTCR_TERM_CODE BETWEEN '199905' AND '200002'
  AND (SFRSTCR_RSTS_CODE LIKE 'R%' OR SFRSTCR_RSTS_CODE LIKE 'W%')
  AND SFRSTCR_CREDIT_HR >= 1

It returns a count of 48, which I believe is correct. However, I don't understand why the subquery doesn't need SATURN.SFRSTCR in the FROM clause in order to reference SFRSTCR_PIDM. I thought subqueries were self contained and couldn't see the rest of the query?

But, if I add SATURN.SFRSTCR to the subquery, the count changes to 22. If I take the AND SGBSTDN_PIDM = SFRSTCR_PIDM out of the subquery, the count also changes to 22. Can someone explain this to me?

有帮助吗?

解决方案

You have a correlated subquery. This is a bit different from a non-correlated subquery, because it can include references to outer tables.

When using correlated subqueries, always use the table aliases for all table references. This is a good idea in general, but should be followed more attentively for correlated subqueries.

AND SGBSTDN_TERM_CODE_EFF = (SELECT MAX(SGBSTDN.SGBSTDN_TERM_CODE_EFF)
                             FROM SATURN.SGBSTDN
                             WHERE SGBSTDN.SGBSTDN_TERM_CODE_EFF <=  '200002'
                               AND SGBSTDN.SGBSTDN_PIDM = SFRSTCR.SFRSTCR_PIDM
                            )

For each value of SFRSTCR.SFRSTCR_PIDM (and the other conditions), the subquery is getting the maximum date.

In most versions of SQL, correlated subqueries are allowed in the from, where, and having clauses. (They might also be allowed in order by.)

其他提示

Correlated subqueries (that is, subqueries inside the WHERE clause), can reference columns from the outer query. They are different from inline views (that is, subqueries inside the FROM clause), which cannot see columns defined in the parent query.

You are doing it right: the subquery will first look for the SFRSTCR_PIDM column in its scope (SATURN.SGBSTDN), then go and look for it in the outer query.

Sub queries as you have listed will take whatever the current record at the outer level being processed and make AVAILABLE to the sub query, that's why you don't explicitly need to include it in your subquery. If the column is uniquely identifiable, you don't need the alias.column reference and can just get away with the columnName reference.

However, subqueries are typically very poor on performance as the sub query is run for EVERY record being processed. Typically joins are used, but each query has it's own needs and you are getting a MAX() at the subquery level.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top