Question

Alright, this one (3a; sample problem with provided answer) has got me scratching my head:

bbc(name, region, area, population, gdp)
3a. Find the largest country in each region:

SELECT region, name, population
  FROM bbc x
 WHERE population >= ALL
    (SELECT population
       FROM bbc y
      WHERE y.region = x.region
        AND population > 0)

I understand the concept of 'WHERE y.region = x.region' when I think about it in terms of the db engine looping over the table entries and matching each x.region with the current y.region (in the nested SELECT)... but wtf does 'AND population > 0' do? It isn't a right answer without it, but I don't see how not...

Was it helpful?

Solution

That clause is there just because there is an entry in the Europe table (for the Vatican) which has NULL in the population column. The following works also and I believe is more understandable:

SELECT region, name, population
  FROM bbc x
 WHERE population >= ALL
    (SELECT population
       FROM bbc y
      WHERE y.region = x.region
        AND population IS NOT NULL)

In the MySQL documentation for ALL subqueries, there's a helpful comment (emphasis theirs):

In general, tables containing NULL values and empty tables are "edge cases." When writing subquery code, always consider whether you have taken those two possibilities into account.

OTHER TIPS

I am speculating it here.

What if population is null for all of the records of a specific region?

EDIT: It can be considered kind of a safety-net to ignore -ve values in population (not a real life scenario).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top