Question

1) The following query obtains from each film category the cheapest possible DVD with the highest rating:

SELECT FilmName, Rating, DVDPrice, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.DVDPrice =
     (SELECT MIN(DVDPrice)
      FROM Films AS FM2
      WHERE FM2.DVDPrice IS NOT NULL
             AND FM1.CategoryId = FM2.CategoryId
             AND FM2.Rating =
                            (SELECT MAX(FM3.Rating)
                             FROM Films AS FM3
                             WHERE FM3.DVDPrice IS NOT NULL
                                  AND FM2.CategoryId = FM3.CategoryId
                             )
      )
ORDER BY FM1.CategoryId;


Query is not perfect, since film “Alien3” in category A could have a very low rating, but if its DVDPrice happens to be the same as the price of a cheapest film with highest rating ( in the same category ), then “Alien3” will also be returned in a query. Does the following query correct this problem:

SELECT FilmName, Rating, DVDPrice, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.DVDPrice =
     (SELECT MIN(DVDPrice)
      FROM Films AS FM2
      WHERE FM2.DVDPrice IS NOT NULL
             AND FM1.CategoryId = FM2.CategoryId
             AND FM1.Rating =
                            (SELECT MAX(FM3.Rating)
                             FROM Films AS FM3
                             WHERE FM3.DVDPrice IS NOT NULL
                                  AND FM2.CategoryId = FM3.CategoryId
                             )
      )
   AND FM1.Rating=(SELECT MAX(FM2.Rating)
                             FROM Films AS FM2
                             WHERE FM2.DVDPrice IS NOT NULL
                                  AND FM2.CategoryId = FM1.CategoryId
                             )
ORDER BY FM1.CategoryId;


2) I was sure that by changing "FM2.Rating=" to "FM1.Rating=" that the query would also produce the correct results, but it doesn’t. Any idea why it doesn’t work?

SELECT FilmName, Rating, DVDPrice, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.DVDPrice =
     (SELECT MIN(DVDPrice)
      FROM Films AS FM2
      WHERE FM2.DVDPrice IS NOT NULL
             AND FM1.CategoryId = FM2.CategoryId
             AND FM1.Rating =
                            (SELECT MAX(FM3.Rating)
                             FROM Films AS FM3
                             WHERE FM3.DVDPrice IS NOT NULL
                                  AND FM2.CategoryId = FM3.CategoryId
                             )
      )
ORDER BY FM1.CategoryId;


thanx


EDIT - REPLYING TO MR. Bill Karwin

If we insert into Films table the following rows:

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 1, 'The Dirty Half Dozen', 1987, 'Six men go to war wearing unwashed uniforms. The horror!', 'N', 2, 4, NULL );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 2, 'On Golden Puddle', 1967, 'A couple find love while wading through a puddle', 'Y', 4, 2, 12.99 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 3, 'The Lion, the Witch, and the Chest of Drawers', 1977, 'A fun film for all those interested in zoo/magic/furniture drama', 'N', 1, 3, NULL );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 4, 'Nightmare on Oak Street, Part 23', 1997, 'The murderous Terry stalks Oak Street', 'Y', 2, 3, 9.99 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 5, 'The Wide Brimmed Hat', 2005, 'Fascinating life story of a wide brimmed hat', 'N', 1, 5, NULL );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 6, 'Sense and Insensitivity', 2001, 'She longs for a new life with Mr Arcy, he longs for a small cottage in the Hamptons', 'Y', 3, 6, 15.99 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 7, 'Planet of the Japes', 1967, 'Earth has been destroyed, to be taken over by a species of comedians', 'Y', 5, 4, 12.99 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 8, 'The Maltese Poodle', 1947, 'A mysterious bite mark, a guilty looking poodle. 1st class thriller', 'Y', 1, 1, 2.99 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 2, '15th Late Afternoon', 1989, 'One of Shakespeare''s lesser known plays', 'N', 5, 6, 12.99 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 2, 'Soylent Yellow', 1967, 'Detective Billy Brambles discovers Soylent Yellow is made of Soya Bean. Ewwww!', 'Y', 5, 5, 12.99 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 16, 'First', 1967, '', 'Y', 1, 2, 12.99 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 17, 'Second', 1967, '', 'Y', 7, 2, 100 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 19, 'Third', 1967, '', 'Y', 7, 2, 10 );

INSERT INTO Films ( FilmId, FilmName, YearReleased, PlotSummary, AvailableOnDVD, Rating, CategoryId, DVDPrice ) VALUES ( 20, 'Fourth', 1967, '', 'Y', 1, 2, 10 );

Then my second query ( the one you've said that it works ) returns the following films:
• The Maltese Poodle
• Third Nightmare on Oak Street
• Planet of the Japes
• Soylent Yellow
• 15th Late Afternoon

while yours returns
• The Dirty Half Dozen
• The Lion
• the Witch and the Chest of Drawers
• Nightmare on Oak Street
• The Wide Brimmed Hat
• Planet of the Japes
• The Maltese Poodle
• 15th Late Afternoon
• Soylent Yellow and Third


EDIT - REPLYING TO MR. Russell Steen

SELECT * FROM Films INNER JOIN
(SELECT Min(DVDPrice) as DVDPrice, MaxRating, x1.CategoryId, x1.Category FROM
  (SELECT FilmName, DVDPrice, Rating, MaxRating, Category, Films.CategoryId  FROM Films INNER JOIN 
    (SELECT MAX(Rating) as MaxRating, FM1.CategoryId, C1.Category
      FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
      GROUP BY Category, FM1.CategoryId
    ) x on Films.Rating = x.MaxRating and Films.CategoryId = x.CategoryID
  ) x1 
  WHERE DVDPrice IS NOT NULL 
  GROUP BY CategoryId,Category
) y on Films.Rating = y.MaxRating and Films.CategoryId = y.CategoryId and Films.DVDPrice = y.DVDPrice

I’ve rewrote your query best I could, but I’m not so sure whether it produces the correct results and truth be told, I’ve got completely lost on whether particular inner queries should also select FilmID columns etc.


EDIT - SECOND REPLY TO MR. Bill Karwin

SELECT f.FilmName, f.Rating, f.DVDPrice, f.CategoryId
FROM Films f
LEFT OUTER JOIN Films p ON (f.CategoryId = p.CategoryId
   AND p.AvailableOnDvd = 'Y' AND f.AvailableOnDvd = 'Y' AND f.DVDPrice > p.DVDPrice)
LEFT OUTER JOIN Films r ON (f.CategoryId = r.CategoryId
   AND r.AvailableOnDvd = 'Y' AND f.DVDPrice = r.DVDPrice AND f.Rating < r.Rating)
WHERE  p.CategoryId IS NULL AND r.CategoryId IS NULL
ORDER BY f.CategoryId;
Was it helpful?

Solution

First let's state the problem. We want all the films with the highest rating for each category. Then, of those, we want the lowest price.

First get the highest ratings

SELECT * FROM Films 
INNER JOIN 
(SELECT Max(Rating) as Rating, Category
  FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
  GROUP BY Category
) x on Films.Rating = x.Rating and Films.Category = x.Category

Now, from that, get the cheapest price

  SELECT * FROM Films INNER JOIN
    (SELECT Min(DVDPrice), x.Rating, Category FROM 
      (SELECT * FROM Films INNER JOIN 
        (SELECT MAX(Rating) as Rating, Category
          FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
          GROUP BY Category
        ) x on Films.Rating = x.Rating and Films.Category = x.Category
      )
      WHERE DVDPrice IS NOT NULL
      GROUP BY Category, DVDPrice
    ) y on Films.Rating = y.Rating and Films.Category = y.Category and Films.DVDRating = y.DVDRating

OTHER TIPS

what you want is:
-----------------
for each category, retrieve a film that meets the following 2 conditions:
_condition1:____rating= max rating in that category_
_
condition2:____price= min price in that category for films verifying condition 1_

--> in other terms it's equivalent to order films by Rating Desc then DVDPrice Asc for each category and take the first one.

1 solution is:

SELECT FilmName, Rating, DVDPrice, Category
FROM Films FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.FilmId = (SELECT TOP 1 FilmId
                      FROM Films AS FM2
                     WHERE FM2.CategoryId = FM1.CategoryId
                  ORDER BY Rating DESC, DVDPrice)

OR:

SELECT FM.FilmName, FM.Rating, FM.DVDPrice, C1.Category
  FROM (SELECT FM0.*, ROW_NUMBER() over (ORDER BY Rating DESC, DVDPrice) rank
          FROM Films FM0) FM 
INNER JOIN Category AS C1 ON C1.CategoryId = FM.CategoryId
INNER JOIN (SELECT FM1.CategoryId, MIN(FM1.rank) rank
              FROM (SELECT CategoryId,
                           ROW_NUMBER() over (ORDER BY Rating DESC,DVDPrice) rank
                  FROM Films) AS FM1
        GROUP BY CategoryId) FM2
 ON FM.CategoryId = FM2.CategoryId
AND FM.rank = FM2.rank

with your data, I've done some tests and it seems that the following query is better than the 2 above:

SELECT FM.*, C.Category
FROM (SELECT FM1.CategoryId, MAX(FM1.FilmId) FilmId
        FROM Films FM1
        WHERE NOT EXISTS (SELECT NULL 
                                FROM Films AS FM2
                               WHERE FM2.CategoryId = FM1.CategoryId
                                 AND (FM1.Rating < FM2.Rating 
                                      OR (    FM1.Rating = FM2.Rating 
                                          AND FM1.DVDPrice > FM2.DVDPrice)
                                     )
                          )
      GROUP BY FM1.CategoryId) FF
INNER JOIN Films FM on FM.FilmId = FF.FilmId
                   AND FM.CategoryId = FF.CategoryId
INNER JOIN Category AS C1 ON C1.CategoryId = FM.CategoryId

1) Yes, the second query you give looks better. But I give +1 to @Russell Steen's solution because it avoids the use of correlated subqueries.

This is a variation of the greatest-n-per-group problem that I see frequently on SO. Here's another possible solution:

SELECT f.*
FROM Films f
LEFT OUTER JOIN Films p
 ON (f.CategoryId = p.CategoryId AND f.DVDPrice > p.DVDPrice)
LEFT OUTER JOIN Films r
 ON (f.CategoryId = r.CategoryId AND f.DVDPrice = r.DVDPrice AND f.Rating < r.Rating)
WHERE p.CategoryId IS NULL AND r.CategoryId IS NULL;

The explanation is that we try to find a film "p" in the same category, with a lower price. When we have found none, p.* will be NULL because that's how outer joins work. When there are no dvd's with a lower price, we've found the one with lowest price.

We further try the same trick to find a film "r" with the highest rating. This time we restrict to films in the same category and with the same price (that is, the lowest price) as the film f. Otherwise we'd unintentionally find the film with the highest rating in the category, even if it isn't cheap.

You can also reverse the order of the joins, first finding the highest rating and then finding the lowest price among those with the higest rating. It depends on what you place at a greater priority -- low price or high rating. No matter what solution you use, you have to make a decision about this priority.

2) The other query you tried doesn't work because the condition you use in the subquery doesn't eliminate any of the wrong rows of the FT2 subquery. It's a "Green Eggs and Ham" problem: whether on a train or on a plane, on a boat or on a goat, you've still got green eggs and ham included in the meal.


update: Okay, thanks for the sample data. When you first asked the question, you didn't include the information that some films could be ineligible because they aren't available on DVD and have a NULL in the DVDPrice column. Here's an updated query using my technique that returns the correct films, one per category, excluding films that aren't available on DVD, with the lowest price and highest rating:

SELECT f.FilmName, f.Rating, f.DVDPrice, f.CategoryId
FROM Films f
LEFT OUTER JOIN Films p ON (f.CategoryId = p.CategoryId
  AND p.AvailableOnDvd = 'Y' AND f.DVDPrice > p.DVDPrice)
LEFT OUTER JOIN Films r ON (f.CategoryId = r.CategoryId
  AND r.AvailableOnDvd = 'Y' AND f.DVDPrice = r.DVDPrice AND f.Rating < r.Rating)
WHERE f.AvailableOnDvd = 'Y' AND p.CategoryId IS NULL AND r.CategoryId IS NULL
ORDER BY f.CategoryId;

Output:

+-------------------------+--------+----------+------------+
| FilmName                | Rating | DVDPrice | CategoryId |
+-------------------------+--------+----------+------------+
| The Maltese Poodle      |      1 |     2.99 |          1 |
| Third                   |      7 |    10.00 |          2 |
| Nightmare on Oak Street |      2 |     9.99 |          3 |
| Planet of the Japes     |      5 |    12.99 |          4 |
| Soylent Yellow          |      5 |    12.99 |          5 |
| Sense and Insensitivity |      3 |    15.99 |          6 |
+-------------------------+--------+----------+------------+

This differs from your result in category 6, because Sense and Insensitivity in your sample data is the only film that is available on DVD. 15 Late Afternoon is not available, even though it has a non-null value for DVDPrice. If I change it to AvailableOnDvd='Y' then 15 Late Afternoon is chosen instead of the other film.


Regarding your question about how did I solve this, it's a variation of a common question in SQL, which I have tagged the "greatest-n-per-group" question. You want the query to return every film f such that no film exists with a lower DVDPrice in the same category. I solve with an outer join to p and if no matches are found in p then f must have the lowest price in that category. That's the common solution.

Your added twist in this problem is that you have another attribute to filter on. So given the film (or films in the case of ties) with the lowest price, you want the one with the highest rating. The technique is the same, to use an outer join to r where the category and price is equal, and the rating is higher. When no such films are found with a higher rating, then f must have the highest rating for a given category and price.

I'm going to add a tag to your question greatest-n-per-group so you can follow it and view other SQL questions that are solved with the same technique.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top