Question

In PostgreSQL 8.3, let's say I have a table called widgets with the following:

 id  | type | count
--------------------
  1  | A    |   21
  2  | A    |   29
  3  | C    |    4
  4  | B    |    1
  5  | C    |    4
  6  | C    |    3
  7  | B    |   14

I want to remove duplicates based upon the type column, leaving only those with the highest count column value in the table. The final data would look like this:

 id  | type | count
--------------------
  2  | A    |   29
  3  | C    |    4  /* `id` for this record might be '5' depending on your query */
  7  | B    |   14

I feel like I'm close, but I can't seem to wrap my head around a query that works to get rid of the duplicate columns.

Was it helpful?

Solution

count is a sql reserve word so it'll have to be escaped somehow. I can't remember the syntax for doing that in Postgres off the top of my head so I just surrounded it with square braces (change it if that isn't correct). In any case, the following should theoretically work (but I didn't actually test it):

delete from widgets where id not in (
select max(w2.id) from widgets as w2 inner join 
(select max(w1.[count]) as [count], type from widgets as w1 group by w1.type) as sq
on sq.[count]=w2.[count] and sq.type=w2.type group by w2.[count]
);

OTHER TIPS

There is a slightly simpler answer than Asaph's, with EXISTS SQL operator :

DELETE FROM widgets AS a 
WHERE EXISTS 
  (SELECT * FROM widgets AS b 
   WHERE (a.type = b.type AND b.count > a.count) 
          OR (b.id > a.id AND a.type = b.type AND b.count = a.count))

EXISTS operator returns TRUE if the following SQL statement returns at least one record.

According to your requirements, seems to me that this should work:

DELETE 
FROM widgets
WHERE type NOT IN 
(
    SELECT type, MAX(count) 
    FROM widgets 
    GROUP BY type
)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top