Question

I have two tables, a mls_history table, and a mls table. The mls_history table has every record uploaded to the system, and the mls table is a subset of mls_history, only containing the latest status date (stat_date).

For example:

MLS_HISTORY:

mlsnum | stat_date
-------------------
1      | 2013-01-04
2      | 2013-01-08
1      | 2013-04-09
1      |

MLS:

mlsnum | stat_date
-------------------
1      | 2013-04-09
2      | 2013-01-08

I want one insert/select query (if possible?) that grabs the max stat_date for each mlsnum, and stores it in the mls table.

There are a few examples already on stackoverflow, but almost all of them allow for multiple records if stat_dates are the same. I want this to only return 1 record, even if 2 meet the max criteria.

To further complicate things, there are no serial columns, so no unique id, so I can't just say "in case of a tie, pick the one with the max id".

Is this possible?

Was it helpful?

Solution

INSERT INTO mls                -- assuming it's empty
SELECT DISTINCT ON (mlsnum) *
FROM   mls_history
ORDER  BY mlsnum, stat_date DESC;

Select exactly 1 row per mlsnum with the latest stat_date.

Details and explanation:
Select first row in each GROUP BY group?

OTHER TIPS

In this case a mundane group by does the job

select mlsnum, max(stat_date)
from mls_history
group by mlsnum

The (very handy) distinct on used in @Erwin`s answer is Postgresql only and it is a bit harder to understand complicating future maintenance.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top