SQL Server rand() aggregate
-
21-08-2019 - |
Question
Problem: a table of coordinate lat/lngs. Two rows can potentially have the same coordinate. We want a query that returns a set of rows with unique coordinates (within the returned set). Note that distinct
is not usable because I need to return the id column which is, by definition, distinct. This sort of works (@maxcount
is the number of rows we need, intid
is a unique int id column):
select top (@maxcount) max(intid)
from Documents d
group by d.geoLng, d.geoLat
It will always return the same row for a given coordinate unfortunately, which is bit of a shame for my use. If only we had a rand()
aggregate we could use instead of max()
... Note that you can't use max()
with guids created by newid()
.
Any ideas? (there's some more background here, if you're interested: http://www.itu.dk/~friism/blog/?p=121)
UPDATE: Full solution here
Solution
You might be able to use a CTE for this with the ROW_NUMBER function across lat and long and then use rand() against that. Something like:
WITH cte AS
(
SELECT
intID,
ROW_NUMBER() OVER
(
PARTITION BY geoLat, geoLng
ORDER BY NEWID()
) AS row_num,
COUNT(intID) OVER (PARTITION BY geoLat, geoLng) AS TotalCount
FROM
dbo.Documents
)
SELECT TOP (@maxcount)
intID, RAND(intID)
FROM
cte
WHERE
row_num = 1 + FLOOR(RAND() * TotalCount)
This will always return the first sets of lat and lngs and I haven't been able to make the order random. Maybe someone can continue on with this approach. It will give you a random row within the matching lat and lng combinations though.
If I have more time later I'll try to get around that last obstacle.
OTHER TIPS
this doesn't work for you?
select top (@maxcount) *
from
(
select max(intid) as id from Documents d group by d.geoLng, d.geoLat
) t
order by newid()
Where did you get the idea that DISTINCT only works on one column? Anyway, you could also use a GROUP BY clause.