Question

The basic structure of my query is this:

  • I have a profiles table with profile information
  • I have a locations table with location coordinates
  • I have a location_assignment table which just contains (profile_id, location_id) pairs

Each profile is assigned to one or more locations, and what I'm trying to do is search the profiles, and return them in order of distance to the location coordinates. My query to do so is (with only the relevant parts included) as follows:

SELECT *, 
      (3959*acos(cos(radians(30.292424))*cos(radians(lat))*cos(radians(lng)-  
       radians(-97.73856))+sin(radians(30.292424))*sin(radians(lat)))) AS distance,
      `profiles`.`name` as profilename, 
      `profiles`.`profile_id` as profile_id
 FROM (`profiles`)
 JOIN `location_assignment` 
          ON `profiles`.`profile_id` =`location_assignment`.`profile_id`
 JOIN `locations` 
          ON `location_assignment`.`location_id` = `locations`.`location_id`
HAVING `distance` < 50
ORDER BY `distance`
LIMIT 3"

(That grosstastic thing in the select line converts the lat/lng fields in the locations table into a distance from a given input lat/lng)

However, my query makes profiles appear multiple times in the results, once for each location he is assigned to. I would like each profile to appear only once, with the information for the location with the shortest distance.

My knee-jerk reaction is to use group_by location_id, but I want to make sure I get the location with the minimum distance to the input coordinates.

Was it helpful?

Solution

Go Longhorns!

Let's start by finding the right row in the location table.

SELECT DISTINCT location_id
  FROM locations
 ORDER BY your_spherical_cosine_law_distance_formula
 LIMIT 1

That gets you the unique location id.

Now you want to use that as a subquery to get the appropriate profiles rows. That you do like this:

 SELECT whatever
   FROM (
        SELECT DISTINCT location_id
          FROM locations
         ORDER BY your_spherical_cosine_law_distance_formula
         LIMIT 1
        ) AS one
   JOIN location_assignment AS la ON one.location_id = la.location_id
   JOIN profiles AS p on p.profile_id =la.profile_id

That should give you the appropriate list of profiles rows, without duplications.

You didn't ask about this, but I hope you don't have too many locations rows. The query you're using will necessarily scan through the whole table and do a lot of math for each row. Your HAVING clause really doesn't help. To make this faster you need to combine a distance search with a bounding-rectangle search. This might help. http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/

OTHER TIPS

I think you should add MIN() function to the distance calculation to get the distance to the closest location for each profile. Also, add GROUP BY to group by profile information.

(I know MySQL allows to return columns that are not in GROUP BY but this is not something I would recomment, so I removed * from your SELECT).

SELECT MIN(3959*acos(cos(radians(30.292424))*cos(radians(lat))*cos(radians(lng)-  
       radians(-97.73856))+sin(radians(30.292424))*sin(radians(lat)))) AS distance,
      `profiles`.`name` as profilename, 
      `profiles`.`profile_id` as profile_id
 FROM (`profiles`)
 JOIN `location_assignment` 
          ON `profiles`.`profile_id` =`location_assignment`.`profile_id`
 JOIN `locations` 
          ON `location_assignment`.`location_id` = `locations`.`location_id`
GROUP BY `profiles`.`name`, `profiles`.`profile_id`
HAVING `distance` < 50
ORDER BY `distance`
LIMIT 3"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top