Question

I'm trying to complete a modification of this Google tutorial

I've written this SQL to query a table of locations using the location "name." Given the name of the location, the query returns pizza restaurants within the proximity. To accomplish this I've cross joined my table of restaurant locations, titled "markers" to itself and calculated distances using the Haversine formula.

    SELECT m.address,
       m.name,
       m.lat,
       m.lng,
       (3959 * ACOS(COS(RADIANS(poi.lat)) * 
       COS(RADIANS(m.lat)) * 
       COS(RADIANS(m.lng) - RADIANS(poi.lng)) + SIN(RADIANS(poi.lat))*
       SIN(RADIANS(m.lat)))) AS distance
    FROM markers poi
       CROSS JOIN markers m
    WHERE poi.address LIKE "%myrtle beach%"
          AND poi.id <> m.id HAVING distance < 200
   ORDER BY distance LIMIT 0,20

The query returns the expected results, but if the point of interest is outside the specified area, in this case "myrtle beach," I get duplicate records per match. This is because the CROSS JOIN and would be easy to fix with a DISTINCT select. But the "lng" and "lat" fields are FLOAT types, so the distance calculations are never identical, even for duplicated records.

Here is a subset of the returns:

3901 North Kings Highway Suite 1, Myrtle Beach, SC | East of Chicago Pizza Company | 33.716099 -78.855583 | 4.0285562196955125

1706 S Kings Hwy # A, Myrtle Beach, SC | Domino's Pizza: Myrtle Beach | 33.674881 | -78.905144 | 4.0285562196955125

82 Wentworth St, Charleston, SC | Andolinis Pizza | 2.782330 | -79.934235 | 85.68177495224947

82 Wentworth St, Charleston, SC | Andolinis Pizza | 32.782330 | -79.934235 | 89.71000040441085

114 Jungle Rd, Edisto Island, SC | Bucks Pizza of Edisto Beach Inc | 32.503971 -80.297951 | 114.22243529200529

114 Jungle Rd, Edisto Island, SC | Bucks Pizza of Edisto Beach Inc | 32.503971 -80.297951 | 118.2509427998286"

Any suggestions on where to go from here?

No correct solution

OTHER TIPS

Try:

select distinct x.address, x.name, y.lat, y.lng, x.distance
  from (SELECT m.address,
               m.name,
               m.lat,
               m.lng,
               (3959 *
               ACOS(COS(RADIANS(poi.lat)) * COS(RADIANS(m.lat)) *
                     COS(RADIANS(m.lng) - RADIANS(poi.lng)) +
                     SIN(RADIANS(poi.lat)) * SIN(RADIANS(m.lat)))) AS distance
          FROM markers poi
         cross JOIN markers m
         WHERE poi.address LIKE "%myrtle beach%"
           and poi.id <> m.id HAVING distance < 200) x
  join markers y
    on x.address = y.address
   and x.name = y.name
   and x.lat = y.lat
   and x.lng = y.lng
 order by x.distance limit 0, 20

You are getting duplicate results because the two points are both matching "myrtle beach". Use a condition like poi.id < m.id to ensure you only get one match.

Example:

poi id    m id    distance
1         2       100
2         1       100

Query:

SELECT 
    m.address,
    m.name,
    m.lat,
    m.lng,
    (3959 * ACOS(COS(RADIANS(poi.lat)) * 
    COS(RADIANS(m.lat)) * 
    COS(RADIANS(m.lng) - RADIANS(poi.lng)) + SIN(RADIANS(poi.lat))*
    SIN(RADIANS(m.lat)))) AS distance
FROM markers poi
CROSS JOIN markers m
WHERE 
    (poi.address LIKE "%myrtle beach%" OR m.address LIKE "%myrtle beach%")
    AND poi.id < m.id 
HAVING distance < 200
ORDER BY distance LIMIT 0,20

Or, if you truly do have a singular row in markers as the point of interest, specify that instead of any match on address. Then your condition of poi.id <> m.id will ensure there are no duplicates.

SELECT 
    m.address,
    m.name,
    m.lat,
    m.lng,
    (3959 * ACOS(COS(RADIANS(poi.lat)) * 
    COS(RADIANS(m.lat)) * 
    COS(RADIANS(m.lng) - RADIANS(poi.lng)) + SIN(RADIANS(poi.lat))*
    SIN(RADIANS(m.lat)))) AS distance
FROM markers poi
CROSS JOIN markers m
WHERE 
    poi.id = (SELECT TOP(1) id FROM markers WHERE address LIKE "%myrtle beach%")
    AND poi.id <> m.id 
HAVING distance < 200
ORDER BY distance LIMIT 0,20

Reviewing everyone's responses got me thinking. Instead of asking why I was getting duplicate results, I started wondering which of the two Myrtle Beach locations was the query calculating distances from? The answer was BOTH. And this explains why I was getting two records per match in the first place.

Here's my solution:

SELECT  m.address, m.name, m.lat, m.lng, (3959 
   * ACOS(COS(RADIANS(poi.lat)) * COS(RADIANS(m.lat)) 
   * COS(RADIANS(m.lng) - RADIANS(poi.lng)) + SIN(RADIANS(poi.lat))
   * SIN(RADIANS(m.lat))))     AS distance
FROM markers m
cross JOIN (
   select  name, lat, lng from markers
   where address like '%myrtle beach %'
   limit 1
) poi
HAVING distance < 200
ORDER BY name
LIMIT 0, 20

This doesn't give me the most accurate Distance calculations, as it arbitrarily uses the first restaurant it finds as the epicenter. But for my immediate purposes, this is good enough. I think for this this app to be production ready, I would need a second table for cities which would contain coordinates for the city center.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top