Question

Let's say I have a MySQL table, people. Each record comprises a variety of properties, among these favourite_colour, country, and age_group.

What I would like to do is retrieve records from this table by their similarity to a set of specific parameters. Given "Red", "United States", and "18-25", for example, the best results would be those records that match all three. These would be 100% matches.

However, I would also like to retrieve records that match any combination of two parameters (66% match), or any one parameter (33% match). Moreover, I would like to be able to define additional points of comparison (e.g. underwear_type, marital_status, etc.).

Is there a relatively efficient solution to this problem?

Was it helpful?

Solution

Yes, you can turn each comparison, such as favourite_colour='Red' &c, into a value of 0 (false) or 1 (true) -- mysql will do it implicitly, but for generality you might want CAST( (favourite_colour='Red') AS INTEGER) &c; then, you SUM all of these, i.e.,

SELECT
userId,
SUM( (favourite_colour='Red'),
     (country='US'),
     (age_group='18-25') ) AS match_score
FROM people
WHERE match_score >= 2
ORDER BY match_score DESC

will give you perfect matches first, 2-of-3 next; easy to generalize to even more checks!-)

OTHER TIPS

For the three first is easy:

select * from people
where
(case when color = 'Red' then 33 else 0 end + 
case when age_group = '18-25' then 33 else 0 end + 
case when country = 'United States' then 33 else 0 end)>=33

I don't understand the "additional points of comparison" part, can you explain?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top