Frage

I'm looking for implementing an algorithm that can calculate the similarity of several numbers (0-100%).

This is one scenario for a movie database: A user profile contains information about the user's movie preferences. That is stored using three attributes (How much I like Action, Drama or Cartoon) containing values between 1-10 (10 is that I like it a lot). On each movie you will find similar values (I.e. Terminator Action=10, Drama=5, Cartoon=1).

Now I would like to calculate how how much a user's preferences matches a movies preferences. What are your suggestions? (This is made in C#)

Regards

War es hilfreich?

Lösung

public static double SingleSimilarity(double x, double y)
{
    return (10.0 - Math.Abs(x - y)) * 10.0;
}

// 3 values of user preferences, 3 values of movie assessment
public static double Similarity(Tuple<double, double, double> user, Tuple<double, double, double> movie)
{
    return (SingleSimilarity(user.Item1, movie.Item1) + SingleSimilarity(user.Item2, movie.Item2) + SingleSimilarity(user.Item3, movie.Item3)) / 3.0;
}

Example:

var similarity = Similarity(Tuple.Create(10.0, 0.0, 5.0), Tuple.Create(0.0, 10.0, 5.0));

Or more generic method:

// 3 items (Action, Drama, Cartoon) each of which contain a value for user and movie
public static double Similarity(IEnumerable<Tuple<double, double>> list)
{
    return list.Sum(t => SingleSimilarity(t.Item1, t.Item2)) / list.Count();
}

And example:

var similarity = Similarity(new[] { Tuple.Create(10.0, 0.0), Tuple.Create(0.0, 10.0), Tuple.Create(5.0, 5.0) });

In such case the result will be 33.(3) and seems to be practice.

Andere Tipps

how about this:

var metric = (movie.actionMetric - user.actionPreference) + (movie.dramaMetric - user.dramaPreference) + (moview.cartoonMetric - user.cartoonPreference)

This simple algorithm could be done within a database query (which is usually important) and spits out a lower number the higher a persons preference for it, you could also convert values to a percentage by (1/metric) x 100. You can also weight the algorithm fairly easily, say if you thought the "cartoon" metric was less important then drama or action:

var metric = (movie.actionMetric - user.actionPreference) + (movie.dramaMetric - user.dramaPreference) + 0.5 * (moview.cartoonMetric - user.cartoonPreference)

Typical solutions use different similarity measures (e.g. cosine, Pearson, Manhattan, etc.) It's all covered beautifully in Toby Segaran's "Programming Collective Intelligence".

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top