Question

I have a 1 to 5 voting system and i'm trying to figure out the best way to find the most popular item voted on, taking into consideration the total possible number of votes cast. To get a vote total, i'm counting "1" votes as -3, "2" votes as -2, "3" votes as +1, "4" votes as +2, "5" votes as +3, so a "1" vote would cancel out a "5" vote and vice versa.

For this example, say we have 3 films playing in 3 different size theaters.

Film 1: 800 seats / Film 2: 400 seats / Film 3: 180 seats

In a way, we're limiting the total amount of votes based on seats, so I would like a way for the film in the smaller theater to not get automatically overwhelmed by the film in the larger theater. It's likely that there will be more votes cast in the larger theater, resulting in a higher total score.


Edit 10/18:

Alright, hopefully I can explain this better. I'm working for a film festival, and we're balloting the first screening of each film in the fest. Therefore, each film will have from 0 to a maximum number of votes based on the size of each theater. I'm looking to find the most popular film in 3 categories: narrative, documentary, short film. By popular I mean a combination of highest average vote and number of votes.

It seems like a weighted average is what i'm looking for, giving less weight to votes from a bigger theater and more weight to votes from a smaller theater to even things out.

Was it helpful?

Solution

You're working with weighted averages.

Instead of just adding up and dividing by the total number of elements (arithmetic mean):

 a + b + c
 ---------
     3

You are adding weights to each element, as they are not all evenly distributed:

 w1*a + w2*b + w3*c
 ------------------
         3

In your case, the weights could be this:

# of people in current theater
--------------------------------
# of people in all the theaters

Let's try a test case:

Theater 1: 100 people       (rating: 1)
Theater 2: 1,000,000 people (rating: 5)

Average = (100 / (100 + 1000000)) * 1 + (1000000/(100 + 1000000)) * 5
          -----------------------------------------------------------
                                      2
        = 2.49980002

OTHER TIPS

Well, depending on your goals it sounds like you are interested in some sort of weighted average.

Continuing your film example, it sounds to me like you are trying to rate how "good" the films are. To do this, you don't want to factor the number of views of any particular film too highly into the final determination. However, you have to take it into account somewhat since a film that only got viewed 5 times and had an average rating of +2.7 has much less credibility than a film with 10,000 views getting the same rating.

You might consider simply not including a film in the results unless it has a minimum number of votes.

Given a uniform (even) distribution of votes across {1,2,3,4,5}, the expected rating of your film is 0.2. This is because the the votes {1 and 5} cancel eachother out, as do {2 and 4}. But the vote 3 has an expected value of 1/5 = 0.2. So if people give a rating of {1,2,3,4,5} with equal probability, then you would expect a film (no matter how many people see it) to have an average rating close to 0.2.

So I think the best option for you would be to add up all the scores received and simply divide by the number of people who have seen each film. This should be a good guess at people's sentiment toward the film as the average of the distribution should not get larger simply because more people see the film.

If I were you, I would also suggest adding a small penalty term to your final result, to take into account the fact that some people didn't even want to go see the movie. If lots of people didn't want to see the movie in the first place, but the 5 or so people that saw it gave it a 5* rating, that doesn't make it a good movie, does it?

So a final solution I would recommend: Add up all the points as you have described, and divide by the total number of people who have gone to the cinema. While not perfect (whatever perfect means), it should give you some indication of what people like and don't like. This essentially means people who chose not to see a movie are adding zero to the points total, but still affect the average because the end result is divided by a larger number.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top