Pregunta

I have a couple systems which contain a users' table along with some form of karma/weight/reputation. Sometimes it's the number of posts a user has made, sometimes it's the number of up/down votes a user has received across all their activity on the site.

USER {
    id int
    name string
    karma int
}

How do I use these numbers to calculate that user's "weight" or "authority"? For example, the vote of one long-time member is often worth much more than 4 votes from brand new users.

I was thinking about adding up the total points/karma/reputation of all members and then trying to come up with a 1-100 scale.

SUM(user.points) / COUNT(user.*) = average user points

Then something like

CEIL(userA.points / average user points) = their weight on an issue

However, there also needs to be a curve on the points this way as I don't want someone with 5,000 posts/karma to out weigh 20 new users votes.

¿Fue útil?

Solución

Mathematically, your best bet is to weight by the log of the percentile ranking of user in question. However, that is painful in SQL.

Simpler would be to cheat and assume the mean is the same as the median (a very bad assumption statistically, but much simpler programmatically):

 SELECT 1 - log10(SELECT COUNT (*) FROM user 
 WHERE (SUM(user.points) / COUNT(user.*)) < user.points)  
 / SELECT (COUNT (*) from user))

In this way, your top 10% of karma would have one and a half the impact of your average user, almost twice the impact of a noob. Changing the log base would scale this, obviously, where natural log (log() in mysql) would give the upper 10% 3 times as much impact as a noob, and twice the impact as average. Log2() is even more extreme. (Note: subtraction is required because the log will be negative.)

If you want a more severe effect you might try squaring the log. (Note: squaring makes the log squared positive, so addition is appropriate here.)

If you want a hyperprecise rule, you can go into standard deviations, but the sql gets cumbersome and slow. It all depends on how far down the rabbit hole you want to go....

Otros consejos

There are probably some resources that can provide you with parameters for this, but you should probably decide exactly what you want rather than using some predefined model. I suggest you define some rules for which sets of users should be equivalent or which should outweigh each other (e.g. 10 0 karma users = 1 5k karma user) (equivalence is much easier to work with), which will very quickly produce parameters for some chosen equation.

Using log (as already suggested), some (fractional) power (like square root) or even just linear can work.

I suggest something like newKarma = a.karma^b + c, and it shouldn't be to difficult to solve a, b and c. I suggest you pick b rather than trying to calculate it. Using new users (with karma = 0) should make this quite easy to solve. Guessing values to get close to what you want can be easier than determining them mathematically (since some rules together won't fit any simple equation).

Note that c above is an offset to karma, which will give many new users more total karma than high-karma users. You may also want to think about a.(karma + c)^b, or a.(karma + c)^b + d. Analysing the rules you defined should tell you which one to use.

UPDATE: Added alternatives for c

EDIT: You have some options for SQL. A temp table (with sums) might actually be the fastest. You can also just use a view. A join on the same table might also be possible, though I'm not sure. Using a view would look something like: (for some chosen a,b,c and d) (you may also want to add indices to the view)

Votes(issueID, userID) // table structure
User(userID, karma, ...) // table structure

CREATE VIEW Sums AS
SELECT issueID, SUM(1*POWER(karma + 2, 3) + 4) AS sumVal
FROM Votes JOIN User ON User.userID = Votes.userID
GROUP BY issueID

Query:

SELECT (1*POWER(karma + 2, 3) + 4)/sumVal AS influenceOnIssue
FROM Votes JOIN User ON User.userID = Votes.userID
  JOIN Sums on Sums.issueID = Votes.issueID
WHERE Votes.userID = @UserID AND Votes.issueID = @IssueID

A simplification may be to have a computed column that = 1*POWER(karma + 2, 3) + 4

The faster option would be to calculate the derived karma on insert/update, either by having an additional column and using triggers or just calculating in before you call insert/update, and calling insert/update with the new value.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top