Question

(let ((g (* 2 (or (gethash word good) 0)))
      (b (or (gethash word bad) 0)))
   (unless (< (+ g b) 5)
     (max .01
          (min .99 (float (/ (min 1 (/ b nbad))
                             (+ (min 1 (/ g ngood))   
                                (min 1 (/ b nbad)))))))))
Was it helpful?

Solution

What is the problem? It is almost plain english:

Let g be the value of word in the hashtable good (or 0 if not existent there) times 2

(let ((g (* 2 (or (gethash word good) 0)))

and b the value of word in the hashtable bad (or 0 if not existent there).

      (b (or (gethash word bad) 0)))

With this in mind, and under the presumption that the sum of g and b is not smaller than 5

   (unless (< (+ g b) 5)

return the maximum of either 0.01 or

     (max .01

the minimum of either 0.99 or

          (min .99 

b/nbad divided by the sum of b/nbad and g/ngood (as a float value, and those individual quotients should be at most 1).

               (float (/ (min 1 (/ b nbad))
                         (+ (min 1 (/ g ngood))   
                            (min 1 (/ b nbad)))))))))

OTHER TIPS

Looks like it is trying to calculate a score based on the presence of word in the the hash tables good and bad.

If the word does not exist in a hash table it is given a value of 0, otherwise if it exists in the good table it is weighted by 2 (doubled).

If the score is less than 5 calculate the score (portion below unless) as follows:

score = min(1, b/nbad) / (min(1, g/ngood) + min(1, b/nbad))
max(0.01, min(0.99, score))

I'm not sure what ngood and nbad are but then n indicates to me they are probably counts. It also looks like the code is keeps the calculated score below 5. It also looks like in the score calculation the denominator will be kept to a maximum 2 keep the lower bound of the score to 0.5.

Based on the tags you've used, I would guess (and it is just a guess) that it is trying to calculate a weighting for word based on some kind of frequency(?) counting of the word in good versus bad email.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top