Question

I'm trying to do Laplace smoothing on my Naive Bayes code. It gives me 72.5% accuracy on 70% train 30% test set, which is kinda low. Does anyone see anything wrong?

posTotal=len(pos)
negTotal=len(neg)

for w in larr:
  if (w not in pos) or (w not in neg):
    unk[w]+=1
    unkTotal=len(unk)
  else:
    if (w in pos):
      posP+=(math.log10(pos[w])-math.log10(posTotal))
    if (w in neg):
      negP+=(math.log10(neg[w])-math.log10(negTotal))

pos and neg are a defaultdic.

No correct solution

OTHER TIPS

My Python's a little rusty, but for the if, don't you want if (w not in pos) and (w not in neg)? Seems like this version would only adjust your scores for words that are somehow found in both pos and neg.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top