Domanda

I'm trying to do Laplace smoothing on my Naive Bayes code. It gives me 72.5% accuracy on 70% train 30% test set, which is kinda low. Does anyone see anything wrong?

posTotal=len(pos)
negTotal=len(neg)

for w in larr:
  if (w not in pos) or (w not in neg):
    unk[w]+=1
    unkTotal=len(unk)
  else:
    if (w in pos):
      posP+=(math.log10(pos[w])-math.log10(posTotal))
    if (w in neg):
      negP+=(math.log10(neg[w])-math.log10(negTotal))

pos and neg are a defaultdic.

Nessuna soluzione corretta

Altri suggerimenti

My Python's a little rusty, but for the if, don't you want if (w not in pos) and (w not in neg)? Seems like this version would only adjust your scores for words that are somehow found in both pos and neg.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top