Naive Bayes, not so Naive?

Question 1

I am not familiar with the internals of WEKA - so please correct me if you think that I am not righth.

When using a text as a "feature" than this text is transformed to a vector of binary values. Each value correponds to one concrete word. The length of the vector is equal to the size of the dictionary.

if your dictionary contains 4 worlds: LCD, VHS, HELLO, WORLD then for example a text HELLO LCD will be transformed to [1,0,1,0].

I do not know how WEKA builds it's dictionary, but I think it might go over all the words present in the examples. Unless the "L" is present in the dictionary (and therefor is present in the examples) than it's probability is logicaly 0. Actually it should not even be considered as a feature.

Actually you can not reason over the probabilities of the features - and you cannot add them together, I think there is no such a relationship between the features.

Question 2

Beware that in text mining, words (letters in your case) may be given weights different than their actual counts if you are using any sort of term weighting and normalization, e.g. tf.idf. In the case of tf.idf for example, characters counts are converted into a logarithmic scale, also characters that appear in every single instance may be penalized using idf normalization.

I am not sure what options you are using to convert your data into Weka features, but you can see here that Weka has parameters to be set for such weighting and normalization options

http://weka.sourceforge.net/doc.dev/weka/filters/unsupervised/attribute/StringToWordVector.html

-T Transform the word frequencies into log(1+fij) where fij is the frequency of word i in jth document(instance).

-I Transform each word frequency into: fij*log(num of Documents/num of documents containing word i) where fij if frequency of word i in jth document(instance)

Question 3

I checked the weka documentation and I didn't see support for extracting letters as features. This implies the weka function may need a space or punctuation to delimit each feature from those adjacent. If so, then the search for "L", "C" and "D" would be interpreted as three separate one-letter-words and would explain why they were not found.

If you think this is it, you could try splitting the text into single characters delimited by \n or space, prior to ingestion.