It looks like you'll need thousands of training messages.
Note that spammers have discovered ways to get past this kind of filter, e.g. mispellings like "v1agra". Iterative refinements to the classifier might catch up to their current techniques.
Bayesian_spam_filtering looks like a good place to start, esp. its references to in-depth articles.