The text mining package maintains it's own list of stopwords and provides useful tools for managing and summarizing this type of text.
Let's say your tweets are stored in a vector.
library(tm)
words <- vector_of_strings
corpus <- Corpus(VectorSource(words))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, function(x) tolower(x))
corpus <- tm_map(corpus, function(x) removeWords(x,
stopwords()))
You can use the last line with your own list of stopwords():
stoppers <- c(stopwords(), "gonna", "wanna", "lol", ... )
Unfortunately, you'll have to generate your own list of "text messaging" or "internet messaging" stopwords.
But, you could cheat a bit, by borrowing from NetLingo ( http://vps.netlingo.com/acronyms.php )
library(XML)
theurl <- "http://vps.netlingo.com/acronyms.php"
h <- htmlParse(theurl)
h <- getNodeSet(h,"//ul/li/span//a")
stoppers <- sapply(h,xmlValue)