The answer to your question is over here: https://stackoverflow.com/a/13370840/1036500 (give it an upvote!)
In brief, more recent versions of the tm
package do not include minDocFreq
but instead use bounds
, for example, your
smaller <- DocumentTermMatrix(corpus, control=list(minDocFreq=100))
should now be
require(tm)
data("crude")
smaller <- DocumentTermMatrix(crude, control=list(bounds = list(global = c(5,Inf))))
dim(smaller) # after Terms that appear in <5 documents are discarded
[1] 20 67
smaller <- DocumentTermMatrix(crude, control=list(bounds = list(global = c(10,Inf))))
dim(smaller) # after Terms that appear in <10 documents are discarded
[1] 20 17