Here are two ways:
mydf[sapply(gregexpr("\\W+", mydf$ARTICLE), length) >4,]
# NO ARTICLE
# 1 34 The New York Times reports a lot of words here.
# 2 12 Greenwire reports a lot of words.
# 4 2 The Financial Times reports a lot of words.
# 6 13 The New York Times reports a lot of words again.
mydf[sapply(strsplit(as.character(mydf$ARTICLE)," "),length)>5,]
# NO ARTICLE
# 1 34 The New York Times reports a lot of words here.
# 2 12 Greenwire reports a lot of words.
# 4 2 The Financial Times reports a lot of words.
# 6 13 The New York Times reports a lot of words again.
The first generates a vector containing the starting positions of each word after the first, and then calculates the length of that vector.
The second splits the ARTICLE column into a vector containing the component words and calculates the length of that vector. This is probably a better approach.