Domanda

I have a large R Corpus object, using the tm package, made up of millions of small documents.

How do I save that to disk as a single text file for use with other programs (such as word2vec)?

I tried

writeCorpus(myCorpus)

but that wrote out a million tiny text files that blew up my Mac!

I'm not very proficient in R, so any help on how to do this would be much, much appreciated. Thank you!

È stato utile?

Soluzione

Try :

writeLines(as.character(mycorpus), con="mycorpus.txt")

But I don't know if it will be efficient with a million documents

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top