reading .doc, .docx, .pdf, .rtf documents in .net without Word
Question
so far it's only aspose words but which is very pricey
other are to convert to .pdf or to print to .pdf
I am looking for a way to read the contents of these doc types without installing office or pdf app i.e. get the text of these documents for parsing
Solution
DevExpress offers a document server component now, which is far less pricey than Aspose.
OTHER TIPS
You want to use components that plug into the IFilter framework, which is what windows uses to index documents for its text search.
For office documents you can use Office 2010 Filter Pack For pdf, you can use a commercial offering such as FoxIt IFilter, which seems fairly priced.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow