Question

A vendor is providing content which needs to be inserted into the db. The content is basically questions with options and explanations. An example is below.

=========================================

1) What is the capital of United Kingdom?

1] London 2]Paris 3]Berlin 4] Edinburgh

Solution: Blah Blah Blah

Answer: Option 1

==========================================

There are hundreds of questions in the above format and the vendor is supplying it in a .doc or .docx format. All these questions need to be entered into the database and I have to automate the process so that the data is read from the word doc and entered into db.

What is the best way to go about it? I prefer using C# and I already have code to which takes custom objects and inserts it into the relevant tables. Now all I want is to read the word doc and populate those objects. Any pointers would be helpful.

Thank you for your time!

Cheers

Was it helpful?

Solution

You need to reference and use the COM object "Microsoft Word x.x object library" where x.x is some version depending on the version of Office that you are using.

You then need to use a Word.ApplicationClass to open the Word document and access its data. It is often suggested that you copy the entire Word document to the clipboard and then access it from there.

Something like:

Word.ApplicationClass wordApp = new Word.ApplicationClass();
object file = filepath;
object nullobj = System.Reflection.Missing.Value;
Word.Document doc = wordApp.Documents.Open(ref file, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj);
doc.ActiveWindow.Selection.WholeStory();
doc.ActiveWindow.Selection.Copy();
IDataObject data = Clipboard.GetDataObject();
txtFileContent.Text = data.GetData(DataFormats.Text).ToString();
doc.Close();
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top