Question

Is there a way to iterate through a document and remove all <:p /> elements if they don't have any runs? I am trying to remove paragraphs if they look something like this:

<w:p>
    <w:pPr>
        <w:pStyle w:val="Heading1" />
        <w:numPr>
            <w:ilvl w:val="0" />
            <w:numId w:val="0" />
        </w:numPr>
        <w:ind w:left="432" />
    </w:pPr>
</w:p>

Here is what I have so far, but it only removes empty <w:p /> elements.

foreach (Paragraph P in D.Descendants<Paragraph>().Where(x => !x.HasChildren).ToList()
Was it helpful?

Solution

You can call this :

foreach (Paragraph P in D.Descendants<Paragraph>()
         .Where(o=>o.Descendants<Run>().Count() ==0).ToList()

But keep in mind if you have sections in your document, it may causes problems (check this for more information : http://msdn.microsoft.com/en-us/library/documentformat.openxml.wordprocessing.sectionproperties(v=office.14).aspx)

OTHER TIPS

I would load the xml into an XmlDocument and then use linq:

XmlDocument doc = new XmlDocument();
doc.Load(@"C:\Path\To\Xml\File.xml");

var rootNode = doc.DocumentElement;

XmlNodeList ps = rootNode.SelectNodes("//p");
for (int i = 0; i < ps.Count; i++)
{
     if (ps[i].SelectNodes("//pr").Count == 0)
     {
         rootNode.RemoveChild(ps[i]);
     }
}

That code is completely untested though, but it does compile. Let me know if this isn't any good for you and check out do some googling of Xml Parsing!

i'm using Linq, This can do better.

/*parent is the document body*/
parent.Descendants<Paragraph>().Where(p=>p.Descendants<Run>().Count()==0).All(p =>
{
   p.Remove();
   return true;
});

Hope this helps. Cheers.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top