Is there any java library (maybe poi?) which allows to merge docx files? [closed]

StackOverflow https://stackoverflow.com/questions/2494549

  •  21-09-2019
  •  | 
  •  

Question

I need to write a java application which can merge docx files. Any suggestions?

Was it helpful?

Solution

The following Java APIs are available to handle OpenXML MS Word documents with Java:

There was one more, but I don't recall the name anymore.

As to your functional requirement: merging two documents is technically tricky to achieve the result as the enduser would expect. Most API's won't allow that. You'll need to extract the desired information from two documents and then create one new document based on this information yourself.

OTHER TIPS

With POI my solution is:

public static void merge(InputStream src1, InputStream src2, OutputStream dest) throws Exception {
    OPCPackage src1Package = OPCPackage.open(src1);
    OPCPackage src2Package = OPCPackage.open(src2);
    XWPFDocument src1Document = new XWPFDocument(src1Package);        
    CTBody src1Body = src1Document.getDocument().getBody();
    XWPFDocument src2Document = new XWPFDocument(src2Package);
    CTBody src2Body = src2Document.getDocument().getBody();        
    appendBody(src1Body, src2Body);
    src1Document.write(dest);
}

private static void appendBody(CTBody src, CTBody append) throws Exception {
    XmlOptions optionsOuter = new XmlOptions();
    optionsOuter.setSaveOuter();
    String appendString = append.xmlText(optionsOuter);
    String srcString = src.xmlText();
    String prefix = srcString.substring(0,srcString.indexOf(">")+1);
    String mainPart = srcString.substring(srcString.indexOf(">")+1,srcString.lastIndexOf("<"));
    String sufix = srcString.substring( srcString.lastIndexOf("<") );
    String addPart = appendString.substring(appendString.indexOf(">") + 1, appendString.lastIndexOf("<"));
    CTBody makeBody = CTBody.Factory.parse(prefix+mainPart+addPart+sufix);
    src.set(makeBody);
}

With Docx4j my solution is:

public class MergeDocx {
    private static long chunk = 0;
    private static final String CONTENT_TYPE = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";

    public void mergeDocx(InputStream s1, InputStream s2, OutputStream os) throws Exception {
        WordprocessingMLPackage target = WordprocessingMLPackage.load(s1);
        insertDocx(target.getMainDocumentPart(), IOUtils.toByteArray(s2));
        SaveToZipFile saver = new SaveToZipFile(target);
        saver.save(os);
    }

    private static void insertDocx(MainDocumentPart main, byte[] bytes) throws Exception {
            AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/part" + (chunk++) + ".docx"));
            afiPart.setContentType(new ContentType(CONTENT_TYPE));
            afiPart.setBinaryData(bytes);
            Relationship altChunkRel = main.addTargetPart(afiPart);

            CTAltChunk chunk = Context.getWmlObjectFactory().createCTAltChunk();
            chunk.setId(altChunkRel.getId());

            main.addObject(chunk);
    }
}

It sure looks like POI can work with docx files. Are you trying to figure out how to merge them?

How to extract plain text from a DOCX file using the new OOXML support in Apache POI 3.5?

Aspose API is the best so far for merging word doc or docx files so far but that is not free or open source, if you need a free and open source tools there are couple of API you can choose from, you can find a review on them here,

http://www.esupu.com/open-source-office-document-java-api-review/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top