Question

Python Docx is a pretty good library for generating Microsoft Word documents for something that doesn't directly deal with all of the COM stuff. Nonetheless, I'm running into some limitations.

  • Does anyone have any idea how one would put a carriage return in a string of text?

I want a paragraph to have multiple lines without there being extra space between them. However, writing out a string that separates the lines with the usual \n is not working. Nor is using &#10 or &#13. Any other thoughts, or is this framework too limited for something like that?

Was it helpful?

Solution

I'm not sure if this is possible. It looks as though Word is in fact treating presses of the enter key (I am treating this action as a sort of programmatic equivalent of "\r\n" and "\n") as the creation of a new paragraph.


If I record a macro in Word that consists of:

  1. Typing the text "One"
  2. Pressing the enter key

I get VBA of:

Selection.TypeText Text:="One"
Selection.TypeParagraph

If I create a Word document that looks like this (pressing enter after each word):

One

Two

Three

The body of that document looks like this in the documents.xml file:

<w:body>
    <w:p w:rsidR="00BE37B0" w:rsidRDefault="00CF2350">
        <w:r>
            <w:t>One</w:t>
        </w:r>
    </w:p>
    <w:p w:rsidR="00CF2350" w:rsidRDefault="00CF2350">
        <w:r>
            <w:t>Two</w:t>
        </w:r>
    </w:p>
    <w:p w:rsidR="00CF2350" w:rsidRDefault="00CF2350">
        <w:r>
            <w:t>Three</w:t>
        </w:r>
    </w:p>
    <w:sectPr w:rsidR="00CF2350" w:rsidSect="001077CC">
        <w:pgSz w:w="11906" w:h="16838"/>
        <w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="708" w:footer="708" w:gutter="0"/>
        <w:cols w:space="708"/>
        <w:docGrid w:linePitch="360"/>
    </w:sectPr>
</w:body>

From MSDN we can see that the <w:p> element represents a paragraph.


I think the solution to this would be to follow the example in Python Docx:

body.append(paragraph("Hi."))
body.append(paragraph("My name is Alice."))
body.append(paragraph("Let's code"))

Or:

for paragraph_text in "Hi. \nMy name is Alice.\n Let's code".split("\n"):
    body.append(paragraph(paragraph_text.strip()))

Edit:

Looking into this some more, if you press Shift + Enter in Word it adds a manual line break (not a paragraph) via appending Chr(11). In Open XML, this translates to a Break.

Looking at the docx.py file of Python Docx, something like this might be the way to go (disclaimer: not tested):

for text in "Hi. \nMy name is Alice.\n Let's code".split("\n"):
    run = makeelement('r')
    run.append(makeelement('t', tagtext=text))
    run.append(makeelement('br'))
    body.append(run)

OTHER TIPS

You can achieve your carriage return using python-docx by calling add_break() on your run. For example:

doc = Document()
p = doc.add_paragraph()
run = p.add_run()
run.add_break()

python-docx reference

As of v0.7.2, python-docx translates '\n' and '\r' characters in a string to <w:br/> elements, which provides the behavior you describe. It also translates '\t' characters into <w:tab/> elements.

This behavior is available for strings provided to:

  • Document.add_paragraph()
  • Paragraph.add_run()

and for strings assigned to:

  • Paragraph.text
  • Run.text
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top