Calling pandoc from python using subprocess.Popen

https://stackoverflow.com//questions/12649908

11-12-2019
|

Question

I am having problems calling pandoc from python using subprocess.Popen. It all works in the console. Here is the code.

# Test markdown file
here is just a simple markdown file.

Now my python code using that filename is the full path my markdown file:

import subprocess
fileout = os.path.splitext(filename)[0] + ".pdf"
args = ['pandoc', filename, '-o', fileout]
subprocess.Popen(args)

I also tried various ways to capture an error but that didn't work. In the console, however, everything is running fine:

pandoc '[filename]' -o '[fileout]'

Solution

That should work just fine, but you may want to wait for it to finish using subprocess.check_call rather than subprocess.Popen directly:

subprocess.check_call(args)

This also makes sure that it completed successfully. If the status code isn't 0, it will throw an exception.

OTHER TIPS

This doesn't answer your question (and you may specifically want/need to call pandoc using subprocess.Popen) but there is a Python wrapper for Pandoc called Pyandoc: see my answer here.

I don't really like using PIPE, it's more complicated, and the Python docs on subprocess recommend not to use it if not necessary (see section 17.1.1).

This works for me (taken from Markx).

Filename is the name of the markdown file without .md, and extension in the desired output (.pdf, .docx):

def pandoc(filename, extension):
    # TODO manage pandoc errors, for example exit status 43 when citations include Snigowski et al. 2000
    options = ['pandoc', filename + '.md', '-o', filename + extension]
    options += ['--ascii', '-s', '--toc'] # some extra options
    options += ['--variable=geometry:' + 'a4paper'] # to override the default letter size
    print options # for debugging
    return subprocess.check_call(options)

If there was a problem an exception in raised. If you want to get the status code instead of an exception, I think you should replace check_call with call, but see the docs.

If you want to use citations see my original implementation from the Markx project with the bibliography option.

If you want to capture the stdout and stderr resulting from the Popen call, you'll need to use PIPE in conjunction with communicate().

from subprocess import Popen, PIPE

fileout = os.path.splitext(filename)[0] + ".pdf"
args = ['pandoc', filename, '-o', fileout]
stdout, stderr = Popen(args, stdout=PIPE, stderr=PIPE).communicate()

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow