subprocess isn't outputting anything
-
20-09-2019 - |
Question
I'm trying to use Python to run pdftotext, but for some reason, my code isn't working. If I run the below, I expect that the content variable would contain the contents of the PDF, but the result I am getting is just an empty string.
Does anybody know what I'm missing?
def getPDFContent(path):
path = "/path/to/a valid/pdffile.pdf"
process = subprocess.Popen(["pdftotext", path], shell=False,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
content, err = process.communicate()[0:2]
return content, err
Solution
By default pdftotext
doesn't output anything on stdout, it instead creates a .txt
file with the same base name as the pdf. To get the text on stdout, add -
as a second parameter in the call to pdftotext
:
process = subprocess.Popen(["pdftotext", path, "-"], shell=False,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow