pdftotext with external URLs (PHP)

https://stackoverflow.com/questions/12999825

PHP
shell-exec
pdftotext

13-07-2021
|

题

I want to make PDFs from external URLs searchable. I'm using pdftotext from XPDF. It's working fine with PDFs already on my webspace, but I keep getting an error message when trying to use external PDFs instead. Specifically I get:

"Error: Couldn't open file 'https://www.vericoa.com/sandbox/test2.pdf' "

Here is my code

$path = 'https://www.vericoa.com/sandbox/test2.pdf'; 

echo shell_exec('pdftotext -enc UTF-8 '.$path.' pdf.txt 2>&1');  

$file = file_get_contents('pdf.txt');

echo $file;

Is it even possible to extract text from external PDF sources? Are there any alternatives (I spent the last hours searching, but found nothing).

Thanks in advance Matthias

解决方案

You could perhaps try downloading the external URL in php, saving it to a file and passing that to the pdftotext script?

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow