Problems with calling unoconv from a django app, running in an virtualenv

Question 1

Since the app is running in an virtualenv, unoconv is being called with the virtualenv python interpreter instead of the system one.

The fix is pretty simple if you have virtualenvwrapper - just call the add2virtualenv command with the path to the directory containing uno.py and unohelper.py as the argument (/usr/share/pyshared) in my case.

Question 2

Are you sure that you absolutely need unoconv for your use case? It is powerful, but since it needs a full-fledged LibreOffice to run, it is: 1) somewhat slow to convert files; 2) slow to start; 3) uses a lot of RAM; 4) not very scalable.

Why don't you try Apache Tika (which is based on Apache POI)? It is somewhat more lightweight and more than good enough for most of the day-to-day tasks.

Launch Tika to process PDF files too, or use magic to distinguish between file types and go with a separate pdftotext utility or something similar. Here's a simplified version of what you can use to convert office files to, let's say, text:

import subprocess
from django.db import models
import magic  # https://github.com/ahupp/python-magic

PDFTOTEXT_COMMAND = '/usr/bin/pdftotext'
JAVA_COMMAND = '/usr/bin/java'
TIKA_PATH = '/path/to/tika.jar' 
PDFTOTEXT_OPTIONS = [u'-', ]
JAVA_OPTIONS = [ u'-jar', TIKA_PATH, u'--text', ]

mime = magic.Magic(mime=True)

class UploadedFileModel(models.Model):
    file = models.FileField(upload_to='files/')

    def get_txt(self):
        if not ('application/pdf' in mime.from_file(
                self.file.path.encode('utf-8'))):
            option_list = [JAVA_COMMAND, ] + JAVA_OPTIONS + [self.file.path, ]
        else:
            option_list = [PDFTOTEXT_COMMAND, ] + [self.file.path, ] +\
                PDFTOTEXT_OPTIONS

        pipe = subprocess.Popen(option_list, stdout=subprocess.PIPE)
        txt = pipe.communicate()[0]
        if pipe.returncode:
            return None
        else:
            return txt

P.S. The error unoconv: Cannot find a suitable pyuno library and python binary combination can be related to a broad number of issues. It is impossible to tell for sure without you providing additional information. For example, it could be a problem with paths.

Be sure to check out the relevant unoconv troubleshooting guides:

Question 3

Just try adding this in ur linux termimnal(after activating the environment) URE_BOOTSTRAP=vnd.sun.star.pathname:/usr/lib64/libreoffice/program/fundamentalrc UNO_PATH=/usr/lib64/libreoffice/program PATH=/usr/lib64/libreoffice/program:/home/graaff/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.3:/opt/android-sdk-update-manager/tools:/opt/android-sdk-update-manager/platform-tools:/usr/games/bin ,or atleast try UNO_PATH and PATH

Question 4

I had this problem when using a virtual environment.

cp /usr/lib/python3/dist-packages/unohelper.py /path/to/env/lib/python3.6/site-packages/
cp /usr/lib/python3/dist-packages/uno.py /path/to/env/lib/python3.6/site-packages/