PyPdf Merge error

https://stackoverflow.com/questions/12781994

python
pypdf

05-07-2021
|

سؤال

When i merge several Pdf pages using PyPdf into one single page using mergeTranslatedPage, i got some unknown characters, these unknown squares are the characters not included in the last merged page, after some research i think that the method _merge_ressources not working very well , because the later page could overwrite the ressources of the older pages , i tried page1.compressContentStreams() after each merge but without a result.

in this link you will see an example of the PDF that has been merged and the PDF result.

Any help please

المحلول

The below solutions uses the pdfjam command to merge multiple pdf pages into a single pdf page. It's a very powerful command with many different options and good documentation. I tested the solution on the two files you've provided 4_P7.pdf and 4_P13.pdf. You can view the merged.pdf to verify that all characters are formatted correctly. The code below uses a 2x2 grid by default but you can change that by setting the grid argument when you call merge.

from subprocess import check_output

def merge(inputs, output, grid='2x2'):
    check_output(['pdfjam'] + inputs + ['--nup', grid, '--outfile', output])

merge(['4_P7.pdf', '4_P13.pdf'], 'merged.pdf')

There was a question in the comment below as to whether custom positions can be done as is the case in the questions example file. The same layout that was provided in the question is implemented below. It first constructs the top layout which is a 4x2 layout, then the bottom 2x6 layout, then finally merges these two layouts into final.pdf. The pdfs used in the below example can be found here.

from subprocess import check_output

def merge(inputs, output, grid='2x2'):
    return check_output(['pdfjam'] + inputs + ['--nup', grid, '--outfile', output])

files = ['1.pdf', '2.pdf', '3.pdf', '4.pdf', '1.pdf', '2.pdf', '3.pdf', '4.pdf']
merge(files, 'top.pdf', '4x2')

files = ['1.pdf', '2.pdf', '3.pdf', '4.pdf', '5.pdf', '6.pdf', '1.pdf', '2.pdf',
    '3.pdf', '4.pdf', '5.pdf', '6.pdf']
merge(files, 'bottom.pdf', '2x6')

merge(['top.pdf', 'bottom.pdf'], 'final.pdf', '1x2')

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow