I suspect that the problem is that the mediaBox is only a magic accessor for a variable is shared across p and all copies t. Therefore, assignments to t.mediaBox
will result in the mediaBox having the same coordinates in all four copies.
The variable behind the mediaBox field is lazily created on the first access to mediaBox, so if you comment out the line (w, h) = p.mediaBox.upperRight
, the mediaBox variables will be created separately for each t .
Two possible solutions for automatically determining the page dimensions:
Get the dimensions after making the copy:
for p in [input.getPage(i) for i in range(0,input.getNumPages())]: for j in range(0,4): t = copy.copy(p) (w, h) = t.mediaBox.upperRight t.mediaBox.lowerLeft = (ifel(j%2==1, w/2, 0), ifel(j<2, h/2, 0)) t.mediaBox.upperRight = (ifel(j%2==0, w/2, w), ifel(j>1, h/2, h)) output.addPage(t)
Instantiate fresh RectangleObjects to use for mediaBox variables
for p in [input.getPage(i) for i in range(0,input.getNumPages())]: (w, h) = p.mediaBox.upperRight for j in range(0,4): t = copy.copy(p) t.mediaBox.lowerLeft = pyPdf.generic.RectangleObject( ifel(j%2==1, w/2, 0), ifel(j<2, h/2, 0), ifel(j%2==0, w/2, w), ifel(j>1, h/2, h)) output.addPage(t)
Using copy.deepcopy()
will cause memory issues for large, complex PDFs,