Split large PDF file into single PDF's with python

0

I'm trying to split a large PDF file per page from page 5000 to 6000. The PDF files has 7000 pages with text and images and is 250MB big. The python code I have written is working for smaller PDF files.

I'm receiving the following errors: First error is RecursionError: maximum recursion depth exceeded.

After setting sys.setrecursionlimit(9999) I'm getting the following error Process finished with exit code -1073741571 (0xC00000FD). The PDF file has been written to my output folder but is corrupt and 0kb big. Increasing the recursion limit doesn't help either.

What could I do? Compress the PDF file and then split?

This is my code:

pdf_file = open(path,'rb')
    pdf_reader = PdfFileReader(pdf_file)
    pageNumbers = pdf_reader.getNumPages()

    output = PdfFileWriter()

    #this is just to test if it works for 1 page
    output.addPage(pdf_reader.getPage(5854))

    with open("output_path" + "document-output.pdf", "wb") as f:
        output.write(f)
python
pdf
pypdf2
asked on Stack Overflow Apr 12, 2019 by Jelmer

1 Answer

0

Sharing what worked for me. I have used the package wand in order to split this PDF file of 7000 pages. wand package

from wand.image import Image
# Converting #page into JPG
with Image(filename="C:/Users/Name/Documents/PDFfile.pdf[5950]", resolution= 300) as img:
     img.save(filename="C:/Users/Name/Documents/temp1.jpg")
answered on Stack Overflow May 2, 2019 by Jelmer

User contributions licensed under CC BY-SA 3.0