Extract a single page (or range of pages) from pdf data without loading the whole pdf (which takes too much RAM sometimes)

Tags:

Using PDFKit in swift, you can use PDFDocument to open pdf files. That's easy and works well. But I'm building a custom pdf viewer (for comic book pdfs) that suits my needs and there is one problem I have. In a viewer, I don't need to have the whole pdf file in memory. I only need about a few pages at a time.

Also, the pdfs consist only of images. There's no text or anything.

When instantiating a PDFDocument, the whole pdf data is being loaded into memory. If you have really huge pdf files (over 1GB) this isn't optimal (and can crash on some devices). As far as I know, there's no way in PDFKit to only load parts of a pdf document.

Is there anything I can do about that? I haven't found a swift/obj-c library that can do this (though I don't really know the right keywords to search for it).

My workaround would be to preprocess pdfs and save each page as image in the .documents director (or similar) using FileManager. That would result in a tremendous amount of files but would solve the memory problem. I'm not sure I like this approach, though.

Update:

So I did what @Prcela and @Sahil Manchanda proposed. It seems to be working for now.

@yms: Hm, that could be a problem, indeed. Does this even happen when there are only images? Without anything else in the pdf.

@Carpsen90: They are local (saved in the documents directory).

EDIT: I haven't accepted the answer below, or given it the bounty. This was automatically. It does not solve the problem. It still loads the entire PDF into memory!

623

asked Sep 01 '18 11:09

Quantm

1 Answers

I have an idea how you could achieve this in PDFKit. After reading the documentation there is a function which allows for the selection of certain pages. Which would probably solve your problem if you would add it to a collectionFlowView.

func selection(from startPage: PDFPage, atCharacterIndex startCharacter: Int, to endPage: PDFPage, atCharacterIndex endCharacter: Int) -> PDFSelection?

However as I read that you mainly have images there is another function which allows to extract parts of the pdf based on CGPoints:

func selection(from startPage: PDFPage, at startPoint: CGPoint, to endPage: PDFPage, at endPoint: CGPoint) -> PDFSelection?

Also have a look at this: https://developer.apple.com/documentation/pdfkit/pdfview

as this might be what you need if you only want to view the pages without any annotations editing etc.

I also prepared a little code to extract one page below. Hope it helps.

import PDFKit
import UIKit

class PDFViewController: UIViewController {

    override func viewDidLoad() {
        super.viewDidLoad()

        guard let url = Bundle.main.url(forResource: "myPDF", withExtension: "pdf") else {fatalError("INVALID URL")}
        let pdf = PDFDocument(url: url)
        let page = pdf?.page(at: 10) // returns a PDFPage instance
        // now you have one page extracted and you can play around with it.
    }
}

EDIT 1: Have a look at this code extraction. I understand that the whole PDF gets loaded however this approach might be more memory efficient as perhaps iOS will be handling it better in a PDFView:

func readBook() {

if let oldBookView = self.view.viewWithTag(3) {
    oldBookView.removeFromSuperview()
    // This removes the old book view when the user chooses a new book language
}

if #available(iOS 11.0, *) {
    let pdfView: PDFView = PDFView()
    let path = BookManager.getBookPath(bookLanguageCode: book.bookLanguageCode)
    let url = URL(fileURLWithPath: path)
    if let pdfDocument = PDFDocument(url: url) {
        pdfView.displayMode = .singlePageContinuous
        pdfView.autoScales = true
        pdfView.document = pdfDocument
        pdfView.tag = 3 // I assigned a tag to this view so that later on I can easily find and remove it when the user chooses a new book language
        let lastReadPage = getLastReadPage()

        if let page = pdfDocument.page(at: lastReadPage) {
            pdfView.go(to: page)
            // Subscribe to notifications so the last read page can be saved
            // Must subscribe after displaying the last read page or else, the first page will be displayed instead
            NotificationCenter.default.addObserver(self, selector: #selector(self.saveLastReadPage),name: .PDFViewPageChanged, object: nil)
        }
    }

    self.containerView.addSubview(pdfView)
    setConstraints(view: pdfView)
    addTapGesture(view: pdfView)
}

EDIT 2: this is not the answer the OP was looking for. This also loads the whole pdf into the memory. Read comments

172

answered Oct 17 '22 15:10

AD Progress

Related questions
                            
                                iOS8 touch position is limited in Landscape, as if window is Portrait on one side?
                            
                                how to use AVCaptureSession to read a video from a file?
                            
                                iOS Backgrounding Not Working
                            
                                NSAttributedString justified text (without stretching words)
                            
                                Xcode Server CI Bot Test Session exited(-1)
                            
                                How to get your page to scroll an input element into view when the virtual keyboard covers it?
                            
                                Unable to merge Unity 5 into our iOS application
                            
                                ios 9 - xcode 7 - SFSafariViewController - Image Upload - Camera Black Screen
                            
                                How can I make the footerview always stay at the bottom in UITableViewController?
                            
                                Scale image based off UIPinch
                            
                                Deep linking redirect to app only works on 2nd attempt on iOS 9 and up only
                            
                                Is there a limit for number of items (CSSearchableItem) in Core Spotlight CSSearchableIndex in iOS 9?
                            
                                Setting tokens in Spotify iOS app disables login callback
                            
                                Firebase Cloud Messaging Notification not sending iOS payload format
                            
                                Repeat HMTimerTrigger On multiple days (Ex: Every Monday,Wednesday.... like in iOS 10 Home app)
                            
                                Jenkins with Xcode 8 - Cannot find Provisioning Profiles
                            
                                Native component for "RCTFBLoginButton" does not exist
                            
                                UICollectionView with autosizing cell (estimatedSize) and sectionHeadersPinToVisibleBounds goes mental
                            
                                Can I union multiple transparent SCNShape objects?
                            
                                UINavigationBar titleTextAttributes not updated after coming back from a View Controller

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Extract a single page (or range of pages) from pdf data without loading the whole pdf (which takes too much RAM sometimes)

Tags:

ios

pdf

swift

Quantm

People also ask

1 Answers

AD Progress

Recent Activity

Donate For Us