Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python & OpenCV

Is there any Native Supports for Grabbing Images from PDFs or Create some sort of Object in Python that can contain the Images from a pdf that then can be access via OpenCV? I've looked at some scripts to dump the Images of a PDF into my directory but I'm aiming more at accessing the PDF and instead dumping the data from the PDF that is the image(s) into some sort of object I can access with OpenCV. My own exploration hasn't yielded any results so i figured I'd ask.

Added a Example of Using PyMuPDF based off example from @Ghilas BELHADJ

import fitz
import cv2
import numpy as np
from tkinter import Tk
from tkinter.filedialog import askopenfilename


class AccessPDF:

    def __init__(self):
        self.filepath = ""
        self.doc = None

    def openPDF(self):
        Tk().withdraw()
        self.filepath = askopenfilename()
        self.doc = fitz.open(self.filepath)

    def pixel2np(self,pix):
        im = np.frombuffer(pix.samples, dtype=np.uint8).reshape(pix.h, pix.w, pix.n)
        im = np.ascontiguousarray(im[..., [2, 1, 0]])  # rgb to bgr
        return im

    def displayKey(self):  
        pixobj = self.doc.getPagePixmap(0, alpha=False)
        im = self.pixel2np(pixobj)
        cv2.imwrite("testimg.png",im)
        cv2.imshow("Key" im)
like image 843
Rob Avatar asked Oct 30 '18 07:10

Rob


People also ask

What is Python used for?

Python is a computer programming language often used to build websites and software, automate tasks, and conduct data analysis. Python is a general-purpose language, meaning it can be used to create a variety of different programs and isn't specialized for any specific problems.

What is A += in Python?

Dec 14, 2020. The Python += operator lets you add two values together and assign the resultant value to a variable. This operator is often referred to as the addition assignment operator. It is shorter than adding two numbers together and then assigning the resulting value using both a + and an = sign separately.

Is Python easy to learn?

Python is widely considered among the easiest programming languages for beginners to learn. If you're interested in learning a programming language, Python is a good place to start. It's also one of the most widely used.

Can I use Python with HTML and CSS?

If you're interested in web development with Python, then knowing HTML and CSS will help you understand web frameworks like Django and Flask better. But even if you're just getting started with Python, HTML and CSS can enable you to create small websites to impress your friends.


2 Answers

Edit: I've made a modification in the code following the comment of @Dan Mašek

You can achieve this (load the PDF embedded images into OpenCV without writing intermediate objects on disk) using PyMuPDF and Numpy.

In this example, I'm using this pdf file.

import fitz
import cv2
import numpy as np


def pix2np(pix):
    im = np.frombuffer(pix.samples, dtype=np.uint8).reshape(pix.h, pix.w, pix.n)
    im = np.ascontiguousarray(im[..., [2, 1, 0]])  # rgb to bgr
    return im


doc = fitz.open('NGM_2018_Media_Kit.pdf')

# entire page
# pix = doc.getPagePixmap(0, alpha=False)

# first page , 5th image, xref element
pix = fitz.Pixmap(doc, doc.getPageImageList(0)[4][0])  
im = pix2np(pix)

cv2.putText(im, 'Azul fellawen', (100, 100),
            cv2.FONT_HERSHEY_SIMPLEX, 1.,
            (18, 156, 243), 2, cv2.LINE_AA)
cv2.imwrite('sample_0.png', im)

enter image description here

like image 188
Ghilas BELHADJ Avatar answered Oct 29 '22 19:10

Ghilas BELHADJ


I've grabbed the images from an pdf containing images as well as text.

You can save the images using pix.writePNG() or just show it using cv2.imshow(), whichever suits you best.

import fitz    #pymupdf
from cv2 import cv2
import numpy as np

def pix2np(pix):
    im = np.frombuffer(pix.samples, dtype=np.uint8).reshape(pix.h, pix.w, pix.n)
    im = np.ascontiguousarray(im[..., [2, 1, 0]])  # rgb to bgr
    return im

def convertPdf(filename):  
    doc = fitz.open(filename)
    #count = 0
    for i in range(len(doc)):
        for img in doc.getPageImageList(i):
            xref = img[0]
            pix = fitz.Pixmap(doc, xref)

            #if pix.n < 5:       # this is GRAY or RGB
            # To save it to the disk
            #pix.writePNG(f"p{count}.png")

            im = pix2np(pix)
            cv2.imshow("image",im)
            cv2.waitKey(0)
            #count += 1
            pix = None

if __name__ == "__main__":
    filename = "sample.pdf"
    convertPdf(filename)
like image 21
Yash Soni Avatar answered Oct 29 '22 18:10

Yash Soni