Is there any Native Supports for Grabbing Images from PDFs or Create some sort of Object in Python that can contain the Images from a pdf that then can be access via OpenCV? I've looked at some scripts to dump the Images of a PDF into my directory but I'm aiming more at accessing the PDF and instead dumping the data from the PDF that is the image(s) into some sort of object I can access with OpenCV. My own exploration hasn't yielded any results so i figured I'd ask.
Added a Example of Using PyMuPDF based off example from @Ghilas BELHADJ
import fitz
import cv2
import numpy as np
from tkinter import Tk
from tkinter.filedialog import askopenfilename
class AccessPDF:
def __init__(self):
self.filepath = ""
self.doc = None
def openPDF(self):
Tk().withdraw()
self.filepath = askopenfilename()
self.doc = fitz.open(self.filepath)
def pixel2np(self,pix):
im = np.frombuffer(pix.samples, dtype=np.uint8).reshape(pix.h, pix.w, pix.n)
im = np.ascontiguousarray(im[..., [2, 1, 0]]) # rgb to bgr
return im
def displayKey(self):
pixobj = self.doc.getPagePixmap(0, alpha=False)
im = self.pixel2np(pixobj)
cv2.imwrite("testimg.png",im)
cv2.imshow("Key" im)
Python is a computer programming language often used to build websites and software, automate tasks, and conduct data analysis. Python is a general-purpose language, meaning it can be used to create a variety of different programs and isn't specialized for any specific problems.
Dec 14, 2020. The Python += operator lets you add two values together and assign the resultant value to a variable. This operator is often referred to as the addition assignment operator. It is shorter than adding two numbers together and then assigning the resulting value using both a + and an = sign separately.
Python is widely considered among the easiest programming languages for beginners to learn. If you're interested in learning a programming language, Python is a good place to start. It's also one of the most widely used.
If you're interested in web development with Python, then knowing HTML and CSS will help you understand web frameworks like Django and Flask better. But even if you're just getting started with Python, HTML and CSS can enable you to create small websites to impress your friends.
Edit: I've made a modification in the code following the comment of @Dan Mašek
You can achieve this (load the PDF embedded images into OpenCV
without writing intermediate objects on disk) using PyMuPDF
and Numpy
.
In this example, I'm using this pdf file.
import fitz
import cv2
import numpy as np
def pix2np(pix):
im = np.frombuffer(pix.samples, dtype=np.uint8).reshape(pix.h, pix.w, pix.n)
im = np.ascontiguousarray(im[..., [2, 1, 0]]) # rgb to bgr
return im
doc = fitz.open('NGM_2018_Media_Kit.pdf')
# entire page
# pix = doc.getPagePixmap(0, alpha=False)
# first page , 5th image, xref element
pix = fitz.Pixmap(doc, doc.getPageImageList(0)[4][0])
im = pix2np(pix)
cv2.putText(im, 'Azul fellawen', (100, 100),
cv2.FONT_HERSHEY_SIMPLEX, 1.,
(18, 156, 243), 2, cv2.LINE_AA)
cv2.imwrite('sample_0.png', im)
I've grabbed the images from an pdf containing images as well as text.
You can save the images using pix.writePNG()
or just show it using cv2.imshow()
, whichever suits you best.
import fitz #pymupdf
from cv2 import cv2
import numpy as np
def pix2np(pix):
im = np.frombuffer(pix.samples, dtype=np.uint8).reshape(pix.h, pix.w, pix.n)
im = np.ascontiguousarray(im[..., [2, 1, 0]]) # rgb to bgr
return im
def convertPdf(filename):
doc = fitz.open(filename)
#count = 0
for i in range(len(doc)):
for img in doc.getPageImageList(i):
xref = img[0]
pix = fitz.Pixmap(doc, xref)
#if pix.n < 5: # this is GRAY or RGB
# To save it to the disk
#pix.writePNG(f"p{count}.png")
im = pix2np(pix)
cv2.imshow("image",im)
cv2.waitKey(0)
#count += 1
pix = None
if __name__ == "__main__":
filename = "sample.pdf"
convertPdf(filename)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With