Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Wand converts from PDF to JPG background is incorrect

I found a so wired thing while converting a pdf to jpeg, so i'd like to figure out that maybe this is a small bug. See the converted jpg below, you could find that, the background color are all black. The image is here: www.shdowin.com/public/02.jpg

However, in the source file of pdf, you can see that the background color are normal white. The image is here: www.shdowin.com/public/normal.jpg

I thought this maybe my pdf file's fault, however, when i try to use Acrobat.pdf2image in .NET environment, the converted jpg shows correctly.

Here is my code:

from wand.image import Image
from wand.color import Color
import os, os.path, sys

def pdf2jpg(source_file, target_file, dest_width, dest_height):
    RESOLUTION    = 300
    ret = True
    try:
        with Image(filename=source_file, resolution=(RESOLUTION,RESOLUTION)) as img:
            img.background_color = Color('white')
            img_width = img.width
            ratio     = dest_width / img_width
            img.resize(dest_width, int(ratio * img.height))
            img.format = 'jpeg'
            img.save(filename = target_file)
    except Exception as e:
        ret = False

    return ret

if __name__ == "__main__":
    source_file = "./02.pdf"
    target_file = "./02.jpg"

    ret = pdf2jpg(source_file, target_file, 1895, 1080)

Any suggestions for the issue?

I have uploaded the pdf to the url: 02.pdf

You can try...

like image 935
cendy Avatar asked Dec 07 '13 08:12

cendy


3 Answers

For others who still have this problem I fixed it after googling and trying a couple of hours thanks to this question https://stackoverflow.com/a/40494320/2686243 by using this two lines:

img.background_color = Color("white")
img.alpha_channel = 'remove'

Tried with Wand version 0.4.4

like image 192
Martin Avatar answered Nov 10 '22 22:11

Martin


I got the answer by myself. It's because of the alpha_channel case. This pdf includes some transparent background(after i transfomred to png format), and for resize, ImageMagick choose the best resize filter, so black background displayed.

So, after a lot of experiments, I found that just add "img.alpha_channel=False" in "with" statement(before img.save()), that would work properly.

Thanks for VadimR's advise, it is helpful.

like image 40
cendy Avatar answered Nov 10 '22 23:11

cendy


An easy solution is to change the order of commands: Change the format to jpeg first and then to resize

        img.format = 'jpeg'
        img.resize(dest_width, int(ratio * img.height))

It is also very easy to open the PDF in the exact size by the resolution tuple, because the resolution can be a float number.

like image 2
hynekcer Avatar answered Nov 10 '22 22:11

hynekcer