I am trying to extract the images stored in PDF as stream. While I can do this easily, I am not able to get the accurate image rotation information. I am looking for specific information such as MediaBox, Rotate and landscape/portrait mode.
When I extract the image, its alignment does not match the what the end user sees with a pdf reader tool.
I binary compared two PDFs (where an image was rotated 90 in the former and the same image was rotated 270 in the latter) and I found difference in a particular stream object. However, I am not able to make out what that stream information is.
Here are the two documents I am talking about:
http://bit.ly/eQZGKJ http://bit.ly/g43Whb
The position, size and orientation of the image when displayed on the page is determined by the current transformation matrix (CTM). You have to execute the entire page content stream to determine the CTM that is in place when the image is displayed. It's like a virtual rendering of the PDF page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With