Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read a PDF text matrix

Tags:

text

pdf

matrix

I am writing a program to create PDF file directly from my Program. I have used the PDF Reference manual and managed to figure out everything except for 1 thing. The Text Matrix - It has to be the most confusing thing I have ever read, googled, re-read, re-googled, and re-read about and still do not undertand it. About the time I think I undertand it something comes up and I realiize I don't.

What I am having an issue with is creating a Landscape PDF file with standard 11 x 8.5 size (792 x 612). I can create the file and everything looks and displays correctly in landscape.

Now I want to remove all the common text that appears on every page and place it into a Form XObject and use Do to add this to every page. I have this working great for portrait PDF files. When I try the same with Landscape PDF file the Form Xobject text prints rotated different then the rest of the page. Apparently the rotation for the pages do not care forward to the Form XObject.

I believe this has to do with Text Matrix, I am trying to find a simple explanation of the values. I understand sin and cosin but not the layout of how values are specified. example: I found this explanation for rotation: for a b c d e f tm rotations are produced cos0 sin0 -sin0 cos0 0 0 rotates coordinate system axes by angle 0 counter clockwise... Huh? I understand sin cosin and "counter clockwise" buts thats about it No simple examples can be found I think I need to see some examples to understand this

What would Text Matrix look like for:

  • 0 rotation?
  • 90 rotation?
  • 180 rotation?
  • 270 rotation?

I found this example but cannot seem to decipher what it translates to

What does this text matrix translate to in simple english.

Example text matrix: 0 1 -1 0 07 07 Tm

What does each value represent?

  • 0 =
  • 1 =
  • -1 =
  • 0 =
  • 07 =
  • 07 =

Any help would be greatly appreciated. Any examples with explanations in simple english would be greatly appreciated Any sample PDF files with Landscape file and also Form Xobject would be appreciated A picture is worth a thousand words so PDF sample files I can usually open with notepad and figure out things I do not understand (except text matrix)

Thanks Richard

like image 269
user2315906 Avatar asked Apr 24 '13 14:04

user2315906


People also ask

How do I read text in a PDF?

Open an adobe (pdf) file. Toggle to the “view” screen and scroll down to “Read Out Loud.” Select “Activate Read Out Loud.” ” Then select how you want the document to be read “Read This Page Only” or “Read To End of Document.”


3 Answers

The matrices used in PDFs are Affine transforms.

tm loads the parameters into:

|a b 0|
|c d 0|
|e f 1|

Where:

a is Scale_x
b is Shear_x
c is Shear_y
d is Scale_y
e is offset x
f is offset y

A good introduction can be found at http://docstore.mik.ua/orelly/java-ent/jfc/ch04_11.htm

Hope this helps someone.

like image 140
Mike Avatar answered Oct 04 '22 22:10

Mike


Also interesting for you could be Chapter 4.2.2 Common Transformations in PDF developer reference https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference.pdf

like image 24
Ole K Avatar answered Oct 04 '22 21:10

Ole K


Point definition

[x, y, 1]

This is a one dimensional vector (array) which places the point at the coordinates x and y. 1 is not needed to specify the point location, but it is helpful for the calculation of the location of a point in a different coordination system, like from device independent pixels to device depended pixels.

Translation calculation

x_new = a*x + c*y + e;
y_new = b*x + d*y + f;

Written as matix, the calculation looks like this:

Matrix for translation

A translation moves a point.

[1, 0, 0, 1, tx, ty]

x_new = 1*x + 0*y + tx;
y_new = 0*x + 1*y + ty;

or

x_new = x + tx;
y_new = y + ty;

Matrix for rotation

enter image description here

[cos(theta), sin(theta), -sin(theta), cos(theta), 0, 0]

x_new = cos(theta)*x - sin(theta)*y + 0;
y_new = sin(theta)*x + cos(theta)*y + 0;

0 degree rotation, cos(0)=1, sin(0)=0: [1, 0, -0, 1, 0, 0]

x_new = 1*x + 0*y + 0;
y_new = 0*x + 1*y + 0;

or

x_new = x;
y_new = y;

90 degree rotation, cos(90)=0, sin(90)=1: [0, 1, -1, 0, 0, 0]

x_new = 0*x + -1*y + 0;
y_new = 1*x + 0*y + 0;

or

x_new = -y;
y_new = x;

180 degree rotation, cos(180)=-1, sin(180)=0: [-1, 0, -0, -1, 0, 0]

x_new = -1*x + 0*y + 0;
y_new = 0*x + -1*y + 0;

or

x_new = -x;
y_new = -y;

270 degree rotation, cos(270)=0, sin(270)=-1: [0, -1, 1, 0, 0, 0]

x_new = 0*x + 1*y + 0;
y_new = -1*x + 0*y + 0;

or

x_new = y;
y_new = -x;

Example text matrix

[0 1 -1 0 07 07]

0 1 -1 0: Rotation by 90 degrees 07 07: Translation (offset) by 7 in each x and y direction

like image 21
Peter Huber Avatar answered Oct 04 '22 22:10

Peter Huber