Is there a field in which PDF files specify their encoding?

Tags:

I understand that it is impossible to determine the character encoding of any stringform data just by looking at the data. This is not my question.

My question is: Is there a field in a PDF file where, by convention, the encoding scheme is specified (e.g.: UTF-8)? This would be something roughly analogous to <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> in HTML.

Thank you very much in advance, Blz

371

asked May 18 '12 16:05

Louis Thibault

1 Answers

A quick look at the PDF specification seems to suggest that you can have different encoding inside a PDF-file. Have a look at page 86. So a PDF library with some kind of low level access should be able to provide you with encoding used for a string. But if you just want the text and don't care about the internal encodings used I would suggest to let the library take care of conversions for you.

143

answered Sep 19 '22 21:09

Mattias Wadman

Related questions
                            
                                How to set a bottom margin in FPDF
                            
                                Angular 2 how to display .pdf file
                            
                                Printing PDFs from Windows Command Line
                            
                                How to set a background color of a Table Cell using iText?
                            
                                With Flying Saucer, how do I generate a pdf with a page number and page total on every page at the footer?
                            
                                How to display pdf on iOS
                            
                                How can I convert PDF to HTML?
                            
                                Converting multiple Markdown files with links to PDF
                            
                                Replacing vector images in a PDF with raster images
                            
                                PDFsharp Line Break
                            
                                Export Pandas DataFrame into a PDF file using Python
                            
                                Convert HTML to PDF in Angular 6 [closed]
                            
                                extract images from pdf using pdfbox
                            
                                How are PDF sizes specified?
                            
                                Split each PDF page in two?
                            
                                How to downsample images within PDF file?
                            
                                Determine if a byte[] is a pdf file
                            
                                DomPDF: Image not readable or empty
                            
                                Removing PDF invisible objects with iTextSharp
                            
                                Javascript call programmatically the "Save as PDF" feature of Chrome dialog print

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a field in which PDF files specify their encoding?

Tags:

pdf

unicode

utf

Louis Thibault

People also ask

1 Answers

Mattias Wadman

Recent Activity

Donate For Us