Read Math Equation using itextSharp

Question

I'm currently trying to read text from pdf file using itextsharp using the following code and assigning to a textbox (MultiLine) - (Windows Desktop App)

Note: This code works fine.

public string ReadPdfFile(string fileName)
        {
            StringBuilder text = new StringBuilder();

            if (File.Exists(fileName))
            {
                PdfReader pdfReader = new PdfReader(fileName);

                for (int page = 1; page <= pdfReader.NumberOfPages; page++)
                {
                    ITextExtractionStrategy strategy = new LocationTextExtractionStrategy();
                    string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);

                    currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
                    text.Append(currentText);
                }
                pdfReader.Close();
            }
            return text.ToString();
        }

BUT My pdf file has an equation

enter image description here

and all i'm getting is the follwing output

enter image description here

what could be added here to achieve the following text? Any sort of help would really be appreciated!

Djuro · Accepted Answer

I used itextsharp and i am 100% sure its not possible. Problem is within pdf format itself. It does not contains any tags refered to some text. Pdf contains specific graphical representation of content which has its position on pdf page. Without OCR its even impossible to detect bolded text. Pdf isnt good format to parse.

My problem was even easier than yours and it was hell to read from pdf. It was just text, but it was formated as 2 pages in one(2 column text). Itextsharp read content by coordinates, so my text got mixed up as he read first line of first column than first line of second column (not as text flows). As for latex, after latex code is converted to pdf there is no reverse to latex code.

Read Math Equation using itextSharp

Tags:

c#

pdf

itext

Aimal Khan

1 Answers

Djuro

Recent Activity

Donate For Us

Read Math Equation using itextSharp

Tags:

c#

pdf

itext

Aimal Khan

1 Answers

Djuro

Related questions

Recent Activity

Donate For Us