Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get text occurrences contained in a specified area with iTextSharp

Tags:

c#

itextsharp

Is it possible, using iTextSharp, get all text occurrences contained in a specified area of ​​a pdf document?

enter image description here

Thanks.

like image 440
Gigi Avatar asked Dec 16 '13 08:12

Gigi


1 Answers

First you need the actual coordinates of the rectangle you marked in Red. On sight, I'd say the x value 144 (2 inches) is probably about right, but it would surprise me if the y value is 76, so you'll have to double check.

Once you have the exact coordinates of the rectangle, you can use iText's text extraction functionality using a LocationTextExtractionStrategy as is done in the ExtractPageContentArea example.

For the iTextSharp version of this example, see the C# port of the examples of chapter 15.

System.util.RectangleJ rect = new System.util.RectangleJ(70, 80, 420, 500);
RenderFilter[] filter = {new RegionTextRenderFilter(rect)};
ITextExtractionStrategy strategy = new FilteredTextRenderListener(
        new LocationTextExtractionStrategy(), filter);
text = PdfTextExtractor.GetTextFromPage(reader, 1, strategy);
like image 75
Bruno Lowagie Avatar answered Oct 15 '22 03:10

Bruno Lowagie