I need to extract text (word by word) from a pdf file.
import java.io.*;
import com.itextpdf.text.*;
import com.itextpdf.text.pdf.*;
import com.itextpdf.text.pdf.parser.*;
public class pdf {
    private static String INPUTFILE = "http://ontology.buffalo.edu/ontology%28PIC%29.pdf" ;
    private static String OUTPUTFILE = "c:/new3.pdf";
    public static void main(String[] args) throws DocumentException,
            IOException {
        Document document = new Document();
        PdfWriter writer = PdfWriter.getInstance(document,
        new FileOutputStream(OUTPUTFILE));
        document.open();
        PdfReader reader = new PdfReader(INPUTFILE);
        int n = reader.getNumberOfPages();
        PdfImportedPage page;
        // Go through all pages
        for (int i = 1; i <= n; i++) {
                page = writer.getImportedPage(reader, i);
                System.out.println(i);
                Image instance = Image.getInstance(page);
                document.add(instance);
        }
        document.close();
        PdfReader readerN = new PdfReader(OUTPUTFILE);
        PdfTextExtractor parse = new PdfTextExtractor();
for (int i = 1; i <= n; i++) 
System.out.println(parser.getTextFromPage(reader,i));
}
When I compile the code, I have this error:
the constructor PdfTextExtractor is undefined
How do I fix this?
PDFTextExtractor only contains static methods and the constructor is private. itext
You can call it like so:String myLine = PDFTextExtractor.getTextFromPage(reader, pageNumber)
If you want to get all the text from the PDF file and save it to a text file you can use below code.
Use pdfutil.jar library.
import java.io.IOException;
import java.io.PrintWriter;
import com.testautomationguru.utility.PDFUtil;
public class PDFToText{
    public static void main(String[] args) {
        try {
            String pdfFilePath = "C:\\abc.pdf";
            PDFUtil pdfUtil = new PDFUtil();
            String content = pdfUtil.getText(pdfFilePath);
            PrintWriter out = new PrintWriter("C:\\abc.txt");
            out.println(content);
            out.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With