Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

iText 7.0.4.0 - Converting PdfDocument to byte array

Tags:

c#

itext

itext7

I'm attempting to split a PDF file page by page, and get each page file's byte array. However, I'm having trouble converting each page to byte array in iText version 7.0.4 for C#.

Methods referenced in other solutions rely on PdfWriter.GetInstance or PdfCopy, which seems to no longer exist in iText version 7.0.4.

I've gone through iText's sample codes and API documents, but I have not been able to extract any useful information out of them.

using (Stream stream = new MemoryStream(pdfBytes))
using (PdfReader reader = new PdfReader(stream))
using (PdfDocument pdfDocument = new PdfDocument(reader))
{
    PdfSplitter splitter = new PdfSplitter(pdfDocument);

    // My Attempt #1 - None of the document's functions seem to be of help.
    foreach (PdfDocument splitPage in splitter.SplitByPageCount(1))
    {
        // ??      
    }

    // My Attempt #2 - GetContentBytes != pdf file bytes.
    for (int i = 1; i <= pdfDocument.GetNumberOfPages(); i++)
    {
        PdfPage page = pdfDocument.GetPage(i);
        byte[] bytes = page.GetContentBytes();
    }
}

Any help would be much appreciated.

like image 936
James.K Avatar asked Sep 23 '17 03:09

James.K


Video Answer


1 Answers

Your approach of using PdfSplitter is one of the best ways to approach your task. Maybe not so much is available out of the box, but PdfSplitter is highly customizable and if you take a look at the implementation or simply the API, it becomes clear which are correct points for injecting your own customized behavior.

You should override GetNextPdfWriter to provide any output media you want the documents to be created at. You can also use IDocumentReadyListener to define the action that will be performed once another document is ready.

I am attaching one of the implementations that can achieve your goal:

class ByteArrayPdfSplitter : PdfSplitter {

    private MemoryStream currentOutputStream;

    public ByteArrayPdfSplitter(PdfDocument pdfDocument) : base(pdfDocument) {
    }

    protected override PdfWriter GetNextPdfWriter(PageRange documentPageRange) {
        currentOutputStream = new MemoryStream();
        return new PdfWriter(currentOutputStream);
    }

    public MemoryStream CurrentMemoryStream {
        get { return currentOutputStream; }
    }

    public class DocumentReadyListender : IDocumentReadyListener {

        private ByteArrayPdfSplitter splitter;

        public DocumentReadyListender(ByteArrayPdfSplitter splitter) {
            this.splitter = splitter;
        }

        public void DocumentReady(PdfDocument pdfDocument, PageRange pageRange) {
            pdfDocument.Close();
            byte[] contents = splitter.CurrentMemoryStream.ToArray();
            String pageNumber = pageRange.ToString();
        }
    }
}

The calls would be basically as you did, but with custom document ready event:

PdfDocument docToSplit = new PdfDocument(new PdfReader(path));
ByteArrayPdfSplitter splitter = new ByteArrayPdfSplitter(docToSplit);
splitter.SplitByPageCount(1, new ByteArrayPdfSplitter.DocumentReadyListender(splitter));
like image 181
Alexey Subach Avatar answered Oct 10 '22 22:10

Alexey Subach