Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What page-image generating technology should I use?

I'm building a desktop application right now that presents its human-readable output as XHTML displayed in a WebBrowser control. Eventually, this output is going to have to be converted from an XHTML file to a document image in an imaging system. Unlike XHTML documents, the document image has to be divided into physical pages; additionally - and this is the part that's killing me - there need to be headers and footers on these pages.

Much as I would like to, I can't simply make the WebBrowser print to a file - the header/footer options it supports aren't anywhere near sophisticated enough. So I'm casting about trying to figure out what the right technology is for generating these images.

It seems likely to me (though it's not mandatory) that what I'll end up doing is producing PDF versions of the HTML documents (so that I can add headers and footers) and then rendering the PDFs as TIFFs, which is the ultimate format that the imaging system wants. So what I'm considering:

  • Use some kind of XHTML-to-PDF conversion software. The problem with this is that without doing a lot of evaluation and testing I can't figure out if the products I've looked at even have the ability to do what I need, which is to take existing XHTML documents, decorate them with headers and footers and paginate them.

  • Use XSL-FO to generate the PDFs. Being a ninja-level XSLT geek helps here (that's how I'm producing the XHTML in the first place), but it still seems like an awkward and slow solution with a lot of moving parts. Also this means I'm sticking a big clunky Java program into the middle of my nice clean .NET system, though I'm certainly enough of a grownup to do that if it's the right answer.

  • Use some other technology that I haven't even thought of yet, like LaTeX. Maybe there's some miraculous page-imaging tool that turns XHTML directly into TIFFs with page headers and footers. That would be ideal.

My primary concerns are:

  • I'm building a commercial product; whatever technology I use needs to be affordable and supportable. It doesn't have to be free.

  • I don't want to disappear down a rabbit hole for three months banging on this stuff to get it to work. This intuitively seems like the kind of problem space where I can lose a lot of time just evaluating and rejecting tools.

  • Whatever solution I adopt needs to be relatively immune to formatting changes in the XHTML. The whole reason I'm using XSLT and producing XHTML in the first place is that the documents I'm producing are being dynamically assembled using business rules that change all the time.

I've spent a lot of time searching for alternatives and haven't found anything that's obviously the answer. But maybe one of you fine people has already solved this problem, and if so, I would like to stand on your shoulders.

like image 235
Robert Rossney Avatar asked Jan 29 '09 20:01

Robert Rossney


People also ask

Can you use DALL-E 2?

Can you use DALL·E 2 for commercial use? Until now OpenAI had prohibited commercial use of images generated by DALL·E 2, but in the beta version, it's now giving “full usage rights” for images created with the platform. That includes the right to sell and reprint images and to use them on merchandise.

Is DALL-E free to use?

DALL-E mini is free and easy to use. All you have to do is go to Craiyon.com and type a prompt into the “What do you want to see?” box. After a few minutes, or sometimes seconds, it will give you nine images in an attempt to match the prompt, according to Business Insider.

Is DALL-E available to the public?

Users can create with DALL. E using free credits that refill every month, and buy additional credits in 115-generation increments for $15. DALL. E, the AI system that creates realistic images and art from a description in natural language, is now available in beta.

What AI are people using to generate images?

The brainchild of Boris Dayma, a Houston-based machine-learning engineer, Craiyon is popularizing a growing trend in AI. Computers are getting better and better at ingesting words and producing increasingly realistic-looking images in response. Lately, people are typing in about 5 million prompts per day, Dayma said.


1 Answers

Edit (2010-11-28 12:30 PM PST) Please +1 this answer if you download my code. I notice my Codeplex sample has been downloaded hundreds of times. The code isn't spectacular, but it works as a great starting point, with lots of links to source help included. Thanks! +tom Edit (2009-03-29 9:00 AM PST) Posted sample conversion.
Edit (2009-03-23 12:30 PM PST, published to CodePlex) I developed a solution for this and posted it to CodePlex. The published version 2.0 is written using the WPF MVVP pattern. TIFF files (one per page) are output to c:\Temp\XhtmlToTiff. XAML and XPS formats are created as well. A compiled,installable version is available at CricketSoft.com


Have you tried the "Microsoft XPS Document Writer"? This a software-only printer that generates paged output from a variety of sources, including web pages.

There is an SDK for working with XPS documents and Open XML docs in general. Here is a How-to article by Beth Massi: "Accessing Open XML Document Parts with the Open XML SDK".

+tom

like image 163
10 revs Avatar answered Oct 14 '22 10:10

10 revs