Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to speed up generation of Word files from C#?

I'm working on an application that generates a relatively large amount of Word output. Currently, we're using Word Interop services to do the document creation, but it's quite slow, especially in older (pre-2007) versions of Office. We'd like to speed up the generation.

I haven't done a lot of profiling yet, but I'm pretty confident that the problem is that we're making tons of COM calls. I'm hoping that profiling will yield a subset of calls that are slower than the others, but my gut tells me that it's probably a question of COM overhead (or Word Interop overhead), and not just a few slow calls.

Also, the product can generate HTML output, and that process (a) is very fast, and (b) uses pretty much the same codepaths, just with a different subclass for the HTML-specific pieces of functionality. So I'm pretty sure that our algorithm isn't fundamentally slow.

So... I'm looking for suggestions for alternate ways to accelerate the generation of Word files.

We can't just rename the generated HTML files to .doc, and we can't generate RTF instead -- in both cases, important formatting information get lost, and in the RTF case, inlined graphics don't work robustly.

One of the approaches we're evaluating is programmatically generating and opening a Word file (via interop) from a template that has a macro that knows how to consume a flat file and create the requisite output. We're interested in feedback about that approach, as well as any other ideas for speeding things up.

like image 375
Patrick Linskey Avatar asked Dec 07 '09 22:12

Patrick Linskey


2 Answers

If you can afford it, I'd recommend Aspose.Words product. Very fast and Word does not need to be installed.

Also it's much easier to use then office interop.

like image 155
Crispy Avatar answered Oct 09 '22 00:10

Crispy


Your macro approach is exactly how we sped up slow excel interop (using version 2003 i think).

We found (at least with excel) that much of the slowness was due to repeated individual calls via the interop. We started to bunch up commands (ie. format large ranges, and then change specific cells as required rather than formating each cell individually), and logically moved on to macros.

I think that the macro + template approach would happily translate.

like image 27
Gregory Avatar answered Oct 08 '22 22:10

Gregory