Doxygen is Slow

Tags:

Doxygen takes about 12 hours to run on our code base. This is primarily because there is a lot of code to process (~1.5M lines). However, it's very quickly approaching the point at which we can't do nightly documentation updates because they take too long. We've already had to reduce the graph depth to get it down to 12 hours.

I've tried the standard approaches, but I really do need high quality output, and this includes graphs and SEARCH_INCLUDES. I have a fairly good machine to run Doxygen on, but Doxygen doesn't take advantage of its many cores. (It pegs a single CPU on the build server, but is only 4% of the available system.) Having a multithreaded Dot build is handy, though that's only half or so of the build time.

Are there any techniques I can use to run doxygen via multiple processes and manually shard the task? I've seen some talk about creating tag files, but I don't understand enough about them to know if they'd do what I want. What I'm looking for is something like:

doxygen Doxyfile-folder1 doxygen Doxyfile-folder2 doxygen Doxyfile-folder3 doxygen Doxyfile-folder4 doxygen-join output/folder1/html output/folder2/html output/folder3/html output/folder4/html

Of course, I'm just making stuff up, but that's an idea of what I'm looking for. Also, I'd use a lot more than 4 processes.

618

asked Nov 23 '11 18:11

alficles

1 Answers

Tag files are typically the way to go if

you have a number of logically coherent source files (let's call them components) and
you know the dependencies between the components, e.g. component A uses component B and C, and component B only uses C, and
It is ok (or even preferred) that the index files (e.g. the list of a files/classes/functions) are limited to a single component.
you are interested in HTML output.

A tag file is basically just a structured list of symbols with links to the location in the documentation. Tag files allow doxygen to make links from the documentation of one component to that of another.

It is a 2 step process:

First you run doxygen on each component to generate the tag file for that component. You can do this by disabling all output and use GENERATE_TAGFILE. So for component A, a Doxyfile.tagonly would have the following settings:
```
GENERATE_HTML         = NO GENERATE_LATEX        = NO GENERATE_RTF          = NO GENERATE_MAN          = NO GENERATE_TAGFILE      = compA.tag 
```
You'll notice that running doxygen this way is very fast.

The second step is to generate the actual documentation. For component A you need a Doxyfile which includes the tag files of the components B and C since we determined A depends on these components.

GENERATE_HTML         = YES GENERATE_LATEX        = NO GENERATE_RTF          = NO GENERATE_MAN          = NO TAGFILES              = path/to/compB/compB.tag=path/to/compB/htmldocs \                         path/to/compC/compC.tag=path/to/compC/htmldocs

Using this approach I have been able to generate documentation for 20M+ lines of code distributed over 1500+ components in under 3 hours on a standard desktop PC (Core i5 with 8Gb RAM and Linux 64bit), including source browsing, full call graphs, and UML-style diagrams of all data structures. Note that the first step only took 10 minutes.

To accomplish this I made a script to generate the Doxyfile's for each component based on the list of components and their direct dependencies. In the first step I run 8 instances of doxygen in parallel (using http://www.gnu.org/s/parallel/). In the second step I run 4 instances of doxygen in parallel.

See http://www.doxygen.nl/manual/external.html for more info about tag files.

143

answered Oct 13 '22 01:10

doxygen

Related questions
                            
                                What does select(2) do if you close(2) a file descriptor in a separate thread?
                            
                                Find out how many threads my application is running?
                            
                                C# Threading - How to start and stop a thread
                            
                                Python threading: can I sleep on two threading.Event()s simultaneously?
                            
                                How to obtain the results from a pool of threads in python?
                            
                                Python Flask shutdown event handler
                            
                                .NET Reverse Semaphore?
                            
                                How can I set the max number of MySQL processes or threads?
                            
                                In what situations could an empty synchronized block achieve correct threading semantics?
                            
                                Using C# MethodInvoker.Invoke() for a GUI app... is this good?
                            
                                Exception in Thread:must be a sequence, not instance
                            
                                Is it safe to create new thread in a loop?
                            
                                How do Immutable Objects help decrease overhead due to Garbage Collection?
                            
                                Is it possible for a Dictionary in .Net to cause dead lock when reading and writing to it in parallel?
                            
                                Modify Qt GUI from background worker thread
                            
                                What is the basic difference between NSTimer, NSTask, NSThread and NSRunloop?
                            
                                Why can't an abstract method be synchronized?
                            
                                C++11 multiple read and one write thread mutex [duplicate]
                            
                                When the main thread exits, do other threads also exit?
                            
                                What will happen when a Java thread is set to null?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Doxygen is Slow

Tags:

performance

multithreading

multiprocessing

doxygen

alficles

People also ask

1 Answers

doxygen

Recent Activity

Donate For Us