Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Code documentation: How much is too much?

How much code documentation in your .NET source is too much?

Some background: I inherited a large codebase that I've talked about in some of the other questions I've posted here on SO. One of the "features" of this codebase is a God Class, a single static class with >3000 lines of code encompassing several dozen static methods. It's everything from Utilities.CalculateFYBasedOnMonth() to Utilities.GetSharePointUserInfo() to Utilities.IsUserIE6(). It's all good code that doesn't need to be rewritten, just refactored into an appropriate set of libraries. I have that planned out.

Since these methods are moving into a new business layer, and my role on this project is to prepare the system for maintenance by other developers, I'm thinking about solid code documentation. While these methods all have good inline comments, they don't all have good (or any) code doco in the form of XML comments. Using a combo of GhostDoc and Sandcastle (or Document X), I can create some pretty nice HTML documentation and post it to SharePoint, which would let developers understand more about what the code does without navigating through the code itself.

As the amount of documentation in the code increases, the more difficult it becomes to navigate the code. I'm beginning to wonder if the XML comments will make the code more difficult to maintain than, say, a simpler //comment would on each method.

These examples are from the Document X sample:

        /// <summary>
        /// Adds a new %Customer:CustomersLibrary.Customer% to the collection.
        /// </summary>
        /// <returns>A new Customer instance that represents the new customer.</returns>
        /// <example>
        ///     The following example demonstrates adding a new customer to the customers
        ///     collection. 
        ///     <code lang="CS" title="Example">
        /// CustomersLibrary.Customer newCustomer = myCustomers.Add(CustomersLibrary.Title.Mr, "John", "J", "Smith");
        ///     </code>
        ///     <code lang="VB" title="Example">
        /// Dim newCustomer As CustomersLibrary.Customer = myCustomers.Add(CustomersLibrary.Title.Mr, "John", "J", "Smith")
        ///     </code>
        /// </example>
        /// <seealso cref="Remove">Remove Method</seealso>
        /// <param name="Title">The customers title.</param>
        /// <param name="FirstName">The customers first name.</param>
        /// <param name="MiddleInitial">The customers middle initial.</param>
        /// <param name="LastName">The customers last name.</param>
        public Customer Add(Title Title, string FirstName, string MiddleInitial, string LastName)
        {
            // create new customer instance
            Customer newCust = new Customer(Title, FirstName, MiddleInitial, LastName);

            // add to internal collection
            mItems.Add(newCust);

            // return ref to new customer instance
            return newCust;
        }

And:

    /// <summary>
    /// Returns the number of %Customer:CustomersLibrary.Customer% instances in the collection.
    /// </summary>
    /// <value>
    /// An Int value that specifies the number of Customer instances within the
    /// collection.
    /// </value>
    public int Count
    {
        get 
        {
            return mItems.Count;
        }
    }

So I was wondering from you: do you document all of your code with XML comments with the goal of using something like NDoc (RIP) or Sandcastle? If not, how do you decide what gets documentation and what doesn't? Something like an API would obviously have doco, but what about a codebase that you're going to hand off to another team to maintain?

What do you think I should do?

like image 983
Robert S. Avatar asked Nov 13 '08 21:11

Robert S.


People also ask

Can you have too much documentation?

One of the main dangers of excessive internal documentation is information overload—the exposure to more information than one is able to process. Information overload is a major source of stress and poor health among workers worldwide. Research shows that it induces error and impairs decision-making.

Is there such thing as too many comments in code?

Over-commented code is often more difficult to understand than code without comments. Little notes back and forth from all the different maintainers of a project can often get cluttered. You spend more time reading the comments than you do the actual code.

What is the main problem with code documentation?

There isn't a step-by-step process to follow. Code is often arranged in non-linear order, so you can't simply proceed line-by-line through it. There's also the question of how much to document, what to cover, and where to include the documentation.

What is good code documentation practices?

Keep it simple and concise. Follow the DRY (Don't Repeat Yourself) principle. You don't need to comment on every single line of your code, use comments to explain something that really needs explaining and is not self-evident. Keep it up to date at all times.


2 Answers

Nobody's mentioned your code doesn't need to be bloated, the XML documentation can be in another file:

/// <include file="Documentation/XML/YourClass.xml" path="//documentation/members[@name='YourClass']/*"/>

And then your Add method can contain no extra XML/comments above it, or if you prefer just the summary (as that's merged with the separate file).

It's far more powerful than the rubbish format that is Javadoc and derivatives you find in PHP/Javascript (though Javadoc paved the way for the XML syntax). Plus the tools available are far superior and the default look of the help docs is more readable and easier to customise (I can say that from having written doclets and comparing that process to Sandcastle/DocProject/NDoc).

like image 156
Chris S Avatar answered Oct 14 '22 19:10

Chris S


I think a good part of the problem here is the verbose and crufty XML documentation syntax MS has foisted on us (JavaDoc wasn't much better either). The question of how to format it is, to a large degree, independent of how much is appropriate.

Using the XML format for comments is optional. You can use DOxygen or other tools that recognize different formats. Or write your own document extractor -- it isn't as hard as you might think to do a reasonable job and is a good learning experience.

The question of how much is more difficult. I think the idea of self-documenting code is fine, if you are digging in to maintain some code. If you are just a client, you shouldn't need to read the code to understand how a given function works. Lots of information is implicit in the data types and names, of course, but there is a great deal that is not. For instance, passing in a reference to an object tells you what is expected, but not how a null reference will be handled. Or in the OP's code, how any whitespace at the beginning or the end of the arguments are handled. I believe there is far more of this type of information that ought to be documented than is usually recognized.

To me it requires natural language documentation to describe the purpose of the function as well as any pre- and post-conditions for the function, its arguments, and return values which cannot be expressed through the programming language syntax.

like image 25
Jeff Kotula Avatar answered Oct 14 '22 20:10

Jeff Kotula