Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Go parser not detecting Doc comments on struct type

I am trying to read the assocated Doc comments on a struct type using Go’s parser and ast packages. In this example, the code simply uses itself as the source.

package main

import (
    "fmt"
    "go/ast"
    "go/parser"
    "go/token"
)

// FirstType docs
type FirstType struct {
    // FirstMember docs
    FirstMember string
}

// SecondType docs
type SecondType struct {
    // SecondMember docs
    SecondMember string
}

// Main docs
func main() {
    fset := token.NewFileSet() // positions are relative to fset

    d, err := parser.ParseDir(fset, "./", nil, parser.ParseComments)
    if err != nil {
        fmt.Println(err)
        return
    }

    for _, f := range d {
        ast.Inspect(f, func(n ast.Node) bool {
            switch x := n.(type) {
            case *ast.FuncDecl:
                fmt.Printf("%s:\tFuncDecl %s\t%s\n", fset.Position(n.Pos()), x.Name, x.Doc)
            case *ast.TypeSpec:
                fmt.Printf("%s:\tTypeSpec %s\t%s\n", fset.Position(n.Pos()), x.Name, x.Doc)
            case *ast.Field:
                fmt.Printf("%s:\tField %s\t%s\n", fset.Position(n.Pos()), x.Names, x.Doc)
            }

            return true
        })
    }
}

The comment docs for the func and fields are output no problem, but for some reason the ‘FirstType docs’ and ‘SecondType docs’ are nowhere to be found. What am I missing? Go version is 1.1.2.

(To run the above, save it into a main.go file, and go run main.go)

like image 981
Matt Sherman Avatar asked Oct 25 '13 03:10

Matt Sherman


2 Answers

Great question!

Looking at the source code of go/doc, we can see that it has to deal with this same case in readType function. There, it says:

324     func (r *reader) readType(decl *ast.GenDecl, spec *ast.TypeSpec) {
...
334     // compute documentation
335     doc := spec.Doc
336     spec.Doc = nil // doc consumed - remove from AST
337     if doc == nil {
338         // no doc associated with the spec, use the declaration doc, if any
339         doc = decl.Doc
340     }
...

Notice in particular how it needs to deal with the case where the AST does not have a doc attached to the TypeSpec. To do this, it falls back on the GenDecl. This gives us a clue as to how we might use the AST directly to parse doc comments for structs. Adapting the for loop in the question code to add a case for *ast.GenDecl:

for _, f := range d {
    ast.Inspect(f, func(n ast.Node) bool {
        switch x := n.(type) {
        case *ast.FuncDecl:
            fmt.Printf("%s:\tFuncDecl %s\t%s\n", fset.Position(n.Pos()), x.Name, x.Doc.Text())
        case *ast.TypeSpec:
            fmt.Printf("%s:\tTypeSpec %s\t%s\n", fset.Position(n.Pos()), x.Name, x.Doc.Text())
        case *ast.Field:
            fmt.Printf("%s:\tField %s\t%s\n", fset.Position(n.Pos()), x.Names, x.Doc.Text())
        case *ast.GenDecl:
            fmt.Printf("%s:\tGenDecl %s\n", fset.Position(n.Pos()), x.Doc.Text())
        }

        return true
    })
}

Running this gives us:

main.go:3:1:    GenDecl %!s(*ast.CommentGroup=<nil>)
main.go:11:1:   GenDecl &{[%!s(*ast.Comment=&{69 // FirstType docs})]}
main.go:11:6:   TypeSpec FirstType  %!s(*ast.CommentGroup=<nil>)
main.go:13:2:   Field [FirstMember] &{[%!s(*ast.Comment=&{112 // FirstMember docs})]}
main.go:17:1:   GenDecl &{[%!s(*ast.Comment=&{155 // SecondType docs})]}
main.go:17:6:   TypeSpec SecondType %!s(*ast.CommentGroup=<nil>)
main.go:19:2:   Field [SecondMember]    &{[%!s(*ast.Comment=&{200 // SecondMember docs})]}
main.go:23:1:   FuncDecl main   &{[%!s(*ast.Comment=&{245 // Main docs})]}
main.go:33:23:  Field [n]   %!s(*ast.CommentGroup=<nil>)
main.go:33:35:  Field []    %!s(*ast.CommentGroup=<nil>)

And, hey!

We've printed out the long-lost FirstType docs and SecondType docs! But this is unsatisfactory. Why is the doc not attached to the TypeSpec? The go/doc/reader.go file goes to extraordinary lengths to circumvent this issue, actually generating a fake GenDecl and passing it to the readType function mentioned earlier, if there is no documentation associated with the struct declaration!

   503  fake := &ast.GenDecl{
   504   Doc: d.Doc,
   505   // don't use the existing TokPos because it
   506   // will lead to the wrong selection range for
   507   // the fake declaration if there are more
   508   // than one type in the group (this affects
   509   // src/cmd/godoc/godoc.go's posLink_urlFunc)
   510   TokPos: s.Pos(),
   511   Tok:    token.TYPE,
   512   Specs:  []ast.Spec{s},
   513  }

But why all this?

Imagine we changed the type definitions from code in the question slightly (defining structs like this is not common, but still valid Go):

// This documents FirstType and SecondType together
type (
    // FirstType docs
    FirstType struct {
        // FirstMember docs
        FirstMember string
    }

    // SecondType docs
    SecondType struct {
        // SecondMember docs
        SecondMember string
    }
)

Run the code (including the case for ast.GenDecl) and we get:

main.go:3:1:    GenDecl %!s(*ast.CommentGroup=<nil>)
main.go:11:1:   GenDecl &{[%!s(*ast.Comment=&{69 // This documents FirstType and SecondType together})]}
main.go:13:2:   TypeSpec FirstType  &{[%!s(*ast.Comment=&{129 // FirstType docs})]}
main.go:15:3:   Field [FirstMember] &{[%!s(*ast.Comment=&{169 // FirstMember docs})]}
main.go:19:2:   TypeSpec SecondType &{[%!s(*ast.Comment=&{215 // SecondType docs})]}
main.go:21:3:   Field [SecondMember]    &{[%!s(*ast.Comment=&{257 // SecondMember docs})]}
main.go:26:1:   FuncDecl main   &{[%!s(*ast.Comment=&{306 // Main docs})]}
main.go:36:23:  Field [n]   %!s(*ast.CommentGroup=<nil>)
main.go:36:35:  Field []    %!s(*ast.CommentGroup=<nil>)

That's right

Now the struct type definitions have their docs, and the GenDecl has its own documentation, too. In the first case, posted in the question, the doc was attached to GenDecl, since the AST sees the individual struct type definitions of "contractions" of the parenthesized-version of type definitions, and wants to handle all definitions the same, whether they are grouped or not. The same thing would happen with variable definitions, as in:

// some general docs
var (
    // v docs
    v int

    // v2 docs
    v2 string
)

So if you wish to parse comments with pure AST, you need to be aware that this is how it works. But the preferred method, as @mjibson suggested, is to use go/doc. Good luck!

like image 147
Herman Schaaf Avatar answered Nov 18 '22 05:11

Herman Schaaf


You need to use the go/doc package to extract documentation from the ast:

package main

import (
    "fmt"
    "go/doc"
    "go/parser"
    "go/token"
)

// FirstType docs
type FirstType struct {
    // FirstMember docs
    FirstMember string
}

// SecondType docs
type SecondType struct {
    // SecondMember docs
    SecondMember string
}

// Main docs
func main() {
    fset := token.NewFileSet() // positions are relative to fset

    d, err := parser.ParseDir(fset, "./", nil, parser.ParseComments)
    if err != nil {
        fmt.Println(err)
        return
    }

    for k, f := range d {
        fmt.Println("package", k)
        p := doc.New(f, "./", 0)

        for _, t := range p.Types {
            fmt.Println("  type", t.Name)
            fmt.Println("    docs:", t.Doc)
        }
    }
}
like image 33
mjibson Avatar answered Nov 18 '22 04:11

mjibson