implement "find definition" in 77 lines of go

2023-01-09 #go

Most integrated development environments let you quickly jump from a symbol to its definition. In this post, we’ll implement a simple version of this feature in just 77 lines of Go code. Along the way, we’ll learn how to use the Go parser and type-checker included in the standard library. Let’s go!

Where we’re going

By the end of this post, we’ll have built a command-line tool that does this:

The tool takes a file location (path, line, and column) and returns the location of the corresponding definition. It can lookup any Go identifier, including the names of types, functions, and variables, both in the file’s package and any imported packages.

The full code is available at github.com/wedaly/find-definition-in-go. The next sections explain the implementation of lookupAndPrintGoDef, which is just 77 lines long.¹

Dependencies

We will use a few packages, including these two from the Go standard library:

go/ast represents the Abstract Syntax Tree (AST) of Go code. Every node in the abstract syntax tree corresponds to some part of a Go source file. For our purposes, the most important kind of node is ast.Ident, which represents an identifier in a Go program. The names of variables, functions, methods, and types are all represented as ast.Ident nodes in the abstract syntax tree.
go/types is a type-checker for Go code. One of its main responsibilities is identifier resolution, mapping identifiers in an AST to their declarations.² We will explore this in more detail below.

Type-checking a package requires first type-checking the package’s imports. So we need to understand the import graph, topologically sort it, and type-check the packages in order. Fortunately, there’s a helper package golang.org/x/tools/go/packages that does this for us.

Let’s build it

Setup: Main program

The main function is just boilerplate. We read three arguments: the file path, the line number, and the column number, then pass these to lookupAndPrintGoDef.

func main() {
	if len(os.Args) < 4 {
		fmt.Fprintf(os.Stderr, "Usage: %s FILE LINE COL\n", os.Args[0])
		os.Exit(1)
	}

	pathArg := os.Args[1]
	lineArg, err := strconv.Atoi(os.Args[2])
	if err != nil {
		fmt.Fprintf(os.Stderr, "Invalid line number %q\n", os.Args[2])
		os.Exit(1)
	}

	colArg, err := strconv.Atoi(os.Args[3])
	if err != nil {
		fmt.Fprintf(os.Stderr, "Invalid column number %q\n", os.Args[3])
		os.Exit(1)
	}

	err = lookupAndPrintGoDef(pathArg, lineArg, colArg)
	if err != nil {
		fmt.Fprintf(os.Stderr, "%s\n", err)
		os.Exit(1)
	}
}

All the interesting stuff happens in lookupAndPrintGoDef, which we’ll implement in the following sections.

Step 1: Load the Go package

First, we need to parse and type-check the Go package containing the file at path. To do this, we use the Load function from golang.org/x/tools/go/packages.

absPath, err := filepath.Abs(path)
if err != nil {
	return err
}

loadMode := (packages.NeedName |
	packages.NeedSyntax |
	packages.NeedDeps |
	packages.NeedTypes |
	packages.NeedTypesInfo)

cfg := &packages.Config{
	Mode: loadMode,
	Dir:  filepath.Dir(absPath),
}

pkgs, err := packages.Load(cfg, ".")
if err != nil {
	return err
} else if len(pkgs) == 0 {
	return fmt.Errorf("No packages loaded")
}

pkg := pkgs[0]

For loadMode, we’re passing some flags bitwise-OR’d together to tell packages.Load what information we need. In this case, we want the name of the package (NeedName), the AST (NeedSyntax), and the types (NeedTypes and NeedTypesInfo). Since type-checking requires loading imported packages, we also request those dependencies (NeedDeps).

In the configuration, we set cfg.Dir to the directory containing path, then call packages.Load(cfg, ".") to load the package in that directory. This is a small trick to avoid an error when the current working directory is outside a go module. Internally, packages.Load uses Go build tools, executing programs like go list to retrieve package information.³ If you run go list outside a Go module, you’ll get an error like:

go: go.mod file not found in current directory or any parent directory; see 'go help modules'

To avoid this error, we explicitly set the current working directory to the directory containing the target file. If that file is part of a Go module, then go list will succeed even if we invoke our tool from a different directory.

In the general case, packages.Load could return multiple packages, but here we expect at most one. Everything we need for the following steps is in pkgs[0].

Step 2: Find the AST identifier

Now that we’ve loaded the Go package, we need to find the ast.Ident at the target line and column. This is the reference whose definition we want to find.

The pkg that we loaded in step 1 has a Syntax field of type []*ast.File. These are the ASTs for the files in the package. We loop through them until we find one with the path for our target file:

var astFile *ast.File
for _, f := range pkg.Syntax {
	if pkg.Fset.Position(f.Pos()).Filename == absPath {
		astFile = f
		break
	}
}
if astFile == nil {
	return fmt.Errorf("Could not find AST file for %q", absPath)
}

You might be wondering what Fset, Position, and Pos mean. The AST stores file positions in an optimized format to save space. Fset represents a set of files, and Pos is an integer offset representing a specific file path, line, and column within an Fset. Given an Fset and Pos, we can retrieve an equivalent Position struct containing the full file path, line, and column.

Next, we need to find the ast.Ident node at our target line and column. We use ast.Inspect to walk the AST until we find an ast.Ident node containing the target position. If we find it, we return false to terminate the search; otherwise, we return true to continue looking.

var astIdent *ast.Ident
ast.Inspect(astFile, func(node ast.Node) bool {
	if node == nil || astIdent != nil {
		return false
	}
	start, end := pkg.Fset.Position(node.Pos()), pkg.Fset.Position(node.End())
	if line < start.Line ||
		line > end.Line ||
		(line == start.Line && col < start.Column) ||
		(line == end.Line && col > end.Column) {
		return false
	}
	if node, ok := node.(*ast.Ident); ok {
		astIdent = node
		return false
	}
	return true
})
if astIdent == nil {
	return fmt.Errorf("Could not find AST identifier at %s:%d:%d", path, line, col)
}

Step 3: Lookup and print the definition

Now that we’ve found the ast.Ident at the target file, line, and column, we can lookup its definition. When we called package.Load in step 1, we passed two flags: NeedTypes and NeedTypesInfo. This triggers type-checking, the results of which are returned in the pkg.TypesInfo field.

The most important field in TypesInfo is Uses, which has the type map[*ast.Ident]Object. What is an “object”? The go/types package tells us:

An Object describes a named language entity such as a package, constant, type, variable, function (incl. methods), or label.

You can think of an object as “the thing created by a declaration”. Here are some examples:

var Foo string  // <-- creates an object for the string Foo

// creates an object for the function PrintFoo
func PrintFoo() string {
    return Foo
}

// creates an object for the struct FooStruct
type FooStruct struct {
    FooField string // <-- creates an object for the field FooField
}

Luckily for us, TypesInfo.Uses maps from ast.Ident (the reference) to types.Object (the definition). This is exactly what we want!

obj, ok := pkg.TypesInfo.Uses[astIdent]
if !ok {
	obj = pkg.TypesInfo.Defs[astIdent]
}
if obj == nil {
	return fmt.Errorf("Could not find type object for ident %q at %s:%d:%d", astIdent.Name, path, line, col)
} else if !obj.Pos().IsValid() {
	return fmt.Errorf("Invalid position for type object for %q at %s:%d:%d", astIdent.Name, path, line, col)
}

There are two small wrinkles:

If we can’t find the object in Uses, we fallback to Defs. This handles lookups where the target file/line/column is the definition itself, not a reference.
Some built-in objects like nil aren’t defined in any Go source files. We exclude those by checking obj.Pos().IsValid().

Once we have the types.Object matching our target ast.Ident, we simply print its position. (The helper normalizePath returns a relative path if the file is in the current working directory and an absolute path otherwise.)

defPosition := pkg.Fset.Position(obj.Pos())
fmt.Printf("%q is defined at %s:%d:%d\n", obj.Name(), normalizePath(defPosition.Filename), defPosition.Line, defPosition.Column)

And that’s it! We’ve found the object “used by” the ast.Ident at our target file/line/column, which is the definition we were looking for.

Tradeoffs

As with everything else in software, this approach makes tradeoffs.

Failure on parse errors

Using the built-in Go parser ensures that the tool always interprets Go code correctly. However, as far as I’m aware, the parser cannot recover from errors: if the code isn’t syntactically valid, the parser will reject it. This works well for navigating working code but can be annoying when you’re in the middle of writing something new.

Speed

Type-checking a package and its dependencies can take some time. One important (and easy!) optimization is to delete sections of the AST to eliminate unnecessary type-checking. The packages.Config struct has a ParseFile field that can be used to modify the AST before type-checking. For “find definition”, we can delete any function body that doesn’t contain the target line/column (see here for an example).

With this optimization, most lookups complete within a few seconds, even for large projects like Kubernetes. Go build tools are fast! Although our approach is slower than using a language server like gopls, it’s still fast enough for interactive use.

Tests

To find definitions from Go test files, we need to do a bit more work. First, we need to set cfg.Tests = true so packages.Load will include test files. This produces two versions of each package, one with the test files/dependencies/binary and one without. If we’re looking for a definition from a test file, we need to use the “test” version of the package (as in this example).

Conclusion

By using Go’s built-in parser and type-checker, we implemented “find definition” in just 77 lines of code! The go/ast and go/types packages provide a simple and powerful mechanism for analyzing Go code.

It’s possible to take this much further. You can also implement type lookup, finding references to a definition, finding implementations for an interface/method, and more! Additional examples can be found in my tool gospelunk.

The rest of the code is boilerplate for interpreting CLI arguments and normalizing paths, so I’m not counting those lines. ↩︎
For a deep-dive into the go/types package, see this tutorial. ↩︎
It’s possible to specify a different “driver” for retrieving this information, but the default is go list. ↩︎