Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bazel: why do targets keep increasing if build graph is known

Tags:

bazel

Per this doc, the analysis and execution phases handle building out the dependency tree (among other things) and going and doing the work if needed, respectively. If that's true, I'm curious why the total number of targets keeps increasing as the build progresses (i.e., when I start a large build, bazel may report that it's built 5 out of 100 targets, but later will say it's built 20 out of 300 targets, and so forth, with the denominator increasing for a while until it levels off).

I've heard the loading and analysis phases can be intermixed. My likely incomplete or incorrect understanding is that when bazel parses a BUILD file, analysis is invoked to determine what dependencies are needed for the requested targets on the command line, and then I guess this is somehow communicated back to the loader to pull in any other BUILD files referenced by these dependencies, which could cause the loader to go out and fetch a remote repo if the dependency (and thus BUILD file) is not in the local repo.

However, my understanding was also that while dynamic build-out of the dependency graph was a potential future direction for bazel, that currently, execution does not intermix with analysis, and thus when execution begins, the full dependency tree should be available to bazel (and thus the total number of targets known)? Does bazel have the full tree, but just not want to traverse the tree to get a count in case it's big, or is something else going on here?

Note: I found a brief mention of this phenomenon here, but without an explanation as to why it happens.

like image 748
Seth Koehler Avatar asked Jul 23 '18 17:07

Seth Koehler


1 Answers

The number you're seeing in the progress bar refers to actions (command-lines..ish) and not targets (e.g. //my:target). I wrote a blog post about the action graph, and here's the relevant description about it:

The action graph contains a different set of information: file-level dependencies, full command lines, and other information Bazel needs to execute the build. If you are familiar with Bazel’s build phases, the action graph is the output of the loading and analysis phase and used during the execution phase.

However, Bazel does not necessarily execute every action in the graph. It only executes if it has to, that is, the action graph is the super set of what is actually executed.

As to why the denominator is ever-increasing, it's because the actions-to-execute discovery within the action graph is lazy. Here's a better explanation from the Bazel TL, Ulf Adams:

The problem is that Skyframe does not eagerly walk the action graph, but it does it lazily. The reason for that is performance, since the action graph can be rather large and this was previously a blocking operation (where Bazel would just hang for some time). The downside is that all threads that walk the action graph block on actions that they execute, which delays discovery of remaining actions. That's why the number keeps going up during the build.

Source: https://github.com/bazelbuild/bazel/issues/3582#issuecomment-329405311

like image 183
Jin Avatar answered Oct 18 '22 02:10

Jin