I use the following Github Actions workflow for my C project. The workflow finishes in ~40 seconds, but more than half of that time is spent by installing the valgrind
package and its dependencies.
I believe caching could help me speed up the workflow. I do not mind waiting a couple of extra seconds, but this just seems like a pointless waste of GitHub's resources.
name: C Workflow on: [push, pull_request] jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v1 - name: make run: make - name: valgrind run: | sudo apt-get install -y valgrind valgrind -v --leak-check=full --show-leak-kinds=all ./bin
Running sudo apt-get install -y valgrind
installs the following packages:
gdb
gdbserver
libbabeltrace1
libc6-dbg
libipt1
valgrind
I know Actions support caching of a specific directory (and there are already several answered SO questions and articles about this), but I am not sure where all the different packages installed by apt end up. I assume /bin/
or /usr/bin/
are not the only directories affected by installing packages.
Is there an elegant way to cache the installed system packages for future workflow runs?
The cache key uses contexts and expressions to generate a key that includes the runner's operating system and a SHA-256 hash of the package-lock. json file. When key matches an existing cache, it's called a cache hit, and the action restores the cached files to the path directory.
When you install a package using apt-get or apt command (or DEB packages in the software center), the apt package manager downloads the package and its dependencies in . deb format and keeps it in /var/cache/apt/archives folder. While downloading, apt keeps the deb package in /var/cache/apt/archives/partial directory.
GitHub Actions uses YAML syntax to define the workflow. Each workflow is stored as a separate YAML file in your code repository, in a directory named . github/workflows . You can create an example workflow in your repository that automatically triggers a series of commands whenever code is pushed.
You can install packages as part of your CI flow using GitHub Actions. For example, you could configure a workflow so that anytime a developer pushes code to a pull request, the workflow resolves dependencies by downloading and installing packages hosted by GitHub Packages. Then, the workflow can run CI tests that require the dependencies.
To help speed up the time it takes to recreate these files, GitHub can cache dependencies you frequently use in workflows. To cache dependencies for a job, you'll need to use GitHub's cache action. The action retrieves a cache identified by a unique key. For more information, see actions/cache.
Note: You must use the cache action in your workflow before you need to use the files that might be restored from the cache. If the provided key doesn't match an existing cache, a new cache is automatically created if the job completes successfully. Every programming language and framework has its own way of caching.
This action allows caching of Advanced Package Tool (APT) package dependencies to improve workflow execution time instead of installing the packages on every run. This action is a composition of actions/cache and the apt utility. Some actions require additional APT based packages to be installed in order for other steps to be executed.
The purpose of this answer is to show how caching can be done with github actions. Not necessarily to show how to cache valgrind
, which it does show, but more so to show that not everything can/should be cached, and that the tradeoffs of caching and restoring a cache, vs reinstalling the dependency needs to be taken into account.
You will make use of the actions/cache
action to do this.
Add it as a step (before you need to use valgrind):
- name: Cache valgrind uses: actions/cache@v2 id: cache-valgrind with: path: "~/valgrind" key: ${{secrets.VALGRIND_VERSION}}
The next step should attempt to install the cached version if any or install from the repositories:
- name: Install valgrind env: CACHE_HIT: ${{steps.cache-valgrind.outputs.cache-hit}} VALGRIND_VERSION: ${{secrets.VALGRIND_VERSION}} run: | if [[ "$CACHE_HIT" == 'true' ]]; then sudo cp --verbose --force --recursive ~/valgrind/* / else sudo apt-get install --yes valgrind="$VALGRIND_VERSION" mkdir -p ~/valgrind sudo dpkg -L valgrind | while IFS= read -r f; do if test -f $f; then echo $f; fi; done | xargs cp --parents --target-directory ~/valgrind/ fi
Set VALGRIND_VERSION
secret to be the output of:
apt-cache policy valgrind | grep -oP '(?<=Candidate:\s)(.+)'
this will allow you to invalidate the cache when a new version is released simply by changing the value of the secret.
dpkg -L valgrind
is used to list all the files installed when using sudo apt-get install valgrind
.
What we can now do with this command is to copy all the dependencies to our cache folder:
dpkg -L valgrind | while IFS= read -r f; do if test -f $f; then echo $f; fi; done | xargs cp --parents --target-directory ~/valgrind/
In addition to copying all the components of valgrind
, it may also be necessary to copy the dependencies (such as libc
in this case), but I don't recommend continuing along this path because the dependency chain just grows from there. To be precise, the dependencies needed to copy to finally have an environment suitable for valgrind to run in is as follows:
To copy all these dependencies, you can use the same syntax as above:
for dep in libc6 libgcc1 gcc-8-base; do dpkg -L $dep | while IFS= read -r f; do if test -f $f; then echo $f; fi; done | xargs cp --parents --target-directory ~/valgrind/ done
Is all this work really worth the trouble when all that is required to install valgrind
in the first place is to simply run sudo apt-get install valgrind
? If your goal is to speed up the build process, then you also have to take into consideration the amount of time it is taking to restore (downloading, and extracting) the cache vs simply running the command again to install valgrind
.
And finally to restore the cache, assuming it is stored at /tmp/valgrind
, you can use the command:
cp --force --recursive /tmp/valgrind/* /
Which will basically copy all the files from the cache unto the root partition.
In addition to the process above, I also have an example of "caching valgrind" by installing and compiling it from source. The cache is now about 63MB (compressed) in size and one still needs to separately install libc
which kind of defeats the purpose.
Note: Another answer to this question proposes what I could consider to be a safer approach to caching dependencies, by using a container which comes with the dependencies pre-installed. The best part is that you can use actions to keep those containers up-to-date.
References:
You could create a docker image with valgrind
preinstalled and run your workflow on that.
Create a Dockerfile
with something like:
FROM ubuntu RUN apt-get install -y valgrind
Build it and push it to dockerhub:
docker build -t natiiix/valgrind . docker push natiiix/valgrind
Then use something like the following as your workflow:
name: C Workflow on: [push, pull_request] jobs: build: container: natiiix/valgrind steps: - uses: actions/checkout@v1 - name: make run: make - name: valgrind run: valgrind -v --leak-check=full --show-leak-kinds=all ./bin
Completely untested, but you get the idea.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With