I would like to use Gitlab CI to compile a Latex article as explained in this answer on tex.stackexchange (a similar pdf generation example is shown in the gitlab documentation for artifacts). I use a special latex template given by the journal editor. My Latex article contains figures made with the R statistical software. R and Latex are two large software installations with a lot of dependencies so I decided to use two separate containers for the build, one for the statistical analysis and visualization with R and one to compile a Latex document to pdf.
Here is the content of .gitlab-ci.yml
:
knit_rnw_to_tex:
image: rocker/verse:4.0.0
script:
- Rscript -e "knitr::knit('article.Rnw')"
artifacts:
paths:
- figure/
compile_pdf:
image: aergus/latex
script:
- ls figure
- latexmk -pdf -bibtex -use-make article.tex
artifacts:
paths:
- article.pdf
The knit_rnw_to_tex
job executed in the R "rocker" container is successful and I can download the figure artifacts from the gitlab "jobs" page. The issue in the second job compile_pdf
is that ls figure
shows me an empty folder and the Latex article compilation fails because of missing figures.
apt install latexmk
fails for an unknown reason. Maybe because It has over hundred dependencies and that is to much for gitlab-CI?Compiling LaTeX documents with GitLab CI The basic idea is as follows: Locally on your own computer you edit the LaTeX source, which is for example stored in a file named essay.tex. If you want to compile this source file, you upload it to a server. About half a minute later you can download the finished PDF in your browser.
GitLab CI in conjunction with GitLab Runner can use Docker Engine to test and build any application. Docker is an open-source project that allows you to use predefined images to run applications in independent “containers” that are run within a single Linux instance.
To access private container registries, the GitLab Runner process can use: 1 Statically defined credentials. That is, a username and password for a specific registry. 2 Credentials Store. For more information, see the relevant Docker documentation . 3 Credential Helpers. For more information, see the relevant Docker documentation .
Apart from LaTeX you may want to compile ConTeXt documents using Gitlab CI. That's very easy as well. Simply use the install script provided by ConTeXt standalone (the following CI configurations will download the beta version of ConTeXt).
Thank you for the comment as I wanted to be sure, how you do it. Example would help too, but I'll be generic for now (using docker
).
To run multiple containers you need a
(The Docker executor
)
To quote the documentation on it:
The Docker executor when used with GitLab CI, connects to Docker Engine and runs each build in a separate and isolated container using the predefined image that is set up in
.gitlab-ci.yml
and in accordance inconfig.toml
.
The Docker executor divides the job into multiple steps:
- Prepare: Create and start the services.
- Pre-job: Clone, restore cache and download artifacts from previous stages. This is run on a special Docker image.
- Job: User build. This is run on the user-provided Docker image.
- Post-job: Create cache, upload artifacts to GitLab. This is run on a special Docker Image.
Your config.toml
could look like this:
[runners.docker]
image = "rocker/verse:4.0.0"
builds_dir = /home/builds/rocker
[[runners.docker.services]]
name = "aergus/latex"
alias = "latex"
From above linked documentation:
image
keywordThe image
keyword is the name of the Docker image that is present in the local Docker Engine (list all images with docker images) or any image that can be found at Docker Hub. For more information about images and Docker Hub please read the Docker Fundamentals documentation.
In short, with image we refer to the Docker image, which will be used to create a container on which your build will run.
If you don’t specify the namespace
, Docker implies library which includes all official images. That’s why you’ll see many times the library part omitted in .gitlab-ci.yml and config.toml. For example you can define an image like image: ruby:2.6
, which is a shortcut for image: library/ruby:2.6
.
Then, for each Docker image there are tags, denoting the version of the image. These are defined with a colon (:) after the image name. For example, for Ruby you can see the supported tags at docker hub. If you don’t specify a tag (like image: ruby
), latest is implied.
The image
you choose to run your build in via image
directive must have a working shell in its operating system PATH
. Supported shells are sh
, bash
, and pwsh
(since 13.9) for Linux, and PowerShell for Windows. GitLab Runner cannot execute a command using the underlying OS system calls (such as exec).
services
keywordThe services
keyword defines just another Docker image that is run during your build and is linked to the Docker image that the image keyword defines. This allows you to access the service image during build time.
The service
image can run any application, but the most common use case is to run a database container, e.g., mysql
. It’s easier and faster to use an existing image and run it as an additional container than install mysql
every time the project is built.
You can see some widely used services examples in the relevant documentation of CI services examples.
If needed, you can assign an alias
to each service.
It should be possible to use artifacts to pass data between jobs according to this answer and to this well explained forum post but they use only one container for different jobs. It doesn't work in my case. Probably because I use two different containers?
The Docker executor by default stores all builds in /builds/<namespace>/<project-name>
and all caches in /cache
(inside the container). You can overwrite the /builds
and /cache
directories by defining the builds_dir
and cache_dir
options under the [[runners]]
section in config.toml
. This will modify where the data are stored inside the container.
If you modify the /cache
storage path, you also need to make sure to mark this directory as persistent by defining it in volumes = ["/my/cache/"]
under the [runners.docker]
section in config.toml
.
builds_dir
-> Absolute path to a directory where builds are stored in the context of the selected executor. For example, locally, Docker, or SSH.The [[runners]] section documentation
As you may have noticed I have customized the build_dir
in your toml
file to /home/builds/rocker
, please adjust it to your own path.
How can I pass the artifacts from one job to the other?
You can use the build_dir
directive. Second option would to use Job Artifacts API.
Should I use cache as explained in docs.gitlab.com / caching?
Yes, You should use cache
to store project dependencies. The advantage is that you fetch the dependencies only once from internet and then subsequent runs are much faster as they can skip this step. Artifacts
are used to share results between build stages.
I hope it is now clearer and I have pointed you into right direction.
The two different images are not the cause of your problems. The artifacts are saved in one image (which seems to work), and then restored in the other. I would therefore advise against building (and maintaining) a single image, as that should not be necessary here.
The reason you are having problems is that you are missing build stages which inform gitlab about dependencies between the jobs. I would therefore advise you to specify stages as well as their respective jobs in your .gitlab-ci.yml
:
stages:
- do_stats
- do_compile_pdf
knit_rnw_to_tex:
stage: do_stats
image: rocker/verse:4.0.0
script:
- Rscript -e "knitr::knit('article.Rnw')"
artifacts:
paths:
- figure/
compile_pdf:
stage: do_compile_pdf
image: aergus/latex
script:
- ls figure
- latexmk -pdf -bibtex -use-make article.tex
artifacts:
paths:
- article.pdf
By default, all artifacts of previous build stages are made available in later stages if you add the corresponding specifications.
If you do not specify any stages, gitlab will put all jobs into the default test
stage and execute them in parallel, assuming that they are independent and do not require each others artifacts. It will still store the artifacts but not make them available between the jobs. This is presumably what is causing your problems.
As for the cache
: Artifacts are how you pass files between build stages. Caches are for well, caching. In practice, they are used for things like external packages in order to avoid having to download them multiple times, see here. Caches are somewhat unpredictable in situations with multiple different runners. They are only used for performance reasons, and passing files between jobs using cache
rather than using the artifact system is a huge anti-pattern.
Edit: I don't know precisely what your knitr
setup is, but if you generate an article.tex
from your article.Rnw
, then you probably need to add that to your artifacts
as well.
Also, services
are used for things like a MySQL server for testing databases, or the dind (docker in docker) daemon to build docker images. This should not be necessary in your case. Similarly, you should not need to change any runner configuration (in their respective config.toml
) from the defaults.
Edit2: I added a MWE here, which works with my gitlab setup.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With