In the last couple weeks I have been setting up my first pipeline using the public shared runners on GitLab.com for a php project in a private repository. The pipeline is pretty simple at this point, defining two stages: <pre class="prettyprint"><code>stages: - test - deploy </code></pre> The test stage runs <code>composer update -o</code> to build the project dependencies, connects to a remote database server, and runs the CodeCeption testing framework to test the build and generate code coverage reports. The deploy stage runs <code>composer update --no-dev -o</code> to rebuild the project with only the production dependencies and uses rsync to push the files to the production webserver. This is all working right now, but for each stage it runs the whole process of pulling the docker image, installing dependencies, and extracting the project from git. It seems like it would be a whole lot more efficient to just load the docker image and project once, then run the test and deploy stages one after the other using the same persistent build instance. I realize that many times you do want to create a fresh instance for each stage, but with my project I feel like this is rather inefficient for time and server resources. I could configure everything to run in the same stage, which would eliminate the redundant docker image process, but I would lose the pipeline functionality in GitLab where you can see which stages failed, and make later stages dependent on the success of the preceding ones. <img src="https://i.stack.imgur.com/Bspfp.png" alt="enter image description here"> From my review of the documentation and several related questions, it seems like this might have to do with the the architecture of how this process works, where jobs are independent of each other (and can even be processed by different runners) and are organized into stages on a pipeline. What I have is certainly workable, (if a little slow) but I thought I would ask the question here in case there was something I was missing that would make this process more efficient while still retaining the CI pipeline functionality.

I know this is an old question, but want to provide an answer for anyone that has the same issue. There's a config option for the Gitlab Runner application itself that controls when the runner will use a local copy of an image or not. If you manage and user your own runners (even if using gitlab.com) you have full control over these options, but if you use the shared runners provided by Gitlab, you cannot. Here are the three "pull policies" you can use: <ol> <li> <code>Never</code>. The <code>never</code> pull policy will instruct the runner to never pull images from Docker cloud or another repository, and will only use images already pulled to the Docker host. This allows full control over images and versions used by Gitlab.</li> <li> <code>If Not Present</code>. The <code>if not present</code> policy instructs the runner to first check if the image is available locally, and if so to use it. Otherwise, it will pull the image from it's repository.</li> <li> <code>Always</code>. The <code>always</code> policy instructs the runner to ignore any local images, and pull from the repository every time the job runs.</li> </ol> For the shared runners on gitlab.com, the pull policy is set to <code>always</code> to serve the needs of most users. The solution to this issue is to register your own runner(s) for your projects (which you can run in AWS EC2, your laptop/workstation, etc. Here is the information on available configuration options when running your own Gitlab Runner. Here are specific details on the available Pull Policies, and when to use them (or not to). Here is how to register a runner to your projects (or to your entire instance if using self-hosted Gitlab).

How can I persist a docker image instance between stages of a GitLab pipeline?

Tags:

gitlab-ci

gitlab-ci-runner

In the last couple weeks I have been setting up my first pipeline using the public shared runners on GitLab.com for a php project in a private repository. The pipeline is pretty simple at this point, defining two stages:

stages:
  - test
  - deploy

The test stage runs composer update -o to build the project dependencies, connects to a remote database server, and runs the CodeCeption testing framework to test the build and generate code coverage reports.

The deploy stage runs composer update --no-dev -o to rebuild the project with only the production dependencies and uses rsync to push the files to the production webserver.

This is all working right now, but for each stage it runs the whole process of pulling the docker image, installing dependencies, and extracting the project from git. It seems like it would be a whole lot more efficient to just load the docker image and project once, then run the test and deploy stages one after the other using the same persistent build instance.

I realize that many times you do want to create a fresh instance for each stage, but with my project I feel like this is rather inefficient for time and server resources.

I could configure everything to run in the same stage, which would eliminate the redundant docker image process, but I would lose the pipeline functionality in GitLab where you can see which stages failed, and make later stages dependent on the success of the preceding ones.

enter image description here

From my review of the documentation and several related questions, it seems like this might have to do with the the architecture of how this process works, where jobs are independent of each other (and can even be processed by different runners) and are organized into stages on a pipeline.

What I have is certainly workable, (if a little slow) but I thought I would ask the question here in case there was something I was missing that would make this process more efficient while still retaining the CI pipeline functionality.

480

asked Jun 21 '19 15:06

AdamsTips

1 Answers

I know this is an old question, but want to provide an answer for anyone that has the same issue.

There's a config option for the Gitlab Runner application itself that controls when the runner will use a local copy of an image or not. If you manage and user your own runners (even if using gitlab.com) you have full control over these options, but if you use the shared runners provided by Gitlab, you cannot.

Here are the three "pull policies" you can use:

Never. The never pull policy will instruct the runner to never pull images from Docker cloud or another repository, and will only use images already pulled to the Docker host. This allows full control over images and versions used by Gitlab.
If Not Present. The if not present policy instructs the runner to first check if the image is available locally, and if so to use it. Otherwise, it will pull the image from it's repository.
Always. The always policy instructs the runner to ignore any local images, and pull from the repository every time the job runs.

For the shared runners on gitlab.com, the pull policy is set to always to serve the needs of most users. The solution to this issue is to register your own runner(s) for your projects (which you can run in AWS EC2, your laptop/workstation, etc.

Here is the information on available configuration options when running your own Gitlab Runner.

Here are specific details on the available Pull Policies, and when to use them (or not to).

Here is how to register a runner to your projects (or to your entire instance if using self-hosted Gitlab).

111

answered Sep 27 '22 17:09

Adam Marshall

Related questions
                            
                                How to fetch and parse all the generated coverage.cobertura files in CI pipelines?
                            
                                Gitlab pages and Jekyll - issue with set up TLS Lets Encrypted
                            
                                Gitlab CI multiple branches
                            
                                Defining parallel sequences of jobs in GitLab CI
                            
                                Run gitlab-ci.yml only when merge request to master made
                            
                                gitlab ci error could not translate host name "postgres" to address: Name does not resolve
                            
                                How to enable AUFS on Debian?
                            
                                How to enable code coverage output in job list for PHP project on gitlab.com
                            
                                How to disable auto pipelines in gitlab
                            
                                Setup CI on Gitlab for Flutter
                            
                                How can I change the url for a project in GitLab?
                            
                                How do I check if an image:tag exists in gitlab container registry
                            
                                Error: Node Sass does not yet support your current environment: Linux 64-bit with Unsupported runtime (88)
                            
                                add hosts redirection in docker
                            
                                Using redis with Gitlab CI
                            
                                Gitlab CI script: exclude branches
                            
                                Use Gitlab Pipeline to push data to ftpserver
                            
                                gitlab-runner: prepare environment failed to start process pwsh in windows
                            
                                GitLab CI - avoid build when adding tag
                            
                                gitlab-ci docker-in-docker access to insecure registry

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With