Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to store node modules between jobs and stages in gitlab with continuous integration

I am fairly new to GitLab CI and I've been trying different approaches to use the node_modules directory in my entire pipeline. From what I've read in the official docs, cache and artifacts seem to be valid approaches to pass on files between jobs:

cache is used to specify a list of files and directories which should be cached between jobs. You can only use paths that are within the project workspace.

However, my issue with the caching method is that the node_modules would be persisted between pipelines by default:

  • cache can be set globally and per-job.
  • from GitLab 9.0, caching is enabled and shared between pipelines and jobs by default.

I do not want to persist the node_modules between pipelines. What I actually want is to trigger a fresh install with npm in my setup stage and then allow all further jobs in the pipeline to use these modules. Hence, I started using artifacts instead of cache, which is described similarly:

artifacts is used to specify a list of files and directories which should be attached to the job after success. [...]

The artifacts will be sent to GitLab after the job finishes successfully and will be available for download in the GitLab UI. The dependency feature should be used in conjunction with artifacts and allows you to define the artifacts to pass between different jobs.

The artifact-dependency method seems to be usable in my case. However, both cache and artifacts are extremely inefficient and slow. The node_modules are installed and usable, but the entire directory then gets uploaded somewhere and is re-downloaded between each job. (I would really love to know what happens here... Where do the modules go?)

Is there a better approach to run npm install only once at the beginning of the pipeline and then keep the node_modules in the pipeline during its entire runtime? I do not want to keep the node_modules after all jobs are finished so they don't need to be uploaded or downloaded anywhere.

Sample pipeline configuration file to reproduce the behavior:

image: node:lts

stages:
  - setup
  - build
  - test

node:
  stage: setup
  script:
    - npm install
  artifacts:
    paths:
      - node_modules/

build:
  stage: build
  script:
    - npm run build
  dependencies:
    - node

test:
  stage: test
  script:
    - npm run lint
    - npm run test
  dependencies:
    - node
like image 618
muffin Avatar asked May 04 '19 15:05

muffin


People also ask

What is Node_modules cache?

node_modules/. cache is a community-standard cache folder for storing files. Many projects such as ava , nyc , storybook , and many webpack loaders and plugins use this folder by default.

What is CI CD pipeline in GitLab?

GitLab CI (Continuous Integration) service is a part of GitLab that build and test the software whenever developer pushes code to application. GitLab CD (Continuous Deployment) is a software service that places the changes of every code in the production which results in every day deployment of production.

How many stages are there in GitLab?

As we can see, we have two stages, first stage installs the modules and has only one job. The second stage kicks in if the first one finishes successfully and starts two jobs in parallel. The first thing that happens once you push your code is that GitLab scans the .gitlab-ci.yml.

Is it possible to cache node_modules in GitLab?

You can only use paths that are within the project workspace. However, my issue with the caching method is that the node_modules would be persisted between pipelines by default: cache can be set globally and per-job. from GitLab 9.0, caching is enabled and shared between pipelines and jobs by default.

Why doesn't NPM work in GitLab CI?

This doesn’t work in Gitlab CI 10.2. There is no node_module/directory in the “test” stage, so npm installs all modules from scratch each time. Home Categories FAQ/Guidelines

Where is the job cache stored in GitLab?

All caches defined for a job are archived in a single cache.zip file. The runner configuration defines where the file is stored. By default, the cache is stored on the machine where GitLab Runner is installed. The location also depends on the type of executor.


1 Answers

Where do the modules go?

By default artifacts are saved on the main gitlab machine:

/var/opt/gitlab/gitlab-rails/shared/artifacts

Is there a better approach to run npm install only once at the beginning of the pipeline and then keep the node_modules in the pipeline during its entire runtime?

There are some options that you can try:

  1. Merge setup and build stages to one stage.

  2. Local npm cache on builder machines. Faster npm install times. Or use private npm proxy registry (for example - Nexus/Artifactory)

  3. Check if gitlab main machine and the builders are in the same network so the upload/download will be faster

  4. Consider packaging your build in docker. You will get reusable docker images between your gitlab stages. (Of course that there is an overhead of uploading the images to docker registry)
like image 90
Amityo Avatar answered Sep 21 '22 22:09

Amityo