Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I run dependency install job only when it's not cached or package.json changed in gitlab ci?

I have a monorepo in gitlab with angular frontend and nestjs backend. I have package.json for each of them and 1 in the root. My pipeline consists of multiple stages like these:

stages:
  - build
  - verify
  - test
  - deploy

And I have a job in a .pre stage which installs dependencies. I would like to cache those between jobs and also between branches, if any of package-lock.json changed, but also if there are no cached node_modules currently. I have a job that looks like this:

prepare:
  stage: .pre
  script:
    - npm run ci-deps # runs npm ci in each folder
  cache:
    key: $CI_PROJECT_ID
    paths:
      - node_modules/
      - frontend/node_modules/
      - backend/node_modules/
    only:
      changes:
        - '**/package-lock.json'

Now problem with this is that if cache was somehow cleared or if I didn't introduce changes to package-lock.json with first push I won't have this job running at all and therefore everything else will fail because it requires node_modules. If I remove changes: from there, then it runs the job for every pipeline. Of course then I still can share it between jobs, but if I do another commit and push it takes almost 2 minutes to install all the dependencies even though I didn't change anything about what should be there... Am I missing something? How can I cache it in a way so that it only will reinstall dependencies if cache is outdated or doesn't exist?

like image 437
Blind Despair Avatar asked Feb 06 '20 10:02

Blind Despair


3 Answers

Rules:Exists runs before the cache is pulled down so this was not a workable soluton for me.

In GitLab v12.5 we can now use cache:key:files

If we combine that with part of Blind Despair's conditional logic we get a nicely working solution

prepare:
  stage: .pre
  image: node:12
  script:
    - if [[ ! -d node_modules ]];
      then
        npm ci;
      fi
  cache:
    key:
      files:
        - package-lock.json
      prefix: nm-$CI_PROJECT_NAME
    paths:
      - node_modules/

We can then use this in subsequent build jobs

# let's keep it dry with templates
.use_cached_node_modules: &use_cached_node_modules
  cache:
    key:
      files:
        - package-lock.json
      prefix: nm-$CI_PROJECT_NAME
    paths:
      - node_modules/
    policy: pull # don't push unnecessarily

build:
  <<: *use_cached_node_modules
  stage: build
  image: node:12
  script:
    - npm run build

We use this successfully across multiple branches with a shared cache.

like image 91
Calummm Avatar answered Oct 16 '22 07:10

Calummm


I had the same problem, and I was able to solve it using the keyword rules instead of only|except. With it, you can declare more complex cases, using if, exists, changes, for example. Also, this :

Rules can't be used in combination with only/except because it is a replacement for that functionality. If you attempt to do this, the linter returns a key may not be used with rules error.

-- https://docs.gitlab.com/ee/ci/yaml/#rules

All the more reasons to switch to rules. Here's my solution, which executes npm ci :

  • if the package-lock.json file was modified

OR

  • or if node-modules folder does not exists (in case of new branches or cache cleaning) :
npm-ci:
  image: node:lts
  cache:
    key: $CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR
    paths:
      - node_modules/
  script:
    - npm ci
  rules:
    - changes:
        - package-lock.json
    - exists:
        - node_modules
      when: never

Hope it helps !

like image 26
Max Avatar answered Oct 16 '22 07:10

Max


In the end I figured that I could do this without relying on gitlab ci features, but do my own checks like so:

prepare:
  stage: .pre
  image: node:12
  script:
    - if [[ ! -d node_modules ]] || [[ -n `git diff --name-only HEAD~1 HEAD | grep "\package.json\b"` ]];
      then
      npm ci;
      fi
    - if [[ ! -d frontend/node_modules ]] || [[ -n `git diff --name-only HEAD~1 HEAD | grep "\frontend/package.json\b"` ]];
      then
      npm run ci-deps:frontend;
      fi
    - if [[ ! -d backend/node_modules ]] || [[ -n `git diff --name-only HEAD~1 HEAD | grep "\backend/package.json\b"` ]];
      then
      npm run ci-deps:backend;
      fi
  cache:
    key: '$CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR'
    paths:
      - node_modules/
      - frontend/node_modules
      - backend/node_modules

The good thing about this is that it will only install dependencies for specific part of the project if it either doesn't have node_modules yet or when package.json was changed. This however will probably be wrong if I push multiple commits and package.json would change not in the last one. In that case I can still clear cache and rerun pipeline manually, but I will try to further improve my script and update my answer.

like image 36
Blind Despair Avatar answered Oct 16 '22 07:10

Blind Despair