The internet is full of complains about Gitlab not caching, but in my case I think, that Gitlab CI indeed caches correctly. The thing is, that npm seems to install everything again anyway.
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- vendor/
- bootstrap/
- node_modules/
build-dependencies:
image: ...
stage: build
script:
- cp .env.gitlab-testing .env
- composer install --no-progress --no-interaction
- php artisan key:generate
- npm install
- npm run prod
- npm run prod
artifacts:
paths:
- vendor/
- bootstrap/
- node_modules/
- .env
- public/mix-manifest.json
tags:
- docker
This is my gitlab-ci.yml file (well.. the relevant part). While the cached composer dependencies are used, the node_modules aren't. I even added everything to cache and artifacts out of desperation..
You may not need to cache your entire node_modules Many projects such as ava , nyc , storybook , and many webpack loaders and plugins use this folder by default. With some configuration, you can speed up your CI and developer experience by pointing other caches to this folder and persisting it across builds.
Ideally, by caching node_modules when it appears safe to do so, the download of node_modules shouldn't need to happen as frequently and caching can save minutes.
By default, they are stored locally in the machine where the Runner is installed and depends on the type of the executor. Locally, stored under the gitlab-runner user's home directory: /home/gitlab-runner/cache/<user>/<project>/<cache-key>/cache. zip .
generally, it is safe to delete the node_modules folder, as it can be regenerated by the package manager.
GitLab@^15.12 & >13
)Just like the comments received, the use of artifacts is not ideal in the original answer, but was the cleanest approach that worked reliably. Now, that GitLab's documentation has been updated around the use of cache
and it was also expanded to support multiple cache keys per job (4 maximum unfortunately), there is a better way to handle node_modules
across a pipeline.
The rationale for implementation is based on understand the quirks of both GitLab and how npm
works. These are the fundamentals:
NPM recommends the use of npm ci
instead of npm install
when running in a CI/CD environment. FYI, this will require the existence of package-lock.json
, which is used to ensure 0 packages are automatically version bumped while running in a CI environment (npm i
by default will not create the same deterministic build every time, such as on a job re-run).
npm ci
deliberately removes the entirety of node_modules
first before re-installing all packages listed in package-lock.json
. Therefore, it is best to only configure GitLab to run npm ci
once and ensure the resulting node_modules
is passed to other jobs.
NPM has its own cache that it stores at ~/.npm/
in case of offline builds and overall speed. You can specify a different cache location with the --cache <dir>
option (you will need this). (variation of @Amityo's answer)
GitLab cannot cache any directory outside of the repository! This means the default cache directory ~/.npm
cannot be cached.
GitLab's global cache configuration is applied to every job by default. Jobs will need to explicit override the cache config if it doesn't need the globally cached files. Using YAML anchors, the global cache config can be copied & modified but this doesn't seem to work if you want to override a global setting when the cache uses a list (resolution still under investigation).
To run additional npx
or npm run <script>
without re-running an install, you should cache node_modules/
folder(s) across the pipeline.
GitLab's expectation is for users to use the cache
feature to handle dependencies and only use artifacts
for dynamically generated build results. This answer now supports this desire better than possible before. There is the restriction that artifacts should be less than the maximum artifact size or 1GB (compressed) on GitLab.com. And artifacts uses your storage usage quota
The use of the needs
or dependencies
directives will influence if artifacts from a previous job will be downloaded (or deleted) automatically on the next job.
GitLab cache's can monitor the hash value of a file and use that as the key so its possible the cache will only update when the package-lock.json
updates. You could use package.json
but you would invalidate your deterministic builds as it does not update when minor or patches are available.
If you have a mono-repo and have more than 2 separate packages, you will hit the cache entry limit of 4 during the install
job. You will not have the ideal setup but you can combine some cache definitions together. It is also worth noting, GitLab cache.key.files
supports a maximum of 2 files to use for the key hash so you likely will need to use another method to determine a useful key. One likely solution will be to use a non-file related key and cache all node_modules/
folders under that key. That way you have only 2 cache entries for the install
job and 1 for each subsequent job.
.pre
stage, using cached downloaded packages (tar.gz's) across entire repository.node_modules/
folders to following jobs for that pipeline execution. Do not allow any jobs except install to upload cache (lowers pipeline run time & prevents unintended consequences)build/
directory via artifacts on to other jobs only when needed# .gitlab-ci.yml
stages:
- build
- test
- deploy
# global cache settings for all jobs
# Ensure compatibility with the install job
# goal: the install job loads the cache and
# all other jobs can only use it
cache:
# most npm libraries will only have 1 entry for the base project deps
- key: &global_cache_node_mods
files:
- package-lock.json
paths:
- node_modules/
policy: pull # prevent subsequent jobs from modifying cache
# # ATTN mono-repo users: with only additional node_modules,
# # add up to 2 additional cache entries.
# # See limitations in #10.
# - key:
# files:
# - core/pkg1/package-lock.json
# paths:
# - core/pkg1/node_modules/
# policy: pull # prevent jobs from modifying cache
install:
image: ...
stage: .pre # always first, no matter if it is listed in stages
cache:
# store npm cache for all branches (stores download pkg.tar.gz's)
# will not be necessary for any other job
- key: ${CI_JOB_NAME}
# must be inside $CI_PROJECT_DIR for gitlab-runner caching (#3)
paths:
- .npm/
when: on_success
policy: pull-push
# Mimic &global_cache_node_mods config but override policy
# to allow this job to update the cache at the end of the job
# and only update if it was a successful job
# NOTE: I would use yaml anchors here but overriding the policy
# in a yaml list is not as easy as a dictionary entry (#5)
- key:
files:
- package-lock.json
paths:
- node_modules/
when: on_success
policy: pull-push
# # ATTN Monorepo Users: add additional key entries from
# # the global cache and override the policy as above but
# # realize the limitations (read #10).
# - key:
# files:
# - core/pkg1/package-lock.json
# paths:
# - core/client/node_modules/
# when: on_success
# policy: pull-push
# before_script:
# - ...
script:
# define cache dir & use it npm!
- npm ci --cache .npm --prefer-offline
# # monorepo users: run secondary install actions
# - npx lerna bootstrap -- --cache .npm/ --prefer-offline
build:
stage: build
# global cache settings are inherited to grab `node_modules`
script:
- npm run build
artifacts:
paths:
- dist/ # where ever your build results are stored
test:
stage: test
# global cache settings are inherited to grab `node_modules`
needs:
# install job is not "needed" unless it creates artifacts
# install job also occurs in the previous stage `.pre` so it
# is implicitly required since `when: on_success` is the default
# for subsequent jobs in subsequent stages
- job: build
artifacts: true # grabs built files
# dependencies: could also be used instead of needs
script:
- npm test
deploy:
stage: deploy
when: on_success # only if previous stages' jobs all succeeded
# override inherited cache settings since node_modules is not needed
cache: {}
needs:
- job: build
artifacts: true # grabs dist/
script:
- npm publish
GitLab's recommendation for npm
can be found in the GitLab Docs.
GitLab<13.12
)All the answers I see so far give only half answers but don't actually fully accomplish the task of caching IMO.
In order to fully cache with npm & GitLab, you must be aware of the following:
See #1 above
npm ci
deliberately removes the entirety of node_modules
first before re-installing all packages listed in package-lock.json
. Therefore, configuring GitLab to cache the node_modules
directory between build jobs is useless. The point is to ensure no preparation hooks or anything else modified node_modules
from a previous run. IMO, this is not really valid for a CI environment but you can't change it and maintain the fully deterministic builds.
See #3-#4 above
If you have multiple stages, global cache will be downloaded every job. This likely is not what you want!
To run additional npx
commands without re-running an install, you should pass the node_modules/
folder as an artifact to other jobs.
.pre
stage, using cached downloaded packages (tar.gz's) across entire repository.
stages:
- build
- test
- deploy
install:
image: ...
stage: .pre # always first, no matter if it is listed in stages
cache:
key: NPM_DOWNLOAD_CACHE # a single-key-4-all-branches for install jobs
paths:
- .npm/
before_script:
- cp .env.gitlab-testing .env
- composer install --no-progress --no-interaction
- php artisan key:generate
script:
# define cache dir & use it npm!
- npm ci --cache .npm --prefer-offline
artifacts:
paths:
- vendor/
- bootstrap/
- node_modules/
- .env
- public/mix-manifest.json
build:
stage: build
needs:
- job: install
artifacts: true # true by default, grabs `node_modules`
script:
- npm run build
artifacts:
paths:
- dist/ # whereever your build results are stored
test:
stage: test
needs:
- job: install
artifacts: true # grabs node_modules
- job: build
artifacts: true # grabs built files
script:
- npm test
deploy:
stage: deploy
needs:
# does not need node_modules so don't state install as a need
- job: build
artifacts: true # grabs dist/
- job: test # must succeed
artifacts: false # not needed
script:
- npm publish
Actually it should work, your cache is set globally, your key refers to the current branch ${CI_COMMIT_REF_SLUG}
...
This is my build and it seems to cache the node_modules between the stages.
image: node:latest
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .next/
stages:
- install
- test
- build
- deploy
install_dependencies:
stage: install
script:
- npm install
test:
stage: test
script:
- npm run test
build:
stage: build
script:
- npm run build
I had the same issue, for me the problem was down to the cache settings, by default the cache does not keep unversioned git files and since we do not store node_modules in git the npm files were not cached at all. So all I had to do was insert one line "untracked: true" like below
cache:
untracked: true
key: ${CI_COMMIT_REF_SLUG}
paths:
- vendor/
- bootstrap/
- node_modules/
Now npm is faster, although it still needs to check if things have changed, for me this still takes a couple of minutes, so I considering having a specific job to do the npm install, but it has sped things up a lot already.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With