Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Gitlab CI/CD Pass artifacts/variables between pipelines

tl;dr

How do I pass data, e.g. the $BUILD_VERSION variable, between jobs in different pipelines in Gitlab CI?

So (in my case) this:

Pipeline 1 on push ect.            Pipeline 2 after merge

    `building` job ...                `deploying` job
          │                                ▲
          └─────── $BUILD_VERSION ─────────┘

Background

Consider the following example (full yml below):

building:
    stage: staging
    # only on merge requests
    rules:
        # execute when a merge request is open
        - if: $CI_PIPELINE_SOURCE == "merge_request_event"
          when: always
        - when: never
    script:
        - echo "BUILD_VERSION=1.2.3" > build.env
    artifacts:
        reports:
            dotenv: build.env

deploying:
    stage: deploy
    # after merge request is merged
    rules:
        # execute when a branch was merged to staging
        - if: $CI_COMMIT_BRANCH == $STAGING_BRANCH
          when: always
        - when: never
    dependencies: 
        - building
    script:
        - echo $BUILD_VERSION

I have two stages, staging and deploy. The building job in staging builds the app and creates a "Review App" (no separate build stage for simplicity). The deploying job in deploy then uploads the new app.

The pipeline containing the building job runs whenever a merge request is opened. This way the app is built and the developer can click on the "Review App" icon in the merge request. The deploying job is run right after the merge request is merged. The idea is the following:

                      *staging* stage (pipeline 1)        *deploy* stage (pipeline 2)

<open merge request> -> `building` job (and show)   ...   <merge> -> `deploying` job
                             │                                            ▲
                             └───────────── $BUILD_VERSION ───────────────┘

The problem for me is, that the staging/building creates some data, e.g. a $BUILD_VERSION. I want to have this $BUILD_VERSION in the deploy/deploying, e.g. for creating a new release via the Gitlab API.

So my question is: How do I pass the $BUILD_VERSION (and other data) from staging/building to deploy/deploying?


What I've tried so far

artifacts.reports.dotenv

The described case is more less handled in the gitlab docs in Pass an environment variable to another job. Also the yml file shown below is heavily inspired by this example. Still, it does not work.

The build.env artifact is created in building, but whenever the deploying job is executed, the build.env file gets removed as shown below in line 15: "Removing build.env". I tried to add build.env to the .gitignore but it still gets removed.

Preparing environment - Running on runner- via gitlab-runner... - Getting source from Git repository - Fetching changes with git depth set to 50... - Reinitialized existing Git repository in  - Checking out as staging... - Removing build.env - Skipping Git submodules setup - Executing "step_script" stage of the job script - Using docker image - echo $BUILD_VERSION - Job succeeded

After hours of searching I found in this gitlab issue comment and this stackoverflow post that the artifacts.reports.dotenv doesn't work with the dependencies or the needs keywords.

Removing dependencies doesn't work. Using needs only doesn't work either. Using both is not allowed.

Does anyone know a way how to get this to work? I feel like this is the way it should work.

Getting the artifacts as a file

This answer of the stackoverflow post Gitlab ci cd removes artifact for merge requests suggests to use the build.env as a normal file. I also tried this. The (relevant) yml is the following:

building:
    # ...
    artifacts:
        paths:
            - build.env

deploying:
    # ...
    before_script:
        - source build.env

The result is the same as above. The build.env gets removed. Then the source build.env command fails because build.env does not exist. (Doesn't matter if build.env is in the .gitignore or not, tested both)

Getting the artifacts from the API

I also found the answer of the stackoverflow post Use artifacts from merge request job in GitLab CI which suggests to use the API together with $CI_JOB_TOKEN. But since I need the artifacts in a non-merge-request pipeline, I cannot use the suggested CI_MERGE_REQUEST_REF_PATH.

I tried to use $CI_COMMIT_REF_NAME. The (important section of the) yml is then:

deploying:
    # ...
    script:
        - url=$CI_API_V4_URL/projects/jobs/artifacts/$CI_COMMIT_REF_NAME/download?job=building
        - echo "Downloading $url"
        - 'curl --header "JOB-TOKEN: ${CI_JOB_TOKEN}" --output $url'
        # ...

But this the API request gets rejected with "404 Not Found". Since commit SHAs are not supported, $CI_COMMIT_BEFORE_SHA or $CI_COMMIT_SHA do not work either.

Using needs

Update: I found the section Artifact downloads between pipelines in the same project in the gitlab docs which is exactly what I want. But: I can't get it to work.

The yml looks like the following after more less copying from the docs:

building:
    # ...
    artifacts:
        paths:
            - version
        expire_in: never

deploying:
    # ...
    needs:
        - project: $CI_PROJECT_PATH
          job: building
          ref: staging # building runs on staging branch, main doesn't work either
          artifacts: true

Now the deploying job instantly fails and I get the following error banner:

This job depends on other jobs with expired/erased artifacts:
Please refer to https://docs.gitlab.com/ee/ci/yaml/README.html#dependencies

I tried to set artifacts.expire_in = never (as shown) but I still get the same error. Also in Settings > CI/CD > Artifacts "Keep artifacts from most recent successful jobs" is selected. So the artifact should be present. What did I miss here? This should work according to the docs!


I hope somebody can help me on getting the $BUILD_VERSION to the deploying job. If there are other ways than the ones I've tried, I'm very happy to hear them. Thanks in advance.


The example .gitlab-ci.yml:

stages:
    - staging
    - deploy

building:
    tags: 
        - docker
    image: bash
    stage: staging
    rules:
        - if: ($CI_PIPELINE_SOURCE == "merge_request_event") && $CI_MERGE_REQUEST_TARGET_BRANCH_NAME == "staging"
          when: always
        - when: never
    script:
        - echo "BUILD_VERSION=1.2.3" > build.env
    artifacts:
        reports:
            dotenv: build.env
    environment:
        name: Example
        url: https://example.com

deploying:
    tags: 
        - docker
    image: bash
    stage: deploy
    rules:
        - if: $CI_COMMIT_BRANCH == "staging"
          when: always
        - when: never
    dependencies:
        - building
    script:
        echo $BUILD_VERSION
like image 469
miile7 Avatar asked Jun 29 '21 13:06

miile7


People also ask

How do you pass variables in GitLab pipeline?

An alternative is to use Gitlab Variables. Go to your project page, Settings tab -> CI/CD, find Variables and click on the Expand button. Here you can define variable names and values, which will be automatically passed into the gitlab pipelines, and are available as environment variables there.

How do I pass a variable from one job to another in GitLab?

Sometimes you might want to pass CI/CD variables to a downstream pipeline. You can do that by using the variables keyword, just like you would for any other job. The ENVIRONMENT variable is passed to every job defined in a downstream pipeline. It is available as a variable when GitLab Runner picks a job.

How do I trigger GitLab pipeline from another pipeline?

Go to Settings → CI/CD → Pipeline triggers → Add Trigger . It will create a trigger with a TOKEN string, which then can be copied into the curl command of gitlab-ci. yml of project A. Note: The triggers under only is necessary to define the rules.

How artifacts work in GitLab?

Introduced in GitLab 12.4, artifacts in internal and private projects can be previewed when GitLab Pages access control is enabled. Jobs can output an archive of files and directories. This output is known as a job artifact. You can download job artifacts by using the GitLab UI or the API.


3 Answers

I assume we start out knowing the commit hash whose artifacts we want to retrieve.

This is the plan:

commit hash --> job id --> artifact archive --> extract artifact

  1. Gitlab's GraphQL API makes it possible to get, in JSON, a list of jobs for a project + artifact urls for each job.
  2. You can filter that JSON list for the commit + jobname you want. Can't do it in GraphQL directly, so I'm doing it in Python.
  3. Then print either the job id or the artifact archive url. In our case, we're grabbing the artifact archive URL directly; but somebody else might want to use the job id as input for some other API call.

First, let's look at just the GraphQL query and its result, to get a feel for the data available

GraphQL query: project jobs and artifacts

Here's the query to get a list of jobs for a project. You can try it out by pasting it into Gitlab's GraphQL explorer.

query {
  # FIXME: your project path goes here
  project(fullPath: "gitlab-org/gitlab") {
    # First page of jobs. To get the next page, change the head to
    # jobs(after: "123_my_endCursor") { ... }
    # You can find the endCursor in pageInfo
    jobs {
      pageInfo {
        endCursor
        startCursor
      }
      # No, we can't filter on `nodes(name: "my-job-name")`,
      # nor on `edges{ node(name: "my-job-name") }`. :-(
      nodes {
        id
        name
        commitPath
        artifacts {
          edges {
            node {
              downloadPath
              fileType
            }
          }
        }
      }
    }
  }
}

GraphQL result

The GraphQL API will return JSON that looks like below. It contains cursor names for pagination, and a list of jobs. In this example the first job has no artifact, the second job does. In practice this list will contain 100 jobs.

{
  "data": {
    "project": {
      "jobs": {
        "pageInfo": {
          "endCursor": "eyJpZCI6IjE1NDExMjgwNDAifQ",
          "startCursor": "eyJpZCI6IjE1NDExNTY0NzEifQ"
        },
        "nodes": [
          {
            "id": "gid://gitlab/Ci::Build/1541156471",
            "name": "review-docs-cleanup",
            "refName": "refs/merge-requests/67466/merge",
            "refPath": "/gitlab-org/gitlab/-/commits/refs/merge-requests/67466/merge",
            "commitPath": "/gitlab-org/gitlab/-/commit/5ec616f5e8f3268c23ff06dc52ef098f76352a7f",
            "artifacts": {
              "edges": []
            }
          },
          {
            "id": "gid://gitlab/Ci::Build/1541128174",
            "name": "static-analysis 4/4",
            "refName": "refs/merge-requests/67509/merge",
            "refPath": "/gitlab-org/gitlab/-/commits/refs/merge-requests/67509/merge",
            "commitPath": "/gitlab-org/gitlab/-/commit/41f949d3a398968edb67e22526c93c2f5292c23d",
            "artifacts": {
              "edges": [
                {
                  "node": {
                    "downloadPath": "/gitlab-org/gitlab/-/jobs/1541128174/artifacts/download?file_type=metadata",
                    "fileType": "METADATA"
                  }
                },
                {
                  "node": {
                    "downloadPath": "/gitlab-org/gitlab/-/jobs/1541128174/artifacts/download?file_type=archive",
                    "fileType": "ARCHIVE"
                  }
                }
              ]
            }
          },
        ]
      }
    }
  }
}

Code that would hopefully work in practice

Please take care, the scripts below

  • Do not handle pagination
  • Have not been run from inside a CI container
    • The initial GraphQL API request script is untested
    • The final command to download and extract the archive is untested
    • Only the JSON -> path part has been tested. That bit works for sure.

get-jobs-as-json.sh: (token, project name) --> joblist

#!/bin/sh

# Usage:
#
#   GRAPHQL_TOKEN=mysecret get-jobs-as-json.sh gitlab-org/gitlab
#
# You can authorize your request by generating a personal access token to use
# as a bearer token.
# https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html

main() {
    # We want curl to see ` \" `, so we type ` \\\" ` when we define $QUERY.
    # I type        : \\\"$group_and_project\\\"
    # QUERY contains: \"asdf/jkl\"
    # I type        : --data "{\"query\": \"$QUERY\"}
    # Curl sees     : '{"query": "...\"asdf/jkl\"...}"
    group_and_project="$1"
    QUERY="
      query {
        # Project path goes here
        project(fullPath: \\\"$group_and_project\\\") {

          # First page of jobs. To get the next page, change the head to
          # jobs(after: \\\"123_my_endCursor\\\") { ... }
          # You can find the endCursor in pageInfo
          jobs {
            pageInfo {
              endCursor
              startCursor
            }
            # No, you can't filter on nodes(name: \\\"my-job-name\\\"),
            # nor on edges{ node(name: \\\"my-job-name\\\") }.
            nodes {
              id
              name
              refName
              refPath
              commitPath
              artifacts {
                edges {
                  node {
                    downloadPath
                    fileType
                  }
                }
              }
            }
          }
        }
      }
    "
    curl "https://gitlab.com/api/graphql"
        --header "Authorization: Bearer $GRAPHQL_TOKEN" \
        --header "Content-Type: application/json" \
        --request POST \
        --data "{\"query\": \"$QUERY\"}"
}

main "$1"

json2jobinfo.py: (joblist, job name, commit SHA) --> (slug of) archive url

Here is a Python script that will read the joblist JSON from stdin, and print the artifact archive path of the job + commit combination you specify.

#!/usr/bin/python3

# json2jobinfo.py

"""Read JSON from stdin, print archive path of job with correct (jobname, commit) combo.

The commit SHA does not have to be the full hash, just the start is enough.

Usage:
    json2jobinfo.py JOBNAME GITHASH

Example:
    json2jobinfo.py 'static-analysis 4/4' 41f949
    json2jobinfo.py 'static-analysis 4/4' 41f949d3a398968edb67e22526c93c2f5292c23d
"""


import sys, json
from pprint import pprint
from typing import List, Dict, Tuple


def matches_sha(commitPath: str, pattern: str) -> bool:
    """True if this commitPath's commit hash starts with {pattern}"""
    commit_part = commitPath.split('/')[-1]
    # print(commit_part)
    # print(pattern)
    return commit_part.startswith(pattern)


def filter_commit_job(jobs: List[Dict], jobname: str, commit: str) -> List[Dict]:
    """Given list of job dicts, find job with correct jobname and commit SHA"""
    return [
        j for j in jobs
        if matches_sha(j['commitPath'], commit)
        and j['name'] == jobname
    ]


def get_archive_url(job: Dict) -> str:
    """Given job dict, return download path of 'ARCHIVE' artifact"""
    archive = [
        arti for arti in job['artifacts']['edges']
        if arti['node']['fileType'] == 'ARCHIVE'
    ][0]
    return archive['node']['downloadPath']


def main_sans_io(graphql_reply: str, jobname: str, commit: str) -> Tuple[str, str]:
    """Return job id, artifact archive download path"""
    jobs = json.loads(graphql_reply)['data']['project']['jobs']['nodes']
    job = filter_commit_job(jobs, jobname, commit)[0]
    job_id = job['id'].split('/')[-1]
    archive_url = get_archive_url(job)
    return job_id, archive_url


def main(args):
    """Read stdin; look for job with correct jobname and commit; print
    download path of artifacts archive"""
    if len(args) == 3:
        jobname, commit = args[1], args[2]
    else:
        # hardcoded for example purposes
        jobname = 'static-analysis 4/4'
        commit = '41f949d3a398968edb67e22526c93c2f5292c23d'

    graphql_reply = sys.stdin.read()
    job_id, job_archive_url = main_sans_io(graphql_reply, jobname, commit)
    print(job_archive_url)

    # If you want to see the json, instead:
    # pprint(job)

if __name__ == '__main__':
    main(sys.argv)

Combined usage:

# First, ensure $GRAPHQL_TOKEN contains your personal access token

# Save current directory
oldpwd=$(pwd)

# cd to a temporary directory
cd $(mktemp -d)

zip_path=$( \
    ./get-jobs-as-json.sh gitlab-org/gitlab \
    | ./json2jobinfo.py 'static-analysis 4/4' 41f949 \
)
curl \
    --location \
    --header "PRIVATE-TOKEN: <your_access_token>" \
    $zip_path > archive.zip
unzip archive.zip

# Extract the file we want
cp FILE/YOU/WANT $oldpwd

# Go back to where we were
cd $oldpwd

Ideally, the code above will be folded into a single Python script that takes 5 inputs all in one place, and produces 1 output: (token, API URL, job name, commit sha, artefact path) -> artefact file. Edits welcome. For now, I've used shell as well as Python.

Also ideally, somebody will try out the code above and leave a comment whether they get it to work. I might test it myself. But not today.

like image 178
Esteis Avatar answered Jan 02 '23 20:01

Esteis


You can't use CI/CD to pass artifacts between entirely unrelated pipelines. The fact that "building" is run on the branch that defines merge request, and "deploying" is run on the result of the merge, doesn't imply that "deploying" is just the next stage. What if another MR was merged in between? What if there were merge conflicts?

In other words, you can't skip "building" on the main branch just because you built the development branch. Let "building" happen all the time, and limit "deploy" to main branch. In this setup, you can easily pass artifacts from "building" to "deploy".

Alternatively, if you want the merge event to actually update the main branch with the version state, just use a source-controlled VERSION file. That's what git is for. When you merge, main will take on the VERSION from the branch. If a different branch got in first, you'll have to resolve the conflict, as you should.

like image 39
Peter Avatar answered Jan 02 '23 19:01

Peter


This is something you can pass by file.

Create new variable in building job:

 variables:
     CONFIG: "anyname"

then in script do export/copy to the file, for example:

- echo $BUILD_VERSION > $CI_PROJECT_DIR/$CONFIG

add paths in artifacts:

artifacts:
   paths:
   - $CONFIG

then in deploy job

variables:
     CONFIG: "anyname"

and source it

- source $CI_PROJECT_DIR/$CONFIG

To make it working, just try to solve passing problems, keep dependencies and to keep artifacts just use "needs", avoid clearing artifacts within job

like image 42
anynewscom Avatar answered Jan 02 '23 18:01

anynewscom