Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SSH from shared Gitlab runner stopped working

This did work previously!

My deployment step in my pipeline SSH's onto a DO box & pulls the code from a docker registry. As mentioned, this worked previously & this was my deploy step in my .gitlab-ci.yml back then which worked fine inspiration from here under Using SSH:

deploy:
  stage: deploy
  image: docker:stable-dind
  only:
    - master
  services:
    # Specifying the DinD version here as the latest DinD version introduced a timeout bug
    # Highlighted here: https://forum.gitlab.com/t/gitlab-com-ci-stuck-on-docker-build/34401/2
    - docker:19.03.5-dind
  variables:
    DOCKER_DRIVER: overlay2
    DOCKER_TLS_CERTDIR: ""
  environment:
    name: production
  when: manual
  before_script:
    - mkdir -p ~/.ssh
    - echo "$DEPLOYMENT_SERVER_PRIVATE_KEY" | tr -d '\r' > ~/.ssh/id_rsa
    - chmod 600 ~/.ssh/id_rsa
    - eval "$(ssh-agent -S)"
    - ssh-add ~/.ssh/id_rsa
    - ssh-keyscan -H $DEPLOYMENT_SERVER_IP >> ~/.ssh/known_hosts
  script:
    - ssh -vvv gitlab@${DEPLOYMENT_SERVER_IP}
      "docker stop ${CI_PROJECT_NAME};
      docker rm ${CI_PROJECT_NAME};
      docker container prune -f;
      docker rmi ${CI_REGISTRY}/${CI_PROJECT_PATH};
      docker login -u ${CI_REGISTRY_USER} -p ${CI_REGISTRY_PASSWORD} ${CI_REGISTRY};
      docker pull ${CI_REGISTRY}/${CI_PROJECT_PATH}:latest;
      docker run -d -p ${HTTP_PORT}:${HTTP_PORT} --restart=always -m 800m --init --name ${CI_PROJECT_NAME} --net ${NETWORK_NAME} --ip ${NETWORK_IP} ${CI_REGISTRY}/${CI_PROJECT_PATH}:latest;"

Once I just attempted to run the deploy step & failed. Coming back with this error:

...
 $ mkdir -p ~/.ssh
 $ echo "${DEPLOYMENT_SERVER_PRIVATE_KEY}" | tr -d '\r' > ~/.ssh/id_rsa
 $ chmod 600 ~/.ssh/id_rsa
 $ eval "$(ssh-agent -s)"
 Agent pid 22
 $ ssh-add ~/.ssh/id_rsa
 Identity added: /root/.ssh/id_rsa (/root/.ssh/id_rsa)
 $ ssh-keyscan -H ${DEPLOYMENT_SERVER_IP} >> ~/.ssh/known_hosts
 # xxx.xxx.xxx.xxx:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 # xxx.xxx.xxx.xxx:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 # xxx.xxx.xxx.xxx:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 # xxx.xxx.xxx.xxx:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 # xxx.xxx.xxx.xxx:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 $ ssh gitlab@${DEPLOYMENT_SERVER_IP} "docker stop ${CI_PROJECT_NAME}; docker rm ${CI_PROJECT_NAME}; docker container prune -f; docker rmi ${CI_REGISTRY}/${CI_PROJECT_PATH}; docker login -u ${CI_REGISTRY_USER} -p ${CI_REGISTRY_PASSWORD} ${CI_REGISTRY}; docker pull ${CI_REGISTRY}/${CI_PROJECT_PATH}:latest; docker run -d -p ${PORT}:${PORT} --restart always -m 2g --init --name ${CI_PROJECT_NAME} --net ${NETWORK_NAME} --ip ${NETWORK_IP} ${CI_REGISTRY}/${CI_PROJECT_PATH}:latest;"
 ssh: connect to host xxx.xxx.xxx.xxx port 22: Connection refused
Running after_script
00:02
Uploading artifacts for failed job
00:01
 ERROR: Job failed: exit code 255

Steps I took to set this up originally

  • Run ssh-keygen -t rsa -b 2048 on DO box (with no password)
  • Added public key into authorized_keys on the DO box
  • Copy the private key into a CI variable DEPLOYMENT_SERVER_PRIVATE_KEY

I know the port is open for SSH as I am able to SSH from my local machine into gitlab user. I have now changed my deployment step (based on comments from here, this article, & this one) to:

deploy:
  stage: deploy
  image: docker:stable-dind
  only:
    - master
  services:
    # Specifying the DinD version here as the latest DinD version introduced a timeout bug
    # Highlighted here: https://forum.gitlab.com/t/gitlab-com-ci-stuck-on-docker-build/34401/2
    - docker:19.03.5-dind
  variables:
    DOCKER_DRIVER: overlay2
    DOCKER_TLS_CERTDIR: ""
  environment:
    name: production
  when: manual
  before_script:
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    - eval $(ssh-agent -s)
    - echo "$DEPLOYMENT_SERVER_PRIVATE_KEY" | tr -d '\r' | ssh-add - > /dev/null
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    - '[[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config'
    - cat ~/.ssh/config
    - echo ${CI_REGISTRY_USER}
    - ssh-keyscan -H ${DEPLOYMENT_SERVER_IP} >> ~/.ssh/known_hosts
  script:
    - ssh -vvv gitlab@${DEPLOYMENT_SERVER_IP}
      "docker stop ${CI_PROJECT_NAME};
      docker rm ${CI_PROJECT_NAME};
      docker container prune -f;
      docker rmi ${CI_REGISTRY}/${CI_PROJECT_PATH};
      docker login -u ${CI_REGISTRY_USER} -p ${CI_REGISTRY_PASSWORD} ${CI_REGISTRY};
      docker pull ${CI_REGISTRY}/${CI_PROJECT_PATH}:latest;
      docker run -d -p ${HTTP_PORT}:${HTTP_PORT} --restart=always -m 800m --init --name ${CI_PROJECT_NAME} --net ${NETWORK_NAME} --ip ${NETWORK_IP} ${CI_REGISTRY}/${CI_PROJECT_PATH}:latest;"

Still to no avail! The verbosing logging of ssh spit out:

...
 $ which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )
 /usr/bin/ssh-agent
 $ eval $(ssh-agent -s)
 Agent pid 18
 $ echo "$DEPLOYMENT_SERVER_PRIVATE_KEY" | tr -d '\r' | ssh-add - > /dev/null
 Identity added: (stdin) ((stdin))
 $ mkdir -p ~/.ssh
 $ chmod 700 ~/.ssh
 $ [[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config
 $ cat ~/.ssh/config
 Host *
    StrictHostKeyChecking no
 $ echo ${CI_REGISTRY_USER}
 gitlab-ci-token
 $ ssh-keyscan -H ${DEPLOYMENT_SERVER_IP} >> ~/.ssh/known_hosts
 # xxx.209.184.138:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 # xxx.209.184.138:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 # xxx.209.184.138:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 # xxx.209.184.138:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 # xxx.xxx.xxx.xxx:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
 $ ssh -vvv gitlab@${DEPLOYMENT_SERVER_IP}
 OpenSSH_8.3p1, OpenSSL 1.1.1g  21 Apr 2020
 debug1: Reading configuration data /root/.ssh/config
 debug1: /root/.ssh/config line 1: Applying options for *
 debug1: Reading configuration data /etc/ssh/ssh_config
 debug2: resolve_canonicalize: hostname 134.xxx.xxx.xxx is address
 Pseudo-terminal will not be allocated because stdin is not a terminal.
 debug1: Authenticator provider $SSH_SK_PROVIDER did not resolve; disabling
 debug2: ssh_connect_direct
 debug1: Connecting to xxx.xxx.xxx.xxx [xxx.xxx.xxx.xxx] port 22.
 debug1: connect to address xxx.xxx.xxx.xxx port 22: Connection refused
 ssh: connect to host xxx.xxx.xxx.xxx port 22: Connection refused
 ERROR: Job failed: exit code 255

I also added the -T option suggested here to disable pseudo-tty allocation but all that did was remove the pseudo line from the logs.

EDIT

Looking at the logs on the DO box (/var/log/auth.log), I've got the error:

Jun 22 15:53:37 exchange-apis sshd[16159]: Connection closed by 35.190.162.232 port 49750 [preauth]
Jun 22 15:53:38 exchange-apis sshd[16160]: Connection closed by 35.190.162.232 port 49754 [preauth]
Jun 22 15:53:38 exchange-apis sshd[16162]: Connection closed by 35.190.162.232 port 49752 [preauth]
Jun 22 15:53:38 exchange-apis sshd[16163]: Unable to negotiate with 35.190.162.232 port 49756: no matching host key type found. Their offer: [email protected] [preauth]
Jun 22 15:53:38 exchange-apis sshd[16161]: Unable to negotiate with 35.190.162.232 port 49758: no matching host key type found. Their offer: [email protected] [preauth]

Googling this error, common cause seems to be due to OpenSSH dropping support for DSA keys. However, not sure why this would effect me as I generated an RSA key pair. Anyway, running dpkg --list | grep openssh spits out:

ii  openssh-client                         1:7.6p1-4ubuntu0.3                              amd64        secure shell (SSH) client, for secure access to remote machines
ii  openssh-server                         1:7.6p1-4ubuntu0.3                              amd64        secure shell (SSH) server, for secure access from remote machines
ii  openssh-sftp-server                    1:7.6p1-4ubuntu0.3                              amd64        secure shell (SSH) sftp server module, for SFTP access from remote machines

& sshd -v spits out:

OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n  7 Dec 2017

Nevertheless, worked of the answers; here & here so my deploy stage is now:

deploy:
  stage: deploy
  image: docker:stable-dind
  only:
    - master
  services:
    # Specifying the DinD version here as the latest DinD version introduced a timeout bug
    # Highlighted here: https://forum.gitlab.com/t/gitlab-com-ci-stuck-on-docker-build/34401/2
    - docker:19.03.5-dind
  variables:
    DOCKER_DRIVER: overlay2
    DOCKER_TLS_CERTDIR: ""
  environment:
    name: production
  when: manual
  before_script:
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    - mkdir -p ~/.ssh
    - echo "$DEPLOYMENT_SERVER_PRIVATE_KEY" | tr -d '\r' > ~/.ssh/id_rsa
    - chmod 600 ~/.ssh/id_rsa
    - '[[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\tHostkeyAlgorithms +ssh-dss\n\tPubkeyAcceptedKeyTypes +ssh-dss\n\n" > ~/.ssh/config'
    - cat ~/.ssh/config
    - ssh-keyscan -H ${DEPLOYMENT_SERVER_IP} >> ~/.ssh/known_hosts
    - chmod 644 ~/.ssh/known_hosts
  script:
    - ssh -oHostKeyAlgorithms=+ssh-dss gitlab@${DEPLOYMENT_SERVER_IP} ls

Still got no look with that & I get the same error in the output of the runner & the log on the DO box. Any ideas?

like image 989
wmash Avatar asked Oct 26 '22 20:10

wmash


1 Answers

Ideally, if you can log on to the DO box, you would stop the ssh service, and launch /usr/bin/sshd -de, in order to establish a debug session on the SSH daemon side, with logs written on stderr (instead of system messages)

But if you cannot, at least try and generate an rsa key without passphrase, for testing. That means you don't need the ssh-agent.
And try a ssh -Tv gitlab@${DEPLOYMENT_SERVER_IP} ls to see what log is produced there.

Try with a classic PEM format

ssh-keygen -t rsa -P "" -m PEM

after editing the pipeline a bit more, I've noticed that it is actually this line that is causing the issue: ssh-keyscan -H ${DEPLOYMENT_SERVER_IP} >> ~/.ssh/known_hosts

It can be the case if it leads to a badly formatted ~/.ssh/known_hosts, especially if the ${DEPLOYMENT_SERVER_IP} is not correctly set.
Try and add a echo "DEPLOYMENT_SERVER_IP='${DEPLOYMENT_SERVER_IP}'", and a cat ~/.ssh/known_hosts commands to the before_script section, to know more.

like image 131
VonC Avatar answered Oct 29 '22 15:10

VonC