Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cloning a Gitlab project to a Google Colab instance using SSH or HTTPS

My issue is that I would like to connect a Google Colab instance with a Gitlab project, but neither SSH nor HTTPS seem to work. From the error messages, I suspect setting-related issues in Colab. Maybe I have to allow Colab to connect to Gitlab and put it on a whitelist somewhere?

Running the following shell commands from a Notebook in Colab while being in the '/content' directory

git config --global user.name "mr_bla"
git config --global user.email "[email protected]"
git clone https://gitlab.com/mr_bla/mr_blas_project.git

results in the following error messages:

Cloning into 'mr_blas_project'...
fatal: could not read Username for 'https://gitlab.com': No such device or address

I have generated SSH keys as I'm used to, but the SSH check

ssh -vvvT [email protected]:mr_bla/mr_blas_project.git

fails, leading to the following error:

OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n  7 Dec 2017
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug2: resolving "gitlab.com:mr_bla/mr_blas_project.git" port 22
ssh: Could not resolve hostname gitlab.com:mr_bla/mr_blas_project.git: Name or service not known

Trying the SSH-way to clone a project doesn't work either:

git clone [email protected]:mr_bla/mr_blas_project.git

results in:

Cloning into 'mr_blas_project'...
Host key verification failed.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

The Google Colab instance is running the following OS:

cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

I've checked, among many others, the following questions without success:

like image 935
Jan Spörer Avatar asked Nov 27 '19 17:11

Jan Spörer


People also ask

How to clone using HTTP on GitLab?

Let us see how to clone using HTTP on GitLab first. Step 3: Sign in to your GitLab account if you are not already logged in. Step 4: Then navigate to your project, i.e. where the code is located. It must look something like this: Step 5: Look for the Clone option then click on the dropdown.

How to clone a Google Cloud repository using SSH?

Clone using SSH 1 In the Google Cloud Console, open Cloud Source Repositories.#N#Open Cloud Source Repositories 2 Hold your pointer over the repository you want to clone. 3 Click Clone add_box.#N#A dialog with authentication options opens. 4 Click the SSH authentication tab and follow the instructions to clone your repository. More ...

How to clone the GitLab-CSR-mirror repository in cloud shell?

In Cloud Shell, clone the gitlab-csr-mirror repository from Cloud Source Repositories: gcloud source repos clone gitlab-csr-mirror View the contents of README.md and verify that it contains the change you made in your commit.

How to clone a project in GitHub?

Follow the following steps if your project or code that you are trying to clone is in GitHub. Step A: Login into Github using your credentials, if you haven’t already. Step B: Navigate to your project that you want to clone. Look for the green button that says Copy.


2 Answers

Here is the workflow I follow to persistently version control with GitLab my Google Colab notebooks (with GitHub I guess it would be quite similar).

I use Personal Access Tokens from GitLab in order to be able to use them in private repositories

WorkFlow

  • Create a Personal Access Token in GitLab

    • From Edit Profile/User Settings go to Access Tokens
      • Then enter a name (you will have to use it later) for the token and and optional expiry date
      • Select desired scopes:
        • read_repository: Read-only (pull) for the repository through git clone
        • write_repository: Read-write (pull, push) for the repository.
      • Press Create personal access token
      • Save the personal access token somewhere safe. After you leave the page, you no longer have access to the token.
  • Then in order to Colab to interact with GitLab you have to store the .git folder of the repository in a Google Drive Folder so it is persistent between Colab sessions

    • Let's say that you have a folder in Gdrive with some files you want to version control with Git:

      • /RootGDrive/Folder1/Folder2
    • Mount GoogleDrive in the GColab container file system. Let's say you mount it on /content/myfiles within the Colab container File System. You have to execute in a notebook this lines (this outputs an URL you have to go to give OAuth2 access to your Google Drive to the Colab instance).In a cell just run:

      from google.colab import drive 
      drive.mount(/content/myfiles)
      
      • This mounts on the container File System the root folder of your Google Drive in /content/myfiles/MyDrive
    • Once mounted change directory executing a magic command with %cd (with !cd will not work, each shell command is executed in a temporary subshell so it is not persistent)

      %cd "/content/myfiles/MyDrive/Folder1/Folder2"
      !pwd
      
    • Once there you initialize the git repository (this is just the first time, due to the fact that all this is done in your Google Drive means that it is a repository that will persist between sessions, if not once you leave the Google Colab session it would be removed).

       !git init
      
      • This creates the .git folder within your Google Drive folder
    • Now you have to configure typical git parameters locally (so it is stored on the .git folder) needed when pushing/pulling (again this has to be done just the first time):

      !git config --local user.email your_gitlab_mail@your_domain.com 
      !git config --local user.name your_gitlab_name
      
    • Now add the remote using the PAT created before (again this is done just the first time):

      • Key Point: The remote URL format (it has to be over HTTPs) depends on weather the Gitlab project (repo) is under a group/subgroups or not:

        • Under a group (there could be /group/subgroup1/subgroup2/.../project.git or just /group/projec.git)

          !git remote add origin https://<pat_name>:pat_code>@gitlab.com/group_name/subgroup1/project_name.git
          
        • NOT Under a group

          !git remote add origin https://<pat_name>:pat_code>@gitlab.com/your_gitlab_username/project_name.git
          
    • Now the git repository is configured within the Google Drive Folder not just in the File System Container so you can pull/push besides all the usual git commands

      !git add .
      !git commit -m"First commit"
      !git push -u origin master
      

After this is done the first time now in order to keep "version controlling" with Git and GitLab (again I guess it is very similar with GitHub for the Groups feature of GitLab for me is quite valuable) the files in the MyDrive/Folder1/Folder2 you should create a notebook that mounts the Google Drive and the git commands you want while you edit the other files in the folder.

I would say the best way is to have a parametrized notebook that checks if this is the first time to do the git initialization and so on and if not to just add/commit/push to the GitLab repository.

Cloning

For just cloning into the Container FS (or into Google Drive if it is already mounted) it is just use the same remote explained above with git clone:

  • Under a group

      !git clone https://<pat_name>:<pat_code>@gitlab.com/group_name/project_name.git
    
  • NOT Under a group

      !git clone https://<pat_name>:<pat_code>@gitlab.com/gitlab_user_name/project_name.git
    

Edit: I am adding the notebook I have created so you can use it to interact between Colab and GitLab called Gitlab_Colab_Interaction.ipynb so you can use it directly from Colab:

Imports

import os
from pathlib import Path

Parameters

# Paths
container_folder_abspath = Path('/content/myfiles')
gdrive_subfolder_relpath = Path('MyDrive/Colab Notebooks/PathTo/FolderYouWant') # No need to scape the space with pathlib Paths
gitlab_project_relpath = Path('/group_name/subgroup1/YourProject.git')
# Personal Access Token
PAT_name = 'my_pat_name'
PAT_code = 'XXXX_PAT_CODE_XXXXX'

Mount Drive

from google.colab import drive
drive.mount(str(container_folder_abspath))


fullpath = container_folder_abspath / gdrive_subfolder_relpath # Path objects with the operator /
%cd $fullpath
!pwd

Initialization (or not)

initialization = True
for element in fullpath.iterdir():
    if element.is_dir():
        if element.name == '.git':
            initialization = False
            print('Folder already initialized as a git repository!')
    

gitlab_url = 'https://' + PAT_name + ':' + PAT_code + '@gitlab.com/' + str(gitlab_project_relpath)
if initialization:
    !git init
    !git config --local user.email [email protected]
    !git config --local user.name your_gitlab_user
    !git remote add origin $gitlab_url # Check that PATs are still valid
    !echo "GitLab_Colab_Interaction.ipynb" >> ".gitignore" # To ignore this file itself if it is included in the folder

else:
    print("### Current Status ###")
    !git status
    print("\n\n### Git log ###")
    !git log

Git Commands

# Git Add
!git add *.ipynb # For example to add just the modified notebooks

# Git Commit
!git commit -m "My commit message"

# Git Push
!git push -u origin master # As of now Gitlab keeps using the name master 
like image 111
Gonzalo Polo Avatar answered Sep 30 '22 07:09

Gonzalo Polo


If it is a private repo. You could use a GitLab deploy token, or you could use a GitLab personal access token. You would then just

git clone https://<deploy_username>:<deploy_token>@gitlab.example.com/tanuki/awesome_project.git

Note you probably don't want above code with this sensitive <deploy_token> exposed in your notebook, you could hide it via putting it in a executable script mounted on your drive as an example or I think you can hide the code.

like image 41
Bernie Lindner Avatar answered Sep 30 '22 06:09

Bernie Lindner