Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to automatically get a certain file(>1MB) from git

Tags:

git

github

I want to grab a certain file from a private git repository daily under linux. I've got no problem with files under 1MB via Get content API with curl command as follows.

curl -H "Content-Type: application/json" -H "Authorization: token $TOKEN" -H 'Accept: application/vnd.github.v3.raw' -O $FILEPATH

As the file gets bigger than 1MB now, I have no idea how to do this now.

Git tells me to use the Git Data API to get a blob(up to 100MB, more than enough for me).

Though I've been trying to find a way to grab the SHA1 of the frequently updating file, I haven't came across any applicable method yet. Any suggestion?

Or maybe method other than using git API?

Thanks in advance.

like image 780
H. Jiang Avatar asked Aug 12 '16 05:08

H. Jiang


People also ask

How do I request a specific file from GitHub?

Go to the file on GitHub.com, left click on the "Raw" button to get to the direct file link, copy this URL, open a terminal, navigate to the directory that you want the content to get downloaded to, type in the following command, replacing the filename with whatever you want to name it, and replacing the URL with the ...

How do I download individual files from repository?

Downloading From The Command Line You will need to create a personal access token to use the API, with which you can replace “ACCESS_TOKEN” in this script. You can then use a JSON parser like jq to pull out the URL and download it. To get the repository file list recursively, you will need to fetch the root tree.


1 Answers

If file path in the repository is known, you can receive its SHA using Contents API. For example:

~ λ curl -H "Content-Type: application/json" \
    -H "Authorization: token $TOKEN" \
    -H "Accept: application/vnd.github.v3" \
    https://api.github.com/repos/smt116/dotfiles/contents/README.md

{
  "name": "README.md",
  "path": "README.md",
  "sha": "36bba4cf1f8fd3cbbdf81d4cc2291b54a4e56a63",
  "size": 16,
  "url": "https://api.github.com/repos/smt116/dotfiles/contents/README.md?ref=master",
  "html_url": "https://github.com/smt116/dotfiles/blob/master/README.md",
  "git_url": "https://api.github.com/repos/smt116/dotfiles/git/blobs/36bba4cf1f8fd3cbbdf81d4cc2291b54a4e56a63",
  "download_url": "https://raw.githubusercontent.com/smt116/dotfiles/master/README.md",
  "type": "file",
  "content": "IyMgTXkgZG90ZmlsZXMuCg==\n",
  "encoding": "base64",
  "_links": {
    "self": "https://api.github.com/repos/smt116/dotfiles/contents/README.md?ref=master",
    "git": "https://api.github.com/repos/smt116/dotfiles/git/blobs/36bba4cf1f8fd3cbbdf81d4cc2291b54a4e56a63",
    "html": "https://github.com/smt116/dotfiles/blob/master/README.md"
  }
}

Now you can download the file with Git Data API using git_url link that is included in the JSON response.

However if you want to download all blobs from a given repository, you can use Git Trees to fetch the list first. You need to specify commit SHA but you can use HEAD if the most recent commit is okay. For example:

~ λ curl -H "Content-Type: application/json" \
      -H "Authorization: token $TOKEN" \
      -H "Accept: application/vnd.github.v3.raw" \
      https://api.github.com/repos/smt116/dotfiles/git/trees/HEAD

{
  "sha": "0fc96d75ff4182913cec229978bb10ad338012fd",
  "url": "https://api.github.com/repos/smt116/dotfiles/git/trees/0fc96d75ff4182913cec229978bb10ad338012fd",
  "tree": [
    {
      "path": ".agignore",
      "mode": "100644",
      "type": "blob",
      "sha": "e2ca571728887bce8255ab3f66061dde53ffae4f",
      "size": 21,
      "url": "https://api.github.com/repos/smt116/dotfiles/git/blobs/e2ca571728887bce8255ab3f66061dde53ffae4f"
    },
    {
      "path": ".bundle",
      "mode": "040000",
      "type": "tree",
      "sha": "4148d567286de6aa47047672b1f2f73d7bea349b",
      "url": "https://api.github.com/repos/smt116/dotfiles/git/trees/4148d567286de6aa47047672b1f2f73d7bea349b"
    },
    ...

To get details of all files including subdirectories, you have to add recursive=1 query parameter to the URL.

Then you need to parse JSON response, filter those items that have blob type and download files using url attributes.

like image 108
Maciej Małecki Avatar answered Oct 19 '22 21:10

Maciej Małecki