Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to download and write a file from Github using Requests

Lets say there's a file that lives at the github repo:

https://github.com/someguy/brilliant/blob/master/somefile.txt

I'm trying to use requests to request this file, write the content of it to disk in the current working directory where it can be used later. Right now, I'm using the following code:

import requests from os import getcwd  url = "https://github.com/someguy/brilliant/blob/master/somefile.txt" directory = getcwd() filename = directory + 'somefile.txt' r = requests.get(url)  f = open(filename,'w') f.write(r.content) 

Undoubtedly ugly, and more importantly, not working. Instead of the expected text, I get:

<!DOCTYPE html> <!--  Hello future GitHubber! I bet you're here to remove those nasty inline styles, DRY up these templates and make 'em nice and re-usable, right?  Please, don't. https://github.com/styleguide/templates/2.0  --> <html>   <head>     <meta http-equiv="Content-type" content="text/html; charset=utf-8">     <title>Page not found &middot; GitHub</title>     <style type="text/css" media="screen">       body {         background: #f1f1f1;         font-family: "HelveticaNeue", Helvetica, Arial, sans-serif;         text-rendering: optimizeLegibility;         margin: 0; }        .container { margin: 50px auto 40px auto; width: 600px; text-align: center; }        a { color: #4183c4; text-decoration: none; }       a:visited { color: #4183c4 }       a:hover { text-decoration: none; }        h1 { letter-spacing: -1px; line-height: 60px; font-size: 60px; font-weight: 100; margin: 0px; text-shadow: 0 1px 0 #fff; }       p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }        ul { list-style: none; margin: 25px 0; padding: 0; }       li { display: table-cell; font-weight: bold; width: 1%; }       #error-suggestions { font-size: 14px; }       #next-steps { margin: 25px 0 50px 0;}       #next-steps li { display: block; width: 100%; text-align: center; padding: 5px 0; font-weight: normal; color: rgba(0, 0, 0, 0.5); }       #next-steps a { font-weight: bold; }       .divider { border-top: 1px solid #d5d5d5; border-bottom: 1px solid #fafafa;}        #parallax_wrapper {         position: relative;         z-index: 0;       }       #parallax_field {         overflow: hidden;         position: absolute;         left: 0;         top: 0;         height: 370px;         width: 100%;       } 

etc etc.

Content from Github, but not the content of the file. What am I doing wrong?

like image 764
Fomite Avatar asked Jan 02 '13 10:01

Fomite


People also ask

Can I download a specific file from GitHub?

Conclusion. You can download an individual file from a GitHub repository from the web interface, by using a URL, or from the command line. You can only retrieve public files by URL or from the command line.

How do I download a Python file from GitHub?

Download a Github RepositoryOn GitHub, navigate to the main page of the repository. Click the Clone or download button located under the repository name. A dropdown is displayed. Click on Download ZIP and save the repository as a zip file to your system.


1 Answers

The content of the file in question is included in the returned data. You are getting the full GitHub view of that file, not just the contents.

If you want to download just the file, you need to use the Raw link at the top of the page, which will be (for your example):

https://raw.github.com/someguy/brilliant/master/somefile.txt 

Note the change in domain name, and the blob/ part of the path is gone.

To demonstrate this with the requests GitHub repository itself:

>>> import requests >>> r = requests.get('https://github.com/kennethreitz/requests/blob/master/README.rst') >>> 'Requests:' in r.text True >>> r.headers['Content-Type'] 'text/html; charset=utf-8' >>> r = requests.get('https://raw.github.com/kennethreitz/requests/master/README.rst') >>> 'Requests:' in r.text True >>> r.headers['Content-Type'] 'text/plain; charset=utf-8' >>> print r.text Requests: HTTP for Humans =========================   .. image:: https://travis-ci.org/kennethreitz/requests.png?branch=master [... etc. ...] 
like image 71
Martijn Pieters Avatar answered Sep 18 '22 11:09

Martijn Pieters