Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GitPython: How can I access the contents of a file in a commit in GitPython

I am new to GitPython and I am trying to get the content of a file within a commit. I am able to get each file from a specific commit, but I am getting an error each time I run the command. Now, I know that the file exist in GitPython, but each time I run my program, I am getting the following error:

 returned non-zero exit status 1

I am using Python 2.7.6 and Ubuntu Linux 14.04.

I know that the file exist, since I also go directly into Git from the command line, check out the respective commit, search for the file, and find it. I also run the cat command on it, and the file contents are displayed. Many times when the error shows up, it says that the file in question does not exist. I am trying to go through each commit with GitPython, get every blob or file from each individual commit, and run an external Java program on the content of that file. The Java program is designed to return a string to Python. To capture the string returned from my Java code, I am also using subprocess.check_output. Any help will be greatly appreciated.

I tried passing in the command as a list:

cmd = ['java', '-classpath', '/home/rahkeemg/workspace/CSCI499_Java/bin/:/usr/local/lib/*:', 'java_gram.mainJava','absolute/path/to/file']
subprocess.check_output(cmd, stderr=subprocess.STDOUT, shell=False)

And I have also tried passing the command as a string:

subprocess.check_output('java -classpath /home/rahkeemg/workspace/CSCI499_Java/bin/:/usr/local/lib/*: java_gram.mainJava {file}'.format(file=entry.abspath.strip()), shell=True)

Is it possible to access the contents of a file from GitPython? For example, say there is a commit and it has one file foo.java In that file is the following lines of code:

foo.java

import java.io.FileInputStream;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;

public class foo{
    public static void main(String[] args) throws Exception{}
}

I want to access everything in the file and run an external program on it. Any help would be greatly appreciated. Below is a piece of the code I am using to do so

 #! usr/bin/env python

 __author__ = 'rahkeemg'

 from git import *
 import git, json, subprocess, re

 
 git_dir = '/home/rahkeemg/Documents/GitRepositories/WhereHows'


 # make an instance of the repository from specified path
 repo = Repo(path=git_dir)

 heads = repo.heads  # obtain the different repositories
 master = heads.master  # get the master repository

 print master
 
 # get all of the commits on the master branch
 commits = list(repo.iter_commits(master))

 cmd = ['java', '-classpath', '/home/rahkeemg/workspace/CSCI499_Java/bin/:/usr/local/lib/*:', 'java_gram.mainJava']

 # start at the very 1st commit, or start at commit 0
 for i in range(len(commits) - 1, 0, -1):
     commit = commits[i]
     commit_num = len(commits) - 1 - i
     print commit_num, ": ", commit.hexsha, '\n', commit.message, '\n'

     for entry in commit.tree.traverse():
         if re.search(r'\.java', entry.path):
                             
            current_file = str(entry.abspath.strip())
            
            # add the current file or blob to the list for the command to run
            cmd.append(current_file) 
            print entry.abspath

            try:
             
                # This is the scenario where I pass arguments into command as a string
                print subprocess.check_output('java -classpath /home/rahkeemg/workspace/CSCI499_Java/bin/:/usr/local/lib/*: java_gram.mainJava {file}'.format(file=entry.abspath.strip()), shell=True)
            
  
                # scenario where I pass arguments into command as a list
                j_response = subprocess.check_output(cmd, stderr=subprocess.STDOUT, shell=False)
            
            except subprocess.CalledProcessError as e:
                 print "Error on file: ", current_file
         
            # Use pop on list to remove the last string, which is the selected file at the moment, to make place for the next file.  
            cmd.pop()
like image 944
Rahkeem George Avatar asked Apr 05 '16 14:04

Rahkeem George


People also ask

How do I open a committed file in Git?

To add and commit files to a Git repository Create your new files or edit existing files in your local project directory. Enter git add --all at the command line prompt in your local project directory to add the files or changes to the repository. Enter git status to see the changes to be committed.

What is Pygit?

Pygit2 is a set of Python bindings to the libgit2 shared library, libgit2 implements the core of Git. Pygit2 works with Python 2.7, 3.3, 3.4, 3.5, 3.6 and pypy. It is likely to work with Python 2.6 and 3.1, but these versions are not officially supported.


1 Answers

First of all, when you traverse the commit history like this, the file will not be checked out. All you get is the filename, maybe leading to the file or maybe not, but certainly it will not lead to the file from different revision than currently checked-out.

However, there is a solution to this. Remember that in principle, anything you could do with some git command, you can do with GitPython.

To get file contents from specific revision, you can do the following, which I've taken from that page:

git show <treeish>:<file>

therefore, in GitPython:

file_contents = repo.git.show('{}:{}'.format(commit.hexsha, entry.path))

However, that still wouldn't make the file appear on disk. If you need some real path for the file, you can use tempfile:

f = tempfile.NamedTemporaryFile(delete=False)
f.write(file_contents)
f.close()

# at this point file with name f.name contains contents of
#   the file from path entry.path at revision commit.hexsha
# your program launch goes here, use f.name as filename to be read

os.unlink(f.name) # delete the temp file
like image 166
mbdevpl Avatar answered Sep 27 '22 19:09

mbdevpl