Implementing a new "git ..." subcommand as a Python script

Tags:

python

I feel like this question could be answered with one or two hyperlinks; I'm just not coming up with the right search terms to find those links myself...

I'm trying to make some minor changes to the git p4 subcommand, which is implemented as a 3797-line Python script named /usr/libexec/git-core/git-p4. Before wading into the code too much farther, I'd like to get a sense of how a Python git command is structured.

What information does the subcommand get from its caller? What environment variables can it rely on existing? How does it detect the configuration (from .git/config and/or ~/.gitconfig)? How are git command-line options passed down from git (in case there are command-line options common to many different subcommands and you want to make sure their handling is centralized)? What is the current working directory at the time the subcommand is launched? What happens if I change the cwd? How do I communicate back to git that it should change the repo (make commits, add or delete tags, rewrite history) as part of the functionality of my new subcommand? Can my command "work on" the index directly without making a working tree, and if so, how? How do I handle and report errors? How do I print usage information in a consistent way?

I assume there must be blog posts and things about this topic, but searching "write a git subcommand in python", "create new git subcommand", etc., just turns up ways to run existing git commands from Python and ways to create new git repos, respectively.

826

asked Mar 02 '18 18:03

Quuxplusone

Video Answer

1 Answers

Git's documentation of course covers this. The short version:

If git foo is not a built-in command, Git will search PATH for git-foo and run that. But it does nothing else at all: it doesn't even verify that you're in a repository directory. (After all, lots of commands don't even need one, like git hash-object or git ls-remote.)

While internal and external APIs exist, most new commands are simply written against the existing suite of git commands. They are how everything happens, and they determine whether things like a working copy are needed or if the index is sufficient.

It’s in cases like this that the “plumbing” commands often become important: in my experience writing a few of these (and using filter-branch), cat-file, commit-tree, for-each-ref, merge-base, rev-list, and rev-parse have been particularly helpful. Obviously it can depend on what you want to do: others might find the --cached option to various (otherwise) porcelain commands more relevant.

159

answered Sep 29 '22 11:09

Davis Herring

Related questions
                            
                                How to pass DataFrame as input to Spark UDF?
                            
                                Details of Unicode Names \N Documented? [duplicate]
                            
                                Confidence interval for the difference between two proportions in Python
                            
                                Gunicorn/Django, ImportError: No module named application.wsgi
                            
                                Celery does not registering tasks
                            
                                Pickle Exploiting
                            
                                Convert a Pandas DataFrame to a multidimensional ndarray
                            
                                Python: Should I avoid initialization of variables inside blocks?
                            
                                How to use spacy in large dataset with short sentences efficiently?
                            
                                Python : How to refer itself in the list comprehension? [duplicate]
                            
                                Python how to print all object properties in one line [duplicate]
                            
                                Move spines in matplotlib 3d plot?
                            
                                Initialize empty Pandas series and conditionally add to it
                            
                                How do I create a (dockerized) Elasticsearch index using a python script running in a docker container?
                            
                                How do I get data from my AJAX Post to my Django View?
                            
                                multiple functions in Python
                            
                                How to save a CSV from dataframe, to keep zeros left in column with numbers?
                            
                                Split an integer into bins
                            
                                How to retrieve all previous builds for a Jenkins job through the API?
                            
                                Left to right application of operations on a list in Python 3

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With