I'm using the github api to grab the contents of files, but I'd like to also see when the files were created. Is there a way to get that information with the github api?
The commits
endpoint can be filtered via the path
parameter, so that it only returns commits that touch the given path. Unless you want to get tricky and make multiple requests to follow file moves/renames, I'd just use the commit date of the furthest commit returned.
Git does not store file creation dates (in a significant sense, Git's files don't have creation dates). You may be able to get something of use to you anyway, but GitHub's interface is leading you in the wrong direction.
If you examine the "create a file" operation closely enough, and are familiar with Git, you will realize that it does not really create a file at all: instead, it makes a new commit. This is why it needs a commit message, and allows an author and committer.
What this means is that you must find a commit that contains the file, and then retrieve the information about that commit. The problem here is twofold:
To find a commit, you need a commit hash (SHA-1 ID). The main place to get a commit hash is from a commit. So you'll need a commit hash in order to find a commit hash. This is, of course, a problem.
To break the deadlock here, you must start by looking up a reference. A reference is simply a name that is paired with a Git object hash—often, but not always, a commit object hash. You then use that reference to locate a commit: if the reference points directly to a commit, you're there; if it points to a tag (an annotated tag object), you read the tag object, which contains another hash ID. Keep reading these objects until you arrive at something that is not a tag. If that's a commit, you've succeeded. If it's a tree or a blob, the tag does not lead to a commit in the first place and probably is not interesting to you.
The reference you want to start with is the same reference you have been using to retrieve the file.
Now that you do have a commit ID and can retrieve the commit, it is time to see if the file in question even exists in that commit. Presumably it does exist that particular commit, since you have been retrieving the file. But the file may not have been created in that commit: it may simply have been carried over from a previous commit.
If, by "created", you mean "when was the most recent commit created", you can stop here: get the commit and use the author date and/or committer date. If, however, you mean "find me a commit where in some previous commit, the file does not exist", you must do much more work. You must now retrieve the tree object associated with each commit, in order to see whether some file $path
such as foo/bar.txt
exists in that particular commit (the other API that you have been using only retrieves files from the tip-most commit on a given reference).
Note that each commit can have more than one parent. This occurs for merge commits. Most such commits have exactly two parents, but any number above 1 is possible. When looking at a merge commit that contains some file path $path
, each of its (multiple) parents may also have file $path
or lack file $path
. This is where "defining what you mean by created" becomes particularly difficult.
foo/bar.txt
is "created" here if it exists.foo/bar.txt
is "created" if foo/bar.txt
exists in this commit, but not in this commit's parent.foo/bar.txt
is "created" if foo/bar.txt
exists in this commit, but not in any of its parents.foo/bar.txt
exists in this commit and some but not all parents, is the file "created"?Once you solve this problem, you are ready to go, with a few minor caveats. First, the "get a tree" API interface imposes limits, so for bigger trees, you must clone the repository. Second, the work involved in doing this is, in general, the same as cloning the repository: you are basically re-implementing Git. You might as well just use Git.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With