In the command line, if I type
git tag --contains {commit}
to obtain a list of releases that contain a given commit, it takes around 11 to 20 seconds for each commit. Since the target code base there exists more than 300,000 commits, it would take a lot to retrieve this information for all commits.
However, gitk
apparently manages to do a good job retrieving this data. From what I searched, it uses a cache for that purpose.
I have two questions:
git
command line tool to generate that same information?You can get this almost directly from git rev-list
.
latest.awk
:
BEGIN { thiscommit=""; }
$1 == "commit" {
if ( thiscommit != "" )
print thiscommit, tags[thiscommit]
thiscommit=$2
line[$2]=NR
latest = 0;
for ( i = 3 ; i <= NF ; ++i ) if ( line[$i] > latest ) {
latest = line[$i];
tags[$2] = tags[$i];
}
next;
}
$1 != "commit" { tags[thiscommit] = $0; }
END { if ( thiscommit != "" ) print thiscommit, tags[thiscommit]; }
a sample command:
git rev-list --date-order --children --format=%d --all | awk -f latest.awk
you can also use --topo-order
, and you'll probably have to weed out unwanted refs in the $1!="commit"
logic.
Depending on what kind of transitivity you want and how explicit the listing has to be, accumulating the tags might need a dictionary. Here's one that gets an explicit listing of all refs for all commits:
all.awk
:
BEGIN {
thiscommit="";
}
$1 == "commit" {
if ( thiscommit != "" )
print thiscommit, tags[thiscommit]
thiscommit=$2
line[$2]=NR
split("",seen);
for ( i = 3 ; i <= NF ; ++i ) {
nnew=split(tags[$i],new);
for ( n = 1 ; n <= nnew ; ++n ) {
if ( !seen[new[n]] ) {
tags[$2]= tags[$2]" "new[n]
seen[new[n]] = 1
}
}
}
next;
}
$1 != "commit" {
nnew=split($0,new,", ");
new[1]=substr(new[1],3);
new[nnew]=substr(new[nnew],1,length(new[nnew])-1);
for ( n = 1; n <= nnew ; ++n )
tags[thiscommit] = tags[thiscommit]" "new[n]
}
END { if ( thiscommit != "" ) print thiscommit, tags[thiscommit]; }
all.awk
took a few minutes to do the 322K linux kernel repo commits, about a thousand a second or something like that (lots of duplicate strings and redundant processing) so you'd probably want to rewrite that in C++ if you're really after the complete cross-product ... but I don't think gitk shows that, only the nearest neighbors, right?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With