Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient method to get the last modification time of every file in a git revision?

Tags:

git

python

time

I want to programmatically list the name and last modification time of every file in a certain revision. Running git log for every file, as suggested here is very slow. Is there a faster way to accomplish this?

Running the script below on a non-trivial repo (SDL) takes 59s on my machine.

#!/usr/bin/env python

import datetime
import subprocess
import time

commit = "HEAD"

start = time.time()

file_names = subprocess.check_output(["git", "ls-tree", "--name-only", "-r", commit], text=True).strip().split("\n")

print(f"[{time.time() - start:.4f}] git ls-tree finished")

file_times = list(datetime.datetime.fromisoformat(subprocess.check_output(["git", "log", "-1", "--pretty=format:%cI", commit, "--", name], text=True).strip()) for name in file_names)

print(f"[{time.time() - start:.4f}] git info finished")
like image 472
maarten Avatar asked Dec 31 '25 21:12

maarten


1 Answers

The basic idea is to postprocess git log --name-status with whatever per-commit info you want and look for the first occurrence of names you're interested in. The all-of-them version:

 git log --name-status --pretty=%ci | awk -F$'\t' '
         NF==1 { stamp=$0; next }
         !seen[$2]++ { print stamp,$0 }
' | sort -t$'\t' -k2,2

and as always season to taste. Are you running on spinning rust? I do that on the SDL default checkout with a cheap ssd it takes 0.548s, so more than a hundred times faster. But then, it's doing 1500+ times fewer walks through history so there's that.

like image 101
jthill Avatar answered Jan 02 '26 11:01

jthill



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!