Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I modify gitstats to only utilize a specified file extension for its statistics?

Tags:

python

The website of the statistics generator in question is:

http://gitstats.sourceforge.net/

Its git repository can be cloned from:

git clone git://repo.or.cz/gitstats.git

What I want to do is something like:

./gitstatus --ext=".py" /input/foo /output/bar

Failing being able to easily pass the above option without heavy modification, I'd just hard-code the file extension I want to be included.

However, I'm unsure of the relevant section of code to modify and even if I did know, I'm unsure of how to start such modifications.

It's seems like it'd be rather simple but alas...

like image 879
Fake Code Monkey Rashid Avatar asked Jan 13 '11 23:01

Fake Code Monkey Rashid


2 Answers

I found this question today while looking for the same thing. After reading sinelaw's answer I looked into the code and ended up forking the project.

https://github.com/ShawnMilo/GitStats

I added an "exclude_extensions" config option. It doesn't affect all parts of the output, but it's getting there.

I may end up doing a pretty extensive rewrite once I fully understand everything it's doing with the git output. The original project was started almost exactly four years ago today and there's a lot of clean-up that can be done due to many updates to the standard library and the Python language.

like image 75
ShawnMilo Avatar answered Sep 20 '22 12:09

ShawnMilo


EDIT: apparently even the previous solution below only affects the "Files" stat page, which is not interesting. I'm trying to find something better. The line we need to fix is 254, this:

    lines = getpipeoutput(['git rev-list --pretty=format:"%%at %%ai %%aN <%%aE>" %s' % getcommitrange('HEAD'), 'grep -v ^commit']).split('\n')

Previous attempt was:

Unfortunately, seems like git does not provide options for easily filtering by files in a commit (in the git log and git rev-list). This solution doesn't really filter all the statistics for certain file types (such as the statistics on tags), but does so for the part that calculates activity by number of lines changed.

So the best I could come up with is at line 499 of gitstats (the main script):

res = int(getpipeoutput(['git ls-tree -r --name-only "%s"' % rev, 'wc -l']).split('\n')[0])

You can change that by either adding a pipe into grep in the command, like this:

res = int(getpipeoutput(['git ls-tree -r --name-only "%s"' % rev, 'grep \\.py$', 'wc -l']).split('\n')[0])

OR, you could split out the 'wc -l' part, get the output of git ls-tree into a list of strings, and filter the resulting file names by using the fnmatch module (and then count the lines in each file, possibly by using 'wc -l') but that sounds like overkill for the specific problem you're trying to solve.

Still doesn't solve the problem (the rest of the stats will ignore this filter), but hopefully helpful.

like image 43
sinelaw Avatar answered Sep 22 '22 12:09

sinelaw