Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to detect code change frequency?

I am working on a program written by several folks with largely varying skill level. There are files in there that have never changed (and probably never will, as we're afraid to touch them) and others that are changing constantly.

I wonder, are there any tools out there that would look at the entire repo history (git) and produce analysis on how frequently a given file changes? Or package? Or project?

It would be of value to recognize that (for example) we spent 25% of our time working on a set of packages, which would be indicative or code's fragility, as compared with code that "just works".

like image 200
James Raitsev Avatar asked Feb 22 '12 00:02

James Raitsev


4 Answers

If you're looking for an OS solution, I'd probably consider starting with gitstats and look at extending it by grabbing file logs and aggregating that data.

like image 150
Dave Newton Avatar answered Nov 15 '22 07:11

Dave Newton


I'd have a look at NChurn:

NChurn is a utility that helps asses the churn level of your files in your repository. Churn can help you detect which files are changed the most in their life time. This helps identify potential bug hives, and improper design.The best thing to do is to plug NChurn into your build process and store history of each run. Then, you can plot the evolution of your repository's churn.

like image 44
Henrik Avatar answered Nov 15 '22 08:11

Henrik


I wrote something that we use to visualize this information successfully.

https://github.com/bcarlso/defect-density-heatmap

Take a look at the project and you can see what the output looks like in the readme.

You can do what you need by first getting a list of files that have changed in each commit from Git.

~ $ git log --pretty="format:" --name-only | grep -v ^$ > file-changes.txt

~ $ for i in `cat file-changes.txt | cut -d"." -f1,2 | uniq`; do num=`cat file-changes.txt | grep $i | wc -l`; if (( $num > 1 )); then echo $num,0,$i; fi; done | heatmap > results.html 

This will give you a tag cloud with files that churn more will show up larger.

like image 5
bcarlso Avatar answered Nov 15 '22 09:11

bcarlso


I suggest using a command like

git log --follow -p file

That will give you all the changes that happened to the file in the history (including renames). If you want to get the number of commits that changed the file then you can do on a UNIX-based OS :

git log --follow --format=oneline Gemfile | wc -l

You can then create a bash script to apply this to multiple files with the name aside.

Hope it helped !

like image 5
Cydonia7 Avatar answered Nov 15 '22 09:11

Cydonia7