Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How much of my code is still around?

Tags:

git

github

Strange question, but a pretty reasonable one I think. Basically there's a project that I started several years ago with a couple hundred lines of code. Amazingly, since then it's grown to be a huge, robust project that I'm very proud of.

Now, I have a question that very often pops into my head:

How much of my code is still around?

Almost certainly the vast majority of my code has been rewritten at this point, but it feels like it should be very possible to have git give me a picture of what's still around.

Now, I've looked into this on a basic level, but can't really find anything else along these lines, though some of github's charts are helpful.

Any ideas?

like image 366
Slater Victoroff Avatar asked Aug 23 '16 22:08

Slater Victoroff


People also ask

How much of code is copied?

Depending on who you ask, as little as 5-10% or as much as much as 7-23% of code is cloned from somewhere else. Whether these clones are good or bad is up for debate. Regardless of the exact amount, code cloning is extremely common. Boilerplate code is essentially code repeated regularly throughout a project.

How much code has been written?

2,781,000,000,000. Roughly 2.8 Trillion Lines of Code have been written in the past 20 years. That is more than 5X the estimated number of stars in the Milky Way!

Is 4000 lines of code a lot?

Also, all that aside (and apart from being unrealistic), 4000 lines is way too many. Most modern development practices would break up huge tasks like that. The volume alone would be too much to QA or unit test, and is potentially a red flag from an architectural and organizational standpoint.

What percentage of the world can code?

Only 0.5% of the world's population knows how to code, which means 99.5% don't know how to build websites and mobile apps. Actually less than 18.5 million developers according to this study from IDC, and 7.2 billion people on the planet, which gives 0.26%.


1 Answers

So git blame is a way to go. Here is how you can calculate number of lines which was changed by each author in current revision

git ls-tree -r HEAD --name-only \
    | xargs -I{} git blame --line-porcelain {} \
    | sed -n 's/^author //p' \
    | sort \
    | uniq -c \
    | sort -rn

Which will give

15492 Alice
 3406 Bob
  100 Carol
like image 85
vsminkov Avatar answered Oct 13 '22 12:10

vsminkov