How many people were involved in a project? Based on Revision Control System
Question
How do you know how many developers were involved in a project using a Revision Control System? A friend of mine found this way to look up the answer in git log:
git log | grep Author: | sort -u | cut –delimiter=” ” -f2 | sort -u | wc -l
Is there a straightforward way in git? How about other Revision Control System like Subversion, Bazaar or Mercurial?
Solution
git
The shortlog
command is very useful. This summarizes the typical git-log
output.
$ git shortlog -sn
119 tsaleh
113 Joe Ferris
70 Ryan McGeary
45 Tammer Saleh
45 Dan Croak
19 Matt Jankowski
...
Pass to wc
to see the number of unique usernames:
$ git shortlog -sn | wc -l
40
OTHER TIPS
For mercurial, there's an extension to do exactly that: hg churn
.
hg churn
sorts by line-changed, if you want changeset count, use hg churn -c
.
For subversion
svn log -q svn://path/to/repo | cut -f 3 -d " " | sort -u
There is stats plugin for Bazaar to get different info about project contributors:
I'm not aware of a straightforward way for Mercurial either and a good search of all its documentation didn't revealed anything too. So, here's a *nix command, similar to the one your friend found, for Mercurial:
hg log | grep user: | cut -c 14- | sort -u | uniq | wc -l
BTW, I think there's an error with the command for git, the second sort -u
should surely be replaced by uniq!
A simpler git version is:
git log --pretty=tformat:%an | sort -u | wc -l
or if you care about unique email addresses:
git log --pretty=tformat:%ae | sort -u | wc -l
Mercurial has a powerful template language built-in (see hg help templates
). So you can get a list of all people in the project without enabling the churn extension:
hg log --template '{author}\n' | sort -u
If people have changed their email address (but otherwise kept their name the same), then you can process the author
template keyword a bit:
hg log --template '{author|person}\n' | sort -u
Then add wc -l
as appropriate to the above commands.