[Unix] Using the 'uniq -c' command to get some statistics

I'm a obsessive Unix geek and can't really understand, how anyone could work without all these command tools.

Here I'll write about small 'uniq' recipe.

I've downloaded CADO-NFS git repo.

Getting a list of commits:

% git log

commit 89f7f3b547a82318b76108f25382dadadd1ea6a3 (HEAD -> master, origin/master, origin/HEAD)
Merge: 988b3b5bb 969d40f12
Author: Emmanuel Thomé <emmanuel.thome@inria.fr>
Date:   Wed Feb 23 19:52:43 2022 +0000

    Merge branch 'review-testing-pool-and-doc' into 'master'

    Review testing pool and doc

    See merge request cado-nfs/cado-nfs!63

commit 969d40f129e399a184a368e442a75218edbeabcc
Author: Emmanuel Thomé <Emmanuel.Thome@inria.fr>
Date:   Wed Feb 23 10:53:03 2022 -0800

    add dpkg --add-architecture for debian 32-bit builds

...

List of authors:

 % git log | grep Author

...

Author: Alexander Kruppa <akruppa@gmail.com>
Author: Alexander Kruppa <akruppa@gmail.com>
Author: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Author: Pierrick Gaudry <pierrick.gaudry@loria.fr>
Author: Pierrick Gaudry <pierrick.gaudry@loria.fr>
Author: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Author: Alexander Kruppa <akruppa@gmail.com>
Author: Alexander Kruppa <akruppa@gmail.com>
Author: Alexander Kruppa <akruppa@gmail.com>
Author: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Author: Paul Zimmermann <Paul.Zimmermann@inria.fr>
...

Who are the most active authors of this project?

 % git log | grep Author | sort | uniq -c | sort -n

...

    128 Author: Thomas Richard <thomas.richard@inria.fr>
    136 Author: Lionel Muller <lionel.muller@loria.fr>
    138 Author: Jérémie Detrey <Jeremie.Detrey@loria.fr>
    177 Author: Cyril Bouvier <cyril.bouvier@lirmm.fr>
    183 Author: Alain Filbois <alain.filbois@inria.fr>
    229 Author: Laurent Imbert <Laurent.Imbert@lirmm.fr>
    233 Author: Shi Bai <shih.bai@gmail.com>
    459 Author: François Morain <Francois.Morain@lix.polytechnique.fr>
    500 Author: Laurent Grémy <laurent.gremy@inria.fr>
    774 Author: Cyril Bouvier <cyril.bouvier@loria.fr>
   1456 Author: Pierrick Gaudry <pierrick.gaudry@loria.fr>
   2934 Author: Alexander Kruppa <akruppa@gmail.com>
   4197 Author: Paul Zimmermann <Paul.Zimmermann@inria.fr>
   5553 Author: Emmanuel Thomé <Emmanuel.Thome@inria.fr>

You see, there are many French uni domains. Which French uni provided most authors to that project?

Using 'cut', drop everything before the '@' sign to get a list of domains:

 % git log | grep Auth | sort | uniq -c | sort -n | cut -d '@' -f 2

...

lix.polytechnique.fr>
gmail.com>
inria.fr>
gmail.com>
inria.fr>
univ-lille1.fr>
inria.fr>
lix.polytechnique.fr>
ens-rennes.fr>
inria.fr>
loria.fr>
loria.fr>
lirmm.fr>
inria.fr>
lirmm.fr>
gmail.com>
lix.polytechnique.fr>

...

Collecting statistics of domains:

 % git log | grep Auth | sort | uniq -c | sort -n | cut -d '@' -f 2 | sort | uniq -c | sort -n
...
      1 cis.upenn.edu>
      1 ens-lyon.fr>
      1 ens-rennes.fr>
      1 fcatrel.loria.fr>
      1 google.com>
      1 quarkslab.com>
      1 seas.upenn.edu>
      1 uevora.pt>
      1 unilim.fr>
      1 univ-lille1.fr>
      1 univ-lille.fr>
      1 wurst.nancy.grid5000.fr>
      2 lirmm.fr>
      3 lix.polytechnique.fr>
      5 gmail.com>
      7 loria.fr>
     13 inria.fr>

INRIA provided most -- 13 developers. LORIA -- 7. To get this information, I spent just ~1 minute.

Another usage of 'uniq -c' command in my blog.


List of my other blog posts.

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.