Question

How to do natural sort on uniq -c output?

When the counts are <10, the uniq -c | sort output looks fine:

alvas@ubi:~/testdir$ echo -e "aaa\nbbb\naa\ncd\nada\naaa\nbbb\naa\nccd\naa" > test.txt
alvas@ubi:~/testdir$ cat test.txt
aaa
bbb
aa
cd
ada
aaa
bbb
aa
ccd
aa
alvas@ubi:~/testdir$ cat test.txt | sort | uniq -c | sort
      1 ada
      1 ccd
      1 cd
      2 aaa
      2 bbb
      3 aa

but when the counts are > 10 and even in thousands/hundreds the sort messes up because it's sorting by string and not by natural integer sort:

alvas@ubi:~/testdir$ echo -e "aaa\nbbb\naa\nnaa\nnaa\naa\nnaa\nnaa\nnaa\nnaa\nnaa\nnaa\nnaa\nnaa\nnnaa\ncd\nada\naaa\nbbb\naa\nccd\naa" > test.txt
alvas@ubi:~/testdir$ cat test.txt | sort | uniq -c | sort
     10 naa
      1 ada
      1 ccd
      1 cd
      1 nnaa
      2 aaa
      2 bbb
      4 aa

How to do natural sort output of "uniq -c" in descending/acsending order?

Was it helpful?

Solution

Use -n in your sort command, so that it sorts numerically. Also -r allows you to reverse the result:

$ sort test.txt | uniq -c | sort -n
      1 ada
      1 ccd
      1 cd
      1 nnaa
      2 aaa
      2 bbb
      4 aa
     10 naa

$ sort test.txt | uniq -c | sort -nr
     10 naa
      4 aa
      2 bbb
      2 aaa
      1 nnaa
      1 cd
      1 ccd
      1 ada

From man sort:

-n, --numeric-sort

compare according to string numerical value

-r, --reverse

reverse the result of comparisons

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top