Question

I've often wanted to sort strings with numbers in them so that, when sorting e.g. abc_2, abc_1, abc_10 the result is abc_1, abc_2, abc_10. Every sort mechanism I've seen sorts as abc_1, abc_10, abc_2, that is character by character from the left.

Is there any efficient way to sort to get the result I want? The idea of looking at every character, determining if it's a numeral, building a substring out of subsequent numerals and sorting on that as a number is too appalling to contemplate in bash.

Has no bearded *nix guru implemented an alternative version of sort with a --sensible_numerical option?

Was it helpful?

Solution

Execute this

sort -t _ -k 2 -g data.file
  • -t separator
  • -k key/column
  • -g general numeric sort

OTHER TIPS

I think this is a GNU extension to sort, but you're looking for the --version-sort (or -V) option:

$ printf "prefix%d\n" $(seq 10 -3 1)
prefix10
prefix7
prefix4
prefix1

$ printf "prefix%d\n" $(seq 10 -3 1) | sort
prefix1
prefix10
prefix4
prefix7

$ printf "prefix%d\n" $(seq 10 -3 1) | sort --version-sort
prefix1
prefix4
prefix7
prefix10

https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html

Try this

$ cat a.txt
abc_1
abc_4
abc_2
abc_10
abc_5

$ sort -V a.txt
abc_1
abc_2
abc_4
abc_5
abc_10
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top