Question

I need help understanding the default algorithm for GNU's sort. I assumed it did a lexicographic sort, however I found out some behavior that does not correspond to that, as an example take the following strings:

alex.
alex.a
alex.Z
alexa
alex0
alexZ
alex.~
alex
alex.|
alex.}
alex.abc

And sort them on a shell using sort like echo 'stuff' | sort

This is the result I get:

alex
alex.
alex.~
alex.|
alex.}
alex0
alexa
alex.a
alex.abc
alexZ
alex.Z

And I can't figure out why alex0 and alexa appear in between alex.} and alex.a

Can someone explain this to me?

Was it helpful?

Solution 2

Sorting by and large depends on LOCALE settings:

$ sort sort 
alex
alex.
alex.~
alex.|
alex.}
alex0
alexa
alex.a
alex.abc
alexZ
alex.Z
$ LC_ALL=C sort sort
alex
alex.
alex.Z
alex.a
alex.abc
alex.|
alex.}
alex.~
alex0
alexZ
alexa

OTHER TIPS

GNU Sort implements mergesort. You can find the complete answer posted in super user thread. https://superuser.com/questions/234054/unix-sort-algorithm-implementation

Try "sort InFile.txt" on this file then change the first word Start to Begin & explain the difference? I keep setting LC_LOCAL-C to no effect. InFile.txt

Start report    /* Change Start to Begin to see difference */    
MISSING..
NEW file.
Updated files
/home/me/path/To/file.txt
/home/me/path/To/new.txt
/home/me/path/To/old.txt
/home/me/path/To/lost.txt
/home/me/path/To/file.txt

Yes, I think I've figured this out. The "/" character seems to be ignored. I wonder where that detail is documented, maybe it is hidden in LC_ALL.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top