Question

I have a file that has the following format:
12345 TAB_HERE Name : The Actual Name TAB_HERE 6785

eg.


1001020 Name : SMITH S ANNALOLA     14570
5701061 Name : MATTHEW SANDY HILL   6440
7001083 Name : TANYA MORRISON MILLER    14406

I want to sort by the last field of numbers.

I'd prefer a simple one line python solution or a linux tool based solution.

I tried using sort -k 3,3n but it did not work.
And I can't seem to write a single line python code that I can run as python -c "code here"

I looked at the following but to no avail:

http://www.unix.com/unix-dummies-questions-answers/18359-how-do-i-specify-tab-field-separator-sort.html

http://www.unix.com/unix-dummies-questions-answers/30450-sort-third-column-n-command.html

http://www.linuxquestions.org/questions/programming-9/unix-sort-on-multiple-fields-598813/

Was it helpful?

Solution

Quick solution:

import sys
print "".join(sorted(sys.stdin.readlines(), key=lambda x:int(x.split()[-1])))

This solution has some disadvantages. For example, it will not work if you have lines without number at the last field, or if you want sort the data not by the last field but by everything else. In this case you must use regular expressions (re module) and descrive the field that you want to use for sorting in the key function.

OTHER TIPS

Python one liner:

cat file | python -c 'import sys; print "".join(sorted(sys.stdin.readlines(), key=lambda x:int(x.split()[-1])))'

My guess why the other python example won't work as a one liner is that he is using " to mark up the code and to invoke the join()...

I guess the --key parameter for the sort command counts the space characters.

sort -k7n

worked for me..

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top