Question

I have a plain text file where newline character in not "\n" but a special character.

Now I want to sort this file.

Is there a direct way to specify custom new-line character while using unix sort command? I don't want to use a script for this as far as possible?

Please note the data in text file have \n, \r\n, and \t characters(the reason for such data is application specific so please don't comment on that).

The sample data is as below:

1111\n1111<Ctrl+A>
2222\t2222<Ctrl+A>
3333333<Ctrl+A>

Here Ctrl+A is the newline character.

Was it helpful?

Solution

Use perl -001e 'print sort <>' to do this:

prompt$ cat -tv /tmp/a
2222^I2222^A3333333^A1111
1111^A

prompt$ perl -001e 'print sort <>' /tmp/a | cat -tv    
1111
1111^A2222^I2222^A3333333^Aprompt$  

That works because character 001 (octal 1) is control-A ("\cA"), which is your record terminator in this dataset.

You can also use the code point in hex using -0xHHHHH. Note that it must be a single code point, not a string, using this shortcut. There are ways of doing it for strings and even regexes that involve infinitessimally more code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top