Question

I have a .csv file with some addresses and admin codes. I want to sort it on the final column (UK postcode). I'm trying to reorder the file by moving the final column to the beginning and then using sort, but I'm running into an odd sort of issue. Here's some of the data:

$ head T201311ADDR\ BNFT.CSV 
201311,A81001,THE DENSHAM SURGERY                     ,THE HEALTH CENTRE        ,LAWSON STREET            ,STOCKTON                 ,CLEVELAND                ,TS18 1HU
201311,A81002,QUEENS PARK MEDICAL CENTRE              ,QUEENS PARK MEDICAL CTR  ,FARRER STREET            ,STOCKTON ON TEES         ,CLEVELAND                ,TS18 2AW
201311,A81003,THE GALLAGHER PRACTICE                  ,THE HEALTH CENTRE        ,VICTORIA ROAD            ,HARTLEPOOL               ,CLEVELAND                ,TS26 8DB
201311,A81004,WOODLANDS ROAD SURGERY                  ,6 WOODLANDS ROAD         ,                         ,MIDDLESBROUGH            ,CLEVELAND                ,TS1 3BE 
201311,A81005,SPRINGWOOD SURGERY                      ,SPRINGWOOD SURGERY       ,RECTORY LANE             ,GUISBOROUGH              ,                         ,TS14 7DJ

I can get the final column:

$ head T201311ADDR\ BNFT.CSV | awk -F ',' 'BEGIN {OFS = ","} {print $NF}' 
TS18 1HU
TS18 2AW
TS26 8DB
TS1 3BE 
TS14 7DJ

But if I print anything else on the line, it overwrites it.

$ head T201311ADDR\ BNFT.CSV | awk -F ',' 'BEGIN {OFS = ","} {print $NF, $2}' 
,A81001U
,A81002W
,A81003B
,A81004 
,A81005J
,A81006T
,A81007W
,A81008 
,A81009 
,A81011W

Clearly not useful. I've tried using a test file that I just edited and don't see this behaviour, so it has something to do with this particular file. Any suggestions? Is there a known issue with some newline characters? I'm working on a Mac, and I suspect the file comes from a PC. It's NHS data from the UK.

Was it helpful?

Solution

Your file may have some weird (Windowsy) characters in it, try this:

cat -vet yourfile

and look for ^M which is how carriage return appears. Here is an example:

cat -vet file
I came from Windows-world ^M$
so did I ^M$

Or if you only have, or prefer, sed:

sed -ne "l"  file
I came from Windows-world \r$
so did I \r$

You can remove any weird characters with tr. To remove carriage returns (\r) do this:

tr -d '\r' < inputfile > outputfile
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top