I've solved it!
Instead of using this kind of tab character: "\t"
I needed to use $'\t'
I haven't found documentation on why, though.
Final answer:
awk -F$'\t' 'BEGIN{OF=OFS}{print $1,$2,$10,$12,$14,$20}' AECPRDA.TAB | head -10
Question
I'm having a very difficult time opening a tab delimitted file. The client says that it is definitely tab delimitted, but it seems like there are no text qualifiers.
I am running this statement:
awk '{OF=OFS="\t"}{print $1,$2,$10,$12,$14,$20}' AECPRDA.TAB | head -10
and the output that i get:
+-----------------------------------------------------------------------+
| 22746528 BKEN48DVD NEVER 050 R N |
| 22746535 BKEN48BR NEVER 050 R N |
| 25584998 WD1194190DVD DTS) / DOL 29.99 |
| 21548598 DSND001906102.2 / 001 11.49 8 |
| 25812794 WHV1000292717BR / 050 PG13 N |
| 25812787 WHV1000284958DVD SPEC GRAVITY / PG13 |
| 21425462 PBSDMST64400DVD SEASON (3PC) CLASSIC: 050 |
| 25584974 WD1194170BR (WS DTS DIGC) AC3 |
| 21388262 HBO1000394029DVD 3 OF SEASON 59.98 |
| 25688450 WD11955700DVD / DOL) THE 050 |
+-----------------------------------------------------------------------+
I don't believe that the columns are correctly "tabbed":
here's is a PURE text sample of the file:
22746528 BKEN48DVD AW40 48 18 METALLICA (2PC) THROUGH THE NEVER (2PC) 050 090 R 12.99 19.98 85611500487 01/28/2014 N N 30 1 A 1 11/27/2013 01/24/2014 11/27/2013 11/27/2013
22746535 BKEN48BR AW40 48 BR METALLICA (2PC) THROUGH THE NEVER (2PC) 050 090 R 16.25 24.98 85611500488 01/28/2014 N N 30 1 A 2 11/27/2013 01/24/2014 11/27/2013 11/27/2013
25584998 WD1194190DVD 0819 1194190 18 FROZEN / (WS DOL DTS) FROZEN / (WS DOL DTS) 050 110 G 21.25 29.99 78693683896 03/18/2014 N N 0 2 A 3 12/20/2013 03/20/2014 12/20/2013 12/20/2013
21548598 DSND001906102.2 0107 001906102 02 FROZEN / O.S.T. FROZEN / O.S.T. 001 024 11.49 13.95 05008729574 11/25/2013 N N 8 1 E 4 10/07/2013 03/20/2014 10/07/2013 10/07/2013
25812794 WHV1000292717BR 0526 1000292717 BR GRAVITY / (UVDC) GRAVITY / (UVDC) 050 093 PG13 29.49 35.99 88392924457 02/25/2014 N N 30 1 E 5 01/16/2014 02/11/2014 01/16/2014 01/16/2014
am i doing something wrong with my AWK commands? why aren't the tabs being set correctly? is there a hidden "space" qualifier that i am missing?
here is an explanation that i got from someone, but i would like to implement it using AWK NOT excel (gd forbid):
Tab delimited will probably not line up. The tab character is defined differently in different operating systems. Usually it is defined as 4 or 5 spaces when displayed. So if you have an artist name that is 5 characters, then the tab characher, then the title would start at character position 9. If the next line the artist is 20 characters long, then the tab character, then the title would appear at position 24. Hope this helps. (Another thought, tell user to open a blank spreadsheet in excel and use the Text Import)
thanks so much for your guidance!
Solution 2
I've solved it!
Instead of using this kind of tab character: "\t"
I needed to use $'\t'
I haven't found documentation on why, though.
Final answer:
awk -F$'\t' 'BEGIN{OF=OFS}{print $1,$2,$10,$12,$14,$20}' AECPRDA.TAB | head -10
OTHER TIPS
This should do:
awk 'BEGIN {FS=OFS="\t"} NR<=10 {print $1,$2,$10,$12,$14,$20}' AECPRDA.TAB