Question

I would like to have help or direction on a problem I have in awk.

I have a tab-delimited file with more than 5 fields. I want to output the fields excluding the first 5 fields.

Could you please tell how to write an awk script to accomplish this task?

Best, jianfeng.mao

Do Note the following kind comment:

There are many fields in my files. Different lines have a different number of fields. The number of fields per line is not standard.

Was it helpful?

Solution

I agree with matchew's suggestion to use cut: it's the right tool for this job. But if this is just going to become a part of a larger awk script, here's how to do it:

awk -F "\t" '{ for (i=6; i<=NF; ++i) $(i-5) = $i; NF = NF-5; print; }

OTHER TIPS

In my tab delimited file temp.txt it looks like the following

field1 field2 field3 field4 field5 field6
field1 field2 field3 field4 field5 field6 field7
field1 field2 field3 field4 field5 field6 field7 field 8

As per your update, I strongly recommend using cut:

cut -f6- temp.txt

will print field6 to end of line.

Note -d specifies the delimiter, but tab is the default delimiter. You can do this in awk, but I find cut to be simpler.

With awk it would look like this:

 awk '{print substr($0, index($0, $6))}' temp.txt

if my tab delimited file temp.txt looks like the following

field1 field2 field3 field4 field5 field6
field1 field2 field3 field4 field5 field6 field7
field1 field2 field3 field4 field5 field6 field7 field 8

awk -F"\t" '{print $6}' temp.txt

will print only the 6th field. if the delimiter is tab it will likely work without setting -F, but I like to set my field-separator when I can.

similarly so too would cut.

cut -f6 temp.txt

I have a hunch your question is a bit more complicated then this, so if you respond to my comment I can try and expand on my answer.

perl way?

perl -lane 'splice @F,0,5;print "@F"'

so,

echo 'field1 field2 field3 field4 field5 field6' | perl -lane 'splice @F,0,5;print "@F"'

will produce

field6
awk -vFS='\t' -vOFS='\t' '{
  $1=$2=$3=$4=$5=""
  print substr($0,6) # delete leading tabs
}'

I use -vFS='\t' rather than -F'\t' because some implementations of awk (e.g. BusyBox's) don't honor C escapes in the latter construction.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top