Question

I have a question about processing files in UNIX line by line. What I have right now is this -

Source file:

header-1 header-sub1
field1|field2|field3|field4
field5|field6|field7|field8
header-2
field9|field0|fieldA|fieldB

Now I want to process this file line by line and generate an output file. The header should be appended to the first column of every line until the next header is found. That is in essence the output file should be as below:

Output:

header-1 header-sub1|field1|field2|field3|field4
header-1 header-sub1|field5|field6|field7|field8
header-2|field9|field0|fieldA|fieldB    

The shell script loop that I have with me is this -

while read line 
do
    echo "Line ---> ${line}"
    if [ $line = "header-1" -o $line = "header-2" ]
    then
        first_col=$line
    else
        complete_line=`echo $first_col"|"$line`
        echo "$complete_line" >> out.csv
    fi
done < input.txt

Shouldn't the input file be read line by line and then create an appended "complete line"? The thing is the program will treat header-1 and header-sub1 as two distinct fields and it will not match the complete header line 1. But I know they are on the same line, so they should be considered as a single line. Or maybe I am missing out on the logic and/or syntax somewhere?

Also is there any way I can use sed or awk to create such a file? Thanks in advance for any suggestions.

Was it helpful?

Solution

You can use this awk:

$ awk 'BEGIN{OFS="|"} /^header/ {h=$0; next} {print h, $0}' file
header-1 header-sub1|field1|field2|field3|field4
header-1 header-sub1|field5|field6|field7|field8
header-2|field9|field0|fieldA|fieldB

Explanation

  • BEGIN{OFS="|"} set the output field separator as |.
  • /^header/ {h=$0; next} if the line starts with header, then store it without printing.
  • {print h, $0} on the rest of the lines, print the stored header first.

OTHER TIPS

This might work for you (GNU sed):

sed -r '/^header/{h;d};G;s/(.*)\n(.*)/\2|\1/' file

Store the header in the hold space and inserts it before non-header lines.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top