Question

I'm trying to figure out whether there's a quick one liner sed or awk script that I can execute in order to modify a certain value in a text file, specifically the value in the last line of the file.

Currently my file has a trailer line with count of data lines. I want to modify this so that it includes the count including header and footer. Any help would be much appreciated.

file1 code :

H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
D|849007|38|1208004|1
T|3

After modification the output should be

H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
D|849007|38|1208004|1
T|5
Was it helpful?

Solution

To modify the line that starts with T:

$ awk '{sub(/^T.*/,"T|"NR)}1' file
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
D|849007|38|1208004|1
T|5

To modify the last line of your input file as originally requested:

$ awk '{printf "%s",p} {p=$0 ORS} END{sub(/\|.*/,"|"NR,p); print p}' file
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
D|849007|38|1208004|1
T|5

Since there was some debate in its comments about why I downvoted a getline solution posted here and since it's difficult to give examples in comments - here's a couple of examples of why you should not use that getline solution (or any like it) for this problem (or any like it):

Works for one set of input:

$ cat file1
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
D|849007|28|1208004|1
T|3

$ awk '{printf "%s",p} {p=$0 ORS} END{sub(/\|.*/,"|"NR,p); print p}' file1
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
D|849007|28|1208004|1
T|5

$ awk '{l=$0; if(getline==1){print l; print} else {sub("\\|.*","|"NR);print}}' file1
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
D|849007|28|1208004|1
T|5

Fails for another:

$ cat file2
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
T|3

$ awk '{printf "%s",p} {p=$0 ORS} END{sub(/\|.*/,"|"NR,p); print p}' file2
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
T|4

$ awk '{l=$0; if(getline==1){print l; print} else {sub("\\|.*","|"NR);print}}' file2
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
T|3

Awkward (at best) to enhance for the smallest job, e.g. printing each line to stderr for debugging:

$ awk '{print |"cat>&2"} {printf "%s",p} {p=$0 ORS} END{sub(/\|.*/,"|"NR,p); print p}' file2

$ awk '{print |"cat>&2"; l=$0; if(getline==1){print |"cat>&2"; print l; print} else {print |"cat>&2"; sub("\\|.*","|"NR); print}}' file1

Notice the difference in simplicity between modifying the 2 versions. Modifying the getline version is clumsy, complicated, non-trivial, non-obvious, inefficient, open to insidious errors, needing duplicated code and/or significant re-write, etc...

What we see above are the VERY common repercussions of trying to use getline to solve problems that awk's natural text processing mode can easily handle.

getline is useful when used appropriately, see http://awk.info/?tip/getline for some examples of valid applications.

OTHER TIPS

It's not strictly a one-liner, and it makes assumptions about the format of the "T" line, but:

(sed '${=;d;}' | sed '$s/^/T|/') < infile > outfile

And an awk one-liner:

awk '/^T/ {sub(/[0-9]*$/, NR)}; {print}' < infile > outfile

Update 2:

  • This solution works and is efficient in that it only reads the input file once.
  • However, for a more idiomatic awk solution that, too, only reads the file once, see @Ed Morton's answer.
  • This solution uses getline, an awk function that has many pitfalls (yet also has legitimate applications) - see http://awk.freeshell.org/AllAboutGetline
    • Case in point: the original version of this answer was fundamentally broken in that it worked only with input files having an odd number of lines; see Ed's answer again for an illustration.
  • Another aspect that can make getline-based solutions problematic in general is maintainability - modifying this solution to do more than just update the line count would be cumbersome.

An awk solution that reads the input file only once:

awk '{l=$0; while(getline==1){print l;l=$0;} sub("\\|.*","|"NR); print}' file

Annotated version:

awk '
  {
    l=$0                     # save 1st line read
    # Start a loop that reads all remaining lines.
    # Print them EXCEPT for the LAST one.
    while (getline == 1) {   # loop until the last line is read
      print l                # print the saved line now known not to be the last
      l=$0                   # save this line for the next iteration
    }
    # Getting here means: the last line was read (and is stored in $0).
    sub("\\|.*","|"NR)       # replace the part after "|" with the line count
    print                    # output modified last line
  }
  ' file

Note that POSIX awk and many implementations do not support modifying an input file in place, so you'll have to save the output (at least temporarily) to a different file.

However, as @Ed Morton points out, GNU awk, version 4.1 or higher, does allow in-place modification with -i inplace - see http://www.gnu.org/software/gawk/manual/gawk.html#Extension-Sample-Inplace

An awk version

awk -F\| 'FNR==NR{f++;next} FNR==f {$NF=f} 1' OFS=\| file{,}
H|ACCT|XEC|1|TEMP|20130215035845|
D|849002|48|1208004|1
D|849007|28|1208004|1
D|849007|38|1208004|1
T|5

If file{,} does not work, use file file to read the file two times. First time it counts the line, then update the counter to show correct lines.


To count only line that starts with H D and T

awk -F\| 'FNR==NR{if (/^(H|D|T)/) f++;n=NR;next} FNR==n {$NF=f} 1' OFS=\| file{,}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top