shell script to read and print a part of a string
Question
Good day members,
I have an input file which has rows of numerical digits ( close to 2000 rows ) I want to extract out " the second to the eight digit from the right" of every row of the numeric string into a separate file, with the result separated by a comma as shown.
Example: input.txt
00000000000001303275310752
00000000000001827380519015
00000000000000800081610361
00000000000000449481894004
00000000000000449481894004
00000000000001812612607514
Expected result: newfile.txt
7531075,
8051901,
8161036,
8189400,
8189400,
1260751,
I'm guessing something like 'sed' can be used to solve my problem, but i'm not quite sure how to go about achieving this. I'm connected to a machine running on Solaris 5.10 Appreciate if someone can guide me with a brief explanation.
regards,
novice.
Solution
For fixed width input, try:
cut -c19-26 input.txt | sed 's/$/,/'
which is to say, extract the 19th to 26th character of input txt and then replace the end of line with a comma.
If you have variable length lines, you will need something a little different.
OTHER TIPS
You can truncate the leading zeros with:
sed 's/^0*//g'
Thus something like:
sed 's/^0*//g' input.txt | sed 's/$/,/'
should work.
Try:
perl -pe 's/^.*(\d{7})\d$/$1,/' < input.txt
Or if you don't like regular expressions:
perl -pe '$_ = substr($_,-9,-2) . ",\n"' < input.txt
This will work for any fixed or variable length line.
Here is a solution in python, it should be intuitive:
$ cat data2
00000000000001303275310752
00000000000001827380519015
00000000000000800081610361
00000000000000449481894004
00000000000000449481894004
00000000000001812612607514
$ cat digits.py
import sys
for line in sys.stdin:
print '%s,' % (line[-9:-2])
$ python digits.py < data2
7531075,
8051901,
8161036,
8189400,
8189400,
1260751,