Frage

I have the following file

CHO 1
4096
26 20 0 0 0 0 0 0 0 0 
0 0 0 0 0 3 5 15 8 14 
9 7 13 10 12 9 5 3 3 2 
2 0 0 0 0 0 0 1 1 0 
0 0 0 0 0 0 0 0 0 0 
0 0 0 0 1 0 1 0 0 0 
0 0 0 0 0 0 1 0 0 0 
0 0 0 0 1 0 0 0 0 0
6 8 5 5 7 13 13 33 23 29 
44 51 56 42 39 31 21 24 18 18 
18 30 44 43 51 67 102 110 130 130 
100 96 87 49 25 16 4 1 1 0
0 0 0 0 0 0

What I want to do is put all entries after 4096 in one column. A desired output is the following

1 26
2 20
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
...
4096 0

I don't have a clue on how to do it using awk. I tried for instance to put them in one line using

awk -F'\n' '{if(NR == 1) {printf $0} else {printf $0}}' file

but I don't know how to get them to one column. Let alone the fact that the first entries are not as expected.

CHO 1409626 20 0 0 0 0 0 0 0 0 0 0 0 0 0 3 5

Any idea on how to get the desired two column output? Any help is more than welcome!!!

War es hilfreich?

Lösung 3

OPs request put all entries after 4096 in one column. Others solution just assume its record number 2. This gnu awk should take care of that, and the problem with spaces at the end of the line:

awk 'f{print ++x,$1} /4096/{f=1}' RS=" | *\n" file

PS you need gnu awk due to multiple characters in RS

Andere Tipps

Using Perl it could be done with a readaptation of this:

#!/usr/bin/perl

use strict;
use warnings;

my @lines = ('CHO 1', '4096', #simulate line-by-line loading of the file
'26 20 0 0 0 0 0 0 0 0',
'0 0 0 0 0 3 5 15 8 14', 
'9 7 13 10 12 9 5 3 3 2', 
'2 0 0 0 0 0 0 1 1 0', 
'0 0 0 0 0 0 0 0 0 0', 
'0 0 0 0 1 0 1 0 0 0', 
'0 0 0 0 0 0 1 0 0 0', 
'0 0 0 0 1 0 0 0 0 0',
'6 8 5 5 7 13 13 33 23 29', 
'44 51 56 42 39 31 21 24 18 18', 
'18 30 44 43 51 67 102 110 130 130', 
'100 96 87 49 25 16 4 1 1 0',
'0 0 0 0 0 0');


my $first_line = shift @lines; #removes CHO 1
my $stop = shift @lines; #removes 4096 
my $i = 0;


foreach my $line (@lines) {
  $line =~ s/^\s*//;
  $line =~ s/\s*$//;

  my @parts = split(/\s+/, $line);
  foreach my $part (@parts) {
    print "$i $part\n"; #prints to stdout, maybe you want to print into a file
    $i++;
  }

}

and this is the output:

0 26
1 20
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 3
16 5
 ...
125 0
 ...

This will do the trick:

$ awk 'NR>2{$1=$1;print}' OFS='\n' file 

This can be done with GNU awk, which can use a regex as recond separator (RS):

gawk -v RS="[[:space:]]+" 'NR > 3 { print NR-3, $0 }' file

This might work for you (GNU sed):

sed -r '1d;2{s/.*/seq -s: &/e;s/$/:/;h;d};G;:a;/:/!d;/^\s*\n/{s///;h;$!d;x;s/:/ 0\n/g;s/.$//p;d};s/^(\S+)\s*([^\n]*\n)([^:]*):/\3 \1\n\2/;P;s/[^\n]*\n//;ba' file

This removes the first line. Stores a sequence of the numbers from 1 to the number held in the second line, in the hold space and removes the second line. Pairs the first number on the next line with the first number in the hold space and adds a newline. Prints out the pairing and repeats. When the last number of the last line has been matched any sequence numbers left are paired with zero.

Here is another way with awk:

awk 'NR>2{for(x=1;x<=NF;x++) print y++,$x}' file

Test:

$ cat file
CHO 1
4096
26 20 0 0 0 0 0 0 0 0
0 0 0 0 0 3 5 15 8 14
9 7 13 10 12 9 5 3 3 2
2 0 0 0 0 0 0 1 1 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 1 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0 0 0
6 8 5 5 7 13 13 33 23 29
44 51 56 42 39 31 21 24 18 18
18 30 44 43 51 67 102 110 130 130
100 96 87 49 25 16 4 1 1 0
0 0 0 0 0 0

$ awk 'NR>2{for(x=1;x<=NF;x++) print y++,$x}' file
0 26
1 20
2 0
3 0
4 0
5 0
6 0
7 0
---
---
122 0
123 0
124 0
125 0
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top