finding min and maximum in a daughter file and relating that result to the parent file

https://stackoverflow.com/questions/9395538

29-10-2019
|

Frage

I have an input file like below.

element  materl(local) 
ipt-shl  stress       sig-xx      sig-yy      sig-zz      sig-xy      sig-yz      sig-zx       plastic
       state                                                                                 strain 
1346995-     25
1-  2 elastic   5.9309E-01 -1.0920E-02  0.0000E+00  2.4431E-04  2.3158E-03  1.0608E-03    7.4616E-02
2-  2 elastic   6.1335E-01 -9.1746E-03  0.0000E+00 -4.2870E-04  2.3158E-03  1.0608E-03    7.4089E-02
3-  2 elastic   6.4586E-01 -7.3146E-03  0.0000E+00 -1.2961E-03  2.3158E-03  1.0608E-03    7.3794E-02
4-  2 elastic   6.7056E-01 -1.5564E-03  0.0000E+00 -1.0469E-03  2.3158E-03  1.0608E-03    7.3682E-02
5-  2 elastic   6.7493E-01  7.1420E-03  0.0000E+00  1.7934E-03  2.3158E-03  1.0608E-03    7.3708E-02
6-  2 elastic   6.7828E-01  1.4787E-02  0.0000E+00  5.4871E-03  2.3158E-03  1.0608E-03    7.3825E-02
7-  2 elastic   6.8092E-01  1.9656E-02  0.0000E+00  8.2580E-03  2.3158E-03  1.0608E-03    7.4210E-02
1346996-     25
1-  2 elastic   6.0586E-01 -4.6476E-03  0.0000E+00  9.4464E-03 -1.9585E-03 -5.1396E-03    7.4299E-02
2-  2 elastic   6.2548E-01 -5.1646E-03  0.0000E+00  6.3450E-03 -1.9585E-03 -5.1396E-03    7.4147E-02
3-  2 elastic   6.5631E-01 -5.3780E-03  0.0000E+00  1.1554E-03 -1.9585E-03 -5.1396E-03    7.4000E-02
4-  2 elastic   6.7186E-01 -1.5611E-03  0.0000E+00 -3.7045E-03 -1.9585E-03 -5.1396E-03    7.3999E-02
5-  2 elastic   6.7481E-01  5.1501E-03  0.0000E+00 -7.2939E-03 -1.9585E-03 -5.1396E-03    7.4107E-02
6-  2 elastic   6.7769E-01  1.1733E-02  0.0000E+00 -1.0146E-02 -1.9585E-03 -5.1396E-03    7.4238E-02
7-  2 elastic   6.7946E-01  1.5462E-02  0.0000E+00 -1.1218E-02 -1.9585E-03 -5.1396E-03    7.4362E-02

and so on.

What I am trying to do is to select only the column under plastic strain , put it to another file and then to find the minimum and maximim out of it. The problem is that when I shift to another file I loose the identity of maximum of minimum which is at the top of 7 lines which is the element number. I used

awk '{ print $10 }' elout > Plastic.k    # Shifting the required field to another file
sed -i -e '/^$/d' Plastic.k              # removing all the empty lines 
sed  -n '/^[0-9]\{1\}/p' Plastic.k > tt  # removing all lines with the first letter alphasbet. 
mv tt Plastic.k

Now I have to find the maximum and minimum out of this file Plastic.k and then to find the element number(identity) of that occurence in elout file, the original file.

Any suggestions ?

P.S. by identity I mean the 7 digit number on the top of a group of 7 lines followed by a - symbol-

The output would be of the form

min=7.3682E-02 at 1346995-25
max=7.4616E-02 at 1346995-25

It would not 1346996-25 as it neither have the minimum nor the maximum at the field 10. I have such a data from an input file and want to get output in an output file.

If this input format is a little changed like as follows , the answer from Potong donesnt work. I tried a lot to understand but could not. The new input is as follows.

Its like same.

element  materl(local)   
ipt-shl  stress       sig-xx      sig-yy      sig-zz      sig-xy      sig-yz      sig-zx       plastic
state                                                                                 strain
699425-     13
1- 16 elastic   4.9281E-01  5.9754E-02  0.0000E+00 -2.7210E-02  1.4192E-02  1.2603E-01    1.7112E-02
2- 16 elastic   4.6965E-01  4.8664E-02  0.0000E+00 -2.1255E-02  1.4192E-02  1.2603E-01    1.2814E-02
3- 16 elastic   4.3029E-01  2.6264E-02  0.0000E+00 -7.2280E-03  1.4192E-02  1.2603E-01    7.1400E-03
4- 16 elastic   3.1283E-01 -1.4079E-02  0.0000E+00  1.3315E-02  1.4192E-02  1.2603E-01    1.9514E-03
5- 16 elastic  -3.4911E-01 -2.9740E-02  0.0000E+00  3.7036E-02  1.4192E-02  1.2603E-01    7.5132E-04
6- 16 elastic  -4.5764E-01 -7.0891E-02  0.0000E+00  3.6667E-02  1.4192E-02  1.2603E-01    7.1070E-03
7- 16 elastic  -4.8788E-01 -8.1926E-02  0.0000E+00  4.1023E-02  1.4192E-02  1.2603E-01    1.1321E-02
699426-     13
1- 16 elastic   3.5073E-01  6.2039E-03  0.0000E+00 -9.4607E-02 -3.4023E-03 -2.4265E-02    1.4478E-02
2- 16 elastic   3.5540E-01  8.6871E-03  0.0000E+00 -7.2062E-02 -3.4023E-03 -2.4265E-02    1.0498E-02
3- 16 elastic   3.6224E-01  7.2871E-03  0.0000E+00 -3.5263E-02 -3.4023E-03 -2.4265E-02    6.1994E-03
4- 16 elastic   2.3782E-01 -1.7772E-02  0.0000E+00  5.9101E-03 -3.4023E-03 -2.4265E-02    1.6298E-03
5- 16 elastic  -2.3065E-01 -3.2565E-02  0.0000E+00  6.0890E-02 -3.4023E-03 -2.4265E-02    1.3029E-03
6- 16 elastic  -3.0923E-01 -3.0984E-02  0.0000E+00  9.0408E-02 -3.4023E-03 -2.4265E-02    5.3680E-03
7- 16 elastic  -3.3606E-01 -2.5992E-02  0.0000E+00  1.0568E-01 -3.4023E-03 -2.4265E-02    9.3878E-03

The only difference is that in this output we have 16 instead of 2 in fornt of the numerbs 1 to 7.

Please suggest me the correction.

Lösung

This might work for you:

sed '/^\([0-9]\{7\}\).*/,+7!d;//{s//\1/;h;d};s/.* //;G;s/^\(.*\)\n\(.*\)/\2 \1/' file |
sort -g |
sed 'h;N;N;N;N;N;N;s/.*\n//;H;g;s/\n\S*//'
1346995 7.3682E-02 7.4616E-02
1346996 7.3999E-02 7.4362E-02

EDIT:

With reference to comments below and requested output shown in amended question, here is an amended solution:

sed '/^\([0-9]\{7\}-\)\s*\([0-9]*\).*/,+7!d;//{s//at \1\2/;h;d};s/.* //;G;s/\n/ /' file| 
sort -g | 
sed '1s/^/min=/p;$s/^/max=/p;d'
min=7.3682E-02 at 1346995-25
max=7.4616E-02 at 1346995-25

Andere Tipps

here's solution: Sorting Scientific Number With Unix Sort so use this:

cat Plastic.k | awk '{ print $10 }' | sed -ne'/^[0-9]\{1\}/p' | sort -g | sed -n -e'1p' -e'$p'

$ cat input.txt | awk 'NR<4{next}; NF==2{id=$1}; NF==10{printf "%s %f\n",id+0,$10}' | sort -k1,1 -k2,2n | awk 'x!=$1{if(NR!=1)printf "%s\n\n",y;x=$1;print};{y=$0};END{print}'

Break into multi-lines: (`>` is bash prompt):

$ cat input.txt |
> awk 'NR<4{next}; NF==2{id=$1}; NF==10{printf "%s %f\n",id+0,$10}' |
> sort -k1,1 -k2,2n |
> awk 'x!=$1{if(NR!=1)printf "%s\n\n",y;x=$1;print};{y=$0};END{print}'

Result:

1346995 0.073682
1346995 0.074616

1346996 0.073999
1346996 0.074362

Explanation:

NR<4{next} skip first 3 lines
NF==2{id=$1} keep track of current group id
NF==10{printf...$10} print both id and value of column#10
sort -k1,1 -k2,2n sort by column#1 and column#2
awk 'x!=$1...print} print last group's last line before print current group's first line
{y=$0} keep track of last line
END{print} print last line

rrr... can't insert code block in comment =( if i undestood you right, you need numbers from first column, which correspond to minimum & maximum values from 10th column, right? Than you can use following script:

#!/bin/bash
minAndMax="`cat Plastic.k | awk '{ print $10 }' | sed -ne'/^[0-9]\{1\}/p' | sort -g | sed -n -e'1p' -e'$p'`"
min="`echo \"$minAndMax\" | head -n 1`"
max="`echo \"$minAndMax\" | tail -n 1`"
minIDs="`cat Plastic.k | awk \"\\\$10 == $min { print \\\$1 }\" | sed -e's/-$//'`"
maxIDs="`cat Plastic.k | awk \"\\\$10 == $max { print \\\$1 }\" | sed -e's/-$//'`"
echo "\$minIDs==$minIDs"
echo "\$maxIDs==$maxIDs"

#!/bin/bash   

cat - <<-EOD > testMinMaxData.txt
1346995-     25
1-  2 elastic   5.9309E-01 -1.0920E-02  0.0000E+00  2.4431E-04  2.3158E-03  1.0608E-03    7.4616E-02
2-  2 elastic   6.1335E-01 -9.1746E-03  0.0000E+00 -4.2870E-04  2.3158E-03  1.0608E-03    7.4089E-02
3-  2 elastic   6.4586E-01 -7.3146E-03  0.0000E+00 -1.2961E-03  2.3158E-03  1.0608E-03    7.3794E-02
4-  2 elastic   6.7056E-01 -1.5564E-03  0.0000E+00 -1.0469E-03  2.3158E-03  1.0608E-03    7.3682E-02
5-  2 elastic   6.7493E-01  7.1420E-03  0.0000E+00  1.7934E-03  2.3158E-03  1.0608E-03    7.3708E-02
6-  2 elastic   6.7828E-01  1.4787E-02  0.0000E+00  5.4871E-03  2.3158E-03  1.0608E-03    7.3825E-02
7-  2 elastic   6.8092E-01  1.9656E-02  0.0000E+00  8.2580E-03  2.3158E-03  1.0608E-03    7.4210E-02
1346996-     25
1-  2 elastic   6.0586E-01 -4.6476E-03  0.0000E+00  9.4464E-03 -1.9585E-03 -5.1396E-03    7.4299E-02
2-  2 elastic   6.2548E-01 -5.1646E-03  0.0000E+00  6.3450E-03 -1.9585E-03 -5.1396E-03    7.4147E-02
3-  2 elastic   6.5631E-01 -5.3780E-03  0.0000E+00  1.1554E-03 -1.9585E-03 -5.1396E-03    7.4000E-02
4-  2 elastic   6.7186E-01 -1.5611E-03  0.0000E+00 -3.7045E-03 -1.9585E-03 -5.1396E-03    7.3999E-02
5-  2 elastic   6.7481E-01  5.1501E-03  0.0000E+00 -7.2939E-03 -1.9585E-03 -5.1396E-03    7.4107E-02
6-  2 elastic   6.7769E-01  1.1733E-02  0.0000E+00 -1.0146E-02 -1.9585E-03 -5.1396E-03    7.4238E-02
EOD

if ${testingMode:-true} ; then
  set -- testMinMaxData.txt
fi

awk '
   NF==2{gsub(/[  ]*/,"",$0); header=$0}
   NF==10{print header "\t" $10}
' "${@:?Usage:$0 file1 [file2 ....]}" \
| awk '{
    hd=$1
    if (! (hd in hdrs)) {
      hdrs[hd]=++i ; hdrsVal[i]=hd; min[hd]=999999; max[hd]=0.000000009 ;
      #dbg print "#dbg:added " hd " to hdrs"
    }
    #dbg print "#dbg:$2=" $2 "\tmin["hd"]=" min[hd] "\tmax["hd"]="max[hd]
    if ($2 < min[hd]) {
      min[hd]=$2
      #dbg print "#dbg:added "$2" to min["hd"]"
    }
    if ($2 > max[hd]+0.0) {
      max[hd]=$2
      #dbg print "#dbg:added "$2" to max["hd"]"
    }
}
END {
   #dbg for (x in hdrs) print "hdrs["x"]=" hdrs[x]
    for (j=1;j<=i;j++) {
      print hdrsVal[j] "\t" min[hdrsVal[j]] "\t" max[hdrsVal[j]]
    }
  }
'\
| awk 'BEGIN{
  minVal=9999999999
  maxVal=.000000009
  }
  {
    if ($2 < minVal) {
        minVal=$2 ; minTag=$1
        #dbg print "#dbg:added "$2" to min["hd"]"
    }
    if ($3 > maxVal) {
        maxVal=$3  ; maxTag=$1
        #dbg print "#dbg:added "$2" to max["hd"]"
    }
  }
  END {
   print "min=" minVal " at " minTag
   print "max=" maxVal " at " maxTag
  }
'

output

min=7.3682E-02 at 1346995-25
max=7.4616E-02 at 1346995-25

This script is a self-contained proof-of-concept test suite. For real usage, I would recommend deleting both of the following 'blocks' of code, and leave only the 3 awks in your working script file.

The cat ... > testMin...EOD block creates your sample data into a test file.

The if ${testingMode:-true}... block uses the shell feature of set -- arg1 arg2 ... to set positional parameters. This value is then expanded as the shell parameter "${@}" that you see at the end of the first awk program (just before the pipe char ('|')).

I have also embedded a usage statment into the evalution of "${@?Usage:$0 file1 [file2 ...]}". If no filenames are supplied, the script gives you a simple error/usage message.

I have left the debugging statements in, you can remove the '#' char in front to see how data is being processed as it goes through the script.

Note that awk associative arrays hdrs[hd]=++i ; hdrsVal[i]=hd; etc., are not always intuitvely obvious to the new awk user. BUT awk associative arrays are one of the languages most powerful features. They are definitely worth your time experimenting with to understand how they work. Turn on some of the debugging lines to see what values are getting sorted where.

The only reason I keep the arr hdrs[hd] is so at the end, we can enumerate through the array by numeric key (1,2,3,...) which means the data will be printed in the order it was read in, and by using the value returned by hdrs[2]=1346995-25, then we can lookup min and max values via min[1346995-25], max[1346995-25].

Finally, as you data looks to be engineering data, you may find further help looking at the links at awk.info--Engineering

Edits

I have added the final distilation to just 1 min and max value with the setID.

You wrote

How can I add an input file name and an output file name.

When you edit script as I have mentioned above, you need to save the file.

Then from the Unix/Linux/Cygwin cmd-line, you need to 'mark' the file so the O.S. knows it is meant to be executable.

chmod 755 ./myMinMaxFinder.sh

Now, you can execute the cmd like

 ./myMinMaxFinder.sh file1 [file2 .... filen] > myOUTPUT.FILE

This is the standard way of creating output files in Unix. Argument processing will be a consulting fee ;-)

I mentioned awk.info above. As you're a mechanical engineer, be sure to check out

http://awk.info/?doc/mecheng.html

This also points to another website, done by a mechanical engineer

http://www.tikmark.com/awkeng/awkscripts.html

The design I'm using here is a traditional Unix pipeline. Each awk section solves one part of the puzzle. You can disconnect any section (by inserting 2 blank lines and adding exit to see what each stage of the script is doing.

For more general info on using awk, see this the Grymoire's most Excellent Awk Tutorial

I hope this helps.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow

finding min and maximum in a daughter file and relating that result to the parent file

Break into multi-lines: (> is bash prompt):

Result:

Explanation:

Break into multi-lines: (`>` is bash prompt):