Question

As part of an annotation pipeline for De Novo fish genomes I need to compare e-values from BLAST to see whether they are lower than a certain threshold.

To get the semantics right I first evaluated one of the othet columns in the blast-output, and it works fine like this:

for f in FOLDER/*; do 
myVar=$(head -1 $f | awk '{print $4}') ;
if [[ $myVar -gt 50 ]]; then echo ..... ;done

$4 is then a column in the BLAST output with whole numerical values (hit length or something)

However, when I try to change the script to working with the e-values, there is some problems with interpretation of the scientific notation etc...

What I WOULD like is this:

for f in FOLDER/*; do 
myVar=$(head -1 $f | awk '{print $11}') ;
if [[ $myVar -gt 1.0e-10 ]]; then echo ..... ;done

where $11 points to the e-value for each hit.

Could this be done in a not to cumbersome manner in bash?

Was it helpful?

Solution

With awk, it is possible:

for f in FOLDER/*; do awk '$11 < 1e-10 {print $11}' "$f"; done

This doesn't need the variable to be defined first.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top