سؤال

When I type ls I get:

aedes_aegypti_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_albimanus_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_arabiensis_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_stephensi_upstream_dremeready_all_simpleMasked_random.fasta
culex_quinquefasciatus_upstream_dremeready_all_simpleMasked_random.fasta

I want to pipe this into cut (or via some alternative way) so that I only get:

aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus

If cut would accept a string (multiple characters) as it's delimiter then I could use:

cut -d "_upstream_" -f1

But that is not permitted as cut only takes single characters as delimiters.

هل كانت مفيدة؟

المحلول

awk does allow a string as delimiter:

$ awk -F"_upstream_" '{print $1}' file
aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
drosophila_melanogaster

Note for the given input you can also use cut with _ as delimiter and print first two records:

$ cut -d'_' -f-2 file
aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
drosophila_melanogaster

sed and grep can also make it. For example, this grep uses a look-ahead to print everything from the beginning of the line until you find _upstream:

$ grep -Po '^\w*(?=_upstream)' file
aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
drosophila_melanogaster

نصائح أخرى

If you only want the first field, you could do this in pure bash:

ls | while read line; do echo "${line%%_upstream_*}"; done

You could also use sed:

sed -i.bak 's/_upstream.*//' file

Result:

aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
drosophila_melanogaster

Note: This will also create a backup of the original file as file.bak.

Similar to @Tom Fenech - using bash parameter expansion/substring removal - but using a for loop:

$ ls
aedes_aegypti_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_albimanus_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_arabiensis_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_stephensi_upstream_dremeready_all_simpleMasked_random.fasta
culex_quinquefasciatus_upstream_dremeready_all_simpleMasked_random.fasta
drosophila_melanogaster_upstream_dremeready_all_simpleMasked_random.fasta

$ for file in *; do
> echo "${file%%_upstream_*}"
> done
aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
drosophila_melanogaster
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top