Question

I've got a big text file. I need to extract all the lines which contains the exact word "DUSP1". Here an example of the lines:

9606    ENSP00000239223 DUSP1   BLAST
9606    ENSP00000239223 DUSP1-001 Ensembl

I want to retrieve the first line but not the second one.

I tried several commands as:

grep -E "^DUSP1"
grep '\<DUSP1\>'
grep '^DUSP1$'
grep -w DUSP1

But none of them seem to work. Which option should I use?

Was it helpful?

Solution 2

The problem you are facing is that a dash (-) is considered by grep as a word delimiter.

You should try this command :

grep '\sDUSP1\s' file

to ensure that there's spaces around your word.
Or use words boundaries :

grep '\bDUSP1\b' file

OTHER TIPS

If you want to grep exactly the whole word, you can use word boundaries like this:

grep '\bDUSP1\b'

This matches for the exact word at the beginning and at the end.

adding to what sputpick said, it could either be that or:

grep '\sDUSP1$' file 

if the DUSP1 is the end of the line.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top