Question

I need to search in multiple files for a PATTERN, if found display the file, line and PATTERN surrounded by a few extra chars. My problem is that if the line matching the PATTERN ends with ^M (CRLF) grep prints an empty line instead.

Create a file like this, first line "a^M", second line "a", third line empty line, forth line "a" (not followed by a new line).

a^M
a

a

Without trying to match a few chars after the PATTERN all occurrences are found and displayed:

# grep -srnoEiI ".{0,2}a" *
1:a
2:a
4:a

If I try to match any chars at the end of the PATTERN, it prints an empty line instead of line one, the one ending in CRLF:

# grep -srnoEiI ".{0,2}a.{0,2}" *

2:a
4:a

How can I change this to act as expected ?

P.S. I will like to fix this grep, but I will accept other solutions for example in awk.

EDIT:

Based on the answers below I choose to strip the \r and force grep to pipe the colors to tr:

grep --color=always -srnoEiI ".{0,2}a.{0,2}" * | tr -d '\r'
Was it helpful?

Solution

Here's a simpler case that reproduces your problem:

# Output 
echo $'a\r' | grep -o "a"
# No output
echo $'a\r' | grep -o "a."

This is beacuse the ^M matches like a regular character, and makes your terminal overwrite its output (this is purely cosmetic).

How you want to fix this depends on what you want to do.

# Show the output in hex format to ensure it's correct
$ echo $'a\r' | grep -o "a." | od -t x1 -c
0000000  61  0d  0a
      a  \r  \n

# Show the output in visually less ambiguous format
$ echo $'a\r' | grep -o "a." | cat -v
a^M

# Strip the carriage return
$ echo $'a\r' | grep -o "a." | tr -d '\r'
a

OTHER TIPS

awk -v pattern="a" '$0 ~ pattern && !/\r$/ {print NR ": " $0}' file

or

sed -n '/a/{/\r$/!{=;p}}' ~/tmp/srcfile | paste -d: - -

Both of these do: find the pattern, see if the line does not end in a carriage return, print the line number and the line. For the sed, the line number is on its own line, so we have to join two consecutive lines with a colon.

You could use pcregrep:

pcregrep -n '.{0,2}a.{0,2}' inputfile

For your sample input:

$ printf $'a\r\na\n\na\n' | pcregrep -n '.{0,2}a.{0,2}' 
1:a
2:a
4:a

A couple more ways:

Use the dos2unix utility to convert the dos-style line endings to unix-style:

dos2unix myfile.txt

Or preprocess the file using tr to remove the CR characters, then pipe to grep:

$ tr -d '\r' < myfile.txt | grep -srnoEiI ".{0,2}a.{0,2}"
1:a
2:a
4:a
$

Note dos2unix may need to be installed on whatever OS you are using. More than likely tr will be available on any POSIX-compliant OS.

You can use awk with a custom field separator:

awk -F '[[:blank:]\r]' '/.{0,2}a.{0,2}/{print FILENAME, NR, $1}' OFS=':' file

TESTING:

Your grep command:

grep -srnoEiI ".{0,2}a.{0,2}" file|cat -vte
file:1:a^M$
file:2:a$
file:4:a$

Suggested awk commmand:

awk -F '[[:blank:]\r]' '/.{0,2}a.{0,2}/{print FILENAME, NR, $1}' OFS=':' file|cat -vte
file:1:a$
file:2:a$
file:4:a$
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top