linux shell: Print lines whose first word matches a variable containing special characters

StackOverflow https://stackoverflow.com/questions/18248055

  •  24-06-2022
  •  | 
  •  

Question

I have a file containing rows of strings like this:

uh-g+n uh-g+b
uh-g g
uh-g+r
g+n
uh-g+s g
sh-n+b
sh-n+d
n+d sh-n+d
g-n+d sh-n+d

I have a list of strings I am searching for, such as

set pats = (g+n sh-n+b n+d)

For each string, I want to find the line whose first "word" matches the string, and append that line to another file.

As you can see...

  • The strings to match are in variables

  • The strings may or may not contain special leading characters such as "-"

  • Lines may contain the string as a first of two words, or in isolation

  • The string may be a substring of a longer string containing special leading characters

  • The string may the second word or part of the second word (which should not be a match)

It has been an unexpected challenge to find the right combination of things to do with grep to make this work!

Here's an example of something simple to try that doesn't work (assuming rows listed above are in file in.txt.

#!/bin/tcsh

set pats = (g+n sh-n+b n+d)

foreach pat ($pats)

   grep -w $pat in.txt >> out.txt

end

In this case, out.txt looks like this:

uh-g+n uh-g+b
g+n
sh-n+b
sh-n+d
n+d sh-n+d
g-n+d sh-n+d
uh-g+n uh-g+b
g+n
sh-n+b
sh-n+d
n+d sh-n+d
g-n+d sh-n+d

But what I want is this:

g+n
sh-n+b
n+d sh-n+d
Was it helpful?

Solution

The following pipeline gives the expected output:

( IFS=$'\n' ; echo "${pats[*]/#/^}" ) | grep -f- in.txt

The first part just outputs the patterns, each on its line and preceded by ^. Grep then searches for the patterns, the ^ makes them match at the beginning of lines.

Update: The tag was changed to tcsh. Ouch. This is a bash solution.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top